Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Chinese text dual attention network for aspect-level sentiment classification

  • Xinjie Sun ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    sxj123lps@163.com

    Affiliations Institute of Computer Science, Liupanshui Normal University, Liupanshui, Guizhou, China, Guizhou Huohua Technology Co., Ltd, Liupanshui, Guizhou, China

  • Zhifang Liu,

    Roles Data curation, Investigation, Visualization, Writing – original draft

    Affiliation Institute of Computer Science, Liupanshui Normal University, Liupanshui, Guizhou, China

  • Hui Li,

    Roles Data curation, Methodology

    Affiliation Institute of Computer Science, Liupanshui Normal University, Liupanshui, Guizhou, China

  • Feng Ying,

    Roles Funding acquisition, Investigation, Writing – original draft

    Affiliation Institute of Computer Science, Liupanshui Normal University, Liupanshui, Guizhou, China

  • Yu Tao

    Roles Data curation

    Affiliation Institute of Computer Science, Liupanshui Normal University, Liupanshui, Guizhou, China

Abstract

English text has a clear and compact subject structure, which makes it easy to find dependency relationships between words. However, Chinese text often conveys information using situational settings, which results in loose sentence structures, and even most Chinese comments and experimental summary texts lack subjects. This makes it challenging to determine the dependency relationship between words in Chinese text, especially in aspect-level sentiment recognition. To solve this problem faced by Chinese text in the field of sentiment recognition, a Chinese text dual attention network for aspect-level sentiment recognition is proposed. First, Chinese syntactic dependency is proposed, and sentiment dictionary is introduced to quickly and accurately extract aspect-level sentiment words, opinion extraction and classification of sentimental trends in text. Additionally, in order to extract context-level features, the CNN-BILSTM model and position coding are also introduced. Finally, to better extract fine-grained aspect-level sentiment, a two-level attention mechanism is used. Compared with ten advanced baseline models, the model’s capabilities are being further optimized for better performance, with Accuracy of 0.9180, 0.9080 and 0.8380 respectively. This method is being demonstrated by a vast array of experiments to achieve higher performance in aspect-level sentiment recognition in less time, and ablation experiments demonstrate the importance of each module of the model.

1 Introduction

With the increasing popularity of online education in China, more and more educational datasets are being recorded. On online education platforms, almost all students post comments and upload their homework, including course learning summaries and experiment summaries. These datasets contain both knowledge point descriptions and possible sentiment expressions related to the teaching and mastery of the course. A large number of sentimental data can be found in these online text datasets [1]. An essential area of Natural Language Processing (NLP) [2], is sentiment recognition, which involves the categorization of text according to its sentimental tone or subjective feeling, the primary objective of sentiment recognition is to analyze the sentimental inclination conveyed in brief subjective texts and determine their sentiment trends.

Aspect Based Sentiment Analysis (ABSA) is a highly detailed subfield of sentiment recognition that concentrates on detecting the different elements present in a text and their corresponding sentimental states, including positive, negative, or neutral [3]. This type of analysis aims to accurately express the outline of sentimental information by analyzing the aspect-level sentiment triad, which comprises the aspect word, corresponding opinion word, and corresponding sentiment trend [4]. For instance, the comment “I don’t like database, but the teacher is cute.” contains two aspects of sentimental words, namely “database” and “teacher.” The corresponding opinion words are “don’t like” and “cute,” respectively, and the negative sentiment and positive sentiment are assigned to the “database” and “teacher” corresponding sentimental trends.

Convolutional Neural Networks (CNNs) [5] and Long Short-Term Memory (LSTM) networks [6] are significant models utilized for addressing sentiment recognition challenges. These models are designed to continually acquire new features based on the sequence of the context within the text, but they do not take into account the interrelationship between individual words. To address this limitation, the attention mechanism was proposed in literature [7], which more effectively correlates the words in sentences. Due to the effectiveness of the attention mechanism, BERT [53] is frequently employed for sentiment classification. One approach involves integrating the encoding structure of BERT into other models as an embedding layer, while the other involves directly utilizing BERT for aspect-level sentiment classification. Both methods have been shown to enhance results. As the corresponding sentimental trends of aspect words and their corresponding opinion words are dependent on each other, the utilization of the attention mechanism can assist in determining these trends.

Performing sentiment analysis at the aspect-level, the dependent tree-based Graph Neural Network (GNN) has gained significant attention as a promising area of research [810]. The information interaction between different words in the GNN is enhanced in literature [11, 12], whereby syntactic information is utilized to obtain sentence features. In addition, it has been shown to enhance the precision of sentiment recognition of textual data. By employing graph-based techniques, it becomes possible to capture both the semantic and syntactic connections between individual words, thereby enabling a more comprehensive comprehension of the sentiment conveyed within a particular text.

Although good results have been achieved by the above methods, there are still some problems that need to be addressed [9]. Firstly, the contextual information in the original sentences tends to be overlooked by the model that integrates both syntactic and semantic information. Secondly, good results for complex statements are difficult to achieve, it is worth noting that the efficacy of these models is significantly impacted by the precision and correctness of the outputs generated through syntactic resolution. For example, there is such a comment: “Who dares to compare service with big brands, right?” Most models will classify the sentiment towards a service as positive. However, in reality, this evaluation of the service often contains elements of sarcasm, which is inherently negative. Additionally, for sentiment datasets accumulated in online learning platforms, many course summaries and experimental summaries are composed of long texts with complex sentiments, and the relatively loose Chinese grammar further decreases the accuracy of sentiment recognition.

Additionally, a new aspect-level sentiment analysis network (ASN-CSD) is proposed to address the issue of the lack of sentimental knowledge assistance in existing models. The proposed network integrates context-level and sentiment dictionary to fully utilize syntactic dependency and obtain contextual feature information. Firstly, a pre-trained Word2vec model is used to obtain sentence encoding. Secondly, a sentiment dictionary-based syntactic dependency analysis algorithm is introduced to identify aspect-level sentiment information accurately and quickly. Through the utilization of attention mechanisms, the aspect-level sentiment words are intrinsically interconnected, and the model is subsequently fine-tuned accordingly to optimize performance. Thirdly, to extract bidirectional characteristics from the entirety of the sentence, a bidirectional Long Short-Term Memory (LSTM) network is employed, by incorporating attention mechanisms, it becomes feasible to grasp the interconnections between words, thereby enabling a more comprehensive understanding of the relationships between them. Finally, the complete context sentence features and aspect-level sentiment features are concatenated to allow various characteristic parts to influence each other and be mutually beneficial. The proposal of the ASN-CSD framework is the primary contribution of this article, which integrates context-level, sentiment dictionary and effectively utilizes syntactic dependency to obtain contextual feature information for the analysis of sentiment at an aspect-level. Outlined below are the specific contributions of this research.

  1. (1) A sentiment dictionary is developed by incorporating the interdependence relationships among the words present in a given sentence. Additionally, a network model is designed to accurately and efficiently identify the sentiment words present in sentences along with their corresponding sentiment trends (DDP-ASDN).
  2. (2) A novel approach to the analysis of sentiment at an aspect-level is presented, which utilizes a dual attention network. The network incorporates both the contextual information of the complete sentence and the aspect-level sentiment dyads of DDP-ASDN. This model achieves a local-to-holistic combination of information.
  3. (3) The effectiveness of the model is assessed using two publicly available Chinese datasets and a collection of Chinese dataset from the “U+ Wisdom Teaching Platform”. Experimental results demonstrate that the model achieves improved Accuracy, Efficiency, and Macro-average Macro-F1 on different datasets. The importance of the dual attention framework in fusing context and sentiment aspects is further highlighted by these results.

The following is the structure of the article: Section 1, the most recent study on fine-grained the analysis of sentiment at an aspect-level is introduced. Section 2, the latest techniques for both sentiment recognition and aspect-level sentiment recognition are presented separately. Section 3, describes the architecture of the artificial neural network recommended in this article. Section 4, showcases the datasets and experimental results used in the context of this research. Finally, Section 5, a summary of the article’s key takeaways is presented in its conclusion.

2 Related work

2.1 Sentiment classification

Sentiment classification based on the artificial neural network mainly endeavors to confirm the feature distance of sentiment in a short text by actively updating the weight relationship of the word vectors, and finally trains the optimal feature distance under the model [13]. Sentiment recognition is often accomplished through the use of Convolutional Neural Networks (CNN) [14, 15], but when the data dependence is relatively high, CNN often does not perform well enough due to the lack of internal dependence mechanism. Recurrent Neural Networks (RNN) [16, 17] can handle short-term dependence. However, vanishing and blowing up of gradients are often observed when dealing with long-term dependencies in text. Long Short-Term Memory (LSTM) networks and Bi-directional Long Short-Term Memory (BILSTM) networks [18, 19] has the ability to overcome the challenges of long short-term dependencies to a certain extent, but gradient disappearance still occurs when the text dependence length exceeds a certain limit. Attention mechanisms [20] can obtain the dependence between local and global, compared with CNN and RNN, have few parameters and low model complexity. However, they cannot effectively learn the location information in the text.

Using a single model alone has certain disadvantages. Currently, the mainstream research direction is to stack and modify models, which can help overcome some of these issues. For example, Yang et al. [21] implemented text classification by weighting a modified RNN and combining it with ATTENTION. Similarly, the design of a BILSTM structure incorporating attention is attributed by Wang et al. [22], that can give priority to various segments of the sentence. Dragoni et al. [23] used a neural word embedding approach combined with conventional network models to address the shortcomings of earlier techniques.

The combination of word embedding models and neural network models for sentiment recognition is a significant area of research [24]. For example, Liang et al. [25] employed a pre-trained word2vec structure for word embedding and used a CNN to continuously learn sentence Characteristics and classify sentiments with the aid of the attention mechanism. Meanwhile, Tang et al. [26] utilized a pre-trained Word2vec structure and employed a LSTM to acquire long short-term associations of the text, and then integrated this with an attention mechanism to classify sentiment.

2.2 Aspect-level sentiment classification

2.2.1 Method based on sentiment dictionary.

The sentiment dictionary constructed on technique utilizes annotated sentiment trend words to classify the sentiment category. First introduced by Whissell et al. [27], early sentimental dictionaries were mainly available in English, such as SentiWordNet can be download from: http://sentiwordnet.isti.cnr.it/. NTUSD can be download from: http://academiasinicanlplab.github.io/. ANTUSD, and HowNet can be download from: http://openhownet.thunlp.org/download. Liang et al. [28] developed a “Chinese sentimental vocabulary ontology library” can be download from: http://ir.dlut.edu.cn/info/1013/1142.htm. Ding et al. [29] recommended a comprehensive sentiment analysis algorithm that utilizes contextual sentiment words and evaluates the distance between sentiment words and aspects. The method assigns different weights to sentimental words and advances the precision of aspect-level sentiment recognition. Xu et al. [30] combined an enriched sentiment dictionary encompassing a larger number of sentimental words with a redesigned sentiment recognition rule based on the dictionary. The experimental results demonstrate an enhanced accuracy of sentiment recognition using this method. Wu et al. [31] proposed to construct a slang sentiment dictionary using online resources, which can effectively identify the sentiment analysis of online resources.

To some extent, the accuracy of aspect-level sentiment recognition is effectively improved by sentiment dictionary-based methods. However, as the network advances, a growing quantity of emerging words appear online, especially in cases where simple sentiment dictionaries ignore the artistic conception of the context, leading to a rapid decline in recognition accuracy.

2.2.2 Deep learning-based approach.

According to recent research, deep learning techniques surpass conventional machine learning methods in terms of performance in the ABSA (Aspect-Based Sentiment Analysis) classification duty. Such as, Xu et al. [32] recommended a novel algorithm that leverages gating CNN to handle the correspondence between sentence and aspect words. Specifically, they designed one gating Tanh CNN and another gating RuLE CNN to handle this task separately. Similarly, A hybrid model comprising RNNs and CRFs was proposed by Wang et al. [33], to learn features that can effectively discriminate between classes and bidirectional propagation information of aspect words and their corresponding opinion words. This framework outperforms other baseline framework on the SemEval2014 dataset.

LSTM is a classical model for handling sentiment analysis of aspect-level texts, as it can handle both long short-term dependencies. Tang et al. [34] recommended a TD-LSTM network was developed using target dependence as a basis, which divided the text into two parts with the center of aspect words, and achieved good results by inputting one part into the LSTM network in a positive order and the other part in reverse order, building on the TD-LSTM network, TC-LSTM network was developed and demonstrated improved performance. Jelodar et al. [35] applied the LSTM network to analyze aspect words in the context of COVID-19, contributing to decision-making in COVID-19 prevention and treatment.

The introduction of attention mechanisms further improved the precision of the sentiment recognition at an aspect-level. A LSTM model incorporating attention and aspect embedding was recommended by Wang et al. [22], which forced aspect to conduct an attention vector and focused on crucial segments of the text. This approach achieved good outcomes regarding the SemEval2014 dataset, demonstrating the efficacy of utilizing attention mechanisms in ABSA. Liu et al. [36] extended the attention model by distinguishing between the fine-grained left and right context and focusing on the contribution of each word to aspect, resulting in significantly improved performance on the T-Dataset and Z-Dataset datasets. Yang et al. [37] recommend an alternating coattention network, which can alternately model aspect-level and context-level attention, enabling effective feature learning. The traditional baseline model with attention mechanism was outperformed by this network on the SemEval2014 dataset, providing a new research direction for aspect-level sentiment recognition. To address the ASTE problem, Peng et al. [38] proposed a two-step assembly line that predicts aspect and corresponding opinions in the text in the first step and matches the results with sentiment trends in the second step, using triples (aspect word, corresponding opinions, corresponding sentiment trend). This approach promotes the sentimental classification of aspect-level. Graph convolutional neural networks (GCNs) [39] and BERT [40] models have exhibited remarkable performance in text and sentiment classification. For instance, Wang et al. [41] proposed the KGBGCN model, which effectively addresses the challenge of capturing key information in lengthy documents and overcoming the deficiency in classification accuracy due to the lack of domain-specific knowledge. Additionally, Xiao et al. [42] introduced the BERT4GCN model, which integrates the grammatical sequential features from BERT’s pre-trained language model (PLM) with syntactic knowledge derived from dependency graphs. This model has yielded exceptional results in aspect-based sentiment classification (ABSC) tasks. Liang et al. [43] proposed a new solution that constructs graphs using dependency trees and common-sense knowledge of sentiment. This solution utilizes graph convolutional networks to capture sentiment dependencies corresponding to specific aspects.

3 Framework construction

ASN-CSD: The framework is constructed for the aspect-level sentiment classification in this research. In addition, the framework diagram is presented in Fig 1.

An approach for aspect-level sentiment recognition, which consists of six stages, is proposed in this research, which are outlined as follows:

  1. Step 1: The input text is encoded using a pre-trained Word2vec model. The text is divided into two categories: short text that is directly encoded into word vectors and long text that first needs to identify sentences and then encode sentence-level vectors.
  2. Step 2: The framework uses a CNN to calculate the input text vector matrix and extract main features. Two hidden layers are used to facilitate backpropagation for parameter update. Additionally, a newly designed DDP-ASDN algorithm model is employed to extract aspect-level sentiment binary groups based on sentiment dictionary internal aspect-level syntax dependence.
  3. Step 3: The main features extracted by the CNN model are further processed using a BILSTM model to analyze sentences with bidirectional dependencies. Two hidden layers are used to facilitate backpropagation for parameter update.
  4. Step 4: Position coding is added to the results of the previous step to obtain a more precise position relationship between the whole and local sentences.
  5. Step 5: Attention mechanisms are employed to give attention to the relationship surrounded by the whole context level and aspect-level, and merge the two levels.
  6. Step 6: The outputs of the preceding stage are normalized, and sentiment recognition at the aspect-level is derived.

A case of aspect-level sentiment recognition is presented in Fig 2. This study presents a case example using a Chinese short text “I don’t like database, but the teacher is cute.” to demonstrate the syntactic dependency relationship between aspect-level sentiment words and their corresponding sentiment options, construct sentiment binary tuples, and ultimately obtain sentiment ternary tuples with sentimental tendencies through analysis.

3.1 Short text and long text sentence recognition

To facilitate Natural Language Processing of short text T and summary equal-length text LT in the context of online education, a method is proposeed to convert them into word-level features. Specifically, for short text T, it is identified as a statement and converted into Tword, which consists of n words. For LT, which is typically longer and contains multiple sentences, compartments are created based on punctuation marks such as commas, periods, and each compartment is then converted into a sentence-level feature LTsentence containing n words. Thus, LTword is composed of m such sentence-level features. This method provides a useful tool for analyzing and processing natural language text in online education. (1) (2)

3.2 Acquisition of embedding vector Word2vec

Word2vec is a widely used pre-trained model for mapping words to vectors and measuring the distance and relationship between them. In this study, the skip-gram method of Word2vec is utilized to generate embedding vectors for the word-level features Tword and LTword. Specifically, and are obtained by mapping each word in Tword and LTword to a z-dimensional vector using the trained Word2vec model.

By leveraging the semantic information captured in the Word2vec vectors. An effective approach for analyzing and processing natural language text in the context of online education is provided by this method. A variety of applications can use this embedding vectors, such as measuring similarity between words and identifying relevant concepts in the text. (3) (4)

3.3 Aspect-level DDP-ASDN model construction

Sentence dependency refers to the dependency between words within a sentence. Label, part of speech, and location information of words can be calculated through dependency analysis. By syntactic analysis of the input text, pairs of associated words and long-distance dependent word equivalence can be obtained, making it one of the commonly used methods in sentiment analysis tasks. As an example, consider the following sentence, “I don’t like database, but the teacher is also cute.” a dependency analysis is shown in Fig 3. The relationship types between words are mostly related to nouns, and the evaluation object, “database,” and its corresponding sentimental polarity, “don’t like,” can be extracted based on the nature of words.

The sentiment dictionary collects commonly used sentimental words and their corresponding label. The sentence dependency analysis model, which is added with the sentiment dictionary, can directly detect the correlation of sentimental words in text. In addition, the processing process of other unrelated words can be ignored. The framework consists of six layers. In the first layer, the incoming parameter Vword and LVword are required to calculate the position p of the word and its corresponding label L. The second layer is the sentiment dictionary layer, where Vword and LVword are compared with the pre-constructed sentimental dictionary. In the third layer, if Vword and LVword belong to the sentimental dictionary, a new sentiment evaluation object and its corresponding sentiment evaluation label are built. In the fourth layer, two fully connected operations are performed to obtain h. In the fifth layer, a normalization operation is done to obtain using the softmax function. The sixth layer is the aspect layer, where a new aspect-level binary group is obtained by calculation. The architecture of the DDP-ASDN model is exhibited in Fig 4.

The processing process of the DDP-ASDN model is outlined in Algorithm 1:

Algorithm 1: DDP-ASDN

Input: input parameters Vword, LVword

Output: aspect-level binary group

1 some description;

2 for Until the last sentence do

3  Calculate the position p of the word and its corresponding label L;

4P = {pw1, pw1, pw1, ⋯, pwn};

5L = {lw1, lw1, lw1, ⋯, lwn};

6  Find a sentiment dictionary;

7Dic = {Wk1, Wk2, Wk3, ⋯, Wkz};

8Dicl = {Wl1, Wl2, Wl3, ⋯, Wlz};

9if Vword, LVword in Dic then

10   Construct a new sentiment evaluation object;

11    ;

12  Construct the corresponding sentiment evaluation labels;

13 ;

14  Enter the hidden layer for calculation;

15 ;

16  Enter the softmax for the calculation;

17 ;

18  Combined into a new aspect-level binary group;

19 ;

20end

21 end

3.4 CNN-BILSTM-ATTENTION model construction

The main features of the input vector are obtained through CNN, and are taken as the input for subsequent calculations to reduce the amount of computing resources required for the entire algorithm. Next, the input is passed through the BILSTM layer to collect the long short-term associations between words in the sentence. The following five steps can be used to describe the specific processing procedure.

Step 1: The eigenvalues and were computed by the CNN model’s convolution layers for the incoming Vword and LVword, respectively. The computation process is defined as follows: (5) (6)

Relu is the activation function, refers to the weight, while corresponds to the bias term, and i is the index for the quantity of convolution kernels. Then, the convolution layer results and are passed into the maximum pooling layer Max() of the CNN model to calculate and . Finally, the results are passed through two fully connected layers, which are defined as follows: (7) (8)

Step 2: and are fed into the BILSTM layer separately to obtain and through forward propagation, followed by reverse propagation to obtain and . Finally, the forward and reverse propagation results of the short text are added to obtain , and the forward and reverse propagation results of the long text are added to obtain . which and represent hidden layer trainable parameters, and represent input layer trainable parameters, and represent bias, ⊕ represents the concatenation of vectors. (9) (10) (11) (12) (13) (14)

Position coding(Pe) is added after the BILSTM layer to effectively incorporate location information. First, the output of the CNN layer, and , and are fed into the position coding layer to obtain corresponding word position information. The location information is encoded using the transformer and reference in the position coding method. The corresponding position-coding is calculated as follows: (15)

The placement of each word in relation to others in the short text is denoted as , while the set of relative positions of each word in the long text is represented as . Similarly, the set of relative positions of each word in the aspect-level after the DDP-ASDN model is represented as , and z denotes the embedding dimension. The phase position coding of each word is combined into a new z-dimension matrix using the sine cosine function. The combined output matrix of short text, long text, and aspect-level is given as follows: (16)

Step 4: The results of position coding, , , and , are added to the results of the BILSTM layer, and , to obtain new encoded representations,short text and long text . (17)

Step 5: The encoded representations, , and sentiment evaluation label are used as input to the normalization layer, to obtain new short text encoding hV, long text encoding hLV and sentiment evaluation label enconding TDDP. which performs the following steps: (18)

The impact factors are then computed based on the normalized results, which are defined as follows. The symbol a represents the ATTENTION mechanism, which calculates the attention between different words. (19)

The weight distribution of the attentions are obtained by calculating the influence factor, defined as follows: (20)

Then the weight factors are summed over, which can be defined as follows: (21)

Then the attention results of the two layers are summed to obtain the attention that contains the context and aspect-level relationships. This is defined as follows: (22)

3.5 Classification results

The attention vector undergoes normalization after being processed by a fully connected layer, and the classification label is obtained by taking the maximum value of the normalized result. Finally, the aspect-level Triples are combined for each aspect sentiment recognition ensemble. which WAtt represents the trainable parameters of the dual attention layer. (23) (24)

4 Experiment

4.1 Datasets

Two publicly available datasets and one collected by our team were selected in this paper. The detailed descriptions of the datasets are provided below.

  1. 1: The Fine-grained Automotive Comment Standard Dataset (dataset 1), comprises 56,920 comments collected from various automotive forums. User comments were annotated with labels such as manufacturer, brand, model, attribute, descriptive value, tendency, and more. This dataset can be download from: https://www.datatang.com/dataset/info/text/135
  2. 2: The Tan Songbo-Hotel Review Corpus (dataset 2), consists of 10,000 articles automatically collected and collated from Ctrip. The corpus comprises 7,000 positive sentiment data and 3,000 negative sentiment data. This dataset can be download from: https://pan.baidu.com/s/1TrumHVMk-Kc4PJz8INMYbg
  3. 3: The U+ Wisdom Teaching Platform Dataset (dataset 3), was extracted from the U+ New Engineering Wisdom Cloud Platform. The dataset includes course evaluation data and experimental summaries of 50 universities, comprising a total of 9,000 long texts and 12,000 short texts. The dataset contains two sentiment categories: positive and negative, which can be further decomposed into fine-grained sentiment, totaling 30,000. This dataset can be download from: https://www.eec-cn.com

All experiments were conducted on four Nvidia V100 GPUs, and the datasets were split into a ratio of 6:1:3 for training, validation, and testing sets.

4.2 Baseline methods

In order to assess the efficacy of the recommended techniques, a comparison is made with the ten most advanced methods as described below:

ATAE-LSTM [44]: The sentences are processed based on the LSTM model, weighted and aggregated by combining the association of context and aspect terms.

BiGCN [45]: A graph convolutional network with two levels of interaction is devised based on dependency tree and word co-occurrence relationships to fully learn the node representation.

TNet [46]: The sentence feature representation is encoded using BILSTM and undergoes continuous aspect-level context encoding and attention mechanism, and the final feature representation is extracted using CNN.

DualGCN [47]: The SemGCN two-channel structure integrates grammar knowledge and semantic information with SynGCN and attention mechanism based on the dependent analytical probability matrix. Orthogonal regularization and differential regularization in SemGCN help the framework to grasp meaning associations different from the grammatical structure.

MemNet [11]: Context sentences are considered as external memory and attention mechanism with multiple heads are applied to the word vector characterization of the context. The final aspect characterization is derived from the output of the last hop.

DM-GCN [48]: Syntactic graph and semantic graph are constructed based on dependent tree and multi-head self-attention mechanism respectively. Information is extracted through syntactic graph convolution (Syntax GCN) and semantic graph convolution (Semantic GCN) respectively. The Common GCN module is utilized with a parameter sharing strategy to acquire shared information from the two spaces. The information extracted from the three channels is fused and used for the classification task.

ASGCN [49]: Feature representations of sentences are acquired using BILSTM. Aspect-level context representations are learned through dependency tree-based GCN, and aggregated context representations are used for classification using the attention mechanism.

GTS [50]: A grid marking scheme is designed to solve the triplet separation problem in an end-to-end system. The adopted inference strategy makes full use of the indicating role between different opinion elements.

IMN+IOG [51]: The interactive network IMN is used to extract explicit target aspects in the sentence. Then, the IOG framework is used to derive the associated opinion words to generate the final triples.

IAN [52]: Two LSTM models are used to encode text environment and aspects. Additionally, the Interdependent attention mechanism is employed to model the relationships between them.

BERT [53]: The model is fine-tuned by simultaneously inputting modified new tokens as both [cls] and [seq].

RoBERTa [54]: RoBERTa is built on the basis of the language masking strategy in BERT, modifying key hyperparameters in BERT, including deleting the next sentence training objective in BERT, and using a larger batch size and learning rate for training. It is relatively friendly to Chinese text classification.

dotGCN [55]: This model introduced a discrete latent tree, a structure designed to serve as a substitute for the traditional dependency tree. This new structure is language-independent and specifically tailored to a particular aspect.

4.3 Evaluation criteria

The evaluation metrics employed in this article consist of Accuracy(Acc), Precision(P), Recall(R), and Macro-F1(F1). The definitions of the indicators are provided below [56]: (25)

TP stands for the true positive class, which is the case when an instance is a positive class and is predicted as such. The true negative class is denoted by TN, this occurs when an instance is classified as a negative class and is indeed negative. FP represents the false positive class, this happens when an instance is actually positive, but is classified as negative by the prediction model. The false negative class is denoted by FN, this is the scenario where an instance is falsely classified as negative despite being positive.

4.4 Experimental environment

In this paper, deep learning model is constructed based on Word2vec, CNN, BILSTM, POS (Position Coding), DDP-ASDN, ATTENTION, and Aspect-Level sentiment dictionaries. The parameter configurations are presented in Table 1:

4.5 Experimental results

In this paper, the performance of the recommended ASN-CSD framework and ten baseline models are compared on three datasets to evaluate ABSA. The results are presented in Table 2 for the Fine-grained automotive comment standard dataset. Table 3 for the Tan Songbo-Hotel review corpus, and Table 4 for the U+ wisdom teaching platform dataset. It is experimentally demonstrated that the ASN-CSD model performed optimally on all three datasets, yielding the highest Accuracy performance, Precision, Recall, Macro-F1, and computational time. The suboptimal models are outperformed by the proposed model by approximately 0.1000 on all evaluation indices. The superior performance of the recommended framework can be come down to its dual attention mechanism of context and aspect-level sentiment dependence, which enables better identification of aspect-level sentiment. Furthermore, the proposed model exhibited excellent computational efficiency, with most models improving by approximately 100.0 seconds. This is due to the use of a sentiment dictionary to pre-identify the dependence relationship and the relationship between internal words, thus reducing computation time.

thumbnail
Table 2. Classification results on dataset fine-grained automotive comment standard dataset (Optimal results are bold).

https://doi.org/10.1371/journal.pone.0295331.t002

thumbnail
Table 3. Classification results on Tan Songbo-Hotel review corpus (Optimal results are bold).

https://doi.org/10.1371/journal.pone.0295331.t003

thumbnail
Table 4. Classification results on the U + Wisdom teaching platform dataset (Optimal results are bold).

https://doi.org/10.1371/journal.pone.0295331.t004

After conducting further analysis of the outcomes of the experiments, a noticeable observation is that most of the models demonstrated better achievement on the two public datasets than on the U + wisdom teaching platform dataset. This can be attributed to the highly specialized nature of the data in the domestic education field and the limited coverage of the sentimental dictionary. Moreover, the involvement of students in the sentimental evaluation summary can lead to ambiguous sentimental expressions that fail to express their true feelings. Additionally, it is noted that the performance of most baseline models are inferior on the Chinese datasets compared to their English counterparts. However, the ASN-CSD model effectively handles Chinese data and performs well in specialized fields, indicating its strong generalization ability and potential for future promotion. Furthermore, it is found that the use of sentimental dictionaries and dependency models often leads to superior performance compared to other models. This underscores the advantages of combining aspect-level sentimental dictionaries, dependency relationships, and contextual mechanisms for the ABSA task.

4.6 Ablation experiment

In this section, five sets of juxtaposition models are created to further verify the effectiveness of each module in the ASN-CSD model, which are specifically described as follows:

  1. ASN-CSD∼CBA: The CNN-BILSTM-ATTENTION module is removed, leaving only the DDP-ASDN and position coding modules.
  2. ASN-CSD∼POS: The position coding module is removed, retaining the CNN-BILSTM-ATTENTION and DDP-ASDN modules.
  3. ASN-CSD∼ATT: The dual attention mechanism is removed, retaining the CNN-BILSTM module, position coding, and DDP-ASDN modules.
  4. ASN-CSD∼ASD: The aspect sentiment dictionary is removed, retaining the CNN-BILSTM-ATTENTION module, position coding, and syntactic dependency.
  5. ASN-CSD∼DDP: The syntax-dependent DDP is removed, retaining the CNN-BILSTM-ATTENTION module, position coding, and sentiment dictionary.
  6. ASN-CSD: The complete model presented in this paper, which retains the CNN-BILSTM-ATTENTION module, position coding, and DDP-ASDN module. These models are designed to compare and evaluate the performance of each module, allowing for a better understanding of the contribution of each module to the overall performance of the model.

The achievement of the five juxtaposition models on the three datasets are presented in Figs 57. It is observed that the removal of syntactic dependence analysis and sentiment dictionary led to a significant decline in performance, with the ASN-CSD∼DDP showing the most significant decrease, and the average accuracy decreasing to below 0.2768. This highlights the crucial role of syntactic dependence analysis and sentiment dictionary in aspect-level sentiment recognition. Sentiment dictionary can identify sentimental words, while syntactic dependence can quickly identify the sentimental tendency of these words, thus demonstrating the effectiveness of the DDP-ASDN layer presented in this paper.

thumbnail
Fig 5. Classification results in fine-grained automotive comment standard dataset after removal of partial modules.

https://doi.org/10.1371/journal.pone.0295331.g005

thumbnail
Fig 6. Classification results In Tan Songbo-Hotel review corpus after removal of some modules.

https://doi.org/10.1371/journal.pone.0295331.g006

thumbnail
Fig 7. Classification results in U + Wisdom teaching platform dataset after removal of some modules.

https://doi.org/10.1371/journal.pone.0295331.g007

The performance of the ASN-CSD∼POS and ASN-CSD∼ATT models on the three datasets reveals a decrease in performance after removing the position coding or attention mechanism. In dataset 3, the minimum drop in precision is only 0.0064, which can be attributed to the BILSTM and syntactic dependence models capturing the relationship between words within the sentence and obtaining some of the characteristics of the internal relationship.

The ASN-CSD∼CBA model shows that aspect-level sentiment classification can still be performed by removing all contextual association features and retaining only the DDP-ASDN model. However, this approach results in a decrease in performance of around 0.1000, mainly due to the lack of context-level feature vectors.

When further examining the ASN-CSD∼CBA and ASN-CSD∼ASD models on datasets 1 and 2, it is observed that the ASN-CSD∼ASD performed better than the ASN-CSD∼CBA. This result highlights the importance of the sentiment dictionary for the sentiment of recognition at an aspect-level sentiment recognition, which is more crucial than the correlation characteristics between contexts. However, on dataset 3, the ASN-CSD∼ASD model performed less effectively than the ASN-CSD∼CBA model. The reason for this may be due to the fact that the data in dataset 3 uses more professional vocabulary, leading to lower coverage of sentiment dictionaries. This result further emphasizes the significance of the proposed sentiment dictionary in aspect-level sentiment recognition.

The worst performance for the three datasets is exhibited by the ASN-CSD∼DDP model, highlighting the importance of syntactic dependence analysis in aspect-level sentiment recognition. Without dependency analysis, it is impossible to obtain the direct sentimental relationship of sentences, leading to poorer aspect-level sentiment recognition. This underscores the necessity of this paper based on syntactic dependence.

Overall, the complete ASN-CSD framework achieved the best achievement for aspect-level sentiment recognition. The ablation experiments have strongly demonstrated the necessity and effectiveness of each module in the ASN-CSD.

5 Conclusion

The ASN-CSD framework for aspect-level sentiment classification in Chinese short and long texts is presented in this article. The input text is encoded into word vectors using the pre-trained Word2vec model, and the aspect-level sentiment dyple is extracted by the newly designed DDP-ASDN model, which relies on a sentiment dictionary for internal aspect-level syntax. The main feature values of a sentence are extracted using CNN-BILSTM-ATTENTION, while considering the long-term and short-term dependencies and context-level feature relationships of the sentence. Additionally, the position coding is calculated to facilitate a more accurate acquisition of the aspect-level characteristics, and the double attention is applied on the context-level and aspect-level. The experiments conducted on three datasets have demonstrated the superiority of the model and the importance of each component.

However, the model’s performance can be further improved by establishing a professional Chinese sentiment dictionary and a more accurate implicit aspect-level words recognition algorithm.

References

  1. 1. Wen Zhi-xiao, Liang Zhi-jian. Fine-grained sentiment analysis model with both word-level and semantic-level attention. Journal of North University of China (Natural Science Edition), 2022, 43(5):431–440.
  2. 2. Zhang Yan, Li Tian-rui. Review of comment-oriented aspect-based sentiment analysis. Computer Science, 2020, 47(6):194–200.
  3. 3. Pontiki M, Galanis D, Papageorgiou H, et al. Semeval-2016 task 5:aspect based sentiment analysis. International Workshop on Semantic Evaluation, 2016:19-30.
  4. 4. XU L, CHIA Y K, BING L D. Learning span-level interactions for aspect sentiment triplet extraction. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Proc essing, Stroudsburg: ACL, 2021, 4755-4766.
  5. 5. Xin W, Liu Y, Sun C, et al. Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, July 26-31, 2015. Stroudsburg:ACL, 2015:1343-1353.
  6. 6. Huang B, Carley K. Parameterized convolutional neural networks for aspect level sentiment classification. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31-November 4,2018. Stroudsburg:ACL,2018:1091-1096.
  7. 7. VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need. Advances in neural information processing systems, 2017, 30.
  8. 8. ZHANG W, LI X, DENG Y, et al. A Survey on Aspect-Based Sentiment Analysis:Tasks, Methods, and Challenges. arXiv preprint arXiv:2203.01054,2022.
  9. 9. Zhang Z, Zhou Z L, Wang Y N. SSEGCN:Syntactic and Semantic Enhanced Graph Convolutional Network for Aspect-based Sentiment Analysis. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, 2022:4916-4925.
  10. 10. Tang H, Yin Q, Chang L, et al. Enhancing Aspect-Based Sentiment Classification with Local Semantic Information. Springer, Singapore, 2022:118–131.
  11. 11. PANG S, XUE Y, YAN Z, et al. Dynamic and multi-channel graph convolutional networks for aspect-based sentiment analysis. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2021:2627-2636.
  12. 12. ZHONG Q, DING L, LIU J, et al. Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis. arXiv preprint arXiv:2201.04831,2022.
  13. 13. Liu Xin, Qi Rui-hua, Xu Lin-hong,et al. Sentiment analysis of russian tweets with multi-level features. Journal of Chinese Computer Systems, 2021, 42(6):1176–1183.
  14. 14. Kim H, Jeong Y S. Sentiment classification using convolutional neural networks. Applied Sciences-Basel, 2019, 9(11):1–14.
  15. 15. Zhang Y, Zhang Z, Miao D,et al. Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Information Sciences, 2019, 477:55–64.
  16. 16. Choi G, Oh S, Kim H. Improving document-level sentiment classification using importance of sentences. Entropy, 2020, 22(12):1–11. pmid:33266520
  17. 17. Wang Gen-sheng, Huang Xue-jian, Min Lu. GRU neural network text emotion classification model based on multi-feature fusion. Journal of Chinese Computer Systems, 2019, 40(10):2130–2138.
  18. 18. Du Yong-ping, Zhao Xiao-zheng, Pei Bing-bing. Short text sentiment classification based on CNN-LSTM model. Journal of Chinese Computer Systems, 2019, 40(10):2130–2138.
  19. 19. Zhu Y L, Zheng W B, Tang H. Interactive dual attention network for text sentiment classification. Computational Intelligence and Neuroscience, 2020, 2020(3):1–11.
  20. 20. LI Z, GUO Q, FENG C,et al. Multimodal Sentiment Analysis Based on Interactive Transformer and Soft Mapping. Wireless Communications and Mobile Computing, 2022, 12(6):561–572.
  21. 21. YANG Z, YANG D, DYER C, et al. Hierarchical Attention Networks for Document Classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2016,1480-1489.
  22. 22. WANG Y, HUANG M, ZHU X, et al. Attention-based LSTM for Aspect-level Sentiment Classification. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing Stroudsburg, PA:Association for Computational Linguistics, 2016:606-615.
  23. 23. Dragoni M., Petrucci G.,A neural word embed-dings approach for multidomain sentiment analysis, IEEE Trans. Affect. Comput.8(4) (2017)457–470.
  24. 24. S. Jameel, Z. Bouraoui, S. Schockaert. Unsu-pervised learning of distributional relation vectors. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol.1, 2018, pp.23–33.
  25. 25. Liang B., Liu Q., Xu J., Zhou Q., and Zhang P. Aspect-based sentiment analysis based on multi-attention CNN. Comput. Res. Development. Chin. vol.54, no.8, pp.1724–1735, 2017.
  26. 26. D. Tang, B. Qin, X. Feng, and T. Liu. Effective LSTMs for targetdependent sentiment classifification. Proc. COLING 26th Int. Conf. Comput. Linguistics, Tech. Papers, 2016, pp.3298–3307.
  27. 27. RAO Y, LEI J, WENYIN L,et al. Building emotional dictionary for sentiment analysis of online news. World Wide Web, 2014, 17(4):723–742.
  28. 28. LIANG Y, LIN H F. Construction and application of Chinese emotional corpus. Proceedings of the 13th Chinese Conference on Chinese Lexical Semantics. Berlin:Springer, 2012: 122-133.
  29. 29. DING X W, LIU B, YU P S. A holistic lexicon-based approach to opinion mining. Proceedings of the 2008 International Conference on Web Search and Web Data Mining. New York: ACM, 2008: 231-240.
  30. 30. XU G, YU Z, YAO H,et al. Chinese text sentiment analysis based on extended sentiment dictionary. IEEE Access, 2019, 7:43749–43762.
  31. 31. WU L, MORSTATTER F, LIU H. Slang SD:Building and using a sentiment dictionary of slang words for shorttext sentiment classification. Language Resources & Evaluation, 2016(6):1–14.
  32. 32. XUE W, LI T. Aspect based sentiment analysis with gated convolutional networks. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2018: 2514-2523.
  33. 33. LIANG Y, MENG F, ZHANG J, et al. An iterative knowledge transfer network with routing for aspect-based sentiment analysis. 2020, arXiv preprint arXiv:2004.01935
  34. 34. TANG D, QIN B, FENG X, et al. Effective LSTMs for targetdependent sentiment classification. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2016: 3298-3307.
  35. 35. JELODAR H, WANG Y L, ORJI R, et al. Deep sentiment classification and topic discovery on novel coronavirus or COVID-19 online discussions:NLP using LSTM recurrent neural network approach. IEEE Journal of Biomedical and Health Informatics, 2020, 24(10):2733–2742. pmid:32750931
  36. 36. LIU J, ZHANG Y. Attention modeling for targeted sentiment. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics,2017:572-577.
  37. 37. YANG C, ZHANG H, JIANG B,et al. Aspect-based sentiment analysis with alternating coattention networks. Information Processing and Management, 2019, 56(3):463–478.
  38. 38. Peng H, Xu L, Bing L, et al. Knowing what, how and why:a near complete solution for aspect-based sentiment analysis. Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(5):8600-8607.
  39. 39. Zhang Shunxiang and Zhao Tong and Wu Houyue and Zhu Guangli and Li KuanChing. TS-GCN: Aspect-level sentiment classification model for consumer reviews. Computer Science and Information Systems, 2023, 20(1): 117–136
  40. 40. Kai Zhang, Kun Zhang, Mengdi Zhang, Hongke Zhao, Qi Liu, Wei Wu, et al. Incorporating Dynamic Semantics into Pre-Trained Language Model for Aspect-based Sentiment Analysis. arXiv preprint arXiv:2203.16369
  41. 41. Wang Yifei and Wang Yongwei and Hu Hao and Zhou Shengnan and Wang Qinwu. Knowledge-Graph-and GCN-Based Domain Chinese Long Text Classification Method. Applied Sciences. 2023; 13(13):7915
  42. 42. Xiao, Zeguan and Wu, Jiarun and Chen, Qingliang and Deng, Congjian. Bert4gcn: Using bert intermediate layers to augment gcn for aspect-based sentiment classification. arXiv preprint arXiv:2110.00171
  43. 43. Liang Bin and Su Hang and Gui Lin and Cambria Erik and Xu Ruifeng. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowledge-Based Systems 235 (2022): 107643
  44. 44. WANG Y, HUANG M, ZHU X, et al. Attention-based LSTM for aspect-level sentiment lassification. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2016:
  45. 45. ZHANG M, QIAN T. Convolution over hierarchical syntactic and lexical graphs for aspect level sentiment analysis. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing EMNLP). 2020: 3540-3549.
  46. 46. Li X, Bing L, Lam W, et al. Transformation networks for target-oriented sentiment classification. arXiv preprint arXiv:1805.01086, 2018.
  47. 47. Ruifan Li, Hao Chen, Fangxiang Feng, Zhanyu Ma, Xiaojie Wang, and Eduard Hovy. Dual graph convolutional networks for aspect-based sentiment analysis. In Proceedings of the 59th Annual Meeting ofthe Association for Computational Linguistics and the 11th International Joint Conference on Natu-ral Language Processing (Volume 1: Long Papers), pages 6319–6329
  48. 48. DONG L, WEI F, TAN C, et al. Adaptive recursive neural network for target-dependent twitter sentiment classification. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (volume 2: Short papers). 2014: 49-54.
  49. 49. ZHANG C, LI Q C, SONG D W. Aspect-based sentiment classification with aspect-specific graph convolutional networks. Proceedings of the 2019 Conference on Emprical Methods in Natural Language Processing and the 9thInternational Joint Conference on Natural Language Processing(EMNLP-IJNLP). ongkong Associ- ation for Computational Linguistics, 2019:4568-4578.
  50. 50. WU Z, YING C C, ZHAO F, et al. Grid tagging scheme for aspect-oriented fine-grained opinion extraction. Proceedings of the Findings of the Association for Computational Linguistics: EMNLP, Stroudsburg, PA: ACL, 2020, 2576-2585.
  51. 51. LI X, BING L D, LI P J, et al. A unified model for opinion target extraction and target sentiment prediction. Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, 2019, 6714 –6721.
  52. 52. Ma D, Li S, Zhang X, et al. Interactive attention networks for aspect-level sentiment classification. arXiv preprint arXiv:1709.00893, 2017.
  53. 53. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language under-standing. arXiv preprint arXiv:1810.04805
  54. 54. Liu, Yinhan and Ott, Myle and Goyal, Naman and Du, Jingfei and Joshi, Mandar and Chen, Danqi et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.
  55. 55. Chen, Chenhua and Teng, Zhiyang and Wang, Zhongqing and Zhang, Yue. Discrete opinion tree induction for aspect-based sentiment analysis. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022.
  56. 56. M. Sokolova, N. Japkowicz, and S. Szpakowicz. Beyond accuracy, F-score and ROC: A family of discriminant measures for performance evaluation. Proc. Australas. Joint Conf. Artif. Intell., 2006, pp. 1015–1021.