Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Syntactic denoising and multi-strategy auxiliary enhancement for aspect-based sentiment analysis

  • Lu Liu,

    Roles Conceptualization, Formal analysis, Methodology, Software, Writing – original draft, Writing – review & editing

    Affiliation Institute of Automation, Qilu University of Technology (Shandong Academy of Sciences), JiNan, ShanDong, China

  • Da Li,

    Roles Investigation, Supervision, Validation

    Affiliation Institute of Automation, Qilu University of Technology (Shandong Academy of Sciences), JiNan, ShanDong, China

  • Chuanxu Yue,

    Roles Formal analysis, Visualization, Writing – review & editing

    Affiliation Institute of Automation, Qilu University of Technology (Shandong Academy of Sciences), JiNan, ShanDong, China

  • Xiaojin Gao,

    Roles Conceptualization, Funding acquisition, Project administration, Resources, Writing – review & editing

    Affiliation Science and Technology Service Platform, Qilu University of Technology (Shandong Academy of Sciences), JiNan, ShanDong, China

  • Yunhai Zhu

    Roles Data curation, Funding acquisition, Project administration, Validation, Writing – review & editing

    zhuyh@sdas.org

    Affiliation Science and Technology Service Platform, Qilu University of Technology (Shandong Academy of Sciences), JiNan, ShanDong, China

Abstract

Aspect-based sentiment analysis (ABSA) aims to identify the sentiment polarity associated with specific aspect terms within sentences. Existing studies have primarily focused on constructing graphs from dependency trees of sentences to extract syntactic features. However, given that public datasets are often derived from online reviews, the syntactic structures of these sentences frequently exhibit irregularities. As a result, the performance of syntactic-based Graph Convolution Network (GCN) models is adversely impacted by the noise introduced during dependency parsing. Moreover, the interaction between syntactic and semantic information in these approaches is often insufficient, which significantly impairs the model’s ability to accurately detect sentiment.To address these challenges, we propose a novel approach called Syntactic Denoising with Multi-strategy Auxiliary Enhancement (SDMAE) for the ABSA task. Specifically, we prune the original dependency tree by focusing on context words with specific part-of-speech features that are critical for conveying the sentiment of aspect terms, and then construct the graph. We introduce a Multi-channel Adaptive Aggregation Module (MAAM), a feature aggregation system that employs a multi-head attention mechanism to integrate semantic and syntactic GCN output representations. Furthermore, we design a multi-strategy task learning framework that incorporates sentiment lexicons and supervised contrastive learning to enhance the model’s performance in aspect sentiment recognition.Comprehensive experiments conducted on four benchmark datasets demonstrate that our approach achieves significant performance improvements compared to several state-of-the-art methods across all evaluated datasets.

Introduction

Sentiment analysis, as a pivotal research area in Natural Language Processing (NLP), has garnered considerable attention from scholars both domestically and internationally. The rapid proliferation of the internet has led to the emergence of various platforms, such as e-commerce [1,2], social forums [35], and online healthcare [6,7]. As a result, online comments have surged dramatically, often containing subjective evaluations of products by consumers or opinions on major social events and government decisions. Consequently, there is an increasing demand within the industry to extract intricate sentiment patterns from this data. Such capabilities would enable management in various sectors, including e-commerce, social forums, and governmental organizations, to enhance and adjust their products and policies accordingly.

Traditional sentiment analysis tasks primarily focus on discerning coarse-grained sentiment polarity at the sentence level. However, these tasks face considerable challenges when addressing the complexity and variability of online reviews, which has spurred the advancement of the Aspect-based Sentiment Analysis (ABSA) task. ABSA aims to determine the sentiment polarity associated with specific aspect terms within a sentence. For instance, as illustrated in Fig 1.

thumbnail
Fig 1. The demonstration of ABSA.

The sentence “Apple’s products look great, except that the charging speed is a little slow.” has two aspects: products and speed, with their sentiment polarities being positive and negative, respectively.

https://doi.org/10.1371/journal.pone.0329018.g001

The sentence “Apple’s products look great, except that the charging speed is a little slow” contains two aspect terms: “products,” which is associated with a positive sentiment polarity, and “speed,” which is linked to a negative sentiment polarity. Traditional sentiment analysis techniques struggle with sentences that express multiple, conflicting sentiments. In contrast, the ABSA task requires models to identify the sentiment orientations associated with particular entities in online consumer reviews, rather than delivering a generalized sentiment assessment of the entire sentence.

Initial methodologies for ABSA were labor-intensive, including lexicon-based, rule-based, and machine learning-based approaches. The effectiveness of these techniques heavily relied on the quality of feature engineering, which limited their generalization and transferability. The advent of deep learning has led to the development of sequence-based neural network models, such as Recurrent Neural Networks (RNNs) and their variants, such as Long Short-Term Memory (LSTM) networks. These models enable the automated extraction of semantic information from textual data. Researchers have increasingly employed these models to autonomously extract semantic information related to both context and aspect terms.To facilitate the automatic identification of key word features within textual data, attention-based approaches have been extensively explored. The development of various attention mechanisms specifically tailored for ABSA has led to significant improvements in its performance.

Inspired by feature extraction techniques in computer vision, researchers have begun applying Convolutional Neural Networks (CNNs) to the ABSA task. This approach involves extracting various local features from textual data using sliding convolutional kernel windows. In 2017, Zhang et al. [8] developed an undirected graph derived from the dependency tree of sentences and were the first to apply Graph Convolutional Networks (GCNs) to model contextual dependencies, yielding notable results. This pioneering work has driven a significant increase in the use of GCNs by researchers in recent years to capture syntactic information in ABSA.Recent studies have also integrated external knowledge bases, including sentiment lexicons and knowledge graphs, into ABSA models to incorporate commonsense knowledge. These approaches [915] adjust weights within the syntactic dependency graph or establish connections between entities, enabling models to prioritize significant opinion words and their relationships with entities.

It is important to note that, although these methods have demonstrated notable enhancements in performance, several challenges persist that have yet to be adequately addressed:

  1. There is a considerable presence of noisy information within the dependency trees of informal online reviews.
  2. The significance of opinion words with sentiment tendencies is frequently neglected in the analysis of semantic information.
  3. The interplay between semantic and syntactic information in the text is not sufficiently explored.

To address the challenges in the ABSA task outlined above, we propose SDMAE, which integrates syntactic denoising with a multi-strategy auxiliary enhancement. Specifically, to mitigate noisy information within dependency trees derived from online reviews, we introduce a syntactic denoising pruning strategy. This methodology eliminates extraneous word dependencies by incorporating part-of-speech and positional distance attributes, thereby constructing a syntactic denoising graph.Subsequently, we employ a pre-trained BERT model to encode the text and create a semantic graph through multi-head attention operations applied to the textual semantic features. Graph Convolutional Networks (GCN) are then utilized to extract features from both the semantic graph and the syntactic denoising graph. The semantic and syntactic features are subsequently integrated via the Multi-channel Adaptive Aggregation Module (MAAM).To effectively capture sentiment information conveyed by opinion words in context, we propose a sentiment refinement strategy. Using the SenticNet 8 sentiment lexicon, we generate a sentiment vector for each text in the dataset and compute the mean squared error loss between this vector and the output produced by the adaptive aggregation module. This strategy enhances the model’s ability to identify and incorporate key sentiment indicators. Additionally, to improve the model’s capacity to distinguish between different sample classes, we incorporate supervised contrastive learning as an auxiliary strategy during training.

To summarize, the main contributions of our work are as follows:

  • We introduce SDMAE, an innovative approach to ABSA that integrates syntactic denoising with a multi-strategy auxiliary enhancement framework. This method markedly improves the performance of ABSA.
  • We propose an aspect-oriented syntactic denoising algorithm that efficiently eliminates extraneous noise information from the dependency tree. This is achieved by integrating words with their corresponding parts of speech and their non-linear positional attributes about aspect words, thereby facilitating the pruning of the dependency tree.
  • We design an affective refinement strategy module that employs a sentiment lexicon to aid the model in recognizing significant sentiment indicators within the contextual framework. The integration of this module with supervised contrastive learning techniques enhances the overall training process of the model.
  • Extensive experiments on REST14, Twitter, REST15, and REST16 four benchmark datasets demonstrate the effectiveness of our proposed methodology in the ABSA task.

Related works

Sentiment analysis can be classified into two categories based on the granularity of sentiment entities: coarse-grained and fine-grained analysis. Coarse-grained sentiment analysis includes both document-level and sentence-level sentiment evaluations. In contrast, ABSA primarily focuses on sentiment entities at the word level, which presents greater challenges compared to document-level and sentence-level sentiment analysis.As ABSA research gains momentum, traditional approaches that rely on manual feature construction are increasingly being phased out due to their limited generalization across domains and the substantial labor costs involved. The advent of deep learning has paved the way for new methodologies and approaches in the ABSA task. In this section, we provide an overview of the prevailing deep learning-based methodologies for ABSA, including neural network-based approaches, GCN-based models, and methods that incorporate external knowledge.

Neural network-based methods

Sequential models are commonly employed for analyzing sequential data, such as text and speech. Unlike conventional methods that rely on manually designed features, neural network-based approaches demonstrate strong generalization abilities and have achieved remarkable performance. For instance, Tang et al. [16] utilize two LSTM networks to capture information from both sides of the aspect, combining this information to form the final representation for sentiment classification. However, methods relying solely on sequence models struggle to capture the importance of contextual word information.Since attention mechanisms can prioritize critical components of sentences, researchers have explored integrating sequence models with attention mechanisms. Wang et al. [17] combine aspect embeddings with sentence representations, merging the attention mechanism with LSTM networks to extract significant contextual semantic features. Huang et al. [18] introduce the concept of attention over-attention, derived from machine translation, to simultaneously model context and aspect features using LSTM networks. Ma et al. [19] leverage LSTM to obtain representations of both context and aspects, employing an interactive attention network to extract contextual information that significantly influences aspect sentiment polarity. Chen et al. [20] use Bidirectional Long Short-Term Memory (BiLSTM) networks to model contextual information and implement multiple attention mechanisms to effectively capture long-range context dependencies. Fan et al. [21] design a multi-granularity attention mechanism that combines coarse-grained and fine-grained attention to enhance model performance. Zhu et al. [22] propose a CNN for phrase extraction and introduce a cross-correlation attention mechanism, which allocates weights to phrases based on words in the context and adjusts weights for individual words in the context according to the phrases.The integration of attention mechanisms has resulted in significant improvements in ABSA task performance, enhancing the model’s ability to capture relevant semantic features from both context and aspect terms.

GCN-based methods

Despite notable advancements in performance from the integration of attention mechanisms, approaches focusing solely on semantic aspects continue to face challenges in effectively capturing long-distance syntactic dependencies within the context. Graph Convolutional Networks (GCN) have been widely adopted for the ABSA task due to their ability to model long-distance dependencies in the dependency tree. Zhang et al. [8] convert the dependency tree into an adjacency graph and apply GCN to model this graph, yielding excellent results. To provide the model with additional contextual information, Zhang et al. [23] propose incorporating conceptual hierarchies of syntax and lexicon, thereby constructing hierarchical syntactic and lexical graphs. They then develop a bi-level GCN aimed at the comprehensive integration of information from both the hierarchical syntactic and lexical graphs. Pang et al. [24] introduce a dynamic multi-channel GCN model that separately models syntactic and semantic information, and design a parameter-sharing GCN to extract common information, which is then concatenated after average pooling for ABSA. Li et al. [25] propose a dual-channel GCN to model semantic and syntactic graphs, employing orthogonal and differential regularizers to aid the model in thoroughly learning semantic features.

Since both semantic and syntactic features can be influenced by irrelevant words, edge effects, and other local factors, Wang et al. [26] introduce a distance-based syntactic weighting algorithm to prune the dependency parse tree. By combining aspect-fusion attention, they further filter opinion words in the context, achieving precise identification of aspect terms. In the ABSA task, the sentiment polarity associated with aspect terms is sometimes contingent upon specific contextual phrases. To avoid introducing irrelevant context and syntactic dependencies, researchers have focused on extracting localized segment information relevant to specific aspects. Ahmad et al. [27] propose a specific aspect-based segmentation framework that segments the sentence, retaining only the portion related to the particular aspect, and then uses GCN to extract both syntactic and semantic features.

You et al. [28] demonstrate that dependency trees can introduce extraneous noise due to irrelevant associations, which may lead to erroneous alignments between aspects and their associated sentiment words. To address this, they employ sentiment-aware contextual trees, incorporating phrase segmentation and hierarchical structures alongside graph attention mechanisms. This approach allows the model to effectively capture detailed syntactic information from both contextual and dependency trees, thereby ensuring precise alignment between aspect terms and their respective sentiment words.

Despite the notable performance improvements achieved by graph-based methods in the ABSA task, challenges remain in effectively leveraging external knowledge to enhance the models’ capabilities in sentiment recognition.

External knowledge-based methods

External knowledge has been shown to enhance the natural language understanding abilities of models, with wide-ranging applications across various NLP tasks, such as event detection [29], text classification [30], and more. To optimize the utilization of external resources, including sentiment lexicons [31,32] and knowledge graphs [33,34], researchers have proposed a variety of strategies to improve the integration of external knowledge into neural networks in a more comprehensive and efficient manner. Ma et al. [10] integrate the LSTM network with the SenticNet 4 [35] sentiment lexicon by enhancing LSTM units to incorporate sentiment information into deep neural networks. Following this, approaches utilizing graph neural networks (GNNs) have become the dominant strategy for tackling the ABSA task, prompting researchers to explore the potential of combining GNNs with external knowledge sources. Zhou et al. [36] combine a syntactic dependency graph with commonsense knowledge graphs using GCN. Zhong et al. [15] propose a multi-view representation enhancement network that integrates knowledge graphs into the embedding space, further facilitating the acquisition of aspect-specific knowledge representations through attention mechanisms. Liang et al. [11] employ the SenticNet 6 [37] sentiment lexicon to adjust the weights within the dependency parse graph. Their experimental findings show that incorporating sentiment information helps the model prioritize opinion words with significant sentiment relevance. Gu et al. [12] integrate sentiment knowledge and part-of-speech information into the original syntactic dependency graph, resulting in an augmented graph that incorporates syntactic, sentiment, and part-of-speech information simultaneously. Zheng et al. [38] develop a framework for incorporating sentiment knowledge at the corpus level by utilizing sentiment lexicons [32]. This framework enhances the model’s ability to retain, modify, and share sentiment knowledge, enabling the transfer of sentiment knowledge during training on different datasets within the same domain by initializing new model nodes with pre-existing sentiment knowledge node representations. Hao et al. [13] propose a network model that integrates three channels of GCN, leveraging the Concept knowledge graph [34] alongside the SenticNet sentiment lexicon. This model effectively captures contextual semantic features, conceptual knowledge, and sentiment knowledge, thereby enhancing its ability to express aspects and represent sentence dependency graphs. Additionally, the incorporation of interactive attention mechanisms further optimizes the coordination between aspects and context.

The aforementioned studies suggest that a model’s sentiment analysis capabilities can be significantly improved through the integration of external knowledge. However, there is still a need for further exploration of effective knowledge fusion methods.

Existing methods merely assign contextual position encoding and sentiment lexicon scores to an adjacency matrix in a simplistic manner, overlooking contextual noise information and contrastive learning of sentiment knowledge. Therefore, our SDMAE model employs Gaussian functions to mitigate noise information, while leveraging contrastive learning to fuse sentiment knowledge, considering the utilization of both for richer feature information.

Methodology

Task Description: Given a sentence S and aspect term A, where S denotes the sentence of length n and contains m aspect terms. S = {w1, w2, w3, ..., wt + 1, ..., wt + m, ..., wn}, where A = {wt + 1, ..., wt + m} is the subsequence of S. The goal of the ABSA task is to identify the sentiment polarity of each aspect term A in the sentence. Fig 2 illustrates the framework of our proposed method SDMAE, which includes five key components: 1) Encoder Layer: This layer encodes the input sentence into a suitable representation for further processing. 2) Dual-channel GCN Layer: This layer employs two parallel GCNs to model semantic and syntactic graphs separately. 3) Multi-channel Adaptive Aggregation Module: This module aggregates features from both semantic and syntactic GCN adaptively. 4) Output Layer: This layer generates the final sentiment prediction representation based on the aggregated features. And 5) Multi-Strategy Auxiliary Module: This module incorporates additional strategies to enhance the model’s performance. Next, we will provide detailed descriptions of each component of SDMAE.

thumbnail
Fig 2. The overall structure of SDMAE.

In sentence-aspect pairs encoder, mainly a BERT-based encoding process. DualGCN modules contain SemGCN and SynGCN. Lossar and Lossscl refer to the multi-strategy auxiliary enhancement loss.

https://doi.org/10.1371/journal.pone.0329018.g002

Encoder layer

For a given sentence encoded using pre-trained BERT [39], we follow BERT-SPC [40] approach to pre-process the text, concatenate the sentence with aspect words, and encode the concatenated representation.

(1)

where [CLS] and [SEP] are the special tokens in the pre-trained BERT, which denote the sentence-pooling representation of the fine-tuned BERT for the downstream classification task as well as the tokens used for separating the two sentences, respectively. The last hidden layer of BERT states H is obtained as the semantic feature representation of sentence-aspect pair: , where h0 denotes the output of pooling i.e. the vector representation at [CLS]. The word vectors obtained after BERT has more powerful representations compared to GloVe word vectors and can automatically learn the semantic, positional, part-of-speech, and syntactic information of the text.

Given that linear encoding offers a restricted amount of learnable information throughout the model training process, we employ a Gaussian function-based positional weighting, which is calculated as demonstrated below:

(2)(3)

following the application of the function p(), we derive semantic information with nonlinear distance features, which demonstrates greater efficacy compared to linear feature perception and enables the modulation of weight distribution through the adaptive adjustment of the σ parameter.

Dual-channel GCN layer

SemGCN.

To improve the representation of semantic features and aid the model’s understanding, we develop SemGCN using a multi-head attention mechanism. We create a semantic graph by assessing the scores from various attention heads, where the attention scores between each word pair reflect their semantic relevance, the calculation process is outlined as follows:

(4)(5)(6)(7)

where Q, K, and V are derived from the semantic features H through linear transformations. Then put them into the multi-head attention mechanism, yielding the multi-head attention output denoted as . By averaging the attention scores scorei from each attention head, we construct the semantic graph of the sentence. To acquire semantic graph information, we employ GCN for modeling the semantic graph, denoted as SemGCN. The position function p() is utilized to process the multi-head attention output and derive the initial representation for SemGCN. The hidden states of the SemGCN are updated as follows:

(8)(9)

where and represent the learnable weight and bias matrices, respectively. , Ei is the degree matrix of the semantic graph , and represents the output of node i from layer l-1. Upon traversing through l layers of SemGCN, the final output of SemGCN is derived, .

SynGCN.

The dependency tree provides syntactic dependency information that illustrates the relationships between words within a sentence, thereby enabling the connection of two distant words through specific types of dependency relations. In the context of the ABSA task, the majority of dependency parse trees for sentences are generated using NLP toolkits, such as spaCy or Stanford NLP, which are then employed to construct syntactic graphs. The general rule is that if there is a dependency arc connecting word i and word j in the parse tree, the corresponding position Ai,j in the adjacency matrix A is set to 1; otherwise, it is set to 0. The construction of Ai,j can be computed as follows:

(10)

where “self-loop” represents that the position Ai,i corresponding to word i in the matrix is 1, and “Rel” represents specific type of dependency relation between the words.

However, it is noteworthy that the experimental dataset derived from online comments encompasses a substantial amount of noisy information. This includes numerous words that are unrelated to aspect sentiment, dependencies, and non-standard syntactic structures. Consequently, denoising the original dependency tree is imperative. We propose an Aspect-Oriented Syntactic Denoising (AOSD) algorithm that integrates aspect information, part-of-speech tags, and distance weights to reconstruct and prune the dependency trees.

Algorithm 1 delineates the detailed procedure of our proposed AOSD method. We employ spaCy to generate the dependency tree T from the original sentence and predefine a set of part-of-speech tags, concentrating on specific word information. The selection of part-of-speech tags is based on research conducted by Gu and Shuang et al. [41,42], which demonstrated that words belonging to part-of-speech tags such as adjectives, adverbs, and verbs significantly influence the sentiment orientation toward a given aspect.

Algorithm 1. Aspect-oriented syntactic denoising.

1: Input: sentence , aspect , dependency Tree T (parsed by spaCy), = [VERB, ADV, ADJ, ADP]

2: Output: Aspect-Oriented Syntactic Denoising matrix M

3: Initialize a zero matrix

4: for i = 1 to n do

5:   for j = 1 to m do

6:    direct in T

7:    if not in Part-of-Speech_List then

8:    

9:    else

10:    

11:    end if

12:   

13:   end for

14:  

15: end for

16: return M

Firstly, we traverse the words within the sentence to reconstruct the dependency tree, establishing a linkage between each node and the aspect terms. Subsequently, we eliminate the dependencies of nodes that do not fall into the predefined part-of-speech list. Eventually, we create the dependency matrix M, in which the weight between nodes and aspect terms is computed based on the relative distance between the nodes and the aspects. The visualization of the AOSD algorithm is depicted in Fig 3.

By employing Algorithm 1, we construct a syntactic denoising adjacency matrix M. Analogous to SemGCN, we also utilized as the initial input for SynGCN. The node update equation of SynGCN is presented by the following equation:

(11)

where and represent the learnable weight and bias matrices, respectively. And represents the output of node i from layer l-1. After traversing l layers of SynGCN, the final output of SynGCN is derived. .

MAAM layer

In previous dual-channel ABSA approaches, the aggregation of semantic and syntactic modules was predominantly accomplished via simple concat or summation, or through the design of gate mechanisms. Nevertheless, these methods have failed to achieve satisfactory performance, signifying their limitations in facilitating mutual learning between semantic and syntactic information. Inspired by the transformer architecture, we utilize a Multi-channel Adaptive Aggregation Module (MAAM), as depicted in Fig 4.

The output of the dual-channel GCN is utilized to compute cross-attention, which is subsequently input into the residual normalization and feedforward neural network layers. The calculation process can be expressed as follows:

(12)(13)(14)(15)

where LN() denotes layer normalization, FFN() is feed-forward neural network, and MHA() represents the multi-head attention mechanism. Q, K, and V are the outputs of and after being processed by the linear layer.

However, during the training process, the model might tend to learn either semantic or syntactic features. Hence, we concatenate and to derive :

(16)

where represents the concat operation. To achieve effective integration of , , and , the attention scores associated with these components were weighted and combined utilizing a softmax function. Given the input , the computation for the fusion output is delineated as follows:

(17)(18)

where σ denotes the activation function, we use ReLU(), W and b represent the learnable weight and bias matrices, respectively, denotes the concat operation. The above approach allows for a complete interaction between the semantic and syntactic features, resulting in the final output, Hfin.

To obtain the contextual feature for the specific aspect, we employ a zero mask mechanism. This approach preserves the context vector while replacing the aspect word vector with zeros. Consequently, we derive aspect-specific contextual feature , .

(19)

Output layer

We construct the final classification representation of the model based on the output of the multi-head attention mechanism from the semantic features and the output of the aspect-level masking mechanism . Subsequently, we employ a linear layer followed by a softmax function to compute the sentiment probability distribution y:

(20)(21)

where f() represents the average pooling function, indicates vector concatenation, and respectively denote the learnable weight and bias matrices, and (s,a) represents the sentence aspect pair.

Multi-strategy auxiliary module

Affective refinement strategy.

The majority of methods related to the ABSA task that employs sentiment lexicons tends to combine dependency relationships to incorporate sentiment information, often neglecting the importance of sentiment information within the semantic context. In response to this gap, we propose an affective refinement strategy based on the SenticNet 8 [31] sentiment lexicon. Through a systematic traversal of sentences within the corpus and the integration of the SenticNet 8 sentiment lexicon, we develop an affective score vector, denoted as lex, for each sentence. lex = {score1, score2, score3, , scoren}.

(22)

where scorei represents the sentiment score of the word i in the corresponding sentiment lexicon. If word i does not exist in the sentiment lexicon, then scorei = 0. To enhance the model’s ability to recognize sentiment cues in sentences, we establish a mapping from the output Hfin of the MAAM layer to . By employing the Mean Squared Error (MSE) function to minimize the discrepancy between and the sentence sentiment vector lex, we derived the sentiment refinement loss, denoted as Lossar, the calculation process is as follows:

(23)(24)

where and denote the learnable weight and bias matrices, respectively.

Supervised contrastive learning strategy.

To enhance the model’s ability in inter-aspect modeling and distinguishing inter-class relationships, we employ supervised contrastive learning to assist in model training. Specifically, we regard the aspect representations with the same sentiment polarity in a mini-batch as positive examples and those with different aspect representations as negative examples.

(25)

where i indicates the index value in the mini-batch sample B. C(i) represents the number of positive samples of the i-th aspect. denotes the aspect term representation in Formula 21. τ represents the temperature.

Model training

We utilize the cross-entropy loss function with L2 regularization to train the model and derive the cross-entropy loss Losssa.

(26)

where D and C denote the training sample pairs and the set of different sentiment categories respectively. λ stands for the L2 regularization parameter.

The final loss Losstotal is acquired by weighting the model training loss Losssa, the affective refinement loss Lossar, and the supervised contrastive loss Lossscl, which can be computed as follows:

(27)

where α represents the weighting parameter of the affective refinement loss Lossar, and β is the weighting parameter of the sample supervised contrastive loss Lossscl. The goal of model training is to minimize the total loss Losstotal.

Experiment

Datasets

We conduct evaluations using four publicly available English datasets for the ABSA task, namely the Twitter dataset from ACL14 [43], and the Rest14, Rest15, and Rest16 datasets from SemEval 2014 [44], SemEval 2015 [45], and SemEval 2016 [46], respectively. The Rest14, Rest15, and Rest16 datasets contain statements that might comprise one or multiple aspect terms, while the Twitter dataset contains a single aspect term. The sentiment polarities within these datasets are classified into positive, negative, or neutral. Detailed specifications for each dataset are presented in Table 1.

Implementation details and training parameters

In the experimental design, we utilize the pre-trained ‘bert-base-uncased’ version of BERT to generate word embeddings. The input construction for BERT follows the Bert-SPC methodology, with the dimension of the hidden layer being 768. Both the SemGCN and SynGCN architectures consist of two layers each. The weighting value α for the affective refinement strategy lies within the interval (0, 1]. The weighting value β in the supervised contrastive strategy also falls within the interval (0, 1]. During the model training process, we initialize the parameter weights using a uniform distribution. The batch size is set to 16, the learning rates are 2e-5 and 5e-5 respectively, and the value of the L2 regularization parameter is 1e-5. We utilize accuracy and macro F1 value metrics to evaluate the performance of the model in our experiment. Among them, the accuracy is obtained by calculating the proportion of the total number of correctly predicted samples in the model, while the macro F1 value is derived by calculating the metrics of each label and finding their unweighted average. During the experiment, the optimal performance parameters on each dataset are presented in Table 2. The epoch is set at 30. If no better performance metrics emerge for five consecutive rounds during the training process, the training will be prematurely stopped. All the experiments in our study are conducted on the NVIDIA GeForce RTX 3090 GPU.

thumbnail
Table 2. The optimal parameter combination on each dataset.

https://doi.org/10.1371/journal.pone.0329018.t002

Baselines

To assess the efficacy of the SDMAE model, we refer to three types of baselines: 1) Neural Network-based methods, 2) Graph-based methods, and 3) Graph-based with BERT methods. We conduct comparisons with the following baseline models on the four datasets.

  • TD-LSTM: Tang et al. [16] design two LSTMs to obtain the context information of aspect words and then concatenate them to predict sentiment polarity.
  • AOA: Huang et al. [18] employ the idea of attention over attention to model the relationship between aspects and the context for predicting sentiment polarity.
  • IAN: Ma et al. [19] propose an interactive attention network to jointly model aspect word and context representations to predict sentiment polarity.
  • RAM: Chen et al. [20] utilize multiple attention mechanisms to capture the sentiment features of specific aspect words in long-distance texts, effectively reducing the influence of irrelevant factors.
  • MemNet: Tang et al. [47] develop a memory network, combining the attention mechanism with the external memory to calculate the importance of context word to the aspect term.
  • BERT: Song et al. [42] employ BERT’s final hidden state to encode and represent the context for sentiment prediction.
  • ASGCN: Zhang et al. [8] utilize GCN to model syntactic graph based on dependency tree and combined attention mechanisms to predict sentiment.
  • CDT: Sun et al. [48] exploit BiLSTM to learn the semantics of contextual words and model the syntactic structure information of dependency trees through convolution.
  • BiGCN: Zhang et al. [23] develop an interactive bidirectional graph convolutional network to integrate word co-occurrence information and syntactic features.
  • kumaGCN: Chen et al. [49] achieve representations tailored for the ABSA task through the integration of syntactic dependency graphs and latent graphs.
  • R-GAT: Wang et al. [50] propose a relational graph attention network and an aspect-oriented tree structure that concentrates on aspects by reshaping and pruning dependency trees.
  • ACLT: Zhou et al. [51] develop aspect-centered tree structure to adaptively correlate aspects with opinion words, thereby reducing the distance between the aspect and the corresponding opinion words.
  • DGEDT: Tang et al. [52] integrate BiGCN and the transformer structure to acquire the architecture of sentiment features from various perspectives. Subsequently, BiAffine was employed for feature fusion.
  • Dual-GCN: Li et al. [25] introduce a dual-channel GCN that separately models both the semantic and syntactic graphs. The framework captures semantic correlations through the application of orthogonal and differential regularizers.
  • SenticGCN: Liang et al. [11] incorporate sentiment knowledge from the SenticNet lexicon into syntactic structures for the enhancement of the model’s sentiment perception capability.
  • ASHGAT: Ouyang et al. [53] construct a word-level hypergraph matrix based on syntactic dependencies to enhance the syntactic and semantic connections between the aspect and the contextual words.
  • TCKGCN: Hao et al. [13] integrate semantic, conceptual, and affective features and proposed a three-channel knowledge fusion model for sentiment analysis, which strengthened the optimization of aspect and context coordination through an interactive attention mechanism.

Experimental results and analysis

From the experimental outcomes presented in Table 3, it is evident that SDMAE has achieved the optimal performance on the Rest14, Rest15, and Rest16 datasets when compared with previous research methodologies. Additionally, its performance on the Twitter dataset is comparable to that of previous methods. These findings demonstrate the efficacy of the SDMAE method. After the comparison, the following results are obtained.

thumbnail
Table 3. The performance results of four datasets on different models are shown below, with data sourced from copied paper authors and code provided by some published papers. The best performance is marked in bold, the second best performance is indicated in italics, and the dash (‘-’) indicates that the original research report did not include results or research findings.

https://doi.org/10.1371/journal.pone.0329018.t003

Firstly, within the neural network-based methodologies, when compared to the TD-LSTM models, the IAN and AOA models demonstrate significantly superior performance across all datasets. This is because the IAN and AOA methods provide specific aspect information to the model. It has been affirmed that integrating such information is beneficial for the model to retrieve key information within the context. Among the neural network methods, BERT performs the best, suggesting that the semantic features obtained by BERT are more effective than those acquired by other approaches. The graph-based method has enhanced the overall performance, indicating the necessity of incorporating syntactic structure information in the ABSA task. This is because the introduction of syntactic dependency trees infuses syntactic knowledge into the model. By combining syntactic knowledge, the model can learn opinion word information that has a long-distance dependency relationship with aspect terms.

In comparison with the ASGCN, the BiGCN showcases enhanced performance, suggesting that incorporating word co-occurrence information within the corpus is effective. R-GAT trims dependency trees and introduces attention mechanisms into graph neural networks to dynamically adjust node weights within the syntactic graph. In contrast to ASGCN, a remarkable improvement in performance has been observed in the Twitter and Rest14 datasets, thereby validating that pruning specific aspects is advantageous. ACLT reduces the distance between aspects and their corresponding opinion words by learning aspect-centered tree structures. Compared to standard dependency trees, it can adaptively associate aspects and opinion words during model training. The performance across all five datasets has been improved. Notably, the accuracy of the Rest16 dataset has reached 92.15%, indicating that minimizing the distance between aspects and corresponding opinion words enables the model to better recognize the sentiment polarity in the ABSA task.

SenticGCN and TCKGCN integrate external knowledge with semantic information or syntactic dependency trees. In comparison to the ASGCN model, the SenticGCN model demonstrated an average enhancement of 2.70% in accuracy and 5.74% in macro F1 score, respectively, on the Rest14, Rest15, and Rest16 datasets. TCKGCN exhibited an average improvement of 2.57% in accuracy and 4.13% in macro F1 values on the datasets excluding Rest14. It has been validated that external knowledge can direct models to learn sentiment and common-sense information, assist models in sentiment classification, and notably enhance the performance of the ABSA task. At the same time, in the graph-based methods, those using BERT also perform better, indicating the superiority of BERT in semantic encoding.

Ablation experiment

The impact of various components in SDMAE.

To further investigate the influence of each component of the model on performance, an ablation experiment was conducted: 1) w/o AOSD denotes deleting the syntactic denoising graph and utilizing the original syntactic dependency graph. 2) w/o Lossar, w/o Lossscl, and w/o Lossar & Lossscl respectively represent removing only the sentiment refinement loss, removing only the sample supervised contrast loss, and simultaneously removing both and computing only the classification loss of the model. The results of the ablation experiment are presented in Table 4.

thumbnail
Table 4. Experimental results of ablation of various components in SDMAE, the best performance is marked in bold.

https://doi.org/10.1371/journal.pone.0329018.t004

On the one hand, in the case of w/o AOSD where the original dependency tree was utilized to generate graphs, a significant performance decline was observed on the Rest14, Rest15, and Rest16 datasets. However, the performance on the Twitter dataset remained nearly equivalent to that when AOSD was employed. This suggests that within the Twitter dataset, the proportion of noisy information in the syntactic structure is relatively low. On the other hand, we individually applied an affective refinement strategy and sample supervised contrastive learning for ablation. It can be noted that these two strategies have different contributions in various datasets. When both strategies are simultaneously removed, a relatively substantial performance degradation occurs compared to the complete method. Evidently, removing any module will impact the performance of the model and lead to varying degrees of decline.

The different fusion ways of semantic and syntactic information.

To demonstrate the efficacy of the MAAM layer in information aggregation, we conduct ablation experiments by utilizing three common information aggregation methods in place of MAAM. w/o MAAM indicates the removal of the MAAM layer and its substitution with three common information aggregation methods namely “sum”, “concat” and “gate”. As illustrated in Fig 5, in the “gate” aggregation, the LeakyReLU activation function is utilized to compute the gate control output.

thumbnail
Fig 5. Different aggregation methods of semantic and syntactic information.

https://doi.org/10.1371/journal.pone.0329018.g005

It can be clearly observed from Table 5 and Fig 6, in the Rest14, Twitter, and Rest15 datasets, there were varying degrees of performance deterioration in terms of accuracy and macro F1 values upon the removal of the MAAM layer. Compared to the MAAM layer, these aggregation methods were incapable of fully aggregating semantic and syntactic features. In the Rest16 dataset, although the “concat” method achieved higher accuracy, there was a considerable decrease in macro F1 values, and the accuracy with the MAAM layer was comparable to it.

thumbnail
Fig 6. The variation between accuracy and F1 value under different aggregation methods.

https://doi.org/10.1371/journal.pone.0329018.g006

thumbnail
Table 5. Comparison of effectiveness between MAAM module and common aggregation methods, the best performance is marked in bold.

https://doi.org/10.1371/journal.pone.0329018.t005

The impact of MHA headcount.

In addition, we also investigated the influence of the number of heads in the multi-head attention on the experimental performance. Visualization results of the ablation experiment on the number ofMHA heads are shown in Fig 7 as follows. We increased the number of attention heads from 2 to 12 and conducted experiments on four publicly available datasets. As indicated by the results, when the number of attention heads is 6, the model we proposed attains the optimal accuracy on all datasets. As the number of attention heads increases, the performance of the model commences to deteriorate. Similarly for the macro F1 value, when the number of attention heads is 6, Twitter, Rest15, and Rest16 achieve the optimal macro F1 value; Rest14 performs optimally when the number of attention heads is 8. It can be discerned that when the number of attention heads is 6, the model can achieve superior performance. Once it exceeds or is lower than this threshold, the performance begins to decline. The reason is that as the number of attention heads increases, the learning ability of the entire model gradually ascends and reaches its peak when the number of attention heads equals 6. Once it exceeds the threshold, the features of the nodes become overly cumbersome. Assigning a higher weight to each head will cause the model to lose the ability to select important nodes.

thumbnail
Fig 7. The variation between accuracy and F1 value under different numbers of MHA heads.

https://doi.org/10.1371/journal.pone.0329018.g007

The visualization matrix of the AOSD algorithm.

To visually present the AOSD algorithm, we employ sentence examples for visualization. Fig 8 depicts the original adjacency matrix A and the adjacency matrix M after denoising by the AOSD algorithm for the sentence “the charging speed of this phone is extremely fast and stable”, where the aspect term is the given “charging speed”. It can be observed that in the pruned syntactic matrix following the AOSD denoising algorithm, the degree of attention of the aspect term “charging speed” to the opinion words “extremely”, “fast”, and “stable” has significantly increased. In comparison with the original syntactic adjacency matrix, it is more conducive for the model to capture the relationship between the opinion words and the aspect term, thereby facilitating the making of the correct sentiment decision. To better analyze the effective help brought by AOSD algorithm to SDMAE, as shown in Table 6, we did a case study of the original matrix and AOSD matrix for two text examples.

thumbnail
Fig 8. The visualization matrix of original syntactic and after AOSD of sentence “the charging speed of this phone is extremely fast and stable”.

https://doi.org/10.1371/journal.pone.0329018.g008

thumbnail
Table 6. Results of case study on the original matrix and AOSD matrix, with the aspect words underlined.

https://doi.org/10.1371/journal.pone.0329018.t006

Discussion

The goal of this study was to tackle two critical challenges in ABSA, namely syntactic noise interference and insufficient integration of semantic-syntactic information, while validating the effectiveness of the proposed SDMAE method. Through the collaborative design of multiple components, SDMAE addresses these research goals in a targeted manner, as elaborated below.

Syntactic noise

Online review data often contains noise due to non - standard syntactic structures, which interferes with the model’s ability to capture sentiment - related relationships. The AOSD algorithm trims the original syntactic dependency tree by filtering part-of-speech (prioritizing VERB, ADV, ADJ, etc., which are strongly associated with sentiment) and applying non-linear position weighting relative to aspect terms. Ablation experiments show that when AOSD is removed, the accuracy on the Rest14, Rest15, and Rest16 datasets decreases by 0.98%, 2.03%, and 1.63% respectively, and the macro-F1 values drop by 1.16%, 1.83%, and 5.2% respectively. Take the sentence “The charging speed of this phone is extremely fast and stable” as an example. AOSD enhances the syntactic association weights between “charging speed” and “fast”/“stable”, enabling the model to more accurately focus on key components related to sentiment, directly addressing the issue of “syntactic noise interference”.

Semantic-syntactic fusion

The inefficient integration of semantic and syntactic information can limit the model’s understanding of sentiment relationships. The MAAM Module in SDMAE achieves dynamic interaction between semantic feature and syntactic features through a multi-head attention mechanism. Compared with traditional fusion methods, MAAM shows more stable performance across datasets. On the Rest14 dataset, the macro-F1 of MAAM is 1.97% higher than that of the “sum” method; on the Rest15 dataset, the accuracy (Acc) is 3.5% higher than that of the “concat” method. This adaptive fusion mechanism allows the model to flexibly balance the contributions of semantics and syntax, effectively overcoming the problem of “insufficient integration of semantic-syntactic information” and verifying the rationality of the SDMAE design.

Threats to validity and limitations

Despite the promising performance of SDMAE, several factors related to the design, data, and methodology may affect the validity and generalizability of the conclusions. These threats are outlined as follows:

Construct Validity. The sentiment annotation of the datasets relies on manual judgment, which introduces subjectivity. Ambiguous sentiment expressions (e.g., “not bad”) may lead to annotation inconsistencies due to differences in annotators’ interpretations. Furthermore, the part-of-speech filtering strategy in the AOSD module (prioritizing VERB, ADV, and ADJ) is based on general sentiment heuristics. Such heuristics may not be universally applicable in all domains, potentially limiting the accuracy of sentiment structure extraction.

Internal Validity. The performance of SDMAE depends on hyperparameters such as the number of attention heads and the weighting of multi-strategy losses, which are tuned on benchmark datasets. Although these configurations lead to strong empirical results, they also pose a risk of overfitting to the experimental setup. Changes in hyperparameter settings (e.g., reducing the number of attention heads) may result in performance variability, thereby threatening the internal consistency of the conclusions.

External Validity. The experiments are conducted exclusively on English-language online review datasets. The model’s generalization ability to other languages or to different types of text (e.g., formal writing, domain-specific documents) has not been evaluated. This lack of validation across diverse data sources limits the external applicability of the findings.

Limitations of the SDMAE Model. Although SDMAE introduces several innovations, it still exhibits certain limitations: (1) Insufficient Coverage of Sentiment Knowledge: The sentiment refinement relies on SenticNet 8, which lacks complete coverage of domain-specific terminology, leading to suboptimal sentiment vector representations. (2) Inadequate Handling of Complex Sentences: The MAAM module struggles to distinguish sentiment associations in sentences with multiple conflicting aspects (e.g., “The food is good but the service is bad”). The absence of an aspect-specific hierarchical attention mechanism hinders accurate aspect-opinion alignment. (3) Unverified Domain Adaptability: While SDMAE performs well on general review datasets, its effectiveness in professional or domain-specific contexts remains untested. The use of a general-domain BERT encoder may fail to capture fine-grained sentiment cues in specialized domains.

Conclusions and future work

In this paper, we propose the SDMAE method to tackle the issues of syntactic noise and inefficient fusion of semantic and syntactic information in the ABSA task. We address these challenges by integrating the aspect-oriented syntactic denoising algorithm and a multi-strategy auxiliary approach. Specifically, AOSD effectively prunes the dependency trees to reduce noise by incorporating the position of aspect words, the part-of-speech of the context, and the feature information of contextual distance. The Multi-channel Adaptive Aggregation Module combines the weights of semantic and syntactic features in the model, facilitating the effective integration of these features through attention mechanisms. Furthermore, the affective refinement strategy and supervised contrastive learning strategy are employed to enhance the model’s ability in sentiment recognition and inter-class discrimination, respectively. Experiments conducted on four benchmark datasets demonstrate that SDMAE outperforms previous methods, confirming its effectiveness.

We believe that although our model fully leverages the emotional knowledge of words and contextual distance information, relying solely on the SenticNet emotional lexicon to provide sufficient emotional knowledge for the ABSA (Aspect-Based Sentiment Analysis) task remains inadequate. Further exploration and research on the acquisition and utilization of emotional knowledge are required. In future work, we aim to enhance the model’s emotional recognition capabilities by considering emotional interaction effects and leveraging the prompt-learning abilities of Large Language Models, integrating more relevant information from various dimensions.

References

  1. 1. Hajek P, Hikkerova L, Sahut J-M. Fake review detection in e-commerce platforms using aspect-based sentiment analysis. J Bus Res. 2023;167:114143.
  2. 2. Lengkeek M, van der Knaap F, Frasincar F. Leveraging hierarchical language models for aspect-based sentiment analysis on financial data. Inf Process Manag. 2023;60(5):103435.
  3. 3. Kumar N, Hanji BR. Aspect-based sentiment score and star rating prediction for travel destination using Multinomial Logistic Regression with fuzzy domain ontology algorithm. Expert Systems with Applications. 2024;240:122493.
  4. 4. Li H, Yu BXB, Li G, Gao H. Restaurant survival prediction using customer-generated content: an aspect-based sentiment analysis of online reviews. Tourism Management. 2023;96:104707.
  5. 5. Amendola M, Cavaliere D, De Maio C, Fenza G, Loia V. Towards echo chamber assessment by employing aspect-based sentiment analysis and GDM consensus metrics. Online Social Networks and Media. 2024;39–40:100276.
  6. 6. Zhao Y, Zhang L, Zeng C, Lu W, Chen Y, Fan T. Construction of an aspect-level sentiment analysis model for online medical reviews. Information Processing & Management. 2023;60(6):103513.
  7. 7. Li X, Luo Y, Wang H, Lin J, Deng B. Doctor selection based on aspect-based sentiment analysis and neutrosophic TOPSIS method. Engineering Applications of Artificial Intelligence. 2023;124:106599.
  8. 8. Zhang C, Li Q, Song D. Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks. In: Proceedings of the EMNLP-IJCNLP, Hong Kong, China. 2019. p. 4568–78.
  9. 9. Xing FZ, Pallucchini F, Cambria E. Cognitive-inspired domain adaptation of sentiment lexicons. Information Processing & Management. 2019;56(3):554–64.
  10. 10. Ma Y, Peng H, Cambria E. Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM. In: Proceedings of the AAAI, New Orleans, LA, USA. 2018. p. 5876–83.
  11. 11. Liang B, Su H, Gui L, Cambria E, Xu R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowledge-Based Systems. 2022;235:107643.
  12. 12. Gu T, Zhao H, He Z, Li M, Ying D. Integrating external knowledge into aspect-based sentiment analysis using graph neural network. Knowledge-Based Systems. 2023;259:110025.
  13. 13. Hao J, Pei L, He Y, Xing Z, Weng Y. TCKGCN: Graph convolutional network for aspect-based sentiment analysis with three-channel knowledge fusion. Neurocomputing. 2024;600:128163.
  14. 14. Yan H, Yi B, Li H, Wu D. Sentiment knowledge-induced neural network for aspect-level sentiment analysis. Neural Comput Applic. 2022;34(24):22275–86.
  15. 15. Zhong Q, Ding L, Liu J, Du B, Jin H, Tao D. Knowledge graph augmented network towards multiview representation learning for aspect-based sentiment analysis. IEEE Trans Knowl Data Eng. 2023;35(10):10098–111.
  16. 16. Tang D, Qin B, Feng X, Liu T. Effective LSTMs for target-dependent sentiment classification. arXiv preprint 2015. https://arxiv.org/abs/1512.01100
  17. 17. Wang Y, Huang M, Zhu X, Zhao L. Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the EMNLP, Austin, TX, USA, 2016. p. 606–15.
  18. 18. Huang B, Ou Y, Carley KM. Aspect level sentiment classification with attention-over-attention neural networks. In: Proceedings of the SBP-BRiMS, Washington, DC, USA. 2018. p. 197–206.
  19. 19. Ma D, Li S, Zhang X, Wang H. Interactive attention networks for aspect-level sentiment classification. arXiv preprint 2017.
  20. 20. Chen P, et al. Recurrent attention network on memory for aspect sentiment analysis. In: Proceedings of the EMNLP, Copenhagen, Denmark. 2017. p. 452–61.
  21. 21. Fan F, Feng Y, Zhao D. Multi-grained attention network for aspect-level sentiment classification. In: Proceedings of the EMNLP, Brussels, Belgium, 2018. 3433–42.
  22. 22. Zhu C, Yi B, Luo L. Base on contextual phrases with cross-correlation attention for aspect-level sentiment analysis. Expert Systems with Applications. 2024;241:122683.
  23. 23. Zhang M, Qian T. Convolution over hierarchical syntactic and lexical graphs for aspect level sentiment analysis. In: Proceedings of the EMNLP. 2020. p. 3540–9.
  24. 24. Pang S, et al. Dynamic and multi-channel graph convolutional networks for aspect-based sentiment analysis. In: Findings ACL-IJCNLP. 2021. p. 2627–36.
  25. 25. Li R, et al. Dual graph convolutional networks for aspect-based sentiment analysis. In: Proceedings ACL-IJCNLP. 2021. p. 6319–29.
  26. 26. Wang Z, et al. DAGCN: Distance-based and Aspect-oriented Graph Convolutional Network for Aspect-based Sentiment Analysis. In: Findings NAACL, 2024. 1863–76.
  27. 27. Ahmad KM, et al. Aspect-specific parsimonious segmentation via attention-based graph convolutional network for aspect-based sentiment analysis. Knowl-Based Syst. 2024;:112169.
  28. 28. You L, Peng J, Jin H, Claramunt C, Zeng H, Zhang Z. DRGAT: dual-relational graph attention networks for aspect-based sentiment classification. Information Sciences. 2024;668:120531.
  29. 29. Ling T, Chen L, Lai Y, Liu H-L. Span-based few-shot event detection via aligning external knowledge. Neural Netw. 2024;176:106327. pmid:38692187
  30. 30. Wu M. Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification. Expert Syst Appl. 2023;232:120800.
  31. 31. Cambria E, et al. SenticNet 8: Fusing emotion AI and commonsense AI for interpretable, trustworthy, and explainable affective computing. In: Proceedings of the HCII. 2024.
  32. 32. Hu M, Liu B. Mining and summarizing customer reviews. In: Proceedings of the KDD, Seattle, WA, USA, 2004. p. 168–77.
  33. 33. Miller GA. WordNet: a lexical database for English. Commun ACM. 1995;38(11):39–41.
  34. 34. Ji L, Wang Y, Shi B, Zhang D, Wang Z, Yan J. Microsoft concept graph: mining semantic concepts for short text understanding. Data Intellegence. 2019;1(3):238–70.
  35. 35. Cambria E, et al. SenticNet 4: a semantic resource for sentiment analysis based on conceptual primitives. In: Proceedings of the COLING, Osaka, Japan. 2016. p. 2666–77.
  36. 36. Zhou J, Huang JX, Hu QV, He L. SK-GCN: modeling syntax and knowledge via graph convolutional network for aspect-level sentiment classification. Knowledge-Based Systems. 2020;205:106292.
  37. 37. Cambria E, et al. SenticNet 6: Ensemble application of symbolic and subsymbolic AI for sentiment analysis. In: Proceedings of the CIKM. 2020. p. 105–14.
  38. 38. Zheng Y, Li X, Nie J-Y. Store, share and transfer: Learning and updating sentiment knowledge for aspect-based sentiment analysis. Inf Sci. 2023;635:151–68.
  39. 39. Devlin J. Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint 2018. https://arxiv.org/abs/1810.04805
  40. 40. Song Y, et al. Attentional encoder network for targeted sentiment classification. arXiv preprint 2019. https://arxiv.org/abs/1902.09314
  41. 41. Gu T, Zhao H, He Z, Li M, Ying D. Integrating external knowledge into aspect-based sentiment analysis using graph neural network. Knowledge-Based Systems. 2023;259:110025.
  42. 42. Shuang K, Gu M, Li R, Loo J, Su S. Interactive POS-aware network for aspect-level sentiment classification. Neurocomputing. 2021;420:181–96.
  43. 43. Dong L, et al. Adaptive recursive neural network for target-dependent twitter sentiment classification. In: Proceedings of the ACL, Baltimore, MD, USA. 2014. p. 49–54.
  44. 44. Pontiki M, et al. SemEval-2014 Task 4: aspect based sentiment analysis. In: Proceedings of the SemEval, Dublin, Ireland. 2014. p. 27–35.
  45. 45. Papageorgiou H, et al. SemEval-2015 task 12: aspect based sentiment analysis. In: Proceedings of the 9th International Workshop on Semantic Evaluation, 2015. p. 486–95.
  46. 46. Pontiki M, et al. SemEval-2016 task 5: aspect based sentiment analysis. In: Proceedings of the SemEval, San Diego, CA, USA. 2016. p. 19–30.
  47. 47. Tang D, Qin B, Liu T. Aspect level sentiment classification with deep memory network. In: Proceedings of the EMNLP, Austin, TX, USA, 2016. p. 214–24.
  48. 48. Sun K, et al. Aspect-level sentiment analysis via convolution over dependency tree. In: Proceedings of the EMNLP-IJCNLP, Hong Kong, China. 2019. p. 5679–88.
  49. 49. Chen C, Teng Z, Zhang Y. Inducing target-specific latent structures for aspect sentiment classification. In: Proceedings of the EMNLP. 2020. p. 5596–607.
  50. 50. Wang K, et al. Relational graph attention network for aspect-based sentiment analysis. In: Proceedings of the Association for Computational Linguistics. 2020. p. 3229–38.
  51. 51. Zhou Y, et al. To be closer: learning to link up aspects with opinions. In: Proceedings of the EMNLP. 2021. p. 3899–909.
  52. 52. Tang H, et al. Dependency graph enhanced dual-transformer structure for aspect-based sentiment classification. In: Proceedings of the ACL. 2020. p. 6578–88.
  53. 53. Ouyang J, Xuan C, Wang B, Yang Z. Aspect-based sentiment classification with aspect-specific hypergraph attention networks. Expert Systems with Applications. 2024;248:123412.