Table 1.
Comparative analysis of work related to multimodal sentiment analysis.
Fig 1.
The proposed multimodal sentiment analysis model.
Fig 2.
Tourism reviews data collection and preprocessing.
Fig 3.
The process of constructing the text graph.
Fig 4.
The process of constructing the image graph.
Fig 5.
The process of constructing the fusion graph [37].
Fig 6.
The principle of the GraphSAGE algorithm.
Table 2.
Parameter settings for BERT, ResNet, and GraphSAGE in the proposed model.
Table 3.
Results of text single-modal comparison of the proposed model with each of the six SOTA models on the Yelp dataset, TripAdvisor dataset and Ctrip dataset.
Table 4.
Results of multi-modal comparison of the proposed model with each of the six SOTA models on the Yelp dataset, TripAdvisor dataset and Ctrip dataset.
Fig 7.
The heat map of the experimental results of the model proposed in this study on the Yelp dataset.
Table 5.
Performance comparison of the proposed model with different loss function weight settings (α, β) across four datasets (TripAdvisor, Amazon, Ctrip, MM-CPC) in text-only modality.
Table 6.
Results of comparison experiment on each of the other four datasets in multi-modal mode.
Table 7.
Computational efficiency comparison between the proposed model and TETFN in terms of training time, inference time, and memory usage.
Fig 8.
Accuracy and F1-score comparison between the proposed model and baseline methods (TFN, LMF, MulT) on the TripAdvisor and Yelp datasets across three sentiment classes.
Fig 9.
Performance impact of removing individual components (BERT text encoder, ResNet visual encoder, or GraphSAGE aggregator) measured by relative change in accuracy (Δ Acc) and macro-F1 score (Δ F1) across all test datasets.
Table 8.
Ablation study results comparing BERT and Word2Vec for text feature extraction (Accuracy, Precision, Recall, F1-score, AUC-ROC).
Table 9.
Ablation study results comparing ResNet and plain CNN for image feature extraction (Accuracy, Precision, Recall, F1-score, AUC-ROC).
Table 10.
Ablation study results comparing GraphSAGE and standard GCN for multimodal graph aggregation (Accuracy, Precision, Recall, F1-score, AUC-ROC).