Table 1.
CMU-MOSI dataset information.
Table 2.
CMU-MOSEI dataset information.
Fig 1.
Structure of AFR-BERT multimodal sentiment analysis model.
AFR-BERT is divided into four network modules, which correspond to data input, data fusion, data analysis, and data output.
Fig 2.
(Forward) means forward propagation of the model. (Backward) means model backward propagation.
Fig 3.
Cross-modal fusion attention mechanism structure.
(Kt) represents text feature data. (Ka) represents audio feature data. (Relu, Row Softmax, softmax, concat) are all function calculations. (Mask) is a matrix.
Fig 4.
Scaled dot product attention structure.
(Q) means the query matrix. (K) means the key matrix. (V) means the value matrix. (Mask) represents matrix operations for processing non-fixed-length sequences. () is the scale factor for scaling.
Table 3.
The optimal parameter settings report.
Table 4.
Comparative experiments of multimodal sentiment analysis models on the dataset CMU-MOSI.
Fig 5.
Cross-sectional histograms of Corr metrics for each model on the CMU-MOSI.
Table 5.
Comparative experiments of multimodal sentiment analysis models on the dataset CMU-MOSEI.
Fig 6.
Cross-sectional histograms of Corr metrics for each model on the CMU-MOSEI dataset.
Table 6.
Ablation experiments on the CMU-MOSI dataset.
Table 7.
Sample analysis.