Fig 1.
Structural diagram of the multi-channel embedding (MSMCE) module.
This module consists of three main components: Encoder, Channel Embedding, and Channel Concatenation, which are designed to enhance the feature representation capability of mass spectrometry data. Note: This diagram illustrates the processing of a single MS vector through the module; in practice, the module processes a batch of B such vectors concurrently.
Table 1.
Dataset information.
Fig 2.
Mass spectrometry data processing workflow.
This figure illustrates the preprocessing pipeline for mass spectrometry data, covering both SpiderMass and LC-MS data processing methods. The final output is a Feature Matrix, which serves as the input for subsequent analyses.
Fig 3.
t-SNE visualization of all studied datasets.
This figure shows the two-dimensional t-SNE visualizations for the four datasets used in this study, illustrating their intrinsic data structures and class separability. (a) Canine Sarcoma dataset shown as a binary classification view; (b) Canine Sarcoma dataset shown as a 12-class view with all sarcoma subtypes; (c) NSCLC dataset; (d) CRLM dataset; (e) RCC dataset.
Fig 4.
Comparison between the training process with MSMCE and without MSMCE.
In the blue path, the input data first passes through the MSMCE module for feature transformation before being fed into the classification model for training. The loss gradient optimizes both the classification model and the MSMCE module. In the red path, the input data is directly fed into the classification model, and the loss gradient is only used to optimize the classification model.
Table 2.
Comparison of classification performance of different models on the NSCLC dataset.
Table 3.
Comparison of classification performance of different models on the CRLM dataset.
Table 4.
Comparison of classification performance of different models on the RCC dataset.
Fig 5.
Comparison of confusion matrices for transformer and MSMCE-transformer on three binary classification datasets.
This figure presents a comparative analysis of the confusion matrices for the baseline Transformer model (top row) and the MSMCE-Transformer model (bottom row) across the NSCLC, CRLM, and RCC datasets. The confusion matrices for the baseline Transformer clearly reveal that the model predicts all samples as a single class, resulting in extremely poor discriminative performance. In stark contrast, the confusion matrices for the MSMCE-Transformer model show a high concentration of values along the diagonal, indicating high numbers of true positives and true negatives.
Table 5.
Comparison of classification performance of different models on the canine sarcoma dataset (Classes = 2).
Table 6.
Comparison of classification performance of different models on the canine sarcoma dataset (Classes = 12).
Table 7.
Ablation study results: Impact of different module combinations on ResNet-50 classification performance.
Fig 6.
Accuracy trends during training and validation in the ablation study.
This figure illustrates the accuracy changes on the training and validation sets for ResNet-50 and its variants with progressively introduced MSMCE submodules. The results highlight the impact of different components on model performance. Before introducing the Channel Embedding module, the accuracy curves exhibit significant fluctuations. However, after incorporating Channel Embedding and Channel Concatenation, the accuracy curves become notably more stable, indicating improved model robustness and convergence.
Fig 7.
Radar chart of model training efficiency on the Canine Sarcoma (12-class) Dataset.
This figure presents the computational efficiency of ResNet-50, DenseNet-121, EfficientNet-B0, LSTM, and Transformer, along with their corresponding MSMCE-enhanced versions. It can be observed that MSMCE-enhanced models achieve significantly improved classification accuracy while maintaining lower computational costs (FLOPs) and smaller model sizes. Radar charts for training efficiency on other datasets are provided in the supplementary materials.