MoGraphDRP: Multi-omics and graph fusion with bilinear attention for predicting drug sensitivity | PLOS One

Advertisement

Browse Subject Areas

?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1 — Fig 1.

Overall architecture of the MoGraphDRP model for drug response prediction through the integration of cellular multi-omics data and composite drug structure representation.
(A) Cellular omics data are processed separately through fully connected layers and ultimately fused into a compressed vector within the cell feature encoder module. Simultaneously, drug representations are extracted through two pathways: the first employs a molecular graph derived from SMILES and a GCN-based graph network; the second compresses chemical fingerprints (Morgan, ESPF, and PubChem) within the drug feature encoder module. The outputs of both pathways are merged using an attention-based integration module to form the final drug representation. (B) The final vectors of the drug and the cell are fed into the Bilinear Attention module. This module, leveraging multiple attention heads, models feature-level interactions between the two vectors and produces an enriched vector denoted as f, which represents the nonlinear drug–cell interaction. This vector is passed to a prediction network composed of multilayer perceptrons (MLPs) to estimate the IC50 value. (C) In the final stage, the initial predicted output y, along with the interaction vector f, is passed into an XGBoost-based boosting model that acts as the final refinement module to enhance prediction accuracy.

More »

Fig 2 — Fig 2.

Architecture of the drug graph module based on GCN.
First, the molecular graph with node feature vectors of dimension 78 is input into a three-layer graph neural network, where each layer employs message averaging, a ReLU activation function, and dropout with a rate of 0.4. Finally, using a mean pooling function, a fixed-length vector of dimension 128 is generated as the representation of the entire graph.

More »

Fig 3.

Structure of the Bilinear Attention module in the MoGraphDRP architecture.
First, the compressed vectors of the cell and the drug are mapped into a shared space through linear layers. Then, feature-level interactions between them are learned via multi-head bilinear attention using the matrices. The output of this module consists of two parts: (1) the predicted IC50 value for the drug–cell pair, and (2) the compressed interaction vector , which contains the features extracted by the model. Both outputs are used as inputs for the final stage of the model the ensemble network to enhance prediction accuracy.

More »

Fig 3.

Structure of the Bilinear Attention module in the MoGraphDRP architecture.
First, the compressed vectors of the cell and the drug are mapped into a shared space through linear layers. Then, feature-level interactions between them are learned via multi-head bilinear attention using the matrices. The output of this module consists of two parts: (1) the predicted IC50 value for the drug–cell pair, and (2) the compressed interaction vector , which contains the features extracted by the model. Both outputs are used as inputs for the final stage of the model the ensemble network to enhance prediction accuracy.

More »

Table 1 — Table 1.

Implementation details of the MoGraphDRP method.

More »

Fig 4 — Fig 4.

Schematic of the data preparation process.

More »

Table 2 — Table 2.

Performance comparison of the MoGraphDRP method with other state-of-the-art approaches on the benchmark dataset.

More »

Fig 5 — Fig 5.

a. Comparison of the performance of different models based on MAE and RMSE metrics on the benchmark dataset.
The MoGraphDRP model achieved lower error values in both metrics compared to other methods, indicating higher accuracy in predicting IC50 values. b. Performance comparison of different models based on SCC, PCC, and R² metrics shows that the MoGraphDRP model not only achieved the highest linear correlation coefficient (PCC = 0.9689) but also recorded the highest explained variance (R² = 0.9388) among all methods.

More »

Fig 6 — Fig 6.

Loss curve over 200 epochs.

More »

Fig 7 — Fig 7.

Pearson Correlation Coefficient (PCC) trend on the validation set during training.

More »

Fig 8 — Fig 8.

The distribution of predicted IC50 values for the top 10 sensitive and top 10 resistant drugs is shown.
This pattern demonstrates the model’s capability to distinguish between active and inactive compounds, even in the absence of direct experimental data.

More »

Fig 9 — Fig 9.

Heatmap of a subset of the predicted IC50 matrix for 30 drugs and 30 cell lines.
The color variation indicates the level of cell sensitivity to different drugs. Darker colors represent stronger responses (lower IC50 values).

More »

Fig 10 — Fig 10.

Feature importance scores (F score) of the XGBoost ensemble model.
Features fN represent dimensions of the Bilinear Attention output vector, and predicted IC50 is the preliminary deep model prediction.

More »

Table 3 — Table 3.

Performance comparison of the MoGraphDRP method with different GNN architectures.

More »

Table 4 — Table 4.

Performance comparison of the MoGraphDRP method with variations in the drug fingerprint data.

More »

Table 5 — Table 5.

Performance comparison of the MoGraphDRP method with variations in cell line data.

More »

Table 6 — Table 6.

Performance comparison of the MoGraphDRP method with architectural variations.

More »

Table 7 — Table 7.

Comparison of MoGraphDRP performance on the independent test set before and after ensemble.

More »