Graph former-CL: A novel graph transformer with contrastive learning framework for enhanced drug-drug interaction prediction | PLOS One

Advertisement

Browse Subject Areas

?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1 — Fig 1.

The training process begins with molecular augmentation to generate diverse views of drug structures, followed by Graph Transformer encoding with hierarchical pooling.
Contrastive learning maximizes similarity between augmented pairs, while cross-modal fusion integrates graph and sequence features. The inference phase applies the trained model to predict interactions for new drug pairs.

More »

Fig 2 — Fig 2.

Overview of Graph Former-CL architecture.
The framework processes Drug A (molecular graph) and Drug B (SMILES sequence) through parallel Graph Transformer and Contrastive Learning pathways. The Graph Transformer incorporates spatial encoding, multi-head attention, hierarchical pooling, and feature extraction. The Contrastive Learning module applies domain-specific augmentations including atom masking, bond perturbation, scaffold hopping, and subgraph sampling. Cross-Modal Fusion with cross-attention and adaptive fusion mechanisms integrates the representations, followed by DDI Prediction through drug pair encoding and MLP classifier to output interaction probability scores.

More »

Fig 3.

Spatial encoding mechanism for molecular graphs.
The left panel shows an example molecular graph with atoms () and bonds. The center panel displays the spatial encoding matrix computed from chemical distances between atoms, where colors represent different distance values (warmer colors indicating closer chemical distances). The right panel illustrates the multi-head attention mechanism incorporating spatial encoding, where Q (queries), K (keys), and V (values) are processed through H parallel attention heads with spatial bias S, followed by concatenation and linear transformation to produce the final output. Each attention head independently computes position-aware attention using learnable projections enabling the model to capture diverse chemical interaction patterns simultaneously.

More »

Fig 3.

Spatial encoding mechanism for molecular graphs.
The left panel shows an example molecular graph with atoms () and bonds. The center panel displays the spatial encoding matrix computed from chemical distances between atoms, where colors represent different distance values (warmer colors indicating closer chemical distances). The right panel illustrates the multi-head attention mechanism incorporating spatial encoding, where Q (queries), K (keys), and V (values) are processed through H parallel attention heads with spatial bias S, followed by concatenation and linear transformation to produce the final output. Each attention head independently computes position-aware attention using learnable projections enabling the model to capture diverse chemical interaction patterns simultaneously.

More »

Fig 4 — Fig 4.

Domain-specific molecular augmentation strategies and contrastive learning framework.
The top panel shows four augmentation techniques applied to an original molecule: atom masking (removing non-essential atoms), bond perturbation (modifying bond types), subgraph sampling (extracting functional groups), and scaffold hopping (replacing molecular scaffolds while preserving pharmacophores). The middle panel displays the contrastive loss computation using a similarity matrix between original and augmented representations. The bottom panel illustrates the three-phase training protocol with performance gains, achieving +2.34% combined improvement through the ensemble strategy.

More »

Table 1 — Table 1.

Dataset statistics.

More »

Table 2 — Table 2.

Hyperparameter configuration.

More »

Fig 5 — Fig 5.

Performance comparison on Drug Bank dataset.
The top panel shows accuracy comparison across five methods (CNN-DDI, GMPNN, SSF-DDI, Graph CL, and Graph Former-CL), with Graph Former-CL achieving 98.20% accuracy. Statistical significance tests confirm improvements with p < 0.001 against major baselines. The inductive setting results show Graph Former-CL achieving 82.45% accuracy on novel drugs (+5.23% improvement). The bottom panel displays AUC performance comparison, with Graph Former-CL achieving the highest AUC of 99.34%.

More »

Table 3 — Table 3.

Performance comparison on drug bank dataset.

More »

Table 4 — Table 4.

Performance comparison on TWOSIDES dataset.

More »

Table 5 — Table 5.

Statistical significance tests (p-values).

More »

Table 6 — Table 6.

Inductive setting results (unknown drugs).

More »

Fig 6 — Fig 6.

Component contribution analysis and ablation study results.
The left panel shows performance degradation when removing individual components, with the full model achieving 98.20% accuracy. The right panel ranks component importance by performance drop when removed, identifying Cross-Modal Fusion (−1.28%) and Contrastive Learning (−0.86%) as critical components. The analysis reveals that components work together beyond individual contributions, achieving +1.12% total synergy when all components are combined.

More »

Table 7 — Table 7.

Ablation study results on drug bank.

More »

Table 8 — Table 8.

Computational performance comparison.

More »

Fig 7 — Fig 7.

Cross-dataset transfer learning performance analysis.
The transfer performance matrix shows accuracy percentages for different source-target dataset combinations, with color coding indicating transfer quality (red: 90-95% excellent, green: 85-90% good). The best transfer is achieved from Deep DDI to Drug Bank (93.21% accuracy). The legend shows that Graph Former-CL maintains consistent performance across different domain characteristics and molecular diversity levels.

More »

Table 9 — Table 9.

Cross-dataset transfer learning results.

More »

Fig 8 — Fig 8.

Attention mechanism analysis and molecular interpretation.
The left panel shows attention weight heatmaps for different atom pairs. The center panel displays the top attended substructures: benzene ring (0.847 attention, CYP450 binding relevance) and carboxyl group (0.823 attention, metabolic site relevance). The right panel shows mechanism-specific accuracy scores for different DDI types, with CYP450 inhibition achieving 97.8% accuracy. The bottom panel demonstrates a pharmacophore recognition example using Ketoconazole-Midazolam interaction, showing predicted CYP3A4 inhibition mechanism with 0.89 confidence, which has been clinically validated.

More »

Table 10 — Table 10.

Top 10 substructures by attention weight.

More »

Table 11 — Table 11.

Impact of different augmentation strategies.

More »

Table 12 — Table 12.

Analysis of prediction errors.

More »

Table 13 — Table 13.

Detected interaction mechanisms.

More »

Fig 9 — Fig 9.

Clinical application case studies for COVID-19 and elderly polypharmacy scenarios.
The top panel shows the COVID-19 drug interaction network with risk assessment levels: high risk (>0.8, red), medium risk (0.5-0.8, orange), and low risk (<0.5, green). Key interactions include Remdesivir-Lopinavir (0.89 high risk) and Dexamethasone-Warfarin (0.92 high risk). The bottom panel demonstrates novel drug candidate DDI predictions, showing the model processing new drug structures through Graph Former-CL to output risk scores and mechanisms, achieving 95% ± 2% model confidence for novel drug predictions including Aducanumab (0.78) and Sotorasib (0.91).

More »

Table 14 — Table 14.

COVID-19 drug interaction predictions.

More »

Table 15 — Table 15.

Common polypharmacy interactions detected.

More »

Table 16 — Table 16.

Novel drug interaction predictions.

More »