Fig 1.
Protein–protein interactions (PPIs) can emerge through two principal mechanisms: similarity and complementarity.
(A) In social networks, individuals with similar attributes tend to connect, a phenomenon termed homophily. (B) Analogously, in PPI network, both similar and complementary proteins tend to interact. (C) A Q-Q plot, a statistical tool for comparing distributions, confirms this pattern. The fact that many points deviate from the straight reference line indicates that the interacting protein pairs do not all follow a single, simple rule, but instead represent a mix of different interaction types. (D) Clustering analysis further categorizes the binding protein pairs into two groups exhibiting markedly different levels of structural dissimilarity. These groups correspond to interactions primarily driven by structural similarity and complementarity, respectively.
Fig 2.
(A) Each protein is first converted into a structural representation derived from its contact map and covalent connectivity.
These structural features are encoded into compact protein embeddings using a pre-trained graph neural network autoencoder. In simple terms, this step transforms complex structural information into a learnable numerical representation of each protein. (B) The adaptive fusion block integrates different sources of structural information using a gating mechanism, which automatically determines how much weight to assign to each feature type depending on the context. (C) The core of DMG-PPI models two complementary principles of protein interactions. AMP captures similarity-based relationships, where proteins with similar structural patterns tend to interact. DMP captures complementarity-based relationships, where structurally distinct but functionally compatible proteins interact. The outputs of these channels are combined to form refined protein representations. Finally, the interaction between two proteins is quantified by combining their representations into a pair-level descriptor, which is used to predict whether the proteins interact.
Fig 3.
Comparative performance of DMG-PPI and benchmark methods.
(A) Heatmaps showing the performance of each method across the following evaluation metrics: Micro-F1, AUPRC, LRAP, NDCG, AUROC, and overall z-score. Darker colors indicate better performance. Each row indicates the performance of corresponding method. (B) Precision–recall curves for PPI prediction on SHS27K, comparing the top five methods; shaded areas represent the range between the highest and lowest results. (C) Robustness of DMG-PPI to label perturbation on SHS27K, evaluated using precision–recall curves at different perturbation ratios.
Fig 4.
Protein–protein interaction (PPI) examples that are correctly predicted by DMG-PPI but not by HIGH-PPI or TUnA.
The corresponding PPI complex structures are predicted using GRAMM and visualized with ChimeraX. (A) O15264 (Mitogen-activated protein kinase 13) and P51452 (Dual specificity protein phosphatase 3); (B) P11233 (Ras-related protein Ral-A) and Q15208 (Serine/threonine-protein kinase 38); (C) P11597 (Cholesteryl ester transfer protein) and P02647 (Apolipoprotein A-I); (D) P01033 (Metalloproteinase inhibitor 1) and Q9H306 (Matrix metalloproteinase-27).
Fig 5.
The interpretability provided by DMG-PPI.
(A) Surface representation of a protein complex (UniProt IDs: O76528 and P41240). The likelihood of a residue being a hotspot is visualized using a color gradient from low (green) to high (red). (B) DMG-PPI accurately identifies the interaction mode between proteins. Dashed lines represent the prediction computed by DMG-PPI, while solid lines represent ground-truth. The modes of PPIs are represented by edge color, varying along a spectrum from blue (similarity-driven) to red (complementarity-driven).
Fig 6.
Performance comparison between each message pathway.
(A) The Micro-F1 of each variant on SHS27K and SHS148K. (B) The AUPR across PPI types on SHS27K, the x-axis labels denote PPI types and the corresponding percentage. (C) Patterns of top-500 PPIs predicted by each variant. The x-axis represents the absolute value of the degree difference between the interacting proteins, while the y-axis indicates the percentage.
Fig 7.
Directional influence among PPI types.
Each row corresponds to a specific PPI type, while each column represents the type influencing it. The value at row i and column j indicates how strongly type j contributes to type i, as learned by DMG‑PPI. Arrows mark influences stronger than 0.2. Overall, the map highlights directional relationships among interaction types and reflects known biological mechanisms.
Fig 8.
Distribution of Euclidean distances between node embeddings for edges in motif-richness versus motif-poor regions.
The similarity of distances across regions suggests that DMG-PPI learns representations largely independent of local motif density.