Figure 1.
Dependence of Kd on various single biophysical features.
(A) Change in the accessible interface surface area (ASA); (B) ΔASA normalized to the total interface area; (C) percent of non-polar change in the accessible surface area; (D) the total number of interfacial H bonds, (E) the number of intermolecular interfacial H bonds, (F) the number of intra-molecular H bonds; (G) Van der Waals energy; (H) volume of cavities; (I) number of hotspots; (J) electrostatic columbic energy; (K) RMSD between bound and unbound structures for interface Cαs; (L) percentage of rotamers that do not change conformation upon binding. Each point represents one PDB file in the database and the line corresponds to a linear fit to all data points in the database.
Figure 2.
Amino acid interface propensities.
(A) Amino acid propensities to be in an interface compared to protein surface calculated according to [5] (B) Amino acid propensities for high-affinity (black) and low-affinity (grey) complexes.
Figure 3.
Improvement in R-value for high-resolution structures.
Barplot displaying correlation (R-value) between different biophysical features and Kd when using only high-resolution structures (red bars) and all structures (grey bars).
Figure 4.
Receiver Operator Characteristic Analysis.
The graph shows the true positive rate vs. false positive rate in discriminating high- from low-affinity PPIs (red line), medium- from low-affinity PPIs (green line) and high- from medium-affinity PPIs (blue line) for each feature. Each point represents a particular cut-off value used to discriminate between the two groups. Features included in the figure are (A) ΔASA, (B) ΔASA/ASA, (C) Van der Waals energy, (D) the total number of interfacial H bonds, (E) the number of intermolecular interfacial H bonds, (F) the number of intra-molecular H bonds; (G) Percentage of rotamers that do not change conformation upon binding; and (H) the number of hotspots.
Figure 5.
Incorporating more features in the prediction improves correlation with Kd and ROC analysis.
The best possible weights were obtained to combine the features into one equation using a linear fit to the experimental data. X-axis shows the number of features used to predict Kd and to discriminate between the two groups. Y-axes shows the best value obtained for each number of features used in the equation. The analysis was performed on all structures in the database (filled circles) and on high-resolution structures only (red stars). (A) AUC were evaluated on high- vs low-affinity (red), medium- vs low-affinity (green) and medium- vs high-affinity (blue) PPIs (B) Pearson's correlation coefficient for all dataset (filled circles) and for high-resolution structures only (red stars).