Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets
Fig 2
Scoring protein interactions by their conditional dependence accurately recovers direct protein-protein contacts within multi-subunit complexes.
A. We compared the value of the pairwise Pearson correlation coefficients between protein elution profiles (red curve) versus the derived conditionally dependent interactions (i.e., direct contact predictions) (black curve) for their ability to recapitulate true protein contacts in 10 complexes with known 3D structures. High-scoring conditionally dependent interactions were strongly enriched for true contacts, unlike the most highly correlated protein elution profiles. Additionally, we plot precision recall curves for predictions made with alternative λ choices (gray curves) and observe improved performance over correlation alone suggesting performance is robust to the selection of this parameter. The random line (dashed) represents the theoretical baseline for all true positives (TP) divided by the total number of possible subunit pairs (TP:335 / Total:1583) B. Evaluation of conditionally dependent interactions on an additional 19 non-redundant complexes showing consistent performance on a leave out set. Random = (TP:261 / Total:1575). C. Evaluation on combined 29 complexes used in A and B. Direct contact probability thresholds and correlation coefficient thresholds are marked in black and red text, respectively. Random = (TP:596 / Total:3158). D. Distributions of area under the precision recall curve (PR AUC) for the individual 29 complexes showing large variance across complexes but showing direct contacts outperforming correlation and random. Precision = TP/(TP+FP); recall = TP/(TP+FN).