Skip to main content
Advertisement

< Back to Article

Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models

Fig 4

Inferred Potts couplings encode energetics and structural information about native and competitor folds, reflecting both positive and negative designs.

A. Values of Jij(a, b) (inferred from a MSA of structure SB with the Potts-ACE method) vs. −E(a, b) across all pairs of sites i, j and of amino acids a, b (found at least once in the MSA on those sites). Couplings and MJ energy parameters are shown in the consensus gauge, in which the entries attached to the most probable amino acids in each site are fixed to zero. Red symbols correspond to pairs (i, j) in contact, while blue symbols correspond to no contact. B. Lower-triangle: contact map cij of structure SB. Full blue squares correspond to pairs of sites i, j in contacts. Green and red dots show, respectively, true and false positives among the 28 largest scores with the ACE method (Methods). Upper triangle: average contact map , computed over all competitor folds weighted with their Boltzmann weights (Methods). The four missed contacts (all touching the central site 4) correspond to large . Red squares locate the four false positives. C. Pressure λij for each pair of sites (i, j), computed from Eq [2], vs. for structure SB. The 195 pairs of sites which can never be in contact on any fold due to the lattice geometry are shown with magenta pluses. The 28 contacts on SB (red symbols) are partitioned into the Unique-Native (UN, 14 full triangles) and Shared-Native (SN, 14 empty triangles) classes, according to, respectively, their absence or presence in the closest competitor structure, SF (Fig 4D). The remaining 128 pairs of sites (blue symbols) are not in contact on SB, and are partitioned into the Closest-Competitor (CC, 14 full squares) and the Non-Native (NN, 114 empty squares) classes, according to, respectively, whether they are in contact or not in the closest competitor structure, SF. Similar results are found for SA, SC and SD, see Table 2 and Figs H, I, and J in S1 Text. As in Fig 4A, we use coupling and MJ entries expressed in the consensus gauge, since the consensus sequence corresponds, or is close to the best folding sequence, used as a reference sequence in our theoretical calculation of the pressure (S1 Text, Section III). Changing the gauge e.g. to the least-probable gauge affects the amplitudes of the pressures λij, but does not qualitatively alter the results. D. Structure SF, the closest competitor structure to SB. Note that the four missed contacts (among the top 28 FAPC scores with the ACE method) are carried by the center of the cube (site i = 4 on SB and SF), see fold SB in Fig 2A and its contact map in Fig 4B. Two of the four false positives are contacts on SF, and are thus in the CC class.

Fig 4

doi: https://doi.org/10.1371/journal.pcbi.1004889.g004