Beauty Is in the Eye of the Beholder: Proteins Can Recognize Binding Sites of Homologous Proteins in More than One Way

Understanding the mechanisms of protein–protein interaction is a fundamental problem with many practical applications. The fact that different proteins can bind similar partners suggests that convergently evolved binding interfaces are reused in different complexes. A set of protein complexes composed of non-homologous domains interacting with homologous partners at equivalent binding sites was collected in 2006, offering an opportunity to investigate this point. We considered 433 pairs of protein–protein complexes from the ABAC database (AB and AC binary protein complexes sharing a homologous partner A) and analyzed the extent of physico-chemical similarity at the atomic and residue level at the protein–protein interface. Homologous partners of the complexes were superimposed using Multiprot, and similar atoms at the interface were quantified using a five class grouping scheme and a distance cut-off. We found that the number of interfacial atoms with similar properties is systematically lower in the non-homologous proteins than in the homologous ones. We assessed the significance of the similarity by bootstrapping the atomic properties at the interfaces. We found that the similarity of binding sites is very significant between homologous proteins, as expected, but generally insignificant between the non-homologous proteins that bind to homologous partners. Furthermore, evolutionarily conserved residues are not colocalized within the binding sites of non-homologous proteins. We could only identify a limited number of cases of structural mimicry at the interface, suggesting that this property is less generic than previously thought. Our results support the hypothesis that different proteins can interact with similar partners using alternate strategies, but do not support convergent evolution.

: The five atom groups used in the present study. We took the same grouping as the one obtained by Mintseris and Weng on a data set of 327 protein interfaces [2], and modified it so as to have identical grouping for main-chain atoms. The five groups roughly represent: yellow: hydrophobic side chains, green: positively charged side-chains, blue: side chains of polar residues, polar portion of TYR and TRP, main-chains Hbond donors/acceptors, gray: negatively charged side-chains, orange: ALA, CYS, other main-chain atoms and non-polar portions of TYR and TRP.
1 VAL SC, LEU SC, ILE SC, PHE SC1, PHE SC2, MET SC2 2 LYS SC1,LYS SC2,ARG SC2 3 All Cα, ALA SC, SER SC,THR SC, GLN SC1,GLN SC2, ASN SC1,ASN SC2, HIS SC2, TYR SC2,ARG SC1 4 GLU SC1, GLU SC2, ASP SC1, ASP SC2 5 MET SC1, PRO SC1, PRO SC2, TRP SC1, TRP SC2, CYS SC, HIS SC1, TYR SC1 Table 1: Grouping for coarse-grain representation. Here, SC refers to side-chain pseudo-atoms.        1: Category of the pairs of complexes, min: minimum number of superimposed elements, q 5% : 5% quantile of the distribution; a quantile equal to 10 means that 5% of the interface have less than 10 superimposed points (and consequently, 95% of the data is above this limit).                  Figure 19: Distribution of P-values for the co-localization of conserved residues (residues with ConSurf score < -1) at protein-protein interfaces of ABAC pairs of all categories. Scheme color is the same as in Figure 4. White bars correspond to a number of conserved residues equal to zero.

Comparison between our results and Humphris and Kortemme [1]
There is a significant overlap between the data used in the present study, and the one used by Humphris and Kortemme [1]. They used 20 multi-specific proteins, seen in complex with different partners in 65 PDB structures. The 20 clusters are overlapping with our 433 pairs in the following extend: • cluster 1 correspond to 6 ABAC pairs, • cluster 4 correspond to 2 ABAC pairs, • cluster 5 correspond to 3 ABAC pairs, • cluster 8 correspond to 3 ABAC pairs, • cluster 9 correspond to 111 ABAC pairs, • cluster 10 correspond to 1 ABAC pair, • cluster 13 correspond to 7 ABAC pairs, • cluster 17 correspond to 1 ABAC pairs, • cluster 11, 12, 16 and 18 correspond to 130 ABAC pairs.
Other clusters are not represented in our data set. The clusters cover 264 ABAC pairs. The correspondence is established by mapping the SCOP family of the promiscuous proteins from Humphris and Kortemme to the SCOP family of the A and A' domains of the ABAC pairs. Proteins from clusters 11 (Ran), 12 (Ras), 16 (Rac) and 18 (Cdc42) used by Humphris and Kortemme belong to the same SCOP family (52592, G proteins).
In Figure 22 A and B, we report the similarity P-values of all the ABAC pairs covered by Humphris and Kortemme clusters, obtained using Cα interface representation. The color code is the same as in previous figures: red=category A, orange=category M, blue=category E, cyan=category I, green=category S, and white points correspond to domains with no conserved residues. Figures 22 C and D represent P-value histograms for two clusters corresponding to a substantial number of ABAC pairs: • cluster 9: elastase from [1], corresponding to ABAC pairs of SCOP family 50514=eukaryotic proteases.
Cluster 9 belongs to group 1 delineated in [1], where all interaction partners probably share key interactions residues, and clusters 11, 12, 16 and 18 belong to group 2, where each binding partner prefers its own subset of wild-type residues within the promiscuous binding site. Our results are in good agreement with the distinction between these two groups: the distribution of similarity P-value on the A/A' domains is clearly dominated by low values for cluster 9 (Figure 22 C), whereas it is less clear for cluster 11/12/16/18 (Figure 22 D). It indicates that A/A' binding sites are clearly similar within the pairs of cluster 9, and more variable within the pairs of cluster 11/12/16/18, in agreement with the group classification. Our results provide additional information on the similarity between B and C binding sites.