Fig 1.
The definitions of ‘protrusions’ and ‘co-insertable protruding hydrophobes’.
Panel A shows a cartoon representation of the C2 domain of human phospholipase A2 (PDB ID: 1RLW), and panel B shows the convex hull for the same protein. Panel C shows the structure of the ENTH-domain (PDB ID: 1H0A), which contains an amphipatic helix. The corresponding convex hull is shown on panel D. All Cα- and Cβ-atoms are shown as spheres. Hydrophobes are coloured orange. The convex hull for the Cα- and Cβ-atomic coordinates is shown in blue. All spheres visible on the convex hull representation are vertex residues. ‘Protrusions’ are defined as vertex residues with low local protein density, and shown as large grey spheres. ‘Co-insertable protruding hydrophobes’ are protruding hydrophobes that are adjacent vertices of the convex hull and are shown connected by orange lines. Small black spheres are at vertex residues that have high local density, and do therefore not meet the criteria for protrusions.
Fig 2.
Hydrophobes are more common on protruding positions in peripheral proteins than in the reference sets.
The plots show frequencies of hydrophobes on surface amino acids, both on protrusions (A, C, E, G) and among all solvent exposed amino acids (B, D, F, H) for peripheral proteins (blue) and the reference datasets (red). The horizontal axes show the mean fraction (Eq 1) of protrusions or solvent exposed amino-acids that are hydrophobic. The vertical axis shows the fraction of protein families for each set. Plots A-D show the comparison between the data sets ‘Peripheral’ and ‘Non-binding surfaces’, and E-H the comparison between ‘Peripheral-P’ and ‘Reference Proteins’.
Fig 3.
On peripheral proteins (‘Peripheral’ dataset) protrusions in low density regions are more often hydrophobes compared to the ‘Non-binding surfaces’.
The plot shows the logarithm of the odds-ratio (Eq 10) comparing the frequency of hydrophobes on ‘vertex residues’ in peripheral proteins and non-binding surfaces. Positive values reflect higher frequencies in the peripheral proteins. The horizontal axis shows the protein density d around the protrusion, measured as the number of Cα and Cβ atoms within 1nm. Vertex residues are all on the convex hull, but only the vertex residues with d < 22 are protrusions. The leftmost bar with d < 7 corresponds mostly to chain terminals. More precisely, the vertical axis shows where A denotes the dataset ‘Peripheral’, B the ‘Non-binding surfaces’, l and u denote the lower and upper limits of the ranges given on the vertical axis, and d is the local protein density defined in ‘Materials and methods’. Error bars are 95% confidence intervals.
Fig 4.
The ‘protruding hydrophobes’ tend to be ‘co-insertable’ in peripheral proteins.
Panel A shows the comparison between the data sets ‘Peripheral’ and ‘Non-binding surfaces’, and B the comparison between the ‘Peripheral-P’ and ‘Reference Proteins’. The tendency for protrusions to be co-insertable is quantified by the weighted frequency of co-insertion (Eq 9), and is compared between each data set and a null model using the odds ratio (Eq 10). Positive values reflect higher frequencies of co-insertion than in the null model. More precisely we plot where set represents the set of peripheral proteins (blue) and the corresponding reference set (red), and null represent their respective null models where hydrophobes have been relocated randomly among protrusions as described in ‘Materials and methods’. Error bars are 95% confidence intervals.
Fig 5.
’Co-insertable protruding hydrophobes’ are common in peripheral proteins and rare in the reference sets.
The plots show the occurrence of ‘co-insertable protruding hydrophobes’ on protein surfaces. Panels A-C show the comparison between the sets ‘Peripheral’ and ‘Non-binding surfaces’ and panels D-F the comparison between ‘Peripheral-P’ and ‘Reference Proteins’. Panels A, B, D, and E show the weighted fraction (Eq 5) of proteins that have protruding hydrophobes in the peripheral proteins (blue) and the reference sets (red). We differentiate here between protrusions that have at least one co-insertable protruding hydrophobe (labeled “Co-ins.”), and those that have not (labeled “isolated”). The analysis is done separately for two groups of proteins according to the total number of protrusions on the protein surface ([0, 25〉 in panels A and D, [25, 50〉 in panels B and E). Panels C and F show the frequency distribution of the total number of protruding residues (“# protrusions”) for all proteins. The selections analysed in panels A, B, D, and E are found between the dashed lines in panels C and F. Error bars in panels A, B, D, and E are 95% confidence intervals.
Fig 6.
Protruding hydrophobes are found on the membrane binding sites of well known membrane binding domains.
The figure shows the convex hull (in blue) of the Cα and Cβ-atoms of selected peripheral membrane binding domains. The Cβ-atoms of ‘the Likely Inserted Hydrophobe’ are shown as orange spheres and Cβ-atoms of experimentally identified membrane-binding residues as gray spheres. The Likely Inserted Hydrophobe is an amino acid that has been experimentally verified to be a membrane binding residue for A, B, D and F. For C and E the Likely Inserted Hydrophobe is located in the same area as the residues identified by experiments. A: C2 domain of human phospholipase A2 (PDBID: 1RLW [18]); B: PX domain of P40PHOX (PDBID: 1H6H [19]); C: snake phospholipase A2 (PDBID: 1POA [20]); D: C1 domain of protein kinase C δ (PDBID: 1PTR [21]); E: Epsin ENTH domain (PDBID: 1H0A [14]); F: FYVE domain of yeast vacuolar protein sorting-associated protein 27 (PDBID: 1VFY [22]).
Fig 7.
Protruding hydrophobes predict experimentally verified binding sites.
The figure shows comparisons of predicted binding residues (‘the Likely Inserted Hydrophobe’) with experimentally verified binding sites for a manually curated dataset of 24 proteins (listed in S2 Table). The vertical axis corresponds to values of the angle (Eq 11) comparing the two vectors connecting the center of the protein with either the predicted or known binding sites. Smaller angles imply better agreement between prediction and experiment. Asterisks (*) mark proteins where the Likely Inserted Hydrophobe is an amino acid experimentally identified to be interacting with the membrane. The grey boxplots show the distribution of angles when the known binding site residues are compared to all protruding amino acids on the protein. 1iaz is analysed in its soluble monomeric state, while it forms a transmembrane pore upon oligomerisation. The structure of the Bovine α-lactalbumin (PDBID: 1F6S) has no identified protruding hydrophobes and is marked with a cross at 180°.
Fig 8.
Comparing predictions based on protruding hydrophobes with the predicted IBS in the Orientation of Proteins in Membranes (OPM) database.
The plots show the distributions of the median ‘insertion coordinate’ from OPM for ‘the Likely Inserted Hydrophobe’ in each family (measured at the Cα-atom, ‘Peripheral’ dataset). Values greater than or equal to zero correspond to atoms positioned in the hydrophobic core or at the boundary. Hence insertion coordinate values close to zero indicate agreement with OPM. Panels A and C show data for the Likely Inserted Hydrophobes and panels B and D for a null model of randomly selected ‘protruding’ residues. Panels C and D show cumulative histograms (accumulated with decreasing insertion coordinate).
Fig 9.
Hydrophobic protrusions in peripheral proteins are more frequent on turns, bends and α-helices, compared to the reference set (‘Non-binding surfaces’).
Panel A shows the weighted number(Eq 2) of ‘protruding hydrophobes’ associated with the different types of secondary structure elements. We have differentiated between protrusions that have at least one co-insertable protruding hydrophobe (right, labeled “Co-ins.”), and those that have not (left, labeled “Isolated”). Panel B compares the weighted frequencies (Eq 4) of hydrophobes on protruding secondary structures between the peripheral membrane proteins and the reference set using the odds ratio (Eq 10). Positive values reflect higher frequencies in the peripheral proteins; panel A shows the values Nhydrophobe|protrusion∩sse and panel B the comparisons where A denotes the peripheral proteins, B the reference set and sse specifies the secondary structure (see color legend). Error bars in panel B are 95% confidence intervals.
Fig 10.
Large aliphatic and aromatic side chains are over-represented on protrusion on peripheral proteins.
Panel A shows the weighted fractions (Eq 4) of hydrophobic amino acids on protrusions from peripheral proteins (blue) and from proteins in the reference set (red, ‘Non-binding surfaces’). In panel B, the contrast between the two sets is quantified by the odds ratio (Eq 10), so that positive values reflect higher frequencies in the set of peripheral proteins than in the reference set. More precisely the vertical axis denote with aa representing each of the standard amino acids. Error bars are 95% confidence intervals.
Fig 11.
Differences in number of polypeptide chains between the protein models present in the dataset ‘Peripheral’ (quaternary structure model from OPM) and the models in ‘Peripheral-P’ (quaternary structure model predicted by PISA).
The difference is calculated for each of the PDB IDs occurring in both datasets. When more chains are present in the PISA models, The difference (horizontal axis) is negative.