Fig 1.
The outline of the comparative analyses.
A) For all domains of every SCOP class, pairwise structural alignments were created with TMalign (blue–helices, yellow–strands, red–unaligned regions). B) Structural alignments with TM score below 0.5 were excluded from the analysis, and the pairwise alignments were ordered according to the sequence similarity of the aligned structures. C) Structurally unaligned regions (red) were refined with Rascal, resulting in high quality pairwise alignments. D) In the pairwise alignments the secondary structure, RSA and contact density were determined for each residue.
Fig 2.
Alpha helices can accept more mutations than beta strands.
A-C) Secondary structure similarity of pairwise alignments as a function of sequence similarity. Pairwise alignments were grouped into 10% bins based on their sequence similarity. α-helices change significantly less with sequence change than beta strands in case of all-α, all-β and α+β domains. D-F) Since in α/β and α+β domains there is a large difference in the buriedness of helices and strands (see S1 Fig), using alignments with 10–20% similarity, we added relative solvent accessibility (RSA) as a covariate. When this correction is applied (i.e. the different levels of buriedness are taken into account), residues of helices are significantly more robust for mutations than strands in all SCOP classes, except for the most buried residues with RSA < 0.1 (diamonds indicate significant difference, tests of proportions, p < 0.05 after Holm-Bonferroni correction).
Fig 3.
The effect of the number of residue contacts on secondary structure similarity.
A) Residues in helices have significantly more non-covalent interactions than residues in strands (ANCOVA using all-α and all-β domains). Using the two regression lines between RSA and the number of inter-residue contacts of each residue, we excluded all helix residues with higher than average number of contacts, and strand residues with lower than average number of contacts, and subsequently determined secondary structure similarity with the remaining residues. B) When using the remaining residue sets in all four SCOP classes, the difference in robustness between alpha helices and beta strands disappears, or even reverses (stars indicate significant difference, tests of proportions, p < 0.05 after Holm-Bonferroni correction), indicating that the higher robustness of helices is caused by their higher contact density.
Fig 4.
Within the same protein domains, helices diverge faster than strands, indicating higher robustness.
A) The relationship between global sequence similarity and sequence similarity in secondary structures in α/β domains. The pairwise structural alignments were grouped into 10% bins (see Fig 2), boxes represent 25–75%, whiskers 10–90%. Note that the difference between helices and strands declines below 40% sequence similarity, because sequence similarity cannot be negative, and random sequences have an expected similarity of 5–6%. In the alignments with 10–90% sequence similarity, helices are significantly more diverged than strands and coils in each bin (p< 0.05, t-tests), also when the differences in their RSA is taken into account (p< 0.05, ANCOVA). B) An example of the independent effect of secondary structure on sequence divergence in α/β domains, using the pairwise structural alignments with 40–60% divergence, and ANCOVA with global sequence similarity and the average RSA of secondary structure as continuous predictors. Within the same domains, helices are significantly more diverged than strands (p < 2 x 10−16, whiskers represent 95% confidence intervals), which in turn are more diverged than coils (p < 2 x 10−16). C-D) The same as A-B, but for α+β domains.
Fig 5.
The effect of secondary structure on robustness and pathogenicity of point mutations.
A) Point mutations in experimentally determined structures are significantly less likely to change secondary structure in helices than in strands. (On all panels “*” represents significance below 0.05 and “**” significance below 0.005, controlled for false discovery rate with the Benjamini-Hochberg method, error bars represent 95% confidence intervals.) B) The frequency of pathogenic mutants in conservative mutations that do not result in a change in secondary structure, and in secondary structure breaking mutations. Mutants were grouped according to the RSA of the wild type. Mutations that are predicted to break secondary structure are significantly more pathogenic than the ones that do not change secondary structure, particularly in the case of buried residues. C) Mutations with the same PolyPhen-2 (PP2) score are more likely to be pathogenic if they are predicted to change secondary structure, indicating that information on secondary structure can be used to improve pathogenicity prediction tools The numbers of mutations are 33492, 9953, 9082, 13369 for the PP2 score ranges 0–0.49, 0.5–0.89, 0.9–0.99, 0.99–1.0, respectively. D) Mutations that cause disease are significantly more destabilizing (have a larger effect on the free energy of folding) than neutral mutations in the RSA bins lower than 0.4. E) The higher pathogenicity of mutations that break secondary structure is probably caused by their stronger destabilizing effect on protein structure: the difference between secondary structure changing and non-changing mutations is highly significant in all RSA bins.