Skip to main content
Advertisement

< Back to Article

Fig 1.

B-factors plots.

A plot of theoretical B-factors as calculated by Seq-GNM (blue), the original GNM obtained from structure (orange), and observed experimental B-factors (black) for acyl-CoA dehydrogenase (PDB id: 1JQI) along with predicted contact maps by Seq-GNM (using a threshold score, shown as blue) and the contact map of the structure (using 10Å cut-off distance) (a) The Seq-GNM with values obtained from RaptorX produces a correlation of 0.56 with experiment, and 0.77 with the GNM obtained from structure. Moreover, the contact maps reveal that the predicted contacts between the Seq-GNM and structural GNM approaches are remarkably similar. (b) The Seq-GNM that uses values obtained from EVcouplings produces a correlation of 0.60 with experiment, and 0.68 with the GNM obtained from structure. The B-factor obtained by applying GNM to the experimental structure yields a correlation of 0.57. The contact map captures the dominant contacts with noise coming from poorly predicted EVcouplings scores.

More »

Fig 1 Expand

Fig 2.

Color-coded ribbon diagrams using experimental and theoretical B-factors obtained by Seq-GNM.

The observed crystallographic B-factors (left) and the predicted B-factors from the Seq-GNM superimposed on the structure. The three proteins selected–2JAO, 1F2J, and 1UMK–are high-resolution structures better than 2.0 Å. The B-factors are color-coded according to their B-factor profile on a spectrum of blue–white–red where blue represents the lowest B-factors (less mobility) and red represents the highest B-factors (more mobility). The B-factor scores are converted to a percentile rank so that they can be compared across different proteins. Each protein is also rotated 180° so that both sides can be visualized and compared. Moreover, the proteins are selected so that they have a variety of secondary structure components–2JAO contains primarily alpha helices, 1UMK is mainly composed of beta-sheets, and 1F2J is a combination of alpha helices and beta-sheets.

More »

Fig 2 Expand

Fig 3.

Comparison of B-factors obtained by GNM and Seq-GNM with experimental B-factors.

(a) Boxplot showing the correlation of predicted B-factors by the Seq-GNM with experimentally observed B-factors (blue) in comparison to that of the GNM obtained from structure (orange) for a subset of 39 structures with resolution better than 2.0 Å. (b) A distribution plot of the same correlations binned into 10 bins with sizes of 0.1. A student t-test reveals no significant difference between the two distributions (p = 0.055) indicating that the Seq-GNM is producing competitive results compared to the original GNM from structure. The mean correlation of the Seq-GNM is 0.53 while that of the GNM from structure is 0.58.

More »

Fig 3 Expand

Fig 4.

Comparison of B-factors obtained from experiments, Seq-GNM, GNM from monomeric structure, and GNM from oligomeric structure.

B-factors are shown on the respective structures for (a) 5'(3')-deoxyribonucleotidase (2JAO) and (b) Aldehyde Dehydrogenase 7A1 (2J6L). (a) The correlation of Seq-GNM to experimental B-factors is 0.83 while correlation of GNM B-factors obtained from monomer to experimental B-factors is 0.63. When dimer a is used for GNM analysis the correlation of GNM B-factors obtained from monomer to experimental B-factors increased to 0.72. (b) The correlation of Seq-GNM to experimental B-factors is 0.61 while correlation of GNM B-factors obtained from monomer to experimental B-factors is 0.37. When a tetramer is used for GNM analysis the correlation of GNM B-factors obtained from monomer to experimental B-factors increase to 0.76. The change in correlation for GNM between monomer and oligomer clearly shows the drawback for dependence on the crystal structure of biounits. However, Seq-GNM captures the interface B-factors correctly.

More »

Fig 4 Expand

Fig 5.

Distribution of correlation coefficients.

The distribution of correlation coefficients between B-factors from Seq-GNM and GNM from structure. (a) The average correlation coefficient is 0.63 with RaptorX EC values. (b) The average correlation coefficient is 0.43 by using EVcouplings EC values.

More »

Fig 5 Expand

Fig 6.

Comparison of theoretical B-factors on disease versus neutral mutant sites.

A ribbon diagram for two human enzymes, human lysozyme (a) and cytochrome reductase (b) colored according to their predicted B-factors by the Seq-GNM. Red indicates high mobility sites, and blue indicates low mobility sites. Each protein contains two known nSNVs. I56T and R57Q are disease-associated, and they occur on low mobility (rigid) sites. Conversely, the neutral nSNVs T116S and T70N occur on high mobility sites.

More »

Fig 6 Expand

Fig 7.

Observed to expected ratio plots for disease and neutral nSNVs.

The relationship of observed-to-expected numbers between 436 disease nSNVs (red) and 302 neutral nSNVs (blue) from 139 human enzymes. The %B-factor scores derived from the Seq-GNM are binned into 5 bins of size 0.2.

More »

Fig 7 Expand

Fig 8.

ROC curves for disease prediction performance comparing Seq-GNM, experimental B-factors and evolutionary parameters.

ROC curves are plotted using 10 randomly selected training and testing data sets using 80%, and 20% of the data, respectively. (a) ROC curve of Seq-GNM. (b) ROC curve of experimental B-factors. (c) ROC curve of evolutionary parameters, where primate, mammal, and vertebrate fitch rates using Fitch Algorithm [57]; and Entropy2 are used as features for training. (d) ROC curve of evolutionary parameters used in (c) with the addition of Seq-GNM.

More »

Fig 8 Expand

Table 1.

The disease prediction data showing the accuracy, sensitivity, and the selectivity of Seq-GNM compared with experimental B-factors, SIFT, PolyPhen-2 and evolutionary parameters.

More »

Table 1 Expand

Fig 9.

Flowchart of Seq-GNM method for nSNV predictions.

A workflow of our method to use predicted evolutionary couplings to determine protein dynamics and assess the functional impact of nSNVs. The initial input is an amino acid sequence, which is used to obtain MSA. Using MSA evolutionary coupling pairs are predicted through RaptorX and EVcouplings. The high scored evolutionary coupling pairs are assigned as contacts in our Seq-GNM to compute the dynamics profile of each protein. The dynamic profiles obtained from Seq-GNM can give insight into the functional impact of nSNVs. This was done for a curated set of 139 structures.

More »

Fig 9 Expand

Fig 10.

Comparison of theoretical B-factors.

Boxplots comparing the correlations of predicted B-factors by our Seq-GNM for (a) RaptorX and (b) EVcouplings with that of the structural GNM for all 139 structures using a constant threshold for EC contacts. The GNM analysis is conducted 8 times using a constant threshold (between 0.92 and 0.99) each time. The best average correlations are produced when the constant threshold value of 0.98 is used.

More »

Fig 10 Expand