Structure-Based Predictive Models for Allosteric Hot Spots

doi:10.1371/journal.pcbi.1000531

Table 1.

Training and Independent Data Sets of Proteins with PDB identifications for the inactive and active state structures for various classes of molecules.

More »

Expand

Table 2.

Feature Set 1.

More »

Expand

Table 3.

Feature Set 2.

More »

Expand

Table 4.

Top 20 highest performing feature/kernel degree combinations (as ranked by F1) using Feature Set 1.

More »

Expand

Table 5.

Summary of the performance of the four feature sets.

More »

Expand

Table 6.

Top 20 highest performing feature/kernel degree combinations (as ranked by F1) using Feature Set 2.

More »

Expand

Figure 1.

Feature usage in the top 300 SVM models using Feature Set 1.

For each feature, the number of models (frequency) in the top 300, as ranked by F1 performance on the training data, that used that particular feature was tabulated.

More »

Expand

Figure 2.

Feature usage in the top 300 SVM models using Feature Set 2.

For each feature, the number of models (frequency) in the top 300, as ranked by F1 performance on the training data, that used that particular feature was tabulated.

More »

Expand

Table 7.

Feature/kernel degree combinations from the top 300 models which used only sequence or inactive state structural information.

More »

Expand

Table 8.

Top 20 highest performing feature/kernel degree combinations (as ranked by F1) using top 8 Set 1 features augmented with deformation energy of the active state (abbreviated def-energ-r in the table) and the difference in deformation energy between the inactive and active states (abbreviated diff-def-energ), Augmented Feature Set 1.

More »

Expand

Table 9.

Performance of the top Feature Set 1-models on the independent data set.

More »

Expand

Table 10.

Performance of top Feature Set 2-models on the independent data set.

More »

Expand

Table 11.

Performance of models that used only inactive state structure and/or sequence information from the top 300 on the independent data set.

More »

Expand

Table 12.

Performance of the top 20 models consisting of the top 8 features from Set 1 augmented with deformation energy of the active state (abbreviated def-energ-a in the table) and the difference in deformation energy between the inactive and active states (abbreviated diff-def-energ) on the independent data set (Augemented Feature Set 1).

More »

Expand

Figure 3.

Improvement of F1 upon successive feature addition.

The bar on the far right represents a feature combination from the top 10 models. Preceding bars represent feature combinations where each bar contains one feature fewer than the bar to its right.

More »

Expand

Table 13.

Feature/kernel degree combinations from the top 300 models that used only two or three features.

More »

Expand

Table 14.

Top 20 highest performing feature/kernel degree combinations (as ranked by F1) using all possible combinations of a mixture of Set 1 and Set 2 features that were found most frequently in the top-scoring models made using all possible combinations of each of the two feature sets separately (Hybrid Feature Set).

More »

Expand

Table 15.

Performance of the top models consisting of mixtures of the top Set 1 and Set 2 features on the independent data set (Hybrid Feature Set).

More »

Expand

Table 16.

The top 9 models with the highest precision on the independent data set that were used in the structural analysis.

More »

Expand

Figure 4.

Hotspot predictions mapped to the inactive state structure of lac repressor.

(A) Predictions made by the top 9 highest-precision Hybrid Feature Set models according to the voting scheme for lac repressor mapped onto the inactive state structure (1tlf). Experimentally tested residues rendered in van der Waals spheres, with known non-hotspots in small van der Waals spheres and known hotspots in larger ones. For other residues, the prediction is shown along the backbone trace, but no experimental data is available to test the prediction. Each residue in the structure is colored according to a blue→green→red heat map, where the extremes are as follows: red represents residues predicted to be hotspots by 9/9 of the models and blue residues to be predicted hotspots by 0/9 models (predicted non-hotspots by 9/9 models). (Refer to color bar above for exact mapping of the number of predicted hotspots to the color.) For ease of viewing only one set of dimers (chain A and B) is shown. His 74 and Asp 278, residues not in the independent data set but were studied experimentally and found to be allosterically active, are rendered in van der Waals mode as well [63]. Correct positive (hotspot) and negative (non-hotspot) predictions are colored according to the heat map, while false predictions are colored gray. The inducer molecule IPTG is rendered as sticks and colored by element. (B) Here the complete set of residues that caused the I^S phenotype are rendered in van der Waals spheres. The hotspots depicted in A. are a subset of these for which no substitution caused an I⁻ phenotype (completely nonfunctional). Incorrect predictions, i.e. false negatives, are colored in gray.

More »

Expand

Figure 5.

Hotspot predictions mapped to the inactive state structure of myosin II.

Predictions made by the top 9 highest-precision Hybrid Feature Set models according to the voting scheme for myosin II motor domain mapped onto the inactive state structure (1vom). Refer to Figure 4 above for an explanation of the coloring. Residues that met our criteria for classification as hotspot and included in the independent data set are rendered in van der Waals spheres. Switch-II (a region with high homology to the switch region of G-proteins that couples GTP hydrolysis to effector-domain conformation) residues (454–459) are depicted in van der Waals spheres as well, and colored according to the heat map.

More »

Expand