Table 1.
The #mutations and percentage of deleterious mutations for published methods.
Table 2.
The number of proteins, mutations and self G-square for each data set.
Table 3.
The G values for different datasets against each other.
Figure 1.
The contributions to G for HumanPoly and PrimateMut.
Only those 150 mutations accessible by single-nucleotide changes are shown in color; others are shown in gray. Wildtype residue types are given along the x-axis and mutant residue types are given along the y-axis. Blue squares indicate substitution types that are overrepresented in PrimateMut, while orange squares indicate substitution types that are overrepresented in HumanPoly.
Table 4.
Performance of the models trained by human polymorphism and primate polymorphism.
Figure 2.
The cross-validation results of five SVM models trained on data sets that are 10%, 30%, 50%, 70% and 90% deleterious mutations (x-axis = 0.1, 0.3, 0.5, 0.7 and 0.9 respectively).
(a) Values for TPR, TNR, PPV, and NPV. (b) Values for MCC, BACC, AUC, and ACC.
Figure 3.
(a) TPR, (b) NPR, (c) PPV, and (d) NPV of five SVM models trained on 5 different data sets (train_10, train_30, train_50, train_70, and train_90) tested by 9 different testing data sets, ranging from 10% deleterious (x-axis = 0.1) to 90% deleterious (x-axis = 0.9).
Figure 4.
(a) ACC, (b) BACC, (c) MCC, and (d) AUC of five SVM models trained on 5 different data sets (train_10, train_30, train_50, train_70, and train_90) tested by 9 different testing data sets, ranging from 10% deleterious (x-axis = 0.1) to 90% deleterious (x-axis = 0.9).
Table 5.
Top five predictors tested by CASP9 targets (117 targets).