PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations

doi:10.1371/journal.pcbi.1003440

Table 1.

Principles and training datasets of eight evaluated tools.

More »

Expand

Figure 1.

Workflow diagram describing construction of independent datasets.

The various sources of mutation data are shown in yellow, intermediate datasets in white, Protein Mutant Database (PMD) testing dataset and the testing dataset compiled from studies on massively mutated proteins (MMP) in blue, and PredictSNP benchmark dataset in green. The data from the original training datasets of all evaluated tools shown in red were removed from newly constructed datasets.

More »

Expand

Figure 2.

Distribution of amino acids in PredictSNP benchmark dataset.

Expected distributions of amino acid residues were extracted from 105,990 sequences in the non-redundant OWL protein database (release 26.0) [58].

More »

Expand

Table 2.

Performance of individual and PredictSNP prediction tools with three independent datasets.

More »

Expand

Figure 3.

Overall receiver operating characteristic curves for all three independent datasets.

Comparison of PredictSNP and its constituent tools with PredictSNP benchmark dataset (A). Comparison of PredictSNP and other consensus classifiers with MMP data set (B) and PMD-UNIPROT dataset (C). The dashed line represents random ranking with AUC equal to 0.5.

More »

Expand

Table 3.

Performance of consensus classifiers with PMD-UNIPROT and MMP datasets.

More »

Expand

Figure 4.

Workflow diagram of PredictSNP.

Upon submission of the input sequence and specification of investigated mutations, integrated predictors of pathogenicity are employed for evaluation of the mutation and the consensus prediction is calculated. In the meantime, UniProt and PMD databases are queried to gather the relevant annotations.

More »

Expand

Figure 5.

Graphic user interface of PredictSNP.

The web server input (left) and output (right) page.

More »

Expand