Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

The framework of PredSAV.

(A) Feature representation. A total of 1521 sequence, Euclidean and Voronoi neighborhood features are initially generated. (B)Two-step feature selection. Stability selection is used as the first step. We select the top 152 features with score larger than 0.2. The second step is performed using a wrapper-based feature selection. Features are evaluated by 5-fold cross-validation with the GTB algorithm. (C) Prediction model. Gradient boosted trees are finally built for prediction.

More »

Fig 1 Expand

Table 1.

Performance of selected attributes with the two-step feature selection method.

The first column lists different cutoffs of stability selection scores.

More »

Table 1 Expand

Fig 2.

ROC curves of our two-step algorithm and other three existing feature selection methods.

More »

Fig 2 Expand

Table 2.

Rankings of feature importance for the optimal selected features.

SN, EN and VN represent sequence neighborhood, Euclidean neighborhood and Voronoi neighborhood, respectively. The numbers in the brackets denote the positions in the sliding window for sequence neighborhood features.

More »

Table 2 Expand

Fig 3.

The relative importance and ranking of the optimal feature group, as evaluated by the gradient tree boosting.

The bar represents the importance score of the corresponding feature group.

More »

Fig 3 Expand

Fig 4.

Comparison of the AUC value of the the three methods using 5-fold cross-validation on the benchmark dataset.

More »

Fig 4 Expand

Table 3.

Prediction performance of PredSAV classifiers in comparison with six other prediction tools on the benchmark dataset.

More »

Table 3 Expand

Fig 5.

The ROC curves of seven classifiers on the benchmark dataset.

More »

Fig 5 Expand

Table 4.

Prediction performance of PredSAV classifiers in comparison with six other prediction tools on the independent test dataset.

More »

Table 4 Expand

Fig 6.

The ROC curves of seven classifiers on the independent test dataset.

More »

Fig 6 Expand

Fig 7.

Prediction examples of the functional effects of SAVs in two proteins by PredSAV and other methods.

Red color denotes disease-associated variants while blue color represents neutral variants. (A) and (B) represent proteins PAH (PDB ID: 1J8U, chain A) and LSS (PDB ID: 1W6K, chain A), respectively. 3-D structures are rendered using PyMol [75].

More »

Fig 7 Expand