Table 1.
Seven benchmark datasets used to train and test our predictor.
Figure 1.
This graph shows the distribution of top322 features.
SVM-REF ranked the features according to their ability to separate different categories for each dataset. So the ranking lists and top features are different for different datasets. Apparently, proportions of different kinds of features are consistent for all seven datasets, i.e., physical-chemical properties reflected by PROFEAT constitute the majority group, followed subsequently by PSSM and GO annotation features. The bar chart shows the numbers of three different kinds of features in top features for each dataset.
Figure 2.
This graph shows the pipeline that goes from the query sequence to the final output as well as all intermediate steps.
Figure 3.
This graph shows comparison of prediction accuracies by SVM-RFE and F-score.
Gray dotted lines highlight the selected top features for high (top 70) and low (top 322) similarity datasets.
Table 2.
Prediction performances on seven datasets by our method.
Figure 4.
This graph shows the ROC curves of 1189 dataset.
Table 3.
Performance comparison of different methods on 1189 dataset.
Table 4.
Performance comparison of different methods on D640 dataset.
Table 5.
Performance comparison of different methods on 25PDB dataset.
Table 6.
Performance comparison of different methods on D1185 dataset.
Table 7.
Performance comparison of different methods on D8244 dataset.
Table 8.
Performance comparison of different methods on Z277 dataset.
Table 9.
Performance comparison of different methods on Z498 dataset.
Table 10.
Examples to show the predicted results by our predictor based on five datasets.
Figure 5.
This graph shows the overlapped PROFEAT features of Z277 and Z498.
After feature selection by SVM-REF, 157 and 155 PROFEAT features are selected in top322 features for datasets Z277 and Z498, and have significant overlap (117 common features).