Wiggle—Predicting Functionally Flexible Regions from Primary Sequence
Figure 4
Predictor Performance Is a Function of Protein Length
(A) Sequence effect on false-positive (thick line) and false-negative (thin line) error rate. Shorter sequences tend to have higher false positive identification of FFRs when trained on a nonpartitioned dataset.
(B) Comparison of SVM prediction results trained on a nonpartitioned dataset (dashed lines) and a partitioned dataset containing proteins up to 200 residues (solid lines). Improvements were seen in both the false-positive (black) and -negative (red) rates.
(C) Comparison of SVM prediction results trained on a nonpartitioned dataset (dashed lines) and a partitioned dataset containing proteins larger than 200 residues (solid lines). Minor improvements were observed in false-positive (black) and -negative (red) rates.