Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Machine learning algorithm validation with a limited sample size

Fig 5

Other factors apart from sample size influencing overfitting when K-Fold CV is used.

SVM-RFE and t-test feature selection, logistic regression classification, and sample size fixed at N = 100. A: Feature number manipulated from 20 to 200. B: Parameter tuning grid size manipulated from 2 × 2 to 200 × 2 with penalty set to L1, L2 and C = ei, where i varied from −4 to 4. Thick lines show fitted 5th order polynomial trend. C: Number of CV folds varied from two-fold to leave-one-out. Thick dashed lines show fitted 5th order polynomial trend.

Fig 5