Classification performance bias between training and test sets in a limited mammography dataset

doi:10.1371/journal.pone.0282402

Fig 1.

Data resampling and evaluation procedure.

More »

Expand

Fig 2.

Interaction of the cross-validated training and test set performances for 4 different logistic regression model types.

Scatter points were based on shuffled splits into different train-test sets. All 4 models showed anti-diagonal trend where training and test AUCs traded off against each other. Dark symbols represent previously published performances [7].

More »

Expand

Fig 3.

Interaction of the cross-validated training and test set performances for 4 logistic regression model types across different splits into train-test sets.

Solid symbols indicate splits with significant differences in patient age or lesion size; their exclusion would not have changed the overall distributions.

More »

Expand

Fig 4.

Interaction of the cross-validated training and test sets for two SVM model types.

Scatter points from different train-test splits are randomly distributed.

More »

Expand

Fig 5.

Box plots of cross-validated AUCs using different numbers of training cases.

More »

Expand