Accuracy in the prediction of disease epidemics when ensembling simple but highly correlated models
Fig 2
Performance of 38 base learner logistic models (identified by the generation of model building), and several ensembles.
The ensembles are: a simple soft-vote model average across all base learner models; 10 weighted averages (Mx) of four base learner models (where the sets of four were randomly chosen from the larger set of all possible permutations of selecting one model each from the four groups indicated in Fig 1, weights based on Brier scores); stacked regression models (with lasso, ridge or elastic-net penalizations) fitted to the cross-validated probabilities of epidemics from all base learner models. A. specificity (Sp) versus sensitivity (Se); B. markedness (MKD) versus informedness (IFD); C. area under the precision-recall curve (PR-AUC) versus the area under the receiver operating characteristic curve (ROC-AUC); D. modified confusion entropy (MCEN) versus the normalized expected mutual information (IMN). The dashed line in each panel is a linear regression through the data and serves as a referential aid. Metrics are defined in Table 1.