Reliability-enhanced data cleaning in biomedical machine learning using inductive conformal prediction

doi:10.1371/journal.pcbi.1012803

Reliability-enhanced data cleaning in biomedical machine learning using inductive conformal prediction

Fig 9

The number of wrong labels detected under different percentages of training data label permutation in COVID-19 patient ICU admission prediction task. The number of wrongly labeled data based on LR models (A–C) and LDA models (D–F) under different detection thresholds of wrongly labeled data: 0.8 (A,D), 0.5 (B,E), 0.2 (C,F). The cleaning process visualization is based on optimized hyperparameters for the conformal predictor tuned on the validation dataset for each classifier and each percentage of labels permuted.

doi: https://doi.org/10.1371/journal.pcbi.1012803.g009