Fig 1.
The binary confusion matrix with positive (+) and negative (-) classes.
Note in the format shown, the labelling provided by the reference data are in the columns and the results of the classifier being assessed are in the rows. Abbreviations used are defined in the text.
Fig 2.
Confusion matrices arising from the use of a gold standard reference, an imperfect reference with correlated errors (accuracy = 0.90) and an imperfect reference with independent errors (accuracy = 0.90).
(a) Prevalence = 0.1 and (b) Prevalence = 0.3.
Fig 3.
Relationships between apparent Recall, Precision, Specificity and Negative Predictive Value with prevalence assessed using three imperfect reference standards of differing quality that contain errors independent of those in the classification.
The true relationship, obtained using a gold standard reference, is also shown for comparative purposes with a dashed black line.
Fig 4.
Relationships between apparent J, F1, MCC and prevalence with prevalence assessed using three imperfect reference standards of differing quality that contain errors independent of those in the classification.
The true relationship, obtained using a gold standard reference, is also shown for comparative purposes with a dashed black line.
Fig 5.
Relationships between apparent Recall, Precision, Specificity and Negative Predictive Value with prevalence assessed using three imperfect reference standards of differing quality that contain errors correlated with those in the classification.
The true relationship, obtained using a gold standard reference, is also shown for comparative purposes with a dashed black line.
Fig 6.
Relationships between apparent J, F1, MCC and prevalence with prevalence assessed using three imperfect reference standards of differing quality that contain errors correlated with those in the classification.
The true relationship, obtained using a gold standard reference, is also shown for comparative purposes with a dashed black line.
Fig 7.
Relationship of apparent MCC with prevalence for a poor classification (Recall = Specificity = 0.5, J = 0) assessed with an imperfect reference (Recall = Specificity = 0.7) containing correlated errors.
The true relationship, obtained using a gold standard reference, is also shown for comparative purposes with a dashed black line.
Fig 8.
Relationship between apparent LR+ and LR- values with prevalence assessed using three imperfect reference standards.
(a) Error in the reference is independent of that in the classification and (b) error in the reference is correlated with that in the classification. Note, in Fig 2B the Y axis for the positive LR was trimmed for visualisation purposes, the apparent value obtained for LR+ rises to 909.2. The true relationship, obtained using a gold standard reference, is also shown for comparative purposes with a dashed black line.