Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Aggregating soft labels from crowd annotations improves uncertainty estimation under distribution shift

Fig 10

Reliability diagram and expected calibration error (ECE, displayed as Equation 9 100) for each soft labeling method in POS tagging.

Black bars indicate the accuracy in the given bin and red bars indicate the gap between accuracy and confidence. We use the average of the logits produced by models trained with 10 different random seeds with no temperature scaling. ECE for aggregation is comparable to the best performing methods (WaWA and ZBS). Models trained using aggregated soft labels have better calibration in both the low and high confidence regimes.

Fig 10

doi: https://doi.org/10.1371/journal.pone.0323064.g010