Fig 1.
Uncertainty quantification approach.
(A) Creation of bounding boxes around Diptera specimens to generate both cropped and uncropped versions of the dataset. (B) Training of CNNs using both image variants to evaluate the effect of cropping on classification performance. (C) Confidence estimation using test-time augmentation (TTA) and test-time dropout (TTD) with Monte Carlo sampling to assess the model confidence. While this figure shows an example for one Family, Fannidae, the uncertainty quantification was applied across all families in the dataset.
Table 1.
For each family the number of total images and the unique species and genus within each family within the dataset are shown.
Table 2.
Number of images in the Training (60%), Validation (20%), and Testing (20%) dataset.
Table 3.
Overview of data augmentation techniques used during test-time augmentation (TTA) and the percentage value if it is applied (p).
Table 4.
Comparison of three CNNs in terms of accuracy and efficiency.
The table includes the number of parameters, overall accuracy (OA), Kappa score, mean confidence, training epochs, time per epoch, test time for 100 Monte Carlo iterations, and if the image was cropped to its bounding box
Fig 2.
Confusion matrix comparing EfficientNetb4 results: (a) training with images cropped to the bounding box and (b) training with uncropped images.
Fig 3.
Confidence values for EfficientNetB4 for all 15 Diptera families for correct classification results with and without cropping its Bounding Boxes.
Fig 4.
Comparison of EfficientNetB4 confidence values on 15 Diptera families.
The figure highlights confidence differences between models trained on images cropped to the Diptera’s bounding box (red bounding box) versus those using (full) images . For each Diptera family, two examples are shown: one representing the smallest difference and one representing the largest difference in confidence values of the test dataset.
Fig 5.
Confidence values for three Diptera families with morphological similarities.
Predicted and actual labels for selected examples classified by EfficientNetB4 using images cropped to their bounding boxes. The model’s predicted class, corresponding ground truth label, and prediction confidence are shown for each image.