Fig 1.
Typical vial images illustrating varying lighting conditions and irregularities in media surface.
Original vial image (A) and egg density representation from QuantiFly (B) for transparent defined medium (DM). Original vial image (C) and egg density representation (D) for opaque sugar/yeast (SY) medium. Red arrows highlight artefacts present in images: (1) clumped eggs; (2) marks on vial base; (3) bubble artefacts in media; (4) specular reflection from vial surface, and; (5) clumped eggs on food surface. Colour bar represents pixel density estimate of an egg.
Fig 2.
Bias in baseline algorithm predictions can be corrected using linear transformation.
The baseline algorithm tends to over-estimate low egg counts and under-estimate high egg counts with the error polarity and magnitude related to the mean of the training set. (A-B) error associated with egg counts (y-axis) from a set of vials containing transparent (A) or opaque (B) media selected to span a broad range of ground-truth values (x-axis). There is a significant linear correlation with an x-axis intercept close to the mean of the distribution in these two datasets, confidence interval (95%) is indicated as dashed coloured lines. (C) Through calculating bias correction during training it is possible to correct the baseline predictive estimate for vial counts. The accuracy was improved for 10 different datasets when bias correction was applied, depicted in panel (transparent media; a-e) and (opaque media; f-j). (D) Summary comparison of transparent media datasets without correction (light-blue) and with bias correction (dark-blue) and of opaque media without correction (yellow-orange) and with correction (dark-orange). Bias correction increased accuracy in both media types. Each dataset was captured independently and contains 8 vial images with the exception of e and j which represent the vials depicted in A and B repectively. A Leave-one-out cross-validation strategy was performed for each vial (7 in and 1 out). Accuracy represents average of five statistical replicates, error bars represent SE (n = 5, statistical replicates); * represents P<0.05, Mann-Whitney one-tailed test.
Table 1.
Performance of algorithm on defined media and SY-media datasets.
Fig 3.
Schematic illustration of training and evaluation modes of QuantiFly software.
(Left) Training mode: steps required to train a QuantiFly model to recognise eggs in an image scene. (Right) Evaluation mode: steps involved with evaluating a bulk number of images with a pre-trained model. Blue circles depict points at which a user must provide information to the system, either specifying input/output locations of files or through labelling eggs in images.
Fig 4.
Characterisation of the quantity of training material required to achieve high prediction accuracy with QuantiFly software.
The QuantiFly software accuracy was compared on transparent and opaque media datasets after training with 1–7 training images. (A) Average accuracy of algorithm when compared to digital ground-truth for transparent media datasets (a-e). (B) Average number of eggs labelled for each level of training for transparent media dataset. (C) Average accuracy of algorithm when compared to digital ground-truth for opaque media datasets (f-j) (D) Average number of eggs labelled for each level of training for opaque media dataset. Each level of training was performed on every image in the dataset and repeated 5 x. for each dataset and the accuracy averaged across all data. Error bars are SE. Students t-test paired two-way analysis was performed on data (p<0.05)
Fig 5.
Comparison of QuantiFly performance when compared to human counter.
Digital images were captured for four nutritionally different transparent media (A; C1-C4) and 4 different opaque media (B; D1-4). Estimates of the eggs in each vial were compared for the following methods: automated counts from QuantiFly algorithm; manual counts from a human and a digital on-screen ground-truth count (grey). (C) Image of opaque media vial with densely clustered eggs, red arrow 2 shows region with high-level of clustering. Error bars represent standard error of differences in vial densities in each condition (C1-4, n = 8; D1-4, n = 5 vials per condition).
Table 2.
Performance of QuantiFly on transparent and opaque media compared to human manual counts and digital ground-truth counts for each dataset.
Fig 6.
(A) The number of replicates required to achieve a confidence interval of below 0.05 was calculated for the manual human count compared to the QuantiFly software using population standard deviation calculated from Fig 5A and 5C. Plot represents the number of replicates required to separate conditions which differ by 1.1 fold. (B) Projected time requirements for counting vials for a single condition using the existing manual approach or the QuantiFly software.