Crowdsourcing image analysis for plant phenomics to generate ground truth data for machine learning

doi:10.1371/journal.pcbi.1006337

Fig 1.

Overall schema of datasets (boxes) and processes (arrows) that led to the analyses (red).

Top row: The Expert Labeled dataset was used a gold standard to analyze how well the different experimental groups (blue boxes) performed. Bottom row: the labeling from each experimental group was used to train an ML classifier. Each ML classifier was then tested against an expert-labeled test set.

More »

Expand

Fig 2.

Example image used during training to demonstrate correct placement of bounding boxes around tassels.

More »

Expand

Fig 3.

Drawing boxes around tassels.

Left: Sample participant-drawn boxes. Right: The Red box is the gold standard box and black is a participant-drawn box.

More »

Expand

Fig 4.

Density of precision recall pairs by group.

Density based on a total of 61,888 participant-drawn boxes. A: Master MTurkers. B: MTurkers. C: Course Credit participants. D: Violin plots showing the distribution of F-measure per image per user, where white circles: distribution median; black bars: second and third quartiles; black lines 95% confidence intervals.

More »

Expand

Table 1.

Parameter estimates from the ANOVA with master MTurk group as baseline.

More »

Expand

Table 2.

Parameter estimates in linear mixed effects regression of time spent each image.

More »

Expand

Fig 5.

Both accuracy and time per question change as participants progress through the task.

A: Time spent in log scale as a function of image order. B: Mean F value decreases very slightly over the survey process.

More »

Expand

Fig 6.

Best Linear Unbiased Predictors for images.

BLUPs are calculated in both analyses for F_mean and time in log scale. Color represents image difficulty determined by expert.

More »

Expand