Fig 1.
Differences in decay of fresh-cut lettuce stored in modified atmosphere packages (MAP) at 3.5°C.
The decay rating on the 0 to 10 scale corresponds to the estimated percentage of decayed tissue divided by ten. Note that the tissue samples were removed from MAP bags before photographing them. At times, decay can be accompanied by a profound cell lysis. The photograph at the lower-right corner shows the tissue sample with most or all of the cells already disrupted. This sample was photographed while still inside the MAP bag; therefore, a plastic mesh bag can be seen above the cell lysate. The mesh bag was used to keep fresh-cut tissue together during sample preparation.
Fig 2.
Examples of indexes used for evaluations of agreement between pairs of ratings.
Pearson correlation coefficient (r) provides good information about the closeness of the ratings to the best-fitting line (best fitting line is in orange). Lin’s concordance coefficient (ρc) combines both the closeness of the ratings to the best-fitting line and how the best-fitting line conforms to the identity line. Identity line (also called line of equality, 1:1 line, or x = y line) leads from the origin at 45-degrees (slope of 1) and represents the perfect agreement between evaluations (identity line is in dashed black). Coefficient of bias (Cb) represents the ratio between the concordance coefficient and the correlation coefficient Cb = ρc /r. Upper row shows pairs of ratings with perfect correlation coefficient, but imperfect concordance coefficient. Lower row shows pairs of ratings with coefficient of bias = 1, but imperfect concordance coefficient.
Fig 3.
Results of principal component analysis performed on visual ratings.
Decay of fresh-cut lettuce was evaluated in 90 MAP bags by nine raters; each rater evaluating the set of bags twice. R1 to R5 were experienced raters (blue color), while R6 to R9 were inexperienced raters (orange color). A pair of ratings from the same rater are connected by obrounds. The overall mean from all ratings is indicated by the red circle.
Fig 4.
Relationship between the composite reference standard (CRS) and 18 individual ratings performed by nine raters.
Orange and blue lines show the best-fit between CRS and two sets of ratings. Black dashed lines are identity lines that show the perfect agreement between CRS and individual ratings. Within each panel, values in the lower right corner show Pearson correlation coefficient (r), Lin’s concordance coefficient (ρc), and coefficient of bias (Cb) between two independent ratings of the same rater (measurement of repeatability, or intra-rater reliability). Values in the upper left corners show the same coefficients but between CRS and individual raters (measurements of accuracy). Accuracy of a rater is the mean of accuracies calculated for two ratings of the rater. R1 to R5 were experienced raters, while R6 to R9 were inexperienced raters.
Table 1.
Effect of rater’s experience on accuracy and reliability of visual ratings.
Fig 5.
Difference between composite reference standard (CRS) and 18 individual ratings performed by nine raters on the set of 90 MAP bags.
Upper panel shows a Bland-Altman plot. Blue line indicates the mean difference between CRS and ratings. Orange lines show the upper and the lower values for the 95% limit of agreement. Lower panel shows percentage of ratings for the particular CRS bin that are outside of the 95% limit of agreement. Orange bars show frequency of overestimates, while blue bars show frequency of underestimates.
Fig 6.
Percentage of ratings for individual raters that are outside of the 95% limit of agreement.
The limit of agreement was calculated from differences between composite reference standard (CRS) and 18 individual ratings performed by nine raters on the set of 90 MAP bags. R1 to R5 were experienced raters, while R6 to R9 were inexperienced raters. Orange bars show frequency of overestimates, while blue bars show frequency of underestimates.
Fig 7.
Decay progress calculated from 4,535 bags evaluated in eight different experiments.
Upper panel shows the mean decay and the standard deviation of decay calculated from means of eight experiments. Lower panel shows the frequency of lettuce samples in five bins. These bins were developed for simpler presentation of data. They combine visual ratings from the 0 to 10 rating scale, where 0 indicates no decay and 10 indicates complete decay.
Fig 8.
Profiles of H-values from Kruskal-Wallis tests that were calculated from either individual weekly ratings or from the area under the decay progress stairs (AUDePS) scores.
H-values were calculated separately for eight experiments, scaled to the 0 to 100 scale (where 100 is the maximum H-value for the experiment) and averaged. The orange and blue horizontal line indicates periods where higher H-values were detected from ratings or AUDePS scores, respectively.
Fig 9.
Profiles of H-values from Kruskal-Wallis tests that were calculated from estimates of time that is need to reach a certain level of decay (e.g. T10D is time to 10% decay, T100D is time to 100% decay).
H-values were calculated separately for eight experiments, scaled to the 0 to 100 scale (where 100 is the maximum H-value for the experiment) and averaged. The orange, green, and blue horizontal line indicates periods where highest H-values were detected from T10D, T20D to T90D, or T100D, respectively.