Table 1.
Sequence of sampling (randomized complete block design).
Table 2.
Diagnostic code list.
Table 3.
Diagnostic code distribution and nondiagnostic summary.
Table 4.
Mean probability (across dogs) of obtaining ≥1 truth-consistent site when sampling k sites (without replacement) from four sites under three evaluation rules; values are means across dogs with 95% bootstrap CIs (2.5th–97.5th percentiles).
Fig 1.
Probability of capturing ≥1 truth-consistent site vs. number of sites sampled (k), by rule.
Lines show means across dogs; bands show 95% bootstrap CIs. m_any = ≥1 pathologist’s read in the truth set; m_both_any = both in the truth set (not necessarily equal); m_both_same = both agree and the agreed code is in the truth set.
Table 5.
Paired, within-dog differences in p(k) for adjacent contrasts (bootstrap 95% CIs) and location tests.
Table 6.
Per-read correctness by site (GLIMMIX least-squares means on the probability scale).
Fig 2.
Model-based site effects on per-reading correctness.
GLIMMIX least-squares means with 95% CIs for LH, LI, RH, RI on the probability scale. No site effect (p = 0.599) and no pathologist effect (p = 0.781). Point: LS-mean predicted probability; Horizontal line: 95% CI.
Table 7.
Inter-pathologist percent agreement by site (exact 95% CIs).
Table 8.
Intra-observer agreement by site pair and pathologist.
Table 9.
Pooled (across pathologists) intra-observer agreement by site pair.
Fig 3.
Inter-pathologist agreement by site pair.
Simple κ with 95% CIs for each site pair pooled across pathologists, with N overlaid. Highlights that LI–RH shows the highest agreement. Open circle: κ estimate; Horizontal line: 95% CI.
Table 10.
Mixed-effects models: main effects and clustering by dog.
Table 11.
Mixed-effects logistic models: main effects and clustering by dog.
Table 12.
Pairwise site comparisons for ordinal scale outcome variables (cumulative logit models), with dog-level clustering (ICC).