The meaning of significant mean group differences for biomarker discovery

doi:10.1371/journal.pcbi.1009477

Table 1.

Biomarkers in homogeneous vs. heterogeneous conditions.

More »

Expand

Fig 1.

Simulations of the degree to which 2 groups overlap at different effect sizes.

(a) Percentage of autistic individuals (red) falling within 1 SD and 2 SDs of the control (blue) distribution at effect size of d = 0.2, 0.5, 1, and 2.7. 0 = mean, σ = SD. Simulations based on 10,000 random draws assuming the same SD and absolute mean difference in the population. The red shaded area indicates the % of cases above 2 SDs. (b) Although sample size does not bias the effect size estimates themselves, it does substantially affect their precision, which is reflected in the width of the CI. The precision of effect size estimates with sample sizes of N = 20 and N = 100. Purple shading denotes CIs around 1 SD of the mean and red shading CIs around 2 SDs of the mean. For example, for a small effect size at Cohen’s d of 0.2, with N = 100 participants per group, between 60% and 75% of autistic people would fall within 1 SD of the control mean. With smaller samples of N = 20 per group, ranges grow to 40%–85% within 1 SD and to 75%–100% within 2 SDs. At Cohen’s d of 0.5, with N = 100 versus N = 20, 55%–71% versus 45%–80% of people with ASD would fall within 1 SD and between 89%–97% versus 85%–100% within 2 SDs of the TD mean, etc. Hence, with small sample sizes, the range of possible results is so wide that it is difficult to make accurate inferences of the frequency or severity of cases who have abnormalities on that measure from single studies. As recently noted, studies with small sample sizes (low power), paired with publication bias and file drawer effects as well as high sample variability (true heterogeneity within a condition), can lead a whole field to overestimate the magnitude of the true population effect [49,50]. ASD, autism spectrum disorder; CI, confidence interval; SD, standard deviation.

More »

Expand

Fig 2.

Simulation of how central tendencies and the shape of the distributions impact group overlap.

Translating the central tendencies of mean and SD into median and interquartile ranges in (a) normal distribution and (b) skewed distribution. Illustration of group overlap when (c) case group is skewed but control group is normal, (d) both groups are skewed, (e) exponentially modified gamma distribution with strong skewness, and (f) with milder skewness, (g) platykurtosis, (h) leptokurtosis, (i), bimodal equal, (j) bimodal asymmetric. SD, standard deviation.

More »

Expand

Fig 3.

Average effect sizes of meta-analyses per modality.

The distributions of the original data included in the meta-analyses were often not reported. This is exemplified in S1 Table where we checked information on the data distributions of the 49 original papers included in a review of emotion recognition [20]. However, in the majority of papers, parametric statistics were employed, which may be taken as indicating normal distribution. EF, executive function; ET, eye-tracking; fMRI, functional MRI; MMN, mismatch negativity; PRS, polygenic risk score; ROI, region of interest; sMRI, structural MRI; SNP, single-nucleotide polymorphism; ToM, theory of mind.

More »

Expand

Table 2.

Use of terms “on average” and/or “biomarker” in published papers using PubMed search, across domains.

More »

Expand