Correcting for batch effects in case-control microbiome studies

doi:10.1371/journal.pcbi.1006102

Fig 1.

Percentile-normalization procedure converts case and control values into percentiles of the control distribution, which allows for pooling of normalized data across studies.

Conceptual plot shows theoretical feature (OTU 1) abundance distributions for control samples and case samples from two independent studies. Converting a control distribution into percentiles of itself naturally gives rise to a uniform distribution (represented by flat blue distributions in central panels), while converting the case distribution into percentiles of the control distribution produces a non-uniform distribution when these two distributions differ (represented by skewed orange distributions in central panels). The right-most panel shows the result of pooling percentile distributions from study 1 and study 2. Percentile-normalization places data from separate studies onto a standardized axis that allows for cross-study comparison. Each simulated case and control distribution was produced by randomly sampling 100 times from a lognormal distribution. Study 1 control parameters: μ = 0.1 and σ = 0.7. Study 1 case parameters: μ = 0.8 and σ = 0.5. Study 2 control parameters: μ = 1.5 and σ = 0.2. Study 2 case parameters: μ = 1.75 and σ = 0.13.

More »

Expand

Fig 2.

Batch effects between healthy controls from different studies can be reduced by ComBat and percentile-normalization.

Non-metric multidimensional scaling (NMDS) plot showing the distribution of healthy controls from three colorectal cancer studies in ordination space (Bray-Curtis distances of relative abundance OTU-level data). Despite standardized bioinformatic processing, healthy patients differed significantly in their gut microbiomes across studies (PERMANOVA p < 0.001; batch accounts for 6.342% of the total variance). Studies were still significantly different after applying ComBat, an established batch-correction method (PERMANOVA p < 0.01). However, percentile-normalization did a better job of stabilizing the variance across studies and removed any apparent batch effect (PERMANOVA p > 0.5).

More »

Expand

Fig 3.

Pooling non-normalized samples from different studies can give rise to many spurious associations.

The control group from one study is gradually substituted with randomly chosen control samples from another study (non-normalized, percentile-normalized, limma-corrected, and ComBat-corrected), keeping the total number of case and control samples fixed at n = 40 (see conceptual illustration on the left). Mixing in non-normalized control samples from another study gave rise to spurious results due to batch effects (blue lines). ComBat- and limma-corrected data showed fewer spurious associations (green and red lines). Percentile-normalization showed no increase in spurious results along the titration gradient (orange lines).

More »

Expand

Fig 4.

False positive rates are reduced by batch-correction methods.

Random sets of 40 Baxter controls and random sets of 40 Zeller controls were selected for null case-control comparisons (20 iterations). Smaller points show the fraction of p-values ≤ 0.05 within a given iteration, while larger dots show the average value across all 20 iterations. Within each category, smaller points are randomly jittered along the x-axis for better visualization. The fraction of p-values ≤ 0.05 is highly inflated for non-normalized data (red dashed line shows the null-expectation for p-values). Only abundant OTUs (detected in at least a third of case or control samples) were included in this analysis.

More »

Expand

Fig 5.

OTUs significant across CRC studies, but not within a given study.

Pooling data provides greater statistical power to detect subtle, yet consistent differences in OTU abundances across sample groups. 18 OTUs are labeled by their most resolved taxonomic annotation. Each OTU in this plot was not found to be significant within either Baxter or Zeller studies, but became significant after pooling the percentile-normalized datasets (q ≤ 0.05).

More »

Expand

Table 1.

Normalization methods impact the number of significant genus-level associations between cases and controls across multiple diseases.

More »

Expand

Fig 6.

Genera that show a significant difference between CRC cases and controls within a given study, but not after pooling.

12 genera showed significant differences between cases and controls within a study (q ≤ 0.05), but not after pooling across CRC studies.

More »

Expand

Fig 7.

Genera that do not show a significant difference between CRC cases and controls within a given study, but do after pooling.

Two genera did not show significant differences between cases and controls within a study, but became significant after pooling across CRC studies (q ≤ 0.05).

More »

Expand