Fig 1.
A: Work flow of RNA extraction, processing and hybridization at study’s baseline (BL) and follow up (FU) examination. I) Blood collection, monocyte enrichment and RNA isolation were performed following the same standard operating procedures at the study center for both, BL and FU. II) RNA processing including amplification, purification, and dilution was performed using the same protocol at BL and FU, however, these steps were performed by different personnel a different study site. III) At BL, the Illumina HT12 BeadChips version 3 was used for hybridization and at FU the Illumina HT12 BeadChips version 4. At BL, the BeadArray Reader was used for the scan of the beadchips and at FU, the iScan was used. B: Definition of replicated samples. 15 subjects were randomly selected. At BL and FU, RNA was isolated from these 15 subjects. Based on the time point of RNA isolation and the time point of RNA processing, hybridization (preparation) and scan, three groups of sample replicates were defined: i) RNA isolated, prepared and arrays scanned at BL (BLrep), ii) RNA isolated at BL, stored for 5 years at -80°C, then prepared and arrays scanned at FU (BLFUrep) and iii) RNA isolated, prepared and scanned at FU (FUrep). C: Factors affecting observed gene expression differences between replicate measures. BLrep and BLFUrep differ by the time point of RNA preparation and array scan and thus reflect technical differences. For BLFUrep and FUrep, RNA preparation and array scan were performed at the same time point, therefore, observed variation mainly reflects biological differences. Observed differences between BLrep and FUrep comprise technical and biological variation. (rep = replicated sample).
Table 1.
Number of RNA measurements used for evaluating batch effect removal approaches at baseline and 5-year follow up.
Fig 2.
Batch effects between baseline and 5-year follow up samples.
Replicates of RNA samples were measured at baseline and follow up. Overall gene expression of those groups were used to generate plots visualizing distributions and variation of transcriptomes. A: Density plots show a clear shift between samples measured at baseline and follow up, with higher expression at 5-year FU measurement. B: The plot of principal component (PC) 1 against PC2 from PC analysis indicate strong batch effects between examination dates. Measurements from the same date (BLFUrep and FUrep) are very similar to each other, although RNA was extracted 5 years apart. C: Boxplots of 9 measurements from the same GHS individual grouped by BLrep, BLFUrep and FUrep. Again, distributions of samples measured within the same batch are very similar to each other. BLrep: RNA extracted and measured at BL, BLFUrep: RNA extracted at BL and measured at FU, FUrep: RNA extracted and measured at FU. Red: Replicates from the group of BLrep measurements, Blue: BLFUrep measurements and dark grey: FUrep measurements.
Fig 3.
Comparison of different batch effect removal approaches.
Replicates of RNA samples extracted at BL were hybridized on Illumina HT12 microarrays at both examination dates. Overall gene expression rescaled by seven different approaches. Components of variance are visualized as PCA plots. Replicate samples extracted and measured at baseline (BLrep) are marked red and repeated measures at 5-year follow up (BLFUrep) in blue. Correction based on A: Deming regression, B: Passing-Bablok regression, C: linear mixed models, D: 3rd order polynomial regression and E: qspline [10] was not capable to remove batch effects from gene expression data. The PCA plots show clusters between replicates extracted, processed and hybridized at both time points. F: After applying ComBat, G: quantile normalization followed by ComBat, H: ReplicateRUV and I: quantile normalization plus ReplicateRUV, no clustering of samples was observed indicating successful removal of batch effects.
Table 2.
Comparison of batch effect removal methods.
Fig 4.
Comparison of gene expression data between baseline and 5-year follow up.
Bland-Altman plots were produced using microarray data from one GHS individual to evaluate agreement of repeated measures before and after batch effect removal. Expression differences between technical replicates hybridized at baseline and 5-year follow up were therefore plotted against the mean expression of both time points for each probe. Each dot represents one probe and dense clusters of probes are marked blue. The majority of probes had low expression values. A: In uncorrected data, large expression differences between baseline and follow up data could be observed. Those differences were strongly dependent on the mean expression. B: After applying ComBat, differences were largely reduced, but were increased for probes with high expression. C: When quantile normalization was performed separately in each batch followed by ComBat, the best results were obtained in terms of agreement between repeated measures.
Fig 5.
Clustering of sample replicates after batch effect correction.
RNA samples extracted at BL and measured at BL (BLrep) or FU (BLFUrep) were clustered based on pairwise distance of overall gene expression. Each GHS individual is represented with an ID between 1 and 15. Batch membership is indicated by the labels BL for BLrep and BLFU for BLFUrep. RNA used for hybridization and scanning was utilized from the same stock at both time points. As a consequence, overall gene expression for one individual should be very similar between technical replicates. Before batch effect correction, samples fell into two clusters representing batches. A: After applying ComBat clustering improved. B: Quantile normalization plus ComBat led to clusters, which mainly discriminate between individuals, indicating retained biological effects. C: ReplicateRUV led to comparable results and D: quantile normalization plus ReplicateRUV led to almost perfect classification.
Fig 6.
Comparison of batch effect removal by ReplicateRUV and ComBat in the full dataset.
Overall gene expression was corrected for batch effects by either ReplicateRUV or ComBat. A, B: Components of variance are visualized as PCA plots. Samples extracted and measured at baseline (BL) are marked red, repeated measures at 5-year follow up (BLFUrep) in blue and follow-up (FU) samples in grey. Batch correction based on A: quantile normalization plus ReplicateRUV resulted in clusters indicating remaining batch effects, while B: quantile normalization plus ComBat removed those effects. C: 50 subjects with BL and FU gene expression data available were drawn from 1092 individuals in 22 iterations. Gene expression was hierarchically clustered and the number of subjects with BL and FU falling into direct proximity was counted. On the y-axis, the proportion of correctly clustered pairs is shown for different batch effect removal approaches. Quantile normalization plus ComBat led to the highest proportion of pairs indicating maintained intra-individual similarity between time points. D: Example dendrogram from hierarchical clustering after quantile normalization plus ReplicateRUV. Each individual is represented with an ID between 1 and 50. The labels BL and FU represent time points. One subject was identified with “BL” and “FU” (BL: Baseline, FU: 5y follow-up) clustered in direct proximity (40.BL, 40.FU). E: Clustering based on quantile normalized and ComBat corrected data led to 16 individuals with BL and FU in the same cluster. F: Hierarchical clustering of quantile normalized data for 15 subjects with BLFUrep and FUrep measurements. Here, for 7 individuals (46.7%) all measurements fell into the same cluster.
Fig 7.
Maintenance of biological variation after quantile normalization and ComBat.
To assess whether biological sources of variability were maintained after batch effect removal, associations between each probe and body mass index (BMI) were calculated using linear mixed models within each batch containing 1092 samples before and after applying ComBat. A, B: For each probe, we plotted the effect of BMI on expression in ComBat corrected data on the x-axis and the quantile-normalized but uncorrected on the y-axis. BMI beta estimates were highly correlated between corrected and uncorrected datasets in A: BL samples (R = 0.995) and B: FU samples (R = 0.986). C, D: The effect of BMI on gene expression in BL samples was plotted against the effect observed in FU samples after C: quantile normalization and D: quantile normalization followed by ComBat. The correlation between BMI effect estimates at BL and FU was slightly higher for ComBat corrected data (r = 0.787) compared data, which was only quantile normalized (r = 0.781).