Table 1.
Per library sequencing metrics.
Fig 1.
Box plots of the mean per-base Q-scores for R1 (left) and R2 (right) across three HiSeq X RTA versions. The mean, standard deviation (SD), and mean overall difference in Q-scores between RTA 2.7.7 and the two older versions (RTA 2.7.1 and 2.7.5) are listed above the box plots for each base and RTA version.
Fig 2.
Examples of average base call quality scores for whole genome bisulfite sequencing of libraries prepared from lymphoblastoid cell line NA10860.
Per nucleotide quality scores (average for each sequencing cycle) for read 1 and read 2 separately. A-D) SPLAT libraries. E-H) TSDM libraries. Panels A,B,C and E,F,G show Q-scores obtained with HiSeq X RTA versions, the version numbers are noted in each panel. Panels D and H show corresponding data generated on the HiSeq 2500 platform. Q-boxplots for guanines exclusively in read 2 are plotted in the rightmost panels.
Fig 3.
Examples of average base call quality scores for whole genome bisulfite sequencing of libraries prepared from leukemia cell line REH.
Per nucleotide quality scores (average for each sequencing cycle) for read 1 and read 2 separately. A and B): SPLAT libraries. C-F) TSDM libraries. HiSeq X RTA versions are plotted in Panels A, and C-E. Corresponding data generated on the HiSeq 2500 are plotted in panels B and F and data generated on the NovaSeq are plotted in panel G. Q-boxplots for guanines exclusively in read 2 is plotted in the rightmost panels.
Fig 4.
Variation in global methylation rates depends less on RTA version than on library preparation method.
A) Global methylation levels (average methylation level across the whole genome) for DNA samples NA10860 and REH are shown for the different library preparation methods and are colored according to RTA software. B) Boxplots showing the average methylation in 100 kB windows for the various libraries and RTA versions; the median values are denoted in the panel.
Table 2.
Global methylation levels computed from R1 and R2 separately.
Fig 5.
Scatter plots illustrating the high correlation of methylation calls.
A) Average methylation in 100 kB windows (n = 28,795) shown for data generated on HiSeq X and HiSeq 2500 systems. B) Methylation at individual CpG sites covered by more than 10 reads (n = 7.5 M) shown for the corresponding data sets.
Fig 6.
Correlation plots showing pairwise comparisons across a shared set of 3 M CpG sites covered by 10 reads or more in each dataset.
A). Heatmap of the Pearson’s correlation coefficients for comparisons across all the library types, sequencing softwares, and cell types used in the study ordered by hierarchical clustering. B) The corresponding root mean square error (RMSE) values for the library/software comparisons in the same order as plotted in panel A.