Figure 1.
Studied three-generational pedigree.
Pedigree of eight individuals of European descent that was studied with exome capture arrays.
Table 1.
NGS run statistics for eight exomes aligning high-quality sequencing reads.
Figure 2.
Sequence coverage of targeted exons.
The graph illustrates the cumulative coverage of targeted bases after sequencing 0.5 Gbp (red), 1 Gbp (blue), 1.5 Gbp (green), and 2 Gbp (purple). 1 Gb resulted in nearly 10x coverage of 50% of all targets; 2 Gb of data increase this number to 88%. Depending on a studies goal, maximum coverage might not always be required.
Table 2.
Genomic variants detected in eight exomes based on 2 454 GS FLX runs of aligned data.
Figure 3.
Sensitivity of genotype calling based on HCDiff SNPs, AllDiff SNPs, and the proposed coverage-dependent genotype calling approach. A) False negative rates are based on concordance with a subset of 44,513 SNPs that overlapped with genotypes obtained with Illumina 1 M Duo BeadChips. The coverage-dependent variant calling approach that calibrates cut-off rates according to array-based genotypes is the most sensitive method, detecting >96% of SNPs at 5x coverage and >99% of all SNPs at ≥8x coverage. B) False positive rates. HCDiff is the most conservative algorithm, resulting in a smaller false positive rate, while the more relaxed dynamic genotype calling algorithm results in twice as high error rates at lower coverage.
Figure 4.
Variant read distribution across eight exomes.
Illustration of the dynamic nature of optimal cut-off rates for calling heterozygous/homozygous variants. At lower coverage (<10x) the ideal cut-off is 88% variant reads in our data, while it is 78% at coverage ≥20. Optimal usage of data should take advantage even of low covered targets. Data are based on comparison to Illumina genotyped SNPs. Green triangles: Illumina heterozygous genotypes, Blue diamonds: Illumina homozygous genotypes. NGS genotypes are placed according to their percent variant reads (y axis).