Figure 1.
Three-generation extended pedigrees.
A) is a 3-generation extended pedigree with numbers labeling the individual heterozygous genotype mismatch rates (%) at coverage of 15× with base quality of Q20 without mapping error and panel B) labels the corresponding mismatch rates for the standard approach of ignoring relatedness. Panel C) and D) display the heterozygous mismatch rates (%) when a fixed sequencing effort of 150× is allocated differently to family members: Panel C) is for the situation where the founders are allocated 30× while non-founders have 5× and in Panel D) founders and non-founders have coverage of 6× and 21× respectively.
Table 1.
Percentage of missing non-reference genotypes (i.e. false negatives) per individual in families for variants called by joint modeling family data and the standard approach of ignoring relatedness for sequencing coverage between 5× and 30× and for input sequence data with Phred-scaled quality of 20 (error rate of 1% per base) or 30 (error rate of 0.1% per base) without mapping error.
Table 2.
Genotype mismatch rates (%) for different family structures with sequencing coverage of 5×, 15×, and 30× and input bases with Phred-scaled quality Q20 (1% error rate) or Q30 (0.1% error rate) without mapping error.
Figure 2.
Mismatch rates (%) of 4 categories of genotypes by the reference allele frequencies for pedigrees of quartet (two siblings and their parents) with base quality Q20 at 15× without mapping error.
The 4 categories are (A) overall genotypes, (B) homozygous alternative allele, (C) heterozygotes and (D) homozygous reference allele.
Table 3.
Mendelian inconsistency rates per triplet (father, mother and offspring) for the genotypes by joint modeling of family data (top panel) and by the standard approach where the relatedness was ignored, i.e. individuals were treated as unrelated (bottom panel) for sequencing coverage of 5× to 30× and bases with Phred-scaled quality Q20 (1% error rate) and 30 (0.1% error rate) without mapping error.
Figure 3.
Power of detecting de novo mutations (DNM) in different pedigree structures for coverage from 5× to 40×.
Panel A) shows the power for trios with base quality Q20 and Q30 and panel B) shows the power comparisons of trios, nuclear families with 2 and 3 siblings, and 3-generation extended pedigrees (shown in Figure 1) for base quality Q20 without mapping error.
Table 4.
Number of false positive de novo mutations per billion bases detected by PolyMutt of jointly modeling for sequencing at coverage 5×–40× with Phred-scaled base quality Q20 (1% error rate) without mapping error in different pedigrees structures.
Table 5.
Heterozygous mismatch rates (%) and Mendelian inconsistency rates (%) per site of call sets generated by PolyMutt (family-aware) and the standard approaches using PolyMutt (ignoring relatedness) and GATK from empirically calibrated alignments of simulated reads with base quality of Q20 in the pedigree shown in Figure 1.
Figure 4.
The receiver operating characteristic (ROC) curves of PolyMutt and the standard methods for de novo mutation (DNM) detection from empirically calibrated alignments of simulated reads with sequencing coverage of 30× with base quality of Q20.
PolyMutt (ignoring relatedness) and GATK calls were obtained by jointly calling a trio assuming individuals in a trio are unrelated using Polymutt and GATK respectively.
Figure 5.
Comparisons of two variant callsets from SardiNIA low-pass sequencing data where variant calling was carried out by explicitly modeling family relatedness (family calls) and by the standard approach of ignoring relatedness (standard calls).