Figure 1.
A recessive and compound heterozygote model of the phenotype.
At left part of the figure (A and B) two rare recessive variants at the same gene locus are assumed to be directly genotyped. At the right part of the figure (C and D) two non-causal SNPs with higher minor allele frequencies and in LD with the causal SNPs are genotyped. The upper part of the figure depicts the logarithm scaled frequency of the cross genotypes of two variants (A and C). The lower part of the figure is an example of the genetic model under illustrative parameters. GRRAA = 8, GRRAaBb = 7, GRRBB = 6, rac2 = rbd2 = 0.1 (B and D).
Figure 2.
The expected P values for the CDH test.
The −log10(P) values for two causal SNPs (on the left part of the figure, A and B) and for the single SNP chi-squared test (on the right part, C and D) are derived as a function of the genotype relative risk (GRRAA = GRRBB = GRRAaBb ranging from 1 to 10), the minor allele frequencies (q = q1 = q2 ranging from 0.01 to 0.05 when N is fixed at 10,000; A and C), and the total sample size N (ranging from 6,000 to 10,000 when q is fixed at 0.05; B and D). The base line prevalence of a binary phenotype is fixed at 5% in all analyses.
Table 1.
Percentage of P values smaller than or equal to the test threshold for single SNP analysis and collapsed genotype analysis of two causal variants.
Figure 3.
The power of CDH and single SNP analysis.
Proportion of P values≤5×10−8 from the CDH analysis (green dots) and the single SNP Cochran-Armitage test of two tagging SNPs c (red dots) and d (blue dots). Four SNPs were re-sampled 10,000 times from the Illumina 550 K chip. SNPs a and b were physically close (<200 kb) and had low MAFs (<5%). SNP c was in LD with a and SNP d was in LD with b. The genotypic relative risk was simulated according to the genotypes of a and b under the recessive and compound heterozygote model, where GRRAA = GRRBB = GRRAaBb. The base-line prevalence of a binary phenotype was fixed at 5%. A, when rac2 × rbd2≤0.1; B, when 0.1<rac2×rbd2≤0.5; C, when 0.5<rac2×rbd2≤0.9, and D, when rac2×rbd2>0.9.
Figure 4.
The power of CDH and weighted sum statistic (WSS) [29] was plotted against the portion of causal variants in the sampled region. A region spanning 200 kb was randomly sampled 10,000 times over the Illumina 550 K chip without replacement. For each sampling, a binary trait was simulated by considering a portion of the rare variants in the region to be causal under the recessive-set model described in [29]. Other parameters were fixed (α = 0.05, n = 10,000, and GRR = 10 for carriers of any homozygote or CH genotype of the causal variants). Four sets of P values were derived when (1) all SNPs in the region were analyzed by CDH (blue), (2) all SNPs with MAF<0.05 were analyzed by WSS (red), (3) all non-causal SNPs were analyzed by CDH (green), and (4) all non-causal variants with MAF<0.05 were analyzed by WSS (purple). The power was defined as the portion of P values smaller than or equal to 5×10-8.
Figure 5.
Association between SNPs at MC1R and the red hair color in the Rotterdam Study.
The -log10(P) values for association with red hair color were plotted for each genotyped SNP according to its chromosomal position (blue dots) and for the CDH test in each sliding window consisting of 100 SNPs (green dots represent the left-most SNP). The LD patterns in the Rotterdam Study population and in the HapMap CEU samples (release 27) and the known genes in the region were aligned bellow according to the physical position of the SNPs (genome-build version 36.3). The orange bar indicates the physical position of the MC1R gene. The yellow bar indicates the region between two SNPs based on which the most significant P value of the CDH test was obtained (the left-most SNP rs258322 and the right-most SNP rs8058895).
Figure 6.
Frequency of diplotypes and the prevalence of red hair in the Rotterdam Study.
The causal SNP a is rs1805007 and b is rs1805008. The tagging SNP c is rs2011877 and d is rs2302898. Causal alleles A and B are indicated in red color. Common alleles are indicated in green background and minor alleles are indicated in orange background.
Table 2.
Frequency of red hair phenotype as a function of genotype of two non-causal SNPs tagging the causal variants at the MC1R gene locus.