A powerful method for pleiotropic analysis under composite null hypothesis identifies novel shared loci between Type 2 Diabetes and Prostate Cancer

doi:10.1371/journal.pgen.1009218

Fig 1.

Scenario I: QQ plots for pleiotropic analysis of null data on traits from 2 independent case-control studies.

Observed(−log₁₀p-values) are plotted on the y-axis and Expected(−log₁₀p-values) on the x-axis. Either each study has 1, 000 unrelated cases and 1, 000 unrelated controls, or Study 1 is 4 times that of Study 2, where Study 2 has 1, 000 unrelated cases and 1, 000 unrelated controls. Type I error performance of tests of pleiotropic effect of a genetic variant on the 2 traits is based on 9.99 million null variants with genetic effects that are either {β₁ = 0 = β₂} or {β₁ = 0, β₂ = log(1.15)} or {β₁ = log(1.15), β₂ = 0}. The gray shaded region represents a conservative 95% confidence interval for the expected distribution of p-values. P-values ≥10^-10 are shown here.

More »

Expand

Fig 2.

Scenario II: QQ plots for pleiotropic analysis of null data on traits from 2 case-control studies with different proportions of overlapping controls.

Observed(−log₁₀p-values) are plotted on the y-axis and Expected(−log₁₀p-values) on the x-axis. Equal study sample size, and equal case-control size assumed in each study. Each study has 1, 000 unrelated cases and 1, 000 unrelated controls, of which either 20%, 40%, 80% or 100% of the controls are shared between the two studies. Type I error performance of tests of pleiotropic effect of a genetic variant on the 2 traits is based on 9.99 million null variants with genetic effects that are either {β₁ = 0 = β₂} or {β₁ = 0, β₂ = log(1.15)} or {β₁ = log(1.15), β₂ = 0}. The gray shaded region represents a conservative 95% confidence interval for the expected distribution of p-values. P-values ≥10^-10 are shown here.

More »

Expand

Fig 3.

Scenario I: Power of PLACO, maxP and naive approaches at genome-wide significance level (5 × 10⁻⁸) for varying genetic effects of traits from 2 independent case-control studies.

Sobel’s approch is excluded from this figure since it has <1% power across all scenarios. The first naive approach (‘Naive-1’) declares pleiotropic association when p_Trait1<5 × 10⁻⁸ and p_Trait2<5 × 10⁻⁵, while the second naive approach (‘Naive-2’) uses a more liberal criterion p_Trait1<5 × 10⁻⁸ and p_Trait2<5 × 10⁻³. Each study either has 1, 000 unrelated cases and 1, 000 unrelated controls, or Study 1 has 4 times sample size as Study 2, where Study 2 has 1, 000 unrelated cases and 1, 000 unrelated controls.

More »

Expand

Fig 4.

Manhattan plot of the PLACO p-values of pleiotropic association of common genetic variants with outcomes (traits) T2D and PrCa.

The black horizontal dashed line corresponds to genome-wide significance level α = 5 × 10⁻⁸. The 44 loci with genome-wide significant pleiotropic lead SNP have been highlighted. A locus is defined by clumping SNPs in ±500 Kb radius around the lead SNP and with LD r²>0.2. Within each locus, if a PLACO significant SNP has genetic effects in opposite directions for T2D and PrCa, it is plotted as a solid triangle (24 such loci), else as a solid circle. Each identified pleiotropic locus is categorized (color-coded) as follows. Three loci harbor SNPs that are marginally genome-wide significant for both T2D and PrCa (single-trait p<5 × 10⁻⁸). Four loci contain SNPs that are marginally genome-wide significant for one disease, and in close proximity (i.e., in the same locus) with another SNP marginally genome-wide significant for the other disease. There are 10 loci where SNPs are marginally genome-wide significant for one disease and in close proximity with another SNP marginally suggestively significant (single-trait p<10⁻⁵) for the other disease. Two loci harbor SNPs that are marginally suggestively significant (but not genome-wide significant) for both T2D and PrCa. There is no locus that contains SNPs that are marginally suggestively significant (but not genome-wide significant) for one disease, and in close proximity with another SNP marginally suggestively significant (but not genome-wide significant) for the other disease. The rest of the 25 loci identified by PLACO contain SNPs that are not even marginally suggestively significant for either T2D or PrCa.

More »

Expand

Table 1.

The coloc colocalization posterior probability () for the lead SNPs from each of the 43 pleiotropic loci identified by PLACO.

More »

Expand

Table 2.

The potentially novel loci detected by PLACO and with convincing evidence ( and ) of being causal for both T2D and PrCa from colocalization analysis.

More »

Expand

Fig 5.

Regional association plot of significant pleiotropic locus near RGS17 with annotations such as CADD scores, RegulomeDB scores, and cis eQTL association p-values from 6 tissues.

Tissues considered are whole blood from eQTLGen Consortium; and adipose, liver, muscle-skeletal, pancreas, and prostate tissues from GTEx v8.

More »

Expand

Fig 6.

Regional association plot of significant pleiotropic locus near UBAP2 with annotations such as CADD scores, RegulomeDB scores, and cis eQTL association p-values from 6 tissues.

Tissues considered are whole blood from eQTLGen Consortium; and adipose, liver, muscle-skeletal, pancreas, and prostate tissues from GTEx v8.

More »

Expand