Ranking and characterization of established BMI and lipid associated loci as candidates for gene-environment interactions

Phenotypic variance heterogeneity across genotypes at a single nucleotide polymorphism (SNP) may reflect underlying gene-environment (G×E) or gene-gene interactions. We modeled variance heterogeneity for blood lipids and BMI in up to 44,211 participants and investigated relationships between variance effects (Pv), G×E interaction effects (with smoking and physical activity), and marginal genetic effects (Pm). Correlations between Pv and Pm were stronger for SNPs with established marginal effects (Spearman’s ρ = 0.401 for triglycerides, and ρ = 0.236 for BMI) compared to all SNPs. When Pv and Pm were compared for all pruned SNPs, only BMI was statistically significant (Spearman’s ρ = 0.010). Overall, SNPs with established marginal effects were overrepresented in the nominally significant part of the Pv distribution (Pbinomial <0.05). SNPs from the top 1% of the Pm distribution for BMI had more significant Pv values (PMann–Whitney = 1.46×10−5), and the odds ratio of SNPs with nominally significant (<0.05) Pm and Pv was 1.33 (95% CI: 1.12, 1.57) for BMI. Moreover, BMI SNPs with nominally significant G×E interaction P-values (Pint<0.05) were enriched with nominally significant Pv values (Pbinomial = 8.63×10−9 and 8.52×10−7 for SNP × smoking and SNP × physical activity, respectively). We conclude that some loci with strong marginal effects may be good candidates for G×E, and variance-based prioritization can be used to identify them.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 modeled variance heterogeneity for blood lipids and BMI in up to 44,211 participants and investigated relationships between variance effects (P v ), G×E interaction effects (with smoking and physical activity), and marginal genetic effects (P m ). Correlations between P v and P m were stronger for SNPs with established marginal effects (Spearman's ρ = 0.401 for triglycerides, and ρ = 0.236 for BMI) compared to all SNPs. When P v and P m were compared for all pruned SNPs, only BMI was statistically significant (Spearman's ρ = 0.010). Overall, SNPs with established marginal effects were overrepresented in the nominally significant part of the P v distribution (P binomial <0.05). SNPs from the top 1% of the P m distribution for BMI had more significant P v values (P Mann-Whitney = 1.46×10 −5 ), and the odds ratio of SNPs with nominally significant (<0.05) P m and P v was 1.33 (95% CI: 1.12, 1.57) for BMI. Moreover, BMI SNPs with nominally significant G×E interaction P-values (P int <0.05) were enriched with nominally significant P v values (P binomial = 8.63×10 −9 and 8.52×10 −7 for SNP × smoking and SNP × physical activity, respectively). We conclude that some loci with strong marginal effects may be good candidates for G×E, and variance-based prioritization can be used to identify them.

Author summary
Most contemporary studies of gene-environment interactions focus on gene variants that are known to bear strong and reliable associations with the traits of interest. The strategy is intuitive because it helps limit the number of tests performed by focusing on a relatively small number of gene variants. However, this approach is predicated on an implicit assumption that these loci are strong candidates for interactions owing to their established relationships with the index traits. The counter-argument is that, because these loci have highly consistent signals within and between populations that vary by environmental characteristics, the probability that these variants interact with other factors is low. The current analysis tests whether variants with strong marginal effects signals (i.e., those prioritized through conventional genome-wide association analyses) are strong or weak candidates for gene-environment interactions. Here we describe analyses focused on lipids and BMI that test this hypothesis by comparing marginal effect signals with variance effect signals and those derived from explicit genome-wide, gene-environment interaction analyses. We conclude that for BMI, there are features of the top-ranking marginal effect loci that render them stronger candidates for interactions than is true of variants with weaker marginal effects signals. These findings are likely to help optimize the efficiency of future gene-environment interaction analyses by providing evidence-based rankings for strong candidate loci.

Introduction
Gene-environment (G×E) interactions may contribute to complex diseases, but their detection has proven challenging; hence, a variety of approaches have been developed to enhance power. Most G×E analyses focus on loci that are strong biological candidates [1] or those with highly significant marginal effects [2]. The latter approach is attractive because these loci are available in many large cohorts, and can be conveniently followed-up with interaction analyses if environmental data are accessible. Moreover, selecting SNPs with strong and reproducible marginal effect signals is a pragmatic data-reduction step that may improve power [3], although this approach risks omitting other promising candidates [4]. In a linear regression setting, the presence of interaction effects drives phenotypic variance heterogeneity by genotype [3,5]. Exploiting variance heterogeneity as a signature of interactions is appealing because, unlike standard approaches for assessing G×E interactions, no explicit information about environmental exposures is needed [6] and multiple exposures can be simultaneously considered.
Here we explored whether loci identified in large-scale genome-wide association studies (GWAS) of blood lipids and body mass index (BMI) are strong candidates for G×E interactions by comparing genome-wide variance heterogeneity P-value distributions generated using Levene's test against P-value distributions for marginal effects and explicit G×E interaction effects (for smoking and physical activity).

Results
We assessed between-genotype variance heterogeneity for up to 1,927,671 directly genotyped or imputed SNPs (HapMap II CEU reference panel [7]) that passed quality control (QC). Meta-analyses of Levene's test summary statistics [8] were performed for BMI (n 44,211 participants), and blood concentrations of high-density lipoprotein cholesterol (HDL-C) (n 34,315), low-density lipoprotein cholesterol (LDL-C) (n 34,180), total cholesterol (TC) (n 34,318) and triglycerides (TG) (n 34,110). We then obtained marginal effects results for the same index traits and SNPs from publicly available GWAS summary data from the GIANT (Genetic Investigation of ANthropometric Traits) Consortium [9] and GLGC (Global Lipids Genetics Consortium) [10,11].
We compared the genome-wide marginal effects with between-genotype variance heterogeneity results for each of the five cardiometabolic traits by calculating the association between marginal effects (P m ) and variance heterogeneity (P v ) P-values using the rank-based Spearman correlation (ρ). This was done using a set of 42,710 pruned SNPs produced using the--indeppairwise command in PLINK (see Materials and Methods) to account for linkage disequilibrium (LD) among variants.
As shown in Table 1 (see also Fig 1A and S1 Table), the Spearman's ρ for the association between P m and P v for all pruned SNPs was of very small magnitude and only statistically significant for BMI. The exclusion of SNPs based on progressively more conservative P m thresholds (P m <0.05; P m <10 −4 ; previously established loci with P m <5×10 −8 in external datasets), saw corresponding improvements in the magnitude of these correlations, which were statistically significant for all traits except TC when focusing on previously established loci. The BMI correlation at the P m <0.05 threshold, as well as the test of equality with ρ for all SNPs, was statistically significant, suggesting concordance between marginal and variance signals at a nominal level of significance. The odds ratio (OR) for a SNP to have both P m <0.05 and P v <0.05 as compared to P v !0.05 was 1.33 (95% CI: 1.12, 1.57) for BMI while the 95% CIs of ORs for other traits included 1. On the other hand, the P-value for a non-zero ρ for TG was statistically significant when focusing on the established loci and at P m <10 −4 , suggesting concordance between marginal and variance signals at more conservative P m thresholds.
We further compared P m with interaction P-values from exposure-specific (smoking and physical activity) genome-wide interaction tests for BMI (P int ); this was only done for BMI owing to the requirement for an adequately powered external dataset (such a dataset was accessible through the GIANT consortium) ( Table 2). Marginal effects GWAS were performed by strata of smokers vs. non-smokers and physically active vs. inactive participants (n = 210,316 European-ancestry adults [12]) respectively, and a heterogeneity test [12] was used to  generate exposure specific P int distributions. Spearman ρ for the pruned set of SNPs in the SNP × physical activity and the SNP × smoking analyses were low and not statistically For each lipid trait (HDL-C, LDL-C, TG and TC on the vertical axis) we ranked P v from Levene's test for all SNPs from lowest to highest so that the lowest P v for a given trait was assigned a rank equal to 1. We scaled ranks into percentiles such that the lowest P v corresponded to the 100 th percentile. We then plotted percentile-scaled ranks of GWAS-derived loci (black sticks on the blue axis) on the distribution of percentile-scaled ranks of genome-wide P v (blue axis) for each trait and marked in red loci with P v <0.05. Loci names are presented above the axis for P v distribution of a given trait and are positioned in the same order as percentile-scaled ranks of GWAS-derived loci, but are equally spaced to facilitate cross-trait comparison (loci names with Levene's test P v <0.05 are highlighted in red). To the left of each axis we present counts of GWAS-derived loci with P v <0.05 and total number of GWAS-derived loci in the analysis separated by a dash, as well as the P-value for the binomial test (P binomial ).

B. Percentile-scaled ranks of GWAS-derived SNPs for BMI on the genome-wide distribution of Pvalues obtained from Levene's test (P v ) and between-strata difference test P-values (P int ) from the 'SNP × Physical Activity' and 'SNP × Smoking' interaction tests for BMI.
For each analysis, we ranked Pvalues for all SNPs from lowest to highest so that the lowest P-value for a given trait was assigned a rank equal to 1. We scaled ranks into percentiles such that the lowest P-value corresponded to the 100 th percentile.
We then plotted percentile-scaled ranks of GWAS-derived loci (black sticks on the blue axis) on the distribution of percentile-scaled ranks of genome-wide P-values (blue axis) from all four approaches and marked in red loci with P v <0.05 or P int <0.05 (or 95 th percentile for average rank between SNP × PA and SNP × Smoking). Loci names are presented above the axis for the P-value distribution of a given trait and are positioned in the same order as the percentile-scaled ranks of GWAS-derived loci, but are equally spaced to facilitate cross-trait comparisons (loci names with P v <0.05 or P int <0.05 are highlighted in red). To the left of each axis conveying each respective P-value distribution, we present counts of GWAS-derived BMI loci with P v <0.05 or P int <0.05 (or 95 th percentile for the average rank of the SNP × PA and SNP × Smoking interaction tests) and the total number of GWAS-derived loci in the analysis separated by a dash, as well as the P-value for the binomial test (P binomial ).
https://doi.org/10.1371/journal.pgen.1006812.g001 significant ( Table 2). We also compared P int values and P v values for BMI. Spearman's ρ for the pruned set of SNPs were low and not statistically significant. We next tested if the number of previously established marginal effect SNPs (P m <5×10 −8 ) that were also nominally significant (P v <0.05) for variance heterogeneity was greater than expected by chance (Tables 3 and 4, Fig 1). For 4 out of the 5 index traits, we observed enrichment at the lower end of the P v distribution (P v <0.05) for the established GWAS-derived lead SNPs. Thus, the nominally significant regions of the P v distributions were generally enriched for GWAS-derived loci.
We also performed enrichment analyses to test if previously established marginal effects SNPs (P m <5×10 −8 ) are enriched for nominally significant (P int <0.05) interactions in the SNP × physical activity or SNP × Smoking analyses, but no enrichment was observed (Table 3; Fig  1B). By contrast, for the physical activity and smoking interaction tests (using all pruned SNPs), the lower end of the P int distribution (P int <0.05) was enriched with SNPs that were nominally significant in the Levene's test analysis (P v <0.05) ( Table 4). This enrichment translated into an OR of 1.08 (95% CI: 1.01, 1.14) for a SNP to have P int <0.05 given P v <0.05 vs. Finally, in the pruned SNP-set we used the Mann-Whitney U test to probe for systematic differences in P v and P m ranks. P-values were ordered from least significant to most significant, and the lowest 100 th centile (i.e. the most significantly associated SNPs) was compared to the remaining 99 th percentile for each of the five traits. For BMI, SNPs in the lowest 100 th centile of the P m distribution had markedly higher P v ranks (i.e. more significant P v ) than the remaining SNPs (P Mann-Whitney = 1.46×10 −5 ; Table 5). Even when excluding previously established lead SNPs (P m <5×10 −8 ) for BMI (or SNPs +/-500kb proximal), SNPs from the lowest 100 th centile of the P m rank-ordered distribution had higher P v ranks than the remaining SNPs (P Mann-Whitney = 4.30×10 −4 ; Table 5). Conversely, no difference in P v ranks was observed for SNPs from the lowest 100 th centile of the P m rank-ordered distribution for the four blood lipid traits; this may reflect trait-specific G×E effects or differences in statistical power by trait. No differences in P v ranks between SNPs from the lowest 99 th centile of the P m rank-ordered distribution compared to SNPs from the 98 th to 1 st centiles of the distribution were observed for any trait (P Mann-Whitney >0.05; Table 5). Similarly, no difference in P m ranks was observed for SNPs from the lowest 100 th centile of the P v rank-ordered distribution for any traits (P Mann-Whitney >0.05; Table 6). To assess whether a trait with a non-normal distribution (e.g. BMI) or strong marginal associations could cause spurious association between the marginal and variance signals, we recapitulated the analysis pipeline (correlation analysis, enrichment analysis, comparisons of rank P m and P v values) in simulations described in the Materials and Methods. Careful assessment of results emanating from these simulations did not reveal evidence of type I error inflation caused by the non-normal distribution of an outcome trait nor strong marginal effects. For instance, we extracted correlation P-values of P m , P v and P int generated from 5,000 simulations. QQ-plots of the 5,000 correlation P-values, 2,500 binomial P-values, and 2,500 Mann-Whitney U test P-values revealed no inflation (S1A-S1C Fig, S2A and S2B Fig and S3A and  S3B Fig, respectively). Repeating these analyses on subsets of SNPs with low P m values did not materially change the results.

Discussion
Collectively, our analyses highlight a few variants with genome-wide significant marginal effects that may be strong candidates for G×E interactions owing to their strong concurrent variance heterogeneity P-values. For BMI, such SNPs are also overrepresented in the nominally significant part of the P v distribution. FTO is an excellent example, as it conveys strong marginal effects [13], exhibits high between-genotype heterogeneity here (Tables 2 and 3 and Fig 1B) and elsewhere [5], and reportedly interacts with physical activity, diet and other lifestyle exposures [2,14,15] and is associated with macronutrient intake [16,17].
Although variance heterogeneity tests are potentially powerful screening tools for G×E interactions, like most interaction tests, they may be bias prone. For example, apparent differences in phenotypic variances across genotypes may be caused by scaling, particularly when the phenotypic means also differ substantially [18], such that the per-genotype means and variances for index traits are correlated. However, where necessary we transformed variables, and the correlations between P m and P v were generally weak, excluding this as a likely source of bias. Using simulated data, we investigated whether the non-normal distribution of a trait can cause a spurious association between marginal and variance signals, which we show is highly improbable. Through further simulations, we assessed whether SNPs with large marginal effects inflate P v , but observed no inflation, indicating that large genetic marginal effects do Characterization of BMI and lipid loci as candidates for G×E interactions not artificially inflate variance heterogeneity to a meaningful extent, and SNPs with low P m and low P v -values are thus likely to be strong candidates for G×E interactions, at least in the case of BMI. It might also be that combining populations from ancestral (e.g., hunter-gatherers) and contemporary environments increases variance heterogeneity owing to diversity in population substructure rather than G×E interactions per se [19]. However, this seems unlikely here, as the cohorts examined are from Westernized European-ancestry populations. There are several additional explanations for between-genotype variance heterogeneity, such as variance misclassification that can occur when the index variant is located within a haplotype containing rare functional variants that convey strong marginal effects [5]. Hence, although variance heterogeneity tests represent a useful data-reduction step, before conclusions are drawn about the presence or absence of G×E interactions, index variants should be validated by testing their interactions with explicit environmental exposures, as we did here with smoking and physical activity. However, genome-wide G×E interactions datasets are not comprised of functionally validated G×E interactions, as no such resource is currently available for human complex traits. This limitation inhibits the extent to which causal effects can be attributed to the top-ranking loci and their interactions with smoking or physical activity.
We conclude that the common approach of prioritizing loci with established genome-wide significant association signals without further discrimination for G×E interaction analyses might be useful, but the efficiency of such analyses could be substantially improved by focusing on variants with low P-values for both variance heterogeneity and marginal effects. We provide these rankings here to facilitate this approach.

Materials and methods
A detailed project flow-chart is shown in Fig 2.

Study sample
We performed a genome-wide search for SNPs whose associations with the following traits are characterized by high between-genotype variance heterogeneity: BMI, TC, TG, HDL-C and LDL-C. The variance heterogeneity analyses were performed using Levene's test [20] in up to 44,211 participants of European descent from seven population-based cohorts. Descriptions of these cohorts are presented in S2 Table. To minimize bias that might result from unequal sample sizes between SNPs when calculating the correlations between the P-values from the marginal (P m ) and variance heterogeneity (P v ) meta-analyses, we restricted the sample size for analyses to 26,000 participants for BMI and to 24,000 participants for lipid traits (S4 Fig).

Genotyping and imputation
A detailed summary of sample sizes, genotyping platforms, genotype calling algorithms, sample and SNP quality control filters, and analysis software for all participating cohorts are provided in S2 and S3 Tables. For each individual, SNPs were imputed using the CEU reference panel of HapMap II [7] (S2 Table). We excluded SNPs with low imputation quality (below 0.3 for MACH, 0.4 for IMPUTE, and 0.8 for PLINK imputed data), Hardy-Weinberg equilibrium P <10 −6 , directly genotyped SNP call rate < 95%, and minor allele frequency (MAF) < 1%.

Selection of SNPs identified through GWAS
We identified SNPs that have been robustly associated (P<5x10 -8 ) with the five cardiometabolic traits in European ancestry populations: 77 SNPs associated with BMI discovered by GIANT [9]; and 58 SNPs associated with LDL-C, 71 SNPs associated with HDL-C, 74 SNPs associated with TC, and 40 SNPs associated with TG [10,11] discovered by GLGC.

Variance heterogeneity analyses
We used Levene's test [20] to identify SNPs that show heterogeneity of phenotypic variances (σ i 2 ) across the three genotype groups at each SNP locus (i = 0, 1, or 2). We first log 10 transformed all five traits followed by a z-score transformation by subtracting the sample mean and dividing by the sample standard deviation (SD), and further Winsorized the z-score values at 4 SD. The transformed phenotype Y was then used to calculate Z, defined by the absolute deviation of each participant's phenotype from the sample mean of his or her respective genotype group at a given SNP locus. For each trait, participating cohorts provided the necessary summary statistics for each genotype at each marker [8]. Specifically, the per genotype group counts (n 0s , n 1s , n 2s ), per genotype means ( " Z 0s ; " Z 1s ; " Z 2s ), and per genotype group variances of Z (σ 0s 2 ,σ 1s 2 ,σ 2s 2 ) were centrally collected and meta-analyzed. The minimum number of observations per genotype group required is 30 participants per cohort.
Meta-analyses were performed using the following formula, derived previously [8]: Where N is the combined sample size, " Z is and s Z is 2 are the sample mean and variance of Z in the i th genotype group of the s th study, respectively. When combining summary-level data to calculate the Levene's test statistics L, the following natural weights ω is and γ i were calculated: o is ¼ n is X s n is and g i ¼ n i N , where n i the sum of genotype counts in the i th genotype group across all participating cohorts. These weights are determined by the frequency of the marker amongst the cohorts, such that the sum of both weights is equal to 1, i.e. X s o is ¼ 1 and X i g i ¼ 1. The meta-analysis Levene's test P-value is obtained by comparing L to an F-distribution with df 1 = 2 and df 2 = N-3.

Comparison between marginal effects and variance heterogeneity Pvalues
Marginal effects P-values for BMI and the relevant lipid traits were obtained from publically available GWAS summary data from the GIANT [9] and GLGC [10,11] consortia, respectively (all cohorts included here in the Levene's meta-analysis were also included in the GIANT and GLGC datasets).
To illustrate our findings, we rank-ordered the P-values (from lowest to highest) from both marginal effects and variance effects analyses for all 1,927,671 SNPs so that the lowest P-value for a given trait was assigned a rank equal to the lowest 100 th centile. These rank-scaled distributions for P m for all five traits are presented in Fig 1. We calculated Spearman's correlations for each of the five cardiometabolic traits between P m and P v . This was done using a pruned set of SNPs. Pruning was performed in the Twin-Gene cohort using the--indep-pairwise 50 5 0.1 command in PLINK [21] by calculating LD (r 2 ) for each pair of SNPs within a window of 50 SNPs, removing one of a pair of SNPs if r 2 >0.1; we proceeded by shifting the window 5 SNPs forwards and repeating the procedure. Spearman's correlations were computed for categories of SNPs: i) all pruned SNPs, ii) the subset of SNPs that was nominally significant (P m <0.05) in the marginal effects analysis, iii) the subset of SNPs with P m <10 −4 in the marginal effects analysis, and iv) SNPs that were previously established in conventional marginal effects GWAS meta-analyses (P m <5×10 −8 ). We also compared Spearman's correlations between these categories of SNPs using the test for equality of two correlations [22].
Next, we performed enrichment analyses to test if there was a higher number of established SNPs in the nominally significant variance P-value (P v <0.05) distribution than expected by chance under the binominal distribution.
We also tested if there is a difference in P v ranks for SNPs from the lowest 100 th centile of the P m rank-ordered distribution for all five traits and the rest of SNPs in the pruned set of SNPs using the Mann-Whitney U test, including and excluding established SNPs (or SNPs that were +/-500kb from the reported lead SNP). This analysis was repeated for SNPs from the 99 th centile vs SNPs from 1 st to 98 th centiles of the P m rank-ordered distribution. The same Mann-Whitney U tests were used to study differences in P m ranks for SNPs from the lowest 100 th and 99 th centiles of the P v rank-ordered distribution and the rest of SNPs in the pruned set of SNPs.
All analyses were performed using Stata 12 (StataCorp LP, TX, USA), unless specified otherwise.

SNP × Physical activity and SNP × Smoking interaction analyses for the outcome of BMI
We used now published data from 210,316 European-ancestry adults (from the GIANT consortium) pertaining to marginal effects meta-analyses for BMI that had been performed separately by strata of smoking (45,968 smokers vs. 164,355 non-smokers) [23]. The genetic marginal effect estimates, calculated separately within each of the two strata, were compared using a heterogeneity test [12] to infer the presence or absence of SNP × smoking interaction effects. The same analyses were performed using physical activity as a binary stratifying variable in up to 180,287 European-ancestry adults (42,065 physically active vs. 138,222 physically inactive) [24]. We calculated Spearman correlations between the P-values derived from the marginal effects meta-analysis and the P int from the interaction effects meta-analysis (i.e., the between-strata heterogeneity test for SNP × smoking and SNP × physical activity interactions from the GIANT consortium); these tests were undertaken for all SNPs and those SNPs that were nominally significant (P m <0.05) in the marginal effects analysis. We then performed enrichment analyses to test if the numbers of nominally significant (P int <0.05) GWAS-derived SNPs from both SNP × physical activity and SNP × smoking analyses were greater than expected by chance under the binomial distribution. We further calculated the OR of having P int <0.05 given P v <0.05 versus P v !0.05 both SNP × physical activity and SNP × smoking interaction analyses in a pruned set of TwinGene SNPs produced using the-indep-pairwise 50 5 0.8 command in PLINK [21].
Thereafter, we calculated the average rank for each SNP's ranking on the P int rank-ordered distributions from the SNP × smoking and SNP × physical activity interaction analyses and performed enrichment analysis using these average ranks with >95 th centile instead of P int <0.05 as the cut-off.

Simulations
We simulated genetic data for 44,000 individuals from a pruned set of 50,335 SNPs with allele frequencies, effect estimates and P m values drawn from the GIANT consortium. We generated an outcome trait by summing the products of the simulated allele counts and effect estimates over all SNPs for each individual, and subsequently added a randomly generated non-normal error term such that the trait resembles the observed distribution of the transformed BMI trait used in the main (real data) analyses. We also simulated a fixed binary interacting factor with 30% prevalence. Using this simulated dataset, we calculated P m , P v and P int values for each SNP and undertook i) pairwise Spearman correlation analyses between P m , P v and P int values (5,000 simulations), ii) enrichment analysis using binomial tests (2,500 simulations) and iii) Mann-Whitney U tests to determine systematic differences in P v and P m ranks (2,500 simulations). Following the same pipeline, we created additional simulated datasets narrowing down SNPs to i) those with P m values from the lowest percentile (n = 504; highest P m = 5×10 −3 ) and to ii) genome-wide significant SNPs (n = 71; P m <5×10 −8 ), and tested the pairwise Spearman correlation for P m , P v and P int values (1,000 simulations for both sets). Simulations were run using the statistical software R (v. 3.3.2). [25] Supporting information S1 Fig. A: Quantile-quantile plot of Spearman correlation test P-values for ranks of P m and P v . Quantile-quantile plot of Spearman correlation test P-values for ranks of P m and P v . The figure illustrates 5,000 Spearman correlation P values testing for correlation between P m and and P v values drawn from a simulated dataset of 44,000 individuals and 50,335 SNPs. In the figure, distribution under the null hypothesis is represented as a black line while its 95% confidence interval is represented as dashed gray lines. The dashed red line represents the correlation P value obtained from the "real data" analysis presented in the main text. B. Quantilequantile plot of Spearman correlation test P-values for ranks of P m and P int . Quantile-quantile plot of Spearman correlation test P-values for ranks of P m and P int . The figure illustrates 5,000 Spearman correlation P values testing for correlation between P m and and P int values drawn from a simulated dataset of 44,000 individuals and 50,335 SNPs. In the figure, distribution under the null hypothesis is represented as a black line while its 95% confidence interval is represented as dashed gray lines. C. Quantile-quantile plot of Spearman correlation test Pvalues for ranks of P int and P v . Quantile-quantile plot of Spearman correlation test P-values for ranks of P int and P v . The figure illustrates 5,000 Spearman correlation P values testing for correlation between P int and and P v values drawn from a simulated dataset of 44,000 individuals and 50,335 SNPs. In the figure, distribution under the null hypothesis is represented as a black line while its 95% confidence interval is represented as dashed gray lines. (TIF)

S2 Fig. A. Quantile-quantile plot of binomial test P-values for enrichment of variants with
P v <0.05 among variants with P m <0.05. Quantile-quantile plot of binomial test P-values for enrichment of variants with P v <0.05 among variants with P m <0.05. The figure illustrates 2,500 binomial P values testing for enrichment of variants with P v <0.05 among all variants with P m <0.05. P v and and P m values drawn from a simulated dataset of 44,000 individuals and 50,335 SNPs. In the figure, distribution under the null hypothesis is represented as a black line while its 95% confidence interval is represented as dashed gray lines. B. Quantile-quantile plot of binomial test P-values for enrichment of variants with P v <0.05 among variants with P int <0.05. Quantile-quantile plot of binomial test P-values for enrichment of variants with P v <0.05 among variants with P int <0.05. The figure illustrates 2,500 binomial P values testing for enrichment of variants with P v <0.05 among all variants with P int <0.05. P v and and P int values drawn from a simulated dataset of 44,000 individuals and 50,335 SNPs. In the figure, the distribution under the null hypothesis is represented as a black line while its 95% confidence interval is represented as dashed gray lines. The dashed red line represents the correlation P value obtained from the "real data" analysis presented in the main text. (TIF) S3 Fig. A. Quantile-quantile plot of Mann-Whitney U test P-values for systematic differences in P v ranks among variants with top ranking and lower ranking P m values. Quantilequantile plot of Mann-Whitney U test P-values for systematic differences in P v ranks among variants with top ranking and lower ranking P m values. The figure illustrates 2,500 Mann-Whitney U P values testing for systematic differences in P v ranks among those variants with the most significant P m values (100 th percentile of P m distribution) and the remaining variants (1-99 percentile of P m distribution). P v and and P m values drawn from a simulated dataset of 44,000 individuals and 50,335 SNPs. In the figure, distribution under the null hypothesis is represented as a black line while its 95% confidence interval is represented as dashed gray lines. The dashed red line represents the correlation P value obtained from the "real data" analysis presented in the main text. B. Quantile-quantile plot of Mann-Whitney U test P-values for systematic differences in P m ranks among variants with top ranking and lower ranking P v values. Quantile-quantile plot of Mann-Whitney U test P-values for systematic differences in P m ranks among variants with top ranking and lower ranking P v values. The figure illustrates 2,500 Mann-Whitney U P values testing for systematic differences in P m ranks among those variants with the most significant P v values (100 th percentile of P v distribution) and the remaining variants (1-99 percentile of P v distribution). P v and and P m values drawn from a simulated dataset of 44,000 individuals and 50,335 SNPs. In the figure, distribution under the null hypothesis is represented as a black line while its 95% confidence interval is represented as dashed gray lines. The dashed red line represents the correlation P value obtained from the "real data" analysis presented in the main text.