## Correction

26 Nov 2014: The PLOS ONE Staff (2014) Correction: A One-Degree-of-Freedom Test for Supra-Multiplicativity of SNP Effects. PLOS ONE 9(11): e114890. https://doi.org/10.1371/journal.pone.0114890 View correction

## Figures

## Abstract

Deviation from multiplicativity of genetic risk factors is biologically plausible and might explain why Genome-wide association studies (GWAS) so far could unravel only a portion of disease heritability. Still, evidence for SNP-SNP epistasis has rarely been reported, suggesting that 2-SNP models are overly simplistic. In this context, it was recently proposed that the genetic architecture of complex diseases could follow limiting pathway models. These models are defined by a critical risk allele load and imply multiple high-dimensional interactions. Here, we present a computationally efficient one-degree-of-freedom “supra-multiplicativity-test” (SMT) for SNP sets of size 2 to 500 that is designed to detect risk alleles whose joint effect is fortified when they occur together in the same individual. Via a simulation study we show that the SMT is powerful in the presence of threshold models, even when only about 30–45% of the model SNPs are available. In addition, we demonstrate that the SMT outperforms standard interaction analysis under recessive models involving just a few SNPs. We apply our test to 10 consensus Alzheimer’s disease (AD) susceptibility SNPs that were previously identified by GWAS and obtain evidence for supra-multiplicativity () that is not attributable to either two-way or three-way interaction.

**Citation: **Herold C, Ramirez A, Drichel D, Lacour A, Vaitsiakhovich T, Nöthen MM, et al. (2013) A One-Degree-of-Freedom Test for Supra-Multiplicativity of SNP Effects. PLoS ONE 8(10):
e78038.
https://doi.org/10.1371/journal.pone.0078038

**Editor: **Angelo Scuteri, INRCA, Italy

**Received: **May 23, 2013; **Accepted: **September 9, 2013; **Published: ** October 30, 2013

**Copyright: ** © 2013 Herold et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Funding: **TV was supported by the Deutsche Forschungsgemeinschaft grant BE 3828/3-2. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Despite of thousands of confirmed disease susceptibility variants [1], the findings from Genome-wide association studies (GWAS) so far explain only a portion of the heritability of complex diseases [2]. Multi-SNP approaches like interaction and pathway analysis were proposed [3] to detect the still unexplained portion of genetic disease risk. While Genome-wide interaction analysis has become computationally feasible [4], [5], by now only few, if any, replicable interactions have been found. In order to explain the phenomenon of missing evidence for interaction, Zuk et al. [6] suggested that common diseases may follow so-called limiting pathway liability models (LPLMs). A LPLM is defined by multiple risk factors which imply a risk threshold. Individuals with a risk allele load above the threshold have a strongly increased disease risk, while a baseline risk applies below the threshold. LPLMs can be viewed as a special case of the larger class of liability models [7], [8] which allow that the risk contribution of the involved factors may vary. In addition, the LPLMs focus on a single pathway that is under polygenic influence. In contrast to that, Li et al. [8] describe two sources of liability to depression, namely genetic liability for stress sensitivity mediating depression, and genetic liability for depression in general. Both sources are shown to be under polygenic control. A key feature of these models and the simpler LPLMs is that they imply epistasis that goes beyond two-way interaction. Further important classes of more complex high-dimensional models have been discussed in [9].

Although pointed out previously [10], it is worthwhile to recall that diverging definitions and interpretation of the terms “interaction” or “epistasis” in the literature often lead to confusion. The topic is intrinsically difficult, since the statistical definition of interaction is scale-dependent [11]. In this paper, as in the majority of statistical publications on the topic, we interpret interaction as deviation from multiplicative relative risks, which corresponds to deviation from additivity on the logarithmic scale used in logistic regression models. This definition is the appropriate definition for rare diseases [12] and will also prove to be appropriate in the settings we are going to investigate.

The risk allele threshold models proposed by Zuk et al. [6] lead to marginal effects that are comparable with effect sizes observed in GWAS studies and imply both low and high-dimensional interactions. However, pairwise interaction, although present, is typically so small that it would be detectable only with sample of several hundreds of thousands of individuals. In this sense, LPLMs would be consistent with the expected importance of genetic interaction [10], [13], [14] on the hand and lacking statistical evidence for its presence on the other hand.

The search for deviation from multiplicativity in all medium-sized SNP sub sets of a GWAS panel is clearly unfeasible and not a realistic strategy in the coming years. However, it is an important research question how to decide whether a *given* set of SNPs displays “supra-multiplicativity” of allelic risks. In this paper, we present a powerful one degree of freedom (d.f.) regression test for deviation from multiplicativity which simultaneously addresses interactions of all orders and which is particularly powerful in the presence of threshold models.

## Results

### Empirical Levels

Table 1 shows results from the simulations under the model with marginal effects, but no interaction effects of any kind. Under all scenarios, the empirical levels are slightly lower than the nominal level. This phenomenon is caused by the application of a Bonferroni-correction to not completely independent test statistics. The conservativeness is significant for and all SNP sizes, as well as for and for SNP sets with less than 30 SNPs. However, the observed conservativeness is rather small in size. The strongest difference we observe is an empirical level of 0.042 at for 40 SNPs. Therefore, we conclude that the application of the Bonferroni-correction is sufficient for practical purposes. The results of the power will support this claim.

In table 2, “nominal levels” or, depending on the perspective, power levels for the non-multiplicative models and are shown. Model involves non-zero dominance effects and model has several pairwise interactions. In other words, classic logistic regression test for dominance deviation or two-way interaction would be the method of choice. Under both models, the SMT shows a measurable excess of the nominal level, i.e., already 1-way interactions ( = dominance effects) and two-way interactions produce small signals. The level of excess of the nominal level is very small, in particular under scenario . Under scenario the excess is slightly more substantial, at a nominal of 0.005 the empirical level is 0.021. The SMT will typically be applied to detect higher-order interactions in which case adjustment for a priori known or data-derived dominance or interactions terms is warranted. When such terms are added as additional covariate parameters, the excess of the nominal level disappears (table 2). In other words, adjustment for significant interaction terms is possible and should be applied in order to investigate which terms drive the significance of a successful SMT application.

### Power of Single-marker Analysis and Conventional Interaction Analysis

Table 3 and table 4 show the power of conventional one-SNP-at-a-time and pairwise interaction analysis in the presence of different LPLMs. Let us first consider the penetrance value . Such a penetrance might be considered to be unusually high, but leads to allele relative risks no larger than 1.4. Consistently, power for single-marker analysis in table 3 is on a level that is reasonable given a sample size of 3000 cases and 3000 controls. With 50 SNPs, there is 62% power to detect at least one SNP at the Genome-wide level of . Power increases with decreasing size of the SNP set. This is not too surprising, since in our set-up we kept the portion of individuals above the liability threshold constant. As a consequence, the marginal effects become larger when fewer SNPs are part of the model. With , power of single-marker analysis is low for more than 30 SNPs and at , only under the 10-SNP model some power is left. However, it has to be emphasized that much higher penetrances have been suggested [6] and that we included the lower values for the purpose of completeness.

Power for pairwise interaction is typically absent (table 4), with the exception of the 10 SNP model and to some extent the 20 SNP model for . Even under the 10-SNP model interaction can be detected only at a significance level of , i.e., under the assumption that the 10 SNPs of the model are known a priori and that only these SNPs are tested for interaction. In a Genome-wide search it is impossible to detect any of the interactions at the experiment-wide significance level of . In summary, our LPLMs are consistent with the models proposed by Zuk et al. [6] and also consistent with the results of the last years of GWAS analysis: the LPLMs imply some detectable marginal effects, but pairwise interaction effects are not identifiable with the given sample size.

### Power of SMT

Figures 1, 2, 3, and 4 show power levels of the SMT in the presence of LPLMs which follow the specifications by Zuk et al. [6]. We start with the description of the power curve for (figure 1). As expected, power increases with the percentage of model SNPs that can be included. In order to reach a power of 80%, only about 35% of the risk SNPs have to be known. Here, the required portion is smaller for the models with more SNPs, but the effects on the power curves depends only moderately on the number of SNPs in the model. The figure also shows, that below a portion of 20% of known SNPs there is typically no power to detect an involvement in a liability model.

The -axis represents the available percentage of SNPs of the complete model, the -axis power levels.

The -axis represents the available percentage of SNPs of the complete model, the -axis power levels.

The -axis represents the available percentage of SNPs of the complete model, the -axis power levels.

The -axis represents the available percentage of SNPs of the complete model, the -axis power levels.

When is reduced to 0.5 (figure 2), the portion of required SNPs increases as expected. However, a power level of 80% can still be reached also when only about 35% to 55% of the involved SNPs are known. In addition, all models have 100% power when more than 60% of the SNPs are included.

When is lowered to 0.3 (figure 3), power drops markedly. As mentioned before, is a rather small penetrance value for a liability threshold model. We included this model for completeness and in order to show limits of our approach. For 80% power, now between 60% to 95% of the SNPs have to be known a priori. It seems unlikely that such a high portion of disease SNPs is known in advance, at least at the current state of research. For higher values of , say , is more reasonable to assume that the required portion of SNPs is known a priori. Finally, at (figure 4), power vanishes completely, except for the 10-SNP model. This penetrance value, however, is so small that also single-marker analysis has no power, cf. table 3. In other words, hardly any method will allow to to detect association with the sample size investigated here.

In figure 5 the impact of incomplete tagging of the true causal variants ( for all SNPs) is exemplified for the 30-SNP LPLM (). The critical allele load risk threshold under this model is . Under perfect tagging (blue curve, ), the regression estimates () for “supra-multiplicativity” grow along with the allele load and rise to a sharp maximum at that coincides with the simulated parameter. Under imperfect tagging (red curve), the regression estimates have a smaller range, which reflects the overall reduced power to detect supra-multiplicativity, c.f. also the power curve under incomplete tagging in figure 6. On the other hand, the curve displays a similar behavior as before, it rises along with to a maximum, now at . In other words, the true maximum is missed, but the peak occurs very close to the simulated parameter. We concluded that the SMT works well also when the true causal SNPs are only captured by variants in LD.

The -axis represents the allele threshold, the -axis the supra-multiplicativity effect estimate. The circles indicate the “true” risk allele threshold.

Comparison of fixed threshold and modified threshold model. The -axis represents the available percentage of SNPs of the complete model, the -axis power levels.

Figure 6 contrasts the power of the SMT under the 30-SNP LPLM (blue curve) to the modified LPLM, which allows variation of individual SNP risks (red curve). A slightly higher portion of known SNPs is now required to reach a given power level. The number of required SNPs increases by two on average. The SNP effects that result under the modified model are quite high (relative risks up to 2). As a consequence, a higher portion of the variance is now explained by marginal effects and the portion of variance explained by the supra-multiplicativity is reduced. In view of the fact that modified LPLM implies strong marginal effects, we conclude that the SMT is rather robust against deviations from equal SNP contributions.

Table 5 shows the performance of the SMT in the presence of 3-SNP recessive models, i.e., models in which the high-risk genotype is defined by homozygosity for the risk allele at each of three SNPs. The statistical interaction method of choice would be the analysis of either all two-way interactions or the analysis of three-way interaction. In the standard logistic regression setting, a 1-d.f. test for three-way interaction can be constructed by testing the allelic interaction parameter . When dominance variance terms shall be considered, as well, eight different three-way interaction terms can be formulated and simultaneously tested. Pairwise interaction in each of three SNP pairs can be assessed either with a 1-d.f. allelic test for interaction (investigation of ) or a 4-d.f. genotypic test for interaction.

At low significance levels, particularly at , the 2-SNP 1-d.f. interaction tests outperforms the SMT, which is partially caused by the fact that power values for the pairwise tests are not corrected for the multiple testing of three pairs. At , the SMT outperforms single-marker analysis and all standard interaction tests, under all four recessive models investigated. Under model REZ-C, for instance, power levels are below 0.02 for all standard tests while the SMT has remarkable power of 80%. This result might be surprising at first glance, but can be explained by the architecture of the SMT. The completely recessive model implies single-SNP dominance variation, two-way and three-way interactions. The interaction terms often will not reach significance, either because the effect is too small, but also because of the high number of degrees of freedom. The SMT, however, combines the evidence for two- and three-way interaction in a single test statistic that follows a -distribution with just one degree of freedom. In addition, the SMT takes into account that the effect directions of risk alleles are not altered, a fact that is not addressed when two-way and three-way interaction terms are combined in a standard regression test. Thus, the SMT can also be a powerful alternative when the goal is “just” interaction analysis of SNP sets of small size.

### Data Analysis

We applied the SMT to known Alzheimer’s disease (AD) susceptibility loci. According to the GWAS catalogue [1], there are currently 10 confirmed consensus AD genes [15]–[19]. These genes, together with the SNP showing strongest evidence for association, are APOE (rs429358), ABCA7 (rs3764650), CR1 (rs3818361), PICALM (rs3851179), CLU (rs11136000), BIN1 (rs744373), EPHA1 (rs11767557), CD2AP (rs9349407), CD33 (rs3865444) and MS4A (rs610932). We obtained genotypes for these SNPs from a previously unpublished late-onset Alzheimer’s disease (AD) Genome-wide case-control study, genotyped on the Illumina Omni1M micro-array. AD patients were recruited within the German Dementia Competence Network, DCN (http://www.kompetenznetz-demenzen.de) and at the interdisciplinary memory clinic of the Department of Psychiatry and Department of Neurology at the University Hospital in Bonn, Germany. Diagnosis of AD dementia was established according to NINCDA-ADRDA criteria [20]. All patients gave written informed consent for participation in the entire study protocol. After application of standard quality control procedures, 850,612 genotypes of 649 cases and 1,096 selected controls were available. We used the software IMPUTE2 [21] to impute into the February 2012 release of the 1,000 genomes project [22] and extracted the 10 AD SNPs for analysis.

As expected, our data shows overwhelming evidence for association of APOE with AD (, table 6). For the remaining SNPs, the level of significance that is reached with single-marker analysis is only moderate. However, for 8 out of 10 SNPs the risk allele is consistent with the risk allele reported in the GWAS catalogue, which suggests that the effects of the known SNPs are reflected in our data, but not significant because of a lack of power. Of note, for two SNPs, rs744373 () and rs3865444 (), the effect direction is not in concordance with the GWAS catalogue. When the SMT is applied to a priori known SNPs, it is essential to count the number of risk alleles according to risk allele specification of the outside source and not according to the risk allele assignment that would follow from the data itself. Indeed, it cannot be expected that all susceptibility variants have the “correct” allele direction in a sample that is smaller than the samples from which the AD consensus SNPs were derived. Therefore, we computed the individual risk allele loads according to the GWAS consensus risk alleles.

Prior to application of the SMT, we screened the data for SNP-dominance effects, and two- and three-way interactions. None of the dominance or pairwise interaction terms was nominally significant. Analysis of three-way interaction of 120 SNP triples was not significant after adjustment for multiple testing ().

Table 7 shows the results of the supra-mulitplicativity analysis. The allele thresholds 6,7 and 12 yield nominally significant p-values, , and , respectively. While the lower thresholds have effect estimates below zero (, ), the higher threshold have an effect estimate above zero (). In other words, for thresholds disease risk is lower than what would be expected under complete multiplicativity, while for disease risk is “supra-multiplicative”. Thus, the effect directions are consistent with the notion of a threshold model, disease risk increases over-proportionally with higher risk allele loads. However, a sharply defined threshold, as observable with simulated data (figure 5), is not detectable. Instead, the effect estimates grow gradually with increasing threshold (figure 7). For illustrative purposes, table 6 also contains odds ratios computed from a two-by-two contingency table with case-control status and the number of individuals with a risk allele load below/above the threshold as attributes. These odds ratios also grow gradually with increasing allele threshold.

The -axis represents the allele threshold, the -axis the supra-multiplicativity effect estimate.

In summary, we see an over-proportional increase in disease risk, along with growing risk allele load. Since the supra-multiplicativity is not attributable to lower order interactions terms, we can state that the deviation from multiplicativity would have gone undetected without the SMT.

## Discussion

We presented the SMT as a test for deviation from multiplicativity. It can be applied to sets of a size of up to SNPs and allows joint investigation of all -way interactions, , by a single parameter. With SNPs, there are allelic interaction terms that theoretically could be analyzed in a regression framework. With growing , such systematic investigation quickly reaches the limit of computability. The SMT, however, can simultaneously assess interactions of different orders. In addition, the formulation as a one parameter test improves power via the reduction of degrees of freedom and by reduction of the multiple testing burden. The test is powerful under two assumptions: First, effect *sizes* of SNP alleles have to depend on the combination of genotypes at other disease SNPs, but, effect *directions* must not depend on the genotypes at other model SNPs. Indeed, the indicator parameter counts the number of risk alleles of an individual. As a consequence, the SMT will lose power when effect directions can be reverted. Second, a substantial part of the SNPs under investigation must deviate from multiplicativity. This assumption might be considered to be a strong one, but is, on the other hand, fulfilled by biologically plausible models which at the same time are easily specified. In particular, limiting pathway liability models (LPLMs) lead to strong supra-multiplicativity which can be detected by the SMT, as shown in our simulation study. Moreover, the SMT remained powerful when we altered the LPLMs such that marginal SNP effects were allowed to vary substantially. In addition, we exemplified the usefulness of the SMT for small SNP sets by showing that it is much more powerful than standard interaction tests under 3-SNP-recessive models.

Zuk et al. [6] have proposed LPLMs as a possible explanation for missing evidence for epistasis in GWAS studies. With the SMT it is now possible to systematically screen GWAS susceptibility SNPs for their involvement in a LPLM. More generally, our approach can be used as a tool to screen pre-defined SNP sets for supra-multiplicativity. Thereby, it has the potential to assess the presence of simultaneous effects of known SNPs and to contribute to the judgement of their joint relevance. In this context, previously confirmed, particular -way interactions can be used as model covariates, in order to detect supra-multiplicativity on top of such interactions.

Our analysis of known Alzheimer’s disease (AD) susceptibility SNPs in previously unpublished data revealed supra-multiplicativity () that was neither attributable to SNP dominance, nor to pairwise or three-way interaction effects. A sharp risk allele threshold as it would be characteristic of LPLMs was not identifiable. Instead, deviation from multiplicativity developed gradually with increasing allele load, and, therefore, was not detectable with standard interaction analysis.

Confirmed supra-multiplicativity of risk factors is of potential relevance in various fields. In the presence of supra-multiplicativity, prediction of disease risk can be improved when the joint distribution of risk factors is fully modeled in contrast to prediction based on multiplication of risk factors. For application in clinical practice, it is important that all risk factors as such, as well as their amount of supra-multiplicativity, are confirmed by independent studies. In addition, independent studies would have to specifically investigate whether prediction can be improved over prediction based on the domineering APOE locus.

Supra-multiplicativity is perhaps even more important for treatment invention and drug development than it is for risk prediction. Supra-multiplicative risk factors imply over-proportionally strong disease risk when they occur together. Conversely, this means that disease risk can be substantially reduced when the effects on phenotype of only a portion of the risk factors can be blocked. Under a risk threshold model, for instance, disease risk can be reduced to a base-line level when the number of still effective risk factors can be reduced so far that the number of still active risk factors is below the risk threshold. In particular, this means that for a complex disease it might not be necessary to have an antidote for all risk factors, it might be sufficient to inactivate just some of these. Stringent statistical evidence for strong supra-multiplicativity of the risk factors of the disease under investigation, however, is an essential prerequisite for such a perspective.

## Methods

Our work got its stimulus from the manuscript of Zuk et al. [6] who suggested limiting pathway liability models (LPLMs) as an explanation for the missing evidence for interaction in GWAS studies. An LPLM can be defined as follows: Consider a set of SNPs, and specify for each SNP its risk allele. Let , be the number of risk alleles of an individual. Let , be a liability threshold. For individuals with , the risk to be a case is set to be equal to a baseline penetrance , while for individuals with risk alleles, an increased penetrance applies. It is typically assumed that and that . By design, such a model leads to marginal SNP effects and also to -way interaction effects for all . Indeed, the effect of the risk allele of SNP , , depends on the number of risk alleles present at SNPs 1 to . Since the interaction effects are distributed over all orders, particular single interaction terms are rather small. As a consequence, search for, for instance, pairwise interaction is not the method of choice in the presence of a LPLM. Motivated by this, we construct a one degree of freedom (d.f.) test for deviation from multiplicativity for a set of SNPs that simultaneously addresses all -way interaction effects, . An implementation can be found in our software package INTERSNP [23] (http://intersnp.meb.uni-bonn.de).

We consider a set of SNPs with corresponding parameters , , which follow the allele coding used in the logistic regression framework described elsewhere [24]. We introduce a series of indicator parameters , , where is a liability threshold that shall be investigated. In case an individual has risk alleles, we set , otherwise we set . Next, we test the liability threshold by comparingagainst

In other words, we use the marginal effects of the SNPs as covariates and investigate if the disease risk increases sharply when risk alleles are present. Since the optimal cut-off is not known in advance, all values of have to be tested. We compute the final p-value as where is the number of cut-offs for which there are individuals both above and below the threshold. The method will be conservative since the tests are not independent for different values of . In the results section, we will show that the Bonferroni-correction is, nevertheless, more than acceptable for practical purposes. We note that the suggested test allows the inclusion of further covariate parameters. Adaption to quantitative traits is also straightforward.

By design, the proposed test can detects various types of deviation from multiplicativity. Therefore, we call it a supra-multiplicativity test (SMT). The SMT is constructed to optimize power in the presence of LPLMs. More generally, one can expect the SMT to be powerful when a substantial portion of the sub sets of the SNP set under investigation deviate from multiplicativity. In addition, effect directions should not be reverted in combination with other risk alleles since the test is build up on risk allele counts per individual.

### Simulation Set-up

#### Null models.

Let be the number of risk alleles, , at a given SNP and let be the respective penetrance. To investigate the proposed test under the null hypothesis “no deviation from multiplicativity”, we simulated case-control data for , disease SNPs under a completely multiplicative model. We considered 1,000 permutation replicates for each choice of *n*, and, for each replicate, randomly selected allele frequencies from a uniform distribution and randomly assigned relative risk values from 1.2 to 1.5. In order to investigate to SMT in the presence of covariates, we considered “semi-null” models, i.e., models with deviation from multiplicativity that was attributable either to single SNP dominance effects or to two-way interaction. We simulated 30 SNPs. In model , 15 SNPs were simulated under a model of recessive type, the relative risk for homozygote carriers was set to . In model , 7 SNP pairs were simulated under a double-recessive model, i.e., the relative risk was set to be for the two-locus genotype with 4 risk alleles. All other effects were combined multiplicatively.

#### LPLMs.

In order to assess power of the SMT, we investigated limiting pathway liability models (LPLMs) as suggested by Zuk et al. [6]. Let be the number of model SNPs and let be a respective allele load threshold. In our set-up, ranged from 10 to 50 in step sizes of 10 and the threshold was chosen such that about 1.5% of the general population had an allele load equal to or above the threshold . For individuals below , a baseline penetrance value of 0.03 was assumed. For individuals above the threshold, we assumed a strongly increased penetrance , where was chosen from . For each parameter constellation , we simulated 1,000 data sets with 3,000 cases and 3,000 controls and estimated empirical power at the level as the portion of simulated data sets significant at . We decided to present power at the level , since an exhaustive search over all subsets of a given set of SNPs quickly leads to a high number of tests. In addition, analysis of is a number of tests that can be analyzed within a reasonable time frame with our implementation. In practice, a less stringent might be considered sufficient, depending on the number of models that are actually tested, cf. also the section “Data analysis”. Power of the liability was not only analyzed for the entire SNP set, but also for marker subsets of all possible sizes, in order to mimic the situation that not all SNPs belonging to a threshold model will be available or known. For a set of SNPs, the number of subsets of size can be enormously high. Therefore, we could not analyze all subsets, but investigated the “first” SNPs, in the arbitrary order implied by the set-up, to assess power for subsets of size . This procedure has also the advantage of improved comparability when moving from subset size to subset size .

We also investigated the potential impact of incomplete SNP coverage on the power of the SMT. To this purpose, we assumed that the true causal SNPs of the 30-SNP LPLM () are not available, but only tagged by proxy SNPs with an of 0.80. After genotype imputation, such approximation of causal SNPs by SNPs in linkage disequilibrium is realistic.

#### Modified threshold models.

The LPLMs suggested in [6] might be considered to be to simplistic. In particular, differences in the contribution of individual SNPs should be allowed in multi-SNP models [9]. Therefore, we modified the 30-SNP LPLM () from above as follows. We maintained the load threshold from before, but varied the weighting of the risk allele contributions from different SNPs. For one third of the SNPs, the risk allele counts were weighted with a factor of 0.5, for another third of the SNPs, the risk allele counts were weighted with a factor of 1, and the remaining SNPs were weighted with a factor of 2.

#### Recessive models.

We investigated “completely recessive” models defined by three SNPs. A baseline penetrance of 0.03 was assumed. Only individuals which were homozygous for the risk allele at all three SNPs were assigned an increased penetrance. We simulated data under different scenarios, defined by varying population allele frequencies and risk genotype penetrance values. In detail, we simulated the model REZ-A with allele frequencies of the three SNPs of 0.2, 0.5 and 0.8, respectively, and a penetrance of 0.20 for the high risk 3-SNP-genotype. Model REZ-B was defined by allele frequencies 0.2, 0.3 and 0.4 and a high risk genotype penetrance of 0.70, model REZ-C was defined by allele frequencies 0.4, 0.5 and 0.6 and a high risk genotype penetrance of 0.10, and, model REZ-D was defined by allele frequencies 0.6, 0.7 and 0.8 and a high risk genotype penetrance of 0.05. The resulting allelic relative risks ranged from 1.05 to 1.19 under model REZ-A, from 1.03 to 1.06 under REZ-B, from 1.05 to 1.08 under REZ-C, and, from 1.10 to 1.13 under REZ-D.

## Author Contributions

Conceived and designed the experiments: TB CH AR FJ MMN WM. Performed the experiments: TB CH AR. Analyzed the data: CH. Contributed reagents/materials/analysis tools: TV DD AL AR MMN FJ WM. Wrote the paper: CH AR TB.

## References

- 1.
Hindorffi L, MacArthur J, Morales J, Junkins H, Hall P, et al.. (2012) A catalog of published genome-wide association studies. URL www.genome.gov/gwastudies.
- 2. Maher B (2008) Personal genomes: The case of the missing heritability. Nature 456(7218): 18–21.
- 3. Mechanic L, Chen H, Amos C, Chatterje N, Cox N, et al. (2012) Next generation analytic tools for large scale genetic epidemiology studies of complex diseases. Genet Epidemiol 36: 22–35.
- 4. Wan X, Yang C, Yang Q, Xue H, Fan X, et al. (2010) Boost: A fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Hum Genet 87: 325–40.
- 5. Herold C, Mattheisen M, Lacour A, Vaitsiakhovich T, Angisch M, et al. (2012) Integrated genomewide pathway association analysis with intersnp. Hum Hered 73: 63–72.
- 6. Zuk O, Hechter E, Sunyaev S, Lander E (2012) The mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci USA 109: 1193–1198.
- 7. Falconer D (1967) The inheritance of liability to diseases with variable age of onset, with particular reference to diabetes mellitus. Ann Hum Genet 31: 1–20.
- 8. Li X, McGue M, Gottesman I (2012) Two sources of genetic liability to depression: interpreting the relationship between stress sensitivity and depression under a multifactorial polygenic model. Behav Genet 42: 268–77.
- 9. Wray NR, Goddard ME (2010) Multi-locus models of genetic risk of disease. Genome Medicine 2: 10.
- 10. Phillips P (1998) The language of gene interaction. Genetics 149: 1167–1171.
- 11. Thompson W (1991) Effect modification and the limits of biological inference from epidemiologic data. J Clin Epidemiol 44: 221–232.
- 12. Clayton D (2012) Link functions in multi-locus genetic models: Implications for testing, prediction, and interpretation. Genet Epidemiol 36: 409–418.
- 13. Carlborg O, Haley C (2004) Epistasis: Too often neglected in complex trait studies? Nat Rev Genet 5: 618–625.
- 14. Cordell H (2009) Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet 10: 392–404.
- 15. Harold D, Abraham R, Hollingworth P, Sims R, Gerrish A, et al. (2009) Genome-wide association study identifies variants at clu and picalm associated with alzheimer’s disease. Nat Genet 41: 1088–1093.
- 16. Lambert J, Heath S, Even G, Campion D, Sleegers K, et al. (2009) Genome-wide association study identifies variants at clu and cr1 associated with alzheimer’s disease. Nat Genet 41: 1094–1099.
- 17. Seshadri S, Fitzpatrick A, Ikram M, DeStefano A, Gudnason V, et al. (2010) Genome-wide analysis of genetic loci associated with alzheimer disease. JAMA 303: 1832–1840.
- 18. Hollingworth P, Harold D, Sims R, Gerrish A, Lambert J, et al. (2011) Common variants at abca7, ms4a6a/ms4a4e, epha1, cd33 and cd2ap are associated with alzheimer’s disease. Nat Genet 43: 429–435.
- 19. Naj A, Jun G, Beecham G, Wang L, Vardarajan B, et al. (2011) Common variants at ms4a4/ms4a6e, cd2ap, cd33 and epha1 are associated with late-onset alzheimer’s disease. Nat Genet 43: 436–441.
- 20. McKhann G, Drachman D, Folstein M, Katzman R, Price D, et al. (1984) Clinical diagnosis of alzheimer’s disease: report of the nincds-adrda work group under the auspices of department of health and human services task force on alzheimer’s disease. Neurology 34: 939–944.
- 21. Howie B, Donnelly P, Marchini J (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genetics 5: e1000529.
- 22. Genomes Project Consortium, Abecasis GR, Altshuler D, Auton A, Brooks LD, et al (2010) A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073.
- 23. Herold C, Steffens M, Brockschmidt F, Baur M, Becker T (2009) Intersnp: Genome-wide interaction analysis guided by a priori information. Bioinformatics 25: 3275–3281.
- 24. Cordell H, Clayton D (2002) A unified stepwise regression procedure for evaluating the relative effects of poly-morphisms within a gene using case/control or family data: application to hla in type 1 diabetes. Am J Hum Genet 70: 124–141.