Numerous obesity loci have been identified using genome-wide association studies. A UK study indicated that physical activity may attenuate the cumulative effect of 12 of these loci, but replication studies are lacking. Therefore, we tested whether the aggregate effect of these loci is diminished in adults of European ancestry reporting high levels of physical activity. Twelve obesity-susceptibility loci were genotyped or imputed in 111,421 participants. A genetic risk score (GRS) was calculated by summing the BMI-associated alleles of each genetic variant. Physical activity was assessed using self-administered questionnaires. Multiplicative interactions between the GRS and physical activity on BMI were tested in linear and logistic regression models in each cohort, with adjustment for age, age2, sex, study center (for multicenter studies), and the marginal terms for physical activity and the GRS. These results were combined using meta-analysis weighted by cohort sample size. The meta-analysis yielded a statistically significant GRS × physical activity interaction effect estimate (Pinteraction = 0.015). However, a statistically significant interaction effect was only apparent in North American cohorts (n = 39,810, Pinteraction = 0.014 vs. n = 71,611, Pinteraction = 0.275 for Europeans). In secondary analyses, both the FTO rs1121980 (Pinteraction = 0.003) and the SEC16B rs10913469 (Pinteraction = 0.025) variants showed evidence of SNP × physical activity interactions. This meta-analysis of 111,421 individuals provides further support for an interaction between physical activity and a GRS in obesity disposition, although these findings hinge on the inclusion of cohorts from North America, indicating that these results are either population-specific or non-causal.
We undertook analyses in 111,421 adults of European descent to examine whether physical activity diminishes the genetic risk of obesity predisposed by 12 single nucleotide polymorphisms, as previously reported in a study of 20,000 UK adults (Li et al, PLoS Med. 2010). Although the study by Li et al is widely cited, the original report has not been replicated to our knowledge. Therefore, we sought to confirm or refute the original study's findings in a combined analysis of 111,421 adults. Our analyses yielded a statistically significant interaction effect (Pinteraction = 0.015), confirming the original study's results; we also identified an interaction between the FTO locus and physical activity (Pinteraction = 0.003), verifying previous analyses (Kilpelainen et al, PLoS Med., 2010), and we detected a novel interaction between the SEC16B locus and physical activity (Pinteraction = 0.025). We also examined the power constraints of interaction analyses, thereby demonstrating that sources of within- and between-study heterogeneity and the manner in which data are treated can inhibit the detection of interaction effects in meta-analyses that combine many cohorts with varying characteristics. This suggests that combining many small studies that have measured environmental exposures differently may be relatively inefficient for the detection of gene × environment interactions.
Citation: Ahmad S, Rukh G, Varga TV, Ali A, Kurbasic A, Shungin D, et al. (2013) Gene × Physical Activity Interactions in Obesity: Combined Analysis of 111,421 Individuals of European Ancestry. PLoS Genet 9(7): e1003607. https://doi.org/10.1371/journal.pgen.1003607
Editor: David B. Allison, University of Alabama at Birmingham, United States of America
Received: December 26, 2012; Accepted: May 18, 2013; Published: July 25, 2013
Copyright: © 2013 Ahmad et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: Some of the work leading to this publication benefited from support from the Innovative Medicines Initiative Joint Undertaking under grant agreement n°115317 (DIRECT), resources of which are composed of financial contribution from the European Union's Seventh Framework Programme (FP7/2007–2013) and EFPIA companies' in kind contribution. The current study was funded by Novo Nordisk, the Swedish Research Council, Påhlssons Foundation, the Swedish Heart-Lung Foundation, and the Skåne Regional Health Authority (all to PWF). The Fenland Study is funded by the Wellcome Trust and the Medical Research Council (MC_U106179471). The present part of the HEALTH2006 study was funded by The Lundbeck Foundation Centre for Applied Medical Genomics in Personalised Disease Prediction, Prevention and Care (LuCamp, www.lucamp.org). The Novo Nordisk Foundation Center for Basic Metabolic Research is an independent Research Center at the University of Copenhagen partially funded by an unrestricted donation from the Novo Nordisk Foundation (www.metabol.ku.dk). The present part of the Inter99 study was funded by The Lundbeck Foundation Centre for Applied Medical Genomics in Personalised Disease Prediction, Prevention and Care (LuCamp, www.lucamp.org). The Novo Nordisk Foundation Center for Basic Metabolic Research is an independent Research Center at the University of Copenhagen partially funded by an unrestricted donation from the Novo Nordisk Foundation (www.metabol.ku.dk). InterAct study was supported by funding from the European Union (Integrated Project LSHM-CT-2006-037197 in the Framework Programme 6 of the European Community) and the Medical Research Council, UK. The MDC study was supported by research grants from Swedish Research Council, the Swedish Heart-Lung Foundation, Strategic Research grant to EXODIAB, Linneus grant to Lund University Diabetes Centre (LUDC), The Albert Påhlsson Foundation, The Novo Nordisk Foundation, The Swedish Diabetes Foundation and an equipment grant from The Knut and Alice Wallenberg Foundation. METSIM study: This work has been supported by the Academy of Finland, the Finnish Diabetes Research Foundation, the Finnish Cardiovascular Research Foundation, EVO grant from the Kuopio University Hospital (5263), NIH grants DK093757, DK072193 and DK062370. The NHS study was supported by grants DK091718, HL071981, HL073168, CA87969, CA49449, CA055075, HL34594, HL088521, U01HG004399, DK080140, 5P30DK46200, U54CA155626, DK58845, U01HG004728-02, EY015473, DK70756, CA134958, and DK46200 from the National Institutes of Health, with additional support for genotyping from Merck Research Laboratories, North Wales, PA. The Swedish Twin Registry is supported by the Ministry for Higher Education, the Swedish Research Council, and GenomEUtwin; the US National Institutes of Health; and the Swedish Foundation for Strategic Research. EI and AG were supported by the Swedish Heart-Lung Foundation (grant no. 20120197) and Swedish Research Council (VR; grant no. 2012-1397) when working on this paper. The WGHS is supported by HL043851 and HL080467 from the National Heart, Lung, and Blood Institute and CA047988 from the National Cancer Institute, the Donald W. Reynolds Foundation and the Fondation Leducq, with collaborative scientific support and funding for genotyping provided by Amgen. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Obesity is a major risk factor for many non-communicable diseases including type 2 diabetes, cardiovascular disease, and certain cancers . Genetic predisposition and lifestyle factors are known to increase obesity susceptibility, and the technological breakthroughs that came with genome-wide association studies (GWAS) have led to the successful identification of a large number of obesogenic loci –. Recent studies suggest that physical activity may modify genetic susceptibility to obesity, with the genetic burden being higher in physically inactive compared with active persons –. The most extensively studied example of a gene × physical activity interaction in obesity is for the FTO locus , , which was recently replicated in a meta-analysis comprising 240,000 persons . Elsewhere, Li et al reported that physical activity offsets the aggregated genetic risk of 12 obesogenic loci .
In the current study, we aimed to replicate the findings of Li et al  in a sample collection of 111,421 individuals of European ancestry. We also undertook detailed analyses focused on the role of within- and between-study factors to establish how the design of gene × environment interaction meta-analyses impacts the power to detect interactions.
Supplementary Table S1 shows participant characteristics for each of the 11 participating cohorts.
Genetic risk score (GRS) × physical activity interactions
The forest plot in Figure 1 shows the interaction coefficients across the 11 cohorts included in the meta-analysis, along with the overall interaction effect estimate (Pinteraction = 0.015). Table 1 summarizes the adjusted main effects of the GRS on BMI and obesity in the combined data from all cohorts and by strata of physical activity. Each unit increase in the GRS, equivalent to one BMI-raising allele, was associated with a mean 0.161 (SE = 0.006) kg/m2 higher BMI (P = 2.1×10−176), which corresponds to 465 g heavier weight for a person 1.70 m tall. Overall, among physically inactive individuals (with a Cambridge Physical Activity Index [CPAI] of 1), each additional BMI-raising allele was associated with 0.186 (SE = 0.006) kg/m2 higher BMI, equivalent to 538 g in weight for a person 1.70 m tall (P = 4.8×10−47), whereas the effect in the most physically active group (CPAI of 4) was 0.143 kg/m2 per GRS allele (SE = 0.011, P = 5.6×10−40), or 413 g in weight for a person 1.70 m tall. In the ‘combined active’ group (individuals with a CPAI of 2–4), each additional risk allele was associated with 0.150 kg/m2 (SE = 0.007, P = 3.3×10−107) higher BMI, or 434 g in weight for a person 1.70 m tall (Figure 2). As illustrated in Figure 3, in the inactive group (CPAI of 1), the difference in BMI between persons with a low (≤11 alleles) and high (>11 alleles) GRS was 0.647 kg/m2 (SE = 0.06; P = 1.9×10−25), while the difference in the combined active group was 0.532 kg/m2 (SE = 0.03; P = 6.6×10−67).
Physical activity was estimated according to the Cambridge Physical Activity Index (CPAI), where the inactive group is defined as individuals with a CPAI of 1 and the ‘combined active’ group as individuals with a CPAI of 2–4.
Physical activity was estimated according to the Cambridge Physical Activity Index (CPAI), where the ‘inactive’ group is defined as individuals with CPAI = 1 and the ‘combined active’ group as individuals with CPAI = 2–4.
The CPAI characterizes total physical activity levels by considering both occupational and leisure time physical activity . Sensitivity analyses were performed in the GLACIER and MDC cohorts (n = 39,000) where interaction terms (gene × physical activity) were modeled separately for occupational and leisure time physical activity, but these results were not materially different from the main analyses (data not shown). Within these two cohorts, we additionally adjusted the models for putative confounding by smoking and education, but the results were essentially the same irrespective of whether these additional covariates were or were not included; hence, for the sake of comparability, we focus on the results with the regression models adjusted as reported by Li et al . We also undertook sensitivity analyses in European and North American cohorts separately (Supplementary Figures S1a and S1b), which revealed a statistically significant GRS × physical activity interaction effect in the latter (n = 39,810, Pinteraction = 0.014), but not the former (n = 71,611, Pinteraction = 0.275).
Individual SNP × physical activity interactions
In analyses modeling the interaction of each of the 12 individual SNPs and physical activity, two tests of interaction were nominally statistically significant: the FTO rs1121980 variant, which concurs with previous reports of interaction at this locus , and the SEC16B rs10913469 locus, which has not previously been reported (Table 2). It should be noted that several of the cohorts used here are included in Kilpeläinen et al. , and so this is not entirely independent confirmation of these findings. The magnitude of the interaction effects (βGE) for FTO rs1121980 and SEC16B rs10913469 variants was −0.052 and −0.049 kg/m2 per risk allele respectively, which compares with βGE of −0.108 kg/m2 per 8.33 alleles for the GRS (equivalent to 1 allele on the bi-allelic scale). For FTO, the interaction effect was almost 10-fold larger in North American than in European cohorts, whereas for the SEC16B locus, the interaction effect was approximately twice the magnitude in North American vs. European cohorts. Supplementary Table S2 shows individual SNP interaction results across each of the 11 cohorts. In models excluding the FTO and SEC16B variants from the GRS, the interaction test was no longer statistically significant (in the entire cohort [Pinteraction = 0.25] or separately within the cohorts from North American [Pinteraction = 0.39] and Europe [Pinteraction = 0.44]), strongly suggesting that the GRS × physical activity interaction result is driven by the inclusion of one or both of these variants.
Statistical power simulations
Power to detect interactions.
We began by estimating power to detect the original interaction effect reported by Li et al  (Supplementary Figure S2a). We estimated that a sample size of N = 110,000 (equivalent to the sample collection included in this meta-analysis) yields close to 100% power to detect the estimated interaction effect of βGE = −0.07 kg/m2 per GRS allele from Li et al . Under the same assumptions, a sample size of N = 20,000 (roughly equivalent to that of the Li et al study  yields around 83% power to detect βGE = −0.07 kg/m2. Although power to detect the interaction effect from the original study is adequate in the current analysis, we observed a much smaller interaction effect estimate in our meta-analysis (βGE = −0.013 kg/m2 per GRS allele), which may be owing to the Winner's curse . Indeed, to gain adequate power (80%) to detect this small effect, given the distributions of the GRS and physical activity variables reported in Li et al, and assuming that these independent variables are not correlated, would require a sample size considerably larger than the current study (Supplementary Figure S2a).
Error, variance and statistical power.
We also estimated sample sizes required to detect the interaction between physical activity and the GRS (βGE = −0.07 kg/m2 per GRS allele, at 80% power and critical alpha 0.05) when the GRS is dichotomized (GRS </> 11.2 alleles) and all else is held equal; under this scenario, a sample size of approximately 370,000 observations is required (compared with 20,000 observations when the GRS is expressed on a continuum) (Supplementary Figures S2a and S2b), which is owing to the decreased variance in the GRS that occurs with dichotomization (σ2 = 5.06 to σ2 = 0.25) (see Supplementary Table S3 for further details). Loss of power would also be anticipated when a continuous physical activity variable is dichotomized, a concept that is discussed at length elsewhere . We also noted that power to detect the interaction increases as the correlation between the two predictor variables increases, as shown in Supplementary Table S4. The ratio of physically inactive to active persons within a population also influences the variable's variance, and hence sample size requirements; providing the interaction effect is approximately linear, the required sample size is smallest when this ratio is balanced and all else remains equal, as shown in Figure S3.
Combining results from multiple cohorts can also lead to a substantial loss of power owing to inflation of model error. Sources of error may include imprecise measurement of exposures and outcomes , variable LD structures between populations, and differences in the magnitude of the relationships of BMI with underlying adiposity phenotypes across populations. In order to account for differences in such error, we compared models based on simulations where the population BMI σ increased from 3.5 (as reported in Li et al) to, 4.0, 4.5 and 5.5, when all else is held equal. These analyses (Table S5) show that the population σ for BMI is inversely related with statistical power to detect the interaction; for example, a sample size of 31,000 yields ∼80% to detect βGE −0.07 kg/m2 per GRS allele if the population BMI σ = 4.5, whereas the required sample size increases to 46,000 to detect the same effect if the population BMI σ = 5.5; a sample size of N∼30,000 is required to achieve 80% power to detect βGE −0.07 kg/m2 per GRS allele for the population BMI σ = 4.39, as observed in this study.
Here we sought to replicate a widely cited study in which an interaction on BMI was reported between physical activity and a GRS comprised of 12 obesity-predisposing gene variants . The original study is one of the largest and most well conducted single-cohort interaction studies published to date, yet to our knowledge no evidence has been published to show that these findings are replicable. Our study included a collection of cohorts whose sample totaled almost six times the size of the study reported by Li et al ; the meta-analyzed interaction coefficient is directionally consistent with the original report  and statistically significant in the current analysis (Pinteraction = 0.015). In secondary analyses, we explored whether any of the individual SNP × physical activity interaction tests were statistically significant; of these, the FTO locus (rs1121980) (Pinteraction = 0.003), consistent with previous findings , and the SEC16B rs10913469 variant yielded statistically significant interaction effects (Pinteraction = 0.025). The latter finding was not statistically significant after correction for multiple testing, there is no published literature suggesting that this locus is exercise-responsive, and a recent analysis in a randomized clinical trial of lifestyle intervention did not yield evidence of SNP × treatment interactions at the SEC16B rs10913469 locus on weight change phenotypes , although that analyses was likely underpowered and may be false negative. Thus, validation of the interaction effect observed here for SEC16B rs10913469 is necessary to confirm or refute its effect-modifying role for physical activity and obesity.
It is widely acknowledged that initial reports of genetic association signals are often of considerably greater effect magnitude than yielded by subsequent replication attempts; this phenomenon is termed the Winner's curse . The large Winner's curse differential (ΔβGE = 0.057 kg/m2 per GRS allele for the comparison of βGE reported by Li et al  and observed in the current study) has a dramatic effect on the sample size required for replication, with around 530,000 individuals (>25 times the size of the original study) being required to yield power of 80% to detect the interaction effect reported in this study (βGE = −0.013 kg/m2 per GRS allele).
We also conducted a range of simulation analyses to determine how within and between study factors impact power to detect interactions in meta-analyses. We show that the optimal setting is one where i) for a given interaction effect size (βGE), the independent variables are expressed on a continuous scale (and if physical activity is dichotomized and the interaction effect is approximately larger the categories should be equally prevalent (i.e., 50%/50%)), ii) the variance in the GRS is large, iii) the GRS and environmental exposure are correlated, and iv) the population variance in the outcome is small, which in part relates to whether exposure and outcome measurements are standardized across studies and measured with reasonable precision (the latter of which is discussed at length elsewhere ).
One of the principal arguments for conducting and reporting studies on gene × lifestyle interactions is that they may help identify persons within target populations who are likely to respond well or poorly to specific lifestyle interventions, thus optimizing the delivery and success of the interventions; the same principle may apply to other medical therapies such as drug treatment and surgery. The targeting of lifestyle interventions using genetic information is appealing as it may improve cost-efficiency, reduce harmful side effects, and increase the health-promoting effects of diet and lifestyle factors . However, very few reported gene × lifestyle interactions have been replicated, which may be because many of the original findings were false positive, the reported interaction effects were cohort-specific, or because subsequent studies were underpowered and yielded false negative results . The study by Li et al  appears well conducted and was performed in a relatively large cohort. The paper was also published in a high impact general medical journal, which implies that the authors' findings are clinically relevant, yet, like most studies of gene × environment interaction, they lacked replication. Importantly, the clinical translation of findings on gene × lifestyle interactions requires that the interaction effect sizes are of a sufficient magnitude to ensure that stratified therapeutic interventions will yield meaningfully different results across genotype groups. The interaction effect size reported in this study is probably too small to be of any clinical value; it is worth noting, though, that in observational studies, where the precision and accuracy with which exposures and outcomes are measured is often low, and where synthetic genetic associations exist (i.e., the observed locus is merely a tag for the latent functional locus), the underlying interaction effect sizes are likely to be underestimated.
A second incentive for conducting studies on gene × lifestyle interactions is that doing so may elucidate biological pathways that lead to the targeting of therapeutic interventions. Most or all of the SNPs studied here probably tag functional variants, with no specific functional role of their own. The functional relevance of the genes most proximal to these SNPs is discussed in detail elsewhere –. The majority of these genes regulate CNS-mediated body weight regulation, energy balance, taste, and satiation ; although not clearly established, these genes might also regulate reciprocal behaviors; for example, variants in MC4R – and FTO ,  are reportedly associated with physical activity.
Although we found statistical evidence of an interaction between physical activity and the GRS in the meta-analysis, it is unlikely that all of the gene variants that comprise the GRS contribute to this interaction effect. For example, the FTO variant included in the GRS has been shown previously to interact with physical activity on obesity , a finding that was confirmed here, and the SEC16B variant also yielded a nominally significant interaction effect in this study. In combination, the two variants yielded an interaction effect size comparable to that seen here for the GRS × physical activity interaction, and the GRS × physical activity interaction test was not statistically significant when the FTO and SEC16B variants were excluded from the GRS, suggesting that these two loci underlie the aggregate genetic effect of all 12 SNPs combined. It is difficult to accurately speculate on whether the GRS × physical activity interaction reported by Li et al  is also driven by the FTO and SEC16B interaction effects, as formal comparisons of this nature were not reported in their paper. Refitting the alleles that comprise a GRS to maximally exploit this information in a regression model (i.e., by weighting the alleles by their interaction effect estimates obtained from SNP × physical activity interaction analyses) would likely increase the magnitude of the observed interaction effect for the GRS; however, to achieve this with minimal bias would require further sample collections to validate these new genetic models, which goes beyond the scope of the present study. Nonetheless, we include the relevant information in Table 2, so that other investigators can construct such weighted models.
It is also important to highlight that the interaction results reported by Li et al  were not statistically significant once persons with prevalent CVD and cancer were excluded; the inclusion of these individuals may have confounded the interaction effect owing to reporting biases attributable to disease labeling or changes in weight and behavior attributable to the disease processes, although the fact that we have replicated their findings in cohorts that were largely free of these diseases suggests this is not the case. It is also possible that the inclusion of diseased individuals in Li et al's study  augmented the interaction effect through hitherto unknown causal mechanisms.
As a general point, it is important to bear in mind that in observational studies, such as those reported here, marginal and interaction effect estimates may not reflect causal processes. This is because physical activity and obesity correlate with other lifestyle, sociodemographic, and metabolic factors, and the gene variants included in the GRS are unlikely to be functional. Thus, even replicated examples of gene × lifestyle interactions may be confounded by latent variables. Reverse causality is a further concern, particularly with cross-sectional data (for example, it is possible that there is a relationship between the GRS and physical activity that is dependent on BMI level).
In summary, our meta-analysis of 111,421 samples from 11 cohorts of European ancestry yielded results that support those of Li et al . However, these effects appear evident only when the cohorts from North America (n = 39,810) are included in this meta-analyses. We also demonstrate using simulated data that combining many small cohorts that vary in their classification of physical activity and other factors is a relatively inefficient approach to studying interactions; hence, future studies of gene × lifestyle interactions might prove most effective if focused on a small collection of large cohorts within which standardized and valid lifestyle assessment methods are available.
Materials and Methods
A total of 111,421 participants from the 11 participating cohorts had genotype and phenotype data necessary for the current analyses. Descriptions of the cohorts included in the current analyses are shown in supplementary Table S6. All participants provided written informed consent and the studies were approved by the relevant institutional review boards and conducted according to the Declaration of Helsinki.
Body composition and physical activity assessment
In most studies, height and weight were measured using wall-mounted stadiometers and calibrated balance-beam scales, respectively (See Supplementary Table S7). By exception, weight for the NHS, HPFS , and WGHS  were self-reported. BMI was calculated as weight in kilograms (kg) divided by height in meters squared (m2). Obesity was defined according to WHO criteria .
Information on physical activity was obtained from self-administered questionnaires, which in most instances were validated. Occupational physical activity in most studies was categorized as i) sedentary or standing; ii) light but partly physically active; iii) light and physically active; and iv) sometimes or often physically straining. Leisure time physical activity during the past three months was categorized as exercising: i) occasionally; ii) 1–2 times/week; iii) 2–3 times/week; or iv) >3 times/week. Among leisure-time physical activity (four categories), participants with missing information were given the lowest intensity score, i.e. classified as being ‘occasionally active’. The CPAI was computed by cross-tabulation of occupational and leisure time physical activity, classifying an individual's total physical activity level according to a four-level scale (inactive, moderately inactive, moderately active and active), as previously described . Because some cohorts could not compute the CPAI owing to a lack of specific physical activity data, a binary variable was computed in all cohorts, which classified participants into active (top 80% of the physical activity frequency distribution) and inactive (bottom 20% of the physical activity distribution). This classification most closely matches the frequency distribution obtained when dichotomizing the CPAI variable by combining moderately inactive, moderately active and active individuals (see Supplementary Table S7 for further details), but, as noted in the Results, may not be the most statistically powerful classification.
DNA was extracted from peripheral blood cells and diluted using standard approaches (see Supplementary Table S8 for further details). Twelve established obesity susceptibility loci – (or their proxies with an r2>0.8) were genotyped in the 11 cohorts (Supplementary Table S8). In all cohorts, the genotyping success rates for all 12 variants exceeded 95% and most genotypes were in Hardy-Weinberg equilibrium (P>0.001). The exception to this was for the SH2B1 rs7498665 SNP in the METSIM and HEALTH2006 cohorts, which did not conform to Hardy Weinberg expectations; sensitivity analyses indicated that removing this SNP from the GRS for the METSIM cohort made no material difference to the overall results (data not shown), and so the results shown here are for the full GRS.
Genetic risk score (GRS)
At each SNP locus, genotypes were coded as 0, 1 and 2 indicating the number of risk alleles (those associated with higher BMI in previous meta-analyses –) and the overall genetic burden for each participant was determined by summing the total number of risk alleles into a GRS, using methods previously described .
In cohorts where genotypes were directly assessed (i.e., not imputed from GWAS data), missing genotypes were imputed in participants with four or fewer missing values using previously described methods . Sensitivity analyses performed in the GLACIER and MDC cohorts (n = 39,000) showed that there was no material difference in the effect estimates when analyses were performed with or without imputed genotypes (data not shown), so here only results for the GRS using imputed values are presented. The GRS was normally distributed in all cohorts.
Statistical analyses were performed using the SAS software (SAS Institute, Cary, NC), R software (http://www.r-project.org/) and STATA (version 12, StataCorp, College Station, TX, USA). General linear models (GLM) were used to test the association of the GRS with BMI. Logistic regression was used to test genetic associations with obesity. All analyses were adjusted for age, age2, sex, study center (for multi-center studies), and physical activity (where appropriate), and we assumed additive effects of the alleles. Interaction tests for individual SNPs and the GRS with physical activity (for outcomes BMI or obesity) were performed by including a SNP (or GRS) × physical activity interaction term in the model, with the marginal effect terms also included. The genetic effect estimates for BMI were also calculated by strata of physical activity (i.e. inactive vs. combined active), as described above.
Meta-analyses were undertaken using the metan command in STATA (version 12, StataCorp, College Station, TX, USA). A summary interaction effect estimate was calculated for all 11 cohorts combined using meta-analysis weighted by cohort sample size to summarize the pairwise (SNP/GRS × physical activity) interaction coefficients and SE derived from each cohort. Meta-analyses were repeated using random and fixed effects models, but between-study heterogeneity was low (χ2 = 15.51, I2 = 3.3% and P-val = 0.415); thus, the results were not materially different to the weighted approach (data not shown), leading us to present only the weighted results here. Analysis of data from the InterAct Study, which includes multiple sub-cohorts, was conducted as described elsewhere . The full InterAct Study includes two Swedish study centers in Malmö and Umeå, which overlap extensively with the GLACIER and MDC cohorts. Thus, these Swedish InterAct cohort samples were not included in the main analyses.
The code-generating program mlPowSim  was used to generate R code for simulations and power estimation with 1,000 iterations for each sample size simulation. In order to estimate power for different samples sizes, we simulated a 12 SNP GRS using a random normal distribution with mean (s.d.) 11.2 (2.2); physical activity was simulated using a binomial distribution assuming the population prevalence of physical inactivity was 30%, as estimated by Li et al. The approach (described in detail in the Supplementary Material S1) was used to simulate different scenarios for the predictor variables: i) with the GRS expressed as a continuous or dichotomized variable (Supplementary Figures S2a and S2b), ii) a range of frequencies for the binary physical activity variable and variances (σ2) (Figure S3, iii) a range of effect sizes for βGE (Supplementary Figures S2a and S2b), iv) a range of covariances between the two predictor variables (Figure S3), and v) a range of variances (σ2) for the population (Supplementary Table S5).
The main power calculations were performed using estimates obtained from Li et al : a GRS marginal effect (βG) of 0.154 kg/m2 per GRS risk allele and a physical activity marginal effect (βE) of −0.313 kg/m2 (active vs. inactive), physical inactivity prevalence of 30%, and s.d of ±3.5. We assumed that the GRS and physical activity are not correlated and a two-sided critical alpha of 0.05 was used in the calculations. Although the interaction effect estimate (βGE) is not explicitly reported in Li et al's paper, we were able to estimate this from the GRS effect estimates reported in Table 2 of their paper (βGE∼−0.07) by approximating the difference of βG between the two combined activity categories (active vs. inactive). To accommodate imprecision in the estimation of βGE and the possibility that Li et al's study  was affected by the ‘winner's curse’  and thus over-estimated the interaction effect size one could hope to observe in other cohorts, we show statistical power estimations for interaction effects ranging from −0.05 to −0.10 (Supplementary Figure S2a). We also simulated the GRS as a binary variable and compared power using this approach with one where the GRS is expressed on a continuum (Supplementary Figure S2b), as GRSs are often reported on the binary scale in genetic association studies.
Forest plot showing the meta-analysis of interaction coefficients (GRS × Cambridge Physical Activity Index) in relation to BMI in the three North American cohorts (a) and the meta-analysis of interaction coefficients (GRS × Cambridge Physical Activity Index) in relation to BMI in the eight European cohorts (b).
Sample size and power to detect an interaction (βGE = −0.013 to −0.10) between a normally distributed genetic risk score (expressed on a continuous [panel A] or binary [panel B] scale) and physical activity (30% inactive and 70% active). Critical alpha = 0.05. All other parameters are taken from Li et al .
Sample size required for 80% power to detect a gene × physical activity interaction in obesity when the prevalence of physical activity (and the variable's variance) varies and all other parameters are fixed. Mean and variance of the genetic risk score are set at 11.2 and 5.06 respectively. Statistical power and critical alpha are fixed at 80% and 0.05 respectively. Solid line represents required sample sizes, dashed line represents σ2 for corresponding prevalence of physical activity, and dotted lines mark the 50th and 80th centile cut-points and the respective sample size requirements for the binary physical activity variable. Power calculations assume a linear interaction effect.
Additional details on statistical power simulation.
Cohort-specific descriptive statistics.
Interactions between the 12 SNPs and CPAI (4 level scale) on BMI across each of the 11 cohorts.
Power to detect gene × physical activity interaction in obesity for the different simulation settings: physical activity is a binary variable, and variance of genetic risk score varies.
Power to detect a gene × physical activity interaction in obesity for the different simulations settings: physical activity is either binary or approximated by a normal distribution and with different degrees of correlation between the physical activity variable and the genetic risk score.
Sample sizes required to detect an interaction between a genetic risk score (12 SNPs) and physical activity (binary) when the standard deviation (S.D.) in the outcome (BMI) varies and all other parameters are fixed.
Study description of participating cohorts.
Cohort-specific methods used for measuring body mass index and physical activity.
We are indebted to the study participants who dedicated their time and samples to these studies. We also thank the VIP and Umeå Medical Biobank staff for biomedical data collection and preparation. We specifically thank John Hutiainen, Åsa Ågren and Sara Nilsson (Umeå Medical Biobank) for data organization, Kerstin Enquist and Thore Johansson (Västerbottens County Council) for expert technical assistance with DNA preparation, and David Hunter, Patrice Soule and Hardeep Ranu (Harvard School of Public Health) for expert assistance with planning and undertaking genotyping of GLACIER samples (GLACIER study). We thank Malin Svensson for excellent technical assistance in genotyping (MDC study). We thank the Fenland Study Investigators, Fenland Study Co-ordination team and the Epidemiology Field, Data and Laboratory teams (FENLAND study). The Health2006 Study was initiated by Allan Linneberg (PI) and Torben Jørgensen (co-PI) (HEALTH 2006 study). We thank all the participants of the NHS and the HPFS for their continued cooperation (HPFS study). INTER99 (Denmark): The Inter99 study was initiated by Torben Jørgensen (PI), Knut Borch-Johnsen (co-PI), Hans Ibsen and Troels F. Thomsen. The steering committee comprises the former two and Charlotta Pisinger (INTER99 study). We are grateful to all participants who gave their time and effort to the study. We are also extremely grateful to all persons who contributed to the data collection across the study sites. A special thanks to the MRC Epidemiology Unit physical activity technical team; Mark Betts, Laura Lamming and Stefanie Mayle who assisted with data reduction, cleaning and processing. Your efforts are highly appreciated (INTERACT consortium study). We are highly thankful to the study participants (METSIM study). We thank all the participants of the NHS and the HPFS for their continued cooperation (NHS study). We express our profound thanks to all the study participants who contributed to this research (TWINGENE 2000 study). We acknowledge the study participants in the WGHS for their contribution in making this study possible (WGHS study).
Conceived and designed the experiments: SA TVV AA AK DS FR PWF. Performed the experiments: NLP ML NJW. Analyzed the data: SA GR TVV AA AK UE RWK AYC AG LMR QQ AS CHS CEE GC MKJ RMT KHA TJ MA NG AL. Contributed reagents/materials/analysis tools: SB CL PKEM NLP MB AH KLM LTP OP RAS PMR EI ML TH LQ NJW DIC GH FBH MOM PWF. Wrote the paper: SA AA AK GP FR PWF.
- 1. Ogden CL, Carroll MD, Curtin LR, McDowell MA, Tabak CJ, et al. (2006) Prevalence of overweight and obesity in the United States, 1999–2004. JAMA 295: 1549–1555.
- 2. Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, et al. (2007) A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316: 889–894.
- 3. Loos RJ, Lindgren CM, Li S, Wheeler E, Zhao JH, et al. (2008) Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nat Genet 40: 768–775.
- 4. Scuteri A, Sanna S, Chen WM, Uda M, Albai G, et al. (2007) Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet 3: e115.
- 5. Thorleifsson G, Walters GB, Gudbjartsson DF, Steinthorsdottir V, Sulem P, et al. (2009) Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity. Nat Genet 41: 18–24.
- 6. Willer CJ, Speliotes EK, Loos RJ, Li S, Lindgren CM, et al. (2009) Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat Genet 41: 25–34.
- 7. Andreasen CH, Stender-Petersen KL, Mogensen MS, Torekov SS, Wegner L, et al. (2008) Low physical activity accentuates the effect of the FTO rs9939609 polymorphism on body fat accumulation. Diabetes 57: 95–101.
- 8. Franks PW, Jablonski KA, Delahanty LM, McAteer JB, Kahn SE, et al. (2008) Assessing gene-treatment interactions at the FTO and INSIG2 loci on obesity-related traits in the Diabetes Prevention Program. Diabetologia 51: 2214–2223.
- 9. Sonestedt E, Roos C, Gullberg B, Ericson U, Wirfalt E, et al. (2009) Fat and carbohydrate intake modify the association between genetic variation in the FTO genotype and obesity. Am J Clin Nutr 90: 1418–1425.
- 10. Rampersaud E, Mitchell BD, Pollin TI, Fu M, Shen H, et al. (2008) Physical activity and the association of common FTO gene variants with body mass index and obesity. Arch Intern Med 168: 1791–1797.
- 11. Kilpelainen TO, Qi L, Brage S, Sharp SJ, Sonestedt E, et al. (2011) Physical activity attenuates the influence of FTO variants on obesity risk: a meta-analysis of 218,166 adults and 19,268 children. PLoS Med 8: e1001116.
- 12. Li S, Zhao JH, Luan J, Ekelund U, Luben RN, et al. (2010) Physical activity attenuates the genetic predisposition to obesity in 20,000 men and women from EPIC-Norfolk prospective population study. PLoS Med 7: e1000332.
- 13. InterAct C (2012) Validity of a short questionnaire to assess physical activity in 10 European countries. Eur J Epidemiol 27: 15–25.
- 14. Xiao R, Boehnke M (2009) Quantifying and correcting for the winner's curse in genetic association studies. Genet Epidemiol 33: 453–462.
- 15. Ragland DR (1992) Dichotomizing continuous outcome variables: dependence of the magnitude of association and statistical power on the cutpoint. Epidemiology 3: 434–440.
- 16. Wong MY, Day NE, Luan JA, Chan KP, Wareham NJ (2003) The detection of gene-environment interaction for continuous traits: should we deal with measurement error by bigger studies or better measurement? Int J Epidemiol 32: 51–57.
- 17. Delahanty LM, Pan Q, Jablonski KA, Watson KE, McCaffery JM, et al. (2012) Genetic predictors of weight loss and weight regain after intensive lifestyle modification, metformin treatment, or standard care in the Diabetes Prevention Program. Diabetes Care 35: 363–366.
- 18. Goring HH, Terwilliger JD, Blangero J (2001) Large upward bias in estimation of locus-specific effects from genomewide scans. Am J Hum Genet 69: 1357–1369.
- 19. Franks PW (2011) Gene × environment interactions in type 2 diabetes. Curr Diab Rep 11: 552–561.
- 20. Franks PW, Nettleton JA (2010) Invited commentary: Gene × lifestyle interactions and complex disease traits–inferring cause and effect from observational data, sine qua non. Am J Epidemiol 172: 992–997 discussion 998–999.
- 21. Loos RJ (2012) Genetic determinants of common obesity and their value in prediction. Best Pract Res Clin Endocrinol Metab 26: 211–226.
- 22. Ekelund U, Brage S, Besson H, Sharp S, Wareham NJ (2008) Time spent being sedentary and weight gain in healthy adults: reverse or bidirectional causality? Am J Clin Nutr 88: 612–617.
- 23. Loos RJ, Rankinen T, Tremblay A, Perusse L, Chagnon Y, et al. (2005) Melanocortin-4 receptor gene and physical activity in the Quebec Family Study. Int J Obes (Lond) 29: 420–428.
- 24. Metcalf BS, Hosking J, Jeffery AN, Voss LD, Henley W, et al. (2011) Fatness leads to inactivity, but inactivity does not lead to fatness: a longitudinal study in children (EarlyBird 45). Arch Dis Child 96: 942–947.
- 25. Cecil JE, Tavendale R, Watt P, Hetherington MM, Palmer CN (2008) An obesity-associated FTO gene variant and increased energy intake in children. N Engl J Med 359: 2558–2566.
- 26. Jonsson A, Franks PW (2009) Obesity, FTO gene variant, and energy intake in children. N Engl J Med 360: 1571–1572 author reply 1572.
- 27. Qi Q, Chu AY, Kang JH, Jensen MK, Curhan GC, et al. (2012) Sugar-sweetened beverages and genetic risk of obesity. N Engl J Med 367: 1387–1396.
- 28. Ridker PM, Chasman DI, Zee RY, Parker A, Rose L, et al. (2008) Rationale, design, and methodology of the Women's Genome Health Study: a genome-wide association study of more than 25,000 initially healthy american women. Clin Chem 54: 249–255.
- 29. Obesity: preventing and managing the global epidemic. Report of a WHO consultation. World Health Organ Tech Rep Ser 894: i–xii, 1–253.
- 30. Renstrom F, Shungin D, Johansson I, Investigators M, Florez JC, et al. (2011) Genetic predisposition to long-term nondiabetic deteriorations in glucose homeostasis: Ten-year follow-up of the GLACIER study. Diabetes 60: 345–354.
- 31. Renstrom F, Payne F, Nordstrom A, Brito EC, Rolandsson O, et al. (2009) Replication and extension of genome-wide association study results for obesity in 4923 adults from northern Sweden. Hum Mol Genet 18: 1489–1496.
- 32. InterAct C, Langenberg C, Sharp S, Forouhi NG, Franks PW, et al. (2011) Design and cohort description of the InterAct Project: an examination of the interaction of genetic and lifestyle factors on the incidence of type 2 diabetes in the EPIC Study. Diabetologia 54: 2272–2282.
- 33. Browne WJ, Mousa G, Parker RMA (2009) A Guide to Sample Size Calculations for Random Effect Models via Simulation and the MLPowSim Software Package. Bristol, United Kingdom: University of Bristol.