Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Phenotype Refinement Strengthens the Association of AHR and CYP1A1 Genotype with Caffeine Consumption

  • George McMahon,

    Affiliations Medical Research Council Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom, School of Social and Community Medicine, University of Bristol, Bristol, United Kingdom

  • Amy E. Taylor,

    Affiliations Medical Research Council Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom, United Kingdom Centre for Tobacco and Alcohol Studies, University of Bristol, Bristol, United Kingdom, School of Experimental Psychology, University of Bristol, Bristol, United Kingdom

  • George Davey Smith,

    Affiliations Medical Research Council Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom, School of Social and Community Medicine, University of Bristol, Bristol, United Kingdom

  • Marcus R. Munafò

    Affiliations Medical Research Council Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom, United Kingdom Centre for Tobacco and Alcohol Studies, University of Bristol, Bristol, United Kingdom, School of Experimental Psychology, University of Bristol, Bristol, United Kingdom


Two genetic loci, one in the cytochrome P450 1A1 (CYP1A1) and 1A2 (CYP1A2) gene region (rs2472297) and one near the aryl-hydrocarbon receptor (AHR) gene (rs6968865), have been associated with habitual caffeine consumption. We sought to establish whether a more refined and comprehensive assessment of caffeine consumption would provide stronger evidence of association, and whether a combined allelic score comprising these two variants would further strengthen the association. We used data from between 4,460 and 7,520 women in the Avon Longitudinal Study of Parents and Children, a longitudinal birth cohort based in the United Kingdom. Self-report data on coffee, tea and cola consumption (including consumption of decaffeinated drinks) were available at multiple time points. Both genotypes were individually associated with total caffeine consumption, and with coffee and tea consumption. There was no association with cola consumption, possibly due to low levels of consumption in this sample. There was also no association with measures of decaffeinated drink consumption, indicating that the observed association is most likely mediated via caffeine. The association was strengthened when a combined allelic score was used, accounting for up to 1.28% of phenotypic variance. This was not associated with potential confounders of observational association. A combined allelic score accounts for sufficient phenotypic variance in caffeine consumption that this may be useful in Mendelian randomization studies. Future studies may therefore be able to use this combined allelic score to explore causal effects of habitual caffeine consumption on health outcomes.


Caffeine is one of the most widely-consumed psychoactive substances world-wide, and while coffee and tea consumption dominate, it is also present in some soft drinks [1]. There is also considerable inter-individual variability in preference for caffeine [2], in part due to genetic factors. Twin studies have consistently indicated substantial (∼50%) heritability of caffeine consumption (typically assessed as coffee consumption) [3][9]. Recently, a number of genome-wide association studies have identified variants robustly associated with caffeine consumption (again, typically assessed as coffee consumption) [10][12]. In particular, two loci, one in the cytochrome P450 1A1 (CYP1A1) and 1A2 (CYP1A2) gene region on chromosome 15 and one near the aryl-hydrocarbon receptor (AHR) gene on chromosome 7, have been found to be associated with habitual caffeine consumption across a number of studies [10][13]. Two single nucleotide polymorphisms, rs2472297 in between CYP1A1 and CYP1A2, and rs6968865 51 kb upstream of AHR, provide the strongest signals, each with an effect equivalent to an increased consumption of ∼0.2 cups per day per risk (T) allele. The genes are biologically plausible candidates for caffeine consumption phenotypes as they both encode members of the same biochemical pathway. AHR is known to induce CYP1A1 and CYP1A2 by binding to the DNA in the region between these two genes [12], and low CYP1A2 activity has been associated with higher caffeine toxicity [14].

A limitation of studies to date is that they have typically used a single measure of caffeine consumption (e.g., coffee). One study [11] measured total caffeine consumption, but coffee contributed towards 80% of this, and data on other sources of caffeine were not reported separately. While coffee represents the major source of caffeine consumption in some countries, other sources of caffeine can be important. We have previously shown that phenotypic assessments which more accurately capture the exposure of interest can improve the precision of genetic association studies [15], particularly when the exposure (e.g., caffeine consumption) is strongly influenced by behaviour or behavioural choices (e.g., preference for coffee or tea). We therefore sought to establish whether using a more comprehensive phenotypic assessment of caffeine consumption, using measures of coffee, tea and cola consumption, would provide stronger evidence of association with rs2472297 and rs6968865. We were also interested in whether a combined allelic score comprising these two variants would further strengthen the association with caffeine consumption.

Materials and Methods

Study Sample

The Avon Longitudinal Study of Parents and Children (ALSPAC) sample is a longitudinal birth cohort that comprises 20,248 pregnancies. The mothers of 14,541 (71.8%) pregnancies were recruited antenatally during 1990–92 (Phase I). Post-natal recruitment to the ‘Focus@7’ clinical assessment at the age of ∼7 years recruited a further 456 children from 452 (2.2% of eligible) pregnancies (Phase II). Recruitment during ages 8–18 years (Phase III) added a further 257 children from 254 (1.2% of eligible) pregnancies, giving an overall total of 15,247 (75.3% of eligible) enrolled pregnancies; from these pregnancies there were 14,775 live-born children of which 14,701 were alive at one year of age. The phases of enrolment are described in more detail in the cohort profile paper [16]. The ALSPAC website contains details of all the data that are available through a fully searchable data dictionary: Ethics approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees (Bristol and Weston Health Authority, Southmead Health Authority, Frenchay Health Authority).

Measures of Caffeine Consumption

Data on coffee and tea consumption were collected via self-report during pregnancy at 8, 18 and 32 weeks gestation and 2, 47, 85, 97 and 145 months after delivery. Participants were asked to report “current daily coffee and tea drinking”, as number of drinks, separately for weekdays and weekends. Similar questions were asked for cola consumption in drinks per week. For cola consumption, questions were open format at 8, 18, and 32 weeks gestation, and 2 months after delivery, and closed format at later time points (“never or rarely”, “once in 2 weeks”, “1 to 3 times a week”, “4 to 7 times a week”, “once a day or more”). Closed format responses were recoded to 0, 0.5, 2, 5.5 and 7 drinks per week, and cola consumption values further recoded to reflect daily consumption. Outlying daily consumption values (>10 drinks for coffee, >15 drinks for tea and >21 drinks for cola) were coded as missing data. Similar questions were also asked for decaffeinated coffee, tea and cola consumption at the same time points, and coded in the same way. In order to obtain a measure of total daily caffeine consumption, number of cups of tea and coffee were summed with drinks per day of cola, weighted with respect to approximate caffeine content (coffee 75; tea 40; cola 34.5) [17], [18]. The distribution of total caffeine consumption, and coffee and tea consumption, is shown in Figures S1S3.


Genotypes at the CYP1A1 (rs2472297) and AHR (rs6968865) loci were available from GWAS genotyping data. A total of 10,015 ALSPAC mothers were genotyped on the Illumina 660K quad chip at the Centre National de Genotypage, Paris, resulting in 557,124 directly genotyped SNPs before quality control. Genotypes were called with Illumina GenomeStudio and PLINK (v1.07) was used to carry out quality control steps.

Individuals were excluded from further analysis on the basis of having incorrect sex assignments; minimal or excessive heterozygosity, disproportionate levels of individual missingness (>5%); evidence of cryptic relatedness (>10% identical by descent) and being of non-European ancestry (as detected by a multidimensional scaling analysis seeded with HapMap 2 individuals). SNPs with a minor allele frequency of <1% and call rate of <95% were removed. Furthermore, only SNPs which passed an exact test of Hardy–Weinberg equilibrium (P>5×10−6) were considered for further use. Population stratification was assessed by means of multidimensional scaling of genome-wide identity by state (IBS) pairwise distances using the four (YOR, CEU, CHB, JPT) HapMap populations as a reference. Cryptic relatedness was assessed using estimates of the proportion of SNPs expected to be identical by descent given estimates of IBS. Subject with a relatedness of 0.1 or higher were excluded. Genotypes were imputed with Markov Chain Haplotyping software (MaCH 1.0.16) (45) using CEPH individuals from phase 2 of the HapMap project as a reference set (release 22). SNP rs2472297 was directly genotyped, had a MAF of 0.27, HWE P-value of 0.1 and 0.02% missingness before imputation. SNP rs6968865 was imputed with an imputation quality of 0.96, and MAF of 0.39. After imputation genotypes were available for 8,340 subjects. The frequencies of the T allele were 0.27 in rs2472297 and 0.61 in rs6968865.

Statistical Analysis

Data on total caffeine consumption, and consumption of tea, coffee, cola and their decaffeinated counterparts, were analysed in a linear regression on number of T alleles in a univariate analysis of each SNP. Linear regression was carried out using the lm package in R (v. 2.14.0). Best-guess genotypes were used for analysis.

To obtain joint effects to take into account genotypes at both SNPs simultaneously, following Sulem and colleagues [12], the number of T alleles were summed across SNPs to derive a combined SNP score of the total number of T alleles per subject which was then used in a regression with phenotype data. For rs6968865 the T allele is the major allele, so that the SNP score contained one minor allele and one major (i.e., reference) allele. Weighting alleles using effect sizes obtained from Sulem and colleagues [12] (rs2472297 by 0.31, rs6968865 by 0.26) provided similar results and we present the results for the unweighted SNP score for simplicity.

We examined within-locus non-additivity by testing the significance of a second heterozygote term, and between-locus non-additivity by testing for a joint effect beyond the sum of the effects of both SNPs individually. Our results indicated that these SNPs act additively, and their effects are independent (although we cannot rule out more complicated interactions between these SNPs in the presence of other factors).

Data used for this submission will be made available on request to the ALSPAC executive committee ( The ALSPAC data management plan (available here: describes in detail the policy regarding data sharing, which is through a system of managed open access.


Characteristics of Participants

The total sample available for analysis comprised between 4,460 and 7,520 women (see Figure 1 for a summary of how this sample was arrived at). Levels of missingness were low unless questions on caffeine consumption were not included in one or more versions of the questionnaire at that time point. More information on ALSPAC mothers' response rates has been published previously [16].

Figure 1. Study Participant Flow Diagram.

Due to study attrition, data obtained when the cohort first started have a higher number of responses than variables collected later. Thus the number of participants on whom data are available is given as a range.

Consumption of coffee tended to increase roughly linearly across time points (means 1.18 to 2.30 drinks per day). Consumption of tea (means 2.73 to 3.18 drinks per day) and cola (means 0.60 to 2.31 drinks per week) varied across time points, but with no clear pattern of change. As a result, total daily caffeine consumption tended to increase across time points (means 206.8 mg to 306.1 mg). These data are shown in Tables 14. In general, cola consumption was considerably less than tea and coffee consumption, reflecting approximately 4% to 11% of total caffeine consumption in drinks per day.

Table 1. Association of CYP1A1 rs2472297, AHR rs6968865 and combined SNP score with total caffeine consumption (mg).

Table 2. Association of CYP1A1 rs2472297, AHR rs6968865 and combined SNP score with coffee consumption.

Table 3. Association CYP1A1 rs2472297, AHR rs6968865 and combined SNP score with tea consumption.

Table 4. Association of CYP1A1 rs2472297, AHR rs6968865 and combined SNP score with cola consumption.

Caffeine Consumption

Across all time points, total caffeine consumption was associated with both CYP1A1 (βs  = 8.7 to 21.4, Ps  = 1.59×10−3 to 3.33×10−10) and AHR (βs  = 4.0 to 14.6, Ps  = 1.15×10−1 to 3.34×10−6) genotypes (Table 1). Similarly, total caffeine consumption was also associated with the combined SNP score, and the statistical evidence for this association considerably stronger (βs  = 5.9 to 17.1, Ps  = 1.15×10−3 to 3.74×10−14).

In general, the proportion of phenotypic variance explained across all time points was small, as would be expected for the association of common variants with complex behavioural phenotypes. For CYP1A1, the proportion of phenotypic variance explained ranged from 0.15% to 0.88%, while for AHR it ranged from 0.04% to 0.48%. However, the combined SNP score accounted for a somewhat higher proportion of phenotypic variance on average, ranging from 0.16% to 1.28%.

Estimates of the proportion of phenotypic variance obtained using GCTA [19] for the two SNPs in the 2-SNP score were broadly similar to those obtained using linear regression (0.10% to 1.10% vs 0.16% to 1.28%). GCTA analysis for the remaining directly-genotyped SNPs available accounted for additional phenotypic variance, although these estimates may be unreliable due to relatively small sample size (see Table S1).

Stratified analyses further indicated that these associations were present for consumption of coffee (combined SNP score: βs  = 0.047 to 0.120, Ps  = 2.34×10−2 to 5.46×10−5) and tea (combined SNP score: βs  = 0.076 to 0.209, Ps  = 2.58×10−2 to 1.23×10−8), but not cola (combined SNP score: βs  = −0.046 to 0.032, Ps  = 9.15×10−1 to 5.51×10−2) (Tables 24). Interestingly, associations for tea consumption were generally stronger than for coffee consumption. Removing participants who reported zero consumption of coffee, tea and/or cola did not alter these results substantially.

There was no evidence that either AHR or CYP1A1 genotypes, or the combined SNP score, was associated with consumption of decaffeinated coffee, tea or cola (see Tables S2S4), indicating that the associations observed are specific to caffeinated drinks. Again, removing participants who reported zero consumption of coffee, tea and/or cola did not alter these results substantially. We also did not observe any association with measures of aversion to coffee, tea or cola taken during pregnancy (data available on request).

Potential Confounders

Next we assessed the association of the combined SNP score with potential confounders (year of birth, educational attainment, measures of socioeconomic position, alcohol use, tobacco use). These indicated no evidence of association (Table 5), suggesting that the combined SNP score may be a useful instrumental variable in Mendelian randomization analyses [20], [21]. This is in contrast with the association of total caffeine consumption with the same potential confounders, which shows very strong evidence of association at multiple time points (Table 6). A full description of these variables is provided in the ALSPAC cohort profile [16].

Table 5. Association of combined SNP score with potential confounders.

Table 6. Association of total caffeine consumption (mg) with potential confounders.


Our results confirm that two SNPs in AHR and CYP1A1 are associated with caffeine consumption, and extend previous findings in two important ways. First, our results are the first to show association in a sample where caffeine consumption via caffeinated beverages other than coffee is common. Moreover, we show that a combined caffeine consumption phenotype derived from measures of consumption of three caffeinated beverages (coffee, tea and cola) provides a stronger signal than any one of these measures separately. Second, our results also confirm that these results are due to caffeine consumption, rather than some other common characteristic of caffeinated beverages. By using measures of consumption of decaffeinated drinks as negative controls we show no evidence of association with either AHR or CYP1A1. While our results hold for both SNPs individually, our strongest results are obtained when both SNPs are combined to create a 2-SNP genetic risk score.

Observationally, caffeine (or, more commonly, coffee) consumption has been shown to be associated with a number of health outcomes [22]. Evidence from longitudinal studies suggests that long-term coffee consumption may in fact be protective against cardiovascular disease [22], [23] and lower the risk of all-cause mortality [24]. Coffee consumption also shows an inverse association with diabetes, although this may be due to antioxidant compounds within coffee rather than caffeine itself [23]. Observational studies suggest that coffee consumption may have further beneficial health effects, including reducing risk of several cancers, such as endometrial, liver and prostate cancer [25][27] and protecting against depression, attention deficit hyperactivity disorder and Alzheimer disease [28][30]. Conversely, it is recommended that caffeine consumption is restricted during pregnancy due to its association with adverse pregnancy outcomes such as intrauterine growth retardation and miscarriage [31], [32]. Observational studies also suggest that caffeine consumption may be detrimental to bone health, leading to increased fracture risk [33]. However, these studies all suffer from the usual problems of residual confounding and reverse causality which limit the causal inferences that can be drawn from observational data.

Mendelian randomization (MR) offers one approach to better understanding the causal nature of the observed associations between caffeine consumption and health outcomes. Genetics variants are randomly assorted during gamete formation and conception, and therefore should be unrelated to other lifestyle factors associated with coffee consumption which may confound observational associations [34]. Health outcomes cannot affect the genes that an individual has, so we know that associations from MR analyses are not due to reverse causality [34]. This may be particularly important in observational studies of the effects of caffeine as individuals may alter levels of caffeine consumption in response to ill health. In addition, caffeine consumption is difficult to measure accurately as it is usually obtained from food frequency questionnaires [35], so observational estimates may be biased by random or non-random measurement error. In contrast, MR can provide accurate estimates of the magnitude of lifelong exposure to a risk factor [36].

Critically, we have shown that the two SNPs in AHR and CYP1A1, and our 2-SNP genetic risk score, are not associated with a range of potential confounders that may give rise to spurious associations in studies of health-outcomes putatively related to caffeine consumption. This, together with the clear evidence of association with caffeine consumption, indicates that the 2-SNP genetic risk score could be used as an instrumental variable in MR analyses. The greater variance explained by the combined score would increase statistical power and reduce the sample size required to detect associations with health outcomes, compared to using either SNP individually. The risk score explains up to 1.3% of the variance in caffeine consumption, which although small in absolute terms is relatively large by the standards of common genetic variants. This is comparable to the variance explained in body mass index (BMI) by variants in the FTO gene, and in cigarette consumption by variants in the CHRNA5-A3-B4 gene cluster [15], [37], which have been used in MR studies of the causal effects of BMI and smoking on health outcomes [38][41]. The 2-SNP score for caffeine consumption may therefore be a suitable instrument to explore the causal effects of caffeine consumption on a range of health outcomes.

There are some limitations to this study that should be considered when interpreting our results. First, caffeine consumption was measured using a food frequency questionnaire, and these may have modest reliability and validity [35]. We were also only able to capture tea, coffee and cola drinks as sources of dietary caffeine, and not other sources (e.g., chocolate). However, tea, coffee and soft drinks (including cola) together account for ∼90% of caffeine consumption in similar populations, and the levels of consumption we observed are similar to those observed in other studies [32]. While more detailed assessments of caffeine consumption are possible, these are difficult to obtain on the scale necessary for genetic association studies. Future studies could obtain more detailed phenotypic information on selected, genetically-informative individuals [42]. Second, levels of cola consumption were low in this sample, so that this, together with the relatively low levels of caffeine in cola drinks, may account for the lack of association observed. It is also possible that participants were responding to questions about “cola” consumption at least in part as questions about all soda consumption. To better understand whether this lack of association is genuine will require the study of populations where levels of cola consumption are higher. Third, our sample was restricted to women only. Rates of caffeine consumption may differ between men and women, although there are no clear reasons to expect that the pattern of results we observed would differ in males. While patterns of consumption during pregnancy may not be typical, our data extend to ∼12 years post-pregnancy. It is likely that the women in our sample reverted to pre-pregnancy patterns of caffeine consumption over time. Fourth, we only included 2 SNPs in our analysis. These were chosen on the basis of being those for which there is the clearest evidence from recent GWAS of caffeine consumption. Future studies may extend our 2-SNP score by including further variants. Fifth, although we are optimistic that these genotypes, and the 2-SNP score, can be used as instrumental variables in MR analyses, potential pleiotropic effects will need to be considered. Metabolic enzyme genotypes typically relate to several metabolic differences with may give rise to associations with health outcomes. In principle, this can be tested by examining the association of genotype with health outcome separately in those who do and do not consume caffeinated drinks [43] – the genotype should not be associated with the outcome in the latter group if the association is mediated via caffeine consumption (although this can give rise to collider bias [44]). Finally, participants of non-European ancestry were excluded during preparation of GWAS data, given that differences in ancestry can bias genetic association studies. Therefore, genotypes were only available for participants of European ancestry. However, >95% of ALSPAC participants are of European ancestry, so we think it unlikely that this influenced our results.

In conclusion, our data confirm the association of AHR and CYP1A1 genotypes with caffeine consumption, and extend previous work by showing that this association holds for tea consumption as well as coffee consumption. Moreover, no association is observed for decaffeinated tea or coffee consumption. This strengthens the argument that the association is mediated via caffeine consumption, although it remains possible that other compounds present in both tea and coffee mediate this association. Future work, perhaps selecting participants on the basis of AHR and CYP1A1 genotype, could explore this possibility through the administration of caffeine in a laboratory setting. Finally, the relatively large proportion of variance in caffeine consumption accounted for by the combined SNP score, and the lack of association of this with potential confounders, means that it could be used in Mendelian randomization studies to explore the causal effects of habitual caffeine consumption on health-related outcomes.

Supporting Information

Figure S1.

Distribution of total caffeine consumption (mg).


Figure S2.

Distribution of total coffee consumption (cups per day).


Figure S3.

Distribution of total tea consumption (cups per day).


Table S1.

Variance in total caffeine consumption explained using linear regression and GCTA.


Table S2.

Association of CYP1A1 rs2472297, AHR rs6968865 and combined genetic score with decaffeinated coffee consumption.


Table S3.

Association of CYP1A1 rs2472297, AHR rs6968865 and combined genetic score with decaffeinated tea consumption.


Table S4.

Association of CYP1A1 rs2472297, AHR rs6968865 and combined genetic score with decaffeinated cola consumption.



We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists and nurses. The UK Medical Research Council, the Wellcome Trust and the University of Bristol provide core support for ALSPAC. This publication is the work of the authors and George McMahon and Marcus Munafò will serve as guarantors for the contents of this paper.

Author Contributions

Conceived and designed the experiments: GM GDS MM. Performed the experiments: GM. Analyzed the data: GM. Contributed to the writing of the manuscript: GM AET GDS MM.


  1. 1. Drewnowski A (2001) The science and complexity of bitter taste. Nutr Rev 59: 163–169.
  2. 2. Hughes JR, Higgins ST, Bickel WK, Hunt WK, Fenwick JW, et al. (1991) Caffeine self-administration, withdrawal, and adverse effects among coffee drinkers. Arch Gen Psychiatry 48: 611–617.
  3. 3. Conterio F, Chiarelli B (1962) Study of the inheritance of some daily life habits. Heredity (Edinb) 17: 347–359.
  4. 4. Hettema JM, Corey LA, Kendler KS (1999) A multivariate genetic analysis of the use of tobacco, alcohol, and caffeine in a population based sample of male and female twins. Drug Alcohol Depend 57: 69–78.
  5. 5. Kendler KS, Prescott CA (1999) Caffeine intake, tolerance, and withdrawal in women: a population-based twin study. Am J Psychiatry 156: 223–228.
  6. 6. Luciano M, Kirk KM, Heath AC, Martin NG (2005) The genetics of tea and coffee drinking and preference for source of caffeine in a large community sample of Australian twins. Addiction 100: 1510–1517.
  7. 7. Reynolds CA, Barlow T, Pedersen NL (2006) Alcohol, tobacco and caffeine use: spouse similarity processes. Behav Genet 36: 201–215.
  8. 8. Swan GE, Carmelli D, Cardon LR (1996) The consumption of tobacco, alcohol, and coffee in Caucasian male twins: a multivariate genetic analysis. J Subst Abuse 8: 19–31.
  9. 9. Vink JM, Staphorsius AS, Boomsma DI (2009) A genetic analysis of coffee consumption in a sample of Dutch twins. Twin Res Hum Genet 12: 127–131.
  10. 10. Amin N, Byrne E, Johnson J, Chenevix-Trench G, Walter S, et al. (2012) Genome-wide association analysis of coffee drinking suggests association with CYP1A1/CYP1A2 and NRCAM. Mol Psychiatry 17: 1116–1129.
  11. 11. Cornelis MC, Monda KL, Yu K, Paynter N, Azzato EM, et al. (2011) Genome-wide meta-analysis identifies regions on 7p21 (AHR) and 15q24 (CYP1A2) as determinants of habitual caffeine consumption. PLoS Genet 7: e1002033.
  12. 12. Sulem P, Gudbjartsson DF, Geller F, Prokopenko I, Feenstra B, et al. (2011) Sequence variants at CYP1A1-CYP1A2 and AHR associate with coffee consumption. Hum Mol Genet 20: 2071–2077.
  13. 13. Josse AR, Da Costa LA, Campos H, El-Sohemy A (2012) Associations between polymorphisms in the AHR and CYP1A1-CYP1A2 gene regions and habitual caffeine consumption. Am J Clin Nutr 96: 665–671.
  14. 14. Carrillo JA, Benitez J (1996) CYP1A2 activity, gender and smoking, as variables influencing the toxicity of caffeine. Br J Clin Pharmacol 41: 605–608.
  15. 15. Munafo MR, Timofeeva MN, Morris RW, Prieto-Merino D, Sattar N, et al. (2012) Association between genetic variants on chromosome 15q25 locus and objective measures of tobacco exposure. J Natl Cancer Inst 104: 740–748.
  16. 16. Fraser A, Macdonald-Wallis C, Tilling K, Boyd A, Golding J, et al. (2013) Cohort Profile: the Avon Longitudinal Study of Parents and Children: ALSPAC mothers cohort. Int J Epidemiol 42: 97–110.
  17. 17. Ministry-of-Agriculture-Fisheries-and-Food (1998) MFF UK - Survey of caffeine and other methylxanthines in energy drinks and other caffeine-containing products (updated). London: Ministry-of-Agriculture-Fisheries-and-Food.
  18. 18. Food-Standards-Agency (2004) Survey of caffeine levels in hot beverages. Food Standards Agency.
  19. 19. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, et al. (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42: 565–569.
  20. 20. Gage SH, Davey Smith G, Zammit S, Hickman M, Munafo MR (2013) Using Mendelian Randomisation to Infer Causality in Depression and Anxiety Research. Depress Anxiety.
  21. 21. Davey Smith G, Ebrahim S (2003) ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32: 1–22.
  22. 22. O'Keefe JH, Bhatti SK, Patil HR, Dinicolantonio JJ, Lucan SC, et al. (2013) Effects of Habitual Coffee Consumption on Cardiometabolic Disease, Cardiovascular Health, and All-cause Mortality. J Am Coll Cardiol.
  23. 23. Campos H, Baylin A (2007) Coffee consumption and risk of type 2 diabetes and heart disease. Nutr Rev 65: 173–179.
  24. 24. Freedman ND, Park Y, Abnet CC, Hollenbeck AR, Sinha R (2012) Association of coffee drinking with total and cause-specific mortality. N Engl J Med 366: 1891–1904.
  25. 25. Je Y, Giovannucci E (2012) Coffee consumption and risk of endometrial cancer: findings from a large up-to-date meta-analysis. Int J Cancer 131: 1700–1710.
  26. 26. Lai GY, Weinstein SJ, Albanes D, Taylor PR, McGlynn KA, et al. (2013) The association of coffee intake with liver cancer incidence and chronic liver disease mortality in male smokers. Br J Cancer.
  27. 27. Wilson KM, Balter K, Moller E, Adami HO, Andren O, et al. (2013) Coffee and risk of prostate cancer incidence and mortality in the Cancer of the Prostate in Sweden Study. Cancer Causes Control 24: 1575–1581.
  28. 28. Lara DR (2010) Caffeine, mental health, and psychiatric disorders. J Alzheimers Dis 20 Suppl 1S239–248.
  29. 29. Lucas M, Mirzaei F, Pan A, Okereke OI, Willett WC, et al. (2011) Coffee, caffeine, and risk of depression among women. Arch Intern Med 171: 1571–1578.
  30. 30. Eskelinen MH, Kivipelto M (2010) Caffeine as a protective factor in dementia and Alzheimer's disease. J Alzheimers Dis 20 Suppl 1S167–174.
  31. 31. Infante-Rivard C, Fernandez A, Gauthier R, David M, Rivard GE (1993) Fetal loss associated with caffeine intake before and during pregnancy. JAMA 270: 2940–2943.
  32. 32. Group CS (2008) Maternal caffeine intake during pregnancy and risk of fetal growth restriction: a large prospective observational study. BMJ 337: a2332.
  33. 33. Liu H, Yao K, Zhang W, Zhou J, Wu T, et al. (2012) Coffee consumption and risk of fractures: a meta-analysis. Arch Med Sci 8: 776–783.
  34. 34. Ebrahim S, Davey Smith G (2008) Mendelian randomization: can genetic epidemiology help redress the failures of observational epidemiology? Hum Genet 123: 15–33.
  35. 35. Schliep KC, Schisterman EF, Mumford SL, Perkins NJ, Ye A, et al. (2013) Validation of different instruments for caffeine measurement among premenopausal women in the BioCycle study. Am J Epidemiol 177: 690–699.
  36. 36. Davey Smith G, Ebrahim S (2005) What can mendelian randomisation tell us about modifiable behavioural and environmental exposures? BMJ 330: 1076–1079.
  37. 37. Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, et al. (2007) A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316: 889–894.
  38. 38. Freathy RM, Kazeem GR, Morris RW, Johnson PC, Paternoster L, et al. (2011) Genetic variation at CHRNA5-CHRNA3-CHRNB4 interacts with smoking status to influence body mass index. Int J Epidemiol 40: 1617–1628.
  39. 39. Nordestgaard BG, Palmer TM, Benn M, Zacho J, Tybjaerg-Hansen A, et al. (2012) The effect of elevated body mass index on ischemic heart disease risk: causal estimates from a Mendelian randomisation approach. PLoS Med 9: e1001212.
  40. 40. Timpson NJ, Harbord R, Davey Smith G, Zacho J, Tybjaerg-Hansen A, et al. (2009) Does greater adiposity increase blood pressure and hypertension risk? Mendelian randomization using the FTO/MC4R genotype. Hypertension 54: 84–90.
  41. 41. Tyrrell J, Huikari V, Christie JT, Cavadino A, Bakker R, et al. (2012) Genetic variation in the 15q25 nicotinic acetylcholine receptor gene cluster (CHRNA5-CHRNA3-CHRNB4) interacts with maternal self-reported smoking status during pregnancy to influence birth weight. Hum Mol Genet 21: 5344–5358.
  42. 42. Ware JJ, Timpson N, Davey Smith G, Munafo MR (2014) A recall-by-genotype study of CHRNA5-A3-B4 genotype, cotinine and smoking topography: study protocol. BMC Med Genet 15: 13.
  43. 43. Davey Smith G (2011) Use of genetic markers and gene-diet interactions for interrogating population-level causal influences of diet on health. Genes Nutr 6: 27–43.
  44. 44. Cole SR, Platt RW, Schisterman EF, Chu H, Westreich D, et al. (2010) Illustrating bias due to conditioning on a collider. Int J Epidemiol 39: 417–420.