Coffee consumption and risk of breast cancer: A Mendelian randomization study

Background Observational studies have reported either null or weak protective associations for coffee consumption and risk of breast cancer. Methods We conducted a two-sample Mendelian randomization (MR) analysis to evaluate the relationship between coffee consumption and breast cancer risk using 33 single-nucleotide polymorphisms (SNPs) associated with coffee consumption from a genome-wide association (GWA) study on 212,119 female UK Biobank participants of White British ancestry. Risk estimates for breast cancer were retrieved from publicly available GWA summary statistics from the Breast Cancer Association Consortium (BCAC) on 122,977 cases (of which 69,501 were estrogen receptor (ER)-positive, 21,468 ER-negative) and 105,974 controls of European ancestry. Random-effects inverse variance weighted (IVW) MR analyses were performed along with several sensitivity analyses to assess the impact of potential MR assumption violations. Results One cup per day increase in genetically predicted coffee consumption in women was not associated with risk of total (IVW random-effects; odds ratio (OR): 0.91, 95% confidence intervals (CI): 0.80–1.02, P: 0.12, P for instrument heterogeneity: 7.17e-13), ER-positive (OR = 0.90, 95% CI: 0.79–1.02, P: 0.09) and ER-negative breast cancer (OR: 0.88, 95% CI: 0.75–1.03, P: 0.12). Null associations were also found in the sensitivity analyses using MR-Egger (total breast cancer; OR: 1.00, 95% CI: 0.80–1.25), weighted median (OR: 0.97, 95% CI: 0.89–1.05) and weighted mode (OR: 1.00, CI: 0.93–1.07). Conclusions The results of this large MR study do not support an association of genetically predicted coffee consumption on breast cancer risk, but we cannot rule out existence of a weak association.


Conclusions
The results of this large MR study do not support an association of genetically predicted coffee consumption on breast cancer risk, but we cannot rule out existence of a weak association.
Several genome-wide association studies (GWAS) on coffee or caffeine consumption have been previously published [32][33][34][35][36][37]. One of these GWAS was a meta-analysis conducted by the Coffee and Caffeine Genetics Consortium in 2015 incorporating summary statistics from 28 population-based studies of European ancestry, and reported six loci associated with coffee consumption that were involved either in the pharmacokinetics (cytochrome P4501A1 (CYP1A1)/cytochrome P4501A2 (CYP1A2), aryl hydrocarbon receptor (AHR)) or pharmacodynamics of caffeine (brain-derived neurotrophic factor (BDNF) and solute carrier family 6 member 4 (SLC6A4)) [35]. A more recent and larger GWAS was conducted among individuals (179,954 males and 212,119 females) of white British ancestry in the UK Biobank (UKB) cohort [37], and identified 35 genetic variants strongly associated with coffee intake.
Mendelian randomization (MR) is a method that uses genetic variation arising from meiosis as a natural experiment, to investigate the potential causal relationship between an exposure and an outcome [38,39]. MR estimates are less susceptible to bias from potential reverse causality and confounding compared to estimates from observational studies, because genetic variants are randomly distributed at conception [40,41]. A recent MR study assessed the potential causal association between coffee consumption and risk of several cancers, including breast cancer, and concluded that coffee consumption is unlikely to be associated with overall breast cancer susceptibility [37]. However, the latter study did not report associations by breast cancer subtypes. In the current MR study, we investigated the relationship between genetically predicted coffee consumption and risk of breast cancer overall as well as breast cancer subtypes incorporating several MR methods to assess the impact of potential MR assumption violations.

Genetic data on coffee consumption
We used 35 single nucleotide polymorphisms (SNPs) that were associated with coffee consumption at genome-wide significance (p<5e-8) level in the combined population of men and women in UKB [37], but their beta estimates (SNP-coffee) were derived from analyses only among the female population. In a sensitivity analysis, we combined beta estimates (SNP-coffee) for both men and women to increase statistical power. The UKB is a population-based cohort study of more than 500,000 participants aged 38 to 73 years, who enrolled in the study between 2006 and 2010 from across the UK [42]. Coffee consumption was measured via selfadministered questionnaires and was defined as cups of decaffeinated coffee, instant coffee, ground coffee and any other type of coffee (UKB Data field ID: 1508) consumed per day [37]. Briefly, the UKB participants were genotyped using Affymetrix UK Biobank Axiom array and imputed against the UK10K, 1000 Genomes Phase 3 and Haplotype Reference Consortium panels [37]. The GWAS was conducted using the BOLT-LMM software [43] to model the genetic association accounting for cryptic relatedness in the UKB sample. SNPs were clumped at r 2 <0.01 using a 10-mb window [37].

Statistical power
Statistical power calculations were conducted using the online mRnd calculator (available at http://cnsgenomics.com/shiny/mRnd/). Using an estimated 1% variance of coffee consumption explained by the instruments [37], the study had 80% power with a type I error rate of 0.05 to detect associations of odds ratios of 0.89, 0.87 and 0.80 per one cup of coffee per day and risk of overall, ER-positive and ER-negative breast cancer, respectively.

Statistical analysis
Main MR analysis. We conducted a two-sample MR using summary association data for 33 coffee-associated SNPs. We ran both fixed-and random-effects inverse-variance weighted (IVW) models, but the random-effects IVW model was considered the main analysis due to the large number of SNPs and the substantive observed heterogeneity [45,46]. The IVW MR approach combines individual MR estimates across SNPs to derive an overall weighted estimate of the potential causal effect. We calculated the MR-derived odds ratio (OR) of breast cancer risk for a one cup per day increase in genetically predicted coffee consumption. This study used publicly available data.
Sensitivity analyses. The IVW MR approach assumes that all genetic variants must satisfy the instrumental variable assumptions, namely the genetic variants must be: 1) associated with coffee consumption, 2) not associated with confounders of the association between coffee consumption and breast cancer, and 3) only associated with breast cancer via their association with coffee consumption [45,47,48]. We tested for potential violation of the first MR assumption by measuring the strength of the genetic instruments using F-statistics. The F-statistic is the ratio of the mean square of the model to the mean square of error [49]. The Cochran's Q test and the I 2 statistic were used to quantify the heterogeneity in effect sizes between the genetic instruments [50], which may indicate horizontal pleiotropy that could violate the third MR assumption. To further test and attempt to correct for potential violation of the second and third MR assumptions, we used several approaches such as the MR-Egger regression [51], the weighted median [52] and mode [53] methods, and the MR pleiotropy residual sum and outlier test (MR-PRESSO) [54].
MR-Egger. The MR-Egger is an adaption of Egger regression, which allows for directional pleiotropy by introducing an intercept in the weighted regression model. Values away from zero for the intercept term are an indication of horizontal pleiotropy [51]. The MR-Egger approach provides unbiased results in the presence of pleiotropic instruments assuming that the magnitude of pleiotropic effects is independent of the size of the SNP-coffee consumption effects, which is called the Instrument Strength Independent of Direct Effects (InSIDE) assumption [51].
Weighted median. We used the weighted median method that orders the MR estimates obtained using each instrument weighted for the inverse of their variance. Selecting the median result provides a single MR estimate with confidence intervals estimated using a parametric bootstrap method [52]. The weighted median does not require that the size of any pleiotropic effects on the instruments are uncorrelated to their effects on the intermediate phenotype, but assumes that at least half of the instruments are valid [55].
Weighted mode. The mode based causal estimate consistently estimates the true causal effect when the largest group of instruments with consistent MR estimates is valid [53].
MR-PRESSO. We used the MR-PRESSO outlier test to identify outlier SNPs, which could have pleiotropic effects [54]. This method regresses SNP-outcome on SNP-exposure and uses square of residuals to identify outliers.
In addition, we repeated the analysis after excluding SNPs that had p-values in their associations with coffee consumption among women larger than 1e-05 to avoid weak instrument bias. We also used beta estimates from a previous GWAS as an alternative instrument of eight SNPs (rs1260326, rs1481012, rs17685, rs7800944, rs6265, rs9902453, rs2472297 and rs4410790) associated with coffee consumption [35] to ensure that our results were robust against different choices of instrument selection and because these eight SNPs are linked to caffeine metabolism and may reflect less likelihood for pleiotropic actions. All the analyses were performed using the MR robust package in Stata [73] and the Mendelian randomization package in R [74].

Results
The associations between the genetic instruments with coffee consumption and breast cancer are shown in the S1 Table. One variant (rs17817964 in FTO) was strongly associated with overall (P = 4.67E-20), ER-positive (P = 2.48E-13) and ER-negative breast cancer (P = 1.56E-09).

MR-Egger
Results based on the MR-Egger regression did not show any association for genetically predicted coffee consumption and risk of total breast cancer or subtypes (Figs 1-3, S2 Table).

Discussion
In this comprehensive MR analysis of coffee consumption with risk of breast cancer, we observed that in the majority of analyses genetically predicted consumption of coffee was not associated with overall, ER-positive and ER-negative breast cancer. In line with our results, a recent large MR-study on the association between coffee consumption and risk of being diagnosed with or dying from cancer overall and by anatomical subsite reported no evidence for an association with risk of breast cancer [37]. Compared to the previous study, our study added results by ER-status and presented detailed sensitivity analyses to fully assess potential violations of MR assumptions.
Coffee is among the most commonly consumed beverages worldwide, and its drinking provides exposure to a range of biologically active compounds [75]. Higher coffee consumption has been associated with decreased risk of all-cause, cardiovascular and cancer mortality among non-smokers [76]. Several observational studies have investigated the association between coffee consumption and risk of breast cancer development, but findings have been inconsistent [31, 77,78]. The most recent meta-analysis synthesized evidence from 21 prospective cohort studies [31], and reported a weak inverse association between coffee consumption and risk of total (OR higher vs. lower = 0.96, 95% CI = 0.93-1.00) and postmenopausal breast cancer (OR = 0.92, 95% CI = 0.88-0.98). Null associations were reported by estrogen or progesterone receptor status [31]. When a dose-response meta-analysis was conducted among 13 prospective studies [31], the association per one cup of coffee per day was nominally significant (OR for postmenopausal disease = 0.97, 95% CI = 0.95-1.00), which was consistent with the finding of the current MR study (OR = 0.90, 95% CI 0.79-1.02). In agreement, the World Cancer Research Fund Third Expert Report graded the evidence of coffee consumption and breast cancer risk as limited-no conclusion [79].
MR studies can be useful in nutritional epidemiology, as they are less susceptible to biases that are commonly present in traditional observational literature [80], namely exposure measurement error, residual confounding and reverse causation. MR estimates warrant a causal interpretation only if the assumptions of the instrumental variable approach hold. Though it is not possible to prove the validity of the assumptions in entirety, we performed several sensitivity analyses to detect potential violations and derived estimates that are potentially robust against violations of these assumptions. The majority of the sensitivity analyses supported our main analysis finding.
Several limitations should be considered when interpreting our findings. Our MR-analysis had appropriate statistical power to detect an OR of 0.89 per cup of coffee per day and risk of overall breast cancer. Observational studies have detected smaller associations of coffee consumption and breast cancer risk than this [31]. We were unable to rule out the possibility that coffee consumption may have a weaker association that we were not powered to detect. A weakness of using summary level data in two-sample MR is that stratified analyses by covariates of interest (e.g. smoking, alcohol, obesity, physical activity) are not possible which would have allowed us to investigate potential interactions between risk factors, but previous observational studies have in general not identified interactions with these variables [31]. Although we have involved clinically meaningful disease subtypes such as ER+ /− breast cancer, we could not examine breast cancer based on menopause status but 85% of breast cancer cases in our sample are postmenopausal. Although our genetic instruments are robustly associated with coffee consumption, coffee consumption itself is a heterogeneous phenotype that may potentially limit the generalizability of our findings on specific coffee type or preparation procedure. In addition, we are currently unable to isolate and classify genetic variants into caffeine and non-caffeine aspects of coffee given that the genetic loci heavily overlap, and future research into the biological mechanisms of the genetic instruments is warranted when more data becomes available; until then, a potential role of micronutrients attained through coffee consumption on reduction of breast cancer risk cannot be ruled out. Another limitation was that two-sample MR assumes linearity, so we could not evaluate potential existence of non-linear associations.

Conclusions
In summary, the results of this large MR study do not support an association of genetically predicted coffee consumption on breast cancer risk, but we cannot rule out existence of a weak association.