Genome-wide association meta-analysis of fish and EPA+DHA consumption in 17 US and European cohorts

Background Regular fish and omega-3 consumption may have several health benefits and are recommended by major dietary guidelines. Yet, their intakes remain remarkably variable both within and across populations, which could partly owe to genetic influences. Objective To identify common genetic variants that influence fish and dietary eicosapentaenoic acid plus docosahexaenoic acid (EPA+DHA) consumption. Design We conducted genome-wide association (GWA) meta-analysis of fish (n = 86,467) and EPA+DHA (n = 62,265) consumption in 17 cohorts of European descent from the CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) Consortium Nutrition Working Group. Results from cohort-specific GWA analyses (additive model) for fish and EPA+DHA consumption were adjusted for age, sex, energy intake, and population stratification, and meta-analyzed separately using fixed-effect meta-analysis with inverse variance weights (METAL software). Additionally, heritability was estimated in 2 cohorts. Results Heritability estimates for fish and EPA+DHA consumption ranged from 0.13–0.24 and 0.12–0.22, respectively. A significant GWA for fish intake was observed for rs9502823 on chromosome 6: each copy of the minor allele (FreqA = 0.015) was associated with 0.029 servings/day (~1 serving/month) lower fish consumption (P = 1.96x10-8). No significant association was observed for EPA+DHA, although rs7206790 in the obesity-associated FTO gene was among top hits (P = 8.18x10-7). Post-hoc calculations demonstrated 95% statistical power to detect a genetic variant associated with effect size of 0.05% for fish and 0.08% for EPA+DHA. Conclusions These novel findings suggest that non-genetic personal and environmental factors are principal determinants of the remarkable variation in fish consumption, representing modifiable targets for increasing intakes among all individuals. Genes underlying the signal at rs72838923 and mechanisms for the association warrant further investigation.


Objective
To identify common genetic variants that influence fish and dietary eicosapentaenoic acid plus docosahexaenoic acid (EPA+DHA) consumption.

Design
We conducted genome-wide association (GWA) meta-analysis of fish (n = 86,467) and EPA +DHA (n = 62,265) consumption in 17 cohorts of European descent from the CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) Consortium Nutrition Working Group. Results from cohort-specific GWA analyses (additive model) for fish and EPA+DHA consumption were adjusted for age, sex, energy intake, and population stratification, and meta-analyzed separately using fixed-effect meta-analysis with inverse variance weights (METAL software). Additionally, heritability was estimated in 2 cohorts.

Results
Heritability estimates for fish and EPA+DHA consumption ranged from 0.13-0.24 and 0.12-0.22, respectively. A significant GWA for fish intake was observed for rs9502823 on chromosome 6: each copy of the minor allele (Freq A = 0.015) was associated with 0.029 servings/ day (~1 serving/month) lower fish consumption (P = 1.96x10 -8 ). No significant association was observed for EPA+DHA, although rs7206790 in the obesity-associated FTO gene was among top hits (P = 8.18x10 -7 ). Post-hoc calculations demonstrated 95% statistical power to detect a genetic variant associated with effect size of 0.05% for fish and 0.08% for EPA +DHA.

Introduction
Consumption of fish (including finfish and shellfish) and long-chain omega-3 fatty acids is linked to lower risk of several chronic diseases, in particular fatal coronary heart disease [1]. These beneficial associations in observational studies are supported by randomized controlled trials demonstrating favorable effects of fish or fish oil on numerous chronic disease risk factors and on cardiac mortality [1,2]. As a result, regular fish consumption is recommended by all major national and international dietary guidelines [1]. In contrast to these guidelines and in comparison to many other foods, remarkable variation exists in the amount of fish consumption within and across populations. In many Western nations, approximately one-third of individuals consume no fish at all, approximately onethird consume fish but relatively rarely (up to once per week), and approximately one-third consume fish more frequently [3]. While some of this wide variation in fish consumption is undoubtedly due to personal and environmental factors (e.g., culture, geographic residence, family habits, socioeconomic status), the potential contribution of intrinsic biologic factors, such as genetic variation, is not well established. In one analysis among Danish twins, the estimated heritability of fish consumption was 17% in men and 61% in women, based on additive genetic effects [4]. For example, potential heritability could relate to differences in genes related to taste, digestion, fatty acid metabolism, or other unknown processes related to food preferences. Yet, the potential genetic variants underlying this estimated heritability are unknown; and such heritability estimates also require further replication.
A basic concept underlying "personalized nutrition" is that a person's genes can influence their behaviors and responses to the environment. Dietary habits, including the consumption of fish, are among the most relevant factors that influence the development of chronic diseases. Elucidating whether, and in what manner, specific genes alter fish and long-chain omega-3 fatty acid consumption would have implications for understanding influences on variation in fish intake within populations and the biology of partiality to foods. Furthermore, identification of such variants could also inform the development of personalized nutrition-dietary recommendations based on genetic preferences for consumption.
As has been seen with other characteristics such as physiologic risk factors, genome-wide association (GWA) studies may lead to discovery of novel genes and biologic pathways that influence the individual characteristic of interest. Although such studies have been performed for major macronutrients (e.g., fat, carbohydrate, protein) [5,6], few analyses have been done for specific foods [7], whose intakes may be influenced by complex characteristics of tastes, textures, aromas, and nutrient contents. The ability to undertake food-specific genetic analyses has been limited by the modest sample sizes of individual cohorts having both dietary and genetic information and the potential lack of reproducibility of genetic findings discovered in any single cohort.
We therefore performed a collaborative investigation to estimate heritability of and assess how common genetic variation relates to dietary consumption of both fish and long-chain omega-3 fatty acids (eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA)) as part of

Cohorts
The present work was a collaboration among 17 US and European population-based cohort studies participating in the Nutrition Working Group of the CHARGE Consortium (S1 Table).  [5,6] and are provided in S1 Table. All persons studied were of European descent, consented to genetic research, and provided written informed consent. For each study, examination protocols were approved by local institutional review boards at Johns Hopkins University (

Assessment of fish and omega-3 fatty acid consumption
Usual dietary intake was assessed in each cohort using detailed food frequency questionnaires designed to capture the dietary habits of the population under study (S2 Table). Typically, participants were asked to indicate how often, on average, they had consumed various foods and beverages over the past year according to multiple frequency categories (e.g., 9 categories ranging from <1/month to 6+/day), with usual portion sizes specified on the questionnaire or by the participant. Fish intake was generally assessed using multiple questions, such as on consumption of tuna fish; dark meat fish such as salmon or sardines; other white fish; shellfish; and fried fish or fish sandwiches. For each question, the midpoint of each frequency category was used to estimate usual intake which was then multiplied by the specified portion size; these intakes were summed across all questions on fish. For this analysis, we standardized fish consumption in each cohort to 100g servings/day. In 12 cohorts, total dietary consumption of eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) was estimated by linking the dietary assessment tool to a food composition table specific to the cohort (e.g., the USDA food composition database in the US). For each of the types of foods consumed, the frequency and average portion size were multipled by the content of EPA/DHA in the food. The total was calculated by summing across all foods in the questionnaire. For cohorts that included nutrients from supplements, the portion of EPA+DHA from supplements was excluded from our analysis.

Heritability estimates
To evaluate potential heritability of fish and EPA+DHA consumption, we estimated heritability using family-based methods in two family-based cohorts (FamHS and FHS) using the variance components method in Sequential Oligogenic Linkage Analysis Routines (SOLAR; Texas Biomedical Research Institute; San Antonio, TX), and adjusting for age and sex. Briefly, heritability is calculated using a maximum likelihood method using the ratio of the genetic variance to total phenotypic variance. [8] Genotyping and analysis Genome-wide genotyping was conducted in each cohort using Affymetrix or Illumina platforms. Each study performed quality control for genotyped single nucleotide polymorphisms (SNPs) based on minor allele frequency (MAF), call rate, and departure from Hardy-Weinberg equilibrium (S3 Table). Phased haplotypes from HapMap CEU were used to impute~2.5 million autosomal SNPs using a Hidden Markov Model algorithm implemented in MACH, IMPUTE, or BimBam. Study-specific GWA analyses were conducted within each cohort using genotyped and imputed SNP dosages assuming an additive genetic model. Fish and EPA +DHA consumption were separately evaluated as the dependent variable using linear regression with robust standard error, adjusted for age, sex, energy intake (kcal/d), study-specific centers where applicable, and population stratification principal components when the cohort lambda was >1. SNPs with low MAF (<1%), low imputation quality (MACH: R 2 <0.3; or IMPUTE: proper info <0.4), were excluded. Quality control for cohort-level GWAS results was performed to ensure correct specification of the minor allele and agreement in frequencies with the reference population (HapMap CEU), consistent distribution of effect sizes and standard error, and examination of QQ plots to assess any large inflation of test statistics. Results across studies were combined using fixed-effect meta-analysis with inverse variance weights (METAL software) [9]. The association results from individual studies as well as meta-analyses were adjusted for genomic control. To explore potential heterogeneity by demographic region, a meta-analysis within cohorts from Europe and USA was performed. Genome-wide significance was considered at the Bonferroni-corrected threshold of P<5x10 -8 . Statistical power to detect a true association at various effect sizes (heritability) was calculated using GWAPower software for the analysis of 1 million independent SNPs for both fish and EPA+DHA (Feng S, 2011 BMC Genetics), assuming linkage disequilibrium (r 2 ) of 0.5 between a SNP and putative causal variant and 10% variance explained by three covariates.

Exploratory analysis of plasma phospholipid EPA and DHA
Result from genome-wide association analyses of circulating EPA and DHA are publically available (http://faculty.washington.edu/rozenl/files/) [10]. These databases were mined to test whether the top SNPs from the fish and EPA+DHA intake GWAS are associated with circulating levels of EPA and DHA.

Results
The 17 cohorts were from the US, Estonia, Finland, Greece, Italy, and the Netherlands and included 86,467 participants with information on fish consumption and 62,265 with information on EPA+DHA consumption. Across participating cohorts, mean fish consumption ranged from  1). Mean intake of EPA+DHA consumption ranged from 89 (Rotterdam) to 563 (HBCS) mg/d and was generally consistent with findings on fish intake, except in THISEAS (Greece) which had relatively higher intakes of fish than EPA+DHA, suggesting predominant consumption of white (non-oily) fish. In general, participants in European cohorts had higher fish consumption than those in US cohorts. The heritability estimates for fish intake were 0.13±0.03 (FamHS) and 0.24±0.02 (FHS); and for EPA+DHA intake, 0.12±0.03 (FamHS) and 0.22±0.02 (FHS). In GWA meta-analyses of fish (17 cohorts) and EPA+DHA (11 cohorts) consumption, the genomic control lambda values were 1.07 and 0.99 respectively (S1 and S2 Figs). A genome-wide significant association was observed for fish intake on chromosome 6 for rs9502823 ( Table 2). The minor allele (Freq A = 0.015) was associated with 0.029 servings/day lower fish consumption (P = 1.96x10 -8 ). This SNP was mapped to LOC285768 gene of unknown function (Fig 1, top panel); and was not identified in NHGRI-EBI GWAS catalogue (http://www.ebi.ac.uk/gwas/, search on Feb 23, 2017). The second top hit was rs17396472 on chromosome 3, not achieving genome-wide statistical significance (P = 5.62x10 -8 ).
No genome-wide significant association was observed for EPA+DHA consumption (S1 and S2 Figs). The top association for EPA+DHA consumption was observed for rs11877506 (P = 1.18x10 -7 ) ( Table 2). Additionally, rs7206790 in the obesity-associated FTO gene was among the top SNPs for EPA+DHA intake: the body mass index-raising G allele was associated with 7mg/day greater EPA+DHA intake (Fig 1, bottom panel; P = 7.44x10 -7 ).
To obtain more information on the locus associated with fish consumption on chromosome 6, we investigated data from the ENCODE project. Using CEU 1000genomes data, we calculated the LD within the region 250kb upstream and downstream from rs9502823. Mapping the SNPs found in the rs9502823 LD block to ENCODE regulatory regions, we identified rs72838923 (in complete LD with rs9502823) as a functional candidate. rs72838923 falls within a experimentally determined H3k27Ac region, identified in several cell types in ENCODE. H3k27Ac regions are thought to be markers of active enhancer activity. In addition, rs72838923 falls within DNAse Hypersensitivity Peak which were identified experimentally across 65 cell types from the ENCODE project. There is further evidence of transcription factor binding sites for FOXA1, among others in this region, from ENCODE CHIP-Seq experiments. In addition, mapping rs72838923 on the UCSC genome browser suggested that this SNP is found within a region of conservation across mammals.
Due to the varying ranges of average fish consumption in US versus European studies, we performed exploratory subgroup GWA meta-analysis stratified by geographic location. No significant associations were identified in USA nor European studies (S3 Fig) for fish or EPA +DHA consumption (S4 Fig). A prior consortium analysis including several of these same cohorts reported on genomewide association of SNP variants with plasma phospholipid EPA and DHA, the concentrations of which are determined by both dietary intake and endogenous metabolism regulation. [11] In exploratory analysis, we evaluated whether the top 5 hits for fish consumption and the top 4 hits for estimated dietary EPA+DHA consumption identified in the present analysis were associated with plasma phospholipid concentrations of EPA or DHA in that prior analysis [10], adjusting for multiple comparisons (9 SNPs x 2 fatty acids = Bonferroni-corrected alpha of 0.05/18 = 0.0028). No significant associations were identified (S4 Table).

Discussion
In this large GWA meta-analysis of 17 US and European cohorts totaling 86,467 participants, we found evidence that common genetic variation may be associated with consumption of fish. We found no genome-wide significant association of common variants with EPA+DHA intake. While the sample size for the analysis of EPA+DHA was smaller than the fish intake analysis, with 62,265 individuals the analysis had 95% power to detect an effect size (heritability) of 0.08%. We identified one locus on chromosome 6 in association with fish consumption. The SNP was mapped to LOC285768 with unknown function. The next closest gene is forkhead box Q1 (FOXQ1) which is a member of the cancer-associated forkhead-box (FOX) gene family [12]. Our evaluation of data from ENCODE, taken together, identified a functional candidate, rs72838923, that appears to lie within a transcriptionally active region of the genome. While the association was statistically significant, the magnitude of effect was small, with the minor allele being associated with a difference of 0.03 servings/day or approximately 1 serving/month of fish. This finding is more likely to be relevant for understanding the biology of food preferences than for influencing clinical outcomes, although even small differences in fish consumption, over a lifetime, could influence health. The nonsignificant top associations identified in chromosomes 1, 3, and 12 each represent intragenic regions of genes highly expressed in the brain, but these associations did not achieve genome-wide significance.
In heritability analyses, we found evidence for modest heritability of fish (0.13 to 0.24) and EPA+DHA (0.12 to 0.22). Our GWA results identified one locus in association with fish intake that cannot fully account for this observed heritability, suggesting that observed heritability might be due to remarkably small effects across a large number of SNPs, other types of genetic  variation such as copy number variants, epigenetic modifications, or multiple unobserved genetic interactions with unknown environmental factors. This challenge of "missing" or unaccounted for heritability is a frequent finding in GWA analyses of common diseases and traits [13]. Heritability analyses may overestimate heritability due to unmeasured shared environmental influences, for example from in utero/placental influences through childhood and adult life. In this light, our heritability findings are lower than those previously reported [4] and represent an additional important new contribution. Our findings support the need for future investigations of the possible explanations for the modest but as yet missing heritability of fish and EPA+DHA consumption. This investigation had several strengths. Our pooling of multiple large, well-established cohorts provided a very large sample of participants for investigating our research questions. Our post-hoc power calculations demonstrate 95% statistical power to detect a genetic variant associated with an effect size of 0.05% for fish consumption and 0.08% for EPA+DHA consumption. We adjusted for total reported energy intake, which helps to address any systematic over-or under-reporting by individuals and also real differences in total food consumed (i.e., due to differences in age, sex, body size, or physical activity), facilitating evaluation of dietary composition. All the studies in the meta-analysis used comparable dietary assessment tools that were appropriate for the population under study, providing the highest quality data that can be reasonably collected across multiple large epidemiological studies.
Limitations should be considered. While dietary intakes assessed by food frequency questionnaire represent a reasonably valid method to collect data on usual dietary habits in large populations [14], such data also include measurement error, which could limit the ability to detect true associations. However, many validation studies have demonstrated that fish and EPA+DHA consumption are measured reasonably well by food frequency questionnaires, whether compared with multiple diet records or with objective circulating or tissue biomarkers [15,16,17,18]. Indeed, because biomarker levels also represent imperfect measures of "true" habitual consumption with uncorrelated errors compared to questionnaire estimates, the actual correlations of estimated fish or EPA+DHA consumption with "true" consumption are likely much higher, in the range of 0.8 or more. Compared with a candidate gene approach, GWA has lower statistical power to detect small genetic effects. Yet, candidate gene approaches for evaluating fish consumption would be strongly limited by imperfect knowledge of which genes affect known systems and biologic processes related to food preferences and, even more so, which genes may affect currently unknown systems and biologic influences on food preferences.
In summary, this large pooling project across 17 established cohorts identified modest heritability of fish and omega-3 fatty acid consumption and one genetic locus associated with fish consumption. These findings suggest that genetic variation may have small effects on fish consumption and, by extension, that other modifiable factors-for example, childhood diet, culture, education, income, and local availability-are the main determinants of the remarkable differences in fish consumption within and across populations, representing targets for increasing fish intake among all individuals.

Acknowledgments
Disclaimer: The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the other funders. The Atherosclerosis Risk in Communities (ARIC) study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts N01-HC-55015, N01-HC-55016, N01-HC-55018, N01-HC-55019, N01-HC-55020, N01-HC-55021, N01-HC-  We thank all study participants as well as everybody involved in the Helsinki Birth Cohort Study. Helsinki Birth Cohort Study has been supported by grants from the Academy of Finland, the Finnish Diabetes Research Society, Folkhälsan Research Foundation, Novo Nordisk Foundation, Finska Läkaresällskapet, Signe and Ane Gyllenberg Foundation, University of Helsinki, European Science Foundation (EUROSTRESS), Ministry of Education, Ahokas Foundation, Emil Aaltonen Foundation.
The Health, Aging and Body Composition (Health ABC) study was supported in part by the Intramural Research Program of the NIH, National Institute on Aging contracts N01AG62101, N01AG62103, and N01AG62106. The genome-wide association study was funded by NIA grant R01 AG032098 to Wake Forest University Health Sciences and genotyping services were provided by the Center for Inherited Disease Research (CIDR). CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, contract number HHSN268200782096C.
The use of Health 2000 data in this study has been financially supported by the Academy of Finland (grant 250207) and Orion-Farmos Research Foundation. The authors would like to thank the many colleagues who contributed to collection and phenotypic characterization of the clinical samples, and DNA extraction and genotyping of the data, especially Eija Hämäläinen, Minttu Jussila, Outi Törnwall, Päivi Laiho, and the staff from the Genotyping Facilities at the Wellcome Trust Sanger Institute. They would also like to acknowledge those who agreed to participate in the H2000 study.
Invecchiare in Chianti (aging in the Chianti area, InCHIANTI) study investigators thank the Intramural Research Program of the NIH, National Institute on Aging who are responsible for the InCHIANTI samples. Investigators also thank the InCHIANTI participants. The InCH IANTI study baseline (1998)(1999)(2000) was supported as a "targeted project" (ICS110.1/RF97.71) by the Italian Ministry of Health and in part by the U.S. National Institute on Aging (Contracts: 263 MD 9164 and 263 MD 821336).
The Multi-Ethnic Study of Atherosclerosis (MESA) and MESA SHARe project are conducted and supported by contracts N01-HC-95159 through N01-HC-95169 and RR-024156 from the National Heart, Lung, and Blood Institute (NHLBI). Funding for MESA SHARe genotyping was provided by NHLBI Contract N02-HL-6-4278. The authors thank the participants of the MESA study, the Coordinating Center, MESA investigators, and study staff for their valuable contributions. A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org.
The NHS and HPFS are supported by the National Cancer Institute (NHS: UM1 CA186 107, HPFS:UM1 CA167552) with additional support for genotyping. The NHS Breast Cancer GW scan was performed as part of the Cancer Genetic Markers of Susceptibility initiative of the NCI (R01CA40356, U01-CA98233). The NHS/HPFS type 2 diabetes GWAS (U01HG00 4399) is a component of a collaborative project that includes 13 other GWAS funded as part of the Gene Environment-Association Studies (GENEVA) under the NIH Genes, Environment and Health Initiative (GEI) (U01HG004738, U01HG004422, U01HG004402, U01HG004729, U01HG004726, 01HG004735, U01HG004415, U01HG004436, U01HG004423, U01HG004