Genome-Wide Association Study of Plasma Polyunsaturated Fatty Acids in the InCHIANTI Study

Polyunsaturated fatty acids (PUFA) have a role in many physiological processes, including energy production, modulation of inflammation, and maintenance of cell membrane integrity. High plasma PUFA concentrations have been shown to have beneficial effects on cardiovascular disease and mortality. To identify genetic contributors of plasma PUFA concentrations, we conducted a genome-wide association study of plasma levels of six omega-3 and omega-6 fatty acids in 1,075 participants in the InCHIANTI study on aging. The strongest evidence for association was observed in a region of chromosome 11 that encodes three fatty acid desaturases (FADS1, FADS2, FADS3). The SNP with the most significant association was rs174537 near FADS1 in the analysis of arachidonic acid (AA; p = 5.95×10−46). Minor allele homozygotes had lower AA compared to the major allele homozygotes and rs174537 accounted for 18.6% of the additive variance in AA concentrations. This SNP was also associated with levels of eicosadienoic acid (EDA; p = 6.78×10−9) and eicosapentanoic acid (EPA; p = 1.07×10−14). Participants carrying the allele associated with higher AA, EDA, and EPA also had higher low-density lipoprotein (LDL-C) and total cholesterol levels. Outside the FADS gene cluster, the strongest region of association mapped to chromosome 6 in the region encoding an elongase of very long fatty acids 2 (ELOVL2). In this region, association was observed with EPA (rs953413; p = 1.1×10−6). The effects of rs174537 were confirmed in an independent sample of 1,076 subjects participating in the GOLDN study. The ELOVL2 SNP was associated with docosapentanoic and DHA but not with EPA in GOLDN. These findings show that polymorphisms of genes encoding enzymes in the metabolism of PUFA contribute to plasma concentrations of fatty acids.


Introduction
Polyunsaturated fatty acids (PUFA) refer to the class of fatty acids with multiple desaturations in the aliphatic tail. Short chain PUFA (up to 16 carbons) are synthesized endogenously by fatty acid synthase. Long chain PUFA are fatty acids of 18 carbons or more in length with two or more double bonds. Depending on the position of the first double bond proximate to the methyl end, PUFA are classified as n-6 or n-3. Long chain PUFA are either directly absorbed from food or synthesized from the two essential fatty acids linoleic acid (LA; 18:2n-6) and alpha-linolenic acid (ALA; 18:3n-3) through a series of desaturation and elongation processes [1]. The initial step in PUFA biosynthesis is the desaturation of ALA and LA by the enzyme d6-desaturase (FADS2; GeneID 9415) ( Figure 1). PUFA modulate inflammatory response through a number of different mechanisms including modulation of cyclooxygenase and lipoxigenase activity [2]. Cyclooxygenase and lipoxigenase are essential for production of eicosanoids and resolvins [2][3][4]. Since n-3 and n-6 fatty acids compete for the same metabolic pathway and produce eicosanoids with differing effects, it has been theorized that the balance of the two classes of PUFA may be important in the pathogenesis of inflammatory diseases.
Epidemiological studies have shown that fatty acid consumption and plasma levels, in particular of the n-3 family, are associated with reduced risk of cardiovascular disease [5][6][7], diabetes [8][9][10], depression [11,12], and dementia [13]. However, not all studies show significant associations and there has been inconsistencies in the direction of the associations especially for the n-6 acids [14,15]. The different methods (dietary questionnaire or biomarkers) for accessing PUFA status may contribute to discrepant results [16][17][18]. The disadvantage of using dietary PUFA intake is the evidence of inaccuracies intrinsic in any reporting methods of dietary intake that plasma levels would circumvent. In addition, direct measures of PUFA reflect the cumulative effects of intake and endogenous metabolism. Dietary fatty acids can be converted into longer chain PUFA or stored for energy thus another reason for inconsistent results may be due the general lack of control for individual differences in metabolism once fatty acids are consumed.
Previous studies have examined the association of genetic variants, especially polymorphisms in the FADS genes, on fatty acid concentrations in plasma and erythrocyte membranes [19][20][21]. There are 3 FADS (FADS1 [GeneID 3992] ,FADS2, and FADS3 [GeneID 3992]) clustered on chromosome 11. Variants in FADS1 and FADS2 have been consistently shown to be associated with PUFA concentrations. It is unknown whether other loci also determine fatty acid concentrations. To address this question, we conducted a genome-wide association study of plasma fatty acid concentration in participants in the InCHIANTI study.

Results
Linoleic acid (LA) constituted the highest proportion of total fatty acids followed by arachidonic acid (AA) ( Table 1) The narrow heritability was highest for AA (37.7%) followed by LA (35.9%), eicosadienoic acid (EDA, 33.3%), alpha-linolenic acid (ALA, 28.1%), eicosapentanoic acid (EPA, 24.4%), and docosahexanoic acid (DHA,12.0%). For EDA, AA, and EPA, genome-wide significant signals fell in the FADS1/FADS2/FADS3 region on chromosome 11 ( Figure 2, Figure 3, Table S1). Of these, the most significant SNP was rs174537 for AA (P = 5.95610 246 ), where the variant explained 18.6% of the additive variance of AA concentrations. This SNP was significantly associated with EDA (P = 6.78610 29 ), and EPA (P = 1.04610 214 ). The association with LA (P = 5.58610 27 ) and ALA (P = 2.76610 25 ) did not reach genome-wide significance, and there was no association with DHA (P = 0.3188). Presence of the minor allele (T) was associated with lower concentrations of longer chain fatty acids (EDA, AA, EPA), but with higher concentrations of LA and ALA (Table 2). With the exception of DHA, the SNPs exhibiting the strongest evidence of association with the fatty acids examined in this study mapped to the FADS1, FADS2, and FADS3 cluster. The most significant SNP for DHA was on chromosome 12 within the SLC26A10 gene (GeneID 65012, rs2277324; P DHA = 2.65610 29 ). In all cases, inclusion of the most significant SNP as a covariate in the model resulted in attenuation of the effect of the other SNPs in the region ( Figure S1). Accordingly, associated SNPs in this region were in significant linkage disequilibrium with each other in the In-CHIANTI sample ( Figure S2).
To investigate whether this SNP has an effect on other cardiovascular disease risk factors, we examined the association of rs174537 with plasma lipid parameters. Significant association was observed with total cholesterol (P = 0.027) and LDLcholesterol (P = 0.011), but not with either HDL-C (P = 0.775) or triglycerides (P = 0.862; Table 2). The minor allele homozygotes (TT) had 8 mg/dL lower total cholesterol and 9 mg/dL lower LDL-C compared with GG subjects.

Author Summary
Polyunsaturated fatty acids (PUFA) have a number of beneficial effects on human health. Plasma PUFA concentrations are determined by a combination of dietary intake and metabolic efficiency. To determine the genes involved in PUFA homeostasis, we scanned the genome for genetic variations associated with plasma PUFA concentrations. The fatty acid desaturase gene, studied in previous candidate gene association studies, was the strongest determinant of plasma PUFA. A second gene encoding a fatty acid elongase was associated with long chain PUFA. The results of this study contribute to our understanding of the genetics of PUFA homeostasis. These genetic markers may be useful tools to examine the interrelationship between diet, genetics, and disease.
In the GOLDN study, there were significant associations of FADS SNP, rs174537, with ALA, LA, AA, EPA and DHA (P,0.001) and marginal association with docosapentaenoic acid (DPA) (P = 0.068) ( Table 2). As with the InCHIANTI study, presence of the T allele was associated with higher ALA and LA concentration and lower AA, EPA, DPA and DHA concentrations. Consistent with the InCHIANTI study, this SNP was associated with total cholesterol and LDL-C but not triglycerides or HDL-C. We also observed strong associations of rs953413 with docosapentanoic acid (DPA; P = 0.002) and DHA (P,0.001). The presence of the minor allele (A) was associated with lower DHA and higher DPA and higher AA compared to the minor allele carriers ( Table 2). The remaining 3 SNP (rs2277324, rs16940765, rs17718324) were not associated with fatty acid concentrations in the GOLDN study (data not shown).

Discussion
The genome-wide association approach enables comprehensive examination of the genome to identify novel loci contributing to PUFA homeostasis. In addition, the significance of the genes previously reported in association with PUFA can be assessed relative to other regions in the genome. Here, we demonstrated that polymorphisms in the FADS cluster are the strongest determinants of plasma and erythrocyte fatty acid concentrations, explaining up to 18.6% of the additive variance in plasma AA levels. Consistent with prior reports, the greatest evidence of association was observed in the region containing FADS1, FEN1 (flap structure specific endonuclease, GeneID 2237), two hypothetical proteins (C11orf9 [GeneID 745], C11orf10 [GeneID 746]), and the promoter region of FADS2 [20,21]. With the exception of EDA, the direction of the association of rs174537 with plasma and erythrocyte fatty acids is consistent with previous reports. We find that there are higher levels of ALA and LA which is suggestive of an accumulation of the initial products of the PUFA metabolic pathway. The cluster of SNP ranging from rs174537 to rs509360 showed the strongest evidence for association, and contains the haplotype block previously examined in relation to plasma and erythrocyte fatty acids [20,21]. Based on the HapMap CEU data, the r2 between rs174537 and previously reported SNPs were $0.8. If functional polymorphisms exist within this region, it could affect the expression of both desaturases. To this end, in a recent report of genome-wide association of global gene expression, the rs174546 that is in LD with the rs174537 (r 2 = 0.99) was associated with FADS1 expression (LOD = 5, P = 1.6610 26 ) but not FADS2 (LOD = 0.7, P = 0.07) in lymphoblastoid cells [22,23]. The allele associated with higher AA showed greater expression of FADS1, consistent with our results. Since FADS1 and FADS2expression varies by tissue type, it would be of interest to examine the effect of the variant on gene expression in other cell types [24].
The T allele associated with lower AA was also associated with decreased LDL-C and total cholesterol. The association with LDL-C was also observed in a large meta-analysis of plasma lipid concentrations in ,8500 subjects [25,26]. In this meta-analysis, there was stronger evidence of association with high density lipoprotein (HDL-C) and triglycerides (TG), where the T allele displayed lower HDL-C and higher triglyceride concentrations. Finally, in the Welcome Trust Case-Control Consortium coronary artery disease (CAD) study, the T allele was associated with increased prevalence of CAD (P = 0.0375) [27]. The increased prevalence of CAD, low HDL-C and high TG is consistent with lower AA concentrations in the T allele carriers. Endogenous PUFA are natural ligands of peroxisome proliferator activating receptor alpha (PPARA) [28]. PPARA activation has been shown to elevate HDL-C and lower TG by inducing the expression of ApoA1, Apo-AII, lipoprotein lipase and suppressing ApoCIII [29][30][31][32]. Thus the low AA, EPA and EDA in the T allele carriers will results in lower PPARA activation. Under this hypothesis, we would expect the T allele carriers to display higher LDL-C since PPARA is known to enhance LDL-C clearance [33] . However, in both the InCHIANTI and GOLDN study, lower concentrations of LDL-C are observed. It is likely that there are other mechanisms by which fatty acids regulate lipoprotein homeostasis, for example through membrane fluidity. It may be possible, that the higher concentrations of linoleic and linolenic acid in the T allele carriers results in increased membrane fluidity, thus increasing LDL-receptor recycling leading to lower LDL-C.
The elongation of very long chain fatty acid (ELOVL) family genes are elongases that catalyze the rate-limiting condensation reaction resulting in the synthesis of very long chain fatty acids (VLCFA) [34]. To date, six ELOVL genes have been described. The ELOVL1, 3 and 6 are involved in synthesis of monounsaturated and saturated long chain fatty acids while ELOVL2, 4 and 5 elongate polyunsaturated fatty acids [35]. In this study, rs953413 in the ELOVL2 was the third most significant SNP in the analysis of EPA, with strong, although not genome-wide level significant association with long chain fatty acids EPA and DHA. In GOLDN, there were no significant associations of this SNP with EPA, but a significant association was observed with DPA. In mammals, two elongation steps are required for the synthesis of DHA from EPA. First, EPA is elongated to DPA, then to 24:5n-3 followed by a desaturation and retroconversion step to form DHA [1] (Figure 1). The two initial elongation steps of 20 and 22-C fatty acids are mediated by ELOVL2 [36]. The rs953413 is associated with substrate EPA (InCHIANTI), and product DPA (GOLDN) and DHA (both studies) of the EVOLV2 pathway. Plasma DPA levels were not measured in InCHIANTI, thus whether this association is also observed in this population cannot be investigated. Why the ELOVL2 SNP was not associated with EPA in GOLDN is not clear, however it may reflect the differences in fatty acid metabolism in erythrocytes versus plasma as they reflect two slightly different pools of fatty acids [37]. Plasma fatty acids reflect short term intake of fatty acids whereas erythrocyte levels reflect long term intake. Thus the different results between the plasma and erythrocyte fatty acids may reflect dietary differences between subjects in the GOLDN (USA) and the InCHIANTI study (Italy). Regardless of these differences, the results of this study suggestive of the role of ELOVL2 in the conversion of EPA to DHA in humans. The presence of the minor (A) allele was associated with higher EPA/DPA and lower DHA. If rs953413, located in intron 1, is the functional SNP (or is in LD with the functional SNP), this variant would likely be associated with lower expression of the ELOVL2 or result in a less efficient variant of the elongase resulting in decreased elongation of EPA to DHA. In lymphoblastoid cells, this SNP was not associated with ELOVL2 expression (LOD = 0.4, P = 0.2) [22,23]. Further investigation in other cell lines and functional analysis of the different variants is warranted. In summary, we have shown that the major loci for fatty acid concentrations in both plasma and erythrocyte membranes are in genes involved in the metabolism of PUFA. The FADS locus on chromosome 11 was the major contributor of plasma fatty acid concentrations, and thus may have implications for cardiovascular disease. In addition, we have identified a second promising locus in ELOVL2 that is involved in the homeostasis of longer chain n-3 fatty acids. Future studies should investigate the interactions between dietary intake, circulating levels of fatty acids and genetic variants on risk of diseases such as cardiovascular disease.

Sample Description
The InCHIANTI study is a population-based epidemiological study aimed at evaluating factors that influence mobility in the older population living in the Chianti region of Tuscany, Italy. Details of the study have been previously reported [38]. Briefly, 1616 residents were selected from the population registry of Greve in Chianti (a rural area: 11 709 residents with 19.3% of the population greater than 65 years of age) and Bagno a Ripoli (Antella village near Florence; 4704 inhabitants, with 20.3% greater than 65 years of age). The participation rate was 90% (n = 1453) and participants ranged between 21-102 years of age. Overnight fasted blood samples were collected for genomic DNA extraction and measurement of plasma fatty acids. Genotyping was completed for 1231 subjects using the Illumina Infinium HumanHap 550 genotyping chip (ver1 and ver3 chips were used). The study protocol was approved by the Italian National Institute of Research and Care of Aging Institutional Review.
There were 85 parent-offspring pairs, 6 sib-pairs and 2 halfsibling pairs documented. We investigated any further familial relationships using IBD of 10,000 random SNPs using RELPAIR and uncovered 1 parent-offspring, 79 siblings and 13 half-sibling [39]. We utilized the correct family structure inferred from genetic data for all analyses. In addition, we identified 2 duplicated samples and removed these from the study. Sample quality was assessed using the GAINQC program (http://www.sph.umich. edu/csg/abecasis/GainQC/). The average genotype completeness and heterozygosity rates were 98% and 32% respectively. We excluded subjects that had less than 97% of genotyped completeness (n = 12), heterozygosity rate of less than 30% (n = 5) and misspecified sex based on heterozygosity of the X chromosome SNPs (n = 1). The final sample size used for SNP quality control was 1210.
The confirmation study population consisted of 1120 white men and women in the United States participating in the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) Study. The majority of participants were re-recruited from the ongoing National Heart and Lung and Blood Institutes (NHLBI) Family Heart Study (FHS) [40] in two genetically homogeneous centers (Minneapolis, MN and Salt Lake City, UT). GOLDN is part of the Program for Genetic Interactions (PROGENI) Network, a group of NIH-funded intervention studies of gene-environmental interactions. The primary aim of the GOLDN study was to characterize the genetic components of triglycerides response following a high fat meal and hypolipedemic drug, fenofibrate. Detailed study design and methodology has been previously described [41,42] In the replication sample, we excluded persons with missing genotypes or extreme fatty acid values. The final data set consists of information on 1076 individuals. The protocol for this study was approved by the Human Studies Committee of Institutional Review Board at University of Minnesota, University of Utah and Tufts University/New England Medical Center. Written informed consent was obtained from all participants.

Biochemical Measurements
InCHIANTI: Plasma fatty acid measurement methods has been described previously [43]. Briefly, blood samples were collected in the morning after a 12-hr overnight fast. Aliquots of plasma were immediately obtained and stored at 280 C. Fatty acid methyl esters (FAME) were prepared through transesterification using Lepage and Roy's method with modification [44,45]. Separation of FAME was carried out on an HP-6890 gas chromatograph (Hewlett-Packard, Palo Alto, CA) with a 30-m fused silica column (HP-225; Hewlett-Packard). FAMEs were identified by comparison with pure standards (NU Chek Prep, Inc., Elysian, MA). For quantitative analysis of fatty acids as methyl esters, calibration curves for FAME (ranging from C14:0 to C24:1) were prepared by adding six increasing amounts of individual FAME standards to the same amount of internal standard (C17:0; 50xg). The correlation coefficients for the calibration curves of fatty acids were in all cases higher than 0.998 in the range of concentrations studied. Fatty acid concentration was expressed as a percentage of total fatty acids. The coefficient of variation for all fatty acids was on average 1.6% for intraassay and 3.3% for interassay. HDL-C, total cholesterol and triglycerides were determined using commercial enzymatic tests (Roche Diagnostics, Mannheim, Germany). Serum low-density lipoprotein cholesterol (LDL-C) was computed with the Friedewald formula (LDL-C = total cholesterol 2 HDL-C 2 triglicerides/5).
GOLDN: Fatty acids (FA) in erythrocyte membrane were measured following procedures described previously [46] Briefly, lipids were extracted from the erythrocyte membranes with a mixture of chloroform:methanol (2:1, v/v), collected in heptanes and injected onto a capillary Varian CP7420 100-m column with a Hewlett Packard 5890 gas chromatograph (GC) equipped with a HP6890A autosampler. The GC was configured for a single capillary column with a flame ionization detector and interfaced with HP chemstation software. The initial temperature of 190uC was increased to 240uC over 50 minutes. Fatty acid methylesters from 12:0 through 24:1n9 were separated, identified and expressed as percent of total fatty acid. Triglycerides were measured using a glycerol blanked enzymatic method (Trig/GB, Roche Diagnostics Corporation, Indianapolis, IN) and cholesterol was measured using a cholesterol esterase, cholesterol oxidase reaction (Chol R1, Roche Diagnostics Corporation) on the Roche/Hitachi 911 Automatic Analyzer (Roche Diagnostics Corporation). For HDL-cholesterol, the non-HDL-cholesterol was first precipitated with magnesium/dextran. LDL-cholesterol was measured by a homogeneous direct method (LDL Direct Liquid Select Cholesterol Reagent, Equal Diagnostics, Exton, PA). Table 2. Associations of fatty acids and plasma lipids by rs174537 (FADS1) and rs953413 (ELOVL2) in InCHIANTI and GOLDN study.

Assessment of Dietary Intake
In the InCHIANTI, dietary intake was assessed using a foodfrequency questionnaire (FFQ) created for the European Prospective Investigation into Cancer and nutrition (EPIC) study, and has previously been validated to provide good estimates of dietary intake in this study population [47,48] . In GOLDN, habitual dietary intake was estimated using the validated dietary history questionnaire (DHQ) developed by the National Cancer Institute [49]. We excluded subjects that reported ,800 kcal and .5500 kcal in men and ,600kcal and .4500kcal in women.
GOLDN: Five SNPs were selected for replication in the GOLDN study: rs953413, rs2277324, rs16940765, rs17718324 and rs174537. One of these, rs2277324, failed genotyping and therefore another SNP in high LD, rs923838 (r 2 = 0.89 in hapmap), was used as a proxy for this SNP. DNA was extracted from blood samples and purified using commercial Puregene reagents (Gentra System, Inc.) following manufacturer's instructions. SNPs were genotyped using the 5'nuclease allelic discrimination Taqman assay with allelic specific probes on the ABI Prism 7900HT Sequence Detection System (Applies Biosystems, Foster City, Calif, USA) according to standard laboratory protocols. The primers and probes were pre-designed (the assay -on -demand) by the manufacturer (Applied Biosystem) (Assay ID: FEN_rs174537: C___2269026_10, HRH4_rs16940765: C__32711739_10, SPARC_rs17718324: C__34334455_10, ELOVL2_rs953413: C___7617198_10, rs923828: C___2022671_10).

Statistical Analysis
InCHIANTI GWAS: Inverse normal transformation was applied to plasma fatty acid concentrations to avoid inflated type I error due to non-normality [51]. The genotypes were coded 0, 1 and 2 reflecting the number of copies of an allele being tested (additive genetic model). For X-chromosome analysis, the average phenotype of males hemizygous for a particular allele was treated assumed to match the average phenotype of females homozygous for the same allele. Association analysis was conducted by fitting simple regression test using the fastAssoc option in MERLIN [52]. Narrow heritability reflects the ratio of the trait's additive variance to the total variance [51,53]. In all the analyses, the models were adjusted for sex, age and age squared. The genomic control method was used to control for effects of population structure and cryptic relatedness [54]. An approximate genome-wide significance threshold of 1610 27 (,0.05/495343 SNPs) was used. For each fatty acid concentration, a second analysis included the most significant SNP from the first pass analysis as a covariate. Linkage disequilibrium coefficints within the region of interest were calculated using GOLD [55].
For the other phenotypes (total cholesterol, triglycerides, LDLcholesterol, HDL-cholesterol and BMI), the traits were normalized either by natural log or square root transformation when necessary. Associations for each SNP were investigated using the general linear model (GLM) procedure in SAS.
GOLDN: Inverse normal transformation was applied to erythrocyte membrane fatty acid concentration to achieve approximate normality. For the additive model, genotype coding was based on the number of variant alleles at the polymorphic site. With no significant sex modification observed, men and women were analyzed together. We used the generalized estimating equation (GEE) linear regression with exchangeable correlation structure as implemented in the GENMOD procedure in SAS (Windows version 9.0, SAS Institute, Cary, NC) to adjust for correlated observations due to familial relationships. Potential confounding factors included study center, age, sex, BMI, smoking (never, former and current smoker), alcohol consumption (nondrinker and current drinker), physical activity, drugs for lowering cholesterol, diabetes and hypertension and hormones. A two-tailed P value of ,0.05 was considered to be statistically significant.