Uncovering the DNA methylation landscape in key regulatory regions within the FADS cluster

Genetic variants near and within the fatty acid desaturase (FADS) cluster are associated with polyunsaturated fatty acid (PUFA) biosynthesis, levels of several disease biomarkers and risk of human disease. However, determining the functional mechanisms by which these genetic variants impact PUFA levels remains a challenge. Utilizing an Illumina 450K array, we previously reported strong allele-specific methylation (ASM) associations (p = 2.69×10−29) between a single nucleotide polymorphism (SNP) rs174537 and DNA methylation of CpG sites located in the putative enhancer region between FADS1 and FADS2, in human liver tissue. However, this array only featured 20 CpG sites within this 12kb region. To better understand the methylation landscape within this region, we conducted bisulfite sequencing of the region between FADS1 and FADS2. Liver tissues from 50 male subjects (27 European Americans, 23 African Americans) were obtained from the Pathobiological Determinants of Atherosclerosis in Youth (PDAY) study, and used to ascertain the genotype at rs174537 and methylation status across the region of interest. Associations between rs174537 genotype and methylation status of 136 CpG sites were determined. Age-adjusted linear regressions were used to assess ASM associations with rs174537 genotype. The majority of CpG sites (117 out of 136, 86%) exhibited high levels of methylation with the greatest variability observed at three key regulatory regions–the promoter regions for FADS1 and FADS2 and a putative enhancer site between the two genes. Eight CpG sites within the putative enhancer region displayed significant (FDR p <0.05) ASM associations with rs174537. These data support the concept that both genetic and epigenetic factors regulate PUFA biosynthesis, and raise fundamental questions as to how genetic variants such as rs174537 impact DNA methylation in distant regulatory regions, and ultimately the capacity of tissues to synthesize PUFAs.

Genetic variants near and within the fatty acid desaturase (FADS) cluster are associated with polyunsaturated fatty acid (PUFA) biosynthesis, levels of several disease biomarkers and risk of human disease. However, determining the functional mechanisms by which these genetic variants impact PUFA levels remains a challenge. Utilizing an Illumina 450K array, we previously reported strong allele-specific methylation (ASM) associations (p = 2.69×10 −29 ) between a single nucleotide polymorphism (SNP) rs174537 and DNA methylation of CpG sites located in the putative enhancer region between FADS1 and FADS2, in human liver tissue. However, this array only featured 20 CpG sites within this 12kb region. To better understand the methylation landscape within this region, we conducted bisulfite sequencing of the region between FADS1 and FADS2. Liver tissues from 50 male subjects (27 European Americans, 23 African Americans) were obtained from the Pathobiological Determinants of Atherosclerosis in Youth (PDAY) study, and used to ascertain the genotype at rs174537 and methylation status across the region of interest. Associations between rs174537 genotype and methylation status of 136 CpG sites were determined. Age-adjusted linear regressions were used to assess ASM associations with rs174537 genotype. The majority of CpG sites (117 out of 136, 86%) exhibited high levels of methylation with the greatest variability observed at three key regulatory regions-the promoter regions for FADS1 and FADS2 and a putative enhancer site between the two genes. Eight CpG sites within the putative enhancer region displayed significant (FDR p <0.05) ASM associations with rs174537. These data support the concept that both genetic and epigenetic factors regulate PUFA biosynthesis, and raise fundamental questions as to how genetic variants such as rs174537 PLOS

Introduction
Polyunsaturated fatty acids (PUFAs) are vital for normal growth and development, serving as key structural components of biological membranes and modulating critical signal transduction events. In particular, long-chain PUFAs (LC-PUFAs) are precursors of bioactive metabolites, which have been implicated in several human diseases including cardiovascular disease, cancer, and inflammation [1][2][3][4]. Historically, the metabolic conversion and endogenous synthesis of LC-PUFAs from dietary PUFAs has been considered to be slow and relatively uniform in humans. Studies over the past decade  have demonstrated that genetic and epigenetic variations near and throughout the fatty acid desaturase (FADS) gene cluster (Fig 1) account for large variations in circulating and cellular LC-PUFA levels. Several common polymorphisms within the FADS gene cluster, including the single nucleotide polymorphism (SNP) rs174537, have been shown to be highly associated with circulating and tissue PUFA levels and PUFA product-to-precursor ratios [9-11, 28, 29]. In light of these studies, the conventional paradigm of slow, inefficient and uniform LC-PUFA biosynthesis in humans has been challenged. Genetic variants in the FADS gene cluster have also shown strong associations with intermediate biomarkers of disease, such as total cholesterol, low-density lipoprotein (LDL), highdensity lipoprotein (HDL), triglycerides, phospholipids, C-reactive protein, and pro-inflammatory eicosanoids [ . Most notable is rs174537, located approximately 15kb downstream of FADS1, which displays the strongest association (p<10 −40 ) with blood and tissue levels of arachidonic acid (ARA; C20:4n-6) and the metabolic conversion capacity, represented by product to precursor ratios (e.g. ARA/DGLA) [11,29,34]. In addition, there are marked allele frequency differences of rs174537 between racial/ethnic groups, with African populations having a derived haplotype that is associated with higher ARA levels and more efficient LC-PUFA biosynthesis [11, 35,36]. Together, these studies reveal the importance of genomic factors on PUFA biosynthesis and chronic inflammatory diseases in racially diverse populations.
Despite consistent associations between SNPs near and in the FADS cluster and LC-PUFA levels and related functional phenotypes, much remains unknown regarding the molecular mechanism of these genomic variants on PUFA metabolism in critical tissues and organs. A few studies indicate that genetic variants such as rs174537, and SNPs in high linkage disequilibrium (LD) with rs174537, may alter LC-PUFA levels via changes in FADS gene expression levels. This could occur in several ways, including altering the regulatory landscape (e.g., promoter or enhancer) of a gene, alternative RNA splicing, transcript degradation, or transcription of non-coding RNA. Additionally, epigenetic modifications such as DNA methylation may influence FADS gene expression levels. We recently performed a genome-wide, allele-specific methylation (ASM) analysis with rs174537 in human liver tissue to test the hypothesis that rs174537 (or variants in high LD) interacts with key regulatory regions within the FADS cluster and be associated with DNA methylation levels in a mechanistically important way [34]. Utilizing the Illumina HumanMethylation450 BeadChip, only one region of the genome (within the 12kb region located between FADS1 and FADS2), on chromosome 11q12.2, showed an ASM association with rs174537 in both African Americans and European Americans. This strong association was observed with methylation site cg27386326 (i.e., 61587979 in this manuscript) (p<10 −20 ), which is located~3.5kb from the FADS1 and~7.8kb from the FADS2 transcription initiation sites, in a region with a putative enhancer signature, near transcription factor binding sites for c-Fos, STAT3 and MafK. The methylation proportion differed by 0.40 between homozygotes at rs174537 (0.44, GG genotype; 0.84, TT genotype) and was highly associated with LC-PUFA biosynthesis.
A limitation of the aforementioned study [34] was that the Illumina HumanMethylation450 BeadChip (485,577 CpG sites) only featured a small number of probes (N = 20) within the potentially important 12kb regulatory region between FADS1 and FADS2. The 12kb region between FADS1 and FADS2, which are transcribed in opposite directions, contains three key regions of interest: (i) the FADS1 promoter region (approximately chr11: 61,584,650-61,586,300), (ii) a putative enhancer region (approximately chr11: 61,587,300-61,589,000), and (iii) the FADS2 promoter region (approximately chr11: 61,594,300-61,595,600) (Fig 1). To comprehensively examine the potential associations between the rs174537 genotypes and the methylation of all CpG sites within this regulatory region, we conducted bisulfite sequencing of the 12kb region between FADS1 and FADS2. Our goal was to characterize the DNA methylation landscape of this regulatory region and identify any new ASM associations between rs174537 and CpG sites within this region.

Study samples
DNA and liver tissue from 50 individuals (27 European American and 23 African American) were obtained from the Pathobiological Determinants of Atherosclerosis in Youth (PDAY) study [37]. PDAY was an autopsy study designed to enroll men and women between 15-34 years or age who died of non-cardiovascular disease related causes (e.g., traumatic injuries). Autopsies were performed within 72 hours of death, and livers were frozen at -80˚C. DNA was then isolated from these tissue samples, as previously described [37]. Since the original PDAY study was geared towards atherosclerosis, levels of HDL and non-HDL, thiocyanate and glycohemoglobin were quantified. Age, body mass index (BMI) and whether the patient was hypertensive were also documented. Since all study subjects were deceased at the time of study, use of these liver tissue specimens is not considered human subjects research.

Genotyping and DNA methylation analysis
Liver tissue samples were originally genotyped at rs174537 and evaluated for DNA methylation using the Illumina HumanMethylation450 BeadChip, as previously described [34]. A portion of the 12kb region between FADS1 and FADS2 ( Fig 1B) was bisulfite sequenced by Zymo Research, Inc. After evaluation of repeat sequences and allowing for PCR failures,~10.5kb out of the 12kb region could be sequenced, and 136 CpG sites were reliably quantitated. Methylation levels 5% were excluded across all samples, based on the assay's limit of detection and sensitivity. The name of each CpG is based on its genomic position on chromosome 11 (0-based format), based on genome build GRCh37/hg19. All methylation data from this study cohort has been provided as a supplemental file (S1 File).

Statistical analysis
ASM analysis was conducted to identify CpG sites most associated with the genotype at rs1745 37. By ethnicity, each of the 136 CpG sites was evaluated for an association with rs174537 using a linear regression model adjusted for age. Genotypes for rs174537 were coded for a dominant genetic model, relative to the T-allele. CpG site associations in the two cohorts were then analyzed in a trans-ethnic meta-analysis. The meta-analysis was computed using METAL, which implements a weighted inverse normal method (weighted by sample size) [38]. For each analysis (ethnic-specific and meta-analysis), the Bonjamini-Hochberg FDR (BH-FDR) p-values were calculated and reported along with the raw p-values. Significance was set at the 0.05 BH-FDR level. All statistical analyses were conducted using commercially available software (STATA and SAS) and open source statistical software (R).

Results
DNA methylation throughout the sequenced region was characterized, and included 136 CpG sites that spanned the three key regulatory regions: (i) the FADS1 promoter region, (ii) the putative enhancer region and (iii) the FADS2 promoter region (Fig 1). DNA methylation analyses were successfully performed on all 50 individuals (27 European American and 23 African American) that passed quality control metrics. Overall, we observed similar characteristics between the European and African American groups, with the exception of BMI and hypertension. African Americans in this cohort were more hypertensive despite having lower BMI. Characteristics of the study cohort are provided in Table 1.
Examining the average methylation status across the region revealed that variance in methylation was greatest within the three key regulatory regions (Fig 2). CpG sites outside of these three regions displayed high levels of methylation (>80% methylation, on average). Further investigation demonstrated substantial variability in the methylation status of CpG sites located in the putative enhancer region (Fig 2).
To assess the association of rs174537 with the methylation status of the 136 CpG sites, a linear regression was computed, adjusting for age. In the overall, race-combined meta-analysis, the greatest ASM with rs174537 occurred at the CpG site located at chr11:61587979 (i.e., cg27386326; Fig 3). This was the same site that demonstrated the greatest ASM (p = 2.69×10 −29 ) with rs174537 in our previous methylation GWAS study. The meta-analysis across the region also revealed an additional seven significantly associated (FDR <0.05) CpG sites (a total of eight), which were localized within the putative enhancer region (61587835, 61587979, 61588092, 61588096,  (Table 2). While these sites did not meet the FDR level of significance, they show promise for further investigation in larger cohorts. The complete ASM results for all 136 CpG sites with their Cg IDs have been provided in S1 Table (S1 Table). In many cases, we observed methylation levels of adjacent CpG sites to be correlated with one another. For example, adjacent CpG sites to chr11:61587979 (i.e. cg27386326), namely 61588092 and 61588096, were highly correlated (r = 0.94), suggesting that CpG sites within close proximity to one another can experience similar levels of methylation. Similarly, three CpG sites within the FADS2 promoter region (i.e. 61594865, 61594876, 61594907, r = 0.86) were strongly correlated with each other. While CpG site 61594865 was the only one that displayed trends towards a strong ASM association with rs175537 in the European American population, it is possible that these adjacent CpG sites could be just as influential in larger studies. While the data from Fig 2 appears to indicate that there are differences in methylation status by race within these three key regulatory regions, this is in fact explained by differences in genotype, with the allele frequencies of rs174537 being different by race. Analysis stratified by race (Figs 2 &  4) reveal that the European Americans displayed lower levels of methylation, on average, at CpG sites within the FADS2 promoter region compared to African American subjects. In contrast, African Americans expressed lower methylation at CpG sites 61587979 (i.e. cg27386326), 61588145 and 61589043 within the enhancer region. However, the racial differences in the methylation status of CpG sites within the enhancer region indicated in Fig 2 were observed to be strongly dependent on genotype at rs174537 (Fig 4). When stratified by race, CpG site chr11:61587979 (i.e. cg27386326) continued to display the strongest ASM association with rs174537 in European Americans. However, the strongest ASM was observed at CpG 61588145 in African Americans; this site is only 166 bp away from 61587979 (i.e. cg27386326) and also located in the putative enhancer region. Table 2 reports the CpG sites with the strongest ASM associations for European and African Ancestry populations and the overall meta-analysis for this cohort.

Discussion
Omega-6 and omega-3 LC-PUFAs such as arachidonic acid (ARA) and docosahexaenoic acid (DHA) have long been recognized to have vital structural roles in cellular membranes, brain development and function, and inflammation [39][40][41]. Given this strong relationship between LC-PUFAs and human physiology and recent evidence demonstrating the significant impact of genomic variants, including rs174537, on LC-PUFA levels and related phenotypes, we conducted a more in-depth investigation of the methylation status of 136 CpG sites between FADS1 and FADS2 genes. Our goals were to 1) characterize the DNA methylation landscape within this potentially important regulatory area; 2) identify new ASM associations with genotype at rs174537; and 3) better understand the molecular mechanism by which genetic and epigenetic variations with the FADS cluster may influence LC-PUFA biosynthesis in human tissues.
Seven new CpG sites were associated within the sequenced region, in addition to the previously identified CpG site 61587979 (i.e. cg27386326), totaling eight CpG sites exhibiting significant ASM associations with rs174537. CpG sites that displayed the greatest ASM association with rs174537 were localized within the putative enhancer region. These associations were replicated within each ethnic group and strengthened by the meta-analysis. Notably, there were additional signals of suggestive significance that may be enhanced with larger sample sizes. For example, European Americans showed suggestive associations at both the FADS1 (p = 0.014) and FADS2 (p = 0.018) promoter regions; and African Americans showed a suggestive FADS2 promoter signal (p = 0.006) which was supported by CpG sites in close-proximity (Fig 3).
The DNA methylation landscape of the sequenced region revealed areas of hyper-and hypo-methylation, with the three key regulatory regions (the FADS1 promoter, the putative enhancer, and the FADS2 promoter regions) displaying the greatest variation in methylation levels. In general, we observed lower average levels of methylation (<25%) in the two promoter regions. Provided that increased DNA methylation at promoter regions is generally associated with lower gene expression levels, it is likely that FADS1 and FADS2 gene expression levels were not greatly suppressed in this cohort. Ultimately, this needs to be experimentally tested. Average methylation levels in the putative enhancer region were generally higher than the promoter regions (~50%), but displayed a great deal of variability between subjects, ethnicity, and genotype, with genotype being the most influential factor.
Individuals homozygous for the major allele (i.e. GG genotype at rs17547), regardless of race, displayed the lowest levels of methylation within the putative enhancer region (Fig 4). Given that the GG genotype is associated with increased metabolic conversion capacities of dietary PUFAs to LC-PUFAs, it is possible that their low levels of methylation does not suppress FADS1 and FADS2 gene expression. Alternatively, individuals with GT and TT genotypes who have slower metabolic conversion capacities, with TT subjects exhibiting deficiencies in LC-PU-FAs, could be experiencing reduced FADS1 and FADS2 expression levels due to their increased DNA methylation. This suggests that reducing methylation levels in this regulatory region for individuals having GT and/or TT genotypes could potentially improve their endogenous synthesis of LC-PUFAs.
It is clear that there is some relationship between the rs174537 genotype and methylation in the sequenced region, and that this relationship has the potential to strongly impact LC-PUFA biosynthesis. It is unlikely that the rs174537 genotype affects the methylation status directly within these regulatory regions, since it resides over 30kb from the associated CpG sites. rs174537 resides in a large LD block, so other SNPs within this block and closer to the regulatory region would show similar relationships. Further studies are necessary to better understand the three-dimensional architecture of the FADS cluster and to determine if these genetic and epigenetic variants interact in that context. Epigenetic regulation has emerged as an important research focus, with environmental and particularly dietary exposure being key factors that influence PUFA metabolism and FADS gene expression levels. For example, Hoile et al. recently reported significant changes in DNA methylation of CpG sites within the FADS region in response to changing the fatty acid composition of the human diet [42]. Specifically, they discovered three CpG sites in the FADS2 promoter region to be most influenced by changes in diet. Interestingly, the single CpG site we identified in the FADS2 promoter within the European American cohort in this study (i.e. 61594865) is also one that Hoile et al. found to be impacted by diet. Together, these data imply that, in addition to being highly associated with rs174537, these CpG sites could play a pivotal role in sensing dietary, circulating, and tissue PUFA levels, and potentially regulate FADS gene transcription.
Genomic variants near and within the FADS cluster have been shown to influence downstream processes from the PUFA biosynthesis pathway, including inflammation, cardiovascular disease, cancer, and other chronic diseases. For example, maternal fat intake in rats has been shown to alter the epigenetic regulation of FADS2 in offspring liver. Additionally, the methylation status of CpG sites in two critical regulatory regions of the FADS cluster have been shown to be associated with blood and tissue PUFA levels as well as an important phenotype (e.g. immediate and intermediate memory) in toddlers [43,44]. It is clear that further investigation is needed to tease out the importance of the methylation status of CpG sites in this important FADS regulatory region.
This study, along with previous work from our lab, has demonstrated the significant racial differences in the genetic and epigenetic status of variants in the FADS cluster as well as LC-PUFA biosynthesis with African Ancestry populations having higher frequencies of variants associated with elevated LC-PUFA levels. We postulate that these genomic variants, along with high omega-6 PUFA exposure in the modern Western diet, have the capacity to create destructive genetic/epigenetic-dietary PUFA interactions. This, in turn, could have important implications on downstream inflammatory processes and disease states, including hypertension, diabetes, heart disease, and cancer [35,36,45,46]. Taken together, all of these studies suggest that a "one size fits all" dietary PUFA recommendation may not be appropriate. Further work is clearly needed to investigate the impact of these genomic variants on these proposed disease states in ethnically diverse populations.
While the DNA methylation landscape of this region has been well characterized and strong ASM associations with rs174537 have been identified in this limited study cohort, the methylation status of additional CpG sites may strongly influence circulating and tissue PUFA levels. A limitation of this work is that while it provided important new data regarding the methylation status of key CpG sites within the 12kb region of the FADS cluster, there may be numerous other regulatory mechanisms that impact LC-PUFA biosynthesis. Another limitation is due to the relatively small number of liver samples; it was not possible to power the study to evaluate the impact of the methylation of these CpG sites on PUFA levels. However, this work points out specific CpG sites where the methylation status is likely to affect LC-PUFA biosynthesis, and thus provides important knowledge to the field regarding which CpG sites to examine in future studies. Furthermore, it is encouraging that the CpG site (chr11:61594865) we identified in the FADS2 promoter region using ASM, for the European American population, was also identified by Hoile et al. [42] from whole blood samples, suggesting that there is overlap between DNA methylation in liver tissue and circulating blood cells for these specific CpG sites.

Conclusions
In conclusion, this in-depth characterization of the region between FADS1 and FADS2 genes revealed eight key CpG sites that were associated with genotype at rs174537. These data validate our previous observation of ASM between rs174537 and the methylation status of cg27386326 located in the putative enhancer region. Additionally, seven new key CpG sites were identified, based on FDR significance. Taken together, these data show a significant association between the genomic region tagged by rs174537 and DNA methylation in the putative enhancer region. Further investigation is needed to determine if altering the DNA methylation landscape at this region can influence FADS1 and FADS2 gene expression levels, ultimately impacting metabolic conversion capacities and the overall synthesis of LC-PUFAs in humans. In addition, it is unknown if alterations in the DNA methylation landscape will preferentially benefit individuals with certain genotypes (e.g. GG vs. TT). Needless to say, this study highlights the importance of genetic and epigenetic factors that may strongly impact the capacity of tissues such as the liver to synthesize LC-PUFAs.
Supporting information S1 Table. ASM results stratified by race and meta-analyses for all 136 CpG sites. The estimates, standard errors, raw p-values and FDR p-values are reported for each CpG site. The FADS1 promoter (green), putative enhancer (yellow) and FADS2 promoter (blue) regions are highlighted, using the same color scheme as in the manuscript. The name of each CpG is based on its genomic position on chromosome 11 (0-based format), based on genome build GRCh37/hg19; cg IDs are provided when available. (XLSX) S1 File. DNA methylation data from study cohort. All relevant data pertaining to this study, including demographics, clinical variables, and CpG sites, have been provided in the form of a supplemental excel file. All data is de-identified and a data dictionary has been provided within the file.