Fetal DNA Methylation Associates with Early Spontaneous Preterm Birth and Gestational Age

Spontaneous preterm birth (PTB, <37 weeks gestation) is a major public health concern, and children born preterm have a higher risk of morbidity and mortality throughout their lives. Recent studies suggest that fetal DNA methylation of several genes varies across a range of gestational ages (GA), but it is not yet clear if fetal epigenetic changes associate with PTB. The objective of this study is to interrogate methylation patterns across the genome in fetal leukocyte DNA from African Americans with early PTB (241/7–340/7 weeks; N = 22) or term births (390/7–406/7weeks; N = 28) and to evaluate the association of each CpG site with PTB and GA. DNA methylation was assessed across the genome with the HumanMethylation450 BeadChip. For each individual sample and CpG site, the proportion of DNA methylation was estimated. The associations between methylation and PTB or GA were evaluated by fitting a separate linear model for each CpG site, adjusting for relevant covariates. Overall, 29 CpG sites associated with PTB (FDR<.05; 5.7×10−10<p<2.9×10−6) independent of GA. Also, 9637 sites associated with GA (FDR<.05; 9.5×10−16<p<1.0×10−3), with 61.8% decreasing in methylation with shorter GA. GA-associated CpG sites were depleted in the CpG islands of their respective genes (p<2.2×10−16). Gene set enrichment analysis (GSEA) supported enrichment of GA-associated CpG sites in genes that play a role in embryonic development as well as the extracellular matrix. Additionally, this study replicated the association of several CpG sites associated with gestational age in other studies (CRHBP, PIK3CD and AVP). Dramatic differences in fetal DNA methylation are evident in fetuses born preterm versus at term, and the patterns established at birth may provide insight into the long-term consequences associated with PTB.


Introduction
Despite advances in health care, the rate of preterm birth (PTB; birth before 37 weeks of gestation) has been increasing for the last 25 years [1]. Specifically, children born preterm are more likely be hospitalized and have diminished cognitive performance and develop behavioral problems such as ADHD during childhood [2,3]. Along these lines, many adult onset diseases have been linked to adverse intrauterine conditions or adverse pregnancy outcomes [4,5]. Thus, PTB not only imparts a difficult start but also considerable challenges throughout life [1,6]. Spontaneous preterm birth (PTB), which occurs without indications, is common and contributes to significant neonatal morbidity and mortality over time [7].
Several epidemiologic, behavioral and biological factors (i.e. race, socioeconomic status, malnutrition, smoking, and infection) have been associated with PTB, but the mechanistic pathways that underlie the association of the risk factors to PTB are still unclear [8,9,10]. The field of epigenetics has the potential to provide a greater understanding of the pathways that contribute to or result from PTB [11]. Indeed specific risk factors may promote epigenetic changes that result in PTB or that predisposes a neonate to adult-onset diseases. Although epigenetic differences associate with many prenatal exposures and complex traits, published studies that evaluate maternal and fetal epigenetic changes during pregnancy, influence on pregnancy outcome, and fetal programming of adult-onset diseases are limited [12,13]. The study of epigenetic patterns during early development is likely to provide more information about environmental and behavioral influences on long-term outcomes than the study of individuals later in life. In time, such studies may suggest biomarkers for developmental outcomes.
DNA methylation is an epigenetic modification required for proper gene regulation and cellular differentiation during fetal development [14,15]. Over the first years of life, DNA methylation of many genes appears to be relatively stable [16,17]. Therefore, DNA methylation patterns of certain genes established at birth may result in a developmental trajectory with long-term consequences. We have previously shown that DNA methylation of certain genes associates with gestational age (GA) in term deliveries [18], and evidence suggests that DNA methylation differences in key genes may provide insight into biological pathways that underlie PTB. The primary objective of this study is to interrogate methylation patterns across the genome in DNA derived from umbilical cord blood leukocytes of a high risk African American cohort and to evaluate the association of each CpG site with PTB and GA.

Methods
This study was approved by the Institutional Review Boards of Centennial Women's Hospital, Western Institutional Review Board and the University of Texas Medical Branch.

Subjects and Sample Collection
The Nashville Birth Cohort (NBC) was established to examine genetic risk factors and changes in the biochemical pathways that distinguish spontaneous preterm from term labor. All subjects were recruited at Centennial Women's Hospital and the Perinatal Research Center in Nashville, TN beginning in 2003. Pregnant women were enrolled during their first clinical visit after obtaining informed consent. Maternal demographic and clinical data were recorded from medical records or by interviews during the consenting process. Demographic and clinical data specific to the fetus was collected from clinical records. Gestational age of the neonate was determined by maternal reporting of the last menstrual period and corroboration by ultrasound dating. Race was identified by self-reporting that traced back to three generations from maternal and paternal side of the fetus. Only African Americans of non-Hispanic ethnicity were included in this study.
Subjects were included in this study if they had contractions (rate of 2 contractions/10 minutes) leading to delivery either at preterm or term. Cases were delivered preterm with intact membranes between 24 1/7 weeks and 34 0/7 weeks. Controls were delivered (.39 0/7 weeks) with spontaneous term labor and delivery and no current or history of pregnancy-related complications including PTB and preterm or prelabor rupture of the membranes (pPROM). Subjects who had multiple gestations, preeclampsia, placenta-previa, fetal anomalies, and/or medical or surgical complications during pregnancy were excluded from the study. Subjects with any surgical procedures during pregnancy were treated for preterm labor or for suspected intra-amniotic infection and delivered at term were excluded from the control group. Maternal demographic and clinical data were collected from medical records or thorough self-report at the time of consent.
Race, socioeconomic (education, household income, marital status, and insurance status), behavioral (cigarette smoking) factors were documented by maternal self-report. Intraamniotic infection was determined by amniotic fluid culture or by PCR for 16 s ribosomal RNA. In cases where culture or PCR data were not available, infection was assessed with four of the following clinical or histologic symptoms: high fever (.102uC), high CRP (.0.8 U/ ml), abdominal tenderness, fetal tachycardia, mucopurulent vaginal discharge or histologic chorioamnionitis, funisitis.

Biological Sample Collection and DNA Extraction
Umbilical cord blood samples were collected in EDTA tubes soon after placental delivery. Blood samples were centrifuged at 3,000 RPM to separate plasma, and buffy coats were aliquoted and stored at 280uC. DNA was extracted using the Autopure automated system (Gentra Systems, Minneapolis, MN).

DNA Methylation Analysis
For each subject, .485,000 CpG sites across the genome were interrogated using the HumanMethylation450 BeadChip (Illumina, San Diego, CA) [19,20]. Briefly, 1 ug of DNA was converted with sodium bisulfite, amplified, fragmented, and hybridized on the HumanMethylation450 BeadChip (Illumina, San Diego, CA) according to the manufacturer's instructions. CpGassoc [21] was used to perform quality control and calculate ß values. Data points with probe detection p-values ..001 were set to missing, and CpG sites with missing data for .10% of samples were excluded from analysis; 483,830 CpG sites passed the above criteria. Samples with probe detection call rates ,90% and those with an average intensity value of either ,50% of the experimentwide sample mean or ,2,000 arbitrary units (AU) were excluded from further analysis. One sample of male DNA was included on each BeadChip as a technical control throughout the experiment and assessed for reproducibility using the Pearson correlation coefficient, to ensure that Pearson correlation coefficient .0.99 for all pairwise comparisons of technical replicates. For each individual sample and CpG site, the signals from methylated (M) and unmethylated (U) bead types were used to calculate a beta value as ß = M/(U+M).

Statistical Analysis
We used MethLAB [22] to test for association with PTB via linear regressions that modeled b-values as the outcome and PTB as the independent variable, adjusting for GA, gender, chip, and row on the chip. Based on previous reports and the potential contribution to PTB we examined the association of birth weight percentile, gravidity, parity, infection and smoking as confounding factors in our analysis; these factors did not associate with methylation of any CpG site after adjustment for multiple testing (FDR,.05; data not shown). Birth weight percentile was based on estimated gestational age (GA) in accordance with the United States national registry [23]. We subsequently used MethLAB to fit similar linear regressions that modeled GA as the independent variable, adjusting for gender, chip, and row on the chip. Because it has been suggested that logit-transformed b values (a.k.a. M values) may perform better in statistical analyses [24], we also examined associations with M values using the strategy described above. Because there was no significant difference between the results, we present results based on untransformed b to ease biological interpretation.
The location of each CpG site was determined using the Illumina array annotation for the HumanMethylation450 Bead-Chip based on build 37 of the human genome. We tested for enrichment among GA-associated sites by comparing the number of GA-associated CpG sites that did or did not occur in a particular gene region (e.g. promoter, 59UTR, Body, 1 st exon, 39UTR, or intragenic regions) to the number of non-GAassociated sites that did or did not occur in that gene region, using Fisher's exact test. We then performed similar tests of enrichment for CpG-rich regions defined as islands or CpG poor regions defined as shores [25,26]. CpG sites with 1000 Genomes Project variants physically contained within the Illumina probe were noted in the analyses but not excluded a priori. In addition we examined whether significant GA-associated CpG sites were enriched or depleted on the X chromosome using Fisher's exact test.
We used GSEAPrerank [27,28] to evaluate whether GAassociated CpG sites were located in genes that were enriched for specific biological processes and cellular components. Significance of the gene ontology enrichment was corrected for an FDR,.05 following 1000 permutations.

Results
The cohort, described in Table 1, consists of African American preterm (GA range 24.1-34.0 weeks) and term (39.0-40.9 weeks) births. Though the groups differed by GA and birthweight, they did not differ significantly in demographic or clinical factors.

Preterm Birth (PTB)
After accounting for multiple comparisons (FDR,.05) and confounding factors (gender, gestational age, and chip effects), 29 CpG sites associate with PTB independently of GA ( Figure 1A; Table 2; 5.7610 210 ,p,2.9610 26 ;2.17,Db,.26). Based on annotation with data from the 1000 Genomes Project, 5 of these 29 CpG probes (17.2%) do contain a SNP (estimated average minor allele frequency of 15.5%), suggesting that we could be observing a genetic rather than an epigenetic association for these 5 CpG sites; the methylated and unmethylated signals for these five sites are shown in Figure S1. In some cases, the pattern appears consistent with SNP-induced methylation differences, while in other cases there is no strong pattern of clustering. Results were not significantly altered by adjustment for maternal smoking, or infection, birth weight percentile, and gravidity (data not shown) nor were they altered by logit-transformation of the beta values. Among the CpG sites associated with PTB, we observed increased DNA methylation of a site (cg13250001) in GSK3B (glycogen synthase kinase 3 beta; p = 1.7610 26 ; Db = 2.06) and decreased methylation of a CpG site (cg25376491) in MAML1 (mastermind-like 1; p = 1.8610 26 ; Db = .14) in fetuses with PTB. In addition, 3 other CpG sites in GSK3B and 4 in MAML1 were nominally associated with PTB (p,.05).

Gestational Age
Our above analyses of PTB all included GA as a covariate because PTB and GA are by definition correlated (r = .93), and there is overwhelming agreement in the association of DNA methylation with PTB unadjusted for GA, or GA itself ( Figure S2).
In fact, 9637 CpG sites associated with GA independent of gender and chip effects (FDR,.05; 9.5610 216 ,p,1.0610 23 ; 2.024,Db per week,.023; Figure 1B: Table S1). GA-associated CpG sites were depleted in the promoter, first exon and 39UTR regions and enriched in the 59UTR, gene body and intragenic regions (2.2610 216 ,p,2.6610 23 ; Table 3) when compared to CpG sites that were not associated with GA via Fisher's exact test. Associated CpG sites were also depleted in CpG islands (14.9% vs. 31.3%; p,2.2610 216 ) and enriched in CpG shores (34.1% vs. 22.8%; p,2.2610 216 ). Examining the directionality of GAassociated CpG sites, 61.8% (5958 CpG sites) had lower methylation in subjects with lower GA; these CpG sites were twice as likely to be located in CpG islands (p,2.2610 216 ; Table 3) and less likely to occur in the gene body (p,2.2610 216 ;) and 39UTR (p = 1.5610 29 ). While the sample size was not sufficient to look for sex-specific differences (i.e. interactions between age and sex), we did note a depletion of GA-associated CpG sites on the X chromosome (.5% vs. 2.4%; p,2.2610 216 ); both the depletion of GA-associated variants on CpG islands and the X chromosome are consistent with a previous report of ageassociated methylation in children [29].
Gene set enrichment analysis (GSEA) was used to gain further insight into the functional context of GA-associated CpG sites (FDR,.05; Table 4). Prominent biological processes that were enriched in GA-associated CpG sites were related to embryonic development. For example, 9 sites in the 59UTR and body of histone deacetylase 4 (HDAC4, 1.3.x10 211 ,p,9.8610 24 ; 2.0023,Db per week,2.01) have higher methylation levels in fetuses with lower GA. HDAC4 is involved in numerous identified pathways including system development and multicellular organismal development, anatomical structure development, organ development, and nervous system development. Several other CpG sites involved in epigenetic regulation during development were also identified. Specifically, CpG sites in the gene body of 3.6,t,5.3;.0040,Db per week,.0053) and the 59UTR of TET1 (tet methylcytosine dioxygenase 1; 1.5610 27 ,p,2.7610 24 ; 4.0,t,6.4;.0046,Db per week,.01) also associate with GA (Table S1).
Among the enriched cellular components are several groups that relate to extracellular regions. Remodeling of the extracellular matrix is required to support pregnancy and parturition [30] and increased attention has recently been focused on the role of matrix metallopeptidases (MMPs) and tissue inhibitors of metalloproteinases (TIMPs) in preterm birth [31]. In this study, 4 CpG sites in the promoter of MMP9 (5.6610 27 ,p,3.2610 24 ; 4.0,t,6.0;.0021,Db per week,.0033) had higher methylation with increasing gestational age. MMP9 is involved in the breakdown of the extracellular matrix in the process of cervical ripening, and increased expression has been seen in pPROM compared to preterm birth with intact membranes [32]. Further, 1 CpG site in the gene body of the MMP9 inhibitor, TIMP2 also associates with GA (p = 1.4610 25 ; t=25.0; Db per week = 2.0053).
To complement our discovery approach, we evaluated the association between CpG sites in genes that had been associated with GA in a previous study that used a less dense array with 27,578 CpG sites [18] ( decreasing GA (t = 24.49; p = 6.5610 25 ; Db per week = .01). CRHBP regulates corticotrophin-releasing hormone (CRH), a principal regulator of the hypothalamic-pituitary-adrenal (HPA) axis. In addition, methylation increased in a CpG site in the promoter of PIK3CD (phosphatidylinositol-4,5-bisphosphate 3kinase, catalytic subunit delta) with decreasing GA (p = 2.4610 28 ; t = 27.0; Db per week = 2.0062). The therapeutic value of PIK3CD inhibitors is currently being explored as anti-inflammatory drugs [33]. One limitation of this strategy is that GA and PTB represent correlated but etiologically distinct phenotypes. Thus, replicating associations observed with GA may not capture the same breadth of candidate genes that could be explored in a study focused on PTB. For example, IGFBP1 has been considered as a marker for preterm in vaginal infection and leaking amniotic fluid [34]. We observed associations between GA and 6 CpG sites in insulin-like growth factor 2 mRNA binding protein 1 (IGF2BP1; 2.1610 212 ,p,1.9610 24 ; 24.1,t,10.2; 2.0087,Db per week,.02) located in the gene body though the direction of the association changed based on proximity of the CpG site to the CpG island.

Discussion
By examining DNA methylation across the genome, we identified 29 CpG sites that associated with PTB independently of GA in leukocyte DNA from high-risk African American fetuses. Among these are CpG sites in GSK3B (glycogen synthase kinase 3 beta), which is involved in neuronal migration, development, and polarization, particularly during early embryonic development [35,36]. Interestingly, GSK3B is a negative regulator of MAML1 (mastermind-like 1) [37], a component of the Notch pathway [38,39], and a CpG site in MAML1 also associated with PTB. GSK3B decreases transcription in the notch pathway through inhibition of MAML1 [37]. Consistent with the role of GSK3B in regulating MAML1, there was an inverse relationship in the associations for the CpG sites in these genes. During development, the Notch pathway is integral to several developmental processes including neurogenesis, cardiovascular function, angiogenesis as well as intestinal and bone development [40].
Additionally, 9637 CpG sites associated with GA when it was modeled separately from PTB. Our analyses suggest enrichment of GA-associated CpG sites in biological processes involved not only in embryonic and organ development but also in neurogenesis, nervous system development and neuron development. These processes involve extensive epigenetic regulation so it is not surprising that we observed associations with CpG sites in genes related to shaping epigenetic patterns during development: HDAC4, DNMT1, DNMT3A, DNMT3B, and TET1. For example, CpG sites in TET1 and DNMT3B have lower DNA methylation in subjects with shorter GA. TET1 functions to hydroxylate 59methylcyctosine(mC) into 59hydroxymethyl cytosine (hmC) [41]. TET1 has been implicated is normal embryogenesis, and the depletion of TET1 leads to low birth weight (LBW) in mouse pups [42]. TET1 promotes active demethylation while DNMT3B promotes de novo methylation; these two processes are highly involved in the establishment of tissue-specific DNA methylation patterns during development [41,43]. Though these results are indicative of the developmental time sampled (i.e. 32 versus 38 weeks), they may also support the hypothesis of epigenetic programming during fetal development [44].
The cellular components most enriched for genes with GAassociated CpG sites were primarily related to the extracellular region. Genes such as MMP9 and TIMP2 are integral to the process of parturition [45]. MMP9 has previously been considered as a biomarker for preterm birth [46] and has been thought to play a role in premature rupture of the membranes (PROM) because of its role in the degradation of the amniochorion basement membranes [47]. MMP9 levels are higher following PROM when compared to term deliveries, while TIMP2 levels decrease. DNA methylation differences in these and other genes related to extracellular matrix function support further study of the role of the fetal extracellular matrix throughout pregnancy and during parturition.
Many studies of fetal programming or prenatal exposures focus on fetuses with intrauterine growth restrictions or that were small  Table S1. doi:10.1371/journal.pone.0067489.g001 for gestational age. Recent studies in the field support associations between GA and both DNA methylation and gene expression differences, but note lesser or no associations with birth weight [18,48]. Similarly, in this study we identified numerous associations between DNA methylation and PTB, which is measured by GA, but no associations with percentile birth weight. Based on this, Stunkel and colleagues hypothesize that birth weight may be a less appropriate measure of adverse outcomes than GA [48]. Along these lines, we identified associations between GA and DNA methylation of CpG sites in insulin-like growth factor 2 mRNA binding protein 1 (IGF2BP1), a developmentally regulated gene that binds IGF2 and has been a focus of the fetal programming literature [49]. DNA methylation in IGF2 has been linked to various pregnancy-related conditions including birth weight [50]. IGFBP proteins are secreted from the placenta, decidua and fetal membranes in increasing amounts across gestation and are abundant in amniotic fluid [51]. Detection of IGFBP-1 in cervical-vaginal secretions is reliably used to detect preterm premature rupture of the membranes, which precedes 40% of spontaneous PTB cases [52,53]. However, we were not able identify PTB-associated DNA methylation differences.
Our results were consistent with previous studies of DNA methylation in gestational age. Despite differences between cohorts and study design, we replicated .80% of CpG sites associated with GA in a previous study [18] further supporting the role of these genes in embryonic development and parturition. For example, CpG sites in CRHBP associated with GA. CHRBP binds CRH limiting its activity, and changes in the relative ratios of CRH to CRHBP associate with timing of birth [54,55]. Prior to parturition, CRHBP levels decrease while CRH levels increase facilitating labor in both term and preterm deliveries [56]. In women who deliver preterm there is a decrease in plasma levels of CRHBP compared to women who deliver term [57].
The goal of this study was to identify associations between DNA methylation and PTB. However, PTB is defined by GA at birth; thus, the differences observed may correspond to differences in the developmental stage versus the causes or consequences of PTB. In this study, the correlation between association tests for PTB and GA is strong (r = .93; Figure S2), and delineation of these factors is complex, particularly in a study with a relatively small sample size. Thus, larger studies will be required to identify DNA methylation differences exclusive to PTB. Future studies of methylation as a risk factor for PTB should also focus on maternal methylation during pregnancy; a prospective study design could avoid  Table 3. Enrichment analysis to examine whether there is a an enrichment in certain regions that associated with GA, or whether there is an enrichment of a certain direction of a t-statistic for associated CpG sites.

GA-associated
Not GAassociated p-value (+) GA-associated (2)  confounding due to differences in GA by sampling at standardized time points, and could allow comparisons between maternal and fetal methylation changes. However, even with our relatively small sample of fetal cord blood DNA, we were able to identify robust associations using a stringent phenotype definition that compared samples from early preterm and later term deliveries in a high-risk cohort; in general, African American women are 3-4 times more likely than Caucasian women to deliver in the early preterm period [7]. Another limitation is the use of whole umbilical cord blood. While an ideal design would examine DNA methylation in a single cell type, this approach and our results were consistent with previous studies [18,58]. Still, our results support the idea that epigenetic differences exist in fetuses born at different gestational ages. Recent studies suggest that DNA methylation patterns in many genes may be relatively stable over the first two years of life [16,17], and further studies will be necessary to determine whether persisting differences in DNA methylation may underlie the physiological correlates of PTB. Figure S1 Scatter plots of the unmethylated vs. methylated signals (A versus B) for the five PTB-associated CpG sites that have 1000 Genomes SNPs within the probe.

Supporting Information
(TIF) Figure S2 Correlation between the t-statics depicting association analysis of CpG sites with PTB (x-axis) compared to GA (y-axis). All CpG sites are depicted whether or not they were associated with the outcome. In order to compare more directly compare the results from analyses of PTB and GA, we reversed the sign of the t-statistics for PTB in this plot.