Maternal Smoking during Pregnancy and DNA-Methylation in Children at Age 5.5 Years: Epigenome-Wide-Analysis in the European Childhood Obesity Project (CHOP)-Study

Mounting evidence links prenatal exposure to maternal tobacco smoking with disruption of DNA methylation (DNAm) profile in the blood of infants. However, data on the postnatal stability of such DNAm signatures in childhood, as assessed by Epigenome Wide Association Studies (EWAS), are scarce. Objectives of this study were to investigate DNAm signatures associated with in utero tobacco smoke exposure beyond the 12th week of gestation in whole blood of children at age 5.5 years, to replicate previous findings in young European and American children and to assess their biological role by exploring databases and enrichment analysis. DNA methylation was measured in blood of 366 children of the multicentre European Childhood Obesity Project Study using the Illumina Infinium HM450 Beadchip (HM450K). An EWAS was conducted using linear regression of methylation values at each CpG site against in utero smoke exposure, adjusted for study characteristics, biological and technical effects. Methylation levels at five HM450K probes in MYO1G (cg12803068, cg22132788, cg19089201), CNTNAP2 (cg25949550), and FRMD4A (cg11813497) showed differential methylation that reached epigenome-wide significance according to the false-discovery-rate (FDR) criteria (q-value<0.05). Whereas cg25949550 showed decreased methylation (-2% DNAm ß-value), increased methylation was observed for the other probes (9%: cg12803068; 5%: cg22132788; 4%: cg19089201 and 4%: cg11813497) in exposed relative to non-exposed subjects. This study thus replicates previous findings in children ages 3 to 5, 7 and 17 and confirms the postnatal stability of MYO1G, CNTNAP2 and FRMD4A differential methylation. The role of this differential methylation in mediating childhood phenotypes, previously associated with maternal smoking, requires further investigation.


Introduction
There is growing evidence from both candidate gene [1] and Epigenome Wide Association Studies (EWAS) that prenatal exposure to maternal tobacco smoking substantially alters DNA methylation (DNAm) in newborn blood [2][3][4][5][6]. Evidence also exists for dosage, tissue-specific and timing of exposure effects. Importantly, several of these studies have reported similar findings, with changes in methylation consistently observed at AHHR, CNTNAP2, CYP1A1, GFI1 and MYO1G at birth.
Recently, the postnatal stability of maternal smoking-associated differential methylation has been investigated. An EWAS using the Illumina Infinium HumanMethylation450 BeadChip (HM450K), conducted in 12-to 18-year old offspring of mothers who smoked during pregnancy [6] found persistence of differentially methylated regions (DMRs) in MYO1G and CNTNAP2. A similar study using longitudinal blood samples collected at birth, at age 7 and age 17 years [7], demonstrated that some smoking-associated DMRs are time-stable postnatally (AHRR, MYO1G, CYP1A1, CNTNAP2), whereas others are not (ATP9A, GFI1, KLF13). Importantly, both the duration and intensity (cigarettes/day) appear important in the observed effect. Despite these findings, no equivalent EWAS has yet been carried out in blood of children younger than 7, except a very recent study in [3][4][5] year old US children [8] using EWAS data to replicate 26 differently methylated CpG sites previously reported in newborns [2].
In order to further explore the postnatal stability of maternal smoking induced epigenetic change in early childhood, we carried out an EWAS in 366 children at 5.5 years of age as part of the multicentre study of the European Childhood Obesity Project Study (CHOP) [9]. In addition, we explored biological databases and performed enrichment analysis to get some biological insight into the role of MYO1G, CNTNAP2 and FRMD4A.

Results
Overall 5 CpG sites in genes MYO1G (three sites), CNTNAP2 and FRMD4A showed epigenome-wide significance according to the FDR criteria (Table 1).
EWAS significance and effect size of 431 313 analysed CpG sites are illustrated also by volcano plot and QQ-plot (Fig 1).
The location of differentially methylated CpG sites in MYO1G, CNTNAP2 and FRMD4A is depicted in Fig 2. The three significant differentially methylated CpG sites in MYO1G were situated at the 3' gene region, in a CpG island (cg22132788, cg19089201) and the 2kb region downstream of this CpG island, i.e. in an S shore (cg12803068) (Fig 2A). The significantly methylated CpG in CNTNAP2 gene was situated in S shore (cg25949550), close to the transcription start site (TSS) (Fig 2B). One significantly methylated CpG site in FRMD4A (cg11813497) was located downstream of the TSS (Fig 2C). Amongst the top 25 differentially methylated CpG sites in association with maternal smoking were other previously reported CpGs in MYO1G (cg04180046) and FRMD4A (cg15507334) genes ( Table 1), but these did not survive correction for multiple testing.
CpGs showing significant differential methylation in association with maternal smoking after week 12 of gestation generally showed positive associations; MYO1G, +9% at cg12803068, +5% at cg22132788 and +4% at cg19089201; FRMD4A, +4% at cg11813497. In addition, cg04180046 at the 3'region of MYO1G showed a +5% difference and cg15507334 at the 5' region of FRMD4A a +4% difference in mean CpG methylation between exposed and unexposed children but both did not reach FDR significance. In contrast, cg25949550 in the S shore of CNTNAP2 gene showed a difference of minus 2% in mean DNA methylation between exposed and unexposed children ( Table 1).
Direction and effect sizes of differential methylations in MYO1G and CNTNAP2 genes found in the CHOP cohort (5.5 years) were quite similar with those from other EWAS based on Illumina 450K arrays from SEED (5 years), ALSPAC (7.5 and 17.7 years) and SYS (15.6 years) cohorts (S1 Table) [6][7][8]. Assembly Build 37. DNAm ß-value difference is the adjusted mean difference in differentially methylated ß-values among in utero tobacco exposed vs. not exposed children for those 25 CpG sites identified by the M-value based EWAS regression models described below; e.g. 0.09 means that the difference in the proportion of DNA methylation between exposed and unexposed children is 09. P-value: uncorrected p-value obtained by standard linear regression of M-values of DNA methylation at respective CpG site (outcome) on indicator coded in utero tobacco exposure by maternal smoking during pregnancy (main predictor) adjusted for sex, age at blood draw, indicator coded country of study centre, maternal education, exposure to parental smoking at age 4 years, the estimated proportions of six major white blood cell (WBC) types, CD4+ T cells, CD8+ T cells, B cells, NK cells, monocytes and granulocytes,
PANTHER protein classes in this network were >5-fold enriched (Bonferroni-adjusted P-value<0.05) for actin binding motor proteins (PC00040), cell junction proteins (PC00070), actin family cytoskeletal proteins (PC00041) and G-protein modulators (PC00022) see S7 Table. Discussion This study aimed to identify associations between in utero exposure to maternal smoking and DNA methylation levels in blood of 366 European children of age 5.5 years. Overall five CpG sites in gene regions MYO1G (cg12803068, cg22132788, cg19089201), CNTNAP2 (cg25949550) and FRMD4A (cg11813497) showed epigenome-wide significance. Interestingly, the CpG sites we identified in the CNTNAP2 and MYO1G genes were the same as those previously reported in newborns [2], 3-5 year old children [8], for adolescents at age 12-16 years [6] and longitudinally in newborns, 7 year old children and 17 year old adolescents [7].   with prenatal smoking status. The direction of differential methylation is marked with arrows (" increase or Moreover, the effect size and direction of effect were more or less equivalent across these studies, highlighting the reproducibility of findings and postnatal stability of in utero smoking exposure-associated DNA methylation disruption in these genes. The differentially methylated CpG site cg11813497 in the promoter region of FRMD4A in our study has previously been reported in a single EWAS in blood of newborns with the largest proportion of smoking mothers yet examined [3]. However, whereas that study identified 3 other FRMD4A related CpG sites (cg2034448, cg25464840, cg15507334) these were not significantly differentially methylated in our dataset [3]. One of these sites (cg25464840) was also reported in an EWAS of 5-12 year old children [12]. As only one of these previous reported CpG sites in FRMD4A (cg15507334) was among the top 25 ranked p-values in our study, it is interesting to speculate that whereas differential methylation at CpG site cg11813497 may show throughout childhood, this may not be the case for the other 2 CpG sites in this gene. However, equally likely is that the reduced power of our study, relative to previous studies [3] played a role in our findings.
Several other differently methylated regions (DMR) in children or adolescents associated with maternal smoking during pregnancy in previous EWAS, e.g. CpGs related to AHRR or CYP1A1 in the study [6,7] did not show significance in our EWAS analyses.

Biological meaning of CpG sites sensitive to maternal smoking during pregnancy
Prenatal tobacco smoke exposure is associated with a wide range of adverse health outcomes in offspring. Children of smoking mothers have an increased risk of speech-processing and attention control deficits [13,14], autism [15], allergy [16], asthma [17], overweight and obesity [18] and nicotine dependency [19]. Interestingly, MYO1G, CNTNAP2 and FRMD4A gene products have been previously been associated with many of these outcomes and differential DNA methylation across their gene loci has been consistently replicated in the blood of newborns or youth exposed to in utero tobacco smoke. However, the functional relevance of these DMRs at CNTNAP2, MYO1G and FRMD4 is limited. Therefore, we investigated publicly available epigenome and transcriptome databases and performed functional network and enrichment analysis to gain insights into the potential functional relevance of the observed differential methylation in these genes.
Increased methylation in this gene region might correlate with increased MYO1G transcription according to three lines of evidence. First, the DNA methylation and gene expression database MethHC [25] shows that in urogenital cancer the methylation at 3' gene region of MYO1G positively correlates with transcription, contrary to the promoter/5'UTR region methylation (S2 Fig). Second, The Encyclopedia of DNA Elements (ENCODE) [26] evidence of DNA methylation in three blood and endothelial cell lines (GM12878, K562 and HUVEC) by Illumina 450K shows that high methylation (>60%) in the 3' MYO1G region and low methylation (<20%) in the 5' gene region correlate with high MYO1G transcription in GM12878 (S1 # decrease). Regional association plots are shown using the gene map (UCSC genome browser, hg19) with a graph of-log 10 p-values on the y-axis, the nucleotide position on the x-axis, and the position of selected Illumina Infinium HumanMethylation 450 BeadChip probes. Abbreviations: CGI: CpG island; TSS: transcription start site; FDR: false discovery rate p-value.  [27]. On the contrary, in K562 and HUVEC cell lines where MYO1G transcription is absent [28], methylation at the 3' gene region is low (<20% in HUVEC) while the 5' gene region methylation is high (>60% in K562 and HUVEC) (S1 Fig). Third, methylation in the 3' MYO1G gene body (cg04180046) was recently reported to positively correlate with MYO1G transcription in leukocytes of 144 healthy Chinese adults in an EWAS of smoking [24]. Thus, these data support the observation that, unlike promoter methylation, gene body methylation in MYO1G might correlate with its active transcription [29].
The 3' MYO1G gene region might have a regulatory role since it overlaps with monomethylated and trimethylated lysine 4 on histone H3 (H3K4) and binding sites for Myc, Max, YY and USF transcriptional factors (S3 Fig). It also binds transcriptional repressors such as EZH2, a histone-lysine N-methyltransferase involved in regulation of histone and DNA methylation [30] and CTCF, an insulator protein that affects mRNA splicing, promoter-enhancer looping and whose binding affinity is modulated by DNA methylation [31].
Exposure to nicotine may change leukocyte numbers and increase leukocyte-endothelial adhesion [32]. This goes in line with GO terms in our pathway analysis such as cell migration (GO:0016477), localization of cell (GO:0051674), cell motility (GO:0048870), movement of cell or subcellular component (GO:0006928) and cell adhesion (GO: 0007155). Moreover, MYO1G might be implicated in transmission of smoking effects on cardiovascular system [33] since this type I unconventional myosin is specifically expressed in the plasma membrane of B and T lymphocytes and mast cells [28,34] and regulates leukocyte adhesion, mobility and phagocytosis [35][36][37].
Sensory perception of sound (GO: 0007605) was another overrepresented biological term in our pathway analysis, which might by associated with impaired auditory processing in infants exposed to prenatal smoking [38]. However, this overrepresentation might be also due to the presence of other nonconventional myosins such as MYO1B, MYO1C and MYO1E in our interaction network. Unlike MYO1G, they are also expressed in sensory cells of the inner ear [39] and their mutations have been associated with hearing loss [40].
Based on current knowledge, it would be of interest to understand the mechanisms that keep 3' MYO1G gene region differentially methylated even after the withdrawal of tobacco toxins and what role this plays for phenotypes related to child development.
CNTNAP2. Decreased methylation at CNTNAP2 (cg25949550) has been reported both in offspring exposed to prenatal smoking [2,[5][6][7] and in adults [22,24]. It is located in S shore, 800 bp downstream of the transcriptional start site, and overlaps with binding sites of several transcriptional repressors such as SIN3A, CTCF, REST and CTBP2 (S4 Fig). Adult smokers also show decreased methylation at nearby cg21322436 [22,24] and cg16254309 [41]. These two CpGs are located in N shores and S shores (0-2kb regions upstream and downstream of CGI, respectively) in the 5' gene region (Fig 2B, S4 Fig). In addition, adult smokers also show increased methylation at cg11207515 and cg1737210 that are located >1 Mb (mega base pairs) downstream of the CNTNAP2 transcriptional start site (Fig 2B) [22][23][24]. These DMRs found in blood of adult smokers have not yet been found in offspring in association with the exposure to prenatal tobacco smoke.
Given the evidence from Gene Expression Atlas [42] that tobacco down-regulates CNTNAP2 transcription in cancerous human lung tissue (E-GEOD-13309) and normal human bronchial epithelial cells (E-GEOD-10718) [43,44], we speculate that CNTNAP2 transcription could also be downregulated in individuals exposed to tobacco smoke. Whether this could be mediated through methylation decrease in one CpG and increased recruitment of transcriptional repressors remains to be addressed. This might even be detectable in blood since CNTNAP2 protein expression has also been found in B lymphocytes and endothelial cells [45,46]. CNTNAP2 is a cell adhesion protein whose decreased expression leads to aberrant neuronal migration and was included in some of our overrepresented GO terms such as 'protein localization to juxtaparanode region of axon' (GO:0071205), 'neuron recognition' (GO:0008038) and 'neuron projection development' (GO:0031175). Since CNTNAP2 mutations have been linked with impaired language development, autism spectrum disorder and intellectual disability [47], it is possible that the prenatal smoking-associated methylation in CNTNAP2 and its persistency until at least adolescence [6,7] might represent an additional mechanism contributing to impaired neurodevelopment in children exposed to in utero tobacco smoke.
FRMD4A. Increased methylation in FRMD4A (cg20344448, cg11813497, cg25464840 and cg15507334) has been associated with prenatal smoking in newborns [3,5] and children [12] but not observed in adult smokers. These CpGs are situated in the potential promoter/5'UTR region (Fig 2C), which in neuroblastoma cells binds GATA-2 (S5 Fig), a hematopoietic transcription factor that regulates proliferation of blood cell lineages. Interestingly, binding of GATA-2 to target genes is prevented by DNA methylation during lymphoid differentiation [48]. In vitro experiments suggest that smoking might modulate FRMD4A transcription since FRMD4A expression in human bronchial cells treated with tobacco smoke condensate was upregulated (E-GEOD-14383) [49]. However, no relation between DNA methylation and expression of FRMD4A was found in 526 adults used for replication of EWAS findings related to prenatal smoking exposure in older children [12].
Prenatal exposure to nicotine exerts strong effects on cytoskeleton reorganization and increases expression of cytoskeletal proteins [50]. Our functional network analysis was overrepresented for 2 GO terms related to morphogenesis (GO:0009653 and GO:0032989), both of which included FRMD4A and MYO1G. FRMD4A is a scaffold protein that regulates epithelial polarization and cell-cell junctions [51] and its upregulation increases intercellular adhesion [52]. The influence of prenatal nicotine on cytoskeleton organization is further supported by overrepresented PANTHER Pathways such as Dopamine receptor mediated signalling (P05912) and Nicotine pharmacodynamics (P06587) pathways. Although they did not include FRMD4A, the EPB41L1 erythrocyte cytoskeleton protein that shares the FERM-C protein domain with FRMD4A was found in both of these pathways. Interestingly, EPB41L1 and its related protein EPB41L3 interact with CNTNAP2 in brain where they may anchor this axonal transmembrane protein to the actin cytoskeleton [53].
The relevance of promoter DNA methylation on FRMD4A function in development remains unclear. Since mutations in FRMD4A gene locus associate with nicotine dependence [54], prenatal smoking-induced DNA methylation of FRMD4A that persists until childhood might be an additional mechanism contributing to increased nicotine dependency of offspring exposed to in utero tobacco smoke [19,55].

Strengths and limitations
With the exception of a very recent replication study in 3-5 year old US children [8] our study is the first comprehensive EWAS with the HM450K BeadChip addressing the effect of continued in utero tobacco smoke exposure in children of age 5.5 years from four different European populations (Belgian, German, Italian and Spanish).
This complements previous EWAS that were conducted in British, Norwegian and French-Canadian populations [6,7,12] at different ages from birth to age 17 years. Although our sample size is reasonable large (n = 366), as is the exposure group (n = 58), it is likely that it was somewhat underpowered to detect smaller effects sizes at all genomic loci likely to be sensitive to methylation disruption in association with maternal smoking in utero. Future meta-analyses across multiple cohorts using the HM450K platform should now be performed to further define the scope of methylation disruption associated with this exposure in pregnancy.

Conclusions
This EWAS showed effects of continued in utero tobacco smoke exposure beyond 12 th week of gestation on the methylation profile of children at age 5.5 years in Belgian, German, Italian and Spanish populations. We replicated previous findings confirming the postnatal stability of methylation disruption at MYO1G and CNTNAP2 genes in association with in utero exposure to maternal smoking throughout pregnancy and confirmed differential methylation of a CpG site in FRMD4A that was previously found only in newborns.
The consistent replication of these methylated CpG sites and the exploration of biological databases combined with enrichment analysis suggests that their biological function and health-related phenotypes should be further explored.

Study design and analysed population
The study population used is a subset of 366 children out of 543 children aged 5.5 years from study centres in Germany, Belgium, Italy and Spain of the European Childhood Obesity Trial Study (CHOP) registered at clinicaltrials.gov as NCT00338689 and URL: http://clinicaltrials. gov/ct2/show/NCT00338689?term=NCT00338689&rank=1.
Inclusion criteria were availability of blood buffy coats collected at age 5.5, valid exposure data on maternal smoking during pregnancy and information on basic characteristics of mother and offspring (sex, age of blood draw, country of study centre).
Characteristics of the analysed study population are listed in Table 2. Details on this ongoing prospective nutritional intervention trial with n = 1678 enrolled healthy infants around birth (n = 550 higher protein formula, n = 540 lower protein formula, n = 588 breastfed) have been published previously [9,56].

Ethics Statement
The study was conducted according to the principles expressed in the Declaration of Helsinki.

Assessment of child's smoking exposure during pregnancy
In utero tobacco exposure was derived from maternal questionnaires administered during the first 8 weeks after delivery: "Did the child's mother smoke during early pregnancy (up to the 12 th week of gestation)?" "Did the child's mother smoke in the further course of pregnancy (beyond the 12 th week of gestation)?" Only 18 of 76 mothers who reported smoking in pregnancy ceased after the 12 th week of gestation. Given previous findings demonstrating the necessity for prolonged exposure [8,57], results are reported only for mothers who smoked throughout pregnancy. The analysed exposure variable was coded 1, if the child was exposed to maternal smoking beyond the 12 th gestational week and 0 if not exposed beyond this point. Exposure was set to missing if a child was exposed only up to 12 th week of gestation.

Measurement of epigenome-wide DNA methylation
During child's physical examination at age 5.5 years blood was drawn to collect peripheral blood cells from buffy coats. Genomic DNA was extracted from buffy coats using a standard precipitation procedure. Bisulfite conversion of 800 ng genomic DNA was conducted with the EZ-96 DNA Methylation Kit (Zymo Research, Irvine, Ca; USA) and converted DNA samples were hybridised on the Infinium HumanMethylation450 BeadChip (HM450K) according to the manufactures instructions (Illumina Inc., San Diego, USA). DNA extraction, bisulfite conversion and methylation analysis were performed at the Genome Analysis Center of Helmholtz Zentrum Muenchen, Munich, Germany.
DNA samples were randomly assigned to each of the 33 HM450K slides each containing 12 arrays and processed in one batch by the same laboratory staff to reduce potential batch effects. The HM450K platform measures methylation at 485577 CpG sites located throughout the genome [58]. It covers 21231 (99%) genes of the UCSC RefGenes and has a coverage of 96% of the CpG islands (CGI), 92% of the CGI shores (0-2 kb from CGI), 86% of the CGI shelves (2-4 kb from CGI) and interrogates 16232 differentially methylated regions (DMR).
DNA methylation (DNAm) status at each HM450K probe was determined as methylation ß-value derived from the intensity ratio of the methylated allele (M) to the sum of intensities of Table 2. Characteristics of the analysed population. Characteristic not exposed exposed exposed vs. Exposed: Child exposed to any in utero tobacco smoking by maternal smoking during pregnancy beyond 12 th week of gestation; Not Exposed: no exposure to in utero tobacco smoking at any time during pregnancy. P-value: test of percent or mean difference in study population characteristic among exposed vs. not exposed group either by chi-square test or t-test. doi:10.1371/journal.pone.0155554.t002 Smoke Exposure in Pregnancy and DNA-Methylation the unmethylated allele (U) and the methylated allele (M) plus an offset of 100 [58]. As individual blood samples contain many genomic copies of each CpG site, the calculated ß-value can be interpreted as the percentage of methylation at that interrogated CpG site for that specific sample with a range from 0 (= completely unmethylated) to 1 (= completely methylated). Methylation ß-values were converted to M-values for statistical analyses by taking the base 2 logarithm of the ratio of the raw methylation ß-value and 1 minus the ß-value [59].

Quality control and normalisation of methylation data
Data pre-processing and normalisation were performed mainly according to the approach of Touleimat and Tost [60] with some adaptions e.g. the beta-mixture quantile normalization (BMIQ) step of the ß-values [61] as recommend by a recent review of pre-processing and normalisation procedures [62]. Data were only retained for probes that had signals from at least 3 functional beads on the array and had detection p-values p0.01. Moreover, only samples with at least 80% significant probe methylation signals per sample were retained. Colour bias correction using smooth quantile normalisation, and background adjustment based on negative control probes were conducted separately for the two colour channels with R-package lumi. Lumi was also used for signal background correction by subtraction of the negative control probe signal. Several features of the process of pre-processing used in this study were described previously in more detail [63]. In contrast to previous approaches [60], no probe filtering according to proximity of CpG site with SNPs of minor allele frequency 5% within 50bp was conducted. In addition, probes on the X and Y chromosomes were not excluded. However, after BMIQ-normalisation and removal of duplicates, identified cross-binding probes were excluded as previously described [64] leaving a final data set of 431313 CpG methylation values in each of 366 children for EWAS analysis.
Batch effects and other technical noise was assessed by principal components analysis (PCA) of control-probes and accounted for by adjustment for principal components (PC) in the EWAS analysis [65]. Heterogeneity of cell mixture distribution in the samples were assessed and accounted for by Houseman's method [66]. In detail, we used the validation data set consisting of purified cell samples as described in [66] (CD4+ T cells, CD8+ T cells, B cells, NK cells, monocytes and granulocytes) as a reference to estimate the relationship between cell types and methylation. After data pre-processing, we extracted ß-values of the 475 methylation sites available in our study that were overlapping with the 500 methylation sites showing the strongest relation with cell types in [66]. These data were used to estimate proportions of the six above-mentioned cell types in samples of the CHOP study using the R function pro-jectWBC() provided by Andres Houseman, restricting the WBC estimates to positive numbers (option nonnegative = TRUE). The estimated WBC proportions were included as covariates in the regression models (see below).

Statistical Analysis (EWAS)
Potential associations between methylation M-values at each CpG site and prenatal exposure to tobacco smoke beyond the 12 th week of gestation were determined by standard linear regression models with adjustment for sex, age at blood draw (months), study centre (Germany (DE), Italy (IT), Spain(ES), reference Belgium (BE)), maternal education (high = 12+ yrs. vs. low = basic schooling and middle (10 -<12 yrs. vs. low), postnatal smoking exposure at age 4 years (by mother (yes/no), by father(yes/no)), the estimated proportions of six major white blood cell types, CD4+ T cells, CD8+ T cells, B cells, NK cells, monocytes and granulocytes and the top 30 principal components (PC) derived from control probes on the HM450K platform.
The latter were included to account for biological and technical noise and batch effects [65,66]. The former adjustment factors were selected based on previous studies [6][7][8] and to account for the study design of the CHOP study (multi-centre study with populations from 4 European countries) [9]. Model adjustment for child's sex, age at blood draw, study centre, maternal education and postnatal smoke exposure was also conducted because lambda (λ) [67] was substantially improved from a model adjusted for cell mixture only (λ = 1.26), over a model adjusted for cell-mixture and 30 PC (λ = 0.98) to the final fully adjusted model (λ = 0.91).
An association of a differently methylated CpG site (M-values) with exposure was considered significant at the epigenome-wide level, if the false discovery rate (FDR) was below 0.05 [68]. EWAS-wide significance was also evaluated and illustrated by QQ-plots produced with module gcontrol2 of R-package gap. Volcano plots of the M-value based regression coefficients of the exposure variable were used to simultaneously illustrate effect size and EWAS significance.

Functional characterization of methylated CpGs
Differentially methylated sites were annotated according to Illumina (www.illumina.com, HumanMethylation450_15017482_v1.csv) and viewed in the UCSC genome browser, GRCh37/h19 assembly [69]. The location of overlapping transcription factor binding sites, transcripts and chromatin modifications was assessed with ENCODE ChIP-seq from blood (GM12878, K562) and endothelial (HUVEC) cell lines [26]. Associations between transcription and DNA methylation at CpGs along the MYO1G locus was retrieved from the pan-cancer methylation database MethHC [25]. General and tobacco-related expression pattern of CNTNAP2 and FRMD4A was assessed using Human Proteome Map [45] and Gene Expression Atlas at the European Bioinformatics Institute [42], respectively. All data were last retrieved in February 2016.