Genome-Wide Association Study of Gene by Smoking Interactions in Coronary Artery Calcification

Many GWAS have identified novel loci associated with common diseases, but have focused only on main effects of individual genetic variants rather than interactions with environmental factors (GxE). Identification of GxE interactions is particularly important for coronary heart disease (CHD), a major preventable source of morbidity and mortality with strong non-genetic risk factors. Atherosclerosis is the major cause of CHD, and coronary artery calcification (CAC) is directly correlated with quantity of coronary atherosclerotic plaque. In the current study, we tested for genetic variants influencing extent of CAC via interaction with smoking (GxS), by conducting a GxS discovery GWAS in Genetic Epidemiology Network of Arteriopathy (GENOA) sibships (N = 915 European Americans) followed by replication in Framingham Heart Study (FHS) sibships (N = 1025 European Americans). Generalized estimating equations accounted for the correlation within sibships in strata-specific groups of smokers and nonsmokers, as well as GxS interaction. Primary analysis found SNPs that showed suggestive associations (p≤10−5) in GENOA GWAS, but these index SNPs did not replicate in FHS. However, secondary analysis was able to replicate candidate gene regions in FHS using other SNPs (+/−250 kb of GENOA index SNP). In smoker and nonsmoker groups, replicated genes included TCF7L2 (p = 6.0×10−5) and WWOX (p = 4.5×10−6); and TNFRSF8 (p = 7.8×10−5), respectively. For GxS interactions, replicated genes included TBC1D4 (p = 6.9×10−5) and ADAMTS9 (P = 7.1×10−5). Interestingly, these genes are involved in inflammatory pathways mediated by the NF-κB axis. Since smoking is known to induce chronic and systemic inflammation, association of these genes likely reflects roles in CAC development via inflammatory pathways. Furthermore, the NF-κB axis regulates bone remodeling, a key physiological process in CAC development. In conclusion, GxS GWAS has yielded evidence for novel loci that are associated with CAC via interaction with smoking, providing promising new targets for future population-based and functional studies of CAC development.


Introduction
Recent genome-wide association studies (GWAS) have identified numerous novel loci associated with common diseases and their risk factors. However, GWAS have typically focused only on main effects of individual genetic variants, rather than interactions with other genes (epistasis) and with environmental factors (GxE interactions). Although GxE interactions provide a well-established paradigm for progression of complex chronic diseases, more precise biological and statistical characterization of this interplay remains elusive [1]. Identification of GxE interactions is particularly important for coronary heart disease (CHD), a major preventable source of morbidity and mortality with strong nongenetic risk factors such as physical activity, diet, and smoking.
Atherosclerosis is the major cause of CHD, and extent of coronary atherosclerosis is the most powerful predictor of subsequent clinical events. Non-invasive imaging of coronary artery calcification (CAC) has emerged as a useful method to assess CHD risk. The quantity of CAC, measured by computed tomography (CT), is heritable [2,3]. and correlates directly with the quantity of coronary atherosclerotic plaque. Furthermore, CAC scores predict all cause mortality [4] and coronary outcomes in asymptomatic individuals as shown in a cohort of over 10,000 individuals followed for 5 years [5,6]. A CAC score .100 has demonstrated clinical relevance representing the transition from mild to moderate coronary atherosclerosis [7,8]. Additionally, a CAC score .100 is associated with a 7 fold increased risk for myocardial infarction (MI) and CHD death after adjusting for traditional risk factors [9].
A recent CAC GWAS conducted by the Cohorts for Heart and Aging Research in Genomic Epidemiology consortium (CHARGE) included five independent cohorts for discovery and three cohorts for replication [10]. The strongest SNP associations for CAC quantity and score .100 were found on chromosome 9p21 (top SNPs near CDKN2A and CDKN2B) and within the PHACTR1 gene on chromosome 6p24. These same regions are associated with early CHD. To date, no GWAS has investigated associations between GxE interactions and CAC.
Cigarette smoking, a major risk factor for CHD, is associated with CAC. In a study of over 30,000 asymptomatic adults, Hoff et al. reported a significant, independent association between having ever smoked and CAC score .100 (OR = 1.8 in men and 1.5 in women) [11]. Recently, North et al. found several chromosomal regions with evidence for linkage with CAC quantities only in nonsmokers (chromosomes 4 and 6) or only in smokers (chromosomes 11 and 13), with significant genotype by smoking interactions (p,0.05) [12].
In the current study, we used GWAS to test for genes that interact with smoking for CAC score .100 in European Americans from the Genetic Epidemiology Network of Arteriopathy study (GENOA). Our primary analysis included GWAS analysis of GENOA subgroups stratified by smoking status, as well as genome-wide tests for variants that show significant Gene by Smoking interactions (GxS). Primary analysis also attempted to replicate GENOA index SNPs that showed suggestive associations (p,10 25 ) in FHS. For secondary analysis, we tested SNPs located in candidate gene regions (+/2250 kb of GENOA index SNP) for associations with CAC in the Framingham Heart Study (genebased strategy).

Ethics Statement
These studies received approval from Institutional Review Boards (University of Texas Health Science Center at Houston IRB, University of Michigan Health Sciences and Behavioral Sciences IRB, Boston University IRB), and study participants gave written informed consents.

Characteristics of Study Cohorts
The discovery cohort consisted of sibships of European ancestry who participated in the GENOA study as part of the NHLBI Family Blood Pressure Program (FBPP) (FBPP Investigators, 2002). Sibships containing at least two individuals diagnosed with essential hypertension before age 60 years were recruited from Rochester, Minnesota. All other siblings were invited to participate regardless of hypertensive status. Exclusion criteria were secondary hypertension, alcoholism or drug abuse, pregnancy, insulindependent diabetes mellitus, or active malignancy. GENOA sibships recruited during Phase 1 (N = 1,583, during 1995 to 2000) were invited to participate in Phase 2 (N = 1,241, during 2000 to 2004) and received electron beam CT scans of the heart. Other information collected by the GENOA study included demographic, environmental, anthropometric, and physiological data. Participant measurements of blood pressure and other clinical and physiological data have been described elsewhere [10]. Individuals with a history of coronary revascularization (n = 83) were excluded from measurement of CAC. Participants with a history of myocardial infarction (n = 19), stroke (n = 27), positive angiogram (n = 31), missing data (n = 2), or self-reporting as Hispanic ancestry (n = 2) were excluded from analyses. Of the 1,077 GENOA participants with CAC and risk factor measures, 915 participants (539 women and 376 men in 421 sibships) had genotype data.
The replication study participants were European Americans from the Offspring Cohort of the Framingham Heart Study (FHS) that participated in the Offspring Exam 7 (n = 1,314) conducted between 1998 and 2001. FHS participants were excluded from the analyses if considered to have a cardiovascular disease (CVD) event prior to Exam 7 (N = 160) yielding the final number of siblings with valid genotype data to 1,025 participants in 431 sibships. A CVD event was defined as the occurrence of coronary death, myocardial infarction, stable or unstable angina pectoris, atherothrombotic stroke, intermittent claudication, or cardiovascular death, and hospitalized coronary insufficiency. Risk factors were assessed from participants undergoing a routine physical examination, anthropometry, and laboratory collection during offspring examination [10].

GWAS Genotyping
GENOA participants were genotyped with the Affymetrix Genome-Wide Human SNP Array 6.0. A total of 669,293 SNPs were genotyped in the 915 participants after passing quality control measures including exclusion of SNPs with a call rate ,95%, or a minor allele frequency (MAF) ,0.01. These measured SNP genotypes were used for imputation (

CAC Measurements
GENOA participants were imaged with an Imatron C-150 electron beam CT scanner (Imatron Inc.) as previously described (O'Donnell et al., 2011). Scan results were initially reviewed by a radiologist for technical quality, and then scored by a radiologic technologist. A focus of CAC was considered to exist if there were at least four contiguous pixels $130 Hounsfield Units (HU) in density. The CAC score is a calculation based on the number, area, and density of CAC foci summed from the four major epicardial arteries using the method of Agatston et al. (1990).
For FHS participants, a multidetector CT exam was conducted between 2002 to 2005 with a calcified lesion in the coronary arteries defined as an area of at least 3 contiguous pixels .130 HUs with the use of 3-dimensional connectivity criteria (6 points). Agatson scores developed for electron beam CT scans were modified for multidetector CT scans as described previously (Parikh 2007). A CAC score cutpoint of 100 to define a qualitative outcome was used in all analyses and is referred to as CAC.

Categorization of smoking status
Smokers (current and previous) and nonsmokers (never) were classified based on self-report. Previous smokers were categorized by having smoked more than 100 cigarettes in a lifetime according to the National Health Interview Survey [15]. Self-reports were used for classification since biochemical measures (e.g., cotinine levels) of smoking amounts were not available for GENOA participants. A dichotomous categorization of smoking status rather than a quantitative measure (e.g., pack-years) was chosen due to the inherent high dimensionality of GWAS analysis (2.1 million SNPs with 4.2 million main effects and interaction variables per SNP).

Data Analysis
In order to account for the correlation among sibships, the Genome-Wide Association Analysis with Family (GWAF) package was utilized within R statistical software version 2.10 employing a generalized estimating equations (GEE) model [16]. Adjustment for additional covariates in the GEE model included age, sex, body mass index (BMI), pulse pressure, diabetes, systolic blood pressure (SBP), use of anti-hypertensive medications, use of lipid lowering medications, and LDL-cholesterol. To assess population stratification in GENOA GWAS, the first ten principal components (PC) were calculated using Eigenstrat [17]. However, none of the PCs were significant (p,0.05) in GEE models (p = 0.82 for the first PC), so were not used in subsequent analyses. To assess interactions of genotype by smoking (GxS) status, subgroups of smokers/nonsmokers were analyzed separately. Results from the stratified analyses were then used to test for interaction in the combined group. Effect differences for smoking were assessed by testing if beta coefficients differed from zero for associations of individual SNPs with CAC. The test for interaction approximates a 2 sample t-test with the following null hypotheses: where b smokers is the beta coefficient for the SNP in smokers and b nonsmokers is the beta coefficient for the SNP in nonsmokers.
For all 2.1 million discovery SNPs, the interaction test statistic was calculated from the following equation as adapted from Heid et al. [18].
Primary analysis started with GxS GWAS in GENOA sibships, followed by testing of index SNPs that showed suggestive evidence for association (p#10 25 ) in FHS sibships. A genome-wide significance threshold based on Bonferroni correction was used to account for multiple testing of 2.1 million SNPs (p#2.3610 28 ). Analyses in FHS were carried out within the corresponding strataspecific/interaction subgroups as described in GENOA. Secondary analysis used a gene-based approach that tested multiple SNPs in FHS sibships that were located in associated gene regions (+/2250 kb of GENOA index SNPs with GWAS p#10 25 ). We also considered genes containing SNPs with p-values between 10 24 and 10 25 , and that were near genes (#250 kb) with functions

Results
GENOA study participants were in 421 sibships ranging in size from 1 to 10, with the majority of sibships consisting of size 2 (44%) or size 3 (21%). The general characteristics of the GENOA participants stratified by gender and smoking status are presented in Table 1. Overall, more participants were female (59%) than male (41%), and males had a higher proportion of smokers (58%) compared to females (38%). Participants in all categories had similar ages (mean age of 58.1 years). Males had higher mean CAC scores than females, regardless of smoking status. Smokers had higher mean CAC scores than nonsmokers in both men (341 versus 271) and women (150 versus 84). Table 1 also shows general characteristics for the stratified FHS participants. Overall, FHS participants were older than GENOA participants, with lower use of hypertensive and lipid-lowering medications, and lower frequency of diabetes. Like GENOA, smokers had higher mean CAC scores in both men (468 versus 238) and women (107 versus 81). Table 2 presents a list of index SNPs and their nearest genes that reached p#10 25 separately for smokers and nonsmokers in GENOA, and for GxS interactions. Primary analysis also included attempts to replicate our discovery results (index SNPs with p#10 25 ) in the FHS cohort. Table 2 shows corresponding p values for GENOA index SNPs in FHS sibships. Overall, only one SNP (COLEC11 rs12990669 in GENOA smokers, p = 1.4610 28 ) reached genome-wide significance levels (p#2.3610 28 ) after Bonferroni corrections for multiple testing in GENOA or FHS (Table 2). However, this signal may represent a false positive result as other SNPs in COLEC11 did not show associations.
We conducted secondary analysis to investigate gene regions containing SNPs that showed suggestive associations in GENOA (p#10 25 ) by testing multiple SNPs in FHS sibships within +/2250 kb of GENOA index SNPs [21]. We also included genes containing SNPs with p-values between 10 24 and 10 25 in GENOA with known functions relevant to CAC (Table 2). Table 3 shows the results for FHS subgroups for SNPs that reached significance thresholds using an LD-based Bonferroni approach to correct for multiple testing (p#0.05/number of LD blocks in each gene region), thus avoiding potential overcorrection for correlated SNPs [19].
For GxS interactions, rs4410439 in the gene for ADAM metallopeptidase thrombospondin type 1 motif, 9 (ADAMTS9) showed associations in GENOA (Table 3, p = 7.1610 25 ). We also found associations with ADAMTS9 (rs4688504) in the FHS cohort (p = 1.0610 24 ). Another variant (rs1560540) that showed associations for GxS interaction in GENOA (p = 6.9610 25 ) was located in the gene for TBC1 domain family member 4 (TBC1D4). We found stronger associations with TBC1D4 (rs1062087) in the FHS cohort (p = 4.2610 27 ). Figure 1 shows the stratified effects for genotypes for ADAMST9 (rs4410439) and TBC1D4 (rs1560540) from the GxS interaction results in Table 3. Panel A shows the additive genotype effects (odds ratios obtained by exponentiation of the beta coefficients) for each smoking strata used to calculate the interaction test for rs4410439 in ADAMST9. For smokers (b = 0.59, SE = 0.17, p = 4.5610 24 ), the odds ratio (OR) for reference AA genotypes was set at 1, the OR for AC genotypes was 1.80 (95% confidence interval of 1.47-2.13), and the OR for CC genotypes was 3.23 (0. 35-7.40 We compared the results of the GxS interaction GWAS with the recent GWAS meta-analysis for main SNP effects on CAC quantity [10]. We interrogated the SNPs with the strongest associations in the GWAS meta-analysis including rs1333049 (CDKN2B), rs9349379 and rs2026458 (PHACTR1), rs3809346 (COL4A2), rs6783981 (SERPIN1), rs17676451 (HAL), rs6604023 (CDC7), and rs8001186 (IRS2). However, we did not find significant associations with any of these loci for GxS interaction (GENOA discovery cohort) with nominal p-values ranging from 0.09 to 0.49.

Discussion
We used a genome-wide approach to investigate GxS interactions to identify genetic variants associated with CAC exclusively in either smokers or nonsmokers, and with GxS interactions in the GENOA cohort. Our primary analysis used GWAS in GENOA, followed by attempts to replicate associated index SNPs (p#10 25 ) in FHS. However, these tests did not yield SNP associations that met genome-wide significance in GENOA, or standard SNPbased replication in FHS (Table 2). Primary analyses were likely limited by the relatively small sample size of the discovery GENOA cohort that reduced statistical power for main effects in GWAS results within strata. In secondary analysis, we used a genebased approach, testing multiple SNPs in FHS within associated gene regions (+/2250 kb of GENOA index SNPs) [19]. We found many instances of SNPs within gene regions that showed significant associations in FHS after correction for multiple testing using numbers of LD blocks in each region (Table 3). Overall, these genes were not represented by identical SNPs in the two cohorts, likely due to differences in allele frequencies or functional differences of SNPs in different regions of the same gene. In some instances, we observed different direction of effects (beta-coefficients) for different SNPs in the same gene (Table 3). Using theoretical modeling, Lin and coworkers demonstrated valid ''flipflop'' associations may occur even for identical SNPs due to correlations with other causal variants that differ among cohorts  Table 3. Secondary analysis: results of gene-based replication in FHS for genes containing SNPs associated with CAC in the GENOA discovery cohort. because of interaction effects or differences in LD patterns caused by sampling variation within ethnic groups or evolutionary history between ethnic groups [22]. In general, replication in GxE GWAS may prove challenging, given the complex nature of traits like CAC that are influenced by numerous interactions among multiple loci and environmental factors, as well as differences among cohorts in LD structure and environmental exposures. Our GWAS studies of GxS interactions identified several genes that showed concordance of results in GENOA and FHS that are involved in diverse cellular processes such as inflammation and osteogenesis that are relevant to CAC. In particular, Inflammation is a likely mediator of GxS interactions, since many of the deleterious effects of smoking are due to induction of inflammatory responses, contributing to chronic diseases such as CHD [23]. Inflammatory markers are also well known risk factors for type 2 diabetes (T2D), providing a likely physiological connection between development of CHD and T2D [24]. Recent in vitro experiments in human umbilical vein endothelial cells demonstrated that nicotine stimulates cellular inflammatory response via activation of the NF-kB transcription factor axis by a second messenger pathway [25]. In a rat model, exposure to cigarette smoke caused changes in levels of inflammatory markers including NF-kB in cardiac tissues [26]. In addition to inflammation, the NF-kB axis plays a central role in CAC quantity and bone remodeling by induction of osteoclast differentiation [27,28].

Smokers
In GENOA and FHS, the gene for transcription factor 7 like-2 (TCF7L2) showed associations only in smokers (Table 3). TCF7L2 (chromosome 10) has shown associations with T2D in a previous GWAS [29], and may impair pancreatic beta-cell function with effects on blood glucose homeostasis [30]. TCF7L2 has shown associations with angiographically determined CHD in diabetic and non-diabetic patients [31], as well as with CVD, ischemic stroke, peripheral artery disease, and all-cause mortality [32]. TCF7L2 encodes the Tcf-4 transcription factor in the Wnt Signalling pathway, directly regulating beta-catenin, a major activator of the NF-kB axis [33]. Activation of the NF-kB axis may provide a clue concerning the role of TCF7L2 in CAC development, since osteogenesis and bone remodeling are regulated by NF-kB [27,28].
The gene for WW domain-containing oxidoreductase (WWOX) also showed associations only in smokers from both GENOA and FHS (Table 3). WWOX (chromosome16) is an established tumor suppressor gene that is associated with CHD, bone development, and higher methylation levels in smokers [34]. In previous population-based studies, variants in WWOX have shown associations with CHD and left ventricular mass [35,36]. Ablation of WWOX in knock-out mouse strains (Wwox -/-) caused development of osteosarcomas, as well as osteopenia and bone growth retardation [34,37].
In GENOA nonsmokers, we found associations in the gene for tumor necrosis factor receptor superfamily, member 8 (TNFRSF8) ( Table 3). Like TCF7L2, this gene may also influence CAC via the NF-kB axis that regulates inflammatory response and bone remodeling [27,28]. TNFRSF8 is a member of the TNF-receptor superfamily that play key roles in signaling pathways that regulate NF-kB activation via interaction with TNF cytokines. In FHS, the associated variant in this chromosomal region was in an intergenic region near PLOD1 (rs4304595, p = 2.0610 24 ) ( Table 3).
For GxS interactions, rs1062087 within the gene for TBC1 domain family member 4 (TBC1D4) showed stronger association in FHS than the SNP identified in GENOA (Table 3). Interestingly, rs1062087 is a nonsynonymous variant (Ile818Val) that is located in the same LD block with rs1560540 that showed associations in GENOA. TBC1D4 encodes the AS160 protein that mediates insulin homeostasis by regulating glucose uptake in fat and muscle cells via GLUT4 glucose transporters. Inflammatory markers (TNF-a, IL-1, IL-6) are associated with a reduction of AS160 activities, resulting in increased insulin resistance [38]. Another variant (rs4410439) that showed associations for GxS interactions in GENOA (P = 7.1610 25 ) was located in the gene for ADAM metallopeptidase thrombospondin type 1 motif, 9 (ADAMTS9) ( Table 3). We also found associations with ADAMTS9 (rs4688504) in the FHS cohort (p = 1.0610 24 ) (Table 3). ADAMTS9 encodes a metalloprotease that is involved in thrombosis, cleaving veriscan, proteoglycans, and aggrecan. In transgenic mice that were heterozygous for an inactivated allele carrying the LacZ reporter gene (Adamts9 +LacZ ), ADAMTS9 haploinsufficiency altered cardiovascular development and allostasis, resulting in valvular and aortic anomalies [39]. As with TCF7L2, both TBC1D4 and ADAMTS9 have shown associations in GWAS meta-analysis for T2D [29,40]. Such genes may provide new insights into the well-known relationship between CHD and T2D, perhaps mediated by inflammatory processes.
Perhaps the future best use of existing GWAS data from epidemiological cohorts is the identification of loci involved in interactions (gene by gene, gene by environment) that underlie complex diseases such as CHD and their risk factors. Our results demonstrate that such interactions (i.e., gene by smoking) may be generalizable among cohorts, given that many of the genes identified in GENOA also showed significant associations in FHS. These interactions are likely to reflect the role of particular metabolic or physiological pathways that include many genes, and that interact with environmental factors such as smoking. In our study, we found that many of the replicated genes were involved in inflammatory pathways mediated by the NF-kB axis. In addition, three of the loci associated with CAC also showed associations in GWAS meta-analysis of T2D [40], a chronic disease with altered inflammatory pathways and increased CHD risk. Since smoking may cause chronic and systemic inflammation, association of these genes in GENOA likely reflect their roles in CAC development and progression via participation in inflammatory pathways. Interestingly, the NF-kB axis also regulates bone remodeling, providing a link between inflammation and pathways of osteogenesis involved in development and progression of CAC. Additional genetic studies will be required for further tests of these genes in other human populations, as well as functional studies to understand how these genes influence gene by smoking interactions.