Replication of Putative Susceptibility Loci from Genome-Wide Association Studies Associated with Coronary Atherosclerosis in Chinese Han Population

Background Coronary atherosclerosis, the main cause of cardiovascular disease, is a progressive disease. Recent Genome Wide Association Studies (GWASs) discovered several novel loci associated with coronary artery disease (CAD) or its main complication myocardial infarction (MI). In this study, we investigated the associations between previously reported CAD- and MI-associated variants and coronary atherosclerosis in Chinese Han population. Methodology/Principal Findings We performed a case-control association study with 2,335 coronary atherosclerosis patients and 1,078 controls undergoing coronary angiography of Chinese Han from China. Fourteen single nucleotide polymorphisms (SNPs), located at 1p13.3, 1q41, 2q36.3, 6q25.1, 9p21.3, 10q11.21 and 15q22.33, were genotyped in our sample collection. Six SNPs at 9p21 were associated with coronary atherosclerosis susceptibility (Ptrend<0.05) and rs10757274 showed the most significant association (P = 2.38×10−08, OR = 1.34). These associations remained significant after adjustment for multiple comparisons. Rs17465637 at 1q41 (Ptrend = 6.83×10−03, OR = 0.86) also showed significant association with coronary atherosclerosis, but the association was not significant after multiple comparisons. Additionally, rs501120 (P = 8.36×10−03, OR = 0.80) at 10q11.21 was associated with coronary atherosclerosis in females, but did not show association in males and all participants. Variants at 1p13.3, 2q36.3, 6q25.1 and 15q22.33 showed no associations with coronary atherosclerosis and main cardiovascular risk factors in our data. Conclusions/Significance Our findings indicated variants at 9p21 were significantly associated with coronary atherosclerosis in Han Chinese. Variants at 1q41 showed suggestive evidence of association and variants at 10q11.21 showed suggestive evidence of association in females, which warrant further study in a larger sample.


Introduction
Coronary atherosclerosis is a progressive disease and the potential consequences in atherosclerosis include: coronary artery disease (CAD) and its main complication myocardial infarction (MI). CAD and MI are leading causes of death and disability worldwide and have a rapidly increasing incidence in developing countries [1]. Epidemiological studies have revealed that both genetic and environmental risk factors contributed to the pathogenesis of atherosclerosis [2]. However, the molecular mechanism of atherogenesis, including formation, proliferation and atheroma rupture, has not yet been clarified.
Biological complexity of atherosclerosis implies involvement of a large number of genes and their functional variants in its pathogenesis [3]. During the past five decades, large-scale epidemiological studies and genetic association analysis have identified multiple risk factors and susceptibility genes for coronary atherosclerosis [4]. Variants of several functionally important genes, including ApoE [5] and LDLR [6], have been implicated in susceptibility to coronary atherosclerosis in general population. With improved genotyping technologies and the completion of the human HapMap project, Genome-Wide Association Studies (GWASs) have recently become an important research method in genetics study. Recent GWASs and meta-analysis in CAD and MI identified several new susceptibility loci [7,8,9,10,11,12]. Among these loci, the strongest association signals were on chromosome 9p21.3, which were also correlated with stroke, abdominal aortic and intracranial aneurysms in several other cohorts [13]. Variants on chromosome 1p13.3 was also found to be significantly associated with CAD and LDL cholesterol concentration in recent GWASs [14,15,16,17,18], reinforcing the mechanistic relationship between the variability in LDL levels and CAD risk [19].
Most of these GWASs were conducted in Caucasian populations, and several replication studies have been performed in Chinese population from China. Rs2383206 and rs2383207 were investigated association with CAD in 1,360 cases and 1,360 gendermatched controls of Chinese Han, and only rs2383207 locus was found to be significantly associated with CAD [20]. The associations of rs10757274, rs2383206, and rs10757278 were investigated with MI in Chinese Han subjects by conducting a hospital-based casecontrol study (432 cases and 430 controls), and all the three SNPs showed association with MI [21]. Rs1333049 were found to be an independent determinant for coronary plaque progression in 1,034 non-diabetic patients but not in 1,012 type 2 diabetic mellitus (T2DM) patients from Chinese Han [22]. Rs2383206, rs1004638 and rs10757278, in a strong linkage disequilibrium (LD) block, were investigated in 510 CAD patients, 558 patients with ischemic stroke and 557 shared controls of Chinese Han, and showed significant associations with CAD and weak association with Ischemic Stroke [23]. Four SNPs at 9p21 were genotyped in 425 MI patients, 687 patients with ischemic stroke, and 1,377 healthy controls recruited from Chinese population residing in Taiwan, and the result showed Genetic variations in the 9p21 region are associated with MI but not with stroke [24]. All the above-mentioned studies in Chinese population were limited to 9p21 variants.
Here, we undertook a replication study in a large cohort of coronary atherosclerosis patients and controls from Chinese Han population. We examined 14 SNPs located in seven chromosome regions which showed strong or moderate associations with CAD and MI from recent GWASs [7,8,9,10]. Among these new risk loci, variants at 9p21.3 have been firmly validated in the follow-up replication studies from different populations [20,25,26,27]. Since the initial GWASs were conducted in Caucasian populations of which the LD pattern is quite different from that of Chinese, and the most significant associated SNP was also different in separate studies, fine-mapping of 9p21.3 in Chinese population is needed. We selected eight SNPs, which represent the most associated and independent SNPs at 9p21.3 in the previous studies, to perform a limited investigation of the associations of the region in Han Chinese.

Results
Clinical and biochemical characteristics according to the presence of significant coronary stenoses are summarized in Table 1. As anticipated, the prevalence of established cardiovascular risk factors was higher in coronary atherosclerosis cases than that in controls. Overall, the characteristics of our patients were typical for patients undergoing coronary angiography for the evaluation of coronary atherosclerosis, with a male preponderance (61.8%) and a high prevalence of T2DM (15.8%), hypertension (59.5%), smoking (30.1%) as well as drinking (10.7%). Notably, the levels of serum LDL cholesterol were significantly lower but HDL cholesterol levels were higher in patients with significant coronary stenoses than in patients without such lesions.
Among the 14 SNPs, rs17228212 with a minor allele frequency (MAF) of less than 1% was removed from further association analysis. In addition, rs7044859 with Hardy-Weinberg equilibrium (HWE) of p,0.01 in controls were also eliminated from further analysis ( Table 2). Twelve SNPs conformed to HWE were investigated association in 2,335 coronary atherosclerosis patients and 1,078 control subjects. The LD structure among seven SNPs conformed to HWE at 9p21.3 was examined by program Haploview 4.2 ( Figure 1A).  Table 3). After accounting for the most significant SNPs rs10757274, no other SNPs at 9p21.3 were independently associated with coronary atherosclerosis in logistic regression analysis, suggesting rs10757274 could solely account for the association signal of this region ( Table 4). All the seven SNPs  remained significant after adjustment for age and sex. Rs599839 at 1p13.3 was not associated with coronary atherosclerosis in univariate analyses; however, it showed a nominal association after adjusting for age and sex (Table 3). We also investigated associations in a multivariate model including age, sex, rs17465637 at 1q41, rs599839 at 1p13.3 and rs10757274 at 9p21.3, and all three SNPs remained significant (P = 0.009 for rs17465637, P = 0.02 for rs599839, and P = 3.83610 207 for rs10757274).
Previous studies have suggested sex-specific heritability of coronary atherosclerosis disease [28,29]. To address this issue, we performed stratified association analysis by sex. Rs17465637 at 1q41 and rs496892 at 9p21, which showed association in all the participants, were associated with coronary atherosclerosis in the female subgroup but not in males. Rs501120 at 10q11.21 was associated with disease in female group (P female = 7.53610 203 ), but did not show association in males and all participants. Additionally, rs2383207 was associated with disease in the male subgroup but not in the females. We also performed association analysis between the selected SNPs and the known risk factors for coronary atherosclerosis (hypertension and diabetes). And we found seven SNPs at 9p21 were associated with susceptibility of atherosclerosis complicated by hypertension, and five SNPs in block 2 at 9p21 showed associations in non-hypertension subgroup ( Figure 1A). In the cohort without diabetes, the distributions of rs599839 at 1p13.3, rs17465637 at 1q41 and seven SNPs at 9p21 showed significant differences between cases and controls. We found no evidence for association with the 6q25.1, 2q36.3 and 15q22.33 loci. The results were summarized in Table 5.
Finally, we examined serum lipid levels with respect to genotypes of the twelve variants. In univariate analyses, rs17465637 at 1q41 was associated with HDL cholesterol (P = 0.018); no other SNPs showed association with serum LDL cholesterol and HDL cholesterol. SNP rs599839 at 1p13.3 was associated with plasma levels of total and LDL cholesterol in previous studies [14,15,16,17]; however, showed no evidence of association in our data (Table 6).

Discussion
Coronary atherosclerosis, the primary cause of CAD, is a progressive disease. MI is the last step in the development of CAD, and thrombogenic factors ultimately determine whether or not infarction occurs [11,12]. In the past few years, a number of novel susceptibility genes of CAD or MI were identified using GWASs. In particular, several SNPs identified by the WTCCC, McPherson et al and Helgadottir et al met the criteria for genome-wide association [7,8,10,30]. Other studies found associations of these variants with myocardial infarction [25,27,31]. Potential associations between variants on these risk loci and angiographically characterized coronary atherosclerosis are unknown, although some investigations had been made [32]. We therefore aimed at investigating the association of these reported variants with coronary atherosclerosis in a large Chinese population consisted of well-characterized patients and controls. Our data validated the variation at 9p21 for association with coronary atherosclerosis, and provided evidence for the association of rs17465637 at 1q41.
Variants at 9p21 showed association with genome-wide significance in our Chinese Han collection. Rs10757274 was the most significant SNP among the seven SNPs in this region, and after accounting for rs10757274, no other SNPs at 9p21.3 showed association signal. Rs10757274 was also the most significant SNP in the GWAS by McPherson et al. [8]. However, rs1333049 was most significantly associated with CAD risk in the GWASs by the WTCCC and the German Cardiogenics Consortium [9,10], and rs10757278 was the lead SNP in the GWAS of MI by Helgadottir et al. [7]. Indeed, rs1333049 and rs10757278 were in complete LD in both CEU and CHB data from HapMap, but were in low LD (r 2 = 0.32) in the YRI data from HapMap. The LD between rs1333049 and rs10757274 were also high in both CEU and CHB data from HapMap (r 2 .0.89), as well as in our study (r 2 = 0.76) and previous ones (r 2 = 0.89) [33] (Figure 1A-B). Since the LD in this critical region is strong, using a population with low LD may help in localizing the association signal to the causal variants and therefore aid downstream analysis in future study.
The nearest described protein coding genes of these risk variants within chromosome 9p21 are CDKN2A/2B which encode inhibitors of CDK4/CDK6. The CDKN2A gene generates two transcripts derived from the alternative first exons, E1a and E1b, which incorporate exon 2 and 3 encoding p16/CDKN2A and p14/ARF, respectively [34]. A fourth gene, MTAP, is located further upstream in close proximity. All these genes play roles in cancer, cell-cycle control, apotosis and aging; however, their roles in the pathogenesis of atherogenesis are not clear. A new discovered large antisense noncoding RNA (designated antisense noncoding RNA in the INK4 locus, ANRIL) embedded in these genes, with a first exon located in the promoter of the p14/ARF gene and overlapping the two exons of p15/CDKN2B [35]. Previous studies showed that ANRIL was expressed in atherosclerotic tissue [36] and the expression of ANRIL was coordinated with that of p14/ARF and possibly with p16/CDKN2A and p15/CDKN2B in both physiological and pathological conditions [35,37], suggesting that it might regulate the expression of these genes. The expression of ANRIL transcripts (EU741058 and NR_003529) in atherosclerotic plaque tissue was directly correlated with severity of atherosclerosis; however, no such correlation was found in CDKN2A, CDKN2B and MTAP [32]. Taken together, ANRIL might play a role in the pathogenesis of atherosclerosis. On the other hand, targeted deletion of the 9p21 non-coding interval in mice provided direct evidence that the risk interval has a pivotal role in regulation of cardiac CDKN2A/B expression, and suggested that this region affects CAD progression by altering the dynamics of vascular cell proliferation [38].
On chromosome 1q41, SNP rs17465637 was associated with coronary atherosclerosis in our study (P trend =6.83610 203 , OR= 0.86, 95% CI=0.7720.96) and also showed moderate association with HDL cholesterol (P=0.02). Rs17465637 was firstly reported to be associated with CAD in a GWAS by samani et al. [9]. And then, it was reported to be associated with early-onset MI in the GWAS by Myocardial Infarction Genetics Consortium [39] and the association with MI was replicated in a Japanese sample [31]. However, in a large-scale association study from nine European studies, rs17465637 showed no significant association with CAD; instead, a SNP (rs3008621) in an adjacent haplotype block showed significant association [19]. Both rs17465637 and rs3008621 were located in the intron region of the melanoma inhibitory activity family, member 3 (MIA3) gene (also known as ARNT or TANGO).   (HMECs). Additionally, the migrating capacity of premonocytic cells through fibrinogen or HMECs was increased after stimulation of these cells with recombinant MIA3. These results suggested that MIA3 reduced the attachment to fibrinogen or other cell adhesion molecules [41]. This process is fundamental for the formation and progression of atherosclerotic plaque and also for plaque instability, which might play an important role in the development of coronary atherosclerosis. Rs501120 at 10q11.21 lies upstream of the CXCL12 gene which codes for stromal cell-derived factor-1 (SDF-1), a member of the family of chemoattractant cytokines known as chemokines and is the ligand for cell-surface chemokine receptor 4. SDF-1 has high expression in smooth muscle cells, endothelial cells, and macrophages in human atherosclerotic plaques, and was reported to be involved in the induction of platelet aggregation [42]. The combined analysis of The WTCCC study and the German MI study revealed that rs501120 at 10q11.21 was associated with CAD for the first time [9]. In 11,550 CAD cases and 11,205 controls from 9 European studies, rs501120 was replicated association with CAD and showed a stronger association in women than in men [19]. In our study, the SNP was associated with coronary atherosclerosis in women but not in man and all participants (P all =0.19, P female =8.36610 203 , P male =0.67). These results suggested that rs501120 might affect the pathology of atherosclerosis with sex difference, and the mechanism requires further investigation. SNP rs599839, on chromosome 1p13.3, has recently been reported to be associated with CAD [9,19] and LDL-cholesterol in several GWASs [14,15,16,17], and the association with lipid concentrations were replicated in a Japanese population [43]. In univariate analyses of our study, SNP rs599839 was not associated with coronary atherosclerosis as well as serum lipids level; however, the association appeared significant after adjusting for age, sex and other significant SNPs in multivariate analyses. Assuming the prevalence of 0.50 and using a significance level of 0.05, our study had only 24.6% power to detect association with rs599839 (MAF of 7.3%) in 2,335 CAD patients vs. 1,078 controls.  Table 6. Distribution of serum lipid levels in the study according to genotypes of rs599839 and rs17465637.
Total cholesterol (mmol/l) ± SD Triglycerides (mmol/l) ± SD HDL-C (mmol/l) ± SD LDL-C (mmol/l) ± SD An even larger-scale case-control study in Chinese Han population should be performed to evaluate the association of rs599839 to CAD susceptibility and lipid serum lipids level. Variants at 2q36.3, 6q25.1, 10q11.21 and 15q22.33 showed no associations in our data, and the following reasons might explain for the lack of association with atherosclerosis. First, the minor allele frequency of certain SNPs (rs6922269 and rs17228212) were low (MAF ,5%) in Chinese population, and the sample size of our study was moderate, which has limited our power to detect the association. Therefore, further study with even larger sample size was needed to assess the associations. Second, previous GWASs might uncover risk alleles in these regions important for populations of Caucasians descent which may not contribute much to risk in Chinese due to different environment or other gene interactions, and therefore we did not replicated the associations of these region using the SNPs uncovered previously.
In conclusion, we have confirmed genetic associations for coronary atherosclerosis with 9p21. Additionally, our data showed suggestive evidence of association at 1q41 and suggestive evidence of association at 10q11.21 in females, which warrant further study in a larger sample. These findings provided a strong foundation for further investigation of these loci as risk factors for coronary atherosclerosis in Chinese Han population.

Ethics Statement
Approval to undertake this study was granted by the Ethics Review Committee of the Chinese National Human Genome Center at Shanghai and was conducted according to the Declaration of Helsinki Principles. Written informed consent was obtained from each recruited subjects.

Study Populations
A total of 3,413 individuals were included in this study, consisting of 2,335 atherosclerosis patients and 1,078 unrelated controls free of atherosclerosis. Samples were collected from the cardiovascular care units of three hospitals in Shanghai. To reduce the potential confounding from ethnic backgrounds, we only enrolled people with self-reported origin of central Han Chinese, including indigenous people from Shanghai, Zhejiang Province, Jiangsu Province and Anhui Province. Recent analyses by Genome-wide SNP variation have shown that the central Han Chinese could be regarded as one single homogenous population [44,45]. The diagnosis of coronary atherosclerosis was made on the basis of coronary angiography. Consensus diagnosis of coronary atherosclerosis was performed by two experienced doctors, who carefully evaluated the status by medical history, stenosis status and physical examination. Individuals with at least 50% atherosclerosis occlusion in more than two branches of the coronary artery were included as cases in the current study. All controls were individuals without coronary sterosis. Those with coronary myocardial bridge were excluded from the study. At enrollment, anthropometric measures, medication usage and family history data were collected from each subject by a trained interviewer. The demographic and risk factor information of all case and control samples were summarized in Table 1. Genomic DNA of all samples was isolated from whole blood using FlexiGene DNA Kit (Qiagen, Valencia, CA, USA).  33 and four tag SNPs of two blocks at 9p21.3 including: rs7044859, rs496892, rs7865618 and rs1333049) [9].

SNP selection and genotyping
Eight SNPs (rs599839, rs6922269, rs10757278, rs2383206, rs7865618, rs501120, rs10757274 and rs17228212) were genotyped using SNPStream (Beckman Coulter, Fullerton, CA). Primer design for PCR and single base extension (SBE) was performed with Beckman Coulter Autoprimer software. Another six SNPs (rs17465637, rs2943634, rs1333049, rs496892, rs2383207 and rs7044859) were genotyped using TaqMan chemistry (Applied Biosystems). TaqMan genotyping assays with probes labeled with the fluorophores FAM and VIC were purchased from Applied Biosystems. The Universal PCR Master Mix from Applied Biosystems was used in a 5 ml total reaction volume with 10 ng DNA per reaction. Allelic discrimination was measured automatically on ABI Prism 7900HT (Applied Biosystems) with Sequence Detection Systems 2.1 software (auto caller confidence level, 95%). To evaluate the concordance of the two platforms, we selected rs1333049 to be re-genotyped in 100 randomly selected samples by using the SNPStream system. The concordance rate between the genotypes from the TaqMan and the SNPStream was 99%.

Statistical analysis
Genotype distributions were evaluated for departure from HWE by Plink software [46] (version 1.07, http://pngu.mgh.harvard. edu/purcell/plink/). We performed genotype and Cochran-Armitage trend tests to assess genotype-phenotype association using Plink software. Allele frequencies for cases and controls were used to calculate the Odds Ratio (OR) and the 95% Confidence Interval (CI). Conditional logistic regression was performed to assess whether the most significant SNP in the associated region was sufficient to model the association. Multiple logistic regression was used to evaluate if each SNP was independently associated with CAD and serum lipid levels when adjusted for age and sex. Continuous data were expressed as mean 6 standard deviations (SD) and independent-samples t-test was employed to analyze differences between two study groups. A two tailed P-value of ,0.05 was considered statistically significant, whereas a value of corrected P,0.0036 was considered significant after Bonferroni correction for 14 SNPs. The software used for statistical calculations was the SPSS 15.0 (SPSS Inc., Chicago, IL, USA) unless specified.