Analysis of 17,576 Potentially Functional SNPs in Three Case–Control Studies of Myocardial Infarction

Myocardial infarction (MI) is a common complex disease with a genetic component. While several single nucleotide polymorphisms (SNPs) have been reported to be associated with risk of MI, they do not fully explain the observed genetic component of MI. We have been investigating the association between MI and SNPs that are located in genes and have the potential to affect gene function or expression. We have previously published studies that tested about 12,000 SNPs for association with risk of MI, early-onset MI, or coronary stenosis. In the current study we tested 17,576 SNPs that could affect gene function or expression. In order to use genotyping resources efficiently, we staged the testing of these SNPs in three case–control studies of MI. In the first study (762 cases, 857 controls) we tested 17,576 SNPs and found 1,949 SNPs that were associated with MI (P<0.05). We tested these 1,949 SNPs in a second study (579 cases and 1159 controls) and found that 24 SNPs were associated with MI (1-sided P<0.05) and had the same risk alleles in the first and second study. Finally, we tested these 24 SNPs in a third study (475 cases and 619 controls) and found that 5 SNPs in 4 genes (ENO1, FXN (2 SNPs), HLA-DPB2, and LPA) were associated with MI in the third study (1-sided P<0.05), and had the same risk alleles in all three studies. The false discovery rate for this group of 5 SNPs was 0.23. Thus, we have identified 5 SNPs that merit further examination for their potential association with MI. One of these SNPs (in LPA), has been previously shown to be associated with risk of cardiovascular disease in other studies.


Introduction
Myocardial infarction (MI) is a prevalent and often fatal consequence of coronary heart disease. Each year approximately 865 thousand Americans are diagnosed with MI and about 180 thousand die from the disease [1]. MI occurs when thrombosisprecipitated by ruptured or eroded atherosclerotic plaque-leads to acute ischemia and subsequent necrosis of the myocardium.
Risk factors for MI include age, sex, elevated LDL-cholesterol, hypertension, low HDL-cholesterol, smoking, type 2 diabetes, and family history of cardiovascular disease. Risk of MI has a genetic component as evidenced by large twin studies [2] which showed that death from cardiovascular disease is more highly correlated among identical twins than fraternal twins. Among genetic variants that are associated with MI, some can affect traditional risk factors [3,4] but for others the underlying biological explanation for the association is not known [5,6].
The identification of genetic variants that are associated with MI is a challenging task, because variants that are associated with MI are expected to only modestly increase the risk of MI, and because a large number of variants could potentially be tested. Thus, very large studies are needed to detect modest association and account for multiple testing confidently [7]. We have limited the magnitude of multiple testing in the past [8][9][10] by testing SNPs with high prior probability for association with MI. Previously, we described the results from investigating ,12,000 such SNPs [8][9][10]. Here we asked if we could identify SNPs that are associated with MI by investigating ,17,000 SNPs. We identified 5 SNPs that appear to be associated with MI and should be investigated in additional studies of MI.

Objectives
To identify genetic polymorphisms associated with MI, we interrogated three case-control studies comprising cases with a history of MI and controls without a history of MI. The first two case-control studies (Study 1 and Study 2) identified SNPs nominally associated with MI. The hypotheses that these SNPs are associated with MI were tested in Study 3. We determined the allele frequency of each SNP in pools of case and control DNA prior to determining the genotype of a smaller number of SNPs for all individual DNA samples.

SNP Selection
The 17,576 SNPs investigated in Study 1 are located in 10,152 genes. Of these 17,576 SNPs, 2767 were tested in at least one of 3 previously reported studies [8][9][10]. We previously reported that one of these SNPs (rs3798220 in LPA) was associated with cardiovascular disease [10]. Most of these SNPs (65%) could potentially affect gene function or expression because they cause an amino-acid change in a predicted open reading frame (missense SNPs), or they are located in exon acceptor or donor splice site, and could change the splicing pattern of predicted open reading frames. We also considered as potentially functional some SNPs that are located in regions that are known to be involved in transcriptional regulation (predicted transcription factor binding sites), or RNA stability (39 or 59 untranslated regions, or predicted microRNA binding sites).

Allele Frequency and Genotype Determination
Initially, the allele frequency for each individual SNP was determined for all the cases and all the controls in pools of DNA. Pools were made by mixing equal volumes of standardized DNA from each individual member of the pool. Prior to pooling, DNA concentration for each sample was determined in triplicate using Picogreen fluorescent detection (Invitrogen). Measurements were repeated for samples which had high variation of fluorescence values (5% or greater coefficient of variation). DNA concentrations were determined from mean fluorescence values using a standard curve of salmon sperm DNA. DNA samples were then diluted to 6 ng/mL using automated liquid handling robotics (Beckman Coulter Fx, or Perkin Elmer Multiprobe II). The final concentration was confirmed using Picogreen fluorescent detection. Typically, several unique pools of DNA were made for cases and controls, made up of about 50 cases or controls. For each SNP, two real-time PCR reactions were performed, using 3 ng of pooled DNA in each reaction and allele-specific primers. The allele frequency in each pool was calculated from amplification curves for each allele. Genotyping of individual DNA samples was done by performing two real-time PCR reactions for each individual sample, using 0.3 ng DNA from each sample and allele specific primers.

Ethics
Subjects of all three studies gave written informed consent and completed questionnaire approved by the Institutional Review

Statistical methods
We assessed association between MI status and allele frequencies by two-tailed x 2 tests, and between MI status and genotype by logistic regression using an additive inheritance model (Wald test). In Study 2 and Study 3, since we tested a single prespecified risk allele for each SNP, we present one-sided P values and 90% confidence intervals (for odds ratios greater than one, there is 95% confidence that the true risk estimate is greater than the lower bound of a 90% confidence interval). We used a P threshold value of 0.05 in all three studies, and adjusted for multiple testing by calculating the False discovery rate (FDR) in Study 3. FDR was calculated using the MULTTEST procedure (SAS statistical package Version 9.1); for SNPs that were in the same gene, only the SNP with the higher (less significant) P value was included in the calculation.

Results
We measured the allele frequencies of 17,576 putative functional SNPs in Study 1 cases and controls using pooled DNA samples and identified 1,949 SNPs that were associated with MI (P,0.05) and had minor allele frequency estimates that were greater than 1%. For these 1,949 SNPs, we determined allele frequencies in Study 2 cases and controls using pooled DNA samples and verified that the risk allele identified in Study 1 was also associated with risk of MI in Study 2. For those SNPs that were associated with MI and had the same risk alleles in both pooling studies, we then confirmed the association of the SNP with MI in Study 1 and Study 2 by genotyping individual DNA samples. We found that the risk alleles of 24 SNPs in 23 genes were associated with MI in both studies using an additive inheritance model (Table 2) and a P value threshold of 0.05. Next we tested the hypotheses that the risk alleles of these 24 SNPs would be associated with MI in Study 3. The power to detect association with MI for these 24 SNPs (based on the risk and allele frequency observed in Study 2) ranged from 41% (for rs3812475 in TRMT12) to 83% (for rs725660 in LOC388553). We found that the risk allele of 5 SNPs, in 4 genes (ENO1, FXN (2 SNPs), HLA-DPB2, and LPA) were associated with MI using an additive inheritance model after adjustment for age and sex ( Table 3). The false discovery rate for these 5 SNPs was 0.23. The distribution of the genotypes for each of the SNPs did not deviate from what was expected under Hardy-Weinberg equilibrium (P.0.5). Further adjustment for traditional risk factors (dyslipidemia, hypertension, smoking status, and BMI), did not appreciably change the risk estimate for 4 of these SNPs (Table 3, LPA, FXN (2 SNPs), and HLA-DPB2). However, the risk for the ENO1 SNP was not statistically significant after further adjustment for traditional risk factors (OR = 1.09, 90% CI 0.85- 1.38, P = 0.28). Dyslipidemia could be confounding the association of the ENO1 SNP with MI since this SNP trended toward association with dyslipidemia (P = 0.1).

Discussion
We conducted an analysis of 17,576 SNPs that could potentially affect gene function or expression in three case-control studies of MI and identified 5 SNPs in four genes (ENO1, FXN (2 SNPs), HLA-DPB2, and LPA) that were associated with MI. The false discovery rate for this group of 5 SNPs was 0.23, indicating that several of these SNPs are expected to be associated with MI.
The first SNP is located in ENO1, a gene that encodes aenolase, a glycolytic enzyme that catalyzes the conversion of 2phospho-D-glycerate to phosphoenolpyruvate. a-enolase is also known to be a plasminogen receptor on the surface of hematopoietic cells and endothelial cells [11]. Thus, a-enolase could contribute to fibrinolysis, hemostasis, and arterial thrombus formation-processes that are critical in the pathophysiology of MI. The SNP in ENO1 (rs1325920) is located about 1 kb upstream of the gene and could be involved in transcriptional regulation.
Two of the SNPs are in the FXN gene. The FXN gene encodes Frataxin, a mitochondrial protein involved in maintaining cellular iron homeostasis [12]. Expanded GAA triplet repeats in intron 1 of FXN leads to silencing of the FXN gene and to accumulation of iron in the mitochondria, which makes mitochondria sensitive to oxidative stress [13]. These changes lead to Friedreich's ataxia, an autosomal recessive disease of the central nervous system that is frequently associated hypertrophic cardiomyopathy [12]. The two SNPs in FXN found to be associated with MI are located in the 39 untranslated region of FXN (rs10890) and in a putative transcription factor binding site (rs3793456), thus one or both of these SNPs could have an effect on FXN gene expression. These two SNPs are in linkage disequilibrium (r 2 = 0.57 in Study 1) and thus, are not independent of one another. Whether these SNPs are associated with increased sensitivity of mitochondria to oxidative stress or to other mild manifestations of Friedreich's Ataxia symptoms is not known.
The fourth SNP (rs3798220 in LPA) encodes a isoleucine to methionine substitution at amino acid 4399 of apolipoprotein(a). We have previously shown that this SNP is associated with coronary artery narrowing and with increased levels of plasma lipoprotein(a) in case-control studies [10]. This SNP was also associated with incident myocardial infarction in the Cardiovascular Health Study, a population-based prospective study of about 5000 individuals aged 65 or older [14]. The low minor allele frequency of this SNP in LPA (1% in the European American population of CHS [14]) suggests that this SNP accounts for only a small fraction of the total variability of plasma Lp(a) levels. Our previous data showed that Lp(a) levels were 5.9-fold higher in carriers of the 4399 methionine allele than in noncarriers [10]; and high levels of Lp(a) are associated with an increased risk for MI [15,16]. Additionally, one can speculate that the association of the 4399 methionine allele with increased risk of disease could also be due to the isoleucine to methionine change in apolipoprotein(a) that may result in a more deleterious form of Lp(a).
Lastly, the fifth SNP is in HLA-DPB2 (rs35410698) is also associated with MI in this study. HLA-DPB2 is a pseudogene in the Human Leukocyte Antigen (HLA) region. The HLA region is highly polymorphic, gene rich region. Linkage disequilibrium in this region can extend across hundred kilobases and encompass HLA as well as non-HLA genes [17]. Therefore, additional genotyping of SNPs in this region would be needed in order to know which gene variant in this region could be associated with MI.

Limitations
We analyzed case-control studies that were retrospectively collected and did not include fatal cases of MI. Therefore, SNPs specifically associated with fatal MI would not have been identified. There were some differences between the participants in these three studies, specifically, Study 3 controls were recruited from patients who underwent coronary catheterization, whereas Study 1 and Study 2 controls were recruited from a lipid clinic population and from community centers. Thus, SNPs that were associated with MI in Study 1 and Study 2 but not in Study 3 might be explained by the differences between these studies. For example, a SNP in THBS4 (rs1866389) that was found to be associated with MI in Study 1 and Study 2 but not in Study 3, has been previously reported to be associated with premature MI [18]. However, the power to detect the association of THBS4 with MI in Study 3 was limited (40% power), thus the lack of association in Study 3 may represent a false negative result. The false discovery rate for the 5 SNPs that were associated with MI in Study 3 was 0.23. Thus we expect that some of the SNPs we identified could be false positive associations (type 1 error); replication from additional studies is required to validate the observed associations. We have looked for support for these associations in the published data from the Welcome Trust Case-Control Consortium data [19], unfor-tunately, none of the 5 SNPs we report here was genotyped in that study. Finally, although the SNPs in this study could potentially affect gene function, additional linkage disequilibrium analysis would be needed in order to determine if other SNPs in these region could better account for the associations with MI we observed.

Conclusion
We identified 5 SNPs in 4 genes that are likely associated with MI. These SNPs merit investigation in additional studies of MI.