MiRNA-Related SNPs and Risk of Esophageal Adenocarcinoma and Barrett’s Esophagus: Post Genome-Wide Association Analysis in the BEACON Consortium

Incidence of esophageal adenocarcinoma (EA) has increased substantially in recent decades. Multiple risk factors have been identified for EA and its precursor, Barrett’s esophagus (BE), such as reflux, European ancestry, male sex, obesity, and tobacco smoking, and several germline genetic variants were recently associated with disease risk. Using data from the Barrett’s and Esophageal Adenocarcinoma Consortium (BEACON) genome-wide association study (GWAS) of 2,515 EA cases, 3,295 BE cases, and 3,207 controls, we examined single nucleotide polymorphisms (SNPs) that potentially affect the biogenesis or biological activity of microRNAs (miRNAs), small non-coding RNAs implicated in post-transcriptional gene regulation, and deregulated in many cancers, including EA. Polymorphisms in three classes of genes were examined for association with risk of EA or BE: miRNA biogenesis genes (157 SNPs, 21 genes); miRNA gene loci (234 SNPs, 210 genes); and miRNA-targeted mRNAs (177 SNPs, 158 genes). Nominal associations (P<0.05) of 29 SNPs with EA risk, and 25 SNPs with BE risk, were observed. None remained significant after correction for multiple comparisons (FDR q>0.50), and we did not find evidence for interactions between variants analyzed and two risk factors for EA/BE (smoking and obesity). This analysis provides the most extensive assessment to date of miRNA-related SNPs in relation to risk of EA and BE. While common genetic variants within components of the miRNA biogenesis core pathway appear unlikely to modulate susceptibility to EA or BE, further studies may be warranted to examine potential associations between unassessed variants in miRNA genes and targets with disease risk.


Introduction
Incidence of esophageal adenocarcinoma (EA) in Western countries has risen sharply in recent decades, while median survival remains less than one year [1]. EA typically arises within a columnar metaplastic precursor epithelium known as Barrett's esophagus (BE). Established risk factors for EA and BE include symptomatic gastroesophageal reflux disease (GERD), European ancestry, male sex, obesity, and tobacco smoking [2]. Less is known about the role of inherited genetic variation and its interplay with environmental factors. Candidate-gene-based studies have associated altered risk of EA or BE with DNA polymorphisms in genes that function in a wide range of biological pathways (inflammation, detoxification, DNA repair, angiogenesis, and apoptosis) [3][4][5][6][7][8][9][10][11][12][13], while a recent linkage-based genetic analysis of sibling pairs provided preliminary evidence for several novel germline mutations [14]. These studies have been limited by small sample sizes and the need for validation. In the last few years, large genome-wide association studies of BE and EA identified multiple SNPs significantly associated with disease risk [15][16][17][18], including variants located in three transcription factors, a transcriptional co-activator, and the major histocompatibility complex locus, none of which were previously implicated by candidate-based studies.
A large body of work has established that microRNAs (miRNAs), small non-coding RNAs that function in post-transcriptional gene regulation, act as oncogenes or tumor suppressors in a variety of tissues, and their deregulation can lead to cancer [19,20]. MiRNA gene loci are transcribed by RNA polymerase II to generate primary (pri-) miRNA transcripts, which are then cleaved by the nuclear RNAse DROSHA complex to form stem-loop precursor (pre-) miRNAs. Following nuclear export to the cytoplasm, pre-miRNAs undergo further processing by the DICER complex to generate mature miRNAs, which typically bind to the 3' untranslated region of target messenger RNAs (mRNAs) and mediate translational repression or RNA degradation. MiRNAs have been linked to inflammatory pathways [21,22], which are likely to play important roles in BE/EA. Changes in miRNA expression have been detected at multiple stages in the development of EA [23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38], and certain miRNAs may be associated with prognosis [33,35].
Many studies have reported associations between miRNA-related SNPs and risk of diverse cancers [39]. These SNPs may reside in a) miRNA biogenesis genes, b) miRNA gene loci, or c) miRNA-targeted mRNAs. Functional miRNA-related SNPs may affect global miRNA expression levels, processing or expression of individual miRNAs, and miRNA target gene specificity. A previous case-control study of 346 esophageal cancer cases (86% EA) and an equal number of matched controls reported that seven miRNA-related SNPs from a panel of 41 total SNPs tested were associated with altered risk of esophageal carcinoma, with the association of one SNP in the pre-miRNA-423 region remaining statistically significant after correction for multiple comparisons [40]. These SNPs have not been validated in independent study populations or evaluated for potential associations with BE, and it is currently unknown whether additional SNPs in the miRNA pathway may modulate disease risk. Using data from a recent genomewide association study (GWAS) of 2,515 EA cases, 3,295 BE cases, and 3207 controls [16], we selected 157 biogenesis pathway SNPs, 234 miRNA SNPs, and 177 mRNA target SNPs and assessed their associations with risks of EA and BE.

Study population and SNP genotyping
The Barrett's and Esophageal Adenocarcinoma Genetic Susceptibility Study (BEAGESS) included men and women diagnosed with EA or BE, and control participants pooled from 14 individual studies conducted in Western Europe, Australia, and North America over the past twenty years. Detailed study population characteristics and genotyping protocols have been published [16]. Briefly, all EA and BE case participants were confirmed by histologic examination, and a set of population control individuals was drawn from the included Barrett's and Esophageal Adenocarcinoma Consortium (BEACON) studies to serve as a comparison group for both EA and BE case participants. The current analysis employed a pooled dataset that has been described previously [41], and included all BEAGESS participants, additional BE and EA patients from the UK Barrett's Esophagus Gene Study and the UK Stomach and Oesophageal Cancer Study (SOCS), respectively [16], and additional controls from a hospital-based casecontrol study of melanoma conducted at the MD Anderson Cancer Center (Houston, TX) [42]. Genotyping of buffy coat or whole blood DNA from all participants was conducted using the Illumina Omni1M Quad platform, in accordance with standard quality control procedures [43]. All participants gave written informed consent, and this project was approved by the ethics review board of the Fred Hutchinson Cancer Research Center. We selected all unrelated participants with missing genotyping call rates < 2%; thus the final study sample included 2,515 EA cases, 3,295 BE cases, and 3,207 controls. Three control participants were excluded from analyses involving BE cases, because of familial relation to cases.

Selection of miRNA-related SNPs
SNPs selected for this study are located in or near a) miRNA biogenesis genes (+/-2kb), b) miRNA gene loci (+/-25bp), or c) predicted or verified mRNA targets of miRNAs (S1 Fig). We excluded from consideration SNPs that failed Illumina quality measures or standard quality control procedures [43]. Specifically, SNPs were excluded if any of the following criteria were satisfied: i) Illumina GenTrain score < 0.6 or cluster separation < 0.4; ii) >5% missing call rate over samples; iii) discordant genotype calls in any pair of duplicate study samples; iv) Mendelian error in either one of the HapMap QC trios or the small number of families identified in the BEACON data; v) significant departure from Hardy-Weinberg Equilibrium (P<10 -4 ); vi) minor allele frequency (MAF) < 1%.
To identify potentially functional SNPs in miRNA-targeted mRNAs, we used a database of polymorphisms predicted to alter miRNA-mRNA regulation [44]. Two filters were imposed to limit the set of SNPs to those most likely to be functional. First, we considered miRNA-mRNA interactions only for miRNAs shown to be expressed in the esophagus at some point in the disease progression from normal squamous epithelium to BE/EA. Based on the union of several published reports [32,[34][35][36][37][38], 135 expressed miRNAs were identified (S1 Table). Second, after the previous exclusions, we considered only a subset of miRNA-mRNA interactions (3%) that were predicted to be most strongly affected by genetic variants in the target mRNA (ΔS>0.85, where S is the predicted regulation score for a given miRNA:mRNA pairing). LDbased pruning of these filtered SNPs (n = 175) was performed as described previously (145/175 SNPs were retained).
A literature search was also conducted to identify miRNA-related SNPs shown to be associated with cancer susceptibility (S2 Table) [39, . Variants not already captured by our selection process were added if available in the Omni1M dataset, or proxy SNPs in high LD were substituted where possible, as indicated (S3 Table). The final set of polymorphisms included 157 biogenesis pathway SNPs (21 genes), 234 miRNA SNPs (210 genes), and 177 target mRNA SNPs (158 genes). Minor and major alleles were reported throughout using the 'plus' strand designation.

Statistical analysis
Unconditional multivariate logistic regression was used to compute odds ratios for risk of EA or BE associated with a given SNP variant, using an additive model (per-allele), while adjusting for age, sex, and the first four principal components (PCs) derived from principal components analysis (PCA) to account for population stratification by ancestry [41]. Inclusion of the first four PCs was empirical, as described previously in the parent GWAS report [16], and based on assessment of i) the percentage of variance accounted for (S2 Fig) and ii) the observed separation of study participants according to pairwise comparisons of individual principal components (S3 Fig). To correct for multiple comparisons when assessing statistical significance, we used the Bonferroni or Benjamini-Hochberg false-discovery rate (FDR) methods [68,69]. The threshold for Bonferroni significance was α = 0.05/568 = 8.80x10 -5 . Stratified analyses by body mass index (BMI, kg/m 2 ) and history of cigarette smoking were also conducted, and evidence for interaction was assessed by including a product term in the logistic regression model. Smoking history and BMI were defined categorically in stratified analyses (smoking: ever/ never, or pack-years: 0, >0 & <15, 15-29, 30-44, 45+; BMI: <25, 25-29, 30-34, 35+), and continuously (pack-years, BMI) to test for interaction. Statistical analyses were conducted using STATA/SE version 13.1 (College Station, TX).

Characteristics of study participants
The distributions of demographic characteristics among controls, EA cases, and BE cases are shown in Table 1. EA cases were somewhat older (mean age: 65.0 years) and more frequently male (87%) relative to controls (mean age: 58.4 years, 73% male) and BE cases (mean age: 63.0 years, 76% male). Among participants with non-missing data for BMI and smoking history, the percentage reporting ever having smoked cigarettes was higher among EA (75%) and BE (66%) cases than among controls (59%). Heavy smoking (45+ pack years) was more prevalent among EA cases (21%) than among controls (14%) or BE cases (14%), while obesity (BMI 30+) was more prevalent among EA (30%) and BE (37%) cases than among controls (20%).

Associations of miRNA-related SNPs and risk of EA or BE
Of the 157 biogenesis pathway SNPs, 234 miRNA SNPs, and 177 mRNA target SNPs evaluated in this study (S3 Table), 29 were nominally associated (P<0.05) with risk of EA ( Table 2 Table 3). Three SNPs satisfied P<0.05 in both analyses (EA and BE): RDH8 rs1644730 T>A, miR-3117 rs7526812 T>C, and miR-4467 rs12534337 G>A. For each of these SNPs, the risk estimates for EA and BE were similar and in the same  Table). After correction for multiple comparisons (n = 568 total SNPs), using the Bonferroni or false discovery rate method, none of the observed associations remained statistically significant (FDR q>0.50).

Discussion
Using genotyping data from a recent consortium-based GWAS, we evaluated the association of 568 miRNA-related SNPs with risks of EA and BE. In total, 29 SNPs were found to be nominally associated (P<0.05) with risk of EA, and 25 with risk of BE, with three shared variants identified between these analyses. None remained significant after correction for multiple comparisons. Aberrant expression of miRNAs has been reported in many cancers, and several studies have described miRNA expression changes at specific stages in the development of EA, which may be associated with progression or prognosis [24,29,31,33,35,36]. Inherited genetic variation in the miRNA pathway has been linked to altered susceptibility to a variety of cancers, but few studies have focused on esophageal cancer, and in particular, EA (as opposed to esophageal squamous cell carcinoma, ESCC) [39,40,70,71]. The largest previous study included 386 esophageal cancer cases (86% EA, n = 296) [40], and identified seven SNPs significantly associated with cancer risk, five of which were associated specifically with EA. Of those, a single SNP in the pre-miR-423 region remained significant after adjustment for multiple comparisons. This variant failed to validate in our own present study of >2,500 EA cases and >3,200 controls.
Since BE is an established risk factor for, and the only known precursor of, EA, it was of interest to compare the list of SNPs nominally associated with risk of each condition. For each of the three variants reaching P<0.05 in both analyses (RDH8 rs1644730 T>A, miR-3117 rs7526812 T>C, and miR-4467 rs12534337 G>A), the direction and magnitude of the OR was very similar for BE and EA (S4 Table), raising the possibility that the association with risk of BE might account for the association with risk of EA. However, given our use of a single control group for comparison to both the EA and BE case groups, cautious interpretation is required, since some shared associations could be expected by chance alone. When EA and BE cases were combined into a single case group and compared to the same set of controls, all three of these variants were highly-ranked among the set of 31 nominally significant SNPs (S7 Table), and two of the three (rs7526812 T>C, rs12534337 G>A) had smaller P values than observed in the individual analyses, without reaching Bonferroni or FDR significance thresholds. rs1644730 T>A is located within a predicted miR-630 binding site in the 3'UTR of alltrans-retinol dehydrogenase 8 (RDH8), which encodes a short-chain dehydrogenase/reductase enzyme involved in rhodopsin regeneration in the vision pathway. Of interest, the enzymatic activity of RDH8 has also been linked to estrogen biosynthesis [72]. Given the significantly higher incidence of BE/EA among males versus females, a potential protective effect of estrogens has been proposed, and some evidence exists for reduced disease risk associated with hormone replacement therapy [73]. An intriguing hypothesis relates to whether the suggested inverse association observed in this study for rs1644730 T>A may reflect impaired miRNAmediated repression of RDH8 and consequent elevations in estrogen synthesis. In exploratory analyses stratified by sex, the inverse association of this variant with risk of EA appeared stronger among males (OR = 0.86, p = 0.00057, q = 0.23) relative to the overall study sample (OR = 0.88, p = 0.0011, q = 0.54), and was not evident among the limited pool (n = 1200) of female participants (OR = 0.97, p = 0.79). rs12534337 G>A is located in the miR-4467 precursor, identified by deep sequencing of the small RNA transcriptome of normal and malignant B cells, and paired normal and tumor breast tissue (68,69). Expression of miR-4467 has not been assessed in the esophagus, and no studies have examined functional effects of this miRNA. The proto-oncogene JUNB mRNA is predicted by the TargetScan algorithm to be one of 23 conserved targets of miR-4467, but experimental validation has not been reported. Any influence of this SNP on the biogenesis of miR-4467 remains to be determined. rs7526812 T>C is located~15kb downstream from, and is in strong linkage disequilibrium (r 2 = 1.0) with, a variant in the miR-3117 gene locus (rs12402181). rs7526812 T>C is also a missense polymorphism situated within an exon of the SGIP gene, which has been genetically linked to fat mass [74].
Five of the seven SNPs reported to be associated with risk of esophageal cancer (EA/ESCC) by Ye et al. [40] were included in our analysis of genotyped SNPs from the BEACON GWAS, including their single SNP (pre-miR-423 rs6505162) reported as significant after correction for multiple comparisons. Three of these five SNPs had been associated with EA (miR-423 rs6505162, miR-196a-2 rs11614913, and RAN rs14035), while two reached borderline significance (XPO5 rs11077 and pri-miR-219-1 rs213210). In our study, none of these five SNPs were (nominally) associated (P<0.05) with either EA or BE (S8 Table), and nearly all ORs were very close to 1. Ye et al. evaluated three genetic models (additive, recessive, dominant) and reported the best-fitting model in their analysis, in contrast to our approach of assessing exclusively the additive model. Re-analysis of these five SNPs using the alternative models resulted in one polymorphism reaching borderline significance (pri-miR-219-1 rs213210 T>C, dominant model, EA OR 1.17, 95% CI 1.01-1.47, P = 0.06), in the same direction as the previously published results (OR 1.61, P = 0.058). Ye et al. did not evaluate any of the nominally-associated variants identified in our analysis. Multiple factors could account for discrepancies between studies. First, many apparent signals in association studies may be false positives, and replication in large, independent populations is critical (our EA analysis included more than eight times as many cases as the previous study). Second, different approaches were taken in adjustment for covariates. While both studies included only individuals of European ancestry, we also adjusted for population stratification via inclusion of several PCs derived from principal components analysis, but chose not to adjust for smoking status.
Strengths of our study included the use of pooled data from the BEACON GWAS, which provided the largest sample size to date in the evaluation of miRNA-related SNPs and risks of EA and BE. Inclusion of both BE and EA cases allowed for a comparison of the genetic variation associated with risk of a neoplastic precursor lesion and the cancer that arises from it. The availability of covariate data for smoking history and BMI further enabled us to evaluate geneenvironment interactions for two established risk factors for these conditions. Our assessment of 568 polymorphisms significantly expands upon past efforts to examine genetic variation in miRNA pathways in relation to risk of Barrett's esophagus and esophageal adenocarcinoma. In particular, our inclusion of >120 SNPs within 14 genes implicated in miRNA biogenesis allowed for broad coverage of this canonical core pathway, activity of which underlies the global expression of miRNAs.
This study also had certain limitations. First, while the EA/BE GWAS employed in our analyses represents the largest study of its kind currently available, it remains plausible that among the variants examined, some true associations of small magnitude were missed due to power constraints, especially those present exclusively in population sub-groups. Missing data for smoking/obesity in a fraction of participants further restricted the size of our study sample for stratified analyses and assessment of effect modification. Second, our examination of genetic variation within miRNA gene loci and miRNA-targeted mRNAs was not comprehensive. Of the~1600 miRNAs deposited in miRBase, only~200 were represented in our analysis. We used stringent rather than exhaustive inclusion criteria for selecting miRNA-targeted mRNAs for analysis, and prioritized targets for which expression of cognate miRNAs has been reported in esophageal tissue. While this category of variants could have been vastly expanded based on bioinformatic predictions, the majority of such predicted miRNA:mRNA interactions are not validated, and are of unknown biological significance.
In sum, our results do not provide evidence for an association between common genetic variation in components of the miRNA biogenesis core pathway and altered risk of EA or BE. Additional studies may be justified to examine potential associations between unassessed germline variants in miRNA gene loci and miRNA target genes and risk of these conditions. Validation in independent study populations, coupled with functional evaluation of the biological effects of specific SNPs, would be needed to substantiate any polymorphisms as causal risk factors for BE or EA.  his past studies of EA and BE (FINBAR Study). Genotyping data for MD Anderson controls [42] was obtained from dbGaP through accession number phs000187.v1.p1.