rs4919510 in hsa-mir-608 Is Associated with Outcome but Not Risk of Colorectal Cancer

Background Colorectal cancer is the third most incident cancer and cause of cancer-related death in the United States. MicroRNAs, a class of small non-coding RNAs, have been implicated in the pathogenesis and prognosis of colorectal cancer, although few studies have examined the relationship between germline mutation in the microRNAs with risk and prognosis. We therefore investigated the association between a SNP in hsa-mir-608, which lies within the 10q24 locus, and colorectal cancer. Methods and Results A cohort consisting of 245 cases and 446 controls was genotyped for rs4919510. The frequency of the GG genotype was significantly higher in African Americans (15%) compared to Caucasians (3%) controls. There was no significant association between rs4919510 and colorectal cancer risk (African American: ORGG vs. CC 0.89 [95% CI, 0.41–1.80]) (Caucasian: ORGG vs. CC 1.76, ([95% CI, 0.48–6.39]). However, we did observe an association with survival. The GG genotype was associated with an increased risk of death in Caucasians (HRGG vs. CC 3.54 ([95% CI, 1.38–9.12]) and with a reduced risk of death in African Americans (HRGG vs. CC 0.36 ([95% CI 0.12–1.07). Conclusions These results suggest that rs4910510 may be associated with colorectal cancer survival in a manner that is dependent on race.


Introduction
Colorectal carcinoma is the third most incident cancer and cause of cancer-related mortality in the United States. Estimates for 2011 predict 141,210 new cases of colorectal cancer and 49,380 deaths [1]. In the past ten years, there have been significant advances in our understanding of the natural history and molecular mechanisms underlying colorectal cancer development. In addition, substantive guidelines for screening have been developed by the US Preventative Services Task Force. Therapeutic improvements have also been achieved, particularly after the 2004 approval of bevacizumab (AvastinH) for advanced disease. Despite these advances, the five-year survival for colorectal cancer has increased only ,5 percentage points in recent years [1].
Although this survival trend holds true for both African Americans and Caucasians, a readily apparent health disparity exists for African Americans. By age 65, African Americans have an approximate 25% greater probability of developing colorectal carcinoma compared to Caucasians [1,2]. Additionally, colorectal cancer incidence in African Americans in 2007 was estimated to be essentially the same as in 1975, whereas Caucasians have experienced a 17-percentage point drop in incidence. Twenty four percent of colorectal cancers are detected at late stage in African Americans, compared to 19% in Caucasians. This may in part explain the significantly worse overall survival rate in African Americans, who have a 15% increased risk of dying from colorectal cancer when compared to Caucasians [1,2].
MicroRNAs (miRNAs) are a class of small, non-coding RNAs consisting of approximately 22-25 nucleotides in their processed, mature form. To date, over 1000 miRNAs have been discovered in humans, with some estimates predicting a final count of several thousand [3,4], Physiologically, miRNAs act as a rheostat, fine tuning translational output through targeted mRNA binding and repression. The target repertoire of a miRNA is defined primarily by its seed sequence, nucleotides 2-6 at the 39 end of its mature form. A variety of pathologic associations have been attributed to altered miRNA networks particularly in cancer, with miRNAs able to function as both oncogenes and tumor suppressors [5,6,7].
Numerous studies have linked both aberrant expression and genetic variation in miRNAs to colorectal cancer risk, diagnosis, prognosis, and drug response [8,9,10,11]. Single nucleotide polymorphisms (SNPs) in miRNAs can affect their biogenesis, processing, and/or target site binding in a variety of ways, as highlighted in our recent review [12]. For example, since microRNAs are processed in a step-wise fashion from pri to pre to a mature strand, a process guided by stereotypical secondary structure, one can conceptualize how a single base pair change could affect processing or recognition by guide components [12]. Alternatively, a SNP in the mature sequence can alter target site interactions by either strengthening or weakening hybridization kinetics, or if in the seed sequence, it can significantly transform the target library of the miRNA itself. Interestingly, the prevalence of these SNPs in miRNAs is significantly lower than predicted in the remainder of the genome, speaking to the evolutionary conservation and importance of these small biomolecules [13,14]. However, a few reports have recently highlighted an association between these low frequency germline variations and cancer risk and/or prognosis [12].
Loss of heterozygosity in the 10q24 locus has been reported in a number of human cancers, including but not limited to colorectal, prostate, pancreatic, and brain [15,16,17,18]. Despite numerous investigations into a possible tumor suppressor gene in this hotspot, strong evidence is lacking [19]. Interestingly, hsa-mir-608, a microRNA of which virtually nothing is known functionally, lies within an intron of SEMA4G in this region. Furthermore, hsamir-608 harbors a SNP, rs4919510, in bp 22 of its mature 25 bp sequence. This C-G polymorphism is common in several populations. Here we report that the rs4919510 germline polymorphism within hsa-mir-608 is associated with colorectal cancer survival, in a race-specific manner.

Ethics statement
All participants gave written informed consent. The protocol was in compliance with the Declaration of Helsinki and approved by the Institutional Review Boards of the National Cancer Institute. The ethics exemption number is 11289.

Study Population: The NCI-University of Maryland Colorectal Cancer Case-Control Study
The study population consisted of 691 subjects. Incident colorectal cancer cases (n = 245) and controls (n = 446) were recruited from 1992-2003 and 1998-2003, respectively from the greater Baltimore, Maryland area. The controls were accrued from both a hospital setting (n = 236) and a community setting (n = 210). The inclusion and exclusion criteria have been previously described [20]. In brief, subjects were self-reported Caucasian or African-Americans born in the United States, and were excluded if they self-reported a history of cancer other than colorectal, HIV, HBV, HCV, or IV drug use, were institutionalized, or had a mental impairment. Information to determine disease stage, treatment, and survival was obtained from medical records and pathology reports, Social Security Death Index, and National Death Index. Disease staging was completed according to the tumor-node-metastasis system of the American Joint Committee on Cancer. The survival period was determined from date of hospital admission for surgery to date of last completed search for death entries in the Social Security Death Index (2010). Informed consent was obtained from all participants, and epidemiological questionnaires including, personal history, family medical history, past medical history, tobacco history, dietary information, and information on work environment, were administered to all subjects. The study was approved by the institutional review boards of the participating institutions.
Genotyping. Genomic DNA was isolated from buffy coat or colorectal tissue using the Qiagen FlexiGene DNA Kit or the DNAeasy tissue kit, respectively (Qiagen, Valencia, CA). Cases and controls were genotyped using the Taqman assay (Life Technologies, Carlsbad, CA) for rs4919510 in mir-608 at Ohio State University Genotyping Core. The case, control, negative controls, and duplicate samples were randomly distributed for order of processing, with 10% duplicates to test both inter-and intra-plate concordance. All parties involved in genotyping were blinded to the case, control, and duplicate status of the samples. Samples that failed to genotype were recorded as undetermined.
Statistical Analysis. Statistical analyses were performed using STATA 11.0 (College Station, TX). A p-value of less than 0.05 was used as the criterion for statistical significance, and all statistical tests were two-sided. Departures from Hardy-Weinberg equilibrium were determined using a x 2 test. Odds ratios (OR) and their corresponding 95% confidence intervals (CI) were estimated using a univariable unconditional logistic regression model and an adjusted model that included the covariates age (continuous) and gender (categorical). Hazard ratios (HR) and 95% CI were estimated using a univariable Cox proportional hazards regression model and an adjusted model for the covariates age (continuous), gender (categorical) and stage of disease (categorical). Survival time was calculated from date of surgery to date of either last known follow up (last NDI update 12/31/2008) or date of death due to colorectal cancer. Of the 245 patients, there were 117 events, 52 of which were in African Americans, 45 of which were

Study population and miR-608 genotype analysis
We investigated the relationship between rs4919510, a SNP in the mature sequence of hsa-mir-608, with regards to risk and disease outcome in 245 colorectal cancer cases and 446 controls. Relevant population characteristics are described in Table 1. Overall, the cases did not differ from the controls in terms of age or race; however, the case cohort contained significantly more males than each respective control group (population, hospital, or total) ( Table 1). The minor allele frequency [22] of rs4919510 was significantly higher in African Americans, compared to Caucasians; 40.5% and 17.5%, respectively ( Table 2). In addition, the frequency of the GG genotype was significantly higher in African Americans (15%) compared to Caucasians (3%) controls, in agreement with observations for this SNP in HapMap [23]. Since the frequencies of rs4919510 genotypes did not differ between hospital and population control groups, all analyses are presented using the overall control group (Tables 2 and 3). Rs4919510 did not deviate from Hardy-Weinberg equilibrium proportions in either the population or hospital controls.

No association between rs4919510 and colorectal cancer risk
The GG or CG genotypes of rs4919510 were not associated with colorectal cancer risk ( Table 2). In the total population, the frequency of the GG genotype was 8% in cases and 8% in controls. In African Americans, the GG genotype was represented in 15% of controls, and 13% of cases, corresponding to an adjusted OR of 0.89 (95% CI, 0.41-1.80). In Caucasians, the GG genotype was present in 3% of controls and 5% of cases, with an OR of 1.76, (95% CI, 0.48-6.39). There is no association between the SNP and KRAS mutation status in the whole cohort. In African Americans, it was notable that the GG genotype was not observed in mutant KRAS tumors, but the difference did not reach statistical significance ( Figure S1).

Association between rs4919510 and colorectal cancer survival
We found a significant association between rs4919510 and colorectal cancer survival. In Caucasians, the homozygous variant genotype, GG, was associated with a significant increase in risk of death from colorectal cancer represented by a univariable hazard ratio (HR) of 3.54 (95% CI, 1.38-9.12), which remained significant after adjustment for age, gender, histology and stage in our multivariable model; HR GG vs. CC 2.95 (95% CI, 1.13-7.71) ( Table 3) ( Figure 1A). The CG genotype also showed a trend towards poor outcome, although the comparison to the CC referent genotype was not statistically significant (Table 3).
In African Americans, we observed a protective association between the GG genotype and survival, although the model approached significance in both the adjusted (HR GG vs. CC 0.38 [95% CI, 0.13-1.13; p = 0.082]) and non-adjusted (HR GG vs. CC 0.36 [95% CI, 0.12-1.07; p = 0.066]) models (Table 3) ( Figure 1B). Kaplan-Meier survival curves graphically depict the divergent effects of this SNP in these populations (Figure 1). No significant survival associations were found when rs4919510 was analyzed without stratification, or when the cohort was stratified respective of cancer stage (Table 3 and Data not shown).

Discussion
Our results show an association between a germline variant in hsa-mir-608 with prognosis, but not risk, of colorectal cancer. In Caucasians, it was associated with an increased risk of death due to colorectal cancer. However, a trend towards the opposite effect was observed in African Americans. Previous studies have also demonstrated how race can modulate the effect of a SNP in colorectal cancer [24]. Of note however, while the association was statistically significant in Caucasians (n = 145), the association only approached significance in African Americans. This could suggest that the relationship is not epidemiologically relevant, but given the smaller number of individuals in the African American group (n = 94), it is perhaps more reflective of our reduced power. In both populations, the SNP was associated with survival in a recessive model.
While this manuscript was in preparation, Xing et al. published a study on colorectal cancer prognosis in association with rs4919510 within a Chinese population with 408 patients [25]. They reported that the variant allele (C) was associated with a reduced risk of recurrence-free and overall survival (HR = 0.61, 95% CI = 0.41-0.92), therefore, the G allele was associated with adverse outcome. This is in general agreement with our findings in Caucasians. Xing et al. did not include a control arm to examine risk associations, though as described above, our results do not support a role for this SNP in colorectal cancer. These studies strengthen the hypothesis that there is an association between rs4919510 and colorectal cancer prognosis that might be racespecific.
While we cannot currently explain the exact functional mechanism through which this SNP may affect colorectal cancer prognosis, there are several plausible hypotheses. As mentioned before, rs4919510 lies within the mature sequence of miR-608, and is located at the joint of the stem with the canonical hairpin loop. Since this rigid secondary structure is a requisite for recognition, and thus processing, of precursor miRNA by the RNAse Drosha, it is feasible that disruption of structure at this critical point might affect recognition or subsequent processing. Each miRNA has hundreds of targets, thus a singular change in a cell's miRNA kinetic profile could have an exponentially large effect on protein output, perhaps having an effect so large it could skew, even slightly, overall prognosis of a disease.
A change in a miRNA's mature sequence could also theoretically alter its target repertoire once processed [26,27]. Although rs4919510 does not lie within the seed sequence of miR-608, hybridization kinetics outside of this region have been shown to be important in target recognition [28]. The exact importance, relative weight, and subsequent shifting of preference due to miRNA sequence variations has yet to be fully elucidated, though examples exist [12]. An extension of this principle could be applied to the SNP in hsa-mir-608 in relation to colorectal cancer, whose putative targets include BCL-xL, SEPT9, and CDK6 (www. targetscan.org). Indeed, an alteration in transcript targeting by miR-608 to any of those genes could have consequences directly related to cancer cell survival.
Interestingly, Xing et al. also reported that the association between the GG genotype and poor outcome was only observed in those patients who had been treated with chemotherapy, the majority of which had been on a FOLFOX regimen. Interestingly, miR-608 is predicted to target at least 2 of the key enzymes involved in activation of 5-FU, thymidine kinase and folylpolyglutamate synthase (www.targetscan.org). In addition, there have been several studies examining variant chemotherapeutic response rates in colorectal carcinoma by race. In one study, although time to disease progression was not affected by race, a significantly higher percentage of Caucasians responded to a FOLFOX or IROX regime [29]. Therefore, rs4919510 could theoretically play a role in differential response rates to chemotherapeutics across populations.
In addition to a putative functional association with miR-608, linkage disequilibrium with SNPs in neighboring genes cannot be ruled out. Of note, there is significantly less linkage predicted for rs4919510 from current 1000 genomes data in the African American population, where only 7 SNPs in 2 unique genes are predicted to be in linkage disequilibrium with rs4919510, as opposed to 39 SNPs in 5 unique genes in Caucasians (www. broadinstitute.org/mpg/snap/ldsearch.php#, r 2 .0.8 and distance,500 kb). Of particular interest among the list of differentially linked gene alleles in Caucasians is leucine zipper tumor suppressor 2 (LZTS2), which contains several SNPs in the first intron and 59 UTR predicted to be strongly associated with rs4919510. Recent findings from two separate studies have shown that knockdown of LZTS2 expression sensitizes cells to paclitaxel therapy in addition to antagonizing proliferation of several cancer cell lines by down-regulation of myc and cyclin D1 through engagement of the NF-kB pathway [30,31]. If the G allele of rs4919510 is truly in linkage with a SNP in LZTS2, the expression of this gene may be altered, contributing to the poor prognosis of this variant genotype in our Caucasian cohort. However, this hypothesis remains to be investigated.
In summary, our study validates the hypothesis that a SNP in the mature sequence of hsa-mir-608 can significantly affect the prognosis of colorectal cancer patients. Interestingly, this effect of this SNP seems to vary by race, as demonstrated by our findings and others [25]. While the association between this SNP and colorectal cancer survival is notable as demonstrated in three cohorts, further studies should be conducted to confirm the racespecific nature of the SNP's effect. In addition, exploration of the functional role of this SNP, the mechanism of its interaction with 5-FU, as well as its potential involvement in other cancers is warranted. Figure S1 rs4919510 and KRAS mutation status. Percentage of KRAS mutations in rs4919510 CC, CG and GG samples in the whole population A), Caucasians B), and African Americans C). Exact percentages shown in D). WT denotes wild-type, mut denotes mutant. (PDF)