miR-485-5p Binding Site SNP rs8752 in HPGD Gene Is Associated with Breast Cancer Risk

Background Single nucleotide polymorphisms (SNPs) that reside in microRNA target sites may play an important role in breast cancer development and progression. To reveal the association between microRNA target site SNPs and breast cancer risk, we performed a large case-control study in China. Methods We performed a two-stage case-control study including 2744 breast cancer cases and 3125 controls. In Stage I, we genotyped 192 SNPs within microRNA binding sites identified from the “Patrocles” database using custom Illumina GoldenGate VeraCode assays on the Illumina BeadXpress platform. In Stage II, genotyping was performed on SNPs potentially associated with breast cancer risk using the TaqMan platform in an independent replication set. Results In stage I, 15 SNPs were identified to be significantly associated with breast cancer risk (P<0.05). In stage II, one SNP rs8752 was replicated at P<0.05. This SNP is located in the 3’ untranslated region (UTR) of the 15-hydroxyprostaglandin dehydrogenase (HPGD) gene at 4q34-35, a miR-485-5p binding site. Compared with the GG genotype, the combined GA+AA genotypes has a significantly higher risk of breast cancer (OR = 1.18; 95% CI: 1.06-1.31, P = 0.002). Specifically, this SNP was associated with estrogen receptor (ER) positive breast cancer (P = 0.0007), but not with ER negative breast cancer (P = 0.23), though p for heterogeneity not significant. Conclusion Through a systematic case-control study of microRNA binding site SNPs, we identified a new breast cancer risk variant rs8752 in HPGD in Chinese women. Further studies are warranted to investigate the underling mechanism for this association.


Introduction
Breast cancer is the most common female malignancy worldwide, and its incidence has been increasing during the past several decades in both developing and developed countries [1]. It is widely accepted that environmental and genetic factors contribute to the development of breast cancer. Despite environmental factors play an important role in breast cancer, the individual's risk of breast cancer was determined by the genetic susceptibility. Numerous investigations have suggested that micro-RNAs are essential for various biological processes and diseases, including tumorigenesis [2,3,4,5,6,7]. MicroRNA inhibits gene translation by binding to the 3' UTRs of target mRNAs. In recent years, many studies have revealed that SNPs or mutations within microRNA binding sites may affect cancer susceptibility by disrupting miRNA-mRNA interaction and mRNA expression [8,9,10,11]. Several bioinformatic methods have been introduced to predict candidate SNPs located in microRNA target sites, including ''Patrocles'', and ''PolymiRTS'' [12,13,14,15,16]. Some case-control studies have been performed to investigate the association between SNPs in microRNA binding sites and breast cancer risk [17,18,19,20,21,22]. Wang et al. found that a miRNA binding site SNP in the 3'-UTR region of the IL23R gene may be associated with the risk of breast cancer and contribute to the early development of breast cancer in Chinese women [23]. Teo et al. first reported the association between DNA repair gene PARP1 miRNA-binding site SNP rs8679 and breast cancer risk [24]. Zheng et al. reported that the presence of SNPs at the miR-124 binding site may be a marker for breast cancer risk and prognosis [25]. Kontorovich et al. found that the heterozygotes carriers of SNP rs11169571 had an approximately 2 fold increased risk for developing breast cancer, whereas heterozygotes of the rs895819 SNP had an approximately 50% reduced risk for developing breast cancer [26]. Saetrom et al. suggested that allele-specific regulation of BMPR1B by miR-125b explains the observed disease risk [27]. Brendle et al. showed that the A allele of the ITGB4 SNP rs743554 was associated with the negative hormone receptor status and bad breast cancer-specific survival, especially in women with more aggressive tumors [28]. However, most of them are candidate gene studies including only a few SNPs, which could not represent the whole situation of these SNPs in breast cancer etiology.
In this study, high-throughput SNP genotyping was used in a large case-control study including genome-wide microRNA binding site SNPs. The results may provide new insights into the cause of breast cancer, and new molecular markers for breast cancer diagnosis.

Ethics statement
The Ethics Committee of Tianjin Medical University Cancer Institute and Hospital approved the study protocol, and all patients and controls gave written informed consent before participating in the study.

Study subjects
A total of 5869 individuals (2744 breast cancer cases and 3125 controls) were involved in this study. The cases were collected from Tianjin Medical University Cancer Hospital between January 1, 2006 and December 31, 2008 who were newly diagnosed and histologically confirmed breast cancer patients. At the same time, controls were enrolled from the nearby community, who were genetically unrelated to the patients and were frequency matched to the patients by age (63 years). Our study comprised two stages, in stage I, we randomly selected 1349 patients and 1572 cancer-free female controls for SNP screening. In stage II, to validate the findings from stage I, the validation set of 1395 cases and 1553 controls were genotyped. Participants were interviewed by trained investigator using a systematic questionnaire about their demographic characteristics, personal habits, family history, occupational exposure history, eating habits, physical exercise, and reproductive factors. For the cases, clinical information on tumor features and disease severity, including morphology, tumor size, lymph node metastasis, organ metastasis, tumor stage, and status of estrogen receptor (ER) and progesterone receptor (PR) were also collected. Each participant provided 10 ml venous blood. The study was approved by the Institutional Review Board of Tianjin Medical University; informed consent was obtained from all patients.

SNP selection
The ''Patrocles'' online database (http://www.patrocles.org/) was used to select genome-wide micro-RNA target SNPs. Among all the 5035 SNPs in microRNA binding site SNPs that the database provided, 1742 SNPs had been confirmed. SNPs that satisfy the following criteria were considered for inclusion: (1) SNPs were located in a microRNA-seed region binding site, and the seed region was defined according to the "7-mirs" criteria [12].
(2) SNPs have reported population frequency data in Chinese (htpp://www.ncbi.nlm.nih.gov/snp/), and SNPs with minor genotype frequency $0.05 were included. In this way, a total of 192 microRNA target SNPs were included in our study, the detailed information for these SNPs can be seen in Table S1 in File S1.

Genomic DNA samples
The whole blood samples from each participants were collected and stored in Vacutainer tubes (BD Franklin Lakes, NJ) containing anticoagulant of EDTA. Total genomic DNA was extracted from the whole blood using QIAGEN DNA Extraction Kit (QIAGEN Inc.). The extracted DNA was stored at -20uC in TE buffer.

SNP genotyping
In stage I, SNP genotyping was conducted using Illumina Golden Gate SNP Genotyping Arrays according to the manufacturer's instructions. Only plates with a consistent high call rate in the initial calling were used. If the call rate was ,80%, we repeat the experiment. In stage II, genotyping were performed using the TaqMan platform in 384-well plates and read with the Sequence Detection Software on an ABI Prism 7900 instrument according to the manufacturer's instructions (Applied Biosystems, Foster City, CA). Primers and probes were supplied by Applied Biosystems, the PCR conditions used were as follows: 50uC for 2 minutes, 95uC for 10 minutes, and 60uC for 1 minute for 40 cycles. After 2 rounds of genotyping, the success rate for genotyping was 99%, and 5% of the samples were selected for replication, the results were 100% concordant.

Statistical analysis
A x 2 test was used to evaluate the differences in the distributions of major demographic variables and environmental risk factors, as well as the genotypes of selected SNPs between the breast cancer cases and controls. The Hardy-Weinberg equilibrium was determined by a x 2 goodness of fit test in controls. Unconditional logistic regression was used to examine the association between the SNPs and breast cancer risk by estimating the odds ratios (ORs) and 95% confident intervals (CIs), with and without adjustments for age, smoking status, menopause status, oral contraception use, history of benign breast diseases, and family history of cancer. All statistical tests were two-sided, and a P value of 0.05 was considered significant, correction for multiple comparisons was not performed. We used the SAS software version 9.0 (SAS Institute) for all statistical analyses.

Results
The demographic characteristics of the 2744 breast cancer cases and 3125 cancer-free controls (Combined stage I and II) were presented in Table 1. Age was matched between cases and controls (P = 0.447). The differences in smoking status (P ,0.001), oral contraceptive usage (P ,0.001), menopause (P ,0.001), history of benign breast diseases (P ,0.001), and family history of cancer (P ,0.001) were statistically significant between cases and controls. These characteristics were comparable between samples from stage I and samples from stage II (Table S2 in File S1).
In stage II, among the 15 SNPs identified from stage I, the SNP rs8752 in HPGD gene (the duplex structure between miR-485-5p and HPGD was shown in Figure S1 in File S1) was significantly associated with breast cancer risk in an independent replication set (P = 0.018). When data from stage I and stage II were combined, rs8752 in HPGD Gene and Breast Cancer Risk PLOS ONE | www.plosone.org compared with rs8752 GG genotype, GA and AA genotypes were associated with a higher risk of breast cancer (OR = 1.18; 95% CI: 1.06-1.31, P = 0.002). Furthermore, we assessed these associations between rs8752 and breast cancer risk according to ER and PR status. The association was significant for ER positive breast cancer (OR = 1.24; 95% CI: 1.10-1.40, P = 0.0007), but not significant for ER negative breast cancer (OR = 1.09; 95% CI: 0.95-1.26, P = 0.23), though p for heterogeneity not significant. The association between rs8752 and breast cancer risk was similar for PR positive and PR negative breast cancer (Table 3).

Discussion
In this study, we performed a two-stage case-control study including 2744 cases and 3125 controls. Among the 192 SNPs genotyped, the SNP rs8752 (A allele) in HPGD gene was identified to be associated with an increased risk of breast cancer in both stages. This study provided a piece of evidence for a novel susceptibility variation for breast cancer on chromosome 4q34-35.
Our study covered three SNPs (IGF1R rs2654981, NFAT5 rs7359387, NELF rs1059111) that were previously studied in the context of breast cancer risk. The IGF1 receptor (IGF1R) overexpression has been associated with a number of hematological neoplasias and solid tumors including breast cancer [29]. Han-Sung Kang found that seven of the 51 IGF1R SNPs were in LD (linkage disequilibrium) and in one haplotype block, and were likely to be associated with breast cancer risk [30]. Sebastien Jauliac found that NFAT5 were expressed in invasive human ductal breast carcinomas and participate in promoting carcinoma invasion using cell lines derived from human breast and colon carcinomas [31]. NEFL has been shown to act as a tumor suppressor in the carcinogenesis of breast [32,33]. However, we found significant association between these three SNPs and breast cancer risk only in stage I. In stage II (the validation set),we did not find significant association.
The microRNA-related SNPs can generally be categorized into three groups, SNPs in microRNA sequences, SNPs in microRNA biogenesis pathway genes, and SNPs in microRNA target sites [34,35,36]. Up to now, SNPs in microRNA sequences and microRNA biogenesis pathway genes had been systematically studied and important findings were reported from these studies [37,38]. However, for the association between microRNA binding site SNPs and breast cancer risk, most previous studies were based on candidate gene strategy. Results from these studies were not enough to represent the role of such SNPs in the etiology of cancer. In this sense, we conducted a systematic case-control study including genome-wide microRNA binding site SNPs.
The HPGD gene at chromosome 4q34-35 encodes a short-chain non-metalloenzyme alcohol dehydrogenase protein family. The encoded enzyme is responsible for the metabolism of prostaglandins, which function in a variety of physiologic and cellular processes, such as inflammation. HPGD is widely distributed in various mammalian tissues such as lung, breast, prostate, placenta and gut. Recent studies have shown a reduction of HPGD in some cancers, such as colorectal, breast, prostate, and lung [39,40,41,42,43]. Many studies have revealed that HPGD may have tumor-suppressive properties [44,45]. Ido Wolf at al. reported that HPGD was an epigenetically silenced tumor suppressor gene in breast cancer and there was an association between HPGD expression and the ER pathway activity. Prostaglandin E2 (PGE2) is a major stimulator of expression of aromatase, thus leading to increased synthesis of estrogen within the breast [40]. PGE2 levels are regulated not only by its synthesis but also by its degradation. The key enzyme responsible for the biological inactivation of prostaglandins is NAD+-linked HPGD [41]. Our results add another dimension to the above findings that the A allele of HPGD had a positive association with breast cancer risk, and the association was ER status specific. SNP rs8752 (G/A) is located in the miR-485-5p binding site, and it is likely to disrupt the miR-485-5p/HPGD interaction. As shown in Figure S1 in File S1, the A allele cannot be targeted by miR-485-5p, which will result in the increase of HPGD protein expression, a possible underlying mechanism for the observed association with the risk of breast cancer. Although, to the best of our knowledge, this is the largest systematic case-control study investigating the association between microRNA target SNPs and breast cancer risk. Our study has several limitations. First, we included only the SNPs with high frequency of variation, namely three genotypes with minor genotype frequency $0.05. This strategy will inevitably miss some low frequency SNPs that associated with breast cancer risk. Second, functional studies are critical to confirm the findings of association from this study, while such studies were not performed at this stage. Third, correction for multiple comparisons was not performed in this study, although our design with large sample size and replication set can ensure a high repeatability of our findings.
In summary, our findings suggested that common variants in the HPGD gene might be associated with breast cancer risk among Chinese women. Further large studies are warranted to confirm these findings and to examine the biological mechanisms for the association.

Supporting Information
File S1 Supporting file including Figure S1, Table S1, and Table S2. Table S1. 192 microRNA binding site SNPs identified from ''Patrocles'' database. Table S2. Characteristics of breast cancer cases and cancer-free controls (Stage I and Stage II). Figure S1. The duplex structure of miR-485-5p and the 3'UTR of HPGD gene.