Associations of Two Common Genetic Variants with Breast Cancer Risk in a Chinese Population: A Stratified Interaction Analysis

Recent genome-wide association studies (GWAS) have identified a series of new genetic susceptibility loci for breast cancer (BC). However, the correlations between these variants and breast cancer are still not clear. In order to explore the role of breast cancer susceptibility variants in a Southeast Chinese population, we genotyped two common SNPs at chromosome 6q25 (rs2046210) and in TOX3 (rs4784227) in a case-control study with a total of 702 breast cancer cases and 794 healthy-controls. In addition, we also evaluated the multiple interactions among genetic variants, risk factors, and tumor subtypes. Associations of genotypes with breast cancer risk was evaluated using multivariate logistic regression to estimate odds ratios (OR) and their 95% confidence intervals (95% CI). The results indicated that both polymorphisms were significantly associated with the risk of breast cancer, with per allele OR = 1.35, (95%CI = 1.17–1.57) for rs2046210 and per allele OR = 1.24 (95%CI = 1.06–1.45) for rs4784227. Furthermore, in subgroup stratified analyses, we observed that the T allele of rs4784227 was significantly associated with elevated OR among postmenopausal populations (OR = 1.44, 95%CI 1.11–1.87) but not in premenopausal populations, with the heterogeneity P value of P = 0.064. These findings suggest that the genetic variants at chromosome 6q25 and in the TOX3 gene may play important roles in breast cancer development in a Chinese population and the underlying biological mechanisms need to be further elucidated.


Introduction
Breast cancer is one of the most common malignancies worldwide, ranking first in incidence and second in mortality among all cancers diagnosed in women. For the year 2014, it is estimated in the United States that approximately 232,670 female patients would be diagnosed with breast cancer and 40,000 would die from it [1]. And the incidence of BC is increasing rapidly in developing countries, particularly in China [2]. Breast cancer is a heterogeneous disease in which multiple environmental and genetic factors play important roles [3]. Epidemiological studies have indicated that age, obesity, a family history of BC, previous benign breast disease, menstrual and reproductive factors are associated with increased risk of BC [4][5][6]. In family-based studies, several high-penetrance inherited mutations, including BRCA1, BRCA2, TP53 and PTEN, were identified to contribute to increased susceptibility to breast cancer [7]. However, only about 25% of the familial risk and 5% of BC incidence can be explained by these highpenetrance mutations [8][9]. Therefore, the identification of low-penetrance genes could have a significant impact on the risk estimation of breast cancer.
In the past few years, several genome-wide association studies (GWAS) have identified a number of novel genetic susceptibility variants and loci which were independently associated with elevated risk of breast cancer [10][11][12][13][14][15][16][17][18]. Among them, two single nucleotide polymorphisms (SNPs), rs2046210 at 6q25 and rs4784227 in the TOX3 gene were highlighted for their potential biological contribution to the development of breast cancer. SNP rs2046210 is located 180 kb upstream of estrogen receptor 1(ESR1) and downstream of C6orf97. The ESR1 gene is of particular interest in breast carcinogenesis as it encodes estrogen receptor a (ERa). ERa regulates estrogen signal transduction and plays an important role in breast cancer [19]. The TOX3 gene is located at chromosome 16q12.1 [10] and belongs to the large diverse family of high-mobility group (HMG) box proteins [20]. TOX3 regulates calcium dependent transcription through the interaction with the cAMP response element binding protein [21]. In human tissues, TOX3 is mainly expressed in the brain [22]. It is also expressed in breast, with lower levels in breast tumors than in normal tissue [23], suggesting that it may be a candidate tumor suppressor gene.
Studies by Zheng et al. [17] and Long et al. [18] first identified rs2046210 and rs4784227 as genetic susceptibility loci for breast cancer in European and Asian populations, respectively. However, several subsequent replication studies did not produce the same results. For instance, Stacy et al. [24] were unable to replicate the findings in Europeans, and similarly, Cai et al. [25] also failed to validate the associations in African Americans. Possible explanations for the conflicting results could be that study populations were different ethnic groups from different regions. Other factors may include family history or menstrual and reproductive status. Besides, tumor subtypes stratified by estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (Her-2) status may also play an important role due to different etiologic pathways. Thus, more extensive studies, especially in various populations combined with risk factors and tumor subtypes, can help us improve the understanding of genetic variants in BC etiology. Therefore, we performed a case-control study of 702 BC patients and 794 healthy controls to evaluate the associations between these two SNPs and breast cancer risk in women from southeast of China, Fujian Province. We also assessed the interactions between risk loci, traditional risk factors and specific molecular subtypes of BC defined by ER and PR status.

Study population
This study was a hospital-based case-control study that included 702 breast cancer patients and 794 healthy controls. All participants were genetically unrelated Chinese from Fuzhou City and its surrounding regions. Patients were consecutively recruited from the Fujian Medical University Union Hospital, Fujian, China, between June 2009 and March 2014. All BC cases were histopathologically confirmed without restriction of histological type or age. Healthy controls were frequency-matched to the BC patients by age (¡5 years) and randomly selected from persons undergoing routine health examinations in the same hospital. Each participant was interviewed face-to-face by trained interviewers to gather information on demographic data, menstrual history, reproductive and breastfeeding history, previous benign breast disease history, environmental exposure history and family history of breast cancer. In addition, approximately 3 ml of venous blood was collected from each subject. Written informed consent was obtained from all participants via an institutional consent form. The study and this consent procedure were approved by the Ethics Committee of Fujian Medical University Union Hospital. The clinicopathological data of BC patients were obtained from medical records. The ER and PR status were determined by an immunohistochemical method evaluating the percentage of cancer cell nuclear staining, and the percentage of staining cells $10% was considered positive.

Genotyping
Genomic DNA was extracted from the EDTA anti-coagulated whole blood using a commercially available kit according to the manufacturer's protocol (The Whole-Blood DNA Extraction Kit; Bioteke, Beijing, China). Genotyping for the two selected SNPs was performed with a custom-by-design 2648-Plex SNPscan Kit (Cat:G0104K; Genesky Biotechnologies Inc., Shanghai, China). The kit was developed according to patented SNP genotyping technology by Genesky Biotechnologies Inc., which was based on double ligation and multiplex fluorescence PCR. Primers and probes were designed and synthesized by Invitrogen (CA, USA). Sample DNA were ligated and amplified by PCR according to the manufacturer's recommendations. The resulting data were analyzed with an ABI3730XL sequencer and GeneMapper 4.0 Software (Applied Biosystems, Foster City, CA). To ensure quality-control, genotyping was performed without knowledge of case or control status of the subjects, and approximately equal numbers of case and control samples were assayed on each 96-well plate with two blank controls. In addition, a 5% random sample of cases and controls were genotyped twice, and the concordance rate was 100%. Genotyping failed in only one breast cancer case due to DNA quality or quantity, and the average call rate for all SNPs was higher than 99%.

Statistical analyses
Statistical analyses were performed with Statistical Package for the Social Sciences (SPSS, version 18.0) for Windows (SPSS, Chicago, IL). Differences between cases and controls in demographic characteristics, risk factors and genotype frequencies were evaluated by using x 2 test (for categorical variables) or Student's t-test (for continuous variables). The Hardy-Weinberg equilibrium (HWE) was evaluated by a goodness-of-fit x 2 test to compare the observed genotype frequencies with the ones in controls. Associations among genotypes, tumor subtypes and breast cancer risk were estimated by computing odds ratios (ORs) and 95% confidence intervals (CIs) from multivariate logistic regression with adjustment for age, BMI, age at menarche, age at first live birth, menopausal status and family history of breast cancer. All statistical tests were two-sided, and a P-value of ,0.05 was considered statistically significant.

Population characteristics
The selected characteristics compared between breast cancer and healthy-control cases are summarized in Table 1. There were no significant differences in age, BMI, menopausal status and previous benign disease between the two groups (P.0.05). However, compared with healthy-controls, breast cancer patients tend to have an earlier age at menarche, an earlier age at first live birth and a higher proportion of family history of breast cancer (P,0.05). Among 701 breast cancer cases, 479 (68.3%) cases were ER positive and 442 (63.1%) were PR positive.

Associations between SNP genotypes and breast cancer risk
The allele and genotype distribution of rs2046210 and rs4784227 in cases and controls are shown in Table 2. The observed genotype frequencies for the two SNPs were all in Hardy-Weinberg equilibrium in the control group, P50.49 for rs2046210 and P50.94 for rs4784227. In the single locus analyses, both polymorphisms achieved significant differences in the genotype distribution between cases and controls, with per allele OR51.35 (95%CI51.17-1.57) for rs2046210 and per allele OR51.24 (95%CI51.06-1.45) for rs4784227. Multivariate logistic regression analyses also revealed that, for rs2046210, the GA or AA carriers were at higher risk of BC compared with the GG homozygotes (OR51.63, 95%CI51.29-2.13 and OR51.68, 95%CI51.21-2.33, respectively). Similarly for rs4784227, in the dominant model, a significantly increased risk was observed in the CT+TT genotype, as compared to the CC genotype (OR51.27, 95%CI51.03-1.56), indicating the CT or TT carriers were associated with an altered risk of breast cancer compared with the CC homozygotes. In addition, the associations for variant genotypes in the two polymorphisms were both doseindependent, P trend,0.001 for rs2046210 and P trend50.009 for rs4784227.

Associations between SNPs and breast cancer characteristics
To further evaluate the suggestive association between two polymorphisms and breast cancer risk, we performed subgroup stratified analyses according to different epidemiological characters and tumor subtypes. As shown in Table 3, scores of 0, 1 and 2 were assigned to the genotype GG, GA and AA for rs2046210 and CC, CT and TT for rs4784227 respectively (Additive model). The pooled ORs and 95% CIs were calculated in logistic regression analyses counting genotypes as ranking variables. For the rs2046210-A allele, significantly increased risks of breast

The combined effects of rs2046210 and rs4784227
The combined effects of the two polymorphisms are shown in Table 4. All cases and controls were categorized into five groups according to the number of risk alleles they carried (rs2046210-A and rs4784227-T). The total number of risk alleles ranged from 0 to 4 and those with 0 risk allele were regarded as the reference group. When compared to the reference group, the ORs of BC risk for

Discussion
In the present case-control study, we investigated the associations of two candidate SNPs and the risk of breast cancer in a Southeast Chinese population. We found that both rs2046210 at 6q25 and rs4784227 in the TOX3 gene were significantly associated with increased BC risk. Zheng et al. [17] conducted a three stage GWAS which identified rs2046210 was strongly associated with breast cancer in Asian and European populations. However, several subsequent studies did not show the same results [24,25]. One possible reason may be the genetic differences across regions and ethnic groups, another probable explanation could be the differences in linkage disequilibrium (LD) patterns among various populations. In our studied population, rs2046210 is confirmed to be significantly associated with BC risk, which strengthens the observation that this polymorphism plays a critical role in breast cancer. In further studies, the associations between rs2046210 and BC risk appear to be similar within different subgroups while interestingly, we observed that this association was a little stronger in ER negative than in ER positive tumor subtypes. Although this difference was small and not statistically significant, it was consistent with previous reports [17,25]. This may be due to rs2046210 being independently associated with the risk of BC for BRCA1 mutation carriers [26] while breast cancer patients with BRCA1 mutations are more often estrogen receptor negative [27].
The SNP rs2046210 lies 180 kb upstream of the ESR1 gene. ESR1 encodes receptor a which is activated by the hormone estrogen. Breast cancer is one of the hormone dependent malignancies and cumulative exposure to sex hormones has been suggested to be linked to the development of BC [28]. In vitro experiments have also proven that activating ESR1 mutations were shown to result in continued responsiveness to anti-estrogen therapies [29]. Since the contiguous relations between rs2046210 and ESR1, researchers hypothesize that it was the polymorphism itself or the causal variants in LD that might regulate ESR1 gene expression and contribute to be elevated susceptibility to breast cancer. Another SNP, rs9397435 (2.9 kb away from rs2046210) has been identified to confer BC risk to Asian, European and African populations in finemapping studies [24]. However, there are still no exact functional studies confirming whether rs2046210 will affect the expression of ESR1, thus the potential functional mechanism of this polymorphism still requires consideration and further investigation.
The SNP rs4784227 is located 18.4 kb upstream of the TOX3 gene and in the evolutionarily-conserved portion of an intron in the LOC643714 gene. This SNP has been implicated to be a functional genetic risk variant for breast cancer in various in vitro experiments. Long et al. [18] demonstrated the T risk allele could reduce luciferase activity and alter DNA-protein binding patterns. Another study showed that rs4784227 are enriched in the cistromes of FOXA1 and ESR1 in a cancer-cell specific manner, modulating the affinity of chromatin for FOXA1 and resulting in allele-specific gene expression [30].
In our study, we confirmed rs4784227 as a BC susceptibility locus among a Southeast Chinese population. Further stratified analyses suggested that positive association was stronger in ER/PR positive than in ER/PR negative breast cancer which was consistent with previous data [18,31]. In addition, we observed a meaningful result that the T allele of rs4784227 was strongly associated with breast cancer risk among postmenopausal populations (OR51.44, 95%CI51.11-1.87), but no evidence of significant associations were found in premenopausal populations (OR51.02, 95%CI50.82-1.27).
It is well established that endogenous estrogen in postmenopausal women is mainly produced by adipose tissue through the variation from androgen in the aromatase activity [32]. Circulating levels of estrogens and androgens have been demonstrated to be positively associated with the risk of breast cancer in postmenopausal women, particularly in ER positive tumor subtype [19,[33][34]. In some case-control studies, certain polymorphisms in fibroblast growth factor receptor 2 (FGFR2) and methionine synthase reductase (MTRR) were indicated to elevate individual susceptibility in postmenopausal breast cancer [35][36][37]. The FGFR2 gene was identified to impact carcinogenesis through cellular signal transduction [38][39][40], while the MTRR gene mainly influenced folate metabolism and played important roles in DNA methylation and synthesis [37]. Furthermore, a multi-variant analysis studying genes in the estrogen metabolic pathway revealed that the association was mainly focused on polymorphisms of the androgen-toestrogen conversion sub-pathway, and this association was confined to postmenopausal women with sporadic estrogen receptor positive tumors [41]. Considering these results, we hypothesize that the SNP rs4784227 may act as an important transcription factor in the androgen-to-estrogen conversion subpathway, which could result in longer estrogen exposure for postmenopausal women and increased risk of BC. And this may also partly explain why rs4784227 is more associated with ER/PR positive breast cancer. However, this is only one of the speculations about possible biological mechanisms between rs4784227 and breast cancer and needs to be confirmed by follow-up studies.
In conclusions, the present study confirmed that SNP rs2046210 and rs4784227 contributed to increased breast cancer susceptibility among a Southeast Chinese population. Moreover, our data provided additional evidence for the correlations among genetic variants, risk factors and tumor molecular subtypes. One main limitation of this study was that the sample size was still not large enough which can impact on the precision and accuracy of results, some epidemiological characteristics were also unable to be conducted in the stratified analyses. Meanwhile the functions of these two SNPs remain still unclear. Therefore, it is necessary that future larger ethnic-matched studies should be warranted and further investigations into potential biological mechanisms of these two polymorphisms are also needed.