Genetic Variants at 12p11 and 12q24 Are Associated with Breast Cancer Risk in a Chinese Population

Background A recent genome-wide association study (GWAS) has identified three new breast cancer susceptibility loci at 12p11, 12q24 and 21q21 in populations of European descent. However, because of the genetic heterogeneity, it is largely unknown for the role of these loci in the breast cancer susceptibility in the populations of non-European descent. Methodology/Principal Findings Here, we genotyped three variants (rs10771399 at 12p11, rs1292011 at 12q24 and rs2823093 at 21q21) in an independent case–control study with a total of 1792 breast cancer cases and 1867 cancer-free controls in a Chinese population. We found that rs10771399 and rs1292011 were significantly associated with risk of breast cancer with per-allele odds ratios (ORs) of 0.85 (95% confidence interval (CI): 0.76–0.96; P = 0.010) and 0.84 (95% CI: 0.76–0.95; P = 4.50×10−3), respectively, which was consistent with those reported in populations of European descent. Similar effects were observed between ER/PR positive and negative breast cancer for both loci. However, we did not found significant association between rs2823093 and breast cancer risk (OR = 0.97, 95%CI = 0.76–1.24; P  = 0.795). Conclusions/Significance Our results indicate that genetic variants at 12p11 and 12q24 may also play an important role in breast cancer development in Chinese women.


Introduction
Breast cancer is the most frequently diagnosed cancer and the leading cause of cancer death among women around the world, accounting for 23% of the new cancer cases and 14% of the cancer deaths in 2008 [1]. In the past 10 years, breast cancer incidence has increased by 20,30% in China's urban registries [2]. It has been implicated that menstrual and reproductive factors are associated with risk of breast cancer among both pre-and postmenopausal women in China [3]. Except for environmental exposures including lifestyle and behavioral factors, understanding genetic factors related to breast cancer is also important because identifying such factors may be useful for risk prediction. Over the last 5 years, as a powerful method to investigate the genetic determinants of complex diseases, genome-wide association studies (GWAS) have successfully identified more than 20 common susceptibility loci of breast cancer [4][5][6][7][8][9][10][11][12][13][14][15]. However, most GWAS were conducted in populations of European descent, therefore, it is very necessary to evaluate the generalizability of the GWAS findings in diverse populations across different descents.
Recently, Maya et al. carried out a large-scale genome-wide association analysis of breast cancer and further evaluated 72 promising associations in ,70 000 cases and ,68 000 controls from 41 case-control studies and 9 breast cancer GWAS. Three new breast cancer susceptibility loci, rs10771399 (12p11), rs1292011 (12q24) and rs2823093 (21q21), were identified in European women, and rs10771399 was also associated with breast cancer risk in Asian women [16]. Following these findings, Antonis et al. genotyped rs10771399 and rs1292011 in 12 599 BRCA1 and 7132 BRCA2 mutation carriers, and found that rs10771399 near PTHLH was also associated with breast cancer risk for BRCA1 mutation carriers [17]. To evaluate whether these three new identified variants are also associated with breast cancer risk in Chinese women, we conducted a case-control study with 1792 breast cancer cases and 1867 controls in a Chinese population in Jiangsu province of eastern China.

Ethics Statement
This study was approved by the institutional review board of Nanjing Medical University. The design and performance of current study involving human subjects were clearly described in a research protocol. All participants were voluntary and would complete the informed consent in written before taking part in this research.

Study subjects
A total of 1792 breast cancer cases and 1867 cancer-free controls were included in the current study, which has been described previously [18]. In brief, all of the breast cancer cases were consecutively recruited, without restrictions of age or histological type, from the First Affiliated Hospital of Nanjing Medical University, the Cancer Hospital of Jiangsu Province and the Gulou Hospital, Nanjing, China, from Jan 2004 to April 2010. These breast cancer patients were newly diagnosed and histopathologically confirmed. Those who had a history of cancer, metastasized cancer from other organs, radiotherapy or chemotherapy were excluded from the case group. All of the cancer-free women controls were randomly selected from more than 30 000 participants in a community-based screening program conducted in Jiangsu province, China. These controls were frequencymatched to the cases on age (5-year interval) and residential area (urban or rural) and collected at the same period as the patients were recruited. All of the cases and the controls were genetically unrelated, ethnic Han Chinese women. After the informed consent was obtained from each participant, a structure questionnaire was completed face-to-face by trained interviewers to collect individual information including demographic data, menstrual and reproductive history, environmental exposure, history of benign breast disease and family history of breast cancer in first-degree relatives (parents, siblings, and children). After the interview, approximately 5-ml of venous blood was collected from each subject. The estrogen receptor (ER) and progesterone receptor (PR) status determined by immunohistochemistry examinations of breast tumors were obtained from the medical records of patients in the hospitals.

Genotyping
Genotyping was performed using the TaqMan allelic discrimination assay on the platform of 7900HT Real-time PCR System (Applied Biosystems, Foster City, CA) without knowing the subjects' status (case or control). Two negative controls included in each 384-well reaction plate were used for quality control and the genotyping results were determined by using SDS 2.3 Allelic Discrimination Software (Applied Biosystems).

Statistical analyses
Hardy-Weinberg equilibrium for the distribution of each single nucleotide polymorphism (SNP) was evaluated using the goodnessof-fit x 2 test by comparing the observed genotype frequencies with the expected ones among the controls. Differences between the cases and the controls on the demographic characteristics, selected variables and frequencies of the genotypes were analyzed by using the Student's t test (for continuous variables) and x 2 test (for categorical variables). Logistic regression analyses were employed to evaluate the associations between SNPs and the risk of breast cancer and to estimate the odds ratios (ORs) and their 95% confidence intervals (CIs) with adjustment for age, age at menarche and menopausal status. The heterogeneity of associations between subgroups was assessed using the x 2 -based Q-test.
All of the statistical analyses were performed with Statistical Analysis System software (9.1.3; SAS Institute, Cary, NC, USA).

Results
Characteristics of the 1792 breast cancer cases and the 1867 cancer-free controls are presented in Table 1. The age between cases and controls was comparable (P.0.05). The breast cancer cases showed an earlier age at menarche (P,0.05), a later age at first live birth (P,0.05) and different distribution of menopausal status (P,0.05) when compared with the controls. Among the 1,792 breast cancer subjects, 803 (55.5%) cases were ER positive while 810 (56.1%) cases were PR positive.
The genotype distributions of the 3 SNPs in cases and controls and their associations with breast cancer risk were summarized in Table 2. The observed genotype frequencies of 3 SNPs followed Hardy-Weinberg equilibrium in the controls (P.0.05 for all SNPs). Significant associations were observed between rs10771399 at 12p11 and rs1292011 at 12q24 and breast cancer risk (P = 0.010 and 4.50610 23 , respectively). The G allele of rs10771399 and the G allele of rs1292011 were associated with a decreased risk of breast cancer (per-allele OR = 0.85, 95% CI: 0.76-0.96, P = 0.010; and per-allele OR = 0.84, 95% CI: 0.76-0.95, P = 4.50610 23 , respectively), which are consistent with those reported by Maya et al. [16]. However, no significant association was observed between rs2823093 at 21q21 and breast cancer risk.
We further evaluated the associations of rs10771399 and rs1292011 on breast cancer risk in subgroups stratified by age, age at menarche and first live birth, and menopausal status (premenopausal and natural menopausal). As shown in Table 3, there was no significant difference between subgroups for associations of rs10771399 and rs1292011 with breast cancer risk (P for heterogeneity .0.05). Similar per-allele ORs were observed between ER/PR positive and negative breast cancer for both SNPs.

Discussion
In this study, we evaluated the associations of genetic variants at 12p11 (rs10771399), 12q24 (rs1292011) and 21q21 (rs2823093) with breast cancer susceptibility in an independent case-control study with 1792 breast cancer cases and 1867 controls in a Chinese population. We found that rs10771399 and rs1292011 but not rs2823093 were significantly associated with altered risk of breast cancer in our population. The OR of 0.85 for rs10771399 in this Chinese population was similar to those observed in Since rs10771399 was a proxy of 12p11 region, we further functionally annotated the linkage disequilibrium (LD) block containing rs10771399 using the UCSC genome browser, and found that none of known genes overlapped with this region ( Figure S1). The nearest gene PTHLH encodes a protein that regulates endochondral bone development and epithelial-mesenchymal interactions during the formation of the mammary glands, whose receptor is responsible for most cases of humoral hypercalcemia of malignancy [19]. This gene had been approved to be involved in the metastasis of breast cancer [20]. We searched SNPs in strong LD (r 2 .0.8) with rs10771399 based on CHB/JPT data of 1000 Genome pilot (Table S1), and predicted the biological functions of these variants using an online tool, RegulomeDB [21]. As shown in Table S1, three SNPs (rs788463, rs10843066 and rs7957915) were predicted to be in regulatory elements and likely to affect binding of transcriptional factors. For example, rs788463 is in 2.4 kb upstream of the lead SNP rs10771399 and in complete LD (r 2 = 1.0) with rs10771399. Analyses of DNase Footprinting and Position-Weight Matrix (PWM) suggest that rs788463 is in the binding sites of transcriptional factors Osf2, C/EBP and C/EBPalpha. ChIP-seq data suggests this variant located at sites of multiple histone modification marks, including H3k27me3, H3k4me1 and H3k4me2. Taken together, these evidences indicate that genetic variants at 12p11 (such as rs788463) may affect the binding sites of transcriptional factors (such as C/EBP), modify the function of regulatory elements and finally involve in the development of breast cancer. However, the potential target genes are unclear and  the above statements also need to be experimentally validated in the future. Similarly, we also functionally annotated the LD block containing rs1292011 at 12q24 ( Figure S2) and performed functional prediction for those SNPs in strong LD with rs1292011 (r 2 .0.8) (Table S2). There are no genes in the LD region of rs1292011. The variant rs1391721, which is close to lead SNP rs1292011 (390 bp upstream) and highly correlated with it (r 2 = 0.96), implies a functional potential in effecting the combination between transcription factors and their binding site. This variant falls on a position within a footprint containing an Evi-1 and GATA-1/2/3 motifs in MCF-7 cell line and substantially affect footprinting of these motifs, resulting in allelic imbalance in chromatin accessibility. Again, these evidences are kind of bioinformatics prediction and functional experiments are warranted to validate these insights.
We did not observe a significant evidence of association between rs2823093 and breast cancer risk in our study. In the study of Maya et al. [16], they also reported a negative association of rs2823093 in women of Asian ancestry (OR = 1.14, 95%CI = 0.93-1.40) from BCAC (Breast Cancer Association Consortium). Such discrepancy is common between different populations. For example, with a consortium effort including 23637 breast cancer patients and 25579 controls of East Asian ancestry, Zheng et al. [22] investigated 67 independent susceptibility loci of breast cancer identified by GWAS primarily in European-ancestry populations and found 31 loci at P,0.05 in a direction consistent with that reported previously. In terms of the inconsistence of rs2823093 in this study, one of the explanations is the genetic heterogeneity between populations that the identified SNP rs2823093 may be a good proxy of causal variants at 21q21 in population of European descent but poor in Asians. Moreover, as shown in Table 2, in contrast to the other two replicated SNPs, the minor allele frequency of rs2823093 is lower in Asian population (MAF = 0.042) than those in population of European descent (MAF = 0.300), which also reflects the ethnical difference and may result in a low statistical power in this study. For another explanation, the locus of 21q21 may be out of effect in Asian women and inactive in the absence of a specific exposure in Asians, which, however, is common to European women. Taken together, discrepancy between different ethnicities for association between genetic variants and breast cancer risk could be explained partly by the different genetic background. Other possible explanations also include disease heterogeneity, study design and sample size. Well-designed fine-mapping studies with large sample size following resequencing this region may help to clearly evaluate the role of 21q21 in breast cancer development in populations of non-European descent.
In summary, our study evaluated three new breast cancer loci in a Chinese population that were identified in populations of European descent. Our results suggest that genetic variants at 12p11 and 12q24, tagged by rs10771399 and rs1292011, respectively, may also play an important role in the susceptibility of breast cancer in Chinese women. Further studies are warranted to clarify the biological mechanisms of these two loci with breast cancer risk and determine the causal variants in breast carcinogenesis. Supporting Information Figure S1 Overview of the LD block containing rs10771399 at 12p11 from the UCSC browser (NCBI36/ hg18). A 250-kb window within upstream and downstream of the proxy SNP rs10771399 at 12p11 was annotated. Linkage disequilibrium (LD) region was generated using the HaploView 4.2 software according to HapMap II+III CHB data. (DOC) Figure S2 Overview of the LD block containing rs1292011 at 12q24 from the UCSC browser (NCBI36/ hg18). A 250-kb window within upstream and downstream of the proxy SNP rs1292011 at 12q24 was annotated. Linkage disequilibrium (LD) region was generated using the HaploView 4.2 software according to HapMap II+III CHB data.