Strong Impact of TGF-β1 Gene Polymorphisms on Breast Cancer Risk in Indian Women: A Case-Control and Population-Based Study

Introduction TGF-β1 is a multi-functional cytokine that plays an important role in breast carcinogenesis. Critical role of TGF-β1 signaling in breast cancer progression is well documented. Some TGF-β1 polymorphisms influence its expression; however, their impact on breast cancer risk is not clear. Methods We analyzed 1222 samples in a candidate gene-based genetic association study on two distantly located and ethnically divergent case-control groups of Indian women, followed by a population-based genetic epidemiology study analyzing these polymorphisms in other Indian populations. The c.29C>T (Pro10Leu, rs1982073 or rs1800470) and c.74G>C (Arg25Pro, rs1800471) polymorphisms in the TGF-β1 gene were analyzed using direct DNA sequencing, and peripheral level of TGF-β1 were measured by ELISA. Results c.29C>T substitution increased breast cancer risk, irrespective of ethnicity and menopausal status. On the other hand, c.74G>C substitution reduced breast cancer risk significantly in the north Indian group (p = 0.0005) and only in the pre-menopausal women. The protective effect of c.74G>C polymorphism may be ethnicity-specific, as no association was seen in south Indian group. The polymorphic status of c.29C>T was comparable among Indo-Europeans, Dravidians, and Tibeto-Burmans. Interestingly, we found that Tibeto-Burmans lack polymorphism at c.74G>C locus as true for the Chinese populations. However, the Brahmins of Nepal (Indo-Europeans) showed polymorphism in 2.08% of alleles. Mean TGF-β1 was significantly elevated in patients in comparison to controls (p<0.001). Conclusion c.29C>T and c.74G>C polymorphisms in the TGF-β1 gene significantly affect breast cancer risk, which correlates with elevated TGF-β1 level in the patients. The c.29C>T locus is polymorphic across ethnically different populations, but c.74G>C locus is monomorphic in Tibeto-Burmans and polymorphic in other Indian populations.


Introduction
Transforming growth factor beta (TGF-b) signaling is one of the most commonly altered cellular pathways in human cancers [1][2][3][4]. TGF-b1 is a multi-functional cytokine that plays an important role in breast carcinogenesis [5]. TGF-b1 is a potent inhibitor of proliferation of epithelial, endothelial and hematopoietic cells, and it acts as a tumor suppressor. TGF-b1 has dual role in carcinogenesis with tumor suppressive effects in epithelial cells, but tumor invasion and metastasis promoting effects during later stages of carcinoma progression [6][7][8]. Specific pathways are involved in the conversion of pro-and anti-tumor roles of TGF-b1 [9]. A majority of breast cancers secrete elevated TGF-b1 in tumor micro-environment associated with either malignant epithelial cells, stromal cells or both [10]. Increased immunoreactivity for TGF-b protein correlates with poor prognosis and increased lymph node involvement [11], and elevated TGF-b associate with tamoxifen resistance [12]. The role of TGF-b has been widely recognized in cancer stem cells [13,14] and TGF-b signaling in breast cancer has been extensively reviewed [15]. Eventually, TGF-b is thought of as a potential target for management of cancer [16][17][18] and inhibition of TGF-b has been tried for treating cancer, but without significant success till now [19][20][21][22][23][24][25][26][27][28].
TGF b are known as low penetrance genes in cancer [29]. There are three isoforms of TGF-b (TGF-b1, TGF-b2, and TGF-b3), of which TGF-b1 is most widely expressed [30]. TGF-b1 gene is located on chromosome 19q13.1 (OMIM 190180) [31]. So far, several polymorphisms in the TGF-b1 gene have been reported and found to affect TGF-b1 protein expression [32]. Relationship between TGF-b1 polymorphisms and breast cancer has been studied in several populations and is subject of further research interest due to lack of consensus in the data [33][34][35][36][37][38][39][40][41]. One of the most commonly studied polymorphisms in the TGF-b1 gene is c.29C.T substitution (rs1800470), resulting in proline (CCG) to leucine (CTG) change at codon 10 (Pro10Leu) of the protein (29). Another substitution, c.74 G.C (rs1800471), resulting in replacement of arginine (CGG) with proline (CCG) at codon 25 (Arg25Pro) of the protein, has been relatively less studied [42]. c.29C.T substitution results in increased secretion of cytokine [43], making it a strong candidate for analysis in breast cancer. These polymorphisms have not been widely analyzed in Indian populations, except the analysis of c.29C.T polymorphism in some Indian populations [44][45][46].
We conducted the present case-control study on a fairly large sample size to; 1) investigate the association between TGF-b1 polymorphisms (c.29C.T and c.74G.C) and breast cancer risk in India, 2) evaluate variation of the association across ethnically different populations, 3) compare genotype frequencies of these polymorphisms between Dravidian, Indo-European and Tibeto-Burman populations of India, and 4) compare TGF-b1 genotypes with other Asian populations from medico-evolutionary point of view.

Study subjects
Ethics statement. This case-control study was carried out with the approval of the Ethics Committee of the King George's Medical University, Lucknow. The subject recruitment and sample collection were done only after obtaining written informed consent of the participants.
The north Indian group, consisting of 113 patients and 113 control samples, was recruited from the Department of Surgery, King George's Medical University, Lucknow. The South Indian group, consisting of 352 patients and 126 control samples, was recruited from the Rai Memorial Hospital, Chennai, Nizam's Institute of Medical Sciences, Hyderabad, and Kasturba Medical College, Manipal University, Manipal. Women with histopathologically confirmed diagnosis of breast cancer were recruited as cases. Women visiting the clinic for problems other than breast cancer were recruited as controls after proper clinical investigation and/or a mammogram confirming no evidence of breast cancer. Women with any breast disorder or other systemic inflammatory disease were excluded from the control group. General health history of the cases and controls was collected with an appropriately designed proforma. A detailed description of the general and clinical characteristics of the patients is provided in Table 1.
Three (Dravidian, Indo-European, and Tibeto-Burman) out of four major linguistic groups, inhabiting the Indian mainland, have been included in this study. After analyzing Indo-European casecontrol group from northern India and Dravidian case-control group from southern India, we extended the analysis to the Tibeto-Burman populations from north-eastern India. Striking differences in the allele frequency between Indian and East-Asian (Chinese) populations [47], particularly at the c.74G.C locus [48], encouraged us to genotype both the SNPs in Tibeto-Burman populations, in order to further explore the medico-evolutionary significance of TGF-b1 polymorphisms. Tibeto-Burmans in India have close genetic affinities with East Asian populations [49]. We recruited a total of 508 Tibeto-Burmans from north-eastern regions of India, Nepal, and those residing in other states of India. Samples were collected from Khasi of Meghalaya, Ao-Naga, Naga Sema, and Chakhesang Naga of Nagaland, Nyshi of Arunachal Pradesh, Mizo of Mizoram, Poumai Naga of Manipur, Sherpa and Subba of Darjeeling (West Bengal), and Tibeto-Burmans residing in Mysore (Karnataka). Since both Indo-European and Tibeto-Burman populations inhabit Nepal, we recruited Nepali Brahmins (Indo-European) and Magar community (Tibeto-Burman) people to compare the genotype frequency with other populations of South-East Asia.

Genotyping
Isolation of DNA for genotyping was carried out as described in our earlier report [50]. The target TGF-b1 fragment was amplified using primers, GAGGCCCTCCTACCTTTTG (F) and GCAGCTTGGACAGGATCT (R), and PCR products were analyzed on 2% agarose gel stained with ethidium bromide. The amplified products were analyzed by direct DNA sequencing using big dye chain terminator cycle sequencing kit (ABI) on a 3730 DNA analyzer (Applied Biosystems).

Statistical analysis
Genotype data for control population were subjected to test for fitness in the Hardy Weinberg equilibrium. Statistical computational software available at http://ihg.gsf.de/cgi-bin/hw/hwa1.pl was employed for this purpose. The frequencies of the two alleles at the polymorphic sites were compared between cases and controls to find the risk allele. Genotype data were compared using 263 contingency table of Chi Square test or Fisher's exact test using statistical computational tools available at http://www. vassarstats.net. Fisher exact P values were calculated using 262 or 263 contingency tables, but wherever the software could not calculate Fisher exact values due to large sample size, Chi Square P value was used. Peripheral values of TGF-b1 were compared between cases and controls using non-parametric Mann-Whitney U-test. Age dependent multivariate Cox regression analysis was used to assess the genotype associated risk factors of breast cancer, considering genotypes as a risk event and socio-demographic factors as other variables (confounder covariates). Two sided Pvalues of less than 0.05 were considered significant for statistical inference.

Subject characteristics
We did not find any statistically significant difference in general characteristics between cases and controls (Table 1). However, slightly more number of breast cancer patients in the north Indian group fall in the younger age group (15.93% versus 1.99%, Table 1). More than 88% of breast cancer patients in both north Indian and south Indian groups were sporadic. The incidence of familial breast cancer in our subject population was quite high at about 11% frequency, which is lower than reported in other populations. Apparently, there was no correlation between tobacco chewing or smoking and the incidence of breast cancer in the study population.

TGF-b c.29C.T (codon 10) polymorphism
Genotype data were in Hardy Weinberg equilibrium for both north Indian (F = 0.0186, Exact P = 1.0) and south Indian groups (F = 0.0648, Exact P = 0.586). Analysis of the pooled data for all breast cancer patients versus controls showed that C.T substitution increased breast cancer risk (p = 0.00007 for allele comparison and 0.000003 for genotype comparison) ( Table 2). Group-wise analysis showed that C.T substitution at codon 10 increased breast cancer risk both in north Indian (p = 0.0012 for allele comparison and 0.0037 for genotype comparison) and south Indian groups (p = 0.0413 for allele comparison and 0.0004 for genotype comparison) ( Table 3). Sub-group analysis showed that C.T substitution increases breast cancer risk in the north Indian group, irrespective of menopause status (Table 4). However, in south Indians, though the association was significant in the postmenopausal women, it is only marginally significant in premenopausal women (Table 4).

TGF-b c.74G.C (codon 25) polymorphism
Genotype data for this polymorphism were found to be in Hardy Weinberg equilibrium for both north Indian (F = 0.031, Exact P = 0.656) and south Indian groups (F = 0.0413, Exact P = 1.0). Analysis of the pooled data for both the study groups showed that codon 25 polymorphism was not associated with breast cancer risk (p = 0.063 for allele comparison and 0.165 for genotype comparison) ( Table 2). In group-wise analysis, a significant association was observed in the north Indian group (p = 0.0016 for allele comparison and 0.0018 for genotype comparison) ( Table 5) such that the substitution was protective against breast cancer. However, the polymorphism showed no association in case of south Indian group (p = 0.327 for allele comparison and 0.554 for genotype comparison) ( Table 5). In subgroup analysis on the basis of menopause status, the difference was significant only in the pre-menopausal group of north Indian women (p = 0.011 for allele comparison and p = 0.005 for genotype comparison) ( Table 6). However, in post-menopausal group, no difference between cases and controls at genotype level was seen (p = 0.104). The frequencies of the two alleles and the genotypes at this site were comparable between south Indian cases and controls (Table 5), and the protective effect as seen in the north Indian group, was not evident in the South Indian group (Tables 5 and 6).
The polymorphic status of +29C.T was comparable among the Indo-European (North), Dravidian (South), and the Tibeto-Burman (North-East) Indian populations ( Figure 1). Interestingly, +74G.C substitution was observed in the Indo-European and Dravidian populations at a frequency of 5-8%, but was completely absent in Tibeto-Burmans. Tibeto-Burmans invariably possessed 'GG' genotype at +74 G.C locus. The Magar group (Tibeto-Burman) of Nepal also did not exhibit any polymorphism at this locus. However, the Brahmins of Nepal (Indo-European) showed polymorphism frequency comparable to other Indo-European populations. It is clear that the polymorphism at c.29C.T locus is very common and widespread. On the other hand, c.74G.C locus is polymorphic in the Dravidian and Indo-European populations, but completely monomorphic in the Tibeto-Burman populations of India, irrespective of the location and caste status.

Serum level of TGF-b1 in breast cancer patients and control subjects
We also measured the serum level of TGF-b1 in a subset of cases and controls of the North Indian group (Figure 2). Peripheral mean TGF-b1 level in the cases was significantly (U = 324.00, p,0.001) higher in comparison to the controls (Figure 2a). Further, the mean TGF-b1 level in cases across all three genotypes (CC: U = 72.00, p = 0.028; CT: U = 3.00, p,0.001; and TT: TGF-b1 Gene Polymorphism in Indian Women PLOS ONE | www.plosone.org U = 11.00, p = 0.042) at +29 C.T polymorphism was also found to be significantly higher as compared to the controls (Figure 2b). In contrast, the GG genotype at +74G.C polymorphism showed significantly (U = 212.00, p,0.001) higher mean TGF-b1 level in cases as compared to controls, but TGF-b1 level in case of    CG+CC genotypes did not differ significantly between the two groups (U = 9.00, p = 0.630, Figure 2c).

Association of TGF-b polymorphism with covariates
To determine the predictors (covariates) of breast cancer risk, genotype data for both the polymorphisms (c.29C.T and c.74G.C) of North Indians were further analyzed by multivariate Cox regression (Table 7). None of the investigated covariates showed significant association with genotypes associated breast cancer risk, except the menopausal status. The menopausal status in both the polymorphisms showed significant (p,0.001) association with breast cancer risk.

Discussion
We observed that the c.29C.T substitution at codon 10 of the TGF-b1 gene significantly increases the risk of breast cancer in Indian populations. The patients exhibited a far higher frequency of the substitution in comparison to the controls. We found that the allele frequency at this locus in Indian populations is comparable to other populations across the globe (refer NCBI database). In sub-group analysis, we found this substitution to increase breast cancer risk irrespective of ethnicity, as both Northand South-Indian women having substitution were at an increased risk of breast cancer. Comparison of the pre-menopausal and postmenopausal cases with all controls suggested that c.29C.T substitution increases breast cancer risk irrespective of the menopausal status. Three other studies from India have analyzed c.29C.T locus in breast cancer [44][45][46]. Two of them reported no association of this polymorphism with breast cancer risk [44,45]; however, Joshi et al. (2011) reported that TGF-b1 *29C was protective against breast cancer and suggested this to be a plausible reason behind relatively lower incidence of breast cancer in western Indian women in comparison to white women [46]. The allele and genotype frequencies in our study were comparable to those in Joshi et al (2011), and the data support that *29C is a protective allele and *29T a risk allele. Nevertheless, it is worth noting that our inference is in contrast to two other studies from India [44,45].
Published data on c.29C.T polymorphism in breast cancer lack consensus. As a result, five meta-analyses have been conducted on this polymorphism. Interestingly, all five metaanalyses were published in the same year [29,42,51,52,53]. Two of these meta-analysis stated no association between c.29C.T polymorphism and breast cancer [29,51], while the two others stated no overall association between this substitution and breast cancer risk, but an increased risk of breast cancer with 10P allele in Caucasians [42,52], and yet another meta-analysis stated significant association of 10P in overall analysis as well as in the Caucasian group [53]. Contrary to the observations of all these meta-analyses, particularly the latter three, we found the alternate allele ('T' or 'leucine') to be a risk factor for breast cancer. Our results have infused further curiosity regarding the association of this polymorphism with breast cancer.
We observed that c.74G.C substitution was significantly protective against breast cancer in the north Indian population only. North Indian patient population exhibited a higher frequency of the substitution in comparison to the controls. Subgrouping of North Indian cases according to the menopausal status revealed significant protective effect of this substitution in case of pre-menopausal women only. A clear ethnicity based impact on breast cancer risk of the genotypes at c.74G.C site was evident, as Table 6. Comparison of +74 G.C genotype data between case groups as per menopause and the controls. the protective effect of 'CC' genotype was not seen in the South Indian group. This polymorphism has been relatively less studied in comparison to c.29C.T substitution. Only one study on breast cancer from India has analyzed this polymorphism, finding no significant difference between cases and controls [44]; however, this study had severely low statistical power due to the use of a very small sample size for inference. Two other Indian studies on TGF-b1 polymorphisms in breast cancer did not analyze this polymorphism [45,46]. We are the first to genotype this polymorphism in a significantly large sample size and report protective effect of the substitution. Our analysis on Tibeto-Burman populations of India found no variation at this locus. This observation is interesting, but not surprising, as one of our earlier studies showed complete absence of a 25 bp deletion polymorphism in the MyBPC3 gene (causing various forms of cardiomyopathy) in these populations despite its presence in almost all other Indian populations at about 4% frequency [47]. Shanghai breast cancer study also found no incidence of sequence variation at c.74G.C locus after analysis on a cohort of 1111 Chinese patients [48]. Most other populations across the world exhibit small frequency of 'C' allele, showing widespread existence of this polymorphism (refer NCBI database).
Highly polymorphic status of the c.29C.T locus among Indian and North-Eastern Indian populations shows widespread existence of this polymorphism. Monomorphism at the c.74G.C locus unveils important medical and evolutionary significance associated with this locus. The absence of the protective allele (C) may suggest relatively higher risk of breast cancer in the Tibeto-Burmans in comparison to the Dravidians and Indo-Europeans. Similarly, the absence of 'C' allele in the Chinese populations may indicate increased breast cancer risk in comparison to the Indian populations. This notion is supported by a higher incidence of breast cancer in the Chinese populations in comparison to the Indian populations (Dravidian) as reported in an epidemiological study comparing breast cancer incidence over a period of three decades [54]. From evolutionary point of view, our data further supports the proposal that the people of north-eastern region of Indian are genetically closer to Chinese/East Asian populations [49].
We observed that TGF-b1 level in the breast cancer patients was significantly elevated as compared to the control group. The elevated TGF-b1 level could be due to a higher frequency of the risk genotypes in the cases. Further analysis on the basis of genotypes suggested that TGF-b1 level of cases in comparison to control was significantly higher in all the genotypes of c.29C.T locus, while in case of c.74G.C locus, it was only significant in absence of ''C'' allele. Intra-tumoral expression of TGF-b1 has been found to be significantly higher in invasive breast cancer patients [55]. It is well documented that TGF-b1 polymorphic variants are functionally associated with the level of TGF-b1 expression [40,[56][57]. Therefore, it is plausible that TGF-b1 polymorphisms affect breast cancer risk by modulating the level of TGF-b1 expression. In multistage progression of tumors, TGF-b exerts growth inhibitory effects in the initial phase; however, growth-inhibitory effects are abolished and malignant tumor promoting action of TGF-b is activated in the later stages [58]. Significant correlation of TGF-b1 allelic variants with elevated TGF-b1 level suggests their critical role in deciding cancer initiation and progression. Nevertheless, a direct correlation between allelic variants, the level of expression, and cancer risk or progression is difficult to derive since the level of TGF-b1 expression and its pro-and anti-apoptotic effects may differ at different stages of cancer progression. A stage specific analysis of the TGF-b1 expression level and haplotype analysis of all the polymorphisms of this locus could help further understand the breast cancer risk associated with TGF-b1 variations. We feel that availability of further details such as ER and HER2 status, treatment outcome, recurrence rate, and drug resistance data could have helped undertake further detailed investigations, which could not be undertaken due to unavailability of such data.
In conclusion, c.29C.T substitution increases breast cancer risk irrespective of ethnicity and menopausal status. This polymorphism is quite common across the world. c.74G.C polymorphism, on the other hand, showed ethnic variations such that the substitution decreased breast cancer risk in the north Indian populations, but not in their south Indian counterparts. This could be due to a significant impact of other co-occurring genetic variations affecting the risk due to this polymorphism. In other words, the genetic background perhaps becomes more influential in case of c.74G.C polymorphism. The c.74G.C locus is polymorphic across the world with moderate frequency of 'CC' genotype, except in case of the North-East Indians, Nepalese, and Chinese populations. Monomorphism at this locus may suggest increased breast cancer risk in these populations in comparison to other ethnic groups. The increased level of TGF-b1 in the patients in comparison to the controls could suggest the possible mechanism of the effect of TGF-b1 polymorphisms on breast cancer. However, further in vitro studies are required in order to decipher the mechanism of increased cancer risk in the carriers of certain TGF-b1 genotypes. Significant impact of c.74G.C polymorphism on breast cancer risk encourages more studies on this polymorphism. In addition to identifying genetic risk factors for breast cancer, our study has revealed striking differences in the genetic variations between different ethnic groups, which could have important implications on human health.