Synergistic effect of collagenase-1 (MMP1), stromelysin-1 (MMP3) and gelatinase-B (MMP9) gene polymorphisms in breast cancer

Background Extracellular matrix degradation by matrix metalloproteinases (MMPs) is an important mechanism involved in tumor invasion and metastasis. Genetic variations of MMPs have shown association with multiple cancers. The present study is focused to elucidate the association of MMP-1, 3 and 9 genetic variants with respect to epidemiological and clinicopathological variables by haplotype, LD, MDR, survival in silico analyses among South Indian women. Material and methods MMP3–1171 5A/6A and MMP9–1562 C/T SNPs were genotyped by Allele specific polymerase chain reaction and MMP1-1607 1G/2G polymorphism by restriction fragment length polymorphism assays respectively, in 300 BC patients and age-matched 300 healthy controls. Statistical analysis was performed using the SNPStats and SPSS software. Linkage disequilibrium and gene-gene interactions were performed using Haploview and MDR software respectively. Further, transcription factor binding sites in the promoter regions of SNPs under study were carried out using AliBaba2.1 software. Results We have observed an increased frequency of 2G-allele of MMP1, 6A-allele of MMP3 and T-allele of MMP9 (p<0.05) respectively in BC subjects. The 2G-6A haplotype (minor alleles of MMP-1 and MMP-3 respectively) has shown an increased susceptibility to BC. Further, MMP polymorphisms were associated with the clinical characteristics of BC patients such as steroid hormone receptor status, lymph node involvement and metastasis. SNP combinations were in perfect LD in controls. MDR analysis revealed a positive interaction between the SNPs. 5-years survival rate and cox-regression analysis showed a significant association with clinicopathological variables. Conclusion Our results suggest that MMP1–1607 1G/2G, MMP3–1171 5A/6A and MMP9–1562 C/T gene polymorphisms have synergistic effect on breast cancer. The interactions of MMPs clinical risk factors such as lymph node involvement has shown a strong correlation and might influence the 5-years survival rate, suggesting their potential role in the breast carcinogenesis.


Introduction
Breast Cancer (BC) is the most common cancer and a leading cause of death in women worldwide and in India. It is a multi-factorial, polygenic disease resulting from the interplay of genetic, epigenetic, environmental and lifestyle factors [1]. The breast microenvironment composed of extracellular matrix (ECM), stromal cells including endothelial and immune cells, fibroblasts and adipocytes playing a crucial role in mammary duct morphogenesis. Key enzymes regulating ECM turnover are matrix metalloproteinases (MMPs) and tissue inhibitors of metalloproteinases (TIMPs) [2].
The matrix metalloproteinases are a family of secreted Zn-dependent endopeptidases playing an important role in the physiological processes and its deregulation is associated with various diseases including cancers [3]. The deregulated MMPs results in cancer progression such as cell proliferation, angiogenesis, invasion [4], metastasis and escape from the immune surveillance [5][6]. Activation of MMPs can be controlled by proteolytic enzymes such as plasmin, while their inhibition is controlled by their specific endogenous TIMPs [7].
The progress in the knowledge about the role of MMPs and their inhibitors in tumourigenesis have led to numerous studies which have tested a potential association of single nucleotide polymorphisms (SNPs) in these genes with cancer susceptibility and progression [10,11]. Although research has been performed to explore the role of SNPs of these genes in breast cancer individually [12][13][14][15][16], the results are inconclusive. Therefore, the present study has been performed to derive a more precise estimation of the association of MMPs and to elucidate the synergistic effect of genetic polymorphisms in the regulatory regions of MMP1 (rs1799750), MMP3 (rs35068180) and MMP9 (rs3918242) for susceptibility and progression of breast cancer by analysis of SNPs, haplotypes, LD, MDR, survival and in silico analysis along with epidemiological and clinicopathological variables in South Indian women.

Ethics statement
This case-control study was carried out with the approval of the Ethics Committee of Mehdi Nawaj Jung (MNJ) Institute of Oncology & Regional Cancer Centre, Hyderabad, Telangana State, INDIA. The subject recruitment and sample collection were done only after obtaining written informed consent from the participants.

Study subjects
The present study consists of 300 confirmed breast cancer patients and 300 healthy controls from South India. Patients with breast cancer were consecutively recruited from MNJ regional cancer center, Hyderabad and women with any other cancer or other systemic inflammatory disease were excluded from the case and control group. Patients were enrolled from the department of oncology between the period from August 2011 to August 2016. Selection criteria for cases included patients who were histopathologically confirmed as breast cancer by medical and surgical oncologists. The patients were subjected to detailed demographical, clinical and pathological investigations. Staging of cancer was documented according to the AJCC-TNM classification system. During the same time the control group was drawn from the same region with similar socio-economic status and the individuals included had no evidence of any personal history of cancer or other malignant conditions. General health history of the controls was collected with an appropriately designed proforma.

Data collection
A detailed description of the baseline characteristics of the breast cancer patients and healthy controls is shown in Table 1.
Clinical profile of breast cancer was evaluated with the help of medical and surgical oncologists according to the Union for International Cancer Control (UICC) and tumor-node metastasis (TNM) classification for breast cancer (WHO) and the same was noted in the case proforma from the tumor registries as shown in Table 2. [17].

Genomic DNA extraction
From each subject 4ml of blood was drawn into vaccutainer tubes containing ethylene-diamine-tetra-acetic acid (EDTA) and stored at 4˚C. Genomic DNA was extracted from the whole blood sample by using non-enzymatic salting out method [18].
For confirmation, genotyping was performed without the knowledge of subjects case/ control status. Furthermore, in order to ensure the accuracy of the genotyping data, our data was confirmed by Sanger sequencing analysis and the results were found to be in 100% concordance.

Statistical analysis
The continuous variables are expressed as the mean ± standard deviation (Mean±SD). Chisquare test for goodness-of-fit was used to analyze the difference in the frequency distribution between cases and controls for discontinuous variables. Hardy-Weinberg equilibrium test was performed between controls and patients for each SNP. The allele and genotype frequencies for all the polymorphisms were calculated using Chi-Square [χ 2 ] test for significance of differences between cases and controls. Adjusted odds (AOR) ratios were calculated by adjusting covariates such as age and haplotype frequencies were estimated in controls and cases using SNPStats [21]. Linkage disequilibrium (LD) plots of controls and cases were generated using Haploview program [22]. Gene-gene interactions were determined by MDR analysis [23]. We have also compared the allele and genotype distribution for all clinical and histopathological characteristics of cases. The p-values <0.05 were considered statistically significant.
The Kaplan-Meier survival analysis was carried out on the follow up data available from 216 breast cancer cases, taking death as an event occurring within 5 years of diagnosis to calculate median 5-years survival (OS) rate. The multivariate analysis of the probable predictive factors for survival was carried out using Cox's proportional hazard regression analysis.

In silico analysis
The SNPs in the promoter regions of MMP-1, -3 and -9 genes were studied for the presence of transcription factor binding sites using AliBaba2.1 online tool (http://www.gene-regulation. com/pub/programs/alibaba2/index.html).

Characteristics of study population
The baseline characteristics of controls and patients are listed in In the present study, clinicopathological profiles of breast cancer patients revealed that majority of the patients (66%) were at clinical stage T 0 -T 2 (early stage)and with respect to information on steroid hormone receptor status, out of total patients, 176 (58.6%) patients were ER positive, 167 (55.6%) were PgR positive and 163 (54.33%) Her2/nu positive whereas 59 (19.66%) were triple negative status. The histopathological classification of breast cancer at the time of diagnosis in our study showed that of all patients, 82.33% (247) had ductal carcinoma and 17.66% (53), had invasive lobular carcinoma. 73.33% of the patients had shown positive lymph node involvement at the time of diagnosis. Among the total cases, 102 (34%) reported metastasis as presented in Table 2. The SNPs of MMP1 (-1607 1G/2G), MMP3 (-1171 5A/6A) and MMP9 (-1562 C/T) were genotyped using specific primers and thermal cycling conditions as displayed in Table 3.
The distribution of genotype and allele frequencies of SNPs under study for 300 controls and 300 breast cancer patients is shown in Table 4 (S1 File).

SNP rsnumber
Assay Primers Ta RE Gel band pattern Ref.

Distribution of allelic and genotype frequencies of C-1562 T polymorphisms in MMP9 gene
The distribution of allelic frequencies of MMP9-1562 C and T alleles were found to be 0.67 and 0.33 in controls, compared with 0.58 and 0.42 in patients respectively, with an increased frequency of T-allele of MMP9 (OR 1.44, 95% CI 1.14-1.83, p<0.0023) was observed in BC patients as summarized in Table 4. The frequencies of the three MMP9 genotypes in controls were 50% (C/C), 33.7% (C/T) and 16.3% (T/T) while, in BC patients were 40.3% (C/C), 35.7% (C/T) and 24% (T/T), deviated from those expected from the Hardy Weinberg equilibrium (p<0.01). The frequency of variant homozygote (T/T) was found to be higher in the BC group compared to controls with 1.82 fold increased risk for BC (OR 95% CI However, 5A/6A genotype of MMP3-1171 5A/6A polymorphism was found to be significantly associated with lymph node positive cases (OR 2.58, 95%CI 1.39-4.8, p = 0.01). There was no significant association of MMP9-1562 C/T gene polymorphism with respect to clinicopathological variables as summarized in Table 5 (S1 File).

Haplotype analysis
Haplotype analysis of MMP1 and MMP3 genes present on chromosome 11q was performed to calculate the combined effect of MMP1-1607 1G/2G and MMP3-1171 5A/6A polymorphisms on breast cancer. Out of the four haplotypes obtained, the 2G (MMP1)-5A (MMP3) was the most commonly found haplotype therefore was considered as reference. The haplotype 1G-5A (alleles in order of MMP-1 & MMP-3) frequency was high in controls than in patients and might confer protection against breast cancer (OR 0.45, 95% CI 0.29-0.69, p<0.0001) Table 6 (S1 File).

Linkage disequilibrium
In the present study, pairwise LD estimate was obtained for the MMP1, MMP3 and MMP9 gene polymorphisms in cases and control group separately. The analysis revealed that most of the SNP marker combinations exhibited perfect LD scores, with the exception of few combinations that showed differential pattern of high LD scores in each of the analysis group (controls and cases). The SNP loci combination of MMP3-MMP1 and MMP9-MMP1 showed a perfect LD in controls and cases, however D 0 = 0.92 and D 0 = 0.61 was observed between MMP3-MMP9 in controls and cases respectively as shown in Table 7 and Fig 1 (S1 File).

Multifactor dimensionality reduction (MDR) Analysis
Association of higher order interactions with breast cancer risk was analyzed by MDR analysis as summarized in Fig 2. The interaction information analysis revealed moderate effect between the markers -1607 1G/2G of MMP1, -1171 5A/6A of MMP3 and -1562 C/T of MMP9 genes which were conferring risk towards the progression of the breast cancer. High-risk and lowrisk genotypic combinations were determined based on the threshold value, which was 1   Table 9. However, the genotypes did not significantly affect survival time among breast cancer patients (p>0.05) (Fig 4) (S2 File).
With respect to clinicopathological characteristics, our analysis revealed a significant association between median 5 year survival rate with late stage of breast cancer at diagnosis, negative HER2/neu receptor status and presence of metastasis (p<0.05).
In addition, the multivariate cox regression analysis also confirmed the results indicating the significant influence of the clinicopathological variables on breast cancer development and progression Table 10 (S2 File).

In silico analysis
The prediction of transcription factors binding sites (TFBSs) for MMP-1-1607 1G>2G polymorphism has shown that 1G allele has no TFBS whereas 2G allele has a binding site for C/ EBP alpha (CCAAT/enhancer-binding protein alpha) site as depicted in Fig 5. With respect to MMP3-1171 5A>6A polymorphism, our analysis revealed that 5A-allele has Nf-kappaB TFBS while 6A allele results in loss of Nf-kappaB binding site as depicted in

Discussion
Carcinogenesis is a growing health problem worldwide that is characterized at cellular level by self-sufficiency in growth signals, insensitivity to growth-inhibitory signals, evasion of programmed cell death, limitless replicative potential, angiogenesis, tissue invasion, and metastasis. [24]. MMPs are proteolytic enzymes that degrade extracellular matrix and basement membrane. Most of the studied SNPs have been reported to have functional [25][26][27] a role in breast cancer progression. MMP-1 (interstitial collagenase) and MMP-3 (stromelysin) are multifunctional enzymes and structurally related genes localized to chromosome 11q, involved in physiological and pathological tissue remodeling [28]. MMP3 is known to lyse basal membrane collagen and induce the synthesis of other MMPs such as MMP1 and MMP9 (Gelatinase B) [29]. MMP9 a key player in angiogenesis, is present on chromosome 20q. The present study   The promoter genetic variation of MMP-1 gene arises from insertion or deletion of a guanine nucleotide (G) at position -1607 relative to the transcriptional start site; consequently, one allele (insertion) has two Gs (2G), whereas the other allele (deletion) has only one G at this position (1G). The insertion creates the core sequence (5 0 -GGA-3 0 ) of a binding site for the Ets transcription factors, and it was demonstrated in vitro that the 2G allele had a higher transcriptional activity [30]. In the present study as summarized in Table 4, the frequency of 2G-allele was found to be predominant in breast cancer group compared to controls, with 2 folds increased risk for BC our results are in absolute conformity with published earlier data in colorectal carcinoma and renal cell carcinoma [31]. Further 2G-allele was also found to be associated with invasiveness of lung cancer and endometrial cancer [32].
In contrast the insertion of an adenosine in the -1171 position in the MMP-3 gene promoter sequence halves its transcriptional activity [33]. It is known that the higher transcriptional activity associated with the 5A allele may enhance tumor invasiveness [34]. Our study revealed that an individual with 5A/6A genotype has an increased risk for the development of breast cancer (p = 0.01).
Our findings are in accordance with earlier studies carried out on non-small cell lung carcinoma in north china population [35] and early stage oral sub-mucous fibrosis, head and neck carcinoma [36]. Further, our findings are in agreement with meta-analysis report on matrix metalloproteinase 1& 3 and cancer risk [37].
MMP9 promoter region with a C to T substitution at position -1562 in MMP-9 promoter region may result in a loss of binding of a nuclear protein with increase in transcriptional activity [38]. With respect to MMP9-1562 C/T promoter polymorphism, our study revealed that the frequency of T-allele was found to be predominant in breast cancer group compared to controls, with a 1.44 folds increased risk for BC. These reports were in accordance with that of multiple cancers like lung [39] and colorectal cancer [40]. Further, Przybylowska et al (2006) has shown that the T-allele of MMP9-1562 C/T was associated with the tumor expression and influences the malignant potential of breast carcinoma susceptibility [38]. Overexpression of MMP1, MMP3 and MMP9 genes have been found to be positively associated with the clinicopathological characteristics of several malignancies [41][42][43][44]. In the present study SNPs of MMP1, MMP3 and MMP9 genes were correlated with clinicopathological features for their association in breast cancer progression and susceptibility. Our results revealed a significant association of 2G/2G genotype of MMP1-1607 with steroid hormonal receptor status and metastasis, suggesting the importance of 2G/2G genotype in the progression of breast cancer. The MMP3-1171 5A/6A polymorphism, 5A/6A genotype was significantly associated with positive lymph node involvement.
The LD analysis was carried for both controls and BC patients independently to determine the risk conferring genetic markers. The SNP loci combination was in perfect LD (D' = 1), demonstrating their strong association. Haplotype analysis of the MMP gene cluster on chromosome 11q has shown a significant association with 1G-5A variant alleles of MMP1 and MMP3 gene polymorphisms. Further, the MDR analysis was carried out to study gene-gene interactions, the result of the present study suggests that the 2G-allele of MMP1-1607 1G/2G, and T-allele of MMP9-1562 C/T may be associated with altered enzyme activity, favours tumour-related mechanisms, and promotes tumor development and progression.
The interaction dendrogram also further confirmed that MMP-1607 1G/2G and MMP3-1171 5A/6A SNPs have strong correlation whereas, MMP9-1562 C/T polymorphism has an additive effect on the risk of breast cancer development.
Furthermore, the SNPs in promoter region of MMP-1,-3 and-9 genes might be involved in gain or loss of potential transcription factor binding sites (TFBSs), therefore we analysed the TFBSs for the SNPs under study using AliBaba2.1 online tool. Our results revealed that for MMP1-1607 1G>2G polymorphism, the 2G allele was associated with transcription factor binding sites for C/EBPalpha (also known as CCAAT/enhancer-binding protein alpha), leading to enhanced activity of the gene. Similarly, regarding MMP3-1171 5A>6A polymorphism, 5A allele was associated with Nf-KappaB site, while 6A allele was associated with lack of Nf-kappaB site, leading to reduced transcription activity. Likewise, pertaining to MMP9-1562 C>T polymorphism, C allele was associated with Sp1 nuclear protein while, T allele was associated with lack of Sp1 site, leading to increased activity of the gene.
In addition, the associations of MMP1, 3 and 9 SNPs with 5-years survival rate were assessed using Kaplan-Meier analysis. Our data showed a decreased survival rate for risk genotypes of all selected SNPs of MMP-1, -3 & -9 genes but statistically insignificant at p<0.05. However, a significant association of the clinicopathological characteristics such as late stage at diagnosis, HER2/neu receptor negative status and presence of metastasis with 5-years survival rate was observed (p<0.05), suggesting the importance of these clinicopathological features in the progression of breast cancer.
Overall, our results revealed that the polymorphisms in the promoter region of MMP1, MMP3 and MMP9 when correlated with clinicopathological characteristics and survival rate have shown significant effects on the risk and progression of breast cancer, substantiated by in-silico analysis. These functional polymorphisms in MMPs could lead to altered gene expression, subsequently creating imbalance in the vital MMP system that results in excessive ECM degradation and deregulated ECM dynamics in cancer development.
In conclusion, our results suggest that MMP1-1607 1G/2G, MMP3-1171 5A/6A and MMP9-1562 C/T gene polymorphisms have strong correlation with breast cancer. The 2G-6A haplotype (minor alleles of MMP-1 and MMP-3 respectively) has shown an increased susceptibility to BC, may display potential application as biological marker for identification of individuals at risk. The interactions of MMPs with breast cancer related environmental and clinical risk factors such as lymph node involvement have a strong correlation and influence the survival rate, suggesting their potential role in the breast carcinogenesis.
To the best of our knowledge this is the first study reporting on the synergistic effects of SNPs in MMP1, MMP3, and MMP9 genes in correlation with epidemiological and clinical variables also with LD, MDR, survival rate and In-silico analysis. However, our study has several limitations. Firstly, a small study that was analysed in South Indian population, because we restricted the study subjects to individuals of South Indian ethnicity; it is uncertain whether these results can be generalized to other populations. Second, there were few patients with missing hormonal receptor status, which may bias the results indicating an association with advanced disease status. Third, our LD and MDR analysis included only 600 samples and this may have limited the power of the pooled results. Therefore, collaborative studies on different populations are necessary to corroborate our findings.