Type II Transmembrane Serine Protease Gene Variants Associate with Breast Cancer

Type II transmembrane serine proteases (TTSPs) are related to tumor growth, invasion, and metastasis in cancer. Genetic variants in these genes may alter their function, leading to cancer onset and progression, and affect patient outcome. Here, 464 breast cancer cases and 370 controls were genotyped for 82 single-nucleotide polymorphisms covering eight genes. Association of the genotypes was estimated against breast cancer risk, breast cancer–specific survival, and survival in different treatment groups, and clinicopathological variables. SNPs in TMPRSS3 (rs3814903 and rs11203200), TMPRSS7 (rs1844925), and HGF (rs5745752) associated significantly with breast cancer risk (P trend = 0.008–0.042). SNPs in TMPRSS1 (rs12151195 and rs12461158), TMPRSS2 (rs2276205), TMPRSS3 (rs3814903), and TMPRSS7 (rs2399403) associated with prognosis (P = 0.004–0.046). When estimating the combined effect of the variants, the risk of breast cancer was higher with 4–5 alleles present compared to 0–2 alleles (P = 0.0001; OR, 2.34; 95% CI, 1.39–3.94). Women with 6–8 survival-associating alleles had a 3.3 times higher risk of dying of breast cancer compared to women with 1–3 alleles (P = 0.001; HR, 3.30; 95% CI, 1.58–6.88). The results demonstrate the combined effect of variants in TTSPs and their related genes in breast cancer risk and patient outcome. Functional analysis of these variants will lead to further understanding of this gene family, which may improve individualized risk estimation and development of new strategies for treatment of breast cancer.


Introduction
Breast cancer is the most common cancer among women in western countries. The known high risk susceptibility genes for breast cancer, e.g. BRCA1, BRCA2, ATM, and PALB2, are responsible for approximately 20% of the hereditary cases [1], but several unknown breast cancer-predisposing genetic factors still exist. Genetic risk factors with a low or moderate penetrance also affect the risk of sporadic breast cancers and may act together with environmental and lifestyle factors and with each other to enhance cancer predisposition and progression [2,3].
Type II transmembrane serine proteases (TTSPs) degrade components of the extracellular matrix (ECM) [4,5]. The 17 members of the human TTSP family have physiological and pathological roles in digestion, cardiac function, blood pressure regulation, hearing, iron metabolism, and epithelial homeostasis [4,6]. In cancer, the TTSPs are related especially to tumor growth, invasion, and metastasis [7].
TTSPs are divided by their structures into subfamilies [4]. TMPRSS1, TMPRSS2, and TMPRSS3 belong to the Hepsin/ TMPRSS subfamily [4]. TMPRSS1 is overexpressed in prostate and breast cancers, and its expression and localization have also been related to epithelial integrity [8,9]. In terms of cancer-related risk, Pal and coworkers (2006) reported five TMPRSS1 single nucleotide polymorphisms (SNPs) as associated with prostate cancer in men of European origin [10], although another study identified no associated variants [11]. The androgen-regulated TMPRSS2, however, is strongly associated with prostate cancer and forms a fusion gene with ETS transcription factor (TF) family members, that occurs in roughly half of prostate cancer cases [12,13]. This fusion has been studied in ovarian cancer but not detected [14]. TMPRSS3 is overexpressed in epithelial ovarian cancer and is a potential diagnostic marker and therapy target [15][16][17][18].
TMPRSS11E/DESC1 belongs to the HAT/DESC subfamily of the TTSPs and is upregulated in tumors of different origins, including breast [19]. DESC1 can convert pro-urokinase-type plasminogen activator (uPA, PLAU) to active uPA [19]. uPA belongs to the serine proteases and is an important factor in the plasminogen activation system associated with several cancers, including breast cancer and especially tumor invasion and metastasis [20]. In addition, uPA is suggested to be a suitable breast cancer biomarker when planning appropriate adjuvant therapy [21].
The third subfamily of the TTSPs is the matriptase subfamily. Matriptase/ST14 is widely expressed in tissues rich in epithelial cells, such as breast, ovary, intestine, and prostate, and in tumors of epithelial origin derived from these tissues [22][23][24]. We have previously found that in the Eastern Finnish population, a variant in the ST14 gene, rs704624, is associated with a poor patient outcome and low matriptase mRNA expression in breast cancer patients [25]. Moreover, negative/low matriptase protein expression is independently predictive of poor survival [25]. We also reported a genetic risk factor on TMPRSS6, coding matriptase-2, to be associated with elevated breast cancer risk and poor outcome [26,27]. In addition, TMPRSS6 is mutated in breast carcinomas [28]. TMPRSS6 locates on chromosome 22q12-13 where an allelic imbalance has been observed in breast and colorectal cancers [29,30]. TMPRSS7/Matriptase-3 is a recently found, evolutionary conserved TTSP expressed in brain, ovary, and testis [31]. No reports have described TMPRSS7 in breast tissue or breast cancer.
Like uPA, PRSS8 (Prostasin) is a serine protease activated by matriptase [32,33]. Both uPA and PRSS8 are proteases involved in proteolytic cascades whereas hepatocyte growth factor (HGF) does not have proteolytic activity but instead is active in tumorigenesis, angiogenesis, and tissue regeneration via the HGF-Met pathway [34]. HGF is synthesized in pro-form, and like uPA and PRSS8, is processed to active form by the TTSP matriptase [35].
All of the genes investigated in this study are connected to the proteolytic activity that takes place through several cascades and leads to ECM degradation [4,6]. The TTSPs are also involved in maintaining epithelial integrity [4,6,9], and in cancer, alterations in their function may destabilize tumor epithelia, and thus induce tumor invasion.
No published reports have addressed the association of most TTSP genetic variants with breast cancer. Here we evaluated the role of several TTSP variants and related genes in breast cancer patients. Motivated by our previous findings, we hypothesized that in addition to ST14 [25] and TMPRSS6 [26,27], genes coding TTSPs and related genes exist as variants associated with breast cancer risk and patient outcome. To address this hypothesis, we genotyped tagging SNPs (tagSNPs) of five TTSP genes, TMPRSS1, TMPRSS2, TMPRSS3, TMPRSS7, and TMPRSS11E, two other serine proteases uPA and PRSS8, and HGF, and investigated their association with breast cancer risk and patient survival. We also tested whether the effect of the associated variants differs among the treatment groups and estimated the association of clinicopathological parameters with these variants.

DNA Samples
A sample set of 464 invasive breast cancer cases and 370 controls from the Kuopio Breast Cancer Project (KBCP) was available for genotyping (Table S1). The KBCP material consists of 497 prospective breast cancer cases and 458 controls from the province of Northern Savo in Eastern Finland. The cases were diagnosed at Kuopio University Hospital between April 1990 and December 1995, and the age-and long-term area-of-residencematched controls were selected from the National Population Register during the same time period [25,36,37]. The maximum follow-up time of the patients was 20 years (February 2011). Genomic DNA was extracted from peripheral blood lymphocytes using standard procedures [38]. The KBCP is approved by the joint ethics committee of the University of Eastern Finland and the Kuopio University Hospital (written consents 1/1989 and 61/ 2010). Each patient gave informed written consent for participation in the study.

SNP genotyping
Genotyping of 76/82 SNPs was done using MassARRAY (Sequenom Inc., San Diego, CA, USA) and iPLEX Gold (Sequenom Inc.) on 384-well plate format as previously described [39]. Duplicate analysis was done for 6.7% of KBCP samples for quality control. All primer sequences are available upon request. Six of the SNPs were genotyped using the 59 nuclease assay (TaqMan) with the Mx3000P Real-Time PCR System (Stratagene, La Jolla, CA, USA) according to the manufacturer's instructions. Primers and probes for TMPRSS1 rs41523449 and TMPRSS11E rs2708699 were supplied by Applied Biosystems as Custom TaqMan SNP Genotyping Assays. PRSS8 rs2855475, TMPRSS3 rs2839506 and rs9325634, and TMPRSS2 rs7275220 were supplied by Applied Biosystems as TaqMan Genotyping Assays. TaqMan genotyping was done as previously described [25]. TaqMan Genotyping Master Mix (Applied Biosystems) was used, as follows: 10 minutes at 95uC, 45-60 cycles of 15 seconds at 92uC, and 1 minute at 60uC. Duplicate genotypes were done for 4.2% of samples for quality control and the overall call rate was . 95%. If the duplicate and its pair were discordant, the genotypes of the sample were discarded. Greater than 98% overall concordance was required for both iPLEX-and TaqMan-genotyped SNPs.

Statistical analysis
Differences in SNP genotype frequencies between cases and controls were computed using the Armitage trend test (http://ihg. gsf.de/cgi-bin/hw/hwa1.pl) and logistic regression analysis. Concordance with Hardy-Weinberg equilibrium was calculated with the x 2 test. Association of the genotypes with clinicopathological variables was analyzed with the x 2 test, and the logistic regression analysis was used to evaluate the significance levels for the risks (odds ratios (ORs)) of the associated variables. Kaplan-Meier (logrank test) analysis was used to calculate the breast cancer-specific survival (BCSS), and the multivariate survival analysis was performed using a Cox regression model. In all analyses P#0.05 was considered significant. Statistical analyses were performed using SPSS v 19.0 (IBM SPSS statistics 19) and Haploview 4.2 [40]. P values were not corrected for multiple testing so as to avoid eliminating potentially important findings. Therefore, some of the results may need to be interpreted with caution and in addition be replicated in independent data sets. Genetic power estimation for the association studies was calculated using the Genetic Power Calculator, case-control for discrete traits at (http://pngu.mgh. harvard.edu/,purcell/gpc/) [41]. In the calculations, a was set as 0.05 and breast cancer prevalence as 1% [42]. The mean (0.23) of the observed MAFs of the genotyped SNPs in our sample set was used as the high-risk allele frequency. The allele frequencies were assumed to be equal for the risk SNP, and the marker SNP, and the D' was set as 1 corresponding to perfect linkage disequilibrium (LD). The risk for the homozygous and heterozygous high-risk allele genotypes was assumed to be similar (1.2 or 1.5). In silico estimation for the SNP effects was done by using FastSNP [43] and F-SNP [44].
Electrophoretic mobility shift assay MCF7 cells were grown in minimum essential media containing 10% FBS, 1 mM sodium pyruvate, 1.5 g/l sodium bicarbonate, 16 NEAA, 2 mM L-glutamine, 0.01 mg/ml insulin, 100 U/ml penicillin, and 0.1 mg/ml streptomycin. For nuclear protein extraction, the cells were harvested in 16 PBS, and spun down for 5 minutes at 10006g at 4uC. Pelleted cells were lysed in 4-56 volumes of lysis buffer [10 mM HEPES, 1.5 mM MgCl 2 , 10 mM KCl, 0.5 mM dithiothreitol, 0.5% (v/v) NP-40, protease inhibitors (Roche)] and incubated on ice for 5 minutes. Lysate was centrifuged for 1 minute at 12,0006g at 4uC and the supernatant discarded. The nuclear proteins were extracted in 26 volumes of extraction buffer [20 mM HEPES, 1.5 mM MgCl 2 , 420 mM NaCl, 0.2 mM EDTA, 0.5 mM dithiothreitol, 25% (v/v) glycerol, protease inhibitors (Roche)] for 30 minutes on ice, and vortexed a few times during incubation. Lysate was centrifuged for 1 minute at 12,0006g at 4uC, and the supernatant, containing nuclear proteins, was transferred to a fresh tube. Protein concentration was measured with the Bradford method using Coomassie brilliant blue (Merck, Darmstadt, Germany). Twenty-five micrograms of the protein extract was incubated for 40 minutes at 22uC with a 35 bp 32 P-labelled DNA-oligomer corresponding to the T or C alleles of the rs12151195 (upper strand 59-GCTCCTTC-CTAAAATAT/CAGATGATCTACAAG-39). DNA-oligomers were Klenow fill-in labeled. To prove the specific binding, 506 and 756 molar excesses of unlabeled oligomers were incubated with nuclear proteins for 10 minutes at 22uC prior to incubation with 32 P-labeled oligomers. The complexes were separated at 22uC on 4% nondenaturing polyacrylamide gels using 0.256 trisborate -EDTA buffer. The gels were dried and visualized using a phosphoimager (FLA3000; Fuji, Tokyo, Japan).

Results
SNPs in TMPRSS3, TMPRSS7, and HGF associate with breast cancer risk Altogether, 82 SNPs in eight TTSPs and related genes were genotyped in a sample set of 464 invasive breast cancer cases and 370 controls (Table S2). TMPRSS3 SNPs rs3814903 and rs11203200, TMPRSS7 SNP rs1844925, and HGF SNP rs5745752 associated significantly with breast cancer risk (P overall = 0.029, 0.008, 0.042, and 0.017, respectively) ( Table 1). All SNPs were consistent with the Hardy-Weinberg equilibrium. According to the power calculations our sample set with 464 cases and 370 controls has 83% power to detect a risk allele that is in perfect LD with the marker allele and has a relative risk of 1.5.

Increasing number of alleles significantly affects breast cancer risk and prognosis
To estimate the combined effect of the associating alleles, we assessed two new variables. We summed separately the number of alleles of four breast cancer risk-associating SNPs (rs3814903, rs11203200, rs1844925, and rs5745752) and five survival-associating SNPs (rs12151195, rs12461158, rs2276205, rs3814903, and rs2399403). In risk estimation women were divided into three groups carrying 0-2, 3, or 4-5 breast cancer risk alleles, since none of the cases or controls had six or more alleles (maximum eight). The risk of getting breast cancer was significantly higher with three risk alleles present (P = 0.003; OR, 1.70; 95% CI, 1.20-2.42, logistic regression analysis), and even higher with four to five alleles (P = 0.0001; OR, 2.34; 95% CI, 1.39-3.94, logistic regression analysis) compared to having 0-2 alleles.
In the multivariate survival analysis, patients with four or five risk alleles had a significantly poorer BCSS than those with one to three risk alleles (P = 0.037; HR, 1.96; 95% CI, 1.04-3.71) ( Table 2, Fig. 1F). Moreover, the women with six to eight risk alleles had 3.3 times higher risk of dying of breast cancer compared with the women with one to three risk alleles (P = 0.001; HR, 3.30; 95% CI, 1.58-6.88) (Table 2, Fig. 1F). In the multivariate analysis including all five survival-related SNPs, only rs3814903 remained significant (overall P = 0.029, data not shown). All multivariate analyses included age, tumor grade, histological type, tumor size, nodal status, ER status, and HER2 status.

Survival-associating SNPs in TTSPs affect patient outcome in different treatment groups
Overall survival (OS), BCSS, and recurrence-free survival (RFS) were assessed with a multivariate analysis according to the survival-associating SNPs and the combined risk allele variable in different treatment groups of the breast cancer patients. Among patients treated with radiation therapy TMPRSS1 SNP rs12151195 and TMPRSS3 SNP rs3814903 associated significantly with OS, BCSS, and RFS (P = 0.000001, 0.0003, and 0.013, and P = 0.016, 0.049, and 0.027, respectively) ( Table 3). In addition, rs12461158 in TMPRSS1 associated with OS and (P = 0.011), and TMPRSS2 SNP rs2276205 with OS and BCSS (P = 0.019, and 0.020, respectively) ( Table 3). Among the patients receiving only radiation therapy TMPRSS1 SNP rs12151195 and TMPRSS3 rs3814903 associated with OS (P = 0.004, and P = 0.015, respectively) ( Table 4).
An effect of the number of the survival-associating alleles was also seen in the different treatment groups. Women having more alleles had poorer OS and BCSS when treated with radiation therapy, compared with the women having fewer alleles (P = 0.00001 and 0.00004, respectively) ( Table 3). Also, the RFS time was comparatively shorter among these women having more risk alleles (P = 0.001) ( Table 3). The increasing number of risk alleles additionally affected the group treated with radiation therapy only: The OS was poorer among women carrying more than six risk alleles compared to those carrying five or fewer (P = 0.010) (Table 4). However, none of the these SNPs or the combined survival variant remained significant in the group of patients treated with hormone therapy or in the group receiving no treatment at all (data not shown). All multivariate analyses included age, tumor grade, histological type, tumor size, nodal status, ER status, and HER2 status.

Nuclear proteins from breast cancer cells bind differentially to the rs12151195 T allele-harboring region
Because the rs12151195 T/C is a potential gene regulatory SNP, we tested whether the C/T difference influences the binding of nuclear proteins to this gene region. To that end electrophoretic mobility shift assay with nuclear proteins from human breast cancer cells was performed. Interestingly, the common allele Tharboring ds-oligomer showed formation of a high molecular mass nuclear protein-DNA complex that was not evident with the rare C allele ( Figure S1). However, the identity of the bound nuclear protein remains to be elucidated.

Discussion
Proteolytic enzymes like the TTSPs associate with tumor invasion and metastasis in cancer. In this study, we genotyped 82 tagSNPs from seven serine protease genes and HGF and evaluated their role in breast cancer risk and survival. We found both risk-and survival-associating variants in five genes: TMPRSS1, TMPRSS2, TMPRSS3, TMPRSS7, and HGF. More important, we found that the more breast cancer riskassociating or survival-associating variants from this gene family the women had, the higher the risk of developing breast cancer or dying of it. Some of the survival-associating variants and especially the combined survival variants maintained their impairing effect also when treatment regimens were included to the analysis. These results suggest that genetic alterations disturb the function of these genes and proteins to cause enhanced proteolysis and epithelial disorder, possibly leading to cancer onset and progression.
Four variants in this study associated with breast cancer risk. The strongest single marker association was with TMPRSS3 intronic variant rs11203200; the minor allele carriers had a greater than 1.6-fold risk of developing breast cancer compared with the major allele homozygotes. The minor allele homozygotes were very rare; only three patients in our sample set carried it in both chromosomes. In the in silico analysis, this variant seems to have a possible effect as an intronic enhancer, and the minor allele may destroy the binding site of TF lyf-1. In addition, a 59 untranslated region (UTR) variant is in strong LD with rs11203200 (1000 Genomes Project) [45]. The 59UTR contains regulatory elements, and its variation thus may affect gene expression. Although no studies have previously evaluated the role of TMPRSS3 in breast cancer, it is overexpressed in epithelial ovarian cancer, and is a potential diagnostic marker and therapy target [15][16][17][18]. Another breast cancer risk-associating TMPRSS3 variant in our study was rs3814903. Whether these variants are responsible for the possible changes in gene expression on the mRNA or protein level in breast cancer remains to be elucidated.
TMPRSS7 (matriptase-3) is very rarely studied with only one publication by Szabo and colleagues (2005) [31] and has no known role in any cancer, but we found here that rs1844925 associated with breast cancer risk. It is an intronic variant at the start site of the gene near a missense variant rs11929695 that is in LD with rs1844925 (r 2 = 0.759), and changes an amino acid leading to possible effects (1000 Genomes Project) [45].
HGF works via its cognate receptor, Met, with several roles in signaling in different pathways. It can be linked to cancer progression [34], but there are no published associations with breast cancer risk. We found that the women homozygous for the minor allele of HGF SNP rs5745752 had a higher risk of breast cancer whereas the major allele was significantly protective. In addition, the minor alleles of three HGF variants associated with the clinical variables of higher tumor grade, positive nodal status, bigger tumor size, negative ER and PR status, and overall higher tumor stage. These associations may support the role of HGF in many actions of cancer progression.
Our results strongly indicate that an increased number of risk alleles enhances the risk of breast cancer. Women who had four to five risk alleles had 2.3-fold higher risk of breast cancer compared with women carrying up to two risk alleles, and the overall risk with the combined risk allele variable was higher than with any of the SNPs on their own. This result, even with the small number of genes involved, supports the polygenic risk model as suggested by recently published iCOGS (Collaborative Oncological Gene-environment Study) studies [46]. Here, the associated variants are all non-coding and do not affect the structure of the proteins translated from these genes. The variants may, however, affect the transcriptional regulation of these genes and thus disrupt signaling in and among cell types. In the case of tagSNPs, the effective or functional variant may also be nearby in the same LD block.
In addition to breast cancer risk estimation, we studied the association of the serine protease genetic variants with BCSS and survival in different treatment groups. We found five breast cancer survival-associated SNPs in four TTSP genes, and also evidence about poor response to radiotherapy due to variants. The strongest association was with the minor allele of TMPRSS1 rs12151195, which was a marker of poor prognosis and associated with PRnegative tumors. In addition, the minor allele carriers of the variant rs12151195 had poorer OS, BCSS, and RFS among patients given radiotherapy. rs12151195 is located after the 39UTR of TMPRSS1, but its potential regulatory function is not known. Interestingly, the T/C difference in the rs12151195 influences the binding of nuclear protein(s) to the gene region; however, because the identity of the putative TF binding to the gene region is currently unclear (in silico searches failed to predict a strong candidate TF), it is difficult to judge the possible regulatory effect of the SNP. According to the 1000 Genomes Project data from Finns, more than 20 variants are present in a 25kb area with LD (r 2 ) $0.75 in both directions from rs12151195 [45]. Further studies taking these variables into account are needed. In the same gene, TMPRSS1, the minor allele of rs12461158 associated with better survival, as well as with negative HER2 status. In two other studies, TMPRSS1 gene variants have been found to associate with prostate cancer susceptibility but not with the prognosis [10,47]. However, the genotyped variants were not the same as in our study. TMPRSS1 expression in breast cancer is enhanced on the protein level, assessed with immunohistochemistry [8]. In that study, the knockdown of TMPRSS1 in breast cancer cells with high TMPRSS1 expression led to a decreased invasion in a Matrigel invasion assay, suggesting it to have a role in tumor invasion [8]. Most of the studies concerning TMPRSS1 and TMPRSS2 are done with prostate cancer, and very little is known about their role in breast cancer.
Our results show that the TMPRSS2 rs2276205 minor allele is associated with better survival in breast cancer patients; thus, the major allele impairs the prognosis. Interestingly, an in silico analysis showed that the minor allele possibly disrupts the binding site of GATA-1, which is present with the major allele. In addition, the minor allele associated with tumor PR positivity, which may be connected with survival via treatment. Silencing of TMPRSS2 causes sensitivity to tamoxifen, one of the most widely used drugs in treating breast cancer [48]. Moreover, TMPRSS2 is androgen-regulated and forms a fusion gene with ETS TFs in prostate cancer [13]. Whether the fusion gene occurs in breast cancer is not known. Interestingly, androgens and the androgen receptor affect breast cancer risk  and prognosis, although the data are somewhat complicated [49]. In our study, the only SNP associated with both breast cancer risk and survival was TMPRSS3 rs3814903. Surprisingly, the patients who were heterozygous for TMPRSS3 rs3814903 had poorer survival than those who were homozygous. The SNP sits approximately 1100 bp upstream from the gene TMPRSS3 and may thus affect regulation of gene expression. The in silico analysis suggested rs3814903 to associate with splicing regulation and detected the minor allele to create a binding motif for splicing factor SRp55 that is absent with the major allele. In addition, in the same analysis, the major allele creates a binding site for the splicing factor SF2/ASF, also known as SRSF1. SRSF1 is a reported protooncogene and involved in mammary epithelial transformation via its overexpression and by splicing regulation of Bcl-2 family tumor suppressors [50,51].
Interestingly, TMPRSS7, encoding matriptase-3, was the only gene having SNPs associated with both breast cancer risk and prognosis. Our results show that the TMPRSS7 intronic variant rs2399403 associated with breast cancer survival. Compared with those who were homozygous for the major allele, the minor allele carriers had significantly poorer survival. Moreover, in the group of patients with no adjuvant therapy, the rare allele carriers of rs2399403 had significantly poorer BCSS, and in addition, their RFS was slightly significantly shorter (no data shown). Previously, Szabo and colleagues (2005) identified matriptase-3 as a functional serine protease [31]; therefore, it might affect tumor invasiveness. More important, TMPRSS7 belongs to the same TTSP subfamily as matriptase and matriptase-2 that we have found to be associated with breast cancer, which makes it of interest for further study. [25][26][27].
As with the breast cancer risk-associating SNPs, we combined the survival-associated SNPs into a new variable and estimated its association with prognosis. The risk of dying of breast cancer was 3.3-fold higher among women who had six to eight risk alleles compared with women with one to three alleles. The effect of an increasing number of risk alleles was also seen in radiation therapy group. However, this was not the case in the group of patients with no adjuvant treatment (no data shown). Therefore, although some of the TTSP SNPs were significantly associated with survival in the whole study population, this association might reflect their prediction of response to cancer therapies rather than their role as prognostic markers. The studied SNPs might associate with patient outcome by affecting metabolic pathways or response to cytotoxic treatments. To our knowledge, though, no in vitro or clinical data address these specific SNPs and their potential association with the efficacy of radiotherapy or chemotherapy.
In summary, the results of this study suggest that these genetic variants should be evaluated as an overall pattern instead of as single markers when assessing patient cancer risk, prognosis, and treatment. If the breast cancer risk-and survivalassociating SNPs lead to changes in gene function or expression levels, they may affect on tumor invasion, metastasis, and response to treatment as a whole protein family. The effects of different TTSPs on the enhancement of the tumor invasion are not necessarily parallel, but we can hypothesize that one overexpressed protein enhances ECM degradation and that underexpression of another protein destabilizes epithelial integrity [9,34]. Small sequence changes can affect TF binding or epigenetic regulation and thus lead to changes in gene expression. However, this idea needs to be studied in vivo in functional studies of these associated variants to confirm the current results. Figure S1 Differential binding of nuclear proteins from breast cancer cells to oligomers corresponding to rs12151195 alleles. Electrophoretic mobility shift assay was performed as described in the materials and methods. The upper arrow indicates the position of differential DNA-protein complex formation. The lower arrow indicates the free probe. Lanes 1 and 2: oligomers corresponding to rs12151195 C and T alleles without proteins; lanes 3 and 4: oligomers corresponding to rs12151195 C and T alleles with 25 mg of MCF7 breast cancer cell nuclear proteins; lanes 5 and 6: oligomers corresponding to rs12151195 C and T alleles with 25 mg of MCF7 cell nuclear proteins and with 506 molar excess of unlabeled oligomers; and lanes 7 and 8: oligomers corresponding to rs12151195 C and T alleles with 25 mg of MCF7 cell nuclear proteins and with 756 molar excess of unlabeled oligomers. (TIF)