Genetic Variations in the Transforming Growth Factor Beta Pathway as Predictors of Bladder Cancer Risk

Bladder cancer is the fifth most common cancer in the United States, and identifying genetic markers that may predict susceptibility in high-risk population is always needed. The purpose of our study is to determine whether genetic variations in the transforming growth factor-beta (TGF-β) pathway are associated with bladder cancer risk. We identified 356 single-nucleotide polymorphisms (SNPs) in 37 key genes from this pathway and evaluated their association with cancer risk in 801 cases and 801 controls. Forty-one SNPs were significantly associated with cancer risk, and after adjusting for multiple comparisons, 9 remained significant (Q-value ≤0.1). Haplotype analysis further revealed three haplotypes within VEGFC and two haplotypes in EGFR were significantly associated with increased bladder cancer risk compared to the most common haplotype. Classification and regression tree analysis further revealed potential high-order gene-gene interactions, with VEGFC: rs3775194 being the initial split, which suggests that this variant is responsible for the most variation in risk. Individuals carrying the common genotype for VEGFC: rs3775194 and EGFR: rs7799627 and the variant genotype for VEGFR: rs4557213 had a 4.22-fold increase in risk, a much larger effect magnitude than that conferred by common genotype for VEGFR: rs4557213. Our study provides the first epidemiological evidence supporting a connection between TGF-β pathway variants and bladder cancer risk.


Introduction
Bladder cancer commonly affects the elderly and men, with an estimated 73,510 new cases and 14,880 deaths in the United States in 2012 [1]. Major risk factors for bladder cancer include male gender, older age, tobacco smoking, and occupational exposure to aromatic amines [2]. It is increasingly recognized that genetic susceptibility may contribute to bladder cancer carcinogenesis [3]. Therefore, identifying individuals susceptible to cancer with the aid of genetic markers can reduce health care costs, increase the cost-benefit of screening and surveillance, and improve the treatment and survival of cancer patients.
The transforming growth factor-beta (TGF-b) pathway has been established playing important roles in different cancer types and implicated in the tumorigenesis of bladder transitional cell carcinoma. Many studies have indicated that TGFb signaling contributes to epithelial-mesenchymal transition, angiogenesis, migration, and metastases in many types of malignant tumors [4] [5,6]. In normal cells, TGF-b regulates cell growth, differentiation, matrix production, and apoptosis [7]. TGF-b induced apoptosis is frequently mediated by the smad-dependent pathway but may also occur through both p53dependent and p53-independent mechanisms [8,9], and involves caspase activation, upregulation of proapoptotic factors (i.e., Bax), and/or downregulation of antiapoptotic factors (i.e., Bcl-2 and Bcl-xL) [10][11][12]. These factors are all integral parts of the human immune system. The TGF-b receptor 1 variant rs11466445 (TGFBR1*6A) has been associated with an increased risk of breast and ovarian cancers but not colorectal or bladder cancer [13][14][15][16]. However, another study on the effects of 7 different genetic variants in two key members of the TGFb pathway (TGFB1 and TGFBR1) and the clinical outcome of muscle invasive bladder cancer indicated that TGFBR1: rs868, located in the 39-untranslated region, was significantly associated with disease-specific mortality [17]. It is also reported that genetic variants in RUNX3, a tumor suppressor of the TGFb pathway, may modulate bladder cancer risk [18].
Since the TGF-b pathway plays an essential role in cellular processes, we hypothesized that polymorphisms of TGF-b pathway genes may modulate the risk of bladder cancer. To test this hypothesis, we conducted a large case-control study to evaluate the effects of single-nucleotide polymorphisms (SNPs) in key genes from this pathway. To our knowledge, this is the first study to explore the association between a comprehensive panel of polymorphisms in the TGF-b pathway genes and bladder cancer risk and to identify subgroups that would be more likely to have higher cancer risk.

Ethics Statement
Written informed consents were obtained from all patients prior to enrollment in this study. The study was approved by the Institutional Review Boards at MD Anderson Cancer Center, Baylor College of Medicine, and Kelsey-Seybold Clinic.

Study Population and Data Collection
This case-control study started patient recruitment in 1999 and is currently ongoing. Bladder cancer patients were recruited from the University of Texas MD Anderson Cancer Center and Baylor College of Medicine. The cases were patients with newly diagnosed, histologically confirmed bladder cancer. None of the patients had undergone chemotherapy or radiotherapy prior to study enrollment. There were no restrictions for age, gender, or disease stage at recruitment. The control subjects were healthy individuals without prior history of cancer (except nonmelanoma skin cancer). They were recruited from the Kelsey-Seybold clinic, the largest private multispecialty group practice in the Houston metropolitan area. Cases and controls were matched in terms of age (65 years), sex and ethnicity. All study participants provided signed informed consent and completed a 45-minute in-person interview by trained MD Anderson staff interviewers. After each interview, a 40 ml peripheral blood sample was drawn into coded and heparinized tubes for subsequent DNA isolation and analysis. Individuals who had smoked less than 100 cigarettes in their lifetime were defined as never smokers, individuals who had smoked more than 100 cigarettes but had quit more than 1 year prior to diagnosis or interview were defined as former smokers, and individuals who were currently smoking or had stopped less than 1 year prior were defined as current smokers. In this study, former and current smokers were defined as ever smokers. Since more than 90% of our recruited cases were pure transitional cell carcinoma, we only included this histology in the study. Since more than 90% of our study population was self-reported non-Hispanic Caucasians based on the questionnaire date, we restricted the analysis to Caucasians to limit the confounding effect from population stratification while retaining most of the statistical power.

SNP Selection and Genotyping
A panel of 356 SNPs in 37 genes (Table S1) was selected on the basis of the following criteria. Briefly, we utilized Ingenuity System Pathway Analysis software (http://www.ingenuity.com) and National Center for Biotechnology Information (NCBI) PubMed (http://www.ncbi.nlm.nih.gov) to identify a list of TGF-b pathwayrelated genes. For each gene, we selected tagging SNPs by the binning algorithm of LDSelect software (http://droog.gs. washington.edu/ldSelect.pl, version 1.0) (r 2 ,0.8, MAF .0.05) within 10 kb upstream of the transcriptional start site or 10 kb downstream of the transcriptional stop site. SNP frequency and LD data were based on the International HapMap Project database, release 22, human genome build 36. We also included potentially functional SNPs with minor allele frequency (MAF) greater than 0.01 in the coding (synonymous and non-synonymous SNPs) and regulatory regions (promoter, splicing site, 59-untranslated region, and 39-untranslated region). The genotyping was performed using Illumina's iSelect custom SNP array (Illumina, San Diego, CA). Genotypes were analyzed and exported using the Illumina BeadStudio software. Any SNPs with a call rate ,95% was excluded from further analysis.

Statistical Analysis
Laboratory and epidemiological data were merged and most analyses were performed using STATA 10.0 (Stata Corporation, College Station, TX) and HelixTree (Golden Helix, Bozeman, MT) software. Distributions of characteristics in cases and controls were evaluated by the x 2 test (for categorical variables) or Student's t-test (for continuous variables). For each SNP, we tested Hardy-Weinberg equilibrium using the goodness-of-fit x 2 test to compare the observed with the expected frequency of genotypes in control subjects. The effects of genotypes of SNPs on bladder cancer risk were estimated as odds ratios and 95% confidence intervals (95% CI) using multivariate unconditional logistic regression under the dominant, recessive, and additive models of inheritance adjusted for age, gender, and smoking status, where appropriate. The best-fitting model was the one with the smallest P value among the three models. A Q-value was calculated to account for multiple comparisons. The Q-value measured the proportion of false positive incurred (false discovery rate) when a SNP shown significant. Bootstrap resampling was performed 100 times to internally validate the results [19]. Analysis of the combined effects of unfavorable genotypes involves a sum of all risk genotypes from those SNPs showing statistical significance in the main analysis (P,0.05) with equal contribution from each variant. Haplotype analysis was performed using the maximization algorithm implemented in the HelixTree software. Higher-order gene-gene interactions were evaluated using classification and regression tree (CART) analysis implemented in the HelixTree software. All statistical analyses were two-sided.

Characteristics of the Study Population
As shown in Table 1, 801 Caucasian cases and 801 Caucasian controls were enrolled in this study. There were no significant differences in bladder cancer risk due to sex (P = 1) or age (P = 0.051). As predicted, cases reported a higher No. of Cigarettes/Day than controls (25.6 versus 22.5, P,0.001). Among smokers, cases reported higher pack-years of smoking than controls (43.0 versus 29.9, P,0.001). There was a significant difference between cases and controls by smoking status (P,0.001) with a higher percentage of current smokers in cases and higher percentage of never smokers in controls (P,0.001).

Individual SNPs and Overall Survival
The association of all significant SNPs with risk was summarized in Table 2 (P,0.05). Among the 356 SNPs examined, 41 SNPs were significantly associated with cancer risk (P,0.05), and after adjusting for multiple comparisons, 9 remained significant, with a Q-value #0.1. These SNPs were located in three genes in the TGF-b pathway, including VEGFC, epidermal growth factor receptor gene (EGFR), and SMAD3.
To internally validate these results, we next performed random bootstrap sampling of the significant SNPs for 100 iterations and listed the number of times that the P value was ,0.05. Eight of these nine top SNPs had highly consistent results, with bootstrap P values ,0.05 for greater than 90% of the samplings ( Table 2).

Cumulative Effects of Multiple Unfavorable Genotypes on Cancer Risk
Because abnormal TGF-b signaling can result in the activation of multiple downstream genes and 9 SNPs reached statistical significance after multiple comparisons in the main effect analysis, we used combined analysis to determine whether multiple unfavorable genotypes in the TGF-b pathway have an additive effect on bladder cancer risk. There was a significant doseresponse trend of increased risk of bladder cancer with increasing number of unfavorable genotypes. Compared with the reference group consisting of subjects with 0,2 unfavorable genotypes, the groups with 3 or 4 unfavorable genotypes had a significantly elevated risks with the ORs of 1.73 (95% CI, 1.28-2.33, P,0.001) or 2.15 (95% CI, 1.59-2.91, P,0.001), respectively, whereas the high-risk group with 5,9 unfavorable genotypes had a significantly elevated risks with the OR of 2.57 (95% CI, 1.92-3.43, P,0.001) (P trend = 1.07610 210 ) ( Table 3).

Higher-order Gene-gene Interactions
We next explored higher-order gene-gene interactions to determine whether or not complex interactions among these significant SNPs could further modulate bladder cancer risk. The final tree structure identified several potential interactions among the top nine SNPs (Figure 1). VEGFC: rs3775194 was identified as the initial split, which suggests its important role in gene-gene interaction and the potential to predict cancer risk. The final tree structure also identified four terminal nodes with significantly higher risk than the low-risk genetic profile of node 1 (Figure 1). In particular, node 5 had a significantly elevated risk with the OR of 3.28 (95% CI, 1.88-5.72), while node 6 had an even higher significantly elevated risk with the OR of 4.22 (95% CI, 1.46-12.17).

Discussion
Bladder cancer remains to be a challenging disease due to the high rate of recurrence and the accompanied high medical costs. Using genetic markers for determining risk may help to identify high risk population for early screening, diagnosis, and therapy, which may improve clinical outcome. This is the first study to explore the association between a comprehensive panel of polymorphisms of TGF-b pathway genes and bladder cancer risk and to identify subgroups that would more likely have higher cancer risk. We identified and evaluated 356 SNPs in 37 key genes from the TGF-b pathway for their associations with bladder cancer risk. Our results identified 41 SNPs significantly associated with bladder cancer susceptibility, and nine of them remained significant after adjustment for multiple comparisons (Q #0.1). In particular, SNPs in VEGFC showed the most significant associations in single SNP and haplotype analyses. CART analysis further revealed potential high-order gene-gene interactions and categorized subjects into different risk groups according to their specific polymorphic signatures. Our study provides the first epidemiological evidence supporting a connection between a comprehensive TGF-b pathway SNPs and bladder cancer risk.
Five of the top nine significant SNPs were in VEGFC, which is known to have important functions contributing to bladder cancer risk ( Table 2). VEGFC is a member of the platelet-derived growth factor/vascular endothelial growth factor (PDGF/VEGF) family. This gene functions in angiogenesis, lymphangiogenesis, endothelial cell growth and survival, affects the permeability of blood vessels, and also facilitates nodal metastasis. Several studies have correlated elevated VEGF expression with disease recurrence or progression, often as an independent predictor on multivariate analysis. There is a positive correlation between VEGF-C expression and lymphatic invasion in patients with breast, gastric, and cervical cancer [20][21][22]. VEGF-C expression was found in the cytoplasm of transitional carcinoma cells and was associated with lymph node metastasis in bladder cancer [23][24] and also has been found to correlate with clinical parameters like tumor size, pathological T stage, pathological grade, lymphatic-venous involvement, and pelvic lymph node metastasis in bladder cancer patients [23]. Haplotype analysis further identified three significant haplotypes of VEGFC, which suggests that haplotype-based analysis may be more informative than single SNP analysis and resequencing DNA samples carrying the high-risk and low-risk haplotypes may be able to improve risk assessment. The five most significant (Q ,0.1) VEGFC SNPs we identified are all located in the intron region, which may contribute to alterations in gene expression or splicing. Alternatively, it is also possible that these SNPs are linked to other causal variants in VEGFC.
Three of the nine most significant SNPs were in EGFR ( Table 2). EGFR is a tyrosine kinase transmembrane receptor in the ErB family of receptors expressed on the surface of epithelial cells [25]. EGFR regulates important processes in carcinogenesis, including cell survival, tumor invasion, and angiogenesis and is involved in many malignancies, including bladder cancer [26]. EGFR overexpression is frequently observed in tumors and precancerous lesions and induces tumor formation in animal studies. EGFR expression in bladder cancer independently predicts disease progression and mortality, and both VEGF and EGFR are emerging as important targets for the treatment of metastatic bladder cancer [25]. It is possible that these EGFR SNPs may affect gene transcription, thus altering protein level, or they may be linked to other causal variants in EGFR. In addition, we also identified SMAD3: rs12324036 significantly associated with cancer risk. It is well known that TGF-b/Activin/Nodal signaling is transduced by SMAD2 and SMAD3, and increased TGF-b1 level can revert a malignant phenotype to a less aggressive phenotype in rat bladder carcinoma cell line lacking TGF-b1 [27]. Our previous study also identified SMAD3: rs11632964 being significantly associated with lung cancer overall survival [28], which also highlights the important role of SMAD3 in cancer. It is possible that the variant allele of SMAD3: rs12324036 may affect gene transcription thus altering protein level. Alternatively, it may be linked to other causal variants in SMAD3. Overall, our study highlights the association of EGFR VEGFC, and SMAD3 polymorphisms with bladder cancer risk. We next applied a pathway-based approach to comprehensively evaluate the effect of the nine significant SNPs (Q ,0.1) on the risk of bladder cancer. We identified a gene-dosage effect for the nine SNPs that were significant after multiple comparison adjustment. Those with 5,9 risk genotypes had the highest risk of bladder cancer, suggesting additional variations within this key pathway was detrimental and had a larger effect than any single variant. Furthermore, the magnitude of each individual SNP was modest, but the risk for individuals with five to nine of these risk genotypes was more than doubled. This also emphasizes the importance of including multiple SNPs within a shared pathway for examining joint effects in the risk assessment.
Within the framework of a pathway, we hypothesized that genegene interactions would further modulate the risk of bladder cancer. Potential gene-gene interactions among three variants were observed, with VEGFC: rs3775194 being the initial split in our CART analysis, suggesting its importance in determining the most variation in risk. Individuals carrying the common genotype for VEGFC: rs3775194 and EGFR: rs7799627 and the variant genotype for VEGFR: rs4557213 had a 4.22-fold increase in risk, a much larger effect magnitude than that conferred by common genotype for VEGFR: rs4557213.
In summary, we have performed a pathway-based analysis of TGF-b pathway genes and bladder cancer risk. These data provide important genetic information for predicting individuals at risk for bladder cancer and identifying tumors at an early, curable stage. In addition, our relatively comprehensive query of TGFb pathway polymorphisms and our large population with detailed risk information provide substantial evidence for the involvement of SNPs as predictors or modulators of bladder cancer risk. However, there are some limitations in our study. For example, further fine mapping and functional assays are necessary to reveal potential molecular mechanisms of these SNPs or other linked causal polymorphisms. Additionally, only Caucasians were included in this study. It would be interesting to exam these SNPs in minority populations. Finally, although our data are largely internally validated, future replication studies in independent populations are needed to validate some of the results.

Supporting Information
Table S1 Selected genes of the TGF-b pathway for this study.
(DOC) Figure 1. Higher-order gene-gene interactions and bladder cancer risk. Note: Each node denotes number and percentage of cases and OR with 95% confidence interval in parenthesis. *Significant at P,0.05. doi:10.1371/journal.pone.0051758.g001