Molecular characterization of low grade and high grade bladder cancer

Background Bladder cancer (BC) is the 9th most common cancer diagnosis worldwide. Low grade (LG) represents 70% of all BCs, characterized by recurrence and rare ability (10–15%) to progress to high grade (HG) and invade. The remaining 30% is high grade (HG), fast invasive BC, which is resistant to therapy. Identifying biomarkers for predicting those tumors able to progress is a key goal for patient outcome improvement. This study focuses on the most promising prognostic markers. Materials and methods TP53 and FGFR3 mutational status, Survivin, CK19, CK20, E-cadherin and CD44 gene expression analysis were performed on 66 BCs. Results Survivin was found associated to tumor grade (p<0.05). Moreover, Survivin correlated with CD44 in TP53 wild type (p = 0.0242) and FGFR3 wild type (p = 0.0036) tumors. In particular the Survivin-CD44 correlation was associated to HG FGFR3 wild type BCs (p = 0.0045). Unsupervised hierarchical clustering based on gene expression data identified four distinct molecular groups reflecting the patient histology (p = 0.038). Conclusion We suggest Survivin, both as a biomarker associated to G3 BCs but negatively related to TP53 mutational status, and as a potential novel therapeutic target.


Conclusion
We suggest Survivin, both as a biomarker associated to G3 BCs but negatively related to TP53 mutational status, and as a potential novel therapeutic target. PLOS

Introduction
Bladder cancer (BC) is the ninth most common malignant disease and the one of the most common cause of cancer death worldwide [1]. Men are more affected than women (3.5:1 ratio) and BC incidence increases with age differing considerably between geographical regions. European countries (Spain, Italy, Denmark and Switzerland), North America, some northern African countries and western Asia show the highest incidence, while Central and South America, SubSaharan Africa and Southeast Asia show the lowest rates [2]. Cigarette smoke is the main responsible factor for about half of all BCs while several others industrialized chemicals are associated to 20% of BC development [3]. The most common BC symptom is haematuria (microscopic or macroscopic). Macroscopic hematuria is associated with an advanced pathological stage. Unfortunately, microscopic haematuria is not adequately evaluated and there is no active screening for BC [4].
At diagnosis, about 70% of all BCs are low grade (LG) BCs (Ta/pT1/CIS), typically noninvasive tumors, growing as superficial papillary protrusions, genetically associated to FGFR3 mutations and characterized with a high risk of recurrence but low propensity to progress to high grade (HG), invade and metastasize. However, there is a small percentage of LG tumors (10-15%) which are able to progress to HG and become invasive, likely due to acquired TP53 mutations [5,6]. The remaining 30% of BCs are high grade (HG), genetically associated to TP53 mutations and characterized by a fast direct progression to become invasive tumors (pT2-4). They are mainly flat BCs, developing from severe dysplasia or carcinoma in situ (CIS) and associated to resistance to therapy and poor prognosis. Many genetic factors contribute to tumorigenesis and progression of BC [6,7], such as mutations in both TP53 and FGFR3 genes [8][9], as well as alterations in the expression of genes involved in cell morphology, epithelialmesenchimal transformation (EMT) and apoptosis resistance [10], such as CK genes [11], CD44 [12], E-cadherin [13] and Survivin [14].
Currently, tumor, node, and metastasis (TNM) staging and grading systems are insufficient to predict accurately BC evolution. However, the presence of grade 3 (pTa-1G3 or CIS) is the main predictor of progression and mortality in patients affected by non-invasive BC [15]. Predicting which tumors can acquire the susceptibility to progress, invade and/or metastasize is crucial in order to dictate initial therapy and improve the patient outcome. Thus, it is evident a need for tumor markers, to incorporate them into clinical practice adding prognostic information to the conventional TNM and grading systems in terms of treatment response and prognosis [16].
In this study we molecularly characterized 66 histologically non-invasive BCs, considering the most promising prognostic markers. We analyzed TP53, and FGFR3 gene mutational status and gene expression of Survivin, CK19, CK20, CD44 and E-cadherin. Especially, we performed association analysis of the target genes within high grade (HG) and low grade (LG) BCs. Then, we focused on comparative analysis of the molecular status with tumor grade and morphology (papillary or flat) in order to characterize those BCs with high propensity to progress such as HG flat G3 (CIS) underlining putative molecular targets to therapy. Furthermore, we investigated a possible correlation among genes and performed hierarchical clustering analysis in order to find in our population a stratification driven solely by the molecular status.

Samples collection
A total of 66 BC samples: 58/66 were male and 8/66 female with an mean age of 75. The patient population was composed of 48 formalin-fixed paraffin embedded (FFPE) tissues and 18 fresh tissues, collected at the Versilia Hospital (Viareggio, Italy). All 66 samples were histologically non-invasive. 33 out of 66 were histologically diagnosed as HG, and 33 as LG tumours (Table 1 and Table A in S1 Tables).
A detailed grading was available only for the 48 FFPE samples: 9/48 were G1 (pTa), 13/48 were G2 (pTa) and 26/48 were G3. In the G3 group 7/26 showed papillary morphology (pTa) while the remaining 19 had no papillary morphology (pTis) at histological diagnosis ( Table 2 and Table A in S1 Tables). An example of the histologic classification of G1 (A), G2 (B) and G3 (C) as well as LG and HG BCs among our cases is showed in Fig 1. The study was approved by the Ethics Committee of the University Hospital of Pisa and all methods were performed in accordance with approved guidelines.

TP53 and FGFR3 gene mutational analysis
DNA extracted from 4 x 10 μm FFPE tissue sections (Macherey-Nagel, Düren, Germany) was used to amplify TP53 exon 4-9 PCR and FGFR3 exon 7, 10, 15 by PCR (Applied Biosystem, MA). PCR product was purified (Millipore) and analyzed by ABI-XL 3130 Sequencer (Applied Biosystem, MA USA). Because of the poor quality of the DNA extracted from the FFPE tissues we could not perform the TP53 and FGFR3 mutational status on all the samples.

Gene expression analysis of Survivin, CK20, CD44 and E-cadherin
Total RNA extracted from 4 x 10 μm FFPE tissue sections (Helix Extraction System, Diatech) was reverse transcribed in cDNA (Invitrogen, Carlsbad, CA). Gene expression levels were analyzed by Rotor-Gene 6000 (Corbett, Qiagen, CA). Endogenous reference gene (beta 2 microglobulin) was used to normalize each gene expression level (Table C in S1 Tables). Because of the poor quality of the RNA extracted from the FFPE tissues we could not perform the gene expression analysis on all the samples.

Statistical analysis
Chi-square and Fisher tests were used to analyze the frequencies of TP53 and FGFR3 genetic alteration, while t-student was performed to study the differential gene expression in the patient population studied. Unsupervised Hierarchical clustering analysis considering all the gene expression data was performed by Tanagra Software.
Multivariable Analysis was used to investigate a correlation among genes taking into account the effects of all histological and morphological variables. p-value � 0.05 indicates statistical significance of all the analysis (StatGraphics XVI software).

Association analysis of gene espression levels with LG and HG BCs
To study the relationship between the molecular profile and histological classification (in terms of low-high grade), we analyzed, in HG and LG BCs, the expression of genes involved in BC development, such as Survivin, CK19, CK20, CD44 and E-cadherin, (Fig 2). Survivin gene expression was significantly overexpressed (p = 0.04) in HG, showing mRNA levels 3.8 fold higher than LG (mRNA levels: 0.58 ± 0.19 and 0.15 ± 0.06 respectively) (Fig 2A). HG tumors showed a four fold down-regulation of CK20 compared to LG BCs (mRNA levels: 0.24 ± 0.13 and 0.06 ± 0.02 respectively) ( Fig 2B). CK19 mRNA levels slightly decreased (data not showed) between LG and HG tumors (mRNA levels: 27.09 ± 5.53 and 18.22 ± 4.12 respectively). E-cadherin mRNA level was 3.3 fold higher in LG than HG group (mRNA level: 1.34 ± 0.61 and 0.44 ± 0.14 respectively), while CD44 showed an over-expression of 2.1 fold in LG compared to HG tumors (mRNA levels: 0.24 ± 0.14 and 0.11 ± 0.06 respectively) (Fig 2C and 2D). However CK19 (data not showed), CK20, E-cadherin and CD44 gene expression analysis were not significant.
Then, withinin LG and HG tumors, we investigate the relationship between the expression of these target genes and the TP53-FGFR3 mutational status of the tumor. We observed that Survivin expression seems to be inversely related to TP53 mutational status, in LG tumors. TP53 mutations, alone or in combination with FGFR3 mutations, were associated to a decrease in Survivin mRNA levels compared to wild type LG BCs. In particular TP53 mutated tumors had 12.7 times and 17.5 times decrease of Survivin mRNA levels compared to wild type (data not significant) and FGFR3 mutated LG BCs (p = 0.021), respectively ( Fig 3A). On the other hand, FGFR3 mutated and TP53-FGFR3 double mutated BCs showed 9 fold decrease of CD44 mRNA levels compared to both, wild type and TP53 mutated tumors (dat not  3A). Down-regulation of Survivin correlated with TP53 mutational status was confirmed also in HG, compared to both, wild type and FGFR3 mutated tumors. In addiction, TP53 tumors showed 4 fold increase of CK20 mRNA compared to wild type tumors, not significanlty ( Fig 3B). Moreover multivariable analysis showed Survivin and CD44 strongly correlated each other in both TP53 wild type (p = 0.0242) and FGFR3 wild type (p = 0.0036) BCs, whereas E-cadherin correlated with CD44 exclusively in TP53 mutated (p = 0.011) tumors, as well as with CK20 only in FGFR3 mutated (p = 0.033) BCs. Within HG tumors, E-cadherin correlated with CK20 in those BCs harboring TP53 mutations (p = 0.0025), as well as with Survivin in TP53 wild type HG tumors (p = 0.0129). FGFR3 wild type HG tumors showed Survivin and CD44 significantly correlated each other (p = 0.0045) as well as CK20 and E-cadherin (p = 0.0001). On the other hand, within LG BC, Survivin and CK20 correlated each other in TP53 wild type (p = 0.0159) or FGFR3 wild type (p = 0.0104) tumors. Multivariable analysis on the total population of patients showed a positive significant correlation between CK19 and CK20 (p = 0.006) (Fig 4).

Hierarchical clustering analysis
Considering the well-known heterogeneity of BC, we performed an Unsupervised Hierachical Clustering combining CD44, E-cadherin, Survivin and CK20 gene expression data. To perform the analysis it was necessary to have all four gene expression values for each sample so the total number of samples examined was reduced to 44 samples for the LG/HG analysis and to 30 samples for the grading and papillary morphology analysis. In Fig 7 is shown how gene expression profiles obtained by the combination of expression data revealed a similar trend between Cluster 2 and G2 group, and Cluster 4 and flat G3 group. The distribution of the two histotypes LG and HG in the four clustering groups was statistically significant with a p = 0.03 (Fig 8 and Table 7). On the other hand the distribution of grading (G1, G2 and G3), as well as the tumor morphology, in the four clusters was not statistically significant. Cluster 1 and Cluster 4 are the most distant groups and Cluster 1 contains a higher number of LG cases vs HG cases while Cluster 4, vice-versa, contained a higher number of HG cases.

Discussion
BC is the 9 th most common cancer diagnosis worldwide [2] It is a heterogeneous disease, both clinically and pathologically, and its detection is based on methods with low sensitivity. Approximately 70% of all BCs are non-invasive low grade (LG) BC (Ta/pT1/CIS), typically growing as superficial papillary protrusions, genetically associated to FGFR3 mutations and characterized by a high risk of recurrence together to a rare (10-15%) ability to progress to high grade (HG) and invade. This capability seems to be likely due to TP53 acquired mutations. The remaining 30% of tumors are high grade (HG), flat and genetically associated to TP53 mutations. They develop from severe dysplasia or carcinoma in situ (CIS) and directly progress to become invasive (pT2-4). They are associated to resistance to cancer therapy and poor prognosis [5,6,7]. Predicting which LG tumor can progress and become invasive is a key goal for improving the patient outcome. TNM, grade and stage systems are not sufficient to accurately predict BC evolution. However, grade seems to be the better prognostic indicator of BC progression and mortality: 50% and at least 25% of CIS and pTaG3 respectively evolves becoming invasive [15]. Then main medical need, still unresolved, is to identify molecular markers that may translate into diagnostic or prognostic tools [16]. Here, we focused on the most promising prognostic markers.
First, we observed that HG and LG BCs did not show any significant association with TP53 and/or FGFR3 somatic mutations (Tables 5 and 6), without supporting the literature [17,18] which reports TP53 and FGFR3 strictly related to HG and LG respectively. This discrepancy could be explained by the very small size of our study population suggesting the importance of validating these findings in a larger cohort of samples. Nevertheless, conflicting results have already been published concerning TP53 and FGFR3 mutation frequency in BC, as urderlined by Neuzillet Y et al (2012) that show how stage and grade can act as confusion factors creating spurious associations between the risk of each mutations. Only large study population, including all BC grades and stages, allows for properly adjusting association analysis between TP53 and FGFR3 [19]. Moreover, looking at the high frequency of TP53 mutation in LG group, we can't exclude that our LG study population includes a copious number of the high propensity LG tumors able to evolve to HG ones (mainly associated to TP53 mutations). Indeed, within LG tumors, it is not actually possible to discriminate at diagnosis those who are able to acquire invasive features. On this perspective, mutations on TP53 gene could represent an early genetic event necessary to drive the 15% of LG BCs able to acquire invasive phenotype. On the other hand, in according with the literature, we found TP53 and FGFR3 as almost mutual exclusive mutations [17]: only 2/27 LG and 1/31 HG tumors showed co-occurrence of TP53-FGFR3 mutations. Regarding the morphology of HG tumors, our data show a higher frequency (about two folds) of TP53 mutations in flat tumors compared to papillary ones. This is in according with the literature, underling that HG tumors characterized with a high propensity to invade (such us flat HG) are mainly associated to TP53 mutations [5,6]. In addiction, HG showed significant Survivin over-expression compared to LG non-invasive tumors, sustaining the role that Survivin could play, not only as a predictive biomarker for BC progression [20], but also as a potential therapeutic target. Moreover, we observed that Survivin expression seems to be inversely related to TP53 mutational status both in LG and HG BCs. In particular, within LG tumors, TP53 mutations (alone or in combination with FGFR3 mutations) were associated to a low Survivin mRNA levels compared to wild type BCs. In details, TP53 mutated   BCs showed a 12.7 to 17.5 fold decrease of Survivin mRNA levels compared to wild type (data not significant) and FGFR3 mutated BCs (p = 0.021) respectively (Fig 3A). Regarding the grading, Survivin was significantly overexpressed in G3 compared with G1 (p = 0.04), maintaining the clear trend associated to the aggressive state of BC. Although not statistically significant, we found a 4 fold decrease of CK20 mRNA level in HG compared to LG BC, supporting the well-known heterogeneity of CK20 expression: in BC CK20 positive cells ranges from 15% to 97% [21]. Moreover, in the study conducted by Bassily et al, expression of CK20 was reported across all grades: 75% (3/4) were low malignant potential, 83% (5/6) LG, 38% (3/8) organ-confined HG, and 67% (6/9) high-grade tumors that invaded adjacent structures [22]. E-cadherin and CD44 data showed a trend already supported by the literature [23,24]. CK20, CD44 and E-cadherin mRNA levels were not statistically associated to G1, G2 or G3 BC likely due to the small size of our study population. However, among all the correlations identified (Fig 4), we underline the significant positive correlation between Survivin and CD44 in both, TP53 wild type (p = 0.0242) and FGFR3 wild type (p = 0.0036) tumors, and in FGFR3 wild type HG BCs (p = 0.0045). These results, not described before in the literature, suggest the role that Survivin and CD44 could play together in tumor progression, especially in those TP53-FGFR3 wild type BCs. Survivin and CD44 represent two important key factors for cancer development: Survivin is a crucial protein playing a role in apoptosis inhibition, whereas CD44 is a well known stemness marker associated to resistance to treatments [25]. Besides prognostic role, Survivin and CD44 could also represent focal targets for cancer therapy, blocking the progression especially in the TP53-FGFR3 wild type BCs able to become invasive [26].
Moreover, on the basis of the histological and the morphological diagnosis, we described and molecular characterized three BC sub-populations: LG with papillary morphology (G1 +G2), HG papillary G3 and flat G3 (CIS). Indeed, Survivin showed a statistically significant up-regulation (7.7 fold) in flat G3 tumors compared to papillary LG group (p = 0.017) and papillary G3 tumors (p = 0.016) (Fig 6), while no difference was found between papillary LG and papillary G3. These results suggest that HG BC with higher propensity to progress, as the flat G3 (CIS) tumors, could be molecularly distinguished. According to the literature [27,28], E-cadherin data describes an interesting trend negatively correlated to BC progression ( Fig  6C), although not statistically significant, likely due to the low number of analyzed samples. CK20 and CD44 were also not statistically different among groups due probably, as well, by the small size of the analyzed population.
Considering the molecular heterogeneity we found in our BC population, we decided to exploit our gene expression data to perform an unsupervised hierarchical clustering data. We decided to use the unsupervised procedure to free the clustering software from any prior knowledge of the dataset, and to be able to explore the output data groups and to find retrospectively a biological meaning. The combination of the gene expression values of CK20, CD44, E-cadherin and Survivin genes was able to distinguish four clustering groups (Fig 7), which contained a progressive increasing number of HG cases going from Cluster1 to Cluster4 in which 80% of the cases were HG (Fig 8). Vice-versa 88% of Cluster1 was composed of LG cases. The distribution of the two histotypes was statistically significant with a p-value of 0.036 ( Table 7). The use of an unsupervised clustering makes our data stronger, since no prior bias was included in the analysis and our results gives increasing importance to the use of a molecular classification for stratifying cancer populations. On the other hand no statistical significant distribution about histology and tumor morphology was found between the clusters, probably due by the very small size of our study population. It is important to underline that the grouping we obtain with few markers does not reflect the one that could be obtained by a wholegenome expression data analysis.
In conclusion, our study confirms FGFR3 and TP53 as mutual exclusive mutations, and underlines the significant negative correlation between Survivin gene expression and TP53 mutational status in both LG and HG tumors. We suggest here Survivin not only as a prognostic marker, in accordance to Jeon C et al [29] which showed the prognostic role of Survivin in BCs (by a meta-analysis approach), but overall as a potential therapeutic target to block, especially in TP53 wild type tumors. Indeed, interestingly, our study focuses exclusively on noninvasive BCs at diagnosis, showing Survivin statistically related to HG lesions and indicative of BC with high potential to invade (such as CIS). In addition, for the first time, we also report a significant positive correlation between Survivin and CD44 in TP53-FGFR3 wild type tumors, suggesting the potential role they could play, not only as predictors but also as therapeutic targets, especially by blocking the progression of the TP53-FGFR3 wild type BCs able to invade. These findings are to be further investigated by functional studies. Finally, we show by clustering analysis how is crucial to compare the histopathological approach to the molecular one, which is able to identify subgroups that otherwise could not be revealed. In the long run, the perspectives of studies like ours should aim at the translation of the molecular findings into clinically meaningful biomarkers. Stratifying the patients in more specific and characterized molecular groups will enable to establish the clinical relevance of each BC, give a better prognosis and develop personalized therapeutic strategies.