We aimed to determine the validity of two risk scores for patients with non-muscle invasive bladder cancer in different European settings, in patients with primary tumours.
We included 1,892 patients with primary stage Ta or T1 non-muscle invasive bladder cancer who underwent a transurethral resection in Spain (n = 973), the Netherlands (n = 639), or Denmark (n = 280). We evaluated recurrence-free survival and progression-free survival according to the European Organisation for Research and Treatment of Cancer (EORTC) and the Spanish Urological Club for Oncological Treatment (CUETO) risk scores for each patient and used the concordance index (c-index) to indicate discriminative ability.
The 3 cohorts were comparable according to age and sex, but patients from Denmark had a larger proportion of patients with the high stage and grade at diagnosis (p<0.01). At least one recurrence occurred in 839 (44%) patients and 258 (14%) patients had a progression during a median follow-up of 74 months. Patients from Denmark had the highest 10-year recurrence and progression rates (75% and 24%, respectively), whereas patients from Spain had the lowest rates (34% and 10%, respectively). The EORTC and CUETO risk scores both predicted progression better than recurrence with c-indices ranging from 0.72 to 0.82 while for recurrence, those ranged from 0.55 to 0.61.
Citation: Vedder MM, Márquez M, de Bekker-Grob EW, Calle ML, Dyrskjøt L, Kogevinas M, et al. (2014) Risk Prediction Scores for Recurrence and Progression of Non-Muscle Invasive Bladder Cancer: An International Validation in Primary Tumours. PLoS ONE 9(6): e96849. https://doi.org/10.1371/journal.pone.0096849
Editor: Georgios Gakis, Eberhard-Karls University, Germany
Received: November 26, 2013; Accepted: April 12, 2014; Published: June 6, 2014
Copyright: © 2014 Vedder et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This research received funding from the European Community’s Seventh Framework program FP7/2007-2011 under grant agreement 201663 (Uromol project, http://www.uromol.eu/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: Ewout Steyerberg is a PLOS ONE Editorial Board member. This does not alter the authors’ adherence to PLOS ONE Editorial policies and criteria.
Bladder cancer is the most common malignancy of the urinary tract and a major health issue . Most patients with bladder cancer are diagnosed with non-muscle invasive disease (NMIBC: stage Ta or T1) . After transurethral resection (TUR), recurrence of disease occurs in 30–60% of patients and, approximately, 10–15% develop progression to muscle-invasive disease in 5-year after diagnosis . Therefore, regular cystoscopy is carried out for surveillances after TUR. To better target surveillance, risk scores for recurrence and progression prediction have been developed. The best known are the European Organisation for Research and Treatment of Cancer (EORTC)  and the Spanish Urological Club for Oncological Treatment (CUETO)  risk scores; the latter focusing on BCG treated patients. Despite their potential usefulness in daily practice, few studies have externally validated these models – and no study focussed on primary NMIBC. In addition, since the EORTC score was based on a cohort of patients included in 7 clinical trials, the question arises whether these scores are still valid in a broader set of NMIBC patients for predictive purposes. The EORTC and CUETO scores were based on specimens evaluated by central pathologies and specialized pathologists, whereas the specimens included in the present study had been evaluated by routine pathology. In the present study, we investigated the external validity of these risk scores in patients with primary NMIBC across European centres in an everyday routine setting.
We included 1,892 patients with primary NMIBC from three countries; Spain, Denmark, and the Netherlands. Patients from Spain were recruited between 1998 and 2001 from 18 general and University hospitals as part of the Spanish Bladder Cancer/EPIdemiology of Cancer of the UROthelium (EPICURO) study . All centres are outlined in Appendix table S1. Patients from Denmark were selectively included based on being at higher risk of progression from patient records of the Aarhus University Hospital between 1979 and 2007 . For the Netherlands, we included consecutive patients from the Erasmus MC who underwent a TUR between 1990 and 2012. Patient and tumour characteristics and data on recurrence and progression after TUR of the primary NMIBC were extracted from hospital records up till November 2012. All patients had histologically confirmed NMIBC and were treated according to the centres’ usual procedures. At the Erasmus MC in the Netherlands, follow-up of patients was according to the EAU guidelines at the time, and risk-adapted according to the EORTC risk scores outcome. At the Aarhus University Hospital in Denmark, the common follow-up strategy for all patients was every three months. In Spain, protocols for the follow-up of bladder cancer patients were developed within each centre. For non-muscle invasive bladder cancers, follow-up for these patients consisted of bladder endoscopy every three months the first year, every six months the second year and then annually bladder endoscopy to complete five years of monitoring. White light cystoscopy was used in all centres participating in our study.
Disease progression was defined as cystoscopically detected tumour relapse with histological confirmation at tumour stage T2 or higher (progression to a muscle invasive tumour stage); it was assumed that a tumour progression always precedes death because of cancer. Patients that died because of bladder cancer without a progression were recorded as having had a progression at the time of death. Recurrence was defined as cystoscopically detected tumour relapse with histological confirmation. Data from the 3 cohorts were harmonized, anonymized, and combined in one data set for statistical analyses, stratified by cohort.
All Danish and Spanish patients gave their written informed consent, and the study was approved by the Central Denmark Region Committees on Biomedical Research Ethics (1994/2920) and by the Ethics Committees of each Spanish participating centre and the Institutional Review Board of the U.S. National Cancer Institute, NIH, USA. This observational study was exempted from formal ethical approval in the Netherlands. All data is anonymized before being used in this study.
The EORTC scores for recurrence and progression were based on data from 2,596 patients diagnosed with Ta/T1 tumours from seven EORTC trials . A limitation of the EORTC scores was the low number of patients treated with bacillus Calmette Guérin (BCG). Therefore, the CUETO group developed a scoring model in 1,062 BCG-treated patients . The EORTC score incorporated the number of tumours (single, 2–7 or ≥8), tumour size (<3 cm or ≥3 cm), prior recurrence rate (primary, ≤1 recurrence/year, >1 recurrence/year), T stage (Ta or T1), concomitant carcinoma in situ (yes/no), and grade (1, 2, or 3). The CUETO model incorporated gender, age (<60, 60–70, >70 years), recurrent tumour (yes/no), number of tumours (≤3 or >3), T stage (Ta or T1), concomitant carcinoma in situ (yes/no), and grade (1, 2, or 3).
For all patients, we calculated risks for recurrence and progression according to the EORTC and CUETO scores based on the primary tumour. Standard pathologic procedures were followed in each cohort. Tumour grade was scored according to the 1973 system, and pathological stage was according to the 2002 staging system. The presence of concomitant carcinoma in situ was incomplete (CIS, n = 990, 52% missing), as well as data on the number of tumours (n = 346, 18% missing). We used a multiple imputation strategy  resulting in five sets of complete data to compute risk scores. We subsequently averaged these risk scores for each patient. Patient scores were then categorized into four risk groups, i.e. low, intermediate low, intermediate high, and high risk for recurrence or progression, as originally specified for the EORTC and CUETO scores. The two highest risk groups were combined because of low numbers. Observed recurrence-free survival (RFS) and progression-free survival (PFS) were calculated from the date of TUR of the primary tumour. An event for RFS was defined as recurrence or progression, if progression occurred as the first event during follow-up. Follow-up was censored at either the last date of follow-up, the date of death, or 120 months. We used standard Kaplan–Meier plots to visualize recurrence and progression patterns in relation to risk groups. This cause-specific analysis was not adjusted for the competing risk of death before recurrence or progression, since we focused on the discriminative ability of the 2 risk scores (quantified by a concordance measure, c-index) . We conducted subgroup analyses for patients receiving only BCG treatment after TUR. Furthermore, we refitted the scores with a Cox regression analysis stratified by cohort by recalculating risk scores with EORTC and CUETO coefficients based on our data, to obtain further insight in the validity of the scores. We used likelihood ratio statistics to determine the statistical significance of predictors. For comparability with the original EORTC and CUETO scores, we scaled the refitted regression coefficients by the inverse of the Cox regression coefficient for the original scores in our data. For example, the refitted score for T1 vs Ta in the EORTC model for recurrence was calculated as: multivariable coefficient for T1 vs Ta*1/(coefficient for EORTC score for recurrence). SPSS (version 20.0, SPSS Inc, Chicago, Illinois, USA) and R (Version R-2.15.2 for Windows, http://www.r-project.org/) were used for data analysis.
We included 1,892 patients; 280 patients from Denmark, 639 from the Netherlands, and 973 from Spain. During 10 years of follow-up, 209 (11%) patients died before a recurrence occurred, 839 (44%) patients had a recurrence and 258 (14%) a progression. Median follow-up for those without recurrence was 74 months. There were 98 patients (N = 90 from the Netherlands, N = 8 from Denmark) without follow-up because of loss to follow-up immediately after TUR. CIS (yes/no) and number of tumours was imputed in patients with missing data, based on 902 patients with information on CIS and 1546 patients with information on the number of tumours, as well as complete information on tumour stage, grade, and size, and progression and recurrence free survival (time and yes/no). The mean age was 66 years and the majority was male (Table 1). We do not present totals over all cohorts because of the substantial differences in settings between cohorts. Danish patients presented a larger proportion of patients with high stage and grade (P<0.01), and relatively more recurrences and progressions. The distribution of patients over the risk groups is shown in table 2.
The EORTC score could not well separate low risk from high risk patients with respect to disease recurrence (Figures 1a–c, c-indices 0.55 to 0.61). Discrimination was somewhat better for progression (Figures 2a–c, c-indices 0.72 to 0.81). The CUETO score had a similar performance (figures 1d–f and 2d–f). Subgroup analyses in patients receiving BCG treatment (n = 449) showed poorer results (Figures S1a–f and S2a–f).
Full line: low risk patients, dotted line: intermediate risk patients, dashed line: high risk patients. Number of patients per country: Denmark n = 280; The Netherlands n = 639; Spain n = 973.
Full line: low risk patients, dotted line: intermediate risk patients, dashed line: high risk patients. Number of patients per country: Denmark n = 280; The Netherlands n = 639; Spain n = 973.
When we refitted the EORTC score for recurrence in Cox regression models, the prognostic effects of multiple tumours, tumour size, CIS and tumour grade were largely confirmed, but T1 tumours had no increased risk over Ta tumours (Results not shown). For progression, tumour size and CIS were less predictive than in the original EORTC score, while the effect for grade was stronger. For the CUETO score, gender was confirmed to be predictive of recurrence. While older age was not predictive of recurrence, we confirmed its value for predicting progression in the refitted CUETO score (p<0.01).
The EORTC risk tables have become a standard of care with their inclusion in European guidelines . The CUETO risk model was developed more recently, with a focus on patients treated with BCG. External validation of a prognostic model on a new dataset is crucial to assess its generalizability . In our study, the EORTC and CUETO risk scores showed only modest discriminative ability for the recurrence of NMIBC, with c-indices of, at most, 0.61. Prediction of progression was better with c-indices ranging from 0.72 to 0.82. Our findings were consistent in the cohorts from Denmark, Spain and the Netherlands, and are in line with another external validation of the EORTC risk score  and with validation in primary bladder cancer cases .
Remarkably, the CUETO score was specifically developed for patients treated with BCG, but discriminated better in the overall population than in the selected BCG population. BCG treatment, which has become a common treatment to manage intermediate- and high-risk NMIBC , was used in 449 patients, of over 50% at low risk of recurrence and progression according to the CUETO risk scores. For the EORTC risk scores, we noted that BCG treatment was usually administered to higher risk patients with a relatively narrow distribution of risk scores. This homogeneity in risk may partly explain the poor discriminative ability of the scores in those treated with BCG . More research in this specific group of patients needs to be done, also because of the lack of statistical power due to low numbers of BCG patients in the current study.
In the original study that presented the EORTC risk scores, prior recurrence rate was an important prognostic factor for both recurrence and progression . In the clinical setting, we need to establish a surveillance plan already after TUR for the primary tumour. Therefore, it is of great importance that the EORTC risk score has predictive value also for these patients, who have not had one or multiple recurrences. We found that predicting recurrence was very difficult for primary tumours. The heterogeneity in recurrence risk becomes better known once one or more recurrences have been observed .
A possible explanation for the poor performance of the risk scores for the prediction of recurrence outside controlled trials is interobserver variability in bladder cancer staging and grading by pathologists. To partly overcome these issues, new methods for bladder cancer pathology have been introduced in 1998  and 2004 . The 1998 method has been shown to be an improvement over the 1973 method , which was used for our patients.
The poor predictability of recurrence may also relate to other factors, unrelated to the (observed) pathology of the disease. For example, detection of all primary tumours may be difficult at primary tumour presentation. Tumour tissue may be left behind, falsely leading to classification as a recurrent tumour. The quality of the TUR may be important but it could not be considered in our evaluation. Moreover, detection policies may vary between urologists with respect to surveillance intervals and treatment modalities (e.g. TUR vs ablation). Progression is a more robust end point, which may partly explain its better predictability with the EORTC and CUETO scores.
The retrospective analysis is a limitation of this study, and explains the presence of missing values in important variables such as CIS and tumour size. We used multiple imputation, which has been shown to be a reliable method to handle missing data ,. We had no detailed information on treatments and surveillance policies, which may have changed over time. The treatment modalities may have led to a dilution of differences between the risk groups. On the other hand, a real life situation was considered with respect to the standard care of urologists. We furthermore note that a selected group of high risk patients was included from Denmark, which can be explained by the fact that patients originated form a specialised university medical centre. However, patients from Spain were a representative sample from standard primary NMIBC population in that country, and patients from the Netherlands, though originating from an academic centre, were similar to the general Dutch primary NMIBC patient population .
It is clear that the EORTC and CUETO scores need further improvement. Several markers have shown promising results, such as FGFR3 and Ki67, which improved c-indices for prediction of progression from 0.75 to 0.82 in one study . Various other promising molecular and germline markers are available, which need further rigorous evaluation for their usefulness to predict recurrence and progression –. Future risk scores will again need external validation, considering discrimination and other aspects of predictive performance, such as calibration (correspondence between observed and predicted risks) and clinical usefulness (ability to make better decisions) –.
We conclude that the discriminatory ability of currently available risk scores is poor for recurrence and moderate for progression in primary NMIBC. Since successful discrimination of low and high risk patients is essential to the right intensity of bladder cancer surveillance, new risk markers are urgently needed to improve risk classification in NMIBC patients.
A–F. Kaplan-Meier estimates of recurrence of bladder cancer in a ten-year period from transurethral resection of a bladder tumour for patients with non-muscle invasive bladder cancer treated with BCG. Full line: low risk patients, dotted line: intermediate risk patients, dashed line: high risk patients. Number of patients per country: Denmark n = 52; The Netherlands n = 108; Spain n = 289.
A–F. Kaplan-Meier estimates of progression of bladder cancer in a ten-year period from transurethral resection of a bladder tumour for patients with non-muscle invasive bladder cancer treated with BCG. Full line: low risk patients, dotted line: intermediate risk patients, dashed line: high risk patients. Number of patients per country: Denmark n = 52; The Netherlands n = 108; Spain n = 289.
We would like to thank all who were involved in study design, funding, and data collection used for this analysis, specifically Debra Silverman, Nathaniel Rothman, Adonina Tardón, Alfredo Carrato, Josep Lloreta, and the rest of the Spanish Bladder Cancer/EPICURO Study investigators.
Analyzed the data: MV NM ES. Contributed reagents/materials/analysis tools: MM WB EZ LD. Wrote the paper: MV EdB-G. Critical revision of the manuscript for important intellectual content: ES MV MM EdB-G MC LD MK US PUM FA WB TØ EZ FR NM. Obtained funding: TØ.
- 1. Ferlay J, Shin HR, Bray F, Forman D, Mathers C, et al.. (2010) GLOBOCAN 2008 v2.0, Cancer Incidence and Mortality Worldwide: IARC CancerBase No. 10. Lyon, France: International Agency for Research on Cancer.
- 2. Babjuk M, Burger M, Zigeuner R, Shariat SF, van Rhijn BW, et al. (2013) EAU guidelines on non-muscle-invasive urothelial carcinoma of the bladder: update 2013. Eur Urol 64: 639–653.
- 3. Kirkali Z, Chan T, Manoharan M, Algaba F, Busch C, et al. (2005) Bladder cancer: epidemiology, staging and grading, and diagnosis. Urology 66: 4–34.
- 4. Sylvester RJ, van der Meijden AP, Oosterlinck W, Witjes JA, Bouffioux C, et al. (2006) Predicting recurrence and progression in individual patients with stage Ta T1 bladder cancer using EORTC risk tables: a combined analysis of 2596 patients from seven EORTC trials. Eur Urol 49: 466–465 discussion 475–467.
- 5. Fernandez-Gomez J, Madero R, Solsona E, Unda M, Martinez-Pineiro L, et al. (2009) Predicting nonmuscle invasive bladder cancer recurrence and progression in patients treated with bacillus Calmette-Guerin: the CUETO scoring model. J Urol 182: 2195–2203.
- 6. Hernandez V, De La Pena E, Martin MD, Blazquez C, Diaz FJ, et al. (2011) External validation and applicability of the EORTC risk tables for non-muscle-invasive bladder cancer. World J Urol 29: 409–414.
- 7. Buethe DD, Sexton WJ (2011) Bladder cancer: validating the EORTC risk tables in BCG-treated patients. Nat Rev Urol 8: 480–481.
- 8. van Rhijn BW, Zuiverloon TC, Vis AN, Radvanyi F, van Leenders GJ, et al. (2010) Molecular grade (FGFR3/MIB-1) and EORTC risk scores are predictive in primary non-muscle-invasive bladder cancer. Eur Urol 58: 433–441.
- 9. Rosevear HM, Lightfoot AJ, Nepple KG, O’Donnell MA (2011) Usefulness of the Spanish Urological Club for Oncological Treatment scoring model to predict nonmuscle invasive bladder cancer recurrence in patients treated with intravesical bacillus Calmette-Guerin plus interferon-alpha. J Urol 185: 67–71.
- 10. Fernandez-Gomez J, Madero R, Solsona E, Unda M, Martinez-Pineiro L, et al. (2011) The EORTC tables overestimate the risk of recurrence and progression in patients with non-muscle-invasive bladder cancer treated with bacillus Calmette-Guerin: external validation of the EORTC risk tables. Eur Urol 60: 423–430.
- 11. Xylinas E, Kent M, Kluth L, Pycha A, Comploj E, et al. (2013) Accuracy of the EORTC risk tables and of the CUETO scoring model to predict outcomes in non-muscle-invasive urothelial carcinoma of the bladder. Br J Cancer 109: 1460–1466.
- 12. Porta N, Calle ML, Malats N, Gomez G (2012) A dynamic model for the risk of bladder cancer progression. Stat Med 31: 287–300.
- 13. Fristrup N, Ulhoi BP, Birkenkamp-Demtroder K, Mansilla F, Sanchez-Carbayo M, et al. (2012) Cathepsin E, maspin, Plk1, and survivin are promising prognostic protein markers for progression in non-muscle invasive bladder cancer. Am J Pathol 180: 1824–1834.
- 14. Rubin DB, Schenker N (1991) Multiple imputation in health-care databases: an overview and some applications. Stat Med 10: 585–598.
- 15. Harrel FEJ (2001) Regression Modeling Strategies: Springer-Verlag New York, Inc.
- 16. Justice AC, Covinsky KE, Berlin JA (1999) Assessing the generalizability of prognostic information. Ann Intern Med 130: 515–524.
- 17. Fahmy N, Lazo-Langner A, Iansavichene AE, Pautler SE (2013) Effect of anticoagulants and antiplatelet agents on the efficacy of intravesical BCG treatment of bladder cancer: A systematic review. Can Urol Assoc J 7: E740–749.
- 18. Vergouwe Y, Moons KG, Steyerberg EW (2010) External validity of risk models: Use of benchmark values to disentangle a case-mix effect from incorrect coefficients. Am J Epidemiol 172: 971–980.
- 19. Kompier LC, van der Aa MN, Lurkin I, Vermeij M, Kirkels WJ, et al. (2009) The development of multiple bladder tumour recurrences in relation to the FGFR3 mutation status of the primary tumour. J Pathol 218: 104–112.
- 20. Epstein JI, Amin MB, Reuter VR, Mostofi FK (1998) The World Health Organization/International Society of Urological Pathology consensus classification of urothelial (transitional cell) neoplasms of the urinary bladder. Bladder Consensus Conference Committee. Am J Surg Pathol 22: 1435–1448.
- 21. Montironi R, Lopez-Beltran A (2005) The 2004 WHO classification of bladder tumors: a summary and commentary. Int J Surg Pathol 13: 143–153.
- 22. Gonul II, Poyraz A, Unsal C, Acar C, Alkibay T (2007) Comparison of 1998 WHO/ISUP and 1973 WHO classifications for interobserver variability in grading of papillary urothelial neoplasms of the bladder. Pathological evaluation of 258 cases. Urol Int 78: 338–344.
- 23. Ambler G, Omar RZ, Royston P (2007) A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Stat Methods Med Res 16: 277–298.
- 24. Dutch Cancer registration (2010) www.iknl.nl.Integraal Kankercentrum Nederland.
- 25. van Rhijn BW (2012) Combining molecular and pathologic data to prognosticate non-muscle-invasive bladder cancer. Urol Oncol 30: 518–523.
- 26. Shariat SF, Lotan Y, Vickers A, Karakiewicz PI, Schmitz-Drager BJ, et al. (2010) Statistical consideration for clinical biomarker research in bladder cancer. Urol Oncol 28: 389–400.
- 27. Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, et al. (2010) Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 21: 128–138.
- 28. Vickers AJ, Elkin EB (2006) Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making 26: 565–574.
- 29. Vickers A (2010) Prediction models in urology: are they any good, and how would we know anyway? Eur Urol 57: 571–573 discussion 574.