Association of surgeon and hospital volume with short-term outcomes after robot-assisted radical prostatectomy: Nationwide, population-based study

Background and objective Few studies have investigated the association between surgical volume and outcome of robot-assisted radical prostatectomy (RARP) in an unselected cohort. We sought to investigate the association between surgical volume with peri-operative and short-term outcomes in a nation-wide, population-based study group. Methods 9,810 RARP’s registered in the National Prostate Cancer Register of Sweden (2015–2018) were included. Associations between outcome and volume were analyzed with multivariable logistic regression including age, PSA-density, number of positive biopsy cores, cT stage, Gleason score, and extent of lymph node dissection. Results Surgeons and hospitals in the highest volume group compared to lowest group had shorter operative time; surgeon (OR 9.20, 95% CI 7.11–11.91), hospital (OR 2.16, 95% CI 1.53–3.06), less blood loss; surgeon (OR 2.58. 95% CI 2.07–3.21) hospital (no difference), more often nerve sparing intention; surgeon (OR 2.89, 95% CI 2.34–3.57), hospital (OR 2.02, 95% CI 1.66–2.44), negative margins; surgeon (OR 1.90, 95% CI 1.54–2.35), hospital (OR 1.28, 95% CI 1.07–1.53). There was wide range in outcome between hospitals and surgeons with similar volume that remained after adjustment. Conclusions High surgeon and hospital volume were associated with better outcomes. The range in outcome was wide in all volume groups, which indicates that factors besides volume are of importance. Registration of surgical performance is essential for quality control and improvement.

Introduction Already 40 years ago, Luft and colleagues described an association between high hospital surgical volume and lower mortality and suggested a centralization of certain surgical procedures [1]. Since then there has been many publications in support of an association between high surgical volume and better outcome [2,3].
Radical prostatectomy (RP) is no exception and numerous publications have reported an association between high volume and good outcomes including risk of complications, readmission, positive surgical margin, incontinence, risk of disease recurrence, mortality and costs [4,5]. These reports have stimulated initiatives aiming at centralization of RP. For example, the NICE guidelines state that at least 150 robot assisted radical prostatectomy (RARP)/year should be performed in a hospital in order to ensure cost-effectiveness [6] and in Germany, a minimum of 50 RP's are required in order for a clinic to be a certified as a "prostate cancer center" [7]. Parallel to this process there has been a movement in the opposite direction with wide dissemination of robotic technology so even some small hospitals in Sweden have invested in surgical robots. A decentralization of RP´s has also been reported from e.g. Germany where RP's are performed at an increasing number of hospitals [7].
Compared to the large number of reports on outcome after open retropubic RP, there are few studies on the association between volume and outcome for RARP. To date these reports have been either large population-based studies with limited possibilities to adjust for case mix or reports with comprehensive data from specialized tertial referral centers [4,8].
The aim of this study was to investigate the association between surgeon and hospital volume with perioperative and short-term outcomes in a nationwide population-based cohort of men.

Materials and methods
The National Prostate Cancer Register (NPCR) of Sweden captures 98% of all newly diagnosed prostate cancer cases in Sweden [9]. NPCR collects information on cancer characteristics, work-up, primary treatment and since 2015, also comprehensive information on RP and radical radiotherapy. A detailed description of the RP form (PiS: Prostatectomy in Sweden form) has been published [10]. In summary, the PiS form contain pre-, peri-, and postoperative variables and exists in a short version with 60 variables and a more comprehensive version with 83 variables. Each reporting department decides which form to use. The variable list for the two PiS forms is available in Swedish at: https://www.cancercentrum.se/samverkan/ cancerdiagnoser/prostata/kvalitetsregister/dokument/. NPCR registers the identity of each individual surgeon by use of a code and the code key is kept at each department. A majority of RP surgeons in Sweden performs surgeries at one hospital but some surgeons perform RP´s at several hospitals and are thus registered with multiple codes. Data on re-admission and comorbidity were obtained by linkage to the Patient Registry in Prostate Cancer data Base  , we can provide access to  the dataset on a remote server on demand where  analyses can be performed and aggregated data in  Figures and Tables can be exported, but no data on individual participants can be exported. External researchers should contact the corresponding author who will direct the demand for data to the PCBaSe reference group who will then decide if access described above can be granted. Researchers can also apply for collaborations based on data in PCBaSe including PiS with a standardized form. After approval, a study file will be uploaded to a remote access server for statistical analysis. Users will be charged for software licenses, administration, and data management. For more information, contact: npcr@npcr.se.

PLOS ONE
Sweden (PCBaSe RAPID 2018) [11]. Charlson comorbidity index (CCI) was calculated as previously described [12,13]. NPCR/PCBaSe RAPID has been approved by the Research Ethics Board Uppsala (Dnr 2017-263). The board waived the need for consent. In Sweden, cancer registers use an opt-out design for consent, meaning that individuals are informed but no written consent is collected. If an individual does not want to participate, he will not be registered in NPCR/PCBaSe. The present study included all RARP´s registered in NPCR performed between 1 st Jan 2015 and 31 st Dec 2018. RARP´s performed after an initial period of active surveillance were not included (Fig 1). Hospitals and surgeons who had registered �5 RARP/year were excluded.

Statistical analysis
Cut-off values for surgeon and hospital volume were based on the Swedish national guidelines for prostate cancer that recommend that a RP surgeon should perform at least 25 RP's per year and that there should be at least two surgeons at each department who perform this number of RP's [14]. Multiples of these numbers were used to define cut-off values for each volume group: very low (surgeon <13/ hospital <50), low (surgeon 13-25/hospital 50-100), intermediate (surgeon 25-50/hospital 100-150), high (50-75/ hospital 150-200) and very high (surgeon �75/ hospital �200). The mean number of RARP's per year was calculated by dividing the accumulated number of RARP's per surgeon and hospital with the study period of 4 years. The grouping of hospital volume was performed before the exclusion of RARP's with missing data on surgeon identity.
The Multiple Imputation Chained Equations (MICE) method [15] was used to impute missing data on PSA, prostate volume, number of positive biopsy cores, cT stage, Gleason score and extent of lymph node dissection. The following additional variables were included to improve predictions: region, patient age at surgery, year of surgery, total mm of cancer in biopsy cores, number of biopsies with cancer, vital status, and time from surgery to death or 31 st of December 2018, whichever event came first. Data were imputed 20 times and results from the models fitted to each dataset were pooled using Rubin's rules [16].
Outcome variables were dichotomized; thresholds for short operative time and low blood loss were based on the median. Univariable logistic regression was used to calculate odds ratios (OR) for the outcomes. In the multivariable logistic regression we included age at RARP, PSA, prostate volume, PSA density, number of positive biopsy cores, cT stage, Gleason score, and extent of lymph node dissection. Surgeon and hospital volume were analysed in two separate models. The multivariable logistic regression models were subsequently used to construct covariate-adjusted funnel plots using the method proposed by Spiegelhalter [17].
All statistical analyses were performed in R Statistical Software (Version 3.6.1; Foundation for Statistical Computing, Vienna, Austria).

Results
10,612 men in NPCR underwent a RARP between 1 st Jan 2015 and 31 st Dec 2018, 802 men were excluded leaving a final study population of 9,810 men (Fig 1). Of the 10,612 men, 9,868 were also reported in the National Patient Registry, while 740 men were registered in NPCR only, and 430 men registered in the Patient Register only. 74% (7252/9810) of all procedures were performed by surgeons who performed >25 RARP/year and 58% (5734/9810) of all RARP were performed at high or very high volume hospitals. Table 1 demonstrates clinical characteristics of the study population according to surgeon volume (hospital volume S1 Table). There was no clear association between case mix and surgeon or hospital volume. There was a wide range in the proportion of lymph node dissection with the highest proportion in RARP's performed by high volume surgeons and at very high volume hospital. Very high volume surgeons and hospitals had the highest completeness of registration of data. Table 2 demonstrates OR for the factors that were included as adjustments in the multivariable analysis of surgeon volume (hospital volume S2 Table). Prostate volume and lymph node dissection was associated with long operative time and high blood loss. There were fewer negative surgical margins with increasing PSA, PSA density, number of positive cores, higher T stage, and extended lymph node dissection. Nerve sparing intention was less common with increasing patient age, comorbidity, PSA, prostate volume, number of positive cores, cT stage, and Gleason score. Extended lymph node dissection was negatively associated with all outcomes except nerve sparing intention. Results were virtually identical when we analyzed these factors in a complete case analysis.
Fig 2 shows outcomes from multivariable analyses adjusted for clinical characteristics including surgeon and hospital volume. Surgeons and hospitals in the highest volume group had, compared to the lowest volume group, shorter operative time, surgeon OR 9.20 (95% CI 7.11-11.91), hospital OR 2.16 (95% CI 1.53-3.06). Blood loss was smaller in RARP's performed by surgeons in the highest volume group compared to the lowest volume group (OR 2.58, 95% CI 2.07-3.21). There was an U-shaped, negative association between hospital volume and blood loss. Surgeons and hospitals in the highest volume groups had higher odds of nerve sparing intent compared to very low volume surgeons (OR 2.89, 95% CI 2.34-3.57) and hospitals (OR 2.2, 95% CI 1.66-2.44). Negative surgical margins were more common in procedures performed by the highest volume surgeons compared to the lowest volume surgeons (OR 1.90, 95% CI 1.54-2.35). There was no clear association between hospital volume and negative surgical margins. There was no statistically significant association between volume and readmission with the exception of low volume hospitals which had higher odds (1.41, 95% CI 1.02-1.95) for no readmission compared to very low volume hospitals. Fig 3 shows adjusted funnel plots for the five outcomes. For individual surgeons, there was a wide range in the proportion of procedures with a short operative time and low blood loss across the range of volume. All very high volume surgeons were at or above the median value for short operative time and low blood loss but the range was wide also in this group and there were a few 'positive outliers'. For hospital volume, the widest range in operative time and    blood loss was observed among very low volume hospitals. There was a wide range in proportion of nerve sparing procedure for all surgeon volumes. The proportion of nerve sparing procedures was highest among very high volume hospitals. All very high volume surgeons were above the median value for high proportion of negative margins. For hospital volume, the widest range was observed in very high volume hospitals. Compared to the other outcomes, there was a smaller range in the proportion of men with no readmission for volume according to both surgeon and hospital.

Discussion
In this nation-wide, population-based register study on robot-assisted radical prostatectomy (RARP), highest vs lowest volume surgeons had shorter operative time, less blood loss, more often performed nerve sparing intent and had more negative surgical margins also when adjusting for case mix and potential confounders. For hospital volume of RARP, associations between volume and perioperative outcomes were weaker. Some low volume surgeons and low volume hospitals had better results than some surgeons and hospitals with higher volume, however the low number of observations increased the risk of chance findings in the former groups. There were also high volume surgeons and hospitals who performed below average.

Strength and limitations
The strengths of this study include the nation-wide, population-based study group with a virtually complete capture of all men in Sweden who underwent RARP at all types of hospitals. In contrast to many studies based on administrative registries, we had access to detailed patient and cancer characteristics enabling us to adjust for potential confounders in our analyses. Another strength is inclusion of both surgeon and hospital volume. Exclusion of surgeons and hospitals with <5 RARP/year could have introduced a bias. We argue for the opposite, that including these observations would have led to erroneous estimates since it is likely that many of these registrations were of poor quality. With this threshold, we avoided erroneous registration of RARP at hospitals where there is no surgical robot or registration of RARP performed by visiting high volume surgeon at a small hospital. The proportion of missing data was small for all variables in the model except for number of positive biopsy cores. We applied multiple imputation to handle missing data, however, results were virtually identical when we analyzed these factors in a complete case analysis. Limitations of our study include lack of data on functional and long-term oncological outcomes and surgical experience for individual surgeons prior to the study period. Given this lack of data we had to assume that all surgeons practiced during the entire study period, which would "falsely" lower the average number of RARP's/year for some surgeons. A systematic review on the association between volume and outcome after RP reported 49 original publications of which 11 studies examined both surgeon and hospital volume [4]. High surgeon and hospital RP volume were associated with better outcome, however, the association varied between outcomes. For example, cost and mortality were more dependent on institutional factors whereas incontinence and length of hospital stay were more related to individual surgeon [4].
We found a clear dose-response association between surgeon volume and short operative time and low blood loss, which could not be observed for hospital volume, indicating that these outcomes are more dependent on the individual surgeon's technique than on hospital factors. Other studies have reported similar associations [18,19]. Surgeons and hospitals in the highest volume groups had higher odds of a nerve sparing intent compared to very low volume surgeons and hospitals. However, nerve sparing intent was based on the surgeon´s self assessment and is not an outcome and in order to investigate the validity of this variable, data on functional outcomes are needed.
We found that the highest volume surgeons had almost two-fold higher odds of negative surgical margins vs. the lowest volume surgeons. Surgical margin status in organ confined prostate cancer is an important measure of surgical quality and positive surgical margins are associated with increased risk of biochemical recurrence and secondary treatment [20]. Most previous studies have reported a higher proportion of negative margins with increasing RP volume [21][22][23]. For example, in a recent study based on the American National Cancer Database, there was a higher proportion of negative margins in high volume hospitals [22]. Besides surgical technique, margin status is also affected by T stage, Gleason, quality of the pathology assessment etc, which could explain the lack of association with hospital volume and the wide range between hospitals in the same volume group.
We found no association between readmission and volume which is in line with a previous study from PCBaSe [24] but in contrast to a register-based US study [22].
The type of volume-outcomes association differed between outcomes. For example, for operative time and blood loss, we found a dose-response association with improved outcome with increased volume but for negative surgical margins, there seemed to be a threshold effect where only the very-high volume group had significantly better results. One potential explanation could be that the learning curve to reach improved oncological outcome with RARP is longer than the learning curve for operative time and blood loss.
We speculate that surgeon volume is the driver in the volume-outcome association. Compared to many previous studies, the association between hospital volume and perioperative outcomes was weak in our study [4], likely due to the fact there were very high volume hospitals with many low to intermediate volume surgeons and lower volume hospitals with high/ very high volume surgeons.
We observed a wide range in outcomes also for surgeons and hospitals within the same volume group. However, for most of the outcomes, variation within a volume group was wider with decreasing volumes. For example, the largest variation in operative time was seen for surgeons and hospitals in the lowest volume groups.
The proportion of men who received a nerve sparing procedure spanned from less than 10% to close to 100% between very low volume surgeons and from less than 30% to more than 90% between high volume hospitals. Similarily, the proportion of negative surgical margins in very high volume surgeons varied from <70% to >80%. For readmission, the variation was smaller in all volume groups.
In these comparisons within a volume group, there should be limited influence by case-mix suggesting that there are other factors that affect outcome. Random effect can probably explain some variation, especially in the lower volume groups. There are numerous reports describing heterogeneity between surgeons, including those with high volumes [25][26][27][28][29]. In a recent population-based, Swedish study, significant variation in functional and oncological outcomes was seen, also for experiences surgeons. Even tough surgeons' experience and annual volume could explain some of the observed heterogeneity in outcomes, most of the heterogeneity remained after adjusting for these factors. Individual surgeon skill and talent probably explain some variation. Some surgeons are more skilled than others are, but to what extent these skills can be improved by training is unknown and probably differs between individuals [29].
Our results do not answer whether prostate cancer surgery can or should be centralized but rather emphasize the need to register and report surgical performance in order to ensure quality control and quality improvement. When outcomes are registered, best practice can be identified and disseminated. Continuous registration and feedback from registers such as the NPCR can help raise the minimum quality standards for surgeons and hospitals regardless of volume. We argue that quality registration should be compulsory since. quality assurance of RP has been shown to improve outcome. For example, in a quality assurance program at the Pelvic cancer surgical center in London, UK complication rate fell from 13% to 7% and urinary continence 3 months after surgery increased from 57% to 67% [30].
In 2018, NPCR introduced electronic forms for patient reported outcome measures (ePROM) for men undergoing RP and radical radiotherapy. Patients are asked to fill in electronic questionnaires before treatment, and at 3 and 12 months after treatment. The results are immediately available online at the secured Information Network for Cancer Care (INCA) platform where comparisons between regions, hospitals, and individual surgeons within the same hospital can be performed. Capture was limited in 2018, so there is not enough data as of yet to investigate the association between volume and PROM.

Conclusions
In this nation-wide, population-based register study, we found strong associations between high surgeon volume and good perioperative and short-term outcomes whereas for hospitals, association between volume and outcome was weaker. There was a wide range in outcome between hospitals and surgeons with similar surgical volume indicating that factors other than volume are important. These findings highlight the need to register and report surgical performance to facilitate quality control and improvement. Future studies in NPCR will include functional results based on ePROM.
Supporting information S1 Table. Baseline characteristics of men in Prostate Cancer data Base Sweden (PCBaSe) according to hospital volume; mean number of robot assisted radical prostatectomies (RARP)/year performed in a hospital. Limits for volume groups are shown as mean number of RARP/year performed in a hospital. CCI is calculated at diagnosis. (DOCX) S2 Table. Odds ratios (OR) and 95% confidence intervals (CI) of the covariates for respective outcome according to surgical volume in a hospital. (DOCX) S1 File. List of the hospitals included. (DOCX)