The role of gadolinium in magnetic resonance imaging for early prostate cancer diagnosis: A diagnostic accuracy study

Objective Prostate lesions detected with multiparametric magnetic resonance imaging (mpMRI) are classified for their malignant potential according to the Prostate Imaging-Reporting And Data System (PI-RADS™2). In this study, we evaluate the diagnostic accuracy of the mpMRI with and without gadolinium, with emphasis on the added diagnostic value of the dynamic contrast enhancement (DCE). Materials and methods The study was retrospective for 286 prostate lesions / 213 eligible patients, n = 116/170, and 49/59% malignant for the peripheral (Pz) and transitional zone (Tz), respectively. A stereotactic MRI-guided prostate biopsy served as the histological ground truth. All patients received a mpMRI with DCE. The influence of DCE in the prediction of malignancy was analyzed by blinded assessment of the imaging protocol without DCE and the DCE separately. Results Significant (CSPca) and insignificant (IPca) prostate cancers were evaluated separately to enhance the potential effects of the DCE in the detection of CSPca. The Receiver Operating Characteristics Area Under Curve (ROC-AUC), sensitivity (Se) and specificity (Spe) of PIRADS-without-DCE in the Pz was 0.70/0.47/0.86 for all cancers (IPca and CSPca merged) and 0.73/0.54/0.82 for CSPca. PIRADS-with-DCE for the same patients showed ROC-AUC/Se/Spe of 0.70/0.49/0.86 for all Pz cancers and 0.69/0.54/0.81 for CSPca in the Pz, respectively, p>0.05 chi-squared test. Similar results for the Tz, AUC/Se/Spe for PIRADS-without-DCE was 0.75/0.61/0.79 all cancers and 0.67/0.54/0.71 for CSPca, not influenced by DCE (0.66/0.47/0.81 for all Tz cancers and 0.61/0.39/0.75 for CSPca in Tz). The added Se and Spe of DCE for the detection of CSPca was 88/34% and 78/33% in the Pz and Tz, respectively. Conclusion DCE showed no significant added diagnostic value and lower specificity for the prediction of CSPca compared to the non-enhanced sequences. Our results support that gadolinium might be omitted without mitigating the diagnostic accuracy of the mpMRI for prostate cancer.


Introduction
Magnetic resonance imaging (MRI) is a non-invasive and accurate diagnostic method for the early diagnosis of prostate cancer (Pca) [1]. Especially in the last 3 years, MRI opts to replace the Transrectal Ultrasound-guided (TRUS) biopsy and become the standard of care for the early diagnosis of Pca in patients with elevated Prostate-Specific Antigen, while still maintaining its role in follow up, active surveillance and staging. The European Association of Urology, European Society for Radiotherapy and Oncology and International Society of Geriatric Oncology guidelines propose the role of MRI in early prostate cancer diagnosis in view of an MR-guided biopsy in clinical scenarios with persistent suspicion for malignity after at least one negative TRUS biopsy [2]. This approach is the most acceptable, supported by the European Society of Medical Oncology and the British National Institute for Health and Care Excellence as well [3,4]. In order to normalize the image interpretation language to a common denominator, the European Society of Urogenital Radiology and the American College of Radiology released a structured reporting system, the Prostate Imaging Reporting And Data System [5] updated to PI-RADS TM 2 in 2015 and PI-RADS TM 2.1 in 2019 [1,[6][7][8]. PI-RADS requires a diagnostic standard of anatomical (T2-weighted, T2w) and functional sequences (Diffusion-Weighted Imaging, DWI), including a series of Dynamic Contrast Enhancement (DCE). The combined protocol (T2w, DWI, and DCE) is summarized as a multiparametric MRI (mpMRI).
Although the utility and diagnostic value of contrast enhancement was enthusiastically endorsed in the first steps of structured prostate imaging [5,9,10], increasing demand in prostate MRI examinations [11], debated issues such as the gadolinium toxicity and tissue deposition [12,13] as well as the cost inflation with scanning time and use of gadolinium have stimulated the community to re-assess the added value of the DCE. Independent research groups converge towards the opinion that DCE has no significant added value in diagnostic accuracy [14][15][16][17][18][19][20][21][22][23]. However, the field remains heavily debated by datasets that support the DCE value in the diagnosis of clinically significant Pca (CSPca) towards the insignificant Pca (IPca) [24][25][26][27][28], especially in the hands of inexperienced readers [28] or for smaller lesions [29]. Currently, DCE is a standard recommendation in the most recent update of the prostate imaging guidelines (PI-RADSv2.1) [7,8] and a common practice for many radiological units.
The current study aims to assess the DCE necessity in the mpMRI protocol for the first Pca diagnosis using a retrospective database. The independent and added value of DCE was evaluated for all cancers and CSPca separately in the peripheral (Pz) and the transitional (Tz) prostate zones. Overall, we provide evidence that the added value of the DCE is not statistically significant, and gadolinium could be omitted without hampering the diagnostic accuracy of mpMRI.

Ethical statement
Data were analyzed retrospectively, fully anonymized, following the ethical standards laid down in the 1964 Declaration of Helsinki and its amendments as well as the European Regulation 536/2014. The Institutional Review Board of the University Hospital of Jena approved the study and waived the mandate from obtaining a legally valid informed consent from the included subjects (6/2019) [30].

Study design and participant flow
The study was designed according to the Standards for Reporting of Diagnostic Accuracy (STARD) guidelines [31] and included n = 286 lesions from N = 213 eligible patients aged 64 ±7 years (mean/σ), screened with mpMRI in our department between 1/2012 and 11/2017 (S1 Fig). The MRI was conducted upon clinical suspicion for prostate cancer based on an elevated PSA assay and, in the vast majority, after an inconclusive transrectal ultrasound-guided biopsy. The diagnostic MRI was conducted at least four weeks after the ultrasound-guided biopsy to avoid artifacts. An MR-guided biopsy followed within three months post-diagnosis (mean/σ = 40/38 days) and served as the histological ground truth. After the MRI-guided prostate biopsy, no further ultrasound-guided biopsies followed. In N = 6 patients with negative first MRIguided biopsy and persisting clinical Pca suspicion, the MRI-guided biopsy was repeated within a time interval in 612±231 days (mean, standard deviation). From a total of 225 patients, we excluded 12 patients due to a lack of mpMRI before MRI-guided prostate biopsy (total eligible patient/lesions N/n = 213/286). One hundred seventy lesions (59%) derived from the transitional zone (74 malignant and 41 CSPca) and 116 lesions (41%) from the peripheral zone (79 malignant and 48 CSPca). The flow of participants in the study is thoroughly described according to the STARD guidelines in the supplement (S1 Table).

Imaging protocol
Ten patients (15 lesions) were examined in a 1.5T MRI-system and the rest in a 3.0T-setup using a superficial multi-array coil (Philips Ingenia, Philips Medical Systems, Böblingen, Germany). The following protocol was applied as the standard of diagnosis with an average duration of 20 min without DCE and 35-40 min with DCE (S2 Table): a T2w turbo spin-echo (T2wTSE HR) in 2mm resolution, a DWI at 5 different b-values (b0-100-500-800-1000 s/ mm 2 ) and a T1-weighted (T1w) Fast Field Echo with DCE in 25 repetitions with 13.35 s temporal resolution and 7 s delay [29]. A weight-adjusted bolus of gadoteridol (ProHance 1 , Bracco Imaging S.p.A., Konstanz, Germany) 0.1mmol/kg was injected at 3 ml/min flow rate.

Image evaluation
Two radiologists, one with intermediate experience (IP,5 th year of training in radiology) and a board-certified radiologist (AM with more than 15 years of board certification for prostate MRI), graded all lesions according to the qualitative criteria of PI-RADS™2 for T2w, DWI, and DCE [1,30]. The grading of non-enhanced sequences (T2w, DWI) on the basis of the 5-point Likert scale, which stratifies the level of suspicion for malignity (Supplemental information spreadsheet), was performed without the influence of DCE.
The DCE was graded separately by the same radiologists in a blinded manner to the T2w and DWI sequences [1] within a time interval of a minimum of one week. DCE scoring was binomial, based on the PIRADSv2 criteria, described as "positive" or "negative"depending on the speed and amplitude of the wash-in phase using the software DynaCAD v2 (Invivo, Gainesville, FL, USA). The DCE does not influence the final score of PI-RADS 4 or 5 lesions according to the PI-RADS™v2 criteria and is only relevant for triaging ambiguous lesions (PIR-ADS 3) in the Tz. However, we extended our analysis to include PI-RADS 4 and 5 lesions to assess the putative role of DCE in detecting CSPca in candidates for MR-guided prostate biopsy.
The retrospective analysis was based on the joint opinion because the separate reports were not accessible.

MR-guided prostate biopsy
All patients were scheduled for an MRI-guided prostate biopsy in 40±38 days (mean/σ) from the initial assessment. A prophylactic antibiotic schema with fluoroquinolones starting 24 hours before the biopsy and a coagulation screening (international normalized ratio, partial thromboplastin time, and platelet count) were applied as a standard of care before the biopsy. The biopsy was performed at the same field strength with the diagnostic imaging, using the compatible, minimally invasive biopsy device DynaTRIM and its dedicated software Dyna-CAD (Invivo, a Philips Healthcare Company, Best, The Netherlands) to obtain an average of 2 biopsies per lesion. The size of biopsied lesions varied between 5 and 57 mm (S1 Table).

Statistics and data analysis
Logistics and descriptive statistics were performed with the Microsoft Office suite 365 (Microsoft Ireland Operations Limited, Dublin, Ireland). The receiver operating characteristics (ROC), Analysis of Variance (ANOVA) with Dunn´s post-hoc test and Mann Whitney ranksum test were performed with the Statistical Package for the Social Sciences version 25 (IBM GmbH, Ehningen, Germany). The Shapiro-Wilk test was used for validating the normal distribution hypothesis. The threshold for statistical significance was set at 0.05 (α = 0.05). Outliers were included in the data analysis and not treated separately. Percentages are rounded up to the closest integer only for reporting purposes. Graphical processing of vectorized images and halftones was accomplished using Inkscape (GPL v2+, https://inkscape.org).

Dynamic contrast enhancement has a low sensitivity for prostate cancer detection
T2w and DWI are the leading sequences for Pca diagnosis in the Tz and Pz, respectively [1,8]. However, DCE retains a role in the Tz for risk stratification of ambiguous lesions (PI-RADS 3) and, possibly, for the prediction of CSPca (Gleason equal to or more than 7) towards the IPca. Both statements were retrospectively evaluated in a database of n = 286 lesions (N = 213 patients) with a histological ground truth based on an MRI-guided prostate biopsy (S1 Fig).
In the PI-RADS™v2, DCE is binomially evaluated as "positive" or "negative," based on the fast-arterial wash-in phase, and can influence the T2w score only in case of ambiguous (PI-R-ADS 3) lesions in the Tz. We analyzed the predictive value of DCE in triaging ambiguous lesions as well as in predicting CSPca in PI-RADS scores 4 and 5 (abbreviated as "all PI-RADS scores"). All steps in data analysis were performed for (i) all cancers (IPca+CSPca), and (ii) CSPca while respecting the individualities of the Pz and Tz ( Table 1). The number of correct predictions (TP and TN) amongst the malignant lesions was high, and the DCE sensitivity for Pz/Tz was 82/77% for all cancers, 88/78% for CSPca (Table 1). However, DCE was associated with a high number of false positives (i.e., benign lesions classified as cancers), which, especially in the Tz, outnumbered the TP predictions (Table 1).
Although CSPca is usually highly vascularized, with a vivid, ultrafast kinetic in the early wash-in phase of the DCE (Fig 1A.i, 1A.ii, 1B and 1C), high-grade cancers can be associated with slow perfusion dynamic as well (Fig 1A.iv, 1D). Similarly, prostatitis and benign prostate hyperplasia often simulate Pca due to hypervascularization (Fig 1F.i, 1F.ii, 1G and 1H). Hence, a slow wash-in DCE such as observed in a prostatitis example (Fig 1 F.iv) does not exclude cancer, and might even harbor a high-grade Gleason 7b CSPca (Fig 1A.iv). An intermediate wash-in kinetic can be a feature of both prostatitis ( Fig 1F.i and 1F.ii) and a high-grade, Gleason 9 CSPca (Fig 1A.iii). This significant variance in DCE behavior explains the low DCE specificity, which accounts for Pz/Tz = 33/34% for all cancers and Pz/Tz = 34/33% for CSPca ( Table 1). The small n of ambiguous (PI-RADS 3) lesions does not allow for a safe statistical result-however, all ambiguous lesions (Pz n = 5 and Tz n = 14) were benign, and in the vast majority falsely overcalled by the DCE (Table 1). All in all, DCE has a moderate-to-high sensitivity, especially for the detection of significant cancers, but a very low specificity between 33-34%, which considerably restricts the diagnostic value as a Pca biomarker.

Dynamic contrast enhancement does not improve the accuracy of biparametric MRI for the first detection of prostate cancer
Due to the low DCE sensitivity, adjunct costs [32], and side effects associated with i.v. gadolinium enhancers, an enhancer-free, bi-parametric (T2w and DWI) MRI protocol was recently suggested as an alternative in prostate imaging. We analyzed the PI-RADS ROC curves with and without DCE in both Pz and Tz for the detection of Pca and CSPca, separately.
PI-RADS™v2 is determined by the DWI score for the Pz and the T2w score for the Tz; however, our analysis reveals no statistically significant differences between both sequences, p>0.05, chi-squared test for both the Pz and Tz regardless of the malignancy level (Tables 2  and 3). Hence, non-enhanced MR-sequences performed equivocally for the diagnosis of IPca and CSPca in our database.
Next, we evaluated the standalone diagnostic accuracy of the DCE, T2w, and DWI/ADC for the peripheral and transitional zone. The DCE ROC-AUC for the Pz for all cancers/CSPca was 0.63/0.69, significantly lower compared to the T2w for the detection of all cancers, p = 0.020, chi-squared test. In the separate CSPca analysis, the DCE standalone performance was equivalent to T2w and DWI, p>0.05, chi-squared test (Fig 2A, Table 2). In the Tz, the DCE-AUC was 0.61/0.59 for all cancers/CSPca, also considerably lower compared to T2w and   between high-and low-grade cancers (p>0.05). Previous literature suggested that the DCE could bridge the diagnostic gap of ADC and facilitate the differentiation between low-and high-grade cancers [24][25][26]. By selecting the clinically significant cancers (Fig 2C, 2D, 2G and 2H) we could observe that the DCE ROC-AUC was not superior to the DWI/ADC either in the Pz (Fig 2C, 2D, and Table 2) or Tz (Fig 2G, 2H and Table 3), p>0.05, chi-squared test. We conclude that DCE had no significant added diagnostic value to T2w or DWI/ADC for the detection of CSPca.
To directly answer the question of whether DCE has an added value to mpMRI, we tested head-to-head the diagnostic accuracy of PI-RADSv2 with and without DCE (Fig 3) for all cancers and CSPca in the Pz and Tz, respectively. The AUC of PI-RADS with DCE was 0.70/0.69/ 0.66/0.61 for the Pz (all cancers) /Pz (CSPca) / Tz (all cancers) / Tz (CSPca), respectively. Omitting the DCE did not statistically influence the diagnostic accuracy of PI-RADS (AUC = 0.70/0.73/0.75/0.67), p = 0.96/0.09/0.08/0.14, chi-squared test respectively (Fig 3A, 3B, 3E and 3F). We further questioned whether the DCE could be beneficial for the stratification of small tumors. Hence, each Pz and Tz database (Fig 3I and 3J as a histogram) was split into two subgroups setting 11 mm as a threshold for the smaller tumors. The ROC analysis (Fig 3C,  3D, 3G and 3H) showed that DCE did not improve the PIRADS performance, and, in the case Table 2 of small Tz tumors, even significantly worsened the bi-parametric prognostic value, p = 0.04 chi-squared test (Fig 3G). For the small and large lesions in Pz (Fig 3C and 3D) and for the large Tz lesions ( Fig 3H) the effect of DCE was equivocal (p = 0.54/0.20/0.71) Thus, DCE did not increase the diagnostic accuracy of PI-RADS for IPca or CSPca in our database.

The character of lesions overcalled by the dynamic contrast enhancement
The tissue perfusion, as assessed by the DCE, reflects the degree of neovascularization and vessel permeability [33]. Since neoangiogenesis cascades can be induced by cancer, benign prostate hyperplasia, and chronic inflammation through different mechanisms [34][35][36], we questioned whether DCE tends to overcall particular types of benign lesions. The retrospective analysis showed that approximately 65-70% of all benign lesions and ca. 80% of the Atypical Small Acinar Proliferation (ASAP) and Gleason 3+3 lesions were overcalled by the DCE,  showing no preference in the Pz or Tz (S3 Table). All in all, our study failed to associate a specific benign prostate pathology with the DCE-overcalling.

Discussion
This study concludes that the DCE had no added value in the diagnostic accuracy of PI-RADS; hence, selected candidates could be screened for prostate cancer with a faster, gadolinium-free protocol. This trend [14,15] [24] with N = 313 and De Visschere [17] encompassing N = 257 patients. Various methodological differences can be spotted between our study and the reports mentioned above, such as the use of superficial vs. endorectal coil, MRI-guided biopsy vs. MRI/TRUS biopsy [37], wholemount preparation as the gold standard [17] and variation of the DWI b-values [21,24]. Technical differences between studies impede, on the one hand, the head-to-head comparison; on the other hand, they reveal the reproducibility of the main result under different conditions. Various other groups with smaller databases also converge towards the opinion that the DCE is not necessary for prostate mpMRI [16,[18][19][20]. The metanalysis of Woo et al. [20] and Alabousi et al. [23] concluded that "the performance of bpMRI was similar to that of mpMRI in the (first) diagnosis of prostate cancer." Despite the cumulating evidence, DCE is a subject of debate and a current guideline recommendation in prostate imaging [2,38]. The hypothesis that DCE could facilitate the differentiation between low-and high-grade cancers [24][25][26] was not confirmed by our study, as we did not find any significant advantage of the DCE ROC-AUC towards the T2w and DWI for both IPca and CSPca. Numerous recent reports proactively support the role of the DCE as problemsolver for inexperienced readers. Gatti et al. (N = 68) suggest that PIRADS without DCE is a valid alternative for expert readers, whereas less experienced ones need DCE to improve the sensitivity [28]. This hypothesis could not be tested in our study because we had access only to the final conjoint report of the experienced and inexperienced reader. Alternative DCE-validation approaches might improve diagnostic accuracy. Sun et al. [39] mention that DCE performed better than T2w and DWI in volumetric Pca studies, and Parra et al. analyze the DCE image entropy to classify prostate cancer based on the behavior of "DCE-microdomains" [25]. Altogether, the DCE remains a highly debated field in prostate mpMRI and a persisting challenge for the PI-RADS steering committee.
One of the main findings of our study is that DCE has a moderate-to-good sensitivity but a very low specificity for prostate cancer, especially for the CSPca. The low sensitivity of DCE was already commented on by earlier studies, such as by Kozlowski et al. [40], which concludes that DCE reduces the specificity of T2w for a small gain of sensitivity. DCE is an indirect index of vascular permeability, which is a feature of neovessels occurring in Pca but also in benign prostate hyperplasia (BPH), atypical hyperplasia, and chronic inflammation [35]. Neoangiogenesis is a putative link between inflammation and cancer [34], albeit activated through different mechanisms in each situation [36]. Hence, overlapping neovessel formation in benign and malignant conditions could explain the low specificity of DCE. Besides, neovascularization might not necessarily be a feature of Pca [41][42][43]. Vessel markers such as the Vascular Endothelial Growth Factor Receptor 2 are not coherently elevated in Pca patients [44,45]. A genome meta-analysis has failed to correlate Pca with any VEGF polymorphism, whereas such association was proven for bladder cancer [46]. Hence, neovascularization might not necessarily be a feature of the CSPca, which could partially explain the low DCE specificity [33].
PIRADSv1 was a powerful albeit complex scoring system, leading to an interrater agreement rate of 0.39-0.64 [47][48][49]. Simplification of the scoring system in PIRADSv2 significantly improved the interrater agreement: from 0.64 to 0.70 in the study of Becker et al. [47], from 0.39 to 0.56 in the study of Tewes et al. [48] and from 1.33 (Bland-Altman statistics for sumscore) to 0.41 in Krishna et al. [49]. Interobserver agreement studies for mp-vs. bi-parametric MRI were not performed yet, a further simplification is nevertheless quite likely to improve the interrater agreement. Krishna et al. [49] show that the kappa rate of DWI for the peripheral zone (equaling PIRADSv2 without DCE) was 0.51, improved compared to the PIRADSv2 kappa = 0.41. This highlights the necessity for future dedicated study designs towards an unbiased, rater-experience-weighted [28] evaluation of the interrater performance between mpand bi-parametric MRI.
Within the disadvantages of this study, as well as of the majority of similar cited studies, is the retrospective character, which can provide only a low level of evidence, even if performed as a multicenter study [18] or meta-analysis [20,23]. Our study includes a low number of PI-R-ADS 3 lesions, especially in the Tz. Nevertheless, equally low proportions of PIRADS 3 lesions were observed in other studies (De Visschere et al. [17], 8%). Moreover, databases with a higher percentage of ambiguous lesions (Cristel et al. [24], 17%; Junker et al. [21] 20%) come up with a low DCE specificity.
Even though in the meanwhile numerous studies converge to the conclusion that gadolinium could be omitted without hampering the diagnostic accuracy of MRI, the use of gadolinium enhancer is a matter of debate [27] and a recommendation in the current prostate imaging guidelines [7,8,50]. With our contribution, we opt to strengthen the cumulating evidence towards the optimization of the upcoming guidelines for prostate diagnostics.
Supporting information S1