Evaluation of different mathematical models and different b-value ranges of diffusion-weighted imaging in peripheral zone prostate cancer detection using b-value up to 4500 s/mm2

Objectives To evaluate the diagnostic performance of different mathematical models and different b-value ranges of diffusion-weighted imaging (DWI) in peripheral zone prostate cancer (PZ PCa) detection. Methods Fifty-six patients with histologically proven PZ PCa who underwent DWI-magnetic resonance imaging (MRI) using 21 b-values (0–4500 s/mm2) were included. The mean signal intensities of the regions of interest (ROIs) placed in benign PZs and cancerous tissues on DWI images were fitted using mono-exponential, bi-exponential, stretched-exponential, and kurtosis models. The b-values were divided into four ranges: 0–1000, 0–2000, 0–3200, and 0–4500 s/mm2, grouped as A, B, C, and D, respectively. ADC, , D*, f, DDC, α, Dapp, and Kapp were estimated for each group. The adjusted coefficient of determination (R2) was calculated to measure goodness-of-fit. Receiver operating characteristic curve analysis was performed to evaluate the diagnostic performance of the parameters. Results All parameters except D* showed significant differences between cancerous tissues and benign PZs in each group. The area under the curve values (AUCs) of ADC were comparable in groups C and D (p = 0.980) and were significantly higher than those in groups A and B (p< 0.05 for all). The AUCs of ADC and Kapp in groups B and C were similar (p = 0.07 and p = 0.954), and were significantly higher than the other parameters (p< 0.001 for all). The AUCs of ADC in group D was slightly higher than Kapp (p = 0.002), and both were significantly higher than the other parameters (p< 0.001 for all). Conclusions ADC derived from conventional mono-exponential high b-value (3200 s/mm2) models is an optimal parameter for PZ PCa detection.


Introduction
Magnetic resonance imaging (MRI) is considered the best imaging technique for the detection, staging and surveillance of prostate cancer (PCa) because of its ability to provide functional and anatomic information [1][2][3]. Diffusion-weighted imaging (DWI) is a powerful and dominant component of MRI in peripheral zone (PZ) PCa detection [4].
In traditional diffusion theory, a mono-exponential model, based on the assumption of homogeneous Gaussian diffusion, has been adopted for diffusion analysis in most clinical studies [5,6]. However, molecular diffusion in biological tissues does not occur freely, but rather faces many forms of hindrance, including that from membranes and intracellular organelles [7]. Therefore, many non-Gaussian diffusion models, including bi-exponential, stretched-exponential, and kurtosis models, have been developed to describe the complicated behavior of water diffusion. Some previous studies have suggested that the bi-exponential model [8,9], stretched-exponential model [10], and diffusion kurtosis imaging [11] might provide useful information for PCa evaluation. However, the observed benefits from those non-Gaussian diffusion models required validation [12,13], warranting further studies to explore and compare their roles in the diagnosis of PCa.
The choice of an appropriate b-value is another crucial aspect of DWI that affects PCa detection efficiency. The appropriate b-value is essential for calculating precise diffusion parameters to aid PCa diagnosis [14,15]. Nevertheless, the maximum b-values used in prostate studies have varied greatly, from 500 to 3000 s/mm 2 [12,16,17]. while the brain delivered values up to 5000 s/mm 2 [18]. An earlier study investigated the influence of b-value range on DWI parameters and its effect on diagnostic performance for PCa [16]. However, the maximum b-value in that study was 2300 s/mm 2 ; the effect of an ultrahigh b-value (up to 4500 s/mm 2 ) remains unclear.
In this study, we aimed to address the influence of the range of b-values on the discriminatory value of the diffusion parameters of four different mathematical models (mono-exponential, bi-exponential, stretched-exponential and kurtosis) and to determine the optimal parameter and its optimal b-value range for PCa detection.

Materials and methods Patients
This prospective study was approved by the Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology Institutional Review Board; written informed consent was obtained from each patient prior to examination. Between May 2014 and March 2015, patients with clinically suspected PCa due to either a rising or elevated prostate-specific antigen (PSA) level and/or palpable prostatic nodule, with no prior therapy, were recruited to this study. Initially, 166 consecutive patients were identified. Patients were excluded if a) they had histological confirmation of no cancer (n = 43); b) they had no histological confirmation of cancer (n = 51); c) the images were of poor quality due to susceptibility artifact or movement (n = 4); or d) the cancerous focus was located in the transition zone (n = 12). After these exclusions, 56 patients remained.

TRUS-guided biopsy
After the MRI examination, all patients underwent transrectal ultrasound (TRUS)-guided biopsies. All the biopsies were performed by a single urologist with more than 20 years of experience using an ultrasound system (Hawk 2102, BK Medical, Denmark) equipped with a 5.1 MHz endocavitary probe, with a spring-loaded biopsy gun with an 18 gauge core biopsy needle. To match biopsy sextants and MR images, the PZ of the prostate was divided into six regions in MRI, which corresponded to the TRUS-guided biopsy (apex, mid gland, and base on each side of the PZ), and two core biopsies per region were obtained. Cores were individually labeled according to their location with respect to the biopsy scheme. Each specimen was histologically analyzed as cancerous or noncancerous by an experienced pathologist (with > 15 years of experience).

MRI data assessment
MRI assessment was performed according to the methods used by Rosenkrantz et al. and Li et al. [11,19], and post-processing was performed using the FUNCTOOL (GE Healthcare 4.6) workstation. MR images were interpreted by the consensus of two observers (ZY Feng, and XD Min) with more than 5 years of experience in prostate MRI (each observer had read more than one thousand prostate MRIs). Disagreements between the two observers were resolved by a re-evaluation with the study coordinator, a radiologist with more than 15 years of experience in prostate MRI. On the MR images, the prostate was divided into approximately equal thirds in the cranio-caudal direction, with each third demarcating the left and right basal, mid, and apical sextants, corresponding to the TRUS-guided biopsy. The readers treated each sextant as a separate focus for purposes of the analysis. Regions of interest (ROIs) were placed within the proven cancerous tissues and benign PZs on the T2WI and then automatically copied to DWI. The focus suspicious for PCa with a diameter of not less than 5 mm was delineated on the MRI of the prostate corresponding to the biopsy sextants. If both biopsies obtained in one region were negative or positive for PCa, the region was considered cancer negative or cancer positive respectively. If one biopsy was positive for cancer but the other was negative, the region was still considered cancer positive [11]. The sizes of the ROIs were chosen to be as large as possible with minimal contamination from uninvolved tissues. The SNR for each ROI was calculated for DWI with b-values of 4500 s/mm 2 . The SNR was defined as the ratio between the average signal intensity of ROIs and the mean of the standard deviation (SD) of the signals in the bilateral internal obturator muscles [20].

Modeling
All analyses in the current study were performed on an ROI level [13,21]. We performed the non-linear least-square fitting based on the Levenberg-Marquardt algorithm using Matlab (Math-Works, Natick, MA, USA) to fit the following models to the mean signal of each ROI (Fig 1): 2. Bi-exponential model 3. Stretched exponential model 4. Diffusion kurtosis imaging Where S(b) is the signal intensity at a particular b-value, and S 0 is the signal intensity at b = 0 s/mm 2 . ADC is the diffusion coefficient of the mono-exponential model. <D>, D Ã , and f are the diffusion parameters of the bi-exponential model: f is the perfusion fraction, <D> is the pure molecular diffusion, and D Ã is the pseudo-diffusion coefficient. DDC is the diffusion coefficient of the stretched exponential model, and α is the water diffusion heterogeneity index between 0 and 1. D app is the diffusion coefficient of the kurtosis model, and K app is the kurtosis.

Statistical analysis
To validate the dependence of the models on b-values, the b-values were divided into four different ranges: 0-1000, 0-2000, 0-3200, and 0-4500 s/mm 2 , grouped as A, B, C, and D, respectively.
To measure the goodness-of-fit, adjusted coefficients of determination (R 2 ) for the groups were calculated among different models using Eqs (1)-(4). One-way ANOVA statistics and Bonferroni test were used for multiple comparisons of adjusted R 2 between models.
ADC, <D>, D Ã , f, DDC, α, D app , and K app were estimated in the benign PZs and cancerous tissues. The parameters were calculated using all the b-values in groups A, B, C, and D. Normal distribution was assessed using the Kolmogorov-Smirnov test; if the data presented a normal distribution, an independent t-test was performed to compare the differences in parameters between benign PZs and cancerous tissues. For non-normally distributed variables, the Mann-Whitney U test was used. Next, receiver operating characteristic curve (ROC) analysis was performed, and the area under the curve (AUC) was calculated to evaluate and compare the diagnostic accuracy of the parameters in distinguishing cancerous tissues from benign PZs. The diagnostic performance was expressed as the AUC. Differences in AUCs were assessed using the method of DeLong et al [22].
The adjusted R 2 was calculated with Matlab (MathWorks, Natick, MA, USA). The statistical analysis was performed using SPSS software (SPSS for Windows 19.0, Chicago, IL, USA). The ROC analysis was performed using MedCalc version 13.0.0.0 for Windows (MedCalc Software, Mariakerke, Belgium). For all the significance level was set at p< 0.05.

Results
The final study population comprised 56 PZ PCa patients (age range, 52-82 years; median age, 67 years). The median PSA level in these 56 patients was 26.036 ng/mL (range, 2.06-1000 ng/ ml). Among the total 336 ROIs, 198 ROIs were evaluated as benign PZs, and 138 ROIs were evaluated as cancerous tissues. The SNR of the ROIs at the highest b-value of 4500 s/mm 2 was between 8.784 and 64.118. The mean (±SD) ROI SNRs in cancerous tissues and benign PZs were 27.784±10.245 and 17.035±8.107, respectively. The mean (±SD) ROI sizes in cancerous tissues and benign PZs were 169.780±107.914mm 2 and 97.57±22.041mm 2 , respectively.

Goodness-of-fit of the models
The adjusted R 2 was calculated to test the goodness-of-fit of the models. The mean values and standard deviations of the adjusted R 2 are provided in Table 1. The bi-exponential and the stretched-exponential models provided the highest adjusted R 2 among the four models in every group. There were no significant differences between the bi-exponential and stretchedexponential models in any of the four groups (p>0.05 for all). The goodness-of-fit of the kurtosis model was slightly less than that of the bi-exponential and the stretched-exponential models in all groups (p< 0.05 for all). The mono-exponential provided the worst goodness-of-fit in the four groups. Significant differences were observed between kurtosis and the other three models in all groups (p< 0.05 for all). Similarly, the mono-exponential model showed differences from the other three models in all groups (p< 0.05 for all). The mean value of the adjusted R 2 of the mono-exponential model decreased with increasing b-value. No clear tendency could be established for the bi-exponential, stretched-exponential or kurtosis models (Fig 2).

Summary of DWI parameters
The mean values and standard deviations of the diffusion parameters measured in groups A, B, C, and D are summarized in Table 2. The mean values of ADC, <D>, f, DDC, α, and D app were  found to be significantly lower in cancerous tissues than in benign PZs in each group (p< 0.001 for all). Similarly, the K app in cancerous tissues was significantly higher than that in benign PZs in each group (p< 0.001 for all). We also found a significant difference in D Ã between cancerous tissues and benign PZs in groups B and C (p< 0.05), but not in groups A and D (p > 0.05). However, the D Ã showed a large standard deviation in all groups (Fig 3).

Analysis of ROC curves
The AUC, cutoff, sensitivity, and specificity values of the parameters for distinguishing cancerous tissues from benign PZs are reported in Table 3. The ROC curves of each parameter calculated using different b-value ranges are shown in   The parameters ADC, <D>, DDC and K app provided the highest AUCs among the four models. The AUCs of ADC in groups C and D were comparable (p = 0.980) and were significantly higher than those of groups A and B (p< 0.05 for all). The AUCs of <D> calculated in groups A, B, and C were significantly higher than in group D (p< 0.001 for all). The AUC of <D> in groups A and B showed no significant difference (p = 0.136), and the value of group B was significantly higher than in groups C and D (p< 0.001 for all). The AUCs of DDC calculated using groups B, C and D showed no significant difference (p > 0.05 for all). The AUCs of DDC calculated using groups B and D were significantly higher than in group A (p = 0.016 and p = 0.014, respectively). The AUCs of DDC calculated using groups A and C showed no significant difference (p = 0.102). The AUCs of K app calculated using groups B, C, D were significantly higher than in group A (p< 0.05 for all); however, no significant difference was observed among groups B, C, and D (p > 0.05 for all).
For the parameters calculated using the b-value of group A, the AUCs of DDC and K app were not significantly different (p = 0.204); the AUCs of ADC and K app were not significantly different (p = 0.070); the AUC of DDC was slightly higher than the AUC of ADC (p = 0.014); and the AUCs of ADC, DDC and K app were significantly higher than those of the other parameters (p< 0.05 for all). For the parameters calculated using the b-values of groups B and C, the AUCs of ADC and K app were similar (p = 0.07 and p = 0.954, respectively), but were significantly higher than those of the other parameters (p< 0.001 for all). For the parameters calculated using the b-value of group D, the AUC of ADC was slightly higher than that of K app (0.957 vs 0.953, p = 0.002); the AUCs of ADC and K app were significantly higher than those of the other parameters (p< 0.001 for all).
diagnostic performance relative to conventional ADC. ADC derived from conventional mono-exponential high b-value (3200 s/mm 2 ) provided excellent diagnostic performance in PZ PCa detection, such that no extra benefit was obtained from cumbersome non-Gaussian models (bi-exponential, stretched-exponential, and kurtosis models) and time-consuming over-high b-value (> 3200 s/mm 2 ).
Appropriate b-values must be carefully selected for the use of DWI in PCa detection. In clinical applications, DWI is typical performed using b-values 800-1500 s/mm 2 [4,23]. Nevertheless, the clinical benefits of high-b-value DWI have been explored from head to body [17,24,25]. Until now, consensus on the optimal b-value range of DWI for PCa diagnosis has been lacking. As reported in a previous study, b-value distribution mainly influences the repeatability of DWI-derived parameters rather than the diagnostic performance [26]; however, the influence of b-value distribution on diagnostic performance is beyond the scope of this study.  Our study focused on the influence of b-value range on the diagnostic performance of the parameters of the various DWI models. Our study showed that, for PCa detection, the diagnostic performance of DDC and K app calculated using b-values up to 2000 s/mm 2 outperformed those calculated using b-values up to a maximum of 1000 s/mm 2 . According to a previous study, DWI with a maximum b-value of 2000 s/mm 2 is beneficial for the estimation of the stretched-exponential and kurtosis model parameters [27]. Our results are also in agreement with a study by Mazzoni [16], which found that the K app demonstrated higher AUC for PCa detection using a higher b-value (2300 s/ mm 2 ) than a low b-value (800 s/mm 2 ). However, the highest b-value they used was only 2300 s/mm 2 . For brain kurtosis imaging, empirical evidence indicates that maximum b-values of 2000 to 3000 s/mm 2 are appropriate [7,28]. However, no reasonable maximum b-value has In group A, K app had the largest AUC (0.940), but the AUCs of ADC and K app were not significantly different (p = 0.070). In groups B and C, the AUCs of ADC and K app were comparable and significantly higher than those of the other parameters. In group D, the AUCs of ADC was slightly higher than that of K app (0.957 vs 0.953, p = 0.002), and both were significantly higher than those of other parameters (p< 0.001 for all). Evaluation of different mathematical models and b-value ranges in PZ PCa been recommended for prostate stretched-exponential and kurtosis imaging. Our results revealed that higher b-values above 2000 s/mm 2 provided no extra diagnostic value. Therefore, maximum b-values of approximately 2000 s/mm 2 may be appropriate for prostate stretchedexponential and kurtosis model imaging. The suggestion of Rosenkrantz et al., that a maximal b-value of 1500-2000 s/mm 2 is suitable for body kurtosis imaging, also supports this conclusion [28]. For the bi-exponential model, <D> provided the highest diagnostic value. The AUC of <D> was comparable in group A and group B and was significantly decreased in group C and D. This phenomenon reminds us to pay attention to the maximum b-value of the bi-exponential model. Some studies showed that the parameters of the bi-exponential model depended heavily on the choice of b-values and that it is critical to select an appropriate range of b-values in prostate imaging to obtain accurate parameters [15,29]. In contrast to <D>, DDC and K app , the diagnostic value of ADC calculated using a maximum b-value of 3200 s/mm 2 was superior to those derived using maximum b-values of 2000 s/mm 2 and 1000 s/mm 2 . The increased maximum b-value provided greater contrast between cancerous and non-cancerous tissues, adding to the diagnostic performance. However, when the b-value was higher than 3200 s/mm 2 , the rapid decrease in the goodness-of-fit (0.871 to 0.800) may outweigh the extra benefit of ADC.
Non-Gaussian diffusion models have been proposed to describe complicated water diffusion behavior and are believed to provide a more accurate description of the DWI signal decay curves obtained using higher b-values [30]. Our results showed that the diagnostic performance of parameters derived from a bi-exponential model was low despite excellent goodnessof-fit. The better fit of the bi-exponential model may result from the model addressing a large number of free parameters. However, the additional parameters may cause over-fitting of the data, resulting in poor repeatability and reliability [12,13,30,31]. A study by Merisaari et al. found that, compared with the mono-exponential, stretched-exponential, and kurtosis models, the parameters of the bi-exponential model demonstrated the worst repeatability and diagnostic performance [26]. The poor repeatability and reliability of the bi-exponential model limits its clinical utility. When calculated with a high b-value (!2000 mm 2 /s), the ROC analysis in our study did not reveal significantly superior performance of the parameters derived from the stretched-exponential and kurtosis models over the conventional ADC for PCa detection, although a higher-quality fit to DWI signal decay was provided. These results are consistent with several previous studies [13,32,33]. High reliability and repeatability are preconditions for using parameters derived from quantitative diffusion models for disease diagnosis and characterization, and this may be why the mono-exponential model is superior at maintaining robust diagnostic performance. Nevertheless, the additional value for PCa of non-Gaussian diffusion models compared with the mono-exponential model remains controversial and requires further research. It is well known that the mono-exponential, bi-exponential, stretched-exponential, and kurtosis models all belong to phenomenological DWI signal models. However, until now, their physiologic basis remained unclear, necessitating further evaluation of the physiologic basis of DWI [34].
There were several limitations in this study. First, this study took the TRUS-guided biopsy as the standard of reference. The use of whole-mount histopathology would improve the accuracy of the agreement between MR images and histopathology. However, it is unreasonable to expect all of our subjects to undergo prostatectomy. Because TRUS-guided biopsy has good specificity but poor sensitivity, a small tumor focus may be missed, resulting in a false-negative diagnosis. In our study, we focused primarily on large tumors, which could partially obviate this problem. Moreover, TRUS-guided biopsy has been the standard of reference in a large number of studies [11,19,35,36], which also illustrated the feasibility of the method used in our study. Second, the role of DWI parameters in the assessment of the transition zone was not evaluated, because TRUS-guided biopsy may miss some cancers that arise in the transition zone and because 70-75% of PCa are located in the PZ. Additionally, DWI is the dominant sequence for PZ PCa detection, while the dominant sequence for transition zone PCa is T2WI [5]; thus, it is more important to evaluate the diagnostic performance of DWI for PZ PCa. Third, compared with whole mount step-section histopathology, TRUS-guided biopsy may lead to the under-grading of a fraction of cancers, therefore this study did not aim to explore the predictive accuracy of different mathematical models with regard to Gleason score. Fourth, the influence of SNR on diffusion model parameters of different b-value ranges was not evaluated. The issue of low SNR must be considered for diffusion imaging, especially for high-bvalue images. Although high b-values up to 4500s/mm 2 were used in our study, the SNR of the highest images still reached an extremely high level. The following strategies may have played a positive role in increasing SNR. First, we included imaging at a higher field strength (3.0 Tesla). Second, we increased the number of excitations (10 for the highest b-value image). Third, we used a slice thickness of 5 mm, which is thicker than the 3mm commonly used in clinical settings. Fourth, the resolution we used was 3.125×2.917mm rather than 2.5×2.5mm, which is routinely used in clinical settings. Despite the limitations, we believe our methodical strategies provide sufficient validity for the principal results of our study.

Conclusion
In conclusion, our study suggested that ADC derived from a conventional mono-exponential model using a high b-value (3200s /mm 2 ) DWI is an optimal choice for clinical routine application in PZ PCa detection. For PCa detection, the diagnostic value of parameters derived from non-mono-exponential diffusion models or higher b-values (> 3200 s /mm 2 ) warrant further multicenter exploration. Furthermore, as highly parameterized non-Gaussian diffusion models have a higher information content than the single-parameter mono-exponential model, they may provide useful information for differentiating PCa from other prostatic lesions and might be promising for monitoring PCa progression or response to therapy, which also needs further evaluation.
Supporting information S1 Data. Computed parameters for benign and cancerous tissues. (XLS) S1