Diffusion weighted imaging in patients with rectal cancer: Comparison between Gaussian and non-Gaussian models

Purpose The purpose of this study was to compare the performance of four diffusion models, including mono and bi-exponential both Gaussian and non-Gaussian models, in diffusion weighted imaging of rectal cancer. Material and methods Nineteen patients with rectal adenocarcinoma underwent MRI examination of the rectum before chemoradiation therapy including a 7 b-value diffusion sequence (0, 25, 50, 100, 500, 1000 and 2000 s/mm2) at a 1.5T scanner. Four different diffusion models including mono- and bi-exponential Gaussian (MG and BG) and non-Gaussian (MNG and BNG) were applied on whole tumor volumes of interest. Two different statistical criteria were recruited to assess their fitting performance, including the adjusted-R2 and Root Mean Square Error (RMSE). To decide which model better characterizes rectal cancer, model selection was relied on Akaike Information Criteria (AIC) and F-ratio. Results All candidate models achieved a good fitting performance with the two most complex models, the BG and the BNG, exhibiting the best fitting performance. However, both criteria for model selection indicated that the MG model performed better than any other model. In particular, using AIC Weights and F-ratio, the pixel-based analysis demonstrated that tumor areas better described by the simplest MG model in an average area of 53% and 33%, respectively. Non-Gaussian behavior was illustrated in an average area of 37% according to the F-ratio, and 7% using AIC Weights. However, the distributions of the pixels best fitted by each of the four models suggest that MG failed to perform better than any other model in all patients, and the overall tumor area. Conclusion No single diffusion model evaluated herein could accurately describe rectal tumours. These findings probably can be explained on the basis of increased tumour heterogeneity, where areas with high vascularity could be fitted better with bi-exponential models, and areas with necrosis would mostly follow mono-exponential behavior.


Results
All candidate models achieved a good fitting performance with the two most complex models, the BG and the BNG, exhibiting the best fitting performance. However, both criteria for model selection indicated that the MG model performed better than any other model. In particular, using AIC Weights and F-ratio, the pixel-based analysis demonstrated that tumor areas better described by the simplest MG model in an average area of 53% and 33%, respectively. Non-Gaussian behavior was illustrated in an average area of 37% according to the F-ratio, and 7% using AIC Weights. However, the distributions of the pixels best fitted by PLOS

Introduction
Diffusion weighted imaging (DWI) has an increasing clinical role in the imaging of patients with rectal cancer, especially in the restaging phase after chemoradiation treatment (CRT) [1]. It has been confirmed that DWI improves the diagnostic accuracy when added to conventional T2 sequences for detecting residual disease after CRT [2,3]. The latter has been mainly accomplished by detecting high signals on the high b-value diffusion images by means of visual assessment [1][2][3][4], and occasionally by measuring apparent diffusion coefficient (ADC) of these areas [4].
In the era of minimally invasive surgical treatment or even wait and see policies [5,6], it is of paramount importance to develop and validate non-invasive imaging biomarkers that could provide prognostic information on the therapeutic outcome, before initiating the treatment [7]. To serve the latter requirements, an ongoing shift from qualitative evaluation of diffusion images to more quantitative strategies, including measurement of ADC [4,[8][9] and tumor volume [10] on high b-value images, is in progress. The most commonly used diffusion related biomarker is the ADC calculated from a mono-exponential model which assumes that the molecular displacement probability function is Gaussian. It has been shown that in several normal tissues, as well as, in malignant, heterogeneous tissues, there is a deviation between the Gaussian diffusion models and the experimental data, noticeable in high b-values which could be attributed to interactions of water molecules with anatomical structures, like cellular membranes. This means, that in the presence of increased tissue heterogeneity the assumption that water displacements can be described by a Gaussian probability function, is no longer valid. In such cases, non-Gaussian models like kurtosis has been shown to fit the data more accurately in the brain [11], breast [12], prostate [13], liver [14] and pancreas [15].
In the current study, four different models (mono-and bi-exponential fitting to Gaussian and non-Gaussian distributions) were applied on data from patients with rectal cancer, to identify which model provided the best performance in terms of fitting quality.

Patients
This study retrospectively assessed nineteen patients who were diagnosed with histologically proven non-mucinous type rectal adenocarcinoma at Maastricht University Medical Center medical center between April 2014 and July 2015. Twelve patients were male, seven females. Median age was 66 (range 45-84 years). Patients were selected from a patient group of n = 28 patients who all underwent a primary staging MRI examination including a dedicated DWI sequence before treatment. Nine patients were excluded because the DW images could not be assessed due to peristaltic motion effects. The study was approved by the Maastricht UMC Medical Review Ethics Committee while informed consent was waived.

Image acquisition
Diffusion weighted imaging (DWI) using a single shot Spin-Echo Echo Planar Imaging sequence was acquired on a 1.5 T whole-body magnetic resonance scanner (Ingenia, Philips, Best, the Netherlands). For signal reception a 32-channel flexible anterior phased-array coil and built-in posterior coil were used. Seven b-values (number of signal averages) including: 0 (5), 25 (5), 50 (5), 100 (5), 500 (5), 1000 (10) and 2000 (10) s/mm 2 were acquired. The most important DWI sequence parameters were TR/TE: 4414/79ms; FOV: 320 x 247 mm 2 ; acquisition matrix: 176x109; reconstruction matrix: 256x256; slice thickness 5mm and intersection gap 0.4mm. The diffusion gradients were applied in 3 orthogonal axes (tetrahedral scheme), parallel imaging factor was 1.9 and a spectral selective fat saturation pulse was used. The DWI scan time was 08mins and 24s for the acquisition of 20 axial slices. Before the initiation of the diffusion sequence all patients were injected 20mg of Butylscopolamine (Buscopan, Boehringer Ingelheim Pharma, Ingelheim, Germany) to reduce peristaltic motion.

Image analysis
Diffusion data were post processed with in-house developed software [16] which was able to generate parametric maps of a number of model related parameters. For each patient the tumor was traced manually slice by slice by a trained radiologist with 7 years of experience in rectal MRI. Regions of interest (ROIs) were drawn on the b1000 images, including only the areas with high signal intensity therefore avoiding necrotic parts of the tumors encompassing as much of the tumor volume as possible, avoiding the outmost tumor margins in order to minimize partial volume averaging (Fig 1). The T2-weighted sequences were at the reader's disposal for anatomical reference.
All pixel values belonging to the tumor were used as input for signal intensity curves as a function of b-value. Model specific curves were graphically overlaid on the data in order to gain insight into each model performance qualitatively. For visual and quantitative evaluation of each approach, statistical criteria permitted direct comparison of the fitting outcome ( Fig  2). Signal to Noise Ratio (SNR) maps were calculated on a pixel by pixel basis for each individual b-value based on the following formula that is valid when images obtained with parallel imaging techniques are considered [17]: where SI tumor was the signal intensity of each tumoral pixel and sd air the standard deviation of a region of interest drown in the air near the anterior abdominal wall. The analysis was based upon pixels with adequate model compliance to the data. A statistical goodness-of-fit metric relying on the adjusted-R 2 (adj-R 2 ) was applied indifferently of the model, and pixels above a threshold of equal to 50% were included in the analysis. Moreover, in case a ROI showed a very high SNR value at b0 (i.e. 1000), pixels within the ROI with SNR above the 95 th quantile of the SNR distribution were also excluded from the analysis as conveying spurious signal. Finally, the analysis was applied individually to each patient, all tumours were visualised on DWI series, and quantitative data were presented as mean ± standard deviation (SD).

Diffusion signal modelling
The DWI biomarkers were quantified on a pixel by pixel basis leading to the generation of parametric maps (Figs 3 and 4) of each individual biomarker using the following formulae: 1. ADC from the mono-exponential Gaussian fit, according to: 2. D bg , D bg Ã and f bg from the bi-exponential Gaussian fit, according to:  3. K mng and D mng from the mono-exponential non-Gaussian fit, according to: 4. D bng , f bng , D bng Ã , K bng from the bi-exponential non-Gaussian fit, according to: where S(b) is the signal intensity (SI) at a given b-value, S 0 the SI without any diffusion weighting gradient (b-value equal to 0), ADC the apparent diffusion coefficient, D bg the Gaussian true diffusion coefficient, D bg Ã the Gaussian pseudo-diffusion coefficient, f bg the Gaussian micro-perfusion fraction, D mng is the mono-exponential non-Gaussian diffusion coefficient, K mng the mono-exponential kurtosis coefficient, D bng is the bi-exponential non-Gaussian true diffusion coefficient, D bng Ã is the bi-exponential non-Gaussian pseudo-diffusion coefficient, f bng is the non-Gaussian micro-perfusion fraction and K bng the bi-exponential kurtosis coefficient. The kurtosis coefficient expresses the degree of deviation from the Gaussian distribution and is a dimensionless parameter, whose value may be either 0 (expressing perfect Gaussian distribution) or higher. All models were applied on multi-slice regions of interest including rectal tumors and their associated biomarkers were derived for every pixel in the region using non-linear fitting techniques. The mean post processing time to generate all diffusion parametric maps was 04:32mins (min: 01:45mins, max: 08:31mins) depending on the number of pixels belonging to the tumor.
Nonlinear Least Squares (NLLS) was applied as a fitting method for calculating the biomarkers from the four different models. NLLS, based on the Levenberg-Marquardt algorithm [18], are minimization problems in mathematics that given initial, lower, and upper bounds for each calculated biomarker (i.e. D bg , D bg Ã and f bg for the bi-exponential Gaussian fit) consider the diffusion model by a linear one, and iteratively refine the values of the parameters to reach their optimal values. The following constraints in the initialization values intended to limit possible effect of local minima in the fitting procedure: 1. Mono-exponential Gaussian (MG) model: ADC from 0.1 (10 -3 mm 2 /s) to 4.0 (10 -3 mm 2 /s) with an initial value of 1.5 (10 -3 mm 2 /s).

Statistical metrics
Statistical metrics including the R-square (R 2 ) and the Root Mean Square Error (RMSE) are frequently used criteria to determine the goodness-of-fit of a model. Studies showed that R 2 and conclusively metrics that mainly rely on the measurement of the absolute distance between the fitted curve and the given signal have been adequate metrics in nonlinear fitting problems [19][20]. Therefore, the bias-corrected adjusted-R 2 (adj-R 2 ) that accounts for the number of degree of freedom (DOF) was used instead of the R 2 . Both adj-R 2 and RMSE were included in the analysis to only assess how close the fitted curve was to the measured signal intensity curve, thus providing a strong statistical indicator about the fitting accuracy of the four examined models and the derived diffusion parameters.
Akaike Information Criteria (AIC) [21] and the F-test statistics (F-ratio) [22] were recruited for statistical evaluation of the performance of the four investigated models in terms of model selection. A low value for AIC signifies a good model. However, a direct comparison of the AIC values is meaningless when comparing a series of models [23]. Instead, model selection was performed using AIC Weights. The second metric for model selection relied on a hypothesis test using F-ratio with a 5% level of significance. F-ratio was calculated based on the following equation: where DF is the degree of freedom given by the number of the b-values minus the number of model parameters, and subscripts 1 and 2 present the simpler and the more complex examined models respectively. F-ratio indicates a pairwise comparison between two candidate models for best fitting, choosing the more complex model (i.e. with subscript 2) in case its p-value is less than the one from the F-table with a 5% level of significance. Multiple pairwise comparisons were conducted between the four candidate models until the best model was determined. To further extend the statistical analysis, all derived parameters were tested and exhibited non-normal distribution with a p-value of 5% as significant. Therefore, Wilcoxon-Mann-Whitney test was used to disclose any significant differences between all four models.
For adj-R 2 and AIC Weights, success is measured on the basis of the maximum value of the criterion while the opposite is true for the RMSE where the lowest score pinpoints the most successful model. F-ratio relies on a hypothesis test and no measurements can be displayed. Table 1 summarizes the percentage of fitted pixels within the ROIs of all patients after the thresholding process was applied to all four models. All candidate models effectively fit a wide area of each ROI above the defined threshold, specifically more than 85% of the pixels in all cases. The Gaussian bi-exponential model (BG) fitted the highest number of pixels of each patient with an adj-R 2 value superior to 95% (for 15 out of 19 patients). SNR of the tumour area for every b-value, expressed as mean ± SD, is also shown on Table 1. An average SNR level above 240 was achieved in the low b-value range (b-value from 0 to 100 s/mm 2 ), decaying smoothly to 85 as the b-value increases. Table 2 gives an overview of the goodness-of-fit of the four candidate models, where the adj-R 2 and especially the RMSE indicate that the two most complex models, the BG and the BNG, exhibit the best fitting performance. In case of the adj-R 2 , the BG model showed the best fitting performance in fourteen out of nineteen patients whereas using RMSE the most complex model (BNG) best fitted tumor areas from all patients. Mean ± SD of each derived parameter from the four examined models are presented in Table 3 for all tumoral pixels. In a significant number of patients (12 out of 19) the ADC from the MG and the D bng from the BNG model showed a statistical dependence according to the Wilcoxon-Mann-Whitney test (p-value higher than 5%). On the contrary, no statistical significant differences (p-value less than 5%) were found between micro-perfusion, micro-perfusion fraction and kurtosis related parameters, respectively.

Model selection
The statistical analysis was performed with respect to the model selection criteria of the AIC Weights and the F-ratio. To obtain a more detailed insight into the performance of the four examined models, the number of pixels that were attributed to a certain criterion according to the best fitting were calculated and depicted in Table 4. The majority of pixels from each ROI were in general assigned to the model proven to be the most successful. In case of the Akaike Weights and the F-ratio, most of the pixels seemed to be better characterized by the monoexponential Gaussian decaying curve and the ADC parameter. In the same twelve out of nineteen patients, most of the pixels from the tumor area were better characterized by the MG model either using AIC Weights or F-ratio. However, the F-ratio led to a more balanced Abbreviations: MG, mono-exponential Gaussian; ADC, apparent diffusion coefficient (10 −3 mm 2 /s); BG, biexponential Gaussian; D bg , Gaussian true diffusion (10 −3 mm 2 /s); D bg *, Gaussian pseudo-diffusion (10 −3 mm 2 /s); f bg , Gaussian micro-perfusion fraction; MNG, mono-exponential non-Gaussian; D mng , monoexponential non-Gaussian diffusion (10 −3 mm 2 /s); K mng , mono-exponential kurtosis; BNG, bi-exponential non-Gaussian; D bng , bi-exponential non-Gaussian true diffusion (10 −3 mm 2 /s); D bng *, bi-exponential non-Gaussian pseudo-diffusion (10 −3 mm 2 /s); f bng , non-Gaussian micro-perfusion fraction; K bng , bi-exponential kurtosis coefficient. distribution of the overall number of pixels from all ROIs to the four examined models when compared to the AIC Weights. The BG model was the second most successful in both criteria, and non-Gaussian behavior was observed in almost 37% and 7% of pixels when F-ratio and AIC Weights were calculated, respectively. The statistical analysis was extended in terms of the SNR calculations to assess any potential influence they had on the model selection process. The mean value of SNR for every pixel of all ROIs was calculated for the subgroup of low (0, 25, 50, 100 s/mm 2 ) and high (500, 1000, 2000 s/mm 2 ) b-values, respectively. The corresponding min-to-max SNR ranges for each group were in turn divided into 8 sub-regions of equal width in which all models were individually tested with respect to the aforementioned analysis criteria. The scope of this analysis was focused in testing the level of stability each model performance shows at different levels of SNR. Table 5 shows the percentage of pixels best fitted by each model at each SNR interval for the low and the high b-value range according to the F-ratio, respectively. A consistent behavior, in terms of which model best fitted the most pixels, was shown in the low, high and overall b-value range (Table 4) with minor alterations to the percentage differences along the SNR intervals. Similar results in terms of stability were obtained in case of the AIC Weights.
In order to conclude to a single model that achieves best fitting performance for each patient individually, an average percentage of pixels best described by each model was first calculated for each patient and then summarized as shown in Table 6 and Fig 5. The purpose of evaluating fitting performance for each patient individually was to remove dependence from lesion size (number of pixels assigned as tumor) and more importantly to decide upon the most appropriate model for personalized data fitting. The results shown equivalent results to those presented in Table 4.

Discussion
Diffusion imaging is gaining increasing attention for rectal cancer imaging not only qualitatively but also quantitatively [8,24]. The predictive value of ADC in assessing the treatment outcome has already been demonstrated by a limited number of studies [25][26][27]. In the vast majority of these studies a mono-exponential Gaussian algorithm is used in order to extract quantitative information. In the presented study, we acquired multiple b-values located on the low and very high range, in order to bring out micro-perfusion contamination and deviations from the Gaussian behavior [28].
A comprehensive statistical analysis was conducted to assess fitting quality of each model using the adj-R 2 and RMSE. As reported in the literature, metrics that rely on the measurement of the absolute distance between the fitted curve and the acquired diffusion signal tend to favour the most complex models [19][20]. Statistically, a complex model like BNG would better fit the data than a simple model like MG causing overfitting and consequently false A trend was distinguished when considering both chosen criteria (i.e. AIC Weights and the F-ratio) for model selection, which evince the MG as the most reliable fitting algorithm. However, a high heterogeneity of all ROIs was prominent and statistically presented through the percentage of pixels best fitted by each of the four models. Table 4 shows that the AIC Weights and especially the F-ratio showed a balanced and smoothed distribution of the pixels better fitted by the models. Extending the current analysis into eight different SNR intervals, Table 5 depicted that model performance was not influenced by the SNR of the tumor area. These results were further confirmed when average percentages of all pixels of the tumor area of each patient that best fitted by each model were summarized in Table 6 and Fig 5. These preliminary findings suggest that in heterogeneous tissue areas, a single model cannot quantitatively describe all the underling anatomical and functional diversity. Therefore, a composite diffusion model (CDM) map could be useful to reflect tumour heterogeneity by presenting the most accurate diffusion model on a pixel by pixel basis (Fig 6). Measured mean signal and fitted curves from the four examined models, applied to two different areas (Region A and B) from Fig 6, were depicted in the following figure (Fig 7). As seen from Fig 7, "Region A" which was classified according to the model selection criteria as an area with a mono-exponential behavior was better fitted by the MG model compared to "Region B" which was better described by BG and BNG.
In the present study, when measuring the signal decay in tumoral pixels, we observed a significant deviation not only in the low b-value area that can be explained in the basis of microperfusion contamination, but in the high b-value area due to probably increased tumoral heterogeneity. Although the signal to noise ratio (SNR) decreases considerably with higher b-values, our acquisition protocol with asymmetric averaging scheme and utilization of state of the  We must acknowledge several limitations in the current study. The relative small number of patients is a potential limitation although for the purpose of the study it was considered adequate. Another limitation was the absence of the application of motion correction techniques between the different b-values. Motion correction in the rectum is not a trivial problem often requiring the development of sophisticated elastic deformation algorithms, which was not under the scope of this study. However, proper administration of antiperistaltic drugs just before the diffusion acquisition minimized such motion-related issues.
In conclusion, the current study indicates that there is no single diffusion model that can describe rectal tumors accurately. Our results suggest that a combination of different models can add value for describing tumor heterogeneity quantitatively in the context of composite diffusion model maps. These findings probably can be explained on the basis of increased tumoral heterogeneity in these lesions, where areas with high vascularity could be better fitted by bi-exponential models, and areas with necrosis would mostly follow mono-exponential behavior.
Supporting information S1 Dataset. The dataset used in this analysis is available in the file S1_Dataset.zip. The derived DWI parameters from the four examined models and the statistical analysis results are provided in csv format. (ZIP)