Signal to noise and b-value analysis for optimal intra-voxel incoherent motion imaging in the brain

Intravoxel incoherent motion (IVIM) is a method that can provide quantitative information about perfusion in the human body, in vivo, and without contrast agent. Unfortunately, the IVIM perfusion parameter maps are known to be relatively noisy in the brain, in particular for the pseudo-diffusion coefficient, which might hinder its potential broader use in clinical applications. Therefore, we studied the conditions to produce optimal IVIM perfusion images in the brain. IVIM imaging was performed on a 3-Tesla clinical system in four healthy volunteers, with 16 b values 0, 10, 20, 40, 80, 110, 140, 170, 200, 300, 400, 500, 600, 700, 800, 900 s/mm2, repeated 20 times. We analyzed the noise characteristics of the trace images as a function of b-value, and the homogeneity of the IVIM parameter maps across number of averages and sub-sets of the acquired b values. We found two peaks of noise of the trace images as function of b value, one due to thermal noise at high b-value, and one due to physiological noise at low b-value. The selection of b value distribution was found to have higher impact on the homogeneity of the IVIM parameter maps than the number of averages. Based on evaluations, we suggest an optimal b value acquisition scheme for a 12 min scan as 0 (7), 20 (4), 140 (19), 300 (9), 500 (19), 700 (1), 800 (4), 900 (1) s/mm2.


Introduction
Intravoxel incoherent motion (IVIM) is a method to separate perfusion effects from thermal diffusion effects from images acquired using diffusion-weighted magnetic resonance [1]. A relatively large amount of experimental evidence consistent with the interpretation that the IVIM method can provide in vivo perfusion information has been collected in the last few years [2]. In particular, the IVIM method has been shown to be applicable in a broad range of brain clinical investigations [3], both in the context of hyperperfused lesions such as in high-grade glioma [4][5][6][7][8][9][10][11][12], and hypoperfused lesions such as strokes [13][14][15][16], vasospasm [17], cerebral lymphoma [18] and cerebral death [19]. In addition, the method has shown promise for the survival prognosis in high-grade brain glioma [20,21], in differentiating recurrent tumor from radiation necrosis for brain metastases treated with radiosurgery [22], and as a surrogate a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 marker for the progression of cerebral small vessel disease [23,24]. Unfortunately, IVIM perfusion parameters maps are known to be noisy [25][26][27], and this is particularly harmful for the detection of hypoperfused lesions, because the quality of the IVIM signal equation fit decreases with decreasing perfusion fraction. Optimizing the acquisition parameters might help reduce this drawback.
Several studies on the effect of b-value distribution on the IVIM reconstruction have been conducted. However, to the best of our knowledge no exhaustive evaluation for the optimal choice of b value in the brain has been conducted, considering the number of averaged repetitions. Lemke et al. studied b value distribution in liver [28]. In the brain, Chabert et al. studied 10 subjects with two b value distributions [29]. Hu et. al [30] studied retrospectively 22 healthy males with 12 b-value sets in low and high groups, suggesting total of eight b-values up to 800-1000 s/mm 2 . A further study in eight healthy subjects with an acquisition with four repetitions in the upper abdomen, an optimization derived from based on Cramér-Rao Lower Bounds suggested to use twice as many b-values as b = 0 images [31,32]. In [33], simulations and 16 healthy volunteers were analyzed in a rapid measurement setting evaluating two b value distributions. Reproducibility across sites and scanner models with selected b-value set for two of IVIM parameters f and D parameters only, was evaluated in [34]. Further, inter-rater reliability with eight subjects in various organs, including brain, was studied in [35], and short-term repeatability in [36].
The purpose of this study was to quantify in the brain the conditions to produce optimal IVIM perfusion images. For this, we acquired 20 averages of 16 b values ranging from 0 to 900 s/mm 2 during a 1-hour scan in four volunteers. We studied the noise characteristics of the trace images in a large number of sub-sets of the acquired b values and number of averages. Finally, we studied and optimized the signal-to-noise properties of the IVIM perfusion maps, using various b set selection strategies.

Data acquisition
IVIM imaging was performed on a 3-Tesla clinical system (Siemens, Erlangen, Germany) in four healthy volunteers (1 female 25 y, 3 males 26, 38, and 38 y) with parameters: TR/TE 4000/ 92 ms, FOV 22x22 cm, matrix size 148x148, for a voxel size of 1.5x1.5x6 mm 3 in approval of local ethics committee. The diffusion weighting was applied in three orthogonal directions from which the trace was computed, with 16 b values 0, 10,20,40,80,110,140,170,200, 300, 400, 500, 600, 700, 800, 900 s/mm 2 . The acquisition was repeated 20 times. Images with number of averages 1-5 of the measurements were calculated so that in the analysis, there was 20 images including 1 repetition, 10 average images each containing 2 consecutive repetitions, and correspondingly for average images containing 3,4 and 5 consecutive repetitions each. The complete analysis flow is shown in Fig 1. This study was approved by the Commission cantonale (VD) d'éthique de la recherche sur I'être humain (Protocole 322/11). Written consent was obtained from all participants.

IVIM model fitting
The IVIM imaging data were post-processed using FSL (FMRIB's Software Library, www. fmrib.ox.ac.uk/fsl, 5.0.4) [37], and locally written python and C++ code. The b0 images were co-registered to the first b0 image. The co-registered b0 image were averaged, after which all other b-value images were co-registered to the averaged b0 to assess motion during scanning session. The DWI decay curves were fitted with Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm implemented in the Dlib-ml C++ library [38]. We followed the fitting procedure described in [26], the fittings were initialized with slopes and perfusion fraction values calculated analytically from the log-transform of the signal and an extrapolated line at b = 0, from the tail values selected according to the available b values (see below) in the images so that all b values > = 300 s/mm 2 were included. The D parameter was fitted first with the values at the tail of the decay curve, and then fixed for the fitting of the f and D � parameters. This provided two-stage fittings of the curve with two and three degrees of freedom in the two stages of the fitting procedure, correspondingly. It is to be noted, that in special case of evaluating only three b-values, the fitting procedure is reduced to be identical to determination of initialization of values, as in such situation there are no other data points available for the fitting.

Analysis set-up
Post-processing was done for all acquired images and for three sets of b-values combinations, which all contained the fix 3 b values: b = 0, 200, and 900 s/mm 2 , and which were produced as follow: • In the first set, b values below 200 s/mm 2 were selected, following the strategy of maximum sampling in a b value range of the IVIM effect.
• In the second set, b values above 200 s/mm 2 were selected to follow the approach of maximizing good starting estimate of D [39].
• Finally in the third set, one value above 200 s/mm 2 and one value below 200 s/mm 2 were selected alternatively, to give emphasis in the mid-range of b-values, as study in [40], has been shown to improve accuracy in estimation of fraction f between two exponential decays.
The 3 strategies of b values were then pooled in one ensemble, and further augmented with manually selected b value sets to homogenize the sets and to evaluate previous work on IVIM brain [33], to a final ensemble of 95 different sets of b values (Appendix A in S1 File).  Values 200, 900 were fixed to follow suggestions of earlier b value set optimization studies [29,33], and were considered as suggested optimal for optimizing f parameter.

Brain segmentation
The brain was segmented in grey matter (GM) and white matter (WM) with average of b0 image in 1 hour acquisition. We applied manually explored b0 image intensity value thresholds for GM and WM, followed by manual edits to remove artifacts (minor intensity inhomogeneity, noise), using ITK-SNAP (www.itksnap.org, version 3.8.0). The voxels of the GM and the WM maps were then pooled for further analysis. The WM region was eroded with one voxel before extracting voxel values, to address potential partial volume effect from neighboring regions.

Standard deviation analysis inside the GM and WM of the original data
To get an overview of the noise characteristics of the original data, the coefficient of variance (i.e. standard deviation divided by the mean) of the voxel intensity values in the GM and WM was calculated for each b value and each average (i.e. 1 to 20 averages) trace images, and plotted as heatmaps for the first of the scanned subjects.

Standard deviation analysis inside the GM and WM of the IVIM parameters
The standard deviation of the IVIM parameters (f, D, D � ) in GM and WM were calculated for the sets of b values and for the averaged images containing 1-5 repetitions. Each set of different b value was analyzed separately for the number of repetitions in average images ranging from 1 to 5. The number of average images were: 20 for 1-average, 10 for 2-average, 6 for 3-average, 5 for 4-average, and 4 for 5-average, total 45 images for each b value setting. The corresponding number of b0 images were 1, 2, 3, 4 and 5. The b0 images were averaged in the latter four partitions.

Optimal b value set and the number of averages for a given scan time of 3 min, 6 min and 12 min
Finally, we assessed the optimal set of b values and number of averages to produce the most homogenous D � maps possible, from all subsets analyzed (see above and Appendix A in S1 File). In addition to regression analysis with four subject and 95 different b-value sets, we performed Cramér-Rao Lower Bound (CRLB) search [41] using measured noise in b-values in range 0-900 s/mm 2 across repetitions, optimizing for all IVIM parameters together and allowing individual b values to have different number of averages to each other.

Statistical analysis
Statistical significance of monotonic upward or downward trends in number of averaged images and number of b values were tested with the Mann-Kendall trend test. Effect of b-value selection, number of repetitions, and the number of b-values were used in multivariate analysis to analyze the relative contribution of these variables to standard deviation IVIM parameter maps in general linear model (GLM) analysis for overall effect of b value set, number of b values in set and number of repetitions to Standard Deviation (SD) (in formula SD~b value selection + number of b values + number of repetition). Also, we applied GLM to evaluate effect of including individual b value in b value set. P-values less than 0.05 after Bonferroni correction were considered statistically significant, while raw p-values are reported unless otherwise noted. All statistical tests were done in RStudio environment (v 1.1.383, 2017 RStudio, Inc.).

Standard deviation inside GM and WM
The coefficient of variance (CoV) as a function of b value and number of averages in the GM and WM (Fig 2) varied in 15%-40% for GM and 9%-10.5% for WM. WM CoV had two peaks at the low number of b-values: One at high b values, due to thermal noise, and interestingly, another one at low b values, most probably due to physiological noise arising from variation in blood flow during the cardiac and respiratory cycles. The peak at low b values was more prominent in the GM compared to the WM. This might probably be due to more prominent capillary network in the GM. We observed that this increase in SD is due to the inclusion of low bvalue < 200 s/mm 2 with a larger signal SD due to physiological noise (see Fig 2), and that number of repetitions did not bring notable improvement to it in four evaluated cases and repetitions up to 20. When b-value sets having the same number of b-values were averaged and four subjects were combined in the GLM analysis, interaction term between the number of b-values and number of repetitions in WM D (p<1.0x10 -6 ), and effect of adding variation between subjects to the model in D � and f (p<1.0x10 -6 ) were found to be significant, and were thus addressed in the subsequent analysis. We observed that the number of b-values was affecting SD of WM and GM in IVIM parameter maps more than the averaging of the signal (p<1.0x10 -6 ). Overall, with all b-value sets and number of averages, when testing the effect of an individual b-value set to SD, selection of individual b-value set was explaining SD significantly better than signal averaging (p<1.0x10 -10 ).

PLOS ONE
Signal to noise and b-value analysis for optimal intra-voxel incoherent motion imaging in the brain We also evaluated the b value sets and b values individually to see their effect on SD. The suggested b values which had statistical significance (p<0.001 after Bonferroni correction over b values) in explaining SD, and which made a statistically significant decreasing difference to SD, are listed in Table 1. Generally, when other than the fixed b values of 0, 200, 900 s/mm 2 were considered in all b-value sets, the D � had increased SD when using b values 10, 20, 40, 110, 600 s/mm 2 , in accordance with noise measurements in Fig 2. Also, high b-values (with one exception of 10, 110, 500, 600, 700 s/mm 2 ) other than 900, increased SD of f. The same effect was visible when b values sets with a maximum scan time of 3, 6 and 12 minutes were considered.
In CRLB analysis where optimized b value settings were aiming for all three IVIM parameters together, the suggested b values were largely in agreement from suggestion from direct SD evaluations with the ensemble of b value sets in GM and in WM. However, the CRLB analysis suggested different number of averages for selected b values, with non-uniform b value averaging and more emphasis towards using small b values with some of the highest available b values.

Optimal b value set and the number of averages for a given scan time of 3, 6 and 12 min
All of the optimized b value sets in Table 1 were analyzed individually for their SD in GM and WM (D � Fig 3, S1 and S2 Figs in S1 File for f and D), and mean f (S3 Fig in S1 File). We found negligible improvement between 3 min to 12 min for b value sub-set of 0, 200, 900 s/

PLOS ONE
Signal to noise and b-value analysis for optimal intra-voxel incoherent motion imaging in the brain mm 2 , which reflects situation where direct analytical estimation of the IVIM parameters is applied from the trace images without fitting. The three b value sub-set was providing reasonably good D � parameter maps in terms of SD, with some expense in quality of f values, while noting that low SD for those images may come in expense of losing true D � signal.
In other b value sets, low SD in D � together with reasonable f was found with 12 min scan (0 (8), 10 (8)   was apparent difference between optimized 12 min sub-set obtained from GLM approach, and 6 min and 12 min sub-set with CRLB approach (D-F in Figs 4 and 5). From all of the evaluated b-value subsets, the 12 min sub-set suggested from CRLB analysis provided most stable parameter maps, for both f and D � (F in Figs 4 and 5).

Discussion
In this extensive study on the dependence of the homogeneity of IVIM parametric maps on the set of values and number of averages in the brain, we found an expected general trend toward an increase in image homogeneity with an increasing number of averages and a less trivial relationship between image homogeneity and the number of different b values used. We found that the inhomogeneity of the IVIM parametric maps increased significantly with the inclusion of b values smaller than 200 s/mm 2 , which showed large SD due to physiological noise, probably mainly due to cardiac pulsation and to a lesser extent to the respiratory cycle. This effect was more pronounced in the GM compared to the WM, but was not observed for the parameter D, because the calculation of this parameter does not include values with physiological noise. D maps produced with two b values above 200 s/mm 2 and 2 averaged repetitions each showed already excellent and almost optimal image homogeneity, and only negligible improvements could be obtained if more repetitions or b values were added to the analysis, while f and D � benefitted from signal averaging especially when applied to lower b values < 200 s/mm 2 . While there were differences in absolute values f and D � between parameter maps using optimized b value sets (Figs 4 and 5), the IVIM parameter values with 12 min and multiple b values (D and F in Figs 4 and 5) were generally within values expected to be found in healthy brain tissue [16]. Overall, taking also into consideration the subjective aspect of the image, the set of parameters 12 min (0 (7), 20 (4), 140 (19), 300 (9), 500 (19), 700 (1), 800 (4), 900 (1) s/mm 2 ) seems to provide a good compromise to evaluate f and D � with reasonable low variation. The choice of optimal scan parameters is in general not trivial, due to the interdependence of a relatively large number of parameters (such as scan time, resolution, TR, TE, bandwidth, for DWI the choice of the profile of the diffusion-sensitizing gradients) and effects (such as hardware related noise, eddy currents, field inhomogeneities, patient related physiological and motion artefacts). In addition, image homogeneity is not necessarily identical with holding maximal physiological or pathological information. Our analysis suggests that the set of b values should be selected with care for IVIM perfusion imaging, particularly when aiming for high-quality D � and f parameter maps, and that optimizing for IVIM parameter maps may be required to be performed separately as optimizing for certain parameter comes with the expense of another. The optimal number of repetitions in average images differs between b values, due to need for addressing physiological noise in the low b values, and thermal noise in the high b values. In particular, it seems reasonable to suggest increasing the number of repetitions at low b values (<200s/mm 2 ) to average over physiological noise. In addition to compensation of physiological noise, placing more repetitions to mid-range of b values (200-400 s/ mm 2 ) when making average images improves the estimation of fraction f, and D � . We speculate that mid-range b values are located where dominance of D � changes to D, and therefore they have an additional contribution in finding the fraction f between the two components of the model. A further option to consider to decrease physiological noise effects, although it increases scan time and is difficult to implement in the daily clinical routine because not very practical, is to use triggered acquisitions, such as cardiac gating [42] and respiratory triggering. The use of triggered acquisition has already been shown to decrease measurement variability in IVIM liver imaging [43]. Also, cerebrospinal fluid (CSF), which also undergoes periodic pulsations driven by cardiac and respiratory forces, and participate in the IVIM signal through partial volume [44,45], could be suppressed using an inversion-recovery pulse [46,47], or even better, a T2-prepared inversion pulse [48], which permits a better recovery of blood signal with similar suppression of the CSF signal.
There is to our knowledge no study on the choice of b value in the brain for IVIM perfusion imaging. Outside brain, typically 5 to 16 b values have been used to sample perfusion and diffusion IVIM effects [49]. Using Monte-Carlo simulation, Lemke et al. suggested an optimized set of 16 b values for the measurement of a low, medium and high perfusion fractions using Monte-Carlo simulations [28]. Cho et al. optimized the set b value using Monte-Carlo simulations and applied it successfully to IVIM parameters estimation in breast cancer [31]. Pang et al. evaluated different combinations values in prostate cancer [50], Ter Voert et al in the liver [51], and Dyvorne et al. found a subset of 4 optimized b values for the liver [52].
Our study had several limitations. Only four subjects were scanned, while each case was analyzed extensively. Only one sequence of acquisition with maximum b value of 900 s/mm 2 was used, while sub-sets with the same acquired data was analyzed. Similarly, one acquired fitting approach was used, while the fitting procedure may affect the quality of measured IVIM parameter maps [26]. In our fitting approach, the perfusion fraction and pseudo diffusion were not expected to affect the fit of D (with >300 mm/s 2 ). We evaluated b value sets only in terms of precision; the acquisition parameter optimization for accuracy or jointly for precision and accuracy was beyond the scope of this work and is left for future investigations. It is left to future studies to assess disease related IVIM parameter values. Preliminary evaluation with one volunteer addressed the effect of the number of repetitions to the distribution of DWI signal intensity values [53]. Neither short term repeatability nor reliability was addressed. Future work should evaluate the repeated measurements with the suggested b-value sets in test-retest setting with larger number of subjects. Our best found b value set requires 12 min scan, which is to be considered relatively high in the clinical setting. Additional work should compare practical usefulness and potential benefit of the created higher quality IVIM parameter maps in specific clinical applications, such as stroke imaging or radiomics feature extraction in tumor imaging.
In conclusion, we evaluated the signal to noise dependence of IVIM trace image data and its effect on the quality of IVIM parameter maps of D, f and D � . We found that physiological noise at low b value and thermal noise at high b values propagates to the parameter maps. We suggest compensating for this effect by increasing the number of averages in those b values, with additional weighting at mid-range (200-400 s/mm 2 ) of b values. Overall, the set of parameters (0 (7), 20 (4), 140 (19), 300 (9), 500 (19), 700 (1), 800 (4), 900 (1) s/mm 2 ) seems to provide a good compromise to evaluate f and D � with reasonable low variation.