Optimizing the intrinsic parallel diffusivity in NODDI: An extensive empirical evaluation

Purpose NODDI is widely used in parameterizing microstructural brain properties. The model includes three signal compartments: intracellular, extracellular, and free water. The neurite compartment intrinsic parallel diffusivity (d∥) is set to 1.7 μm2⋅ms−1, though the effects of this assumption have not been extensively explored. This work investigates the optimality of d∥ = 1.7 μm2⋅ms−1 under varying imaging protocol, age groups, sex, and tissue type in comparison to other biologically plausible values of d∥. Methods Model residuals were used as the optimality criterion. The model residuals were evaluated in function of d∥ over the range from 0.5 to 3.0 μm2⋅ms−1. This was done with respect to tissue type (i.e., white matter versus gray matter), sex, age (infancy to late adulthood), and diffusion-weighting protocol (maximum b-value). Variation in the estimated parameters with respect to d∥ was also explored. Results Results show d∥ = 1.7 μm2⋅ms−1 is appropriate for adult brain white matter but it is suboptimal for gray matter with optimal values being significantly lower. d∥ = 1.7 μm2⋅ms−1 was also suboptimal in the infant brain for both white and gray matter with optimal values being significantly lower. Minor optimum d∥ differences were observed versus diffusion protocol. No significant sex effects were observed. Additionally, changes in d∥ resulted in significant changes to the estimated NODDI parameters. Conclusion The default (d∥) of 1.7 μm2⋅ms−1 is suboptimal in gray matter and infant brains.


Introduction
In diffusion weighted magnetic resonance imaging (dMRI), biophysical models are used for relating the dMRI signal to microstructural properties in white and gray matter [1][2][3][4][5][6][7]. Neurite orientation dispersion and density imaging (NODDI) [7], separates the brain tissue microstructure landscape into three compartments: intracellular space or neurites (axons, dendrites), extracellular tissue matrix, and a free water compartment. In spite of its shortcomings, much like the case of other techniques such as diffusion tensor imaging (DTI), NODDI offers useful information and has been widely used in the investigation of brain tissue microstructure as a function of early development, cognitive function and aging as well as a number of neurological conditions [8][9][10][11][12][13].
Biophysical modeling relies on simplifying assumptions about the tissue properties. Besides the separation of tissue into three compartments, the NODDI model is characterized by the following features or assumptions. Each compartment is represented by its own normalized signal and volume fraction. Water exchange between compartments is assumed negligible. Neurites are modeled as sticks (cylinders of zero radius) for capturing highly anisotropic architecture of neuronal tissue. Diffusion inside the neurites is described by a diffusivity parallel to the sticks, which is referred to as the intrinsic diffusivity, d k , and zero diffusivity perpendicular to them. The orientation distribution function (ODF) of the sticks at each voxel is modeled by an axially symmetric Watson distribution, W [14], which itself is characterized by a concentration parameter κ and mean orientation μ. Highly aligned sticks like those seen in white matter bundles are reflected by high κ values, while highly dispersed sticks like those seen in gray matter fibers are reflected by low κ. The extra-neurite compartment is directionally correlated with the neurite ODF, and modeled as a Gaussian anisotropic compartment.
The local parallel diffusivity of the extracellular space is set equal to the intra-neurite intrinsic diffusivity, d k , whereas the perpendicular diffusivity d ? is related to the neurite water fraction, f ic , and d k by the mean-field tortuosity model [15] as d ? = (1 − f ic )d k . The free-water compartment is modeled as having isotropic diffusion with free diffusivity d iso = 3 μm 2 � ms −1 and volume fraction f iso . The intrinsic diffusivity d k for NODDI is assumed to be 1.7 μm 2 �ms −1 . This is selected to be a biologically reasonable value, which approximates the mean parallel diffusivity from DTI in a healthy coherent white matter region [1]. The parameters that are estimated from acquired data using non-linear gradient descent and heuristic initializations are the water fraction of the neurite compartment f ic , the concentration (κ) and mean orientation (μ) of the Watson distribution. The signal S(b, g) from the unit diffusion gradient direction g for sticks oriented along unit vector n and b-matrix (bgg t ) is given by where xWðn; μ; kÞdn; such that g; n; μ 2 S 2 : A ic , A ec , and A iso , are the intra-cellular, extra-cellular, and free-water isotropic compartments signal contributions respectively. W(n, μ; κ) is the Watson distribution with κ concentration and oriented along μ. S 0 is the un-attenuated signal i.e. S(0, 1), and D e (n) = f ic d k nn t + (1 − f ic ) d k I 3 is the axially symmetric extra-cellular apparent diffusion tensor.
Recently, the model assumptions have become a topic of discussion in the field. The more relevant discussions have focused around the fixed parallel intrinsic diffusivity and equality between parallel intrinsic diffusivity of the extra-and intra-cellular compartment [16]. Of the two, the equality assumption is the more difficult to assess, but has been explored in several reports. While no consensus has been reached, most reports suggest that the intra-cellular parallel intrinsic diffusivity is larger than the extra-cellular one [17][18][19][20]. Yet, this may depend on tissue type [21] and most studies have focused on white matter. Also, some sustain that the differences may not be substantial and independent validation experiments are needed [16].
With respect to the fixed diffusivity assumption, [22] proposed a framework for relaxing the fixed constraints. The study reported that microscopic parallel diffusivities varied across the brain, and that white matter values where considerably larger than that assumed by NODDI. It is important to note, however, that the ability to "estimate intrinsic diffusivity" in [22] comes at a cost, which is the reduction to two-compartment model. In this sense, then, the model in [22] is not fully comparable to the model in NODDI, since the former gives up on estimating the CSF volume fraction. Others [23,24] have also relaxed the fixed diffusivity constraint and made it a free parameter. However, this resulted in unwanted effects on the other parameters in the form of unstable and degenerate estimates. Originally, it was considered unlikely that variation in d k across regions and subjects was significant enough to remove trends in the estimated parameters [1]. Additionally, the fixing of d k is necessary for stability in the parameter estimates and for speeding up convergence of the fitting procedure. Plus, the value that was chosen was the value that minimized the fitting errors for voxels in the midsagittal plane of the corpus callosum [1].
Taking into consideration the non-consensus on the equality assumption and the still widespread use of the technique, here we choose to build on earlier work [25] which investigated the assumption of fixed diffusivity. This consisted of looking at optimal values of the parallel intrinsic diffusivity according to the model residuals. Results suggested that the default value was reasonable in white matter, but it was sub-optimal in gray matter. While recent publications have found our method useful [26,27], this earlier work only considered a single axial slice from three age matched participants and dMRI data acquired with the same imaging protocol. For this reason, we propose a more extensive investigation that considers a diverse array of data in terms of age populations, imaging protocols, and is conducted across the full brain.
We choose to do this only for the case of the original NODDI technique and not for its variants [28,29], as the vast majority of applications have implemented the original version.

Data
Datasets acquired with multiple b-value sequences (suitable for implementing the NODDI technique [7]) were readily available for use in this work from a number of existing neuroimaging studies. These include imaging data from individuals with a broad range of ages and acquired with imaging protocols that vary in regards to number and magnitude of b values as well as number of diffusion encoding directions. dMRI sets include infants, adolescents, young adults, adults, and aging adults. All dMRI sets were collected on a 3T MR750 Discovery scanner (General Electric, Waukesha, WI). A brief description of each study is provided below and details are summarized in Table 1. All procedures for the included studies were approved by the University of Wisconsin-Madison Institutional Review Board.
Neonates study (Neonates). Participants are from a study of neonatal white matter microstructure. Diffusion scans contain 6 non-diffusion weighted volumes and diffusion encoded along 63 directions. Other imaging parameters include: TR/TE = 8400/94ms, 2mm isotropic resolution.
Teen study (Teen-I). Participants in this cohort were drawn from a study of emotion in adolescents. Diffusion scans contain 6 non-diffusion weighted volumes and diffusion encoded along 62 non-collinear directions. Other imaging parameters include: TR/TE = 8400/94 ms and 2 mm isotropic resolution.
Twin teen study (Teen-II). Participants are from a cohort of 130 adolescent twins. Diffusion scans contain 6 non-diffusion weighted volumes and diffusion encoded along 57 directions. Other parameters include 2.0 mm isotropic resolution and TR/TE = 8000/66.2 ms.
Midlife meditation study (Midlife-I). Participants in this cohort were drawn from a study of emotion regulation, asthma, and sleep part of the National Center for Complementary and Alternative Medicine (NCCAM). Diffusion scans contain 6 non-diffusion weighted volumes and diffusion encoded along 57 directions. Other parameters include 2.0 mm isotropic resolution and TR/TE = 8000/66.2 ms.
Preclinical Alzheimer's disease risk study (AD-Risk). Participants were cognitively unimpaired individuals with and without increased risk for Alzheimer's disease recruited from the Wisconsin Registry for Alzheimer's Prevention and Wisconsin Alzheimer's Disease Research Center. Diffusion scans contain 7 non-diffusion weighted volumes and diffusion encoded along 105 non-collinear directions. Other imaging parameters include: TR/TE = 6500/102 ms, sagittal slices 3mm thick, and in-plane resolution of 2.5 mm × 2.5 mm.

Intrinsic diffusivity optimization
Optimality of d k = 1.7 μm 2 �ms −1 was considered by minimizing the model residuals as in [1]. Other biologically plausible values were considered for comparison in the interval [0.5, 3.0] μm 2 �ms −1 in increments of 0.1 μm 2 �ms −1 . For each of the 26 values, the model was fitted to the measured dMRI signal voxel by voxel using the Matlab (The MathWorks, Inc., Natick, MA) NODDI toolbox (http://nitrc.org/projects/noddi_toolbox). Predictions of the signal were then calculated at each voxel from the estimated parameters. With the measured and predicted signals for each d k setting, the root mean squared (RMS) residual was computed at each voxel. A linear search across the 26 different points was then performed for locating the value of d k corresponding to the lowest RMS residual value per voxel. This was done in order to generate a brain map of optimum d k and for looking at the optimality of d k = 1.7 μm 2 �ms −1 across brain regions.

Tissue type segmentations
White matter (WM) and gray matter (GM) masks were obtained for each individual in order to probe the influence of tissue type on the fitting residuals. This was conducted by running FSL's [30] FAST tool [31] with meand diffusivity (MD) and fractional anisotropy (FA) maps as input channels. FA and MD maps were obtained from tensor fits using a weighted least squares method. For the AD-risk study, the shells with b values of 4.8 and 7.5 ms�μm −2 were excluded in the tensor fitting.

Influences of age, sex, and protocol
The availability of data from the various studies allowed for selection of several subgroups that were organized according to age, sex, and protocol. With the data sets organized this way, the residual analysis was performed for the following three cases: Groups for age analysis. Subgroups of 16 participants (roughly half male and half female) were selected from three studies as follows: One group of 16 subjects age approximately one month from the Neonates study. One group of 16 subjects ages between 10 and 19 from the Teen-II study. Six groups, 16 subjects each, extracted from the Midlife-I study, for the six age categories of: 20-29, 30-39, 40-49, 50-59, and 60-65 years. Note that, except for the neonates, these data sets have matching protocols so that the main difference per category was age. In order to help disambiguate protocol from age influences, two additional scans were obtained for one adult: one with the infant protocol and one with the adult protocol.
Groups for sex analysis. From the Teen-I study, two subgroups one of 30 females and one of 30 males were selected. The two groups were matched by age (13 years old), so that the main difference between the groups was sex.
Groups for protocol analysis. Two groups of 16 subjects (roughly half females and half males) with ages ranging from 50-59 years were selected, one from the Midlife-I study and one from the AD-risk study. In this case, the assumed main difference between the groups was the acquisition protocol.

Results
The results are organized as follows. (1) We first show how variation in d k translates to variability in the estimated parameters. (2) Then, the model RMS residuals, with respect to d k are shown to differ between tissue types. (3) This is followed by the presentation of voxel-wise optimized d k maps and the ways in which the optimality of d k = 1.7 μm 2 �ms −1 is influenced by age, sex, protocol and tissue type.

Estimated model parameters and d k
Upon completion of the various model fits, the dependence of the estimated model parameters to variations in d k was explored. For all model parameter maps, mean values were calculated over WM and GM regions. Fig 1 shows these values plotted with respect to d k . This analysis reveals a dependence on d k for all three parameters irrespective of the study as well as variation in the comparison of parameters among the studies. For example, for gray matter values of d k that are lower than the assumed value would weaken variation of the neurite density across the teen and adult subjects. On the other hand, lower values of d k in gray matter would enhance differences in the ODF concentration parameter across all studies.

Model residuals with respect to d k
The values of d k that result in the closest agreement between the measured and predicted signals as dictated by the RMS residuals were explored next. For each of the resulting 26 RMS residual maps, mean values across WM and GM were calculated. These are plotted with respect to d k in Fig 2. These plots reveal that d k values in GM that achieve minimum RMS residuals deviate from the default setting (1.7 μm 2 � ms −1 ) for all studies. For WM, the lowest values in the RMS residual curves occur in the neighborhood of the default setting. Notably, most WM curves, with the exception of the Neonate study, exhibit broad ranges of lowest values as compared to the majority of GM curves. The better defined minima in WM for the infants could be related to a maximum b value that better matches the characteristics of the young brain tissue (i.e. longer T 2 , higher water content) such that diffusion weighting in the signal is more adequate. This is in line with the AD-risk study, which used a max b-value of 7.5 ms�μm −2 and the WM RMS residual curves are noticeably more convex. The remaining studies have maximum b values that are likely on the low end of the optimal range for capturing effects of more restrictive intra-neurite environment, which could help explain the shallower curves in WM.
Optimized d k maps. Optimum intrinsic diffusivity whole brain maps were created by selecting at each voxel the value that corresponded to the smallest RMS residual. Resulting optimal d k maps were median filtered using a box kernel (size 3x3x3 in voxels). The filtering helps to enhance the underlying structure in the distribution of values between white and gray matter. The pattern is spatially consistent before filtering, but it is more difficult to appreciate due to the shallowness of the residual curves for white matter for some of the studies (i.e. Teen-I, Teen-II, Midlife-I). Fig 3 shows optimum d k maps for one subject selected randomly from each of the six studies. These maps reveal moderate to substantial contrast between WM and GM regions. The non-uniformly distributed d k in these maps suggests that d k = 1.7 μm 2 �ms −1 may not be appropriate for all brain regions and all populations.
Optimized d k and age. Optimal d k maps were computed for the cohort organized by age group. These maps were further masked into WM and GM regions and average optimal d k values were obtained for each region. Fig 4A shows the distributions of average optimal d k values according to age group. These plots show distinct distributions between WM and GM average optimal d k for all age groups greater than 10 years. The majority of WM optimal d k values are distributed around the default operating point (1.7 μm 2 �ms −1 ), while all GM optimal d k values are reduced by at least 0.4 μm 2 �ms −1 . These trends are fairly consistent for all distributions corresponding to ages 10 years and above. For the group of less than 1 year (i.e. infants) there is a greater degree of closeness between the WM and GM distributions of average optimal d k in comparison to the rest of the age groups. In this case, optimal d k values fall approximately between 1.4 and 1.5 μm 2 �ms −1 for WM and 1.2 and 1.3 μm 2 �ms −1 for GM. For each age group, a pairwise t-test was conducted in order to assess statistical significance of the tissue-wise difference in average optimal d k . The testing showed that for all groups the optimum d k for GM and WM were significantly different (p < 0.01). A multiple group test revealed that average optimal d k is significantly different between the infant and the rest of the older age groups in both WM and GM, while no significant differences were found between any of the other groups. The mean optimum d k values for the two additional scans on one adult, Fig 4B, are in agreement with those values from same age group for both the infant and adult protocols, pointing to the fact that the observed trends are more a result of differences in age rather than in protocol.
Optimized d k and sex. Optimal d k maps were also computed for the cohort organized according to sex. Average optimal d k values were obtained across WM and GM regions. The distributions of average optimal d k values according to sex category revealed significantly different values between WM and GM with ranges that are consistent with the same age group (10-19 years) from the age-dependence analysis. Yet, no significant effects of sex were observed, a result that is compatible with the age-dependent analysis, which also showed no obvious split in optimal d k between the male and female participants.
Optimized d k and acquisition protocol. Finally, optimal d k maps were also computed for the cohort of subjects with data acquired under differing imaging protocols. Based on the observation that the age dependence analysis revealed no obvious age effects for ages 10 and above, data from the Teen-I study was also included in this cohort despite the unmatched age.  The data sets from the groups with the highest b value protocol show optimal d k values that are lower than d k = 1.7 μm 2 �ms −1 . In GM, this analysis reveals a seemingly decaying trend in optimal d k distributions with respect to maximum b value. Pair-wise t-tests revealed all distributions in GM are significantly shifted down compared to WM distributions, consistent with the observed trend in the previous age and sex comparisons.

Discussion
In this work we studied the implications of diverse multi-shell dMRI data on the optimality of the NODDI parallel intrinsic diffusivity d k = 1.7 μm 2 �ms −1 . The results suggests model assumptions for d k may be suboptimal for specific ages (i.e., infants) and also in gray matter. Although not examined, the optimality of d k = 1.7 μm 2 �ms −1 may also vary with pathology. We also observed that suboptimal d k leads to biases in the estimated NODDI parameters. Of Optimizing the intrinsic parallel diffusivity in NODDI particular interest would be a drop of neurite density in gray matter, a result that is consistent with findings in a recent study [32].
For gray matter, the optimal d k is significantly lower than 1.7 μm 2 �ms −1 . In white matter of the adult brain, values of the optimal d k hover around the default d k = 1.7 μm 2 �ms −1 and just below the range [1.9, 2.2] μm 2 �ms −1 of intra-axonal diffusivities in white matter reported elsewhere [33], though, further analysis (see below) suggested high FA regions in the adult brain contained average optimal d k that falls in this range. It is important to note, however, that the ranges of residual minima in white matter were broad and shallow.
Further, a finer grain analysis indicates that protocol and age also have an impact on the optimality of d k = 1.7 μm 2 �ms −1 , both in white and gray matter. The age-dependence analysis revealed that the newborn brain optimum d k in white and gray matter are closer in value compared to those in the adult brains. Both WM and GM values of optimum d k are different, however, from that used in recent studies [24,34] that have implemented NODDI in the infant brain. The value in these studies was set to 2.0 μm 2 �ms −1 , likely because average DTI axial diffusivity in high FA regions (see below) of newborns is close to this number. Interestingly, at this setting, and using the 1.7 μm 2 �ms −1 for the adult brain, nearly any difference between the infants ODF concentration parameter and that of the older age brains would be removed in gray matter. Using the optimal setting for d k , would result in appreciable differences in ODF concentration parameter between the adults and the infants. On the other hand, using the optimal settings for d k , would weaken the differences in intra-cellular volume fraction between the infant and the older subjects.
This analysis also showed that in the adult brain optimum intrinsic diffusivity values do not vary appreciably with age. However, optimum d k values in GM are much lower than those in WM and different from the default d k = 1.7 μm 2 �ms −1 . With regards to imaging protocol, high b value and more diffusion weighted volumes appeared to yield less noisy and more stable optimal intrinsic diffusivity and NODDI parameter estimates.
In hindsight, the sub-optimality of the assumed d k = 1.7 μm 2 �ms −1 in gray matter is not surprising since this value was originally estimated in the adult corpus callosum [1]. Also, suboptimality of the current state of the model in gray matter might be related to the idea that the impermeable 'stick' representation of neurites is only adequate for myelinated axons but not for dendrites or non-myelinated axons, as others have suggested [35]. In general, however, the variation of optimal intrinsic diffusivity across tissue types is in agreement with findings of axial diffusivity variation across the brain reported in [32].
Studies have reported decreasing DTI axial diffusivity with age [36][37][38]. Thus, the trend of increasing optimum d k with age in WM seen in Fig 4A prompted further investigation. For comparison, averages of DTI axial diffusivity over WM and GM were computed for all subjects in all age groups, Fig 6A and 6B. The resulting axial diffusivity age trajectories are in agreement with previous studies [36][37][38]. However, while these numbers pertain to the whole of white matter, regional differences in developmental trajectories of DTI quantities in the neonate brain have been observed [39]. In the infants, a further look into high FA (>0.5) regions, which reduce to portions of the corpus callosum and the internal capsule, revealed that average optimal d k in these regions is comparable to that seen in the adult global WM. These regions in the infant are thought to be myelinated by one month after birth and to have higher fiber coherence than other white matter areas [39]. The lower FA regions (not shown) in the infant brain, which presumably reflect less or not-yet myelinated axons and or lower fiber coherence, exhibit values of average optimal d k that are similar to those of whole WM. For the older age groups, the axial diffusivity distributions in gray matter mimic those of the optimal d k . For the infants, this is true for both the WM and GM distributions. Also, the optimal d k distribution separation between WM and GM is less for the infants than for the rest of the older age groups. Based on all this, it could be speculated that the neonatal gray matter neurites and white matter neurites are more similar than they are in the adults. Therefore, the model fit for less coherent, nonmyelinated fibers in neonatal white matter would be more similar to the fit in the neonatal gray matter than to the fit in the adult whole WM, as it is illustrated in Fig 4A.

Assumed equal intra-and extra-cellular d k
As mentioned in the introduction, another important assumption of the model is that of equal d k in the intra-and extra-cellular compartments. Thus, one of the limitations of this work is that it was carried out while maintaining this and other assumptions of the model. In order to glimpse at the appropriateness of this assumption as it pertains to this work, a similar model residual optimization was done for the case where the extra-to intra-cellular parallel diffusivity ratio took on values different than 1. In this case, the model was adjusted so that the extra-cellular diffusivity was expressed as a fraction of the intra-cellular diffusivity value. The ratios ranged from 0.1 to 1.3 in 0.1 increments. In this case the number of fits increases dramatically for each subject (26x13 = 338), as do the memory and time requirements. Therefore, the analysis was restricted to two subjects, one infant and one adult, and for a single axial slice. Additionally, in order to circumvent the long fitting times using the Matlab tool box, for this part of the analysis the AMICO NODDI toolbox [40] was used instead.  Optimizing the intrinsic parallel diffusivity in NODDI results have been presented by other reports [23,32], which show that unconstrained multicompartment biophysical models lead to issues in parameter estimation. Particularly, the shape of the lowest residual regions in these contour plots is evocative the pipe-like structures for the fitting cost function landscapes of non-constrained multi-compartment models reported in [32] and [23].

Generalizability
Great effort was made in order to make this as an exhaustive analysis as possible in terms of the diversity of the data that was used. Yet, we acknowledge it is not fully generalizable to the wider scope of neuroimaging biophysical modeling diffusion research, for which it should consider, among others, conditions of pathology and ex-vivo experiments. Nonetheless, we believe that these results are highly informative considering the broad range of ages and imaging protocols investigated. Finally, this analysis was performed for Watson-NODDI only, not for other flavors of the technique which include Bingham-NODDI [28] or NODDIx [29].

Conclusion
In this work, dependence of the estimated NODDI parameters on the parallel intrinsic diffusivity d k was observed. Optimum d k in white matter of the adult brain is similar to the currently used value d k = 1.7 μm 2 �ms −1 but significantly lower in gray matter. Optimal d k is also lower than the default value for the newborn brain in white and gray matter. Effects of imaging protocol on the optimum d k were also observed. Finally, it is important to consider that, despite its limitations, recent analysis suggests that NODDI metrics provide information that is congruent with histologically equivalent metrics [41].