When Structure Affects Function – The Need for Partial Volume Effect Correction in Functional and Resting State Magnetic Resonance Imaging Studies

Both functional and also more recently resting state magnetic resonance imaging have become established tools to investigate functional brain networks. Most studies use these tools to compare different populations without controlling for potential differences in underlying brain structure which might affect the functional measurements of interest. Here, we adapt a simulation approach combined with evaluation of real resting state magnetic resonance imaging data to investigate the potential impact of partial volume effects on established functional and resting state magnetic resonance imaging analyses. We demonstrate that differences in the underlying structure lead to a significant increase in detected functional differences in both types of analyses. Largest increases in functional differences are observed for highest signal-to-noise ratios and when signal with the lowest amount of partial volume effects is compared to any other partial volume effect constellation. In real data, structural information explains about 25% of within-subject variance observed in degree centrality – an established resting state connectivity measurement. Controlling this measurement for structural information can substantially alter correlational maps obtained in group analyses. Our results question current approaches of evaluating these measurements in diseased population with known structural changes without controlling for potential differences in these measurements.


Introduction
Functional and resting state magnetic resonance imaging (fMRI and rsMRI) have now become established tools to investigate brain function and connectivity. Numerous studies in the past decades have applied fMRI and more recently rsMRI-based measurements to evaluate experience and clinical phenotype-related functional alterations . These studies have provided a vast heterogeneity of findings which have been attributed to group differences in brain functional activity or connectivity. However, the effect of potential between-group structural differences has not been systematically investigated. These structural differences are likely to give rise to partial volume effects defined as differential relative contribution of grey matter, white matter, or cerebrospinal fluid to the observed voxel-or region-wise signal [22]. In our study, we further assume that the observed voxel-wise signal containing these partial volume effects is a linear combination of the signal from the different tissue types. These effects have been largely ignored by making the implicit assumption that they are controlled for in fMRI and rsMRI analyses. This is because within-subject statistical maps and within-subject correlational maps are respectively computed, and are then used to evaluate between-group differences [23,24] [25,26]. Moreover, the signal in typical fMRI and rsMRI analyses might be sufficiently strong to make differences in noise levels across groups negligible, in particular when applying strict correction for multiple comparisons. Therefore, it might be argued that partial volume effects introduced by underlying structural differences, potentially present in fMRI and rsMRI analyses, do not strongly affect the final statistics with these kinds of analyses.
However, this assumption is rather questionable especially because of the relatively low spatial resolution of fMRI and the thickness of the targeted cortical structures, which is in the range of 2-3 millimeters [27]. Thus, even subtle structural between-group differences might lead to differences in the amount of partial volume effects contributing to the signal in the corresponding functional voxels and correspondingly to differences in the observed signal [28]. These limitations are particularly applicable to studies of aging and of neurodegenerative disorders which are generally characterized by grey matter loss. For example, reductions in grey matter volume in a region of interest would inevitably lead to increased contribution of cerebrospinal fluid (CSF) to signal measured in the corresponding region. The increased contribution of CSF would lead to higher noise and correspondingly reduce the within-subject statistical estimates such as beta coefficients. The changes in within-subject statistics due to increased partial volume effects would then transfer to second-level statistics, erroneously suggesting between-group functional differences.
Similarly, rsMRI studies commonly use measurements that are based on within-subject metrics such as correlation coefficients computed between different regions over time. These functional connectivity maps (e.g. Fisher's z-transformed or original) are then used to directly compare the different groups or to extract other more advanced connectivity indices such as the total flow or the clustering coefficient. With the same arguments as above, between-group differences in grey matter volume in a region of interest would lead to greater noise level in this region in the group with lower grey matter volume. Correlation coefficients are well known to be strongly affected by noise. Correspondingly, decreased average correlation strength would be observed in this group in the affected region.
Here we investigate the effects of differential partial volume contribution from grey matter, white matter, and cerebrospinal fluid on results of typical fMRI and rsMRI analyses.

Generated data
All data generation steps and statistical analyses were implemented in Matlab 7.12 (MathWorks Inc., Sherborn, MA). To obtain realistic parameters for the data generation procedure, a publicly available fMRI dataset (auditory block design experiment, BOLD/EPI images, 2T Siemens, 96 acquisitions, TR57s, voxel: 36363 mm 3 , 64 slices, matrix size 64664) of a single subject commonly used for teaching purposes was downloaded from the statistical parametric mapping website. The imaging data and a more detailed description of the auditory paradigm and the imaging sequence can be found at the following URL: http:// www.fil.ion.ucl.ac.uk/spm/data/auditory. These single subject auditory task fMRI data were then used to manually define three voxels of interest, each located in the centre of an easily localized grey matter, white matter, or cerebrospinal fluid anatomical structure ( Figure 1) to reduce partial volume effects introduced by the signal from other tissue types. Means and standard deviations of the signal from each voxel over time were computed to obtain distribution characteristics for each tissue class. The means and standard deviations for white matter and cerebrospinal fluid were then used to generate random Gaussian signal for both tissue types with the same mean and standard deviation as observed in the auditory fMRI data.
To generate grey matter signal of a block design fMRI experiment, a boxcar function with the mean corresponding to the mean of the grey matter voxel was computed generating data for 300 time bins and a block duration of 30 seconds (  Figure 2a and Figure S1 in File S1). For rsMRI analyses, as these mostly focus on correlation-based measurements, a sinusoidal shape function with the same mean and also 300 time bins was used to generate the voxel-wise grey matter signal. Noise in the grey matter signal was simulated by adding to the boxcar function and to the sinusoidal function random uncorrelated Gaussian noise with the standard deviation corresponding to that observed in the original grey matter voxel. As real fMRI and rsMRI data may contain correlated noise across the different tissue types, e.g. due to subjects' motion, we additionally generated data with correlated (r50.2, across tissue types and voxels) random Gaussian noise for both types of analyses. To evaluate how different signal-to-noise ratios affect the results, the grey matter signal using the boxcar-or the sinusoidal function was generated for four different signal-to-noise ratios (0.5, 1, 1.5, and 2) by keeping the noise constant whilst scaling the signal with a factor determined as signal-tonoise ratio multiplied with the standard deviation of noise (Figure 2b,c). The signal for the fMRI simulation study thereby refers to the activation level with respect to the baseline of the simulated block paradigm [29]. For the rsMRI, signal refers to the standard deviation of generated sinusoidal signal without noise. Noise refers in both cases to the standard deviation of random fluctuations added on top of the grey matter signal. The grey matter fMRI and rsMRI signal was convolved with a hemodynamic response function using the spm_get_bf function provided by the SPM8 software package (Statistical Parametric Mapping software: http://www.fil.ion.ucl.ac.uk/spm/) with a time bin length of 1 second.
Further, to simulate partial volume effects, 30 different constellations of the mixture between grey matter, white matter, and cerebrospinal fluid signal were generated (Table 1) using linear combinations of signal from each tissue type. Thereby, for each of the 6 different ratios of white matter and cerebrospinal fluid contribution, 5 different percent contributions of grey matter were generated (  Table 1). Following this procedure, 40 datasets were generated separately for fMRI and rsMRI for each of the 4 signal-to-noise ratios and each of the 30 constellations of differential tissue contributions (4*305120 combinations). Each dataset  . Data generation and statistical results. a) Data generated using the boxcar function to simulate a block design functional magnetic resonance imaging (fMRI) signal. Original signal, signal with Gaussian noise, and the convolved noisy signal are displayed. b) Simulated fMRI time series with four different signal-to-noise ratios are displayed. c) Simulated rsMRI time series with two different signal-to-noise ratios are displayed. d) Two exemplary results of the fMRI simulation study for estimation of beta coefficients are displayed for the 980 functional voxels generated for each constellation of partial volume effect contribution. gm -grey matter, wm -white matter, csf -cerebrospinal fluid, SNR -signal-to-noise ratio. included 980 functional voxels. For rsMRI correlational analyses, only 60 functional voxels per subject were generated for computational reasons. These 40 datasets were then split into two equal groups of 20 each and used for subsequent statistical analyses.

Real data
To evaluate effects of structural information on rsMRI measurements, a freely available dataset of 21 healthy control subjects (11M/10F, 22-61 years old) comprising rsMRI and T1-weighted data was downloaded from www.nitrc.org. A detailed description of this dataset including sequence parameters is provided in Landman et al. [30]. In brief, for each subject 7 min (210 time points, TR/ TE52000/30 ms) of rsMRI data were acquired using a 2D echo planar imaging sequence with an in-plane resolution of 363 mm (240 mm field of view) and thirty-seven 3 mm transverse slices with 1 mm slice gap. Structural data were obtained using a MPRAGE sequence (TR/TE/TI56.7/3.1/842 ms) with a 16161.2 mm 3 resolution. Only data from the first of two available MRI sessions were used for each subject. Preprocessing of imaging data was performed using the statistical parametric mapping software (SPM12b, http://www.fil.ion.ucl.ac.uk/spm/software/spm12/) implemented in Matlab 7.12. Preprocessing comprised motion correction of rsMRI data, co-registration to the structural scans, segmentation and spatial normalization of structural data preserving the concentration to MNI (Montreal Neurological Institute) space using unified segmentation [31], application of these spatial normalization parameters to the co-registered functional data, and smoothing by a Gaussian kernel with 8 mm FWHM (full width at half maximum). In the spatial normalization step, the initial resolution of 36363 mm 3 was kept for rsMRI data to avoid artificial up-or downsampling. Z-transformed voxel-wise degree centrality metrics were extracted from the preprocessed rsMRI images using the REST toolbox [32] with default settings (Removing linear trend: ''Detrend'' option and ''Bandpass'' filtering with a highand low-pass frequency filter of 0.01 and 0.08 Hz). Degree centrality is an established functional connectivity metric representing for each voxel the number of Pearson correlations with all other voxels in the brain exceeding a predefined threshold. The threshold was set to r..25 as suggested by Buckner et al. [33] for this metric. All computations were restricted to a grey matter mask obtained by thresholding the MNI template used for spatial normalization by a value of 0.2 (.20% probability of being grey matter).
Further, as we wanted to evaluate the relative contribution of grey and white matter tissue to the rsMRI signal in each voxel, the segmented and normalized grey and white matter probability maps were also smoothed with a Gaussian kernel of 8 mm FWHM and downsampled to a resolution of 36363 mm 3 to match the rsMRI data.

Statistical analyses of generated data
For fMRI analyses, within-subject first-level beta coefficient maps contrasting the two conditions as generated by the boxcar function were computed using the general linear model design for each possible combination of signal-to-noise ratio and amount of partial volume effects (Figure 2d). These first-level contrast maps were then entered into independent samples t-tests comparing for each signal-tonoise ratio the data having differential grey matter contribution but the same white matter to cerebrospinal fluid ratio. To simulate a realistic whole-brain analysis and assuming 10e6 functional voxels for a typical whole-brain fMRI experiment, a conservative Bonferroni threshold of p,.05 corrected for this number of voxels was applied in all fMRI analyses.
For rsMRI analyses, Pearson correlation maps were computed for each dataset between the generated 60 voxels time series resulting in 1770 ((60*60 -60)/2) inter-voxel correlation coefficients in the left lower triangle. These correlation coefficients were Fisher's z-transformed to approximate a Gaussian distribution and entered into an independent samples t-test comparing for each signal-tonoise ratio the data coming from each tissue class combination across the two groups. For these analyses, full Bonferroni correction of p,.05 was applied to control for multiple comparisons.
Importantly, the obtained datasets used for the statistical comparisons do not differ in any other parameters besides the amount of partial volume effects introduced into the data. Correspondingly, all differences observed in group comparisons either in fMRI or rsMRI analyses can be attributed to differences in the amount of partial volume effects. Functional measurements are expected to reflect true functional but not structural differences. Accordingly, all significant between-group differences induced by partial volume effects in this study are considered as false positive errors as they do not reflect true functional differences.

Statistical analyses of real data
In a first analysis, we aimed to evaluate how grey and white matter signal in each voxel contribute to the degree centrality value observed in the corresponding voxel. For this, we deployed a leave-one-out approach to compute voxel-wise general linear models (GLMs) predicting degree centrality values using voxel-wise grey and white matter information. Thereby, voxel-wise GLMs obtained using all but one subjects are used to predict degree centrality of the subject who was not used for training. We then computed correlations between predicted and observed degree centrality maps for each subject.
As we assume that regions with a higher degree of partial volume effects as indicated by their white matter and cerebrospinal fluid contribution show higher noise levels, we would expect these regions to be stronger desynchronized with respect to other brain regions. Correspondingly, we would expect lower degree centrality values for regions with higher contributions from non-grey matter tissues. To test this hypothesis, we divided all voxel-wise degree centrality values obtained for all subjects into 10 chunks, based either on white matter or cerebrospinal fluid probability values (defined as 1 -grey matter -white matter) observed in the corresponding voxels. Each chunk was defined as a white matter or cerebrospinal fluid range of 0.1, e.g. all white matter voxels with values between 0 and 0.1 or between 0.1 and 0.2 and so on. We then computed independent samples t-tests to compare degree centrality values observed in the chunks defined by white matter or cerebrospinal fluid values applying a Bonferroni-corrected threshold of p,.05.
Lastly, we evaluated the impact of partial volume effects on statistical maps obtained using standard SPM regression analyses. For this, we computed in a first step voxel-wise GLMs for all subjects predicting degree centrality values using corresponding grey and white probabilities. In a second step, we then performed two standard SPM regression analyses with age and gender as covariates first using the observed degree centrality values and then second time using residual degree centrality values after removing the variance explained by the GLM computed in step one (from here on referred to as adjustment). As we were not interested in age and gender effects per se but rather in similarity of statistical maps obtained with and without adjustment of degree centrality maps for structural information, a liberal threshold of p,0.05 at voxel level with a cluster threshold of.30 voxels was applied in these analyses. We then evaluated positive and negative correlations with age and gender in both analyses resulting in four statistical maps for each. To assess the similarity between obtained statistical maps, Jaccard indices were computed for the binarized maps obtained with and without adjustment. This index of similarity is defined as the size of the intersection divided by the size of the union of the observed statistical maps. It equals one when a perfect and zero when no overlap exists.

Generated data
The comparison of fMRI data with differential contribution of grey matter signal to the generated functional voxels and uncorrelated noise resulted in a strongly increased number of false positive errors ( Figure 3). These increases were observed for all signal-to-noise ratios and for all constellations of white matter to cerebrospinal fluid ratios. The smallest difference in the amount of partial volume contribution tested in this study results in a false positive detection of differences between the generated fMRI datasets. Most importantly, the number of false positive errors strongly increases with an increased signal-to-noise ratio. We observe the highest number of false positive errors when comparing signal with lowest partial volume effect contribution to any other constellation of partial volume effects tested in this study.
When assuming correlated noise, we observe a strong but less increased number of false positive errors for generated fMRI data across the three tissue types. In contrast to uncorrelated data, an increased amount of false positives is associated with a higher CSF contribution and a lower signal-to noise ratio ( Figure S2 in File S1).
Similarly, in rsMRI analyses a strong dependence is observed between the false positive error rate and the amount of partial volume effects (Figure 4). Though the overall sensitivity of rsMRI analyses to partial volume effects is less evident as compared to fMRI, the amount of partial volume effect-related differences surviving the correction for multiple comparisons is still substantial for most comparisons. Also for rsMRI data, greater signal-to-noise ratio and comparing signal with lowest partial volume effect contribution to other constellations lead to an increased sensitivity to partial volume effects. Higher amounts of false positive errors are thereby observed for combinations with higher grey matter contribution.
For rsMRI data with correlated noise, we find a substantially lower number of false positive errors as compared with correlated data. Higher amounts of false positive errors are associated with higher signal-to-noise ratios and for higher gray matter contribution ( Figure S3 in File S1).

Real data
The median correlation strength between observed and predicted degree centrality maps of each subject in the leave-one-out cross-validation was r5.50, corresponding to an explained variance of 25%, with the lowest correlation strength being 0.4 (all p,.001 Bonferroni corrected for multiple comparisons) (  Figure 5a). When comparing degree centrality across different chunks of white matter and cerebrospinal fluid probabilities, significant differences in mean degree centrality are observed for most of the comparisons (Figure 5b-d). Consistently lower degree centrality values are observed in regions with higher white matter or cerebrospinal fluid probabilities.
For the four contrasts evaluated in the current study, Jaccard indices between significance maps obtained with and without adjustment of degree centrality values for structural information ranged between 0.22 and 0.56 ( Figure 6).

Discussion
We demonstrate in generated data that small to moderate differences in partial volume effects as induced by differential tissue contribution substantially increase the false-positive rate for both fMRI and rsMRI. These increases are observed for all signal-to-noise ratios evaluated in this study with the highest signal-to-noise ratios being more affected by differences in partial volume effects. Importantly, Partial Volume Effects in Functional and Resting State MRI the generated data used in this study for group comparisons do not differ in any other aspects besides the degree of signal contribution from different tissue types. The observed differences can therefore clearly be attributed to differential partial volume effects. We demonstrate that even minor differences in these can have a strong impact on statistical outcome. We observe the highest number of falsepositive errors when groups with a very low degree of partial volume effects are compared to groups with any other partial volume effect contribution evaluated in this study. We further show in a real rsMRI dataset that underlying structural differences are significantly linked to observed functional connectivity measurements and that adjusting for this information can have a substantial impact on the observed group-level statistical findings.
Our findings in real rsMRI data suggest that underlying structural information can explain up to 25% in variance observed in functional connectivity metrics (0.25 determination coefficient, corresponding to the observed median correlation of 0.5 between predicted and observed degree centrality values) with lower degree centrality values observed in regions with higher contributions of non-grey matter tissues. Adjusting for this structural information can have a substantial impact on the observed statistical maps. Importantly, all of these findings refer to a healthy control population. Based on these results and on our results for the generated data, we would expect an even stronger link between structural and rsMRI metrics in diseased populations affected by neurodegenerative processes. Grey matter tissue loss in these populations would lead to a higher contribution of non-grey matter tissues and correspondingly to decreased functional connectivity metrics.
In the simulation part of our study, we further find a lower sensitivity of fMRI and rsMRI data to false positive errors when assuming correlated noise across tissue types and voxels. These findings are not surprising considering that differences in beta coefficients observed between simulated groups with different constellations of partial volume effects are due to differential noise properties across tissue types. Introducing correlations in this noise leads to a more homogeneous combined final signal across the different constellations of grey and white matter contribution and correspondingly to less between group differences in the estimated beta coefficients. Additionally, in rsMRI analyses further differences are introduced due to the fact that correlated noise (across voxels) is recognized as signal when computing correlations. Correspondingly the combined signal becomes more homogenous between simulated groups with different constellations of partial volume effects. Importantly, though the amount of false positive errors due to partial volume effects decreases for both fMRI and rsMRI analyses when assuming correlated noise, these correlations are known to introduce different types of biases as for example repeatedly shown in studies evaluating the effects of motion or scanner instabilities [34][35][36].
The magnitude of structural differences evaluated in this study in generated data is commonly observed in normal aging [37][38][39][40], in many neurological and psychiatric diseases [38,41,42], between males and females [40,43], but also for learning and treatment-induced structural changes [44][45][46][47]. Also, the signal-tonoise ratios used in our study for simulation of fMRI and rsMRI data are of comparable amplitude to those evaluated in other simulation studies [29]. If not controlled for, these differences in the underlying structure might significantly bias the interpretation of observed functional and resting state differences. Numerous studies have applied both fMRI and rsMRI without controlling for partial volume effects to study between-group differences in different populations revealing for example significant changes of hippocampal connectivity in Alzheimer's disease [5,48]. Considering that strong hippocampal atrophy is well validated in this disease, the question remains whether these findings indeed reflect real functional alterations or merely capture the increase in partial volume effects resulting from neurodegeneration of this brain structure. Other studies reported, for example, electroconvulsive therapy induced functional connectivity changes in prefrontal regions in depression [49]. Considering that this treatment is known to induce structural changes, the underlying nature of the observed functional differences would require further exploration [47,50,51].
Intriguingly, the number of false positive errors for both fMRI and rsMRI strongly increases with an increased signal-to-noise ratio. This effect is due to the fact that greater signal-to-noise ratio also leads to initially higher average beta and correlation coefficients as compared to a noisier signal. Introducing partial volume effects increases the noise level and leads to a stronger average decrease of the extracted statistical measurements. Correspondingly, the group statistics become more sensitive to differences introduced by the differential degree of partial volume effects. Moreover, fMRI and, in particular, rsMRI studies often apply region-of-interest approaches extracting mean values from predefined anatomical brain areas but without restricting computations of mean values to the grey matter compartments of corresponding regions [52][53][54]. This procedure leads to even stronger averaging of signal from different tissue types and therefore magnifies partial volume effects and correspondingly the risk of false positives. It is important to note that partial volume effects have been for decades a central methodological focus in studies applying positron emission tomography [22,28,[55][56][57][58]. These studies have resulted in a large number of tools and methodological developments which allow one to control for partial volume effects using, for example, tissue probability estimates obtained from high resolution structural magnetic resonance scans [22,55,56]. Studies applying fMRI and rsMRI as more recent developments have largely ignored these effects and methodological advances by assuming that the applied statistical procedure provides a sufficient control for these effects. As shown in this study, the currently applied fMRI and rsMRI statistical analyses are strongly affected by partial volume effects induced by underlying structural differences. Our findings indicate that these effects should be taken into account in future studies to allow a more functional interpretation of fMRI and rsMRI outcomes.
It is important to note that in the simulation part of our study we make several assumptions on the properties of signal and noise in fMRI and rsMRI data. We assume the observed final voxel-wise signal to be a linear combination of the signal from different tissue types. This assumption is based on geometric properties of the voxel-wise MR signal. It is the most plausible to assume that a tissue covering for example 2/3 of the area covered by a voxel is also contributing 2/3 to the observed signal in the corresponding voxel. In contrast, any other assumption of non-linear contribution would require further assumptions of more complex interactions between tissue types and MR physics, e.g. differential spatial point spread functions for different tissue types. We further assume that noise in grey matter, white matter and cerebrospinal fluid is sufficiently described by a Gaussian distribution. This assumption is also made in commonly applied parametric statistics to analyse fMRI and rsMRI data and has been repeatedly used in previous studies to simulate fMRI data [29,59]. Lastly, in our study we only evaluate the situation assuming either uncorrelated noise or a correlation of 0.2 between tissue types and voxels. All of these assumptions might affect the observed relationship between partial volume effects and the observed functional differences. Deviations from these assumptions in real data might therefore result in different findings regarding the impact of partial volume effects onto rsMRI and fMRI analyses.
Another important issue is related to the correction procedure proposed in our study and concerns the dissociation of partial volume effects from potentially real functional differences which are induced by differences in the underlying structure. The main assumption behind the proposed correction procedure is that the variability in functional signal is different from the one observed in underlying structure. Correspondingly, in case that these are strongly correlated, the proposed approach is unlikely to dissociate between these two types of effects.
To conclude, although simulation approaches as applied in our study are in general very powerful to uncover mechanisms behind hypothesized effects, they are also limited by the necessity of numerous assumptions which might be true or not. Therefore, they cannot be considered as direct evidence for existence of such effects in real data and require further studies to validate the existence and impact of these effects in real fMRI data. Similarly, further studies are also required to establish the impact of these effects onto fMRI and rsMRI differences observed across different pathological conditions.

Author Contributions
Conceived and designed the experiments: JD AB. Performed the experiments: JD. Analyzed the data: JD. Contributed reagents/materials/analysis tools: JD. Wrote the paper: JD AB.