Percent amplitude of fluctuation: A simple measure for resting-state fMRI signal at single voxel level

The amplitude of low-frequency fluctuation (ALFF) measures resting-state functional magnetic resonance imaging (RS-fMRI) signal of each voxel. However, the unit of blood oxygenation level-dependent (BOLD) signal is arbitrary and hence ALFF is sensitive to the scale of raw signal. A well-accepted standardization procedure is to divide each voxel’s ALFF by the global mean ALFF, named mALFF. Although fractional ALFF (fALFF), a ratio of the ALFF to the total amplitude within the full frequency band, offers possible solution of the standardization, it actually mixes with the fluctuation power within the full frequency band and thus cannot reveal the true amplitude characteristics of a given frequency band. The current study borrowed the percent signal change in task fMRI studies and proposed percent amplitude of fluctuation (PerAF) for RS-fMRI. We firstly applied PerAF and mPerAF (i.e., divided by global mean PerAF) to eyes open (EO) vs. eyes closed (EC) RS-fMRI data. PerAF and mPerAF yielded prominently difference between EO and EC, being well consistent with previous studies. We secondly performed test-retest reliability analysis and found that (PerAF ≈ mPerAF ≈ mALFF) > (fALFF ≈ mfALFF). Head motion regression (Friston-24) increased the reliability of PerAF, but decreased all other metrics (e.g. mPerAF, mALFF, fALFF, and mfALFF). The above results suggest that mPerAF is a valid, more reliable, more straightforward, and hence a promising metric for voxel-level RS-fMRI studies. Future study could use both PerAF and mPerAF metrics. For prompting future application of PerAF, we implemented PerAF in a new version of REST package named RESTplus.

Informed consent was obtained from all participants. The study was approved by the Institutional Review Board of the New York University School of Medicine and New York University.
Participants had three resting state sessions. Session 2 and 3 were collected 45 min apart, and were 5 -16 months (mean 11 ± 4 months) after Session 1. During each scanning session, participants were instructed to continuously keep eyes open and a word "Relax" was centrally projected in white against a black background.
For more information regarding Dataset-1 collection, please refer to (Shehzad et al. 2009).

Data preprocessing
The preprocessing was performed using Data Processing Assistant for Resting-State fMRI (DPARSF) (Chao-Gan and Yu-Feng 2010) (http://www.restfmri.net), including: 1) discarding the first 10 timepoints for the longitudinal magnetization to reach a steady state and for participant's adaptation to the scanning noise; 2) slice timing correction; 3) head motion correction; 4) co-registration, spatial normalization and resampling to 3 mm isotropic voxel size; 5) spatial smoothing with an isotropic Gaussian kernel with a FWHM of 6 mm; 6) removing the linear trend of the time series; and 7) band-pass (0.01-0.08Hz) filtering. Three participants were excluded from further analyses because of excessive head motion (more than 2.0 mm of maximal translation or 2.0° of maximal rotation) throughout the course of scanning.
Considering the fact that head motion regression is drawing more and more attention in the RS-fMRI studies, we also explored the effect of head motion regression. Before band-pass filtering, we regressed Friston-24 head motion parameters individually. Friston-24 head motion parameters includes 6 head motion parameters (3 for transition and 3 for rotation), their historical effects (position in the previous scan, 6 parameters), and square of the 12 parameters (Friston et al. 1996).
A recent RS-fMRI study comprehensively investigated the effects of a set of head motion parameters on a set of measurements and concluded that Friston-24 performed the best on most RS-fMRI measurements (Yan et al. 2013a).

Test-retest reliability of PerAF, ALFF, and fALFF
The PerAF was calculated in the way as mentioned in section "2.1". The ALFF and fALFF analysis was performed using REST (Song et al. 2011) (http://www.restfmri.net). After preprocessing, the time series for each voxel was transformed into the frequency domain with a fast Fourier transform (FFT) and the power spectrum was then obtained. Since the power of a given frequency is proportional to the square of the amplitude of this frequency component, the square root was calculated at each frequency of the power spectrum and the averaged square root was obtained across 0.01 -0.08 Hz at each voxel. This averaged square root was taken as the ALFF (Zang et al. 2007). Then a ratio of the sum of amplitude within the low frequency band (i.e., ALFF) to that of the whole frequency band was computed as fALFF (Zou et al. 2008).
The original ALFF value is not very suitable for comparison, so ALFF of each voxel was divided by the global mean ALFF of each participant (Zang et al. 2007) (we call this result mALFF).
The same procedure was performed for fALFF (Zou et al. 2008) (mfALFF). Mathematically, it is not necessary in PerAF since the scaling factor has been normalized by dividing the temporal mean.
However, to have a fair comparison with ALFF and fALFF and to investigate the effect of global 8 mean value, the PerAF of each voxel was also divided by the global mean PerAF of each participant (thus we have both PerAF and mPerAF). In the original paper reporting Dataset-1, the authors did not use the cerebellum and inferior part of temporal lobe because these brain areas in some participants were not covered (Shehzad et al. 2009). Therefore, we made an intersection mask within which all 75 scanning sessions (3 sessions for each of the 25 participants) were covered (Fig.   2). Specifically, the mean fMRI image of each session was spatially normalized and then binarized (using logical function from MATLAB). Then all 75 binary images and a whole brain mask which provided in the software REST (Song et al. 2011) were combine to the intersection mask. The following statistical analysis was constrained within this intersection mask. It has been proposed that z-transformation of ALFF could improve the normality of distribution acorss subjects (Zuo et al. 2010a). Therefore we also transformed the PerAF, ALFF, and fALFF to their respective z score maps, i.e., minus by global mean and divided by standard deviation (SD), thus generating zPerAF, zALFF, and zfALFF. The different metrics and their derivatives were summarized in Table 1. As the original ALFF value is not suitable for comparison, it was excluded from further analysis.
To investigate the test-retest reliability of PerAF, ALFF, and fALFF over time, intraclass correlation coefficient (ICC) was calculated between each pair of the 3 sessions in Dataset-1. ICC has been widely used in previous studies for test-retest reliablility (Zuo et al. 2010a;Shehzad et al. 2009;Zuo et al. 2010b;Liao et al. 2013). Dataset-1 allows both long-term reliability (5 -16 months apart) and short-term reliability (< 1h apart). ICC > 0.5 was considered as moderate or higher test-retest reliablility (Shehzad et al. 2009;Zuo et al. 2010a) and was used as a threshold in this study. As shown in Fig. 3, for all the measures including PerAF, mPerAF, zPerAF, mALFF, zALFF, fALFF, mfALFF, and zfALFF, most cortical areas showed moderate to high short-term (session 2 against session 3) test-retest reliability. Long-term test-retest reliability was lower than short-term reliability (See Fig. 3 for session 1 against session 2 and Supplementary Figure 1 (Fig. S1) for session 1 against session 3). Gray matter's reliability was much higher compared to the white 9 matter. fALFF and its derivative maps showed the worst test-retest reliability among the three metrics ( Fig. 3).
To view spatial overlap between PerAF and each of the other two methods, we selected the more comparable metrics, i.e., mPerAF, mALFF, and mfALFF, and performed overlapping analysis on voxels with ICC > 0.5 for all the metrics. As shown in Figure 5, mPerAF was largely overlapped with mALFF, and mfALFF was mostly included in mPerAF because the test-retest reliability of the mfALFF was lower than that of mPerAF and mALFF.
Head motion regression slightly affect the ICC of both short-and long-term test-retest reliability ( Fig. S3-5).

Dataset-2: comparison between EO and EC
2.3.1. Data description Dataset-2 was from a published data (Zou et al. 2015)

Data preprocessing
It was the same as mentioned in section "2.2.2". In case not every participant's whole brain was completely covered, we made an intersection mask within which all 68 scanning sessions (2 sessions for each of the 34 participants) were covered (Fig. 6). The detailed method was the same as in "2.2.2". To compare the amount of head motion between EO and EC, we calculated framewise displacement head motion (Power et al. 2012). Framewise head motion calculates the relative head motion of each timepoint to its prior timepoint. Zang and colleagues used the sum of framewise head motion of ratation and transition separately (Zang et al. 2007) (See formuli 1 and 2 in that reference). Power and colleagues integrated the sum of 6 framewise headmtion parameters as a whole, named framewise displacement (FD) (Power et al. 2012). FD is beeing widely used and hence the current study also used FD (Power et al. 2012). Paired t-test was peformed on FD between EO and EC.
To show the spatial pattern of PerAF, the averaged PerAF map of the 34 partipants in EC state was shown in Figure 7A. The pattern for EO was very similar with that of EC (not shown here).
The histogram was quite different among the 3 measures ( Fig. 7F-J). The histogram of averaged PerAF was very similar with that of averaged mPerAF; and averaged fALFF was similar with averaged mfALFF. The histogram of zPerAF, zALFF, zfALFF was similar with mPerAF, mALFF, mfALFF, respectively (Fig. S6).
The distribution as shown in the histogram of PerAF and mPerAF has a long tail at the right side (Fig. 7F, H). The distribution in the histogram of fALFF, mfALFF, and mALFF did show such long tail (Fig. 7G, I, J). The pattern of extreme value (> 4 SD) of PerAF, fALFF, mPerAF, mfALFF, and mALFF were different (Fig. 7K-O). For the PerAF and mPerAF, the voxels with extrmely high value were near the skull base (Fig. 7K). There was nearly no that big extreme value for fALFF and mfALFF (Fig. 7L, 7N ). For mALFF, the voxels with extreme value were located either near large vessels and in the gray matter. We plotted a timecourse of a participant's voxel which showed very big PerAF (5.87 %). As shown in Figure 7P, the BOLD signal intensity at some timepoints of this timecourse was nearly zero. No doubtly, this voxel has been affected by noise.
Paired t-tests were performed between EO and EC. Multiple comparison correction was performed within the intersection mask. A combination of individual voxel's P value < 0.05 and cluster size > 6156 mm 3 was used, corresponding to a corrected P < 0.05 based on Monte Carlo simulation (rmm = 5, smothness = 6 mm, 1000 simulations) (from AFNI software and implemented in REST). In addition, to view potential differences between EO and EC outside the brain, the results of paired t-test for PerAF map (i.e., without standardization by global mean PerAF) was also shown without multiple comparison correction, i.e., only a voxel-level P < 0.05 without cluster size threshold was adopted.
In the case without standardization by global mean, significantly lower (corrected for multiple comparisons) PerAF in EO than in EC was observed in widely distributed brain regions including the bilateral primary sensorimotor cortex (PSMC), supplementary motor area (SMA), paracentral lobule, primary auditory cortex extending to superior and middle temporal gyrus, thalamus, precuneus, visual cortex, and posterior cingulate cortex (P < 0.05, corrected) (Fig. 8A1). Only small part of brain area (e.g., inferior orbital frontal, gyrus rectus) showed significantly higher PerAF in EO than EC.
For fALFF (In the case without standardization by global mean), the pattern of difference between EO and EC was similar with that of PerAF, but with smaller volume for most clusters (Fig.   Fig.8A1 vs. Fig.8D).
In the cases with global mean standardization, the between-condition difference of mPerAF ( Fig. 8B), zPerAF (Fig. 8C), mALFF (Fig. 8G), zALFF (Fig. 8H) were very similar. Significantly higher fluctuation in EO than in EC was found in the bilateral middle occipital gyrus and orbitofrontal cortex. Significantly lower fluctuation in EO than in EC was found in the bilateral 13 PSMC, SMA, paracentral lobule, thalamus, and primary auditory cortex (P < 0.05, corrected). For mfALFF (Fig. 8E) and zfALFF (Fig. 8F), the pattern of difference between EO and EC was generally similar with that of mPerAF, zPerAF, mALFF, and zALFF, except in the frontal pole and PSMC. mfALFF and zfALFF showed almost no difference in the frontal pole, while mPerAF, zPerAF, mALFF, and zALFF showed a big cluster. The cluster in the PSMC detected by mfALFF and zfALFF was smaller than that by mPerAF, zPerAF, mALFF, and zALFF.
The results of EO versus EC showed prominent inconsistency for comparisons with and without global mean standardization for PerAF (vs. mPerAF) as well for fALFF (vs. mfALFF) (Fig.   8A1 vs. Fig. 8B, Fig. 8D vs. Fig.8E). Specifically, in the case of no global mean standardization, only a small area showed higher fluctuation in EO than in EC. However, after standardization, a few other areas showed significantly higher fluctuation in EO than in EC, including the bilateral middle occipital gyrus and a large area in the orbitofrontal cortex. (not applicable for mfALFF and zfALFF results). Brain areas showing significantly lower fluctuation in EO than EC were slightly smaller than those without standardization. The prominent inconsistency suggested that the global mean PerAF had strong effect. We therefore performed a paired t-test on the global mean PerAF between the EO and EC. The global mean PerAF was calculated within a brain mask provided in REST (Song et al. 2011). It was found that the global mean PerAF was marginally lower in EO than EC (P = 0.0614).
When no brain mask was used and no multiple comparison correction was performed, the eyeballs showed significantly higher PerAF in EO than EC (Fig. 8A2). The difference in eyeballs extended to a large area of the frontal scalp and even to the orbitofrontal cortex.
The effect of head motion regression on the difference between EO and EC depended a lot on the measures used (Fig. 9). It showed very little effect on mPerAF, zPerAF, mALFF, and zALFF ( Fig. 9 B, C, G, and H, respectively), but showed prominent effect on PerAF, fALFF, mfALFF, and zfALFF (Fig. 9A, D, E, F). Specifically, after Friston-24 head motion regression, the pattern of difference between EO and EC in PerAF was very similar with mPerAF ( Fig. 9A-reg vs. Fig.  9B-no). Effects of Friston-24 head motion regression on fALFF, mfALFF, and zfALFF were quite interesting. Generally, a few clusters disappeared (Fig. 9 D, E, F).
One possible reason for the prominent effect of head motion on the PerAF difference between EO and EC might be due to potential difference of head motion between EO and EC. To test this assumption, we calculated the head motion amount, i.e., FD. The FD was 0.1036 ± 0.0331 (mean ± standard deviation) for EO, and 0.1095 ± 0.0514 for EC. There was no significant difference (P=0.3068). Entering ''PerAF_GUI'' in the MATLAB command window will open PerAF's GUI. It supports NIfTI image format. Users need to set the input directory where the preprocessed data were in. The output directory also needs to be set. User can select PerAF (without standardization by global mean), mPerAF, or zPerAF.

Implementation and usage of PerAF calcuation toolkit
We also implemented a command line in LINUX, named REST-PerAF, based on AFNI (Cox 1996) for calculation of PerAF. It can be downloaded at http://www.restfmri.net.
More usage details could be found in manual which can be downloaded at http://www.restfmri.net.   mean and divided by standard deviation). The original ALFF map is mathematically unsuitable for comparison and hence not listed here. The white contours denote the boundary of the intersection mask. The ICC threshold was set at ≥ 0.5 for all maps. L: left side of the brain. R: right side of the brain.

Fig. 4
Histogram of test-retest reliability of all voxels. Y axis is the number of voxels of each bin (with an ICC step of 0.05). Upper (a) is the short-term (session 2 against session 3) reliability. In general, short-term reliability was better than long-term one. For short-term reliability, most voxels had ICC > 0.5 for all measures. Comparing the number of voxels with ICC > 0.5 among measures, PerAF, mPerAF, and zPerAF performed slightly better than mALFF and zALFF, and much better than fALFF, mfALFF, and zfALFF. Please see Table 2 for detailed number of voxels. For long-term reliability (session 1 against session 2), mPerAF and zPerAF performed similarly with Test-retest reliability overlapping maps of mPerAF with mALFF and mfALFF on voxels with ICC > 0.5. The upper row is for mPerAF and mfALFF (a: short-term; b: long-term) and the lower row is for mPerAF and mfALFF (c: short-term; d: long-term). L: left side of the brain. R: right side of the brain. 35 Fig. 6 Intersection mask of Dataset-2. The left pannel shows how many sessions (totally 34 subjects × 2 = 68 session) were covered, for each voxel in Dataset-2. The right pannel is an intersection mask which was covered by all 68 sessions.