Reproducibility and Temporal Structure in Weekly Resting-State fMRI over a Period of 3.5 Years

Resting-state functional MRI (rs-fMRI) permits study of the brain’s functional networks without requiring participants to perform tasks. Robust changes in such resting state networks (RSNs) have been observed in neurologic disorders, and rs-fMRI outcome measures are candidate biomarkers for monitoring clinical trials, including trials of extended therapeutic interventions for rehabilitation of patients with chronic conditions. In this study, we aim to present a unique longitudinal dataset reporting on a healthy adult subject scanned weekly over 3.5 years and identify rs-fMRI outcome measures appropriate for clinical trials. Accordingly, we assessed the reproducibility, and characterized the temporal structure of, rs-fMRI outcome measures derived using independent component analysis (ICA). Data was compared to a 21-person dataset acquired on the same scanner in order to confirm that the values of the single-subject RSN measures were within the expected range as assessed from the multi-participant dataset. Fourteen RSNs were identified, and the inter-session reproducibility of outcome measures—network spatial map, temporal signal fluctuation magnitude, and between-network connectivity (BNC)–was high, with executive RSNs showing the highest reproducibility. Analysis of the weekly outcome measures also showed that many rs-fMRI outcome measures had a significant linear trend, annual periodicity, and persistence. Such temporal structure was most prominent in spatial map similarity, and least prominent in BNC. High reproducibility supports the candidacy of rs-fMRI outcome measures as biomarkers, but the presence of significant temporal structure needs to be taken into account when such outcome measures are considered as biomarkers for rehabilitation-style therapeutic interventions in chronic conditions.


Introduction
Functional magnetic resonance imaging (fMRI) can noninvasively reveal the functional organization of the human brain, even in the absence of explicit tasks. Referred to as resting-state functional MRI (rs-fMRI), the method exploits synchronous fluctuations in blood oxygen level dependent (BOLD) signal throughout intrinsic brain functional networks [1]. The ability to study the brain's functional networks without requiring participants to perform explicit tasks has clinical appeal, as it allows use of an identical protocol for all patients, regardless of cognitive or physical limitations. This is especially important in chronic conditions that affect motor function, and the need for non-invasive and reproducible biomarkers is enhanced by advances in long-term therapeutic interventions for such chronic conditions [2][3][4][5]. Further, robust changes in resting state networks (RSNs) have been observed in chronic diseases such as spinal cord injury [6], cerebral palsy [4,5], Parkinson's disease [7,8], multiple sclerosis [9,10], and stroke [11], indicating that rs-fMRI based network outcome measures have the potential to serve as biomarkers for chronic diseases and their progression, as well as the effects of possible therapeutics.
Here we report the use of a unique longitudinal dataset that covers the time span of 185 weeks with weekly repeat measures. The dataset is exceptional in its length and frequency of acquisition and provides a unique opportunity to gain insight into two different aspects of rs-fMRI derived measures that were previously not accessible: 1) the reproducibility of the RSN outcome measures over an extended time period relevant for long-term clinical trials, and 2) inter-session temporal characteristics of the multi-year time courses of the rs-fMRI based outcome measures, provided through time series analysis.
In this study, we aimed to: 1) present the unique longitudinal dataset reporting on a healthy adult subject scanned weekly over 3.5 years, 2) identify RSN outcome measures appropriate for clinical trials, with high intra-subject inter-session reproducibility over an extended timeframe, and 3) identify potential parameters-of-interest by assessing the existence of temporal structure within RSN outcome measure time courses. To achieve these goals, we first investigated the intra-subject inter-session reproducibility of independent component analysis (ICA)-derived rs-fMRI outcome measures, namely network spatial maps, BOLD temporal signal fluctuation magnitudes, and temporal correlations between pairs of functional networks (between-network connectivity; BNC). We then performed time series analysis on the time courses of the RSN outcome measures to assess their temporal structure.
The RSN outcome measures were stable over the period of 185 weeks, with executive RSNs showing the highest reproducibility. Significant trend, annual periodicity, and persistence existed in the time courses of the outcome measures, suggesting that when such outcome measures are considered as biomarkers for rehabilitation-style therapeutic interventions in chronic conditions, it may be beneficial to take into consideration the temporal structure of the outcome measure.

Participants
The longitudinal single-subject dataset was acquired from a healthy volunteer (40 years of age at time of initial scan; male). A total of 158 sessions of MRI data was acquired on a weekly basis, over a span of 185 weeks. Scans were typically performed on Thursday mornings at 11:30am; in cases of scheduling conflicts, scans were performed on different days of the week and/or times, or skipped, depending on the types of conflicts. The initial image acquisition was performed on the 7 th of December, 2009, and the last image acquisition was performed on the

Image Acquisition
All participants were scanned on a 3T Philips Achieva scanner (Philips HealthCare, Best, Netherlands).
Rs-fMRI data of the subject was acquired using a multi-slice SENSE-EPI pulse sequence [13,14] with TR/TE = 2000/30 ms, SENSE factor = 2, flip angle = 75°, 37 axial slices, nominal resolution = 3x3x3 mm 3 , 1 mm gap, 16 channel neuro-vascular coil, number of dynamics (frames) per run = 200. Identical imaging parameters were used to acquire the Kirby-21 rs-fMRI data [12], except for the number of dynamics per run, which was 210. Only the first 200 dynamics of the multi-participant data were analyzed, in order to match the length of the runs with the single-subject data. One of the 21 healthy volunteer dataset was identified to include excess motion and was excluded from further data analysis. The rs-fMRI scans were always acquired after the T1w scans, to allow participants to get acclimated to the noise and environment inside the scanner. Participants were instructed to stay as still as possible with their eyes closed during the entire scan, and no other instruction was provided.

Data Processing
Preprocessing of the rs-fMRI datasets was performed using SPM8 (http://www.fil.ion.ucl.ac. uk/spm) [15] and Matlab (Natick, MA). The preprocessing pipeline included: 1) slice timing correction, used to correct differences in image acquisition time between image slices, 2) motion correction, 3) co-registration, used to align structural images to functional images, 4) unified segmentation-normalization [16], used to transform functional images to normalized Montreal Neurological Institute (MNI) space (2x2x2 mm 3 ), 5) high pass filtering with 0.01 Hz cutoff, used to eliminate slowly varying background noise and effects of scanner drift, and 6) spatial smoothing using 6 mm full-width at half-maximum Gaussian kernel (i.e., twice the nominal size of the rs-fMRI acquisition voxel), used to suppress noise and reduce effects of imperfect normalization.
Group ICA of fMRI toolbox (GIFT) software (http://mialab.mrn.org/software/gift) [17] was used to perform group independent component analysis (GICA) [18]. Single-and multi-participant datasets were combined, and two steps of principal component analysis (PCA) data reduction were performed for group level analysis, where individual session data were first reduced to 70 principal components. The reduced data was then concatenated in the temporal direction and further reduced to 35 principal components.
Estimation of the number of independent components (i.e., 35) was guided by order selection using the minimum description length (MDL) criterion [19]. The dimensionality of the individual session PCA data reduction (i.e., 70) was set by doubling the estimated component number, to ensure robust backreconstruction [20,21] following the ICA decomposition.
ICA [22] is one of the most commonly used methods for analyzing rs-fMRI data. It models the data as a linear mixture of signals originating from spatially-independent sources, and then estimates the sources by maximizing their independence [23]. These sources include not only the spontaneous fluctuations in BOLD signals in functional networks, but also "nuisance" signals such as those arising from head motion, respiration, and cardiac pulsations. Later, these nuisance components are eliminated and only sources identified as RSNs are retained for further analysis. One of the biggest advantages of the method is that it allows the analysis of rs-fMRI data without a priori knowledge of the sources [18,24]. In this study, ICA was performed using the InfoMax algorithm [22], and the process yielded a total of 35 aggregate independent component (IC) spatial maps and associated time courses. Single-session maps for each session (single-subject) and participant (multi-participant) were obtained using backreconstruction [18] via the "GICA3" procedure [20]. A flowchart that visualizes the details of the preprocessing and GICA steps is presented as S1 Fig. RSNs were identified manually from the 35 ICs estimated. Three were rejected due to low reliability of the ICs as assessed using the ICASSO toolbox [25]. The spatial distribution (i.e., grey matter vs. white matter and cerebral spinal fluid) and temporal frequency power distribution of the remaining 32 ICs were manually assessed, and 18 ICs were eliminated as representing non-neuronal sources such as head motion, respiration, and cardiac pulsations-specifically, the peak activations of the networks were required to be within the gray matter, and RSN spatial maps were required to have low overlap with vascular and ventricular regions. Motion artifact components that display high intensity values around the edges of the brain were also identified and eliminated.-The process identified the remaining 14 ICs as RSNs that represent unique functional networks.

Resting State Network Outcome Measures
The reproducibility and temporal structure (trend, annual periodicity, and persistence) were assessed for three types of RSN outcome measures: spatial similarity of RSN maps, temporal signal fluctuation magnitude, and BNC.
Spatial similarity of RSN maps. The spatial similarity of each week's RSN spatial maps to the group mean map, as calculated using η 2 [26], was obtained as an outcome measure. First, the single-session RSN maps for each week were obtained through backreconstruction of the aggregate maps, and converted to z-score using Fisher's r-to-z transformation [18,20]. A given voxel's value in each RSN maps, therefore, represents the weight of the RSN time course with respect to the measured relative BOLD signal. The similarity measure η 2 [26][27][28] was defined as: where i represent voxel index within a brain, n is number of voxels in a brain, a i and b i are the values at position i in maps a and b, respectively, m i is the mean value of the two images at position i, and M is the grand mean across the mean image m. η 2 values can range from 0 to 1, where 0 indicates no similarity between two images, and 1 indicates that two images are identical.
By calculating the fraction of the variance in one image accounted for by variance in a second image, η 2 reports on the difference in the values at corresponding points in the two images. One of the biggest advantages of η 2 is that it allows the quantification of differences/similarity of the two images instead of the correlational relationship between them [26].
Finally, for each network, a spatial overlap map [29] was created in order to provide visual means to assess the repeatability of the single-session RSN maps. The process first involved thresholding (z-score > 1) of the single session maps to obtain voxels most representative of each RSN. The resulting binary maps were subsequently summed, and then normalized by dividing the maps by the total number of image acquisition, and multiplying by 100 to convert to percentage.
Temporal fluctuation magnitude of RSN time courses. The magnitude of temporal signal fluctuations for each RSN was calculated as the quadratic mean (root mean square; RMS) of the backreconstructed time courses that were scaled to the original data to represent percent signal change [18,20].
Between-network connectivity of RSN time courses. BNC, a measure of synchrony between RSNs, was computed for each session as the Pearson correlation coefficient of the network time courses [30,31].

Analysis of RSN Outcome Measures over Sessions
Reproducibility. The intra-class correlation (ICC), a metric of test-retest reliability, is widely used in the rs-fMRI reproducibility literature. However, the ICC cannot be used for the present study, which is a longitudinal case report, because there is no 'class' (or group) of subjects who underwent 3.5 years of weekly scanning. Instead, for each type of RSN outcome measure (i.e., spatial similarity of RSN maps, temporal fluctuation magnitude, and BNC), intrasubject inter-session reproducibility was characterized using coefficient of variation (CV), defined as the ratio of standard deviation (SD) to mean, expressed in percentage. CV enables the comparison of data sets with different means, by providing a standardized measure of dispersion. However, the calculated CV can appear artificially inflated if a mean value of a data set is close to zero. Therefore, in order to help keep things in perspective, we also report corresponding SD values.
Time series analysis-trend, annual periodicity, and persistence. The single-subject, multi-year acquisition of the longitudinal rs-fMRI dataset enabled time series analysis of weekly RSN outcome measures and observation of their temporal structure (trend, annual periodicity, and persistence) over the extended time period.
Existence of linear trends in the weekly RSN outcome measures was tested using general linear model. Significance of the trend was tested using F statistics, and each outcome measure's p-value was adjusted for multiple comparisons using false discovery rate (FDR) correction for the number of tested RSNs.
Recognizing that changes in degree of subject motion as well as changes in signal intensity due to variable scanning environment over time may introduce linear trends not of neurological origin, the degree of subject motion over each session over the 185 weeks period was assessed to identify potential confounds. A quantitative measure of subject motion was provided by frame-wise displacement (FD) [32], which was calculated by summing the absolute value of the three differenced translational realignment parameters and the three differenced rotational parameters, which were converted from radians to millimeters by assuming a brain radius of 50 mm.
Additionally, potential changes in signal intensity arising from a variable scanning environment were assessed using a dataset from a concurrent ongoing phantom stability scan within the F.M. Kirby Research Center. The study was run by placing a 15 cm diameter silicone oil filled sphere "phantom" in a 32 channel head coil and running gradient-echo echo-planar imaging (EPI) scans (TR/TE = 3000/40ms, 20 axial slices, imaging matrix: 64x64, field of view: 230 x 230 mm, 3mm slice thickness, 1 mm gap, 300 dynamics) [33], which is equivalent to running a rs-fMRI scan. Signal intensity of the phantom from the corresponding weeks was calculated, and a weekly signal intensity measure was constructed for detection of any linear trend. To prevent reporting of spurious results, permutation tests with 1000 iterations were performed.
The significance of the spectral peaks at 0.0192 weeks -1 (annual periodicity; 1/52.18 weeks) in the RSN outcome measures was tested using a procedure for finding spectral peaks in time series described by Ahdesmaki et al [34,35] at p < 0.05, after FDR correction for the number of tested RSNs. The robust detection method uses Fisher's g-test for the detection of periodic fluctuations in multiple time courses. The method also incorporates regression based methods to find the spectral estimate of a time course instead of using the basic periodogram, allowing robust estimation of spectral peaks in non-uniformly sampled data with unknown noise characteristics. This makes the method especially appropriate for the 185 weeks dataset (with 158 time points) described in this study. We refer readers to the above mentioned references for further details of the detection method. The seasonal effect on RSN outcome measures was also investigated by correlating the RSN outcome measure time courses with the recorded daily maximum temperature of Baltimore (freely distributed by the National Oceanic and Atmospheric Administration (NOAA); www.noaa.gov). To prevent reporting of spurious results, permutation tests with 1000 iterations were performed.
Finally, existence of autocorrelation within the RSN outcome measures was assessed by estimating the autoregressive moving average (ARMA) models of the weekly RSN outcome measures using automatic spectral analysis [36,37]. As with the robust spectral peak detection method used to observe annual periodicity, the method is specifically designed to account for non-uniformly sampled data with unknown noise characteristics. ARMA model is estimated in two parts, through separate estimation of the autoregressive (AR) and moving average (MA) models. Traditional ARMA model estimation algorithms often utilize maximum likelihood (ML) approach for both the AR and MA model estimations. However, the ML approach is known to provide poor estimates of the MA model when there are missing data. We therefore utilize the automatic spectral analysis method, which uses reduced statistics algorithm [36,37] to improve estimates of MA model. Reproducibility of the spatial map similarity measure for both single-and multi-participant datasets is presented using violin plots (Fig 3). For visualization purposes, the violin plots were sorted based on the interquartile range of the single-subject data.

Functional Networks
For each RSN, the degree of spatial similarity of the single-subject dataset's backreconstructed spatial maps to the group mean map was found to be high, with mean η 2 values ranging from 0.747 to 0.841 (Fig 3, Table 1). Also, for all RSNs, the median (second quartile) η 2 values of the single-subject dataset were within the range of η 2 values of the multi-participant dataset.
The two visual networks (Vis-a and Vis-b) and a sensorimotor network (Smot-dor) showed the lowest intra-subject inter-session reproducibility, with CV values of 6.68, 4.86, and 4.75%, while the executive networks (Exec-R and Exec-L) showed the highest reproducibility, with CV values of 1.60 and 1.65% (Table 1). Similarly, the sensorimotor network (Smot-ven) and visual  RSN spatial maps for representative weekly sessions. RSN mean spatial maps (leftmost column), representative backreconstructed weekly single-session spatial maps (middle eight columns), and overlap maps (rightmost column) for the 14 RSNs. The degree of spatial similarity of each session's spatial map to the corresponding mean map, as measured using eta-squared (η 2 ), is indicated below the single-session maps.
A test of equal variance (f-test) indicated that Smot-ven network showed higher intra-subject inter-session spatial map reproducibility compared to its inter-participant reproducibility (p < 0.05, corrected; Table 1). The Vis-a network, on the other hand, showed an opposite trend; intra-subject inter-session reproducibility was significantly lower than the inter-participant reproducibility (p < 0.05, corrected; Table 1). The higher intra-subject inter-session spatial map reproducibility for the Smot-ven network was preserved when the same analysis was performed using only the first twenty sessions of the single-subject data, matching the multiparticipant dataset's number of sessions (not reported separately to avoid overlap). Such was not the case for the Vis-a network; while the trend of lower intra-subject inter-session spatial map reproducibility was still observed, the difference was no longer significant.
Temporal fluctuation magnitude of RSN time courses. Reproducibility of the RMS % BOLD value for each session's backreconstructed time course for both the single-and multiparticipant datasets are presented using violin plots in Fig 4. For visualization purposes, the violin plots were sorted based on the interquartile range of the single-subject data.
The single-subject, median temporal signal fluctuation magnitude of each RSN, was within the range of temporal signal fluctuation magnitude values of the same RSNs within the multiparticipant dataset, as shown in Fig 4. Also, the ranges of the mean temporal signal fluctuation magnitude for the single-and multiple-participant dataset were similar-from 0.454% to 2.02% and 0.716% to 2.282%, respectively ( Table 2). Reproducibility of RSN spatial maps. Spatial similarity of each session's RSN spatial map to the corresponding group mean map, measured using eta-squared (η 2 ), for single-subject (blue) and multiparticipant (yellow) datasets, is visualized using violin plots. The first, second, and third quartiles of the data are represented within the violin plots as dotted lines.  (Table 2). For the multi-participant dataset, Aud and Cb networks showed the lowest inter-participant reproducibility with CV values of 111.4 and 101%, respectively, and Attn-dor and Attn-ven networks had the highest reproducibility, with CV values of 37.7 and 38.7%, respectively.
Test of equal variance (f-test) between the single-subject and multi-participant dataset indicated higher intra-subject inter-session reproducibility of temporal fluctuation magnitude, with eight RSN networks (Exec-L, Exec-R, DMN-c, Smot-ven, aud, Vis-b, Cb, and Sal networks) showing significantly higher intra-subject inter-session reproducibility. In contrast, only two RSN networks (Smot-dor and DMN-a) showed significantly lower intra-subject inter-session reproducibility of the networks ( Table 2; p < 0.05, corrected). In order to ensure that such high intra-subject inter-session reproducibility is not solely due to the larger sample size of the single-subject dataset, reproducibility was also calculated using the first twenty sessions from the single-subject dataset (not reported separately). The observation was consistent, with six of the eight RSNs  (Table 3, S2 Table). The top ten RSN pairs with the largest mean BNC values for the single-and multi-participant datasets are reported in Table 3. There was a significant overlap between the top ten lists from the single-and multi- participant datasets, with seven out of ten RSN pairs with the strongest connectivity in the single-subject dataset's top ten list also appearing in the multi-participant dataset's top ten list. Values of mean and SD BNCs are also visualized as matrices in Fig 5(a) and 5(b), respectively. Within each matrix, the single-subject dataset is presented below the main diagonal, and the multi-participant dataset is presented above. For each RSN pair, differences in mean and SD BNC between single-and multi-participant datasets was small, as shown in Fig 5(c).
In order to verify that the mean BNC values for each RSN pair in the single-subject dataset are within the range of BNC values for the corresponding RSN pairs in the multi-participant dataset, the difference in mean BNC between the two datasets were used to sort and identify ten RSN pairs with the smallest (Fig 5d; top) and the largest (Fig 5d; bottom) differences. Fig  5d-bottom shows that the single-subject mean BNC values for the ten RSN pairs with the largest differences are still within the range of the corresponding multi-participant BNC values.
Additionally, CV values of the BNC measures were used to sort and identify ten RSN pairs with the highest reproducibility for both the single-and multi-participant datasets (Table 4). For the single-subject dataset (Table 4; top), the Exec-L/Exec-R network pair was shown to be the most reproducible, with a CV value of 13.6%, while the Smot-dor/Smot-ven network pair was the most reproducible for the multi-participant dataset (Table 4; bottom), with a CV value of 27.4%. For the single-subject dataset, the somatosensory and visual networks had the most reproducible correlations with other RSNs, and this observation also held for the multi-participant dataset.
Finally, recognizing that artificially inflated CV values can arise if a mean value of a data set is close to zero, and therefore to help keep things in perspective, we also report corresponding  (Tables 3 and 4, S2 Table, and Fig 5b). BNC SD values ranged from 0.082 to 0.29 (Fig 5b and S2 Table)

Time Series Analysis
Trend. RSNs and RSN pairs with significant (after correction for multiple comparisons) linear trends in rs-fMRI outcome measures are visualized as matrices in the top row of Fig 7. These matrices are color-coded to indicate statistically identified positive, negative, and no trend. For each RSN outcome measure, the intercept and slope of the estimated linear trend, as well as the slope's corresponding F statistic and associated p-value, are listed in Table 5. Eleven out of the fourteen RSNs showed significant linear trends in η 2 over 185 weeks. Of these eleven RSNs, ten showed positive trends, while Exec-R showed a negative trend. In comparison, only two (Vis-a and DMN-b) of 14 RSNs showed significant trends in the temporal fluctuation magnitude and twenty-nine out of 105 RSN pairs showed significant trends in BNC. All trends were positive for temporal fluctuation magnitude and BNC. Significant linear trends in BNC were more pronounced in RSN pairs containing DMN-a or DMN-c networks, although significant trends were observed in various RSN pairs involving all categories of functional networks (i.e., auditory, sensorimotor, visual, DMN, attention, executive, salience, and cerebellar).
The weekly FD (a measure of subject motion) and signal intensity measures from the phantom stability study (a measure of week-to-week scanner stability), did not show significant linear trends (not reported separately). Additionally, permutation tests with 1000 iterations confirmed that significant linear trends no longer exist when weekly outcome measures are randomized. Annual periodicity. RSNs and RSN pairs with significant (after correction for multiple comparisons) annual periodicity in relevant outcome measures are visualized as matrices in the middle row of Fig 7. The matrices are color-coded in black and red, where red blocks highlight RSNs and RSN pairs with significant annual periodicity. Table 6 lists p-values for RSNs and RSN pairs with significant periodicity, for each outcome measure. Nine out of 14 total RSNs showed significant annual periodicity in the η 2 measure over the period of 185 weeks. In comparison, only three (Vis-a, Attn-ven, and Attn-dor) of the 14 RSNs showed significant annual periodicity for temporal fluctuation magnitude measures, and none of the 105 RSN pairs showed significant annual periodicity for the BNC measures. Additionally, Table 6 shows that the majority of the RSNs with significant annual periodicity also display good correlation with Baltimore's daily maximum temperature, with correlation coefficients ranging from 0.27 to 0.34. Permutation tests with 1000 iterations confirmed that the observed significant linear trends no longer exist when weekly outcome measures are randomized.
Persistence. After ARMA models that best described the autocorrelation structure of RSN outcome measure time courses had been estimated, the order of the AR portion of the model   Table, which lists the ARMA models that best describe the autocorrelation structure of the weekly outcome measures. Twelve out of 14 RSNs displayed significant autocorrelation in the η 2 measure over the period of 185 weeks. In comparison, only three (Vis-a, DMN-b, and Sal) out of 14 RSNs displayed significant autocorrelation for temporal fluctuation magnitude measures, and 26 of the 105 RSN pairs displayed significant autocorrelation in BNC. The order of estimated ARMA models was shown to vary, where AR and MA orders ranged from 0 to 3 for the three types of RSN outcome measures (S3 Table).

Discussion
Previous reproducibility studies of intrinsic functional networks showed that RSNs are reproducible across participants [29,38,39], within participants over durations of weeks to months  [29,40,41], as well as up to a year [42]. The primary goal of this study was to investigate whether intra-subject inter-session RSN outcome measures were reproducible over an even longer period that is more relevant for rehabilitation studies, and this is confirmed by the results.

Functional Networks
The 14 RSNs identified in this study (Fig 1) were in good correspondence with those consistently reported in previous rs-fMRI studies. The identified RSNs included an auditory network [41], ventral and dorsal sensorimotor networks [1,24,41], two visual networks [24,41], three default mode networks [24,41,43], ventral and dorsal attention networks [41], left and right executive-function networks [44], a salience network [45], and a cerebellar network [29,46]. Of the 35 initially estimated ICs, 21 were rejected as nuisance components (of non-neuronal sources); this rate of rejection of nuisance components was consistent with previous studies [29].

Reproducibility
Spatial similarity of RSN maps. As group mean maps (Fig 2; left column) are obtained by averaging multiple RSN spatial maps across sessions, the most robust "core" activation regions in the RSN spatial maps are preserved within the group mean maps. Measuring the spatial similarity of each backreconstructed RSN map to its corresponding group mean map, therefore, may provide insight into the degree of robustness of core activation region, reflected by the value of η 2 , and the degree of robustness of "non-core" activations, reflected by the variance of η 2 . The high mean η 2 values of all RSNs, ranging from 0.747 to 0.84, show that the 14 RSNs identified in the study have very robust core activation regions. Stronger core activation regions, however, did not translate to higher reproducibility. The Vis-a network, for example, showed that a network can have a strong core activation region and also have variable noncore activation regions that lead to increased variability in the spatial similarity measure (Table 1 and Fig 3). This network's variable non-core activation regions can also be visually observed in Fig 2. For both single-and multi-participant datasets, the three networks with the lowest spatial map inter-session reproducibility were exteroceptive in nature (i.e., related to the external world), namely visual and sensorimotor networks (Vis-a, Vis-b, Smot-dor, and Cb). Consistent with Kosslyn's reports on ideation [47], "mind wandering" during rest may have led to increased modulation in the exteroceptive networks.
Finally, for nine out of 14 RSNs, the intra-subject inter-session reproducibility, as measured using CV, of the RSN spatial maps was higher than, or similar to, inter-participant reproducibility, with Smot-ven network displaying significant difference after Bonferroni correction for multiple comparisons. While this observation may be explained by the inherent higher interparticipant variability in the dataset, such a difference could also arise due to imperfect spatial normalization across participants. This suggests that caution may be warranted in using atlasbased seed placement for seed-based correlation, and that more sophisticated spatial normalization methods (e.g., large deformation diffeomorphic metric mapping (LDDMM) [48,49]) may be beneficial.
Temporal fluctuation magnitudes of RSN time courses. For all RSNs, mean intra-subject inter-session RMS % BOLD values (Fig 4, Table 2) were comparable to those reported in a previous multi-participant rs-fMRI study [41]. Additionally, RSNs that showed low spatial map intra-subject inter-session reproducibility also tended to show low temporal fluctuation magnitude intra-subject inter-session reproducibility. Many such least-reproducible RSNs were of exteroceptive nature, and thus may support the previously stated hypothesis that "mind wandering" [47] during rest may have led to increased modulation in exteroceptive networks.
It should also be noted that compared to the CV values of the network spatial maps (Table 1), the CV values of the temporal fluctuation magnitude ( Table 2) were significantly larger, ranging from 24.5 to 85.8% for the single-subject dataset and from 37.9 to 111.4% for the multi-participant dataset. These significantly larger CV values were driven by the small mean values and large SD values of the temporal signal fluctuation measurement. Using CV as a measure of dispersion is suboptimal when a mean value is close to zero, as calculated CV measure becomes sensitive to small changes. In this case, a larger issue may be the relatively outcome measures. Red blocks indicate significant annual periodicity and black boxes no annual periodicity. BOTTOM AR orders of the estimated ARMA models for RSNs and RSN pairs are visualized for each outcome measures, where black box indicates no autocorrelation, red box AR order of 1, yellow box AR order of 2, and white box AR order of 3. Refer to S3 large SD values of the RSN temporal fluctuation magnitude; the fact that the SD values of many RSNs are close to the corresponding mean values indicates that the measure may be relatively insensitive to small effect sizes, and caution is thus warranted when using the this outcome measure in longitudinal studies.
Between-network connectivity of RSN time courses. The ranges of mean BNC values for the single-subject dataset (-0.133-0.660), and the multi-participant dataset (-0.104-0.606) ( Table 3, S2 Table) were similar, and strong BNC values were observed between networks within the same functional domains; such as between the Vis-a Exec-L/Exec-R network pairs. Overall, the mean BNC values for the single-and multi-participant datasets were similar, reflected by the highly symmetric combined correlation matrix of the datasets (Fig 5a). In addition, network pairs that showed strong connectivity within the single-subject dataset also showed strong connectivity within the multi-participant dataset.
Similar to the large CV values that were observed for temporal fluctuation magnitude measurements, the CV values for the BNC measurements were also relatively high, driven by the Intercept and slope of the estimated linear trend, as well as the slope's F statistic and p-value in three RSN outcome measures, namely the (a) spatial similarity (eta-squared, η 2 ), (b) temporal fluctuation magnitude, and (c) BNC, for each RSNs with significant linear trends are listed.
doi:10.1371/journal.pone.0140134.t005 Table 6. RSNs with significant annual periodicity and/or significant correlation with daily maximum temperature (Baltimore MD, USA) in outcome measures.
(a) Eta-squared (η 2 ) RSN p-value (annual periodicity) Correlation coefficient (w/ daily maximum temperature) p-value (w/ daily maximum temperature)  Table. It should be noted, however, that in cases where a mean BNC value is close to zero, the corresponding SD value needs to be very small before the BNC value of a particular functional network pair can be used as a reproducible longitudinal measure. Therefore, caution should be taken when investigating network pairs with high variance, relative to the corresponding mean BNC value.
For both the single-and multi-participant datasets, RSN pairs with connections to the somatosensory and visual networks demonstrated higher reproducibility (Table 4). This observation agrees with results from a previous study that reported robust inter-participant reliability for the primary visual network [50], compared with other networks.
One of the main observations of this study was the very high intra-subject inter-session reproducibility of the Exec-L and Exec-R networks. For the single-subject dataset, these networks were the most reproducible RSNs for spatial map (Table 1) and temporal signal fluctuation magnitude ( Table 2) outcome measures. Additionally, connectivity between the Exec-L/ Exec-R RSN pair was the most reproducible, with the smallest CV value of 13.6% (Fig 4,  Table 4). This RSN pair was also among the most reproducible networks for all rs-fMRI outcome measures in the multi-participant dataset. Such high intra-subject inter-session reproducibility of the executive control network components has an important implication, as the executive control network is known to be actively involved in functions such as impulse control [51] and consciousness [52,53], and is closely related to disease states such as substance abuse [51], unresponsive wakefulness syndrome [52], and Alzheimer's disease [54]. Accordingly, this suggests that the executive RSN components and the related outcome measures have high potential for serving as biomarkers of disease states.

Time Series Analysis
Trend. η 2 values of weekly RSN spatial maps correspond to degrees of spatial similarity between weekly maps and a group mean map. Therefore, the existence of a trend in the η 2 measurements (Fig 7 TOP(a), Table 5a) is noteworthy. Specifically, a positive trend in η 2 indicates that as time passes, weekly single-session maps become more stable, looking more like the group mean map. A negative trend, on the other hand, would indicate that weekly maps become more variable, looking less like the group mean map as time passes. Previous studies show that a constant and linear decrease of gray matter volume can be initiated as early as the 20's [55][56][57]. While the exact cause of this decline is still under debate, one theory suggests that the decline is due to neuronal and synaptic pruning in the human cortex during reorganization following neural maturation [58]. Our observation of positive trends in the η 2 values for most of the RSNs may partially be explained by such neuronal pruning of gray matter. Preliminary analysis of the subject's high resolution T1w structural images revealed that the subject's gray matter volume decreased by about 0.30 ml/month over the study period [59]; this linear decrease in gray matter volume is consistent with previous reports [55].
Possible long-term habituation of the subject to the scanning environment, and a subsequent decrease of subject motion, was also identified as possible causes for the almost ubiquitous positive trends in η 2 measures. Analysis of the degree of subject motion, represented by weekly FD measures, however, showed that there was no significant trend in the degree of movement across imaging sessions. Similarly, there was no significant trend in the weekly signal intensity measures from the phantom stability data, indicating that any contribution of variable scanning environment over the 185 weeks to the observed linear trends in the η 2 values was minimal.
The effect of age on RSNs has been extensively explored in previous literature [20,[60][61][62][63][64][65][66][67][68]. Specifically, the level of coherent activity and degree of co-activation within RSNs are shown to increase during early childhood and young adulthood, indicating neuronal maturation of the networks. This phase is followed by decreases in the level of coherent activity and degree of coactivation in the older population, marked by subsequent cognitive decline [20,[60][61][62][63][64][65][66][67][68]. Such decreases in coherent activity, measured using the magnitude of temporal fluctuation, and degree of co-activation, measured using BNC, were not observed in the subject. Instead, all significant trends reported in this study were positive (Fig 7 TOP(b-c) and Table 5b and 5c). One reason for the discrepancy may be that majority of the above-mentioned studies compared cohorts of young participants (10-34 yrs) and much older participants (60-93 yrs), thus not including the age range of the subject studied here (ca. 40 yrs). Two studies that did include participants in their 40's [20,68] reported mixed results for different RSNs. It should also be noted that the effects of age on functional connectivity could be mediated by various factors such as stress, education, and exercise [69,70].
A limitation of this study is that the existence of a trend was inferred only using general linear model. This method reports the best fitting linear model of the type specified, but more sophisticated trend detection algorithms may reveal more complex types of trends. Also, it should be noted that effect sizes in the temporal analyses are very small, which may indicate that observed significance is driven by the large number of degrees of freedom. And while this data set is large for fMRI, it is not an exceptionally long time series relative to the norm of seasonal studies.
Annual periodicity. The unique longitudinal dataset of the study enabled us to investigate whether seasonal (more specifically, annual) patterns exists in RSN outcome measures, and the results show that such seasonal patterns exist in the weekly η 2 and temporal fluctuation magnitude measures of relevant RSNs (Fig 7 MIDDLE and Table 6). This result is also confirmed by the good correlation between the RSN outcome measure time courses and Baltimore's daily maximum temperature observed in Table 6. However, the cause and mechanism of the observed seasonal patterns is unknown at this time. There are, however, several studies that look into the fluctuating patterns of shorter timeframe (e.g., diurnal and monthly) in functional connectivity measures, and we looked to see whether such higher frequency fluctuation in relevant RSNs translate to lower frequency annual fluctuation.
Interaction between circadian rhythmicity and time awake (homeostatic process), and the resulting diurnal rhythms within various biological systems that range from gene expression [71,72] to body temperature [73] is well-known. Diurnal rhythms were also shown to affect higher order cognitive functions [74][75][76], and recent studies have shown that diurnal rhythms also exist in the strength of functional connectivity [77][78][79]. One study in particular showed that highly rhythmic connectivity patterns exist within sub-systems of DMN and sensorimotor network [77]. However, the observed diurnal rhythm in DMN and sensorimotor networks did not translate to the existence of annual periodicity in the same RSNs (i.e., DMN and sensorimotor networks did not show annual periodicity; Fig 7 TOP(b)). None of the BNC measures display significant annual periodicity (Fig 7 MIDDLE(c) and Table 6c), and this result is consistent in part with a previous study [80], which reported a lack of monthly fluctuation of BNC values. It is not clear at this time why some RSN outcome measures, but not others, show annual periodicity.
Persistence. The persistence, or autocorrelation, of a system describes the system's tendency to stay in the same state from one observation to the next, and is a common feature of many biological systems. While the existence of persistence within a system can complicate the understanding of its underlying mechanism by reducing the number of independent variables and introducing multiple confounding parameters that are not easily separable, persistence can also be exploited to predict future observations based on those of the past [81,82]. Realization of the existence of persistence in a system, and the subsequent estimation of a best-fit mathematical model of the persistence, therefore can lead to: 1) better understanding of the system by quantitatively assessing the fraction of the system's variance explained by the measured persistence, and 2) better prediction of the behavior of a time course based on past observations. Our results show that persistence is a characteristic of many RSNs and RSN pairs, for all three types of RSN outcome measures (Fig 7 BOTTOM and S3 Table). Within the context of using rs-fMRI derived outcome measures as patient-specific biomarkers of recovery during clinical trials, this observation may lead to the development of more accurate inferences from such data.

Limitations
A major limitation of this report is that only one subject underwent the weekly scanning for 185 weeks. Scanning additional participants would have helped to ensure the generality of our findings. However, ensuring compliance of multiple participants over such a prolonged period would have been difficult. We have sought to address this issue by including results from a previously acquired multi-participant dataset on the same scanner and with identical acquisition parameters [12], and ensured that the healthy male individual of the longitudinal dataset was a representative healthy control. Estimated means of all network outcome measures for the longitudinal dataset-spatial maps, temporal fluctuation magnitude, and BNC-were within the range of the corresponding outcome measures from the multi-participant dataset, as shown in Figs 3-5 -i.e., in all cases, values for the longitudinal dataset were within the range of those computed for the multi-participant dataset.
While we report some outcome measures for the multi-participant dataset, we refrain from an in-depth analysis of it, as the focus of this paper was on long-term reproducibility of the single-subject data. However, published reproducibility studies have consistently shown that rs-fMRI derived outcome measures were robust across participants [29,38,39,83], and here we briefly summarize previous literature on this subject. An early study by Chen et al., acquired data from 14 healthy participants over the period of 16 days and reported that the intrinsic networks were consistent across multiple sessions [29]. Meindl et al., assessed the reproducibility of DMN networks across multiple sessions in 18 healthy participants each scanned three times over the period of a week, and found DMN networks to be highly spatially consistent across sessions [38]. Shehzad et al. reported modest to high inter-participant reproducibility in a dataset acquired from 26 participants at three different times over five months [83].
Another limitation of this study is that we report only on outcome measures derived using group ICA (GICA) [18]. However, as one of the most commonly used methods to estimate RSNs, GICA offers many advantages. GICA is an extension of the ICA method, in which reducing and concatenating multi-session/multi-participant fMRI data allows ICA to be applied once to the aggregate data. This obviates the cumbersome and potentially inaccurate matching of components estimated separately from data from individual sessions and/or participants. Furthermore, the alternative method of performing ICA separately for each session may result in different numbers of components for different sessions, depending on the level of noise. A possible consequence of such approach is that networks may split into varying numbers of sub-networks, as seen in higher-order ICA analysis [20]. This in turn introduces additional uncertainty into the network identification process, and may make group inference across sessions unfeasible. Alternatively, use of GICA [18] allows delineation of identical networks for the single-and multi-participant datasets, eliminating the need to perform ICA separately on each dataset. There have been concerns that the identification of aggregate networks using a single dataset of multiple groups, and then backreconstructing the single-session datasets, may bias the results towards the group mean. However, the original report by Calhoun et al., [18], and a study by Schmidhorst and Holland [84] show that GICA can identify a network present in as little as 10% of the study population. Also, the findings reported in this study are consistent with reports of the reproducibility of rs-fMRI outcome measures estimated using alternative analytic approaches such as seed based temporal correlation [85][86][87][88][89].
Finally, due to time constraints, additional physiological or psychological measures were not obtained during the acquisition of this dataset. Together, such additional measures would have provided valuable information regarding the nature of the temporal structure we have observed in this study. Indeed, there are now ongoing endeavors to acquire longitudinal rs-fMRI data like ours, augmented with the regular parallel acquisition of many auxiliary measures, ranging from sleep data to blood samples. One such example is the MyConnectome project (http://myconnectome.org/wp/), during which along with rs-fMRI, biological samples (i.e., blood) as well as data about daily life activities were collected. We expect our study, along with other unique longitudinal studies, will provide new insight into understanding the dynamics of brain function over time.

Conclusions
Rs-fMRI allows noninvasive observation of brain networks, and has potential to yield biomarkers for clinical trials in neurological diseases where such RSNs may change. The goal of this study was to present a unique longitudinal dataset reporting on a healthy adult subject scanned weekly over 3.5 years, and identify RSN outcome measures with high intra-subject inter-session reproducibility over prolonged timeframes appropriate for rehabilitation trials. ICA was used to identify fourteen RSNs that represent unique functional networks. Three types of rs-fMRI outcome measures, namely spatial map similarity, temporal fluctuation magnitude, and BNC, were found to be reproducible across the extended study period. In particular, the Exec-R and Exec-L networks, which are closely related to disease states such as substance abuse and Alzheimer's disease, showed high intra-subject inter-session reproducibility for all three types of RSN outcome measures, suggesting that these networks may be of particular interest.
Additionally, we sought to identify potential parameters-of-interest for clinical studies, by assessing the existence of temporal structure in the three types of rs-fMRI outcomes measures. Time series analysis showed that the RSN outcome measures displayed properties including linear trend, annual periodicity, and persistence. This finding suggests that when RSN outcome measures are considered as imaging biomarkers for lengthy therapeutic interventions in chronic conditions it may be beneficial to take the temporal structure parameters into consideration. spatial maps, visualized using boxplots. Spatial similarity of each session's RSN spatial map to the corresponding group mean map, measured using eta-squared (η 2 ), for single-subject (a) and multi-participant (b) datasets, is visualized using box plots (end of boxes: quartiles, bar within boxes: median, small dots: outliers). In (b), for each RSN, the mean η 2 of the single-subject dataset is overlaid as a large gray circle. (DOCX) S3 Fig. Reproducibility of RSN signal temporal fluctuation magnitude, visualized using boxplots. Blood oxygenation level dependent (BOLD) signal fluctuation magnitude for each session's RSN time courses, calculated as root-mean-squared (RMS) % BOLD for single-subject (a) and multi-participant (b) datasets, is visualized using boxplots. In (b), for each RSN, the mean RMS % BOLD value for the single-subject dataset is overlaid as a large gray circle. (DOCX) S1 Table. Acquisition dates of the 158 resting state functional MRI (rs-fMRI) scans. A total of 158 scans were acquired over the period of 185 weeks.