Task-evoked activity quenches neural correlations and variability across cortical areas

Many large-scale functional connectivity studies have emphasized the importance of communication through increased inter-region correlations during task states. In contrast, local circuit studies have demonstrated that task states primarily reduce correlations among pairs of neurons, likely enhancing their information coding by suppressing shared spontaneous activity. Here we sought to adjudicate between these conflicting perspectives, assessing whether co-active brain regions during task states tend to increase or decrease their correlations. We found that variability and correlations primarily decrease across a variety of cortical regions in two highly distinct data sets: non-human primate spiking data and human functional magnetic resonance imaging data. Moreover, this observed variability and correlation reduction was accompanied by an overall increase in dimensionality (reflecting less information redundancy) during task states, suggesting that decreased correlations increased information coding capacity. We further found in both spiking and neural mass computational models that task-evoked activity increased the stability around a stable attractor, globally quenching neural variability and correlations. Together, our results provide an integrative mechanistic account that encompasses measures of large-scale neural activity, variability, and correlations during resting and task states.

Reviewer #2: The authors analyse two different datasets (human fMRI and NHP electrophysiological data) for spontaneous and evoked activity under different tasks. They monitor the variability and correlations in the signal in rest and task conditions concluding that the task reduces the variability and the correlations in the signal compared to the rest state. The study is performed carefully and the authors provide evidence in support to their conclusion. They also perform numerical simulations for a minimal model to provide an interpretation of the observed reduced variability in terms of the dynamical evolution of the systems in resting conditions and under tasks.
A primary premise of our analysis is based on the evidence that fMRI signals are indirectly related to underlying neural population activity. Several studies, including a recent study using simultaneous multi-unit activity, deconvolved wide-field calcium imaging data, and optical imaging (i.e., to capture the BOLD signal) have indicated that a significant of variance in the BOLD signal is associated with neural population activity (both LFP and spiking). Specifically, that study showed that deconvolved wide-field calcium imaging signals explained between 60-70% of the variance of a deconvolved BOLD signal (Ma et al. 2016) . Older studies have also illustrated that the BOLD signal is not only related to LFP signals (r^2=0.521), but also with multi-unit activity (r^2=0.445) (Logothetis et al. 2001) . One of the goals of our manuscript was to build on these studies to demonstrate that properties of neural dynamics (e.g., changes to correlation and variability structure during task states) were observable in both neural and fMRI data. However, while these inferences are limited given that fMRI imaging and electrophysiology were performed separately and in two different species, we believe that our analyses helps to bridge an important gap between two distinct subfields within neuroscience (human fMRI and NHP electrophysiology) to provide a more holistic understanding of task-dependent correlation/variability changes. However, we have mentioned the limitations of these assumptions in our results in the Discussion (p. 23): " Despite replication of results across species and task paradigms, our conclusions are based on independently obtained data sets from two species and across task designs. Thus, to ensure variability and correlation quenching are from the same neural origin, it will be important for future work to investigate these questions with experimental designs that simultaneously record both BOLD signal and spiking activity across multiple cortical areas." R2 also suggested that the reduction in variance is due to the field alignment in a large magnetic field during stimulation. While this may be theoretically plausible, we argued [above] and in the Discussion that neural dynamics are strongly associated with BOLD dynamics. In addition, in our numerical simulations that simulate spike rate dynamics, a nonlinear transformation of the spike rate signal to the BOLD signal (via the Balloon-Windkessel model (Friston, Harrison, and Penny 2003) ) preserves variability reduction changes (SFig 6), suggesting that variability reductions would be observed independent of fluctuations induced by changes to the magnetic field. We have emphasized this distinction in the Discussion (p. 23): " In addition, our computational model results demonstrated that reductions in neural variability and correlations were preserved after nonlinearly transforming the spike rate signal to the fMRI BOLD signal with the Balloon-Windkessel model (Friston, Harrison, and Penny 2003) , suggesting that the observed signal changes are likely due to the BOLD signal changes rather than MRI artifacts ." ************************************************************************************************************* The main conclusion by the authors would be more convincing if observed systematically by controlling the amplitude of the external stimulation. The task is not well defined in the manuscript and the duration of the application of some seconds suggests that the stimulation is quite large. Moreover, I think that applying, in a sequence, different stimulations (not otherwise specified) is also a non-rigorous procedure. Indeed, it is possible (as suggested by numerical simulations) that the application of a short in time small amplitude stimulation would not reduce the variability. ************************************************************************************************************* 3. We apologize that the manuscript was not clearer. In our empirical data, the amplitude of the external stimulation was controlled for by each task condition separately . This meant that 'stimulations' (or activations) of the same condition (e.g., 2-back for faces in the working memory task) were controlled for separately from stimulations of a different condition (e.g., 2-back for tools in the working memory task). Though the precise temporal shape of the task-related stimulation for each task condition is unknown (and likely changes depending on the brain region), the external stimulation was flexibly accounted for by using finite impulse response (FIR) regression. This meant that each task condition (or, 'stimulation') was flexibly and rigorously accounted for uniquely. We have added a sentence clarifying this in the Methods section (highlighted below), detailed in the Methods section "fMRI: Preprocessing" (p. 28): "FIR modeled task blocks were modeled separately for task conditions within each of the seven tasks. Thus, the mean task-evoked activation was differentially accounted for according to each specific task condition. In particular, two conditions were fit for the emotion cognition task, where coefficients were fit to either the face condition or shape condition. For the gambling reward task, one condition was fit to trials with the punishment condition, and the other condition was fit to trials with the reward condition. For the language task, one condition was fit for the story condition, and the other condition was fit to the math condition. For the motor task, six conditions were fit: (1) cue; (2) right hand trials; (3) left hand trials; (4) right foot trials; (5) left foot trials; (6) tongue trials. For the relational reasoning task, one condition was fit to trials when the sets of objects were matched, and the other condition was fit to trials when the objects were not matched. For the social cognition task, one condition was fit if the objects were interacting socially (theory of mind), and the other condition was fit to trials where objects were moving randomly. Lastly, for the working memory task, 8 conditions were fit: (1) 2-back body trials; (2) 2-back face trials; (3) 2-back tool trials; (4) 2-back place trials; (5) 0-back body trials; (6) 0-back face trials; (7) 0-back tool trials; (8) 0-back place trials. Since all tasks were block designs, each time point for each block was modeled separately for each task condition (i.e., FIR model), with a lag extending up to 25 TRs after task block offset." This was also mentioned in another Methods section, "fMRI: Task state activation analysis". Note, that for task activation analysis we used a canonical HRF to model the activation amplitude (rather than FIR model). We have added a clause (highlighted) to emphasize that activations were obtained for each condition separately (p. 28): "We performed a task GLM analysis on fMRI task data to evaluate the task-evoked activity. The task timing for each of the 24 task conditions was convolved with the SPM canonical hemodynamic response function to obtain task-evoked activity estimates for each task condition separately [65]. FIR modeling was not used when modeling task-evoked activity. Coefficients were obtained for each parcel in the Glasser et al. (2016) cortical atlas for each of the 24 task conditions." ************************************************************************************************************* It would be also more appropriate to analyse the variability by averaging the response to the same task, not a series of different ones as I understand from the manuscript. I therefore suggest to analyse data for different durations of the stimulation applied, by using the same stimulation frame not a sequence of different ones, whose effects on the evoked activity can be averaged out in the statistical procedure. ************************************************************************************************************* 4. Though for the main manuscript we analyze how variability is reduced across all task conditions together (after accounting for the condition-specific activation responses separately), we also illustrate results for variability reductions for each task separately (averaged across conditions within each task, though mean activations were removed for each condition separately within each task) (SFig 8a).
However, our non-human primate data was only collected in two subjects. This small sample size in subjects is standard in the non-human primate electrophysiology literature, given the difficulty of training and performing surgeries on large NHP populations (e.g., see (Brincat et al. 2018;Siegel, Buschman, and Miller 2015) ). Importantly, in return for the two subjects, a large number of trials was obtained in each subject, providing high-powered and statistically robust results for each subject independently. Thus, given the large number of trials for each non-human primate, we took the approach of analyzing each subject separately in our non-human primate data set, rather than performing a group-analysis. We agree with R2 that a group test across two subjects would provide statistically underpowered results.
This approach to analyzing subjects individually has recently gained popularity in several human fMRI studies, where large amounts of data are obtained for individual subjects, rather than small amounts of data across a large subject cohort (Huth et al. 2012;Nakai and Nishimoto 2020;Gordon et al. 2017) . In this approach, each additional subject in the data set serves as replication to the initial subject (since each subject is analyzed separately). ************************************************************************************************************* As a minor remark, the author should specify what they mean by a negative stimulation amplitude in the numerical study and how this would compare to experimental data. Moreover, since the authors relate the reduced variability to the high activity level induced by the task, how can they justify the reduced variability by a de-activation stimulation? ************************************************************************************************************* 6. In the numerical simulation, negative stimulation is indicated by a negative input value into the neural mass. In the one and two-dimensional firing rate models (governed by equations 8 and 10/11, respectively), A negative stimulation refers to a negative value for the parameter (p. 32). In our spiking s i model (Fig. 6a,b), this corresponds biophysically to the activation of the inhibitory neurons within a local population. We have clarified the phrasing in the Methods sections, "Model: One-dimensional minimal network model" and "Model: Two-dimensional minimal network model" to reference how is defined: s i In "Model: One-dimensional minimal network model", p. 32: "We simulated neural population activity injecting a fixed input (boxcar input) with amplitudes ranging from in 0.01 increments ( Fig 7C)." − , ] s i ∈ [ 5 5 and "Each trial was run for 20 seconds. Fig 7C was generated using input amplitudes of ." − , , } s i ∈ { 3 0 3 In "Model: Two-dimensional minimal network model", p. 33: "We injected fixed input into both neural populations with amplitudes ranging from − , ] s i ∈ [ 5 5 in 0.01 increments (Fig. 7E)" We apologize to the reviewer for not making our being clearer; the results portion that indicates that more task-active regions reduced their variability was poorly phrased. The results illustrated an association of task-evoked activation magnitude with variability reduction (S3 Fig). (Magnitude here reflects the absolute value of the task-evoked activation coefficient, which is estimated from a task GLM. This indicates that the magnitude of either positive or negative activation coefficients is associated with variability reductions. This was more precisely indicated in the S3 Fig axis labels.) This analysis captures the magnitude of fMRI BOLD deflections relative to baseline, independent of the sign of the deflection. This analysis was motivated by and replicates a previous study that showed that voxels with a greater fMRI BOLD activation magnitude had greater variability reductions (see Figure 4A in (He 2013) ). To better characterize our results, we have rephrased our wording to indicate that task-evoked activation magnitudes are correlated with variability and correlation reductions. Critically, this analysis is consistent with our model, which suggests that highly active or deactive states would result in variability reductions. The results paragraph has been revised as follows (see p. 12): "Previous work has shown that regions that have strong task activations (i.e., the magnitude of the task-induced activation, positive or negative) tend to have greater variability reductions (He 2013) . (Task activation magnitudes reflect the deviation of the BOLD activity relative to baseline, or the inter-block interval.) We sought to replicate this effect in the current data set, while extending those results to demonstrate that more task-active regions also tend to reduce their FC during task states. We first correlated regional task-evoked activation magnitude with task-evoked variability reduction (task variance minus rest variance) across regions at the group-level. We found that regions with greater task-evoked activation magnitudes (averaged across tasks) exhibited greater variability reductions during task states, confirming previous findings in a finger tapping task (exploratory cohort rho=-0.32, p<10e-9; replication cohort rho=-0.49, p<10e-22; Supplementary Fig 3A,C) (He 2013) . This negative relationship was also observed in 6/7 of the HCP tasks when analyzed separately (FDR-corrected p <0.01; S8 Fig). To link regional task activation magnitudes with FC decreases, we tested for a correlation between regional task-evoked activation magnitude and the average FC change during task states for each region. Consistent with our hypothesis, we found that regions with greater task-evoked activation magnitudes (averaged across tasks) reduced their average FC more during task states (exploratory cohort rho=-0.25, p<10e-05; replication cohort rho=-0.20, p =0.0002; S3 Fig). When tasks were analyzed separately, this negative correlation was observed in 4/7 of the HCP tasks (FDR-corrected p <0.05; S9 Fig). Thus, brain areas with higher levels of task-evoked activation magnitudes (i.e., changes in activity relative to baseline) tend to reduce both their task-evoked variability and FC." ************************************************************************************************************* Reviewer #3: In behaving animals, pairwise correlation among neurons in a local network is generally reduced during evoked activity. By contrast, at larger-scale when we study interactions across brain regions we find that correlations (functional connectivity: FC) can both increase and decrease. Authors claim that taken together these observations are incompatible. In this study authors have addressed this problem. To this end, they have analysed synchrony and variability in spiking activity recorded from primates and in fMRI signals recorded from humans. They argue that unlike previous observations, task related activity quenches correlations and variability across cortical areas (fMRI measurements). They provide an explanation of reduction in cortical variability and correlation in evoked activity in terms of attractor dynamics. The key to the explanation is neuron mass model transfer function and how the attractor is shifted by the inputs.
Overall there is huge amount of work has gone into this manuscript: from analysis of spiking activity and fMRI data to simulation of neuron mass models with one and two populations. But I think manuscript needs some more work in terms of data analysis to really establish the results based on the experimental data. The major problem with the manuscript is the choice of the model which is not suitable to address the problem. Moreover, the even though the explanation based on the results seems fine it does not apply to the neuronal networks in the brain. So we have an experimental observation at the level of inter brain region correlations not only that needs to demonstrated with greater rigour but also there is no explanation of the observations. In the following I elaborate on these.
8. Addressing the issue of incomparable spatial resolutions: While multi-unit spiking data was collected for the NHP data set, we did not use the individual units for analyses. Instead, we computed the mean-field spike rate within each of the six cortical areas. Though this is an unusual approach for multi-unit spiking data sets because information and spatial specificity is thrown out due to spatial downsampling, we intentionally chose this approach to better match the spatial resolution of our fMRI data (i.e., mean-field cortical areas, rather than individual neurons). For this study, we were interested in making statistical inferences at the level of large neural populations (i.e., inter-area dynamics), rather than at the local network (Doiron et al. 2016) . We acknowledge that the approach of averaging spikes across the multi-units prevents us from making any inferences at the level of a local population.
Importantly, computing the mean-field spike rate of a cortical area (by averaging multi-units) provides a better spatial comparison to our fMRI data. Moreover, few NHP spiking data sets are well-positioned to investigate inter-region correlations for more than two regions given the overall difficulty of obtaining simultaneous electrophysiological recordings from multiple cortical sites. Thus, while we significantly simplified (downsampled) our spatial recordings, we leveraged this unique NHP data set to directly address mean-field, inter-region correlations to facilitate translation to our human fMRI data. More broadly, we sought to bridge the gap between human neuroimaging and NHP electrophysiology (particularly with respect to "functional connectivity" and "noise correlation" analyses), since few links between these subfields are often made.
R2 raised related doubts regarding whether the neural origins of fMRI and mean-field spiking activity are related. We believe the work of Nikos Logothetis' group and more recently, Elizabeth Hillman's group, have provided strong evidence that the fMRI signal has neural (spiking) origins (Logothetis et al., 2001;Ma et al., 2016) . We have addressed these concerns to R2 (see response #2 above), and have revised the manuscript accordingly. In general, we have substituted the use of 'multi-unit' with 'mean-field' (with the exception of the Methods section, where we detail that mean-field spike rates were derived from the average spiking activity of a multi-unit recording): " First, we characterize task-evoked neural variability and correlations in empirical data using two highly distinct data sets: multi-site and mean-field NHP spike rates and whole-brain human fMRI… The combination of simultaneously recorded mean-field spike rate recordings from six cortical sites, whole-brain fMRI obtained from seven different cognitive tasks, and dynamical systems modeling and analysis provide a comprehensive account of task-related correlation and variability changes spanning species and data acquisition techniques." (p. 3, Introduction) "NHP mean-field spiking and human fMRI data… In particular, we focused on neural variability and correlation changes across large cortical areas (mean-field) in our electrophysiology data set (rather than between pairs of neurons) given our focus on large-scale neural interactions, and to facilitate a comparison between different correlation approaches (FC in fMRI data and spike count correlation in electrophysiology data)." (p. 4, Results) " a,b) Using mean-field spike rate data collected simultaneously from six different cortical areas (Siegel, Buschman, and Miller 2015) , we compared the spiking variability and spike count correlations between task-state (i.e., following task cue onset) and rest-state spiking activity… This was performed by estimating the mean-field spike rate by averaging across multi-units in each cortical area, allowing us to target the activity of large neural populations." (p. 5, Fig 1  caption).
"We measured mean-field spike recordings from six different cortical areas during a motion-color categorization task." (p. 6, Fig 2 caption) "In the previous sections, we provided empirical evidence that task states reduce mean-field inter-area correlations and variability in spike rate and fMRI data." (p. 14) " a) Mean-field spike recordings from six different cortical regions fed into our analyses." (p. 41, S1 Fig caption) 9. Addressing the issue of incomparable temporal resolutions: R3 also raises an important point regarding the fact that fMRI data and spiking data are sampled at different timescales. To mitigate this issue, variability and correlations for our NHP data set were computed across trials , where spiking activity was temporally downsampled (to obtain spike counts per trial). This placed the temporal time scale of our NHP data (500ms-1000ms per trial) in line with our fMRI data (sampled at 720ms), such that the temporal timescales were on similar orders of magnitude. We have added a clarifying sentence in the beginning of our Results section to highlight our effort to match the temporal resolution of spiking and fMRI data: " In addition to spatially downsampling our NHP data to evaluate mean-field spike rates in each cortical area, we also temporally downsampled our NHP data to investigate variability and correlation changes across trials (on the order of hundreds of milliseconds), which appropriately matches the sampling rate of our fMRI data (720ms). " (p. 4,Results) In addition, in our numerical simulations, we verified that reductions in variability and correlations were preserved after nonlinearly transforming the original spike rate signal to fMRI signal (SFig 6). This analysis showed that similar changes to neural dynamics (i.e., variability/correlation changes) are likely to be observed despite the application of a low-pass filter to high temporal resolution data (i.e., neural-to-BOLD signal transformation). This was highlighted in a paragraph in the Discussion section: "Despite converging results, there are several key differences in our two empirical data sets. First, the time scale of fMRI BOLD activity is much slower than the NHP spiking activity. However, these differences were mitigated by measuring spiking variability across trials, which is comparable to the time scale of fMRI's sampling rate (in the hundreds of milliseconds). In addition, our computational model results demonstrated that reductions in neural variability and correlations were preserved after nonlinearly transforming the spike rate signal to the fMRI BOLD signal with the Balloon-Windkessel model [41]." (p. 23, Discussion) *************************************************************************************************************
In general, the ongoing and evoked spike rates (averaged across all recording sites) range from 10-15Hz (Fig 2, p. 6): Neural variability and correlations decrease during task states relative to rest in spiking data. Results for the replication subject are reported in S1 Fig. a) We measured mean-field spike recordings from six different cortical areas during a motion-color categorization task. b) We calculated the average spike rate change across all recordings during the rest period (ITI) and task period (task cue), across trials. Each data point reflects the firing rate across 25 consecutive trials. c) We calculated the cross-trial spiking variance for each region during task and rest states, and then averaged across all regions. Each data point reflects the spiking variance across 25 consecutive trials. d) We calculated the average cross-trial neural correlation for task and rest states between all pairs of recorded brain regions. (Spike rates were averaged within each cortical area.) Each data point reflects the correlation across 25 consecutive trials. e-g) For each pair of brain regions, we visualize the correlation matrices between each recording site for the averaged rest period, task period, and the differences between task versus rest state spike count correlations. h) We also observed no increases in covariance (non-normalized correlation) (Duff et al. 2018;M. W. Cole, Yang, et al. 2016;Siegel, Donner, and Engel 2012) . For panels e-h , plots were thresholded and tested for multiple comparisons using an FDR-corrected p<0.05 threshold. Boxplots indicate the interquartile range of the distribution, dotted black line indicates the mean, grey line indicates the median, and the distribution is visualized using a swarm plot. ************************************************************************************************************* -Authors argue that in task-state correlation increase in the replication subject is because of a decrease in unshared variance (Page 7 first para). I am not sure I understand the line of argumentation. Could authors format this in mathematical equations. ************************************************************************************************************* 13. After re-analysis of our data with the corrected conversion to Hz mentioned above, we found that the previously reported increases in correlation in the replication NHP were reproduced using measures of covariance (un-normalized correlation). (Note, however, that we did not find any correlation/covariance increases in the exploratory NHP.) Though the overall increase in covariance was fairly weak, these results did not change the primary finding that the majority of pairwise correlations (and covariances) were strongly reduced. Interestingly, the correlation/covariances that increased were between visual areas and frontal areas (i.e., MT and IT with PFC and FEF). While these mean-field spike count correlation increases have not been reported in the literature, these increases appear to be consistent with our fMRI data (which see correlation increases among the frontoparietal network and the visual network; see Fig 3). Though these mean-field correlation increases between frontal and visual areas appear to be a novel finding in the NHP electrophysiology literature, we believe a full analysis on these correlation/covariance increases are beyond the scope of this manuscript. Moreover, these correlation increases were not replicated in the exploratory NHP data. Nevertheless, we hope to explore these phenomena in more detail in a future study. We have also revised the appropriate Results and Discussion sections to reference these findings.

Results (p. 7):
"We did not identify any pairwise correlation and covariance increases in our exploratory NHP ( Fig 2G). However, in our replication NHP we found correlation increases between visual and frontal areas (i.e., MT/IT and PFC/FEF) (S1 Fig). When analyzed with covariance (rather than correlation), we found these covariance increases to be weak relative to the observed covariance decreases. (Moreover, the baseline correlation strength between these areas was very low during the ITI period.) Though these correlation increases were only observed in 1 of 2 NHPs, they were generally consistent with our fMRI data (below), which showed that though there were few correlation increases, variability and correlations across cortex were dominated by decreases during task states." Discussion (p. 21-22): "Though we largely focused on FC decreases during task states in both data sets, we identified a small number of correlations that increased during task state. Most of these correlation increases were primarily between regions belonging to different functional networks, such as frontal and visual areas, which is consistent with previous literature (M. Cole et al. 2014;Krienen, Yeo, and Buckner 2014;Gonzalez-Castillo and Bandettini 2017) . Some correlation increases have also been reported in the NHP spiking literature, where spike count correlations between units with similar task receptive fields tend to decrease, while spike count correlations between units with dissimilar task receptive fields increase (Ruff and Cohen 2014) . This appears to be conceptually consistent with the present findings, where we focus on mean-field correlation changes between functionally distinct cortical areas (rather than between pairs of individual neurons). Specifically, we found that regions in the same network tend to decrease their correlations, while regions across functionally distinct areas/networks marginally increase their correlations. In our fMRI data, we found marginal task-state correlation increases between different networks, such as the frontoparietal and visual networks (Fig 3F). Correlation increases were also observed in one of the two NHPs between frontal (PFC and FEF) and visual (MT and IT) areas (S1 Fig). (However, we note we did not observe any correlation increases in our exploratory NHP.) While these correlation increases appear to be statistically reliable in the second NHP, the spontaneous (resting-state) correlations between these areas were very low and the correlation increases small. Thus, it will be important for future investigations to directly evaluate the functional relevance of these marginal correlation increases" Re-analysis of the replication subject with corrected spiking units (S1 Fig, p. 42). but using data from a replication subject. We find nearly identical patterns between the exploratory and replication subjects, with the exception that we did not replicate any correlation increases. a) Mean-field spike recordings from six different cortical regions fed into our analyses. b) As in our empirical fMRI data set, we calculated the global variability across task and rest states (estimated using the standard deviation across trials). c) We then calculated the global neural correlation (i.e., the spike count correlation across trials) for task and rest states between all pairs of recorded brain regions. (Spike rates were averaged within each cortical area.) d-f) For each pair of brain regions, we visualized the correlation matrices between each recording site for the averaged rest, task, and the differences between task versus rest state spike count correlations. For panels d-f , plots were thresholded and tested for multiple comparisons using an FDR-corrected p<0.05 threshold. Boxplots indicate the interquartile range of the distribution, dotted black line indicates the mean, grey line indicates the median, and the distribution is visualized using a swarm plot. ************************************************************************************************************* -In the section 'Task-state FC ...' authors write that 'Our current finding illustrate that mean-field spike count correlation decrease...'. Are they referring multi-unit spiking activity as mean-field activity --if so this is wrong. ************************************************************************************************************* 14. We thank the reviewer for pointing out the confusing references to both 'multi-unit spiking activity' and 'mean-field spiking activity'. While multi-unit spiking activity was recorded , only mean-field spiking activity was analyzed (by averaging across multi-unit spikes). We use the following definition: "Mean Field Theory: An analysis that ignores correlations between variables so that the effect of fluctuations can be "smoothed out" by averaging and the system's behavior described according to the mean effect of all variables that comprise a unit" (Miller 2018) .

is really uninformative. I do not understand what is the purpose of showing that FC of each brain area is on average reduced. If authors want to show that FC is reduced then they need to show that FC for each pair of brain regions is decreased in evoked state as compared
to the ongoing activty. So I suggest that authors show a scatter diagram with x-axis as the FC of a pair in ongoing state and y-axis as FC of the same pair in evoked state. Panel 3i is good but please do not make a binary decision and show the full histogram of pair-wise difference in FC in the two states. ************************************************************************************************************* 16. We thank R3 for their feedback. We have opted to keep Fig 3H, given that graph-theoretic measures, such as weighted degree centrality (or "global brain connectivity") are commonly used network metrics that describe FC (and FC changes). Moreover, this provides a reasonable simplification of visualizing the differential effects of FC changes across a cortical surface, which is a more intuitive visualization. This contrasts Fig 3F, which shows the entire task vs. rest FC matrix (with no reference to cortical organization). However, we have taken R3's feedback regarding Panel 3I. We have replaced it (i.e., a binary decision on FC increases v. decreases, and instead have plotted the suggested scatter plot (task-state FC v. rest-state FC for all pairs of connections). We have additionally plotted changes to regional variance as a scatter plot ( Fig  3C), replacing the previous bar plot that indicated the percentage of increases/decreases in regional variance changes from rest to task. The figures (and captions) have also been updated for the replication cohort (    Fig. a) We first compared the global variability during task and rest states, which is averaged across all brain regions, and then b) computed the task-versus rest-state variability for each brain region. c) Scatter plot depicting the variance of each parcel during task states (y-axis) and rest states (x-axis). Dotted grey line denotes no change between rest and task states. d) We next compared the correlation matrices for resting state blocks with (e) task state blocks, and (f) computed the task-versus rest-state correlation matrix difference. g) We found that the average FC between all pairs of brain regions is significantly reduced during task state. h) We found that the average correlation for each brain region, decreased for each brain region during task state. i) Scatter plot depicting the FC (correlation values) of each pair of parcels during task states (y-axis) and rest states (x-axis). Dotted grey line denotes no change between rest and task states. For panels b-f, and h, plots were tested for multiple comparisons using an FDR-corrected p <0.05 threshold. Boxplots indicate the interquartile range of the distribution, dotted black line indicates the mean, grey line indicates the median, and the distribution is visualized using a swarm plot.
Also updated figure (using un-normalized time series with variance and covariance) (S5 Fig, p. 47). Updated panels are S5C and S5I Fig:  S5 Fig. Non-normalized data using variance and covariance, using the full set of 352 subjects. Neural variance and covariance decreased during task states in human fMRI data. We successfully replicated results from Fig 3 using, but without z-normalizing the time series (and using covariance instead of correlation). The combination of reduced correlations (Fig 3) and covariance measures suggested that shared signal dynamics is reduced from task to rest (M. W. Cole, Yang, et al. 2016;Duff et al. 2018;Siegel, Donner, and Engel 2012) . a) We first compared the global variability during task and rest states, which is averaged across all brain regions, and then b) computed the task-versus rest-state variability for each brain region. c) Scatter plot depicting the variance of each parcel during task states (y-axis) and rest states (x-axis). Dotted grey line denotes no change between rest and task states. d) We next compared the covariance matrices for resting state blocks with (e) task state blocks, and (f) computed the task-versus rest-state covariance matrix difference. g) We found that the average covariance between all pairs of brain regions is significantly reduced during task state. h) We found that the average covariance for each brain region, decreased for each brain region during task state. i) Scatter plot depicting the FC (covariance values) of each pair of parcels during task states (y-axis) and rest states (x-axis). Dotted grey line denotes no change between rest and task states. For panels b-f, and h , plots were tested for multiple comparisons using an FDR-corrected p <0.05 threshold. Boxplots indicate the interquartile range of the distribution, dotted black line indicates the mean, grey line indicates the median, and the distribution is visualized using a swarm plot. *************************************************************************************************************
We have made the corresponding revisions in the text to clarify this intuition (p. 14): "Using this previously established model, we systematically perturbed this balanced network under a distribution of inputs (both excitatory and/or inhibitory inputs) to estimate the excitatory output (i.e., mean-field transfer function) of a cortical population. Though most long-range cortical connections are excitatory, we incorporated excitatory and inhibitory stimulation effects on a local population (Fig 6B). This is because long-range excitatory afferents may target local inhibitory neurons, producing a net inhibitory effect. Under the presence of inputs, we found that the population transfer function approximated a sigmoid activation function (Fig 6B). We note that the upper bound on the sigmoid transfer function (Fig 6D) is likely due to inhibitory feedback on excitatory activity rather than the true saturating spiking regime in neurons. This is because excitatory neurons in a local population typically do not reach a saturating spiking regime even for strong visual stimuli (Priebe and Ferster 2008) , and instead reach an upper bound due to strong inhibitory stabilization preventing runaway excitation (Hennequin et al. 2018) . Importantly, simplifying the mean-field transfer function of a cortical area allowed us to focus our modeling efforts on simplified networks across large cortical areas (Joglekar et al. 2018) . " The population spike rate (excitatory cells only) subject to inhibitory regulation. We systematically stimulated a subset of the neural population and measured the corresponding mean excitatory spike rate. Spike rates were normalized between 0 and 1. Excitatory stimulation was implemented by stimulating 400 excitatory neurons, and inhibitory stimulation was implemented by stimulating 400 inhibitory neurons. Spiking statistics were calculated across 30 trials, with each point in the scatter plot indicating a different 50ms time bin. c) Population neural variability (excitatory cells only), as a function of input stimulation. d) Based on panel b, we approximated the mean field neural transfer function as a sigmoid. A sigmoid transfer function produces optimal input-output dynamics for a narrow range of inputs (gray). The same input distribution mean shifted by some excitatory/inhibitory stimulation produces a quenched dynamic range.
In addition, in the revised manuscript we have added a large-scale network model comprising of 300 mean-field units (S11 Fig). We included 4 large-scale network models: 1) 300 unit network with random connections (excitatory efficacies); 2) 300 unit network with random connections (80% excitatory, 20% inhibitory efficacies); 3) 300 unit network with clustered connections (100% excitatory efficacies); 4) 300 unit network with clustered connections (80% excitatory, 20% inhibitory efficacies). Models #2 and #4 account for the possibility of an excitatory drive impinging primarily on local inhibitory neurons. In all models, we found that correlations primarily decreased in the presence of stimulation.

-Page 16 first para: "We found a highly negative association between mean and variance under experimental perturbation, suggesting that at the mean-field level, mean and variance cannot be mechanistically dissociated" --This is correct when we assume that the operating point of the network is right at the middle of the Sigmoid function. Authors have conveniently chosen a range of 0-1Hz for activity. In reality a network is operating at 1-2Hz (per neuron) baseline and stimulus evoked activity can push the network to very high firing rates transiently but certainly up to 20Hz in steady state. But cortical network activity will not reach saturation range --if it does please provide suitable references. So the input induced reduction in variability while correct in the chosen model, does not make sense for a cortical network --biological neural networks, irrespective of the spatial scale we observe them do not work in a saturation regime during evoked activity whereas in this model evoked activity induced reduction in variability is reached when the network moves towards saturation. Authors have also ignored their own data which shows that multi-unit firing rates different by a small amount in evoked and ongoing activity states --that is even with the chosen Signoid function input induced shift in the operating point by input will be only 0.11 Hz.
************************************************************************************************************* 18. We apologize to the reviewer for not being clearer. The specific operating range the sigmoid transfer function was not intended to be 0-1Hz, but rather the normalized (min and max) of the neural population's spike rate. (In other words, 0 and 1 represent the minimum and maximum firing rate for a given neural population.) This was chosen because the typical mean-field spike rates for neural populations likely varies with location/cortical area, and so accounting for regional idiosyncrasies would not provide a generalizable model. While this was explicitly labeled for the Fig. 6b, the axes were not explicitly labeled for Fig. 6d, Fig. 7a, Fig. 8a,c. We have revised those figures to ensure that 'normalized spike rate' was explicitly placed in the y-axis labels.
Relatedly, R3 raises an interesting point, suggesting that biologically, neurons do not typically reach a saturating spiking regime. In general, we agree that many neurons (particularly in our data set) do not reach a saturating spiking regime. Moreover, our intention behind the sigmoid transfer function (particularly the upper bound of the sigmoid) was not to model saturating spiking activity, but rather excitatory spiking activity subject to inhibitory regulation . In other words, the upper bound of the sigmoid transfer function implicitly models inhibitory stabilization to prevent runaway excitation (i.e., epilepsy) or saturation. We apologize that this was not made clearer. The general intuition was taken from the concept of inhibitory stabilization, providing an 'effective' upper bound on stimulus-driven activity (Hennequin, Agnes, and Vogels 2017;Ahmadian and Miller 2019) .
The use of the balanced spiking model in Fig 6A justified the use of a sigmoid transfer function, since we measured the mean-field excitatory activity subject to feedback inhibition under a continuous range of stimulation (Fig 6B). Thus, though biologically neurons don't necessarily reach saturating levels of activity, the sigmoid function (at the population level) can still provide an accurate I/O description of bounded excitatory output given some stimulus-drive.
We have revised the text in the Results section introducing the sigmoid transfer function, as well as parts of the Discussion section to ensure further clarity.
Though larger models may provide additional insight (e.g., insight into how network organization may affect correlations), these models contain a huge parameter space, potentially obfuscating the critical dynamical mechanisms that contribute to correlation changes. Using our simple model, we were able to pinpoint a dynamical mechanism -the characteristic time scale near a fixed point attractor -that explains changes to simulated variability and correlations independent of complex network parameters. Importantly, while we pinpoint a specific dynamical mechanism that governs these statistical changes, we also acknowledge a sigmoid transfer function (at the mean-field) can be implemented with a range of biological (and network) mechanisms, such as clustered excitatory connectivity (Litwin-Kumar and Doiron 2012) , and/or a transition from loose to tight E-I balance (Hennequin et al. 2018) . We note that the mechanisms we suggest are theoretically consistent with many of the explanations in other papers, where variability is reduced due to strengthening dynamics around a stable fixed point. We characterized this in the Discussion section (p. 22): " The quenching of output variability can be explained by different biological mechanisms, such as clustered excitatory connectivity in local circuits, tightening of E-I balance due to inhibitory feedback, neural adaptation, and/or irregular synaptic vesicle release (Doiron et al. 2016;Rosenbaum, Rubin, and Doiron 2012;Hennequin et al. 2018;Litwin-Kumar and Doiron 2012;Deco and Hugues 2012) . The manifestation of these biological mechanisms can be summarized at the mean-field by the reduction of the response variability due to the decreased slope in the sigmoid transfer function during highly active or inactive states." However, to ameliorate any concerns that the mechanisms observed in a simple case would not scale appropriately to a large-scale network, we have provided additional numerical simulations with n=300 units. Given that the space of possible models increases in large-scale models (and an exhaustive overview of all possible scenarios would be beyond the scope of this study), we limited our analysis to include simulations with only sparse and random connectivity matrices with different connectivity weights. Below (and in the manuscript), we show the results demonstrating that at the large-scale, stimulus-driven activity reduces global variability and correlations: On variability changes in large-scale networks (Results, p.16): "We also generalized the findings from our minimal (single region) model to large-scale firing rate models (with 300 regions), where we found variability decreases during task-evoked states in both network models with random structural connections and clustered structural connections (SFig 11). We demonstrated this for network models with excitatory connections only, as well as networks with both E and I connections." On correlation changes in large-scale networks (Results, p. 20): " To ensure that our findings in simplified two node networks would scale to large-scale network models, we simulated large-scale firing rate models (with 300 regions). We found correlation decreases during task-evoked states in both network models with random structural connections and clustered structural connections (SFig 11), suggesting that the mechanisms we identified in these minimal models likely scale to the larger networks. We demonstrated this for network models with excitatory connections only, as well as networks with both E and I connections." S11 Fig, p. 53: "S11 Figure. Variability and correlations are quenched in large-scale network models (300 regions) with both random and clustered structural connections. For each structural connectivity matrix, we randomly sampled synaptic weights from a normal distribution with either 100% E connections (given evidence that most long-range connections are excitatory, Joglekar et al., 2018), or .0, σ .2 μ = 1 = 0 80% E and 20% I connections ( ). For each network model (4 in total), we simulated 20 .0, σ .2 μ = 1 = 1 subjects for 10 seconds each (100ms sampling rate). For simplicity, during the task state, all units were stimulated with a fixed input. a) Random structural connectivity matrix (20% connectivity density) for an example subject. b) The average across all pairwise correlations during the rest and task states for the network model with 80% E and 20% I connections. The rest state exhibits higher correlations than the task state. c) The variability (variance across time) averaged across brain regions during the rest and task states for the network model with 80% E and 20% I connections. The rest state exhibits higher variability than the task state. d) The task minus rest FC matrix (correlation difference) between all 300 regions. Correlations decreased from rest to task states. e-g) The same analyses as b-d , but using only excitatory connections only. h) Clustered structural connectivity matrix (10 communities, 20% within-community density, 3% out-of-community density). i-k) The same analyses as b-d , but using the clustered connectivity matrix with 80% E and 20% I connections. l-n) The same analyses as b-d , but using the clustered connectivity matrix with 100% E connections. Boxplots indicate the interquartile range of the distribution, dotted black line indicates the mean, and the distribution is visualized using a swarm plot. Plots d, g, k, n, were corrected for multiple comparisons and thresholded using an FDR-corrected p<0.05. " Methods of the 300-unit network model (Methods, "Model: 300 unit firing rate model To verify the findings observed in our minimal models would scale to larger networks (one and two-dimensional models), we included a 300 region mean-field firing rate model. We chose 300 regions given that most whole-brain human atlases contain 200-400 cortically defined parcels (Glasser et al. 2016;Power et al. 2011;Schaefer et al. 2018) . Our model followed the same equations as in our minimal model, though inter-area weights were appropriately scaled relative regional self-coupling parameters. Specifically, the network dynamics obeyed the equations where describes the activity of each population, and all other variables are as described x i above. Inter-regional coupling was set to be greater than local coupling (2:1 ratio), given evidence from previous studies that global coupling is greater than local coupling (Deco et al. 2013;Ito et al. 2017;M. W. Cole, Ito, et al. 2016) . Specific parameters for this network model were specified such that: , the mean of inter-region coupling parameters was , was sampled from a Gaussian distribution with mean 0 and standard during rest state and during task state. s i = 0 s i = 1 In total, we ran simulations for two classes of network models: a network with random connections and a network with clustered communities (S11 Fig). For the random network, we randomly sampled connections with 20% probability rate between all pairs of regions. For the clustered network model, we generated 10 communities of 30 nodes each. Regions within the community had a 20% probability rate for establishing a connection. Between-community connections had a 3% probability rate for establishing a connection.
For each class of network (random or clustered), we weighted connections with either positive weights only (i.e., only E connections) or both positive and negative weights (i.e., both E and I connections. For the E-only network, weights were sampled from a normal distribution with parameters . For the network with both E and I weights (80% E, 20% I), weights , σ .2 μ = 1 = 0 were sampled from a normal distribution with parameters . , σ .2 μ = 1 = 1 Both the rest and task state simulation was run for 10 seconds each and sampled at 100ms. For each group analysis (S11 Fig), we simulated 20 subjects worth of data. The full model code will be made publicly available, and was written in python 3.7.3." R3 also makes a good point regarding the lack of inter-area synaptic delay in our model. As mentioned above, we opted to exclude synaptic delay parameters in our neural mass models due to the increased complexity (adding additional dynamic variables), which would make detailed dynamical systems analyses difficult. Instead, we showed in our balanced spiking model (which contains synaptic delays) that variability reductions occur during stimulus-driven activity even with a synaptic delay (Fig. 6b,c). Moreover, the time scale of synaptic dynamics is significantly faster (i.e., <10ms) than the time scale of variability/correlation changes we analyzed (>500ms in our spiking data and fMRI data). Because inter-areal synaptic delays operate on a much faster timescale than the variability/correlation reductions we were interested in, we believe that our model simplification (i.e., exclusion of synaptic delays) was appropriate.
Other studies have characterized inter-area neuron-to-neuron correlation changes using detailed biophysical models (Huang et al. 2019). We have referenced this paper (which offers a biophysical interpretation of correlation changes) in the Discussion section (p. 22). We wanted to offer a different perspective to contrast those models, using simple models and providing dynamical (rather than biophysical) explanations.
While the Deco and Hughes 2012 paper provides mechanistic explanations for variability reductions that apply at the mean-field level, they use a local spiking neural network model. More importantly, the proposed mechanisms are limited to explaining neural variability and were not extended to mechanisms explaining inter-area correlations/functional connectivity. We specifically wanted to extend these findings to identify mechanisms that could explain inter-area mean-field correlation changes , which could facilitate translation to the level of functional connectivity in fMRI data. In fact, we were inspired by these papers such that we used the model from Litwin-Kumar and Doiron 2012 to justify the use of a sigmoid transfer function to describe mean-field population dynamics. The publications R3 has mentioned (i.e., Litwin-Kumar and , Doiron et al. 2016, Deco and Hughes 2012, Hennequin et al. 2018 have been cited throughout the paper, but most prominently in the Discussion. We thank the reviewer for bringing our attention to Bujan et al. 2015. We have now included a reference to that study as well. Discussion (p. 22): " The quenching of output variability can be explained by different biological mechanisms, such as clustered excitatory connectivity in local circuits, tightening of E-I balance due to inhibitory feedback, the statistics of feedforward inputs, neural adaptation, and/or irregular synaptic vesicle release (Bujan, Aertsen, and Kumar 2015;Deco and Hugues 2012;Doiron et al. 2016;Hennequin et al. 2018;Litwin-Kumar and Doiron 2012) . The manifestation of these biological mechanisms can be summarized at the mean-field by the reduction of the response variability due to the decreased slope in the sigmoid transfer function during highly active or inactive states."