## Figures

## Abstract

Divisive normalization, a prominent descriptive model of neural activity, is employed by theories of neural coding across many different brain areas. Yet, the relationship between normalization and the statistics of neural responses beyond single neurons remains largely unexplored. Here we focus on noise correlations, a widely studied pairwise statistic, because its stimulus and state dependence plays a central role in neural coding. Existing models of covariability typically ignore normalization despite empirical evidence suggesting it affects correlation structure in neural populations. We therefore propose a pairwise stochastic divisive normalization model that accounts for the effects of normalization and other factors on covariability. We first show that normalization modulates noise correlations in qualitatively different ways depending on whether normalization is shared between neurons, and we discuss how to infer when normalization signals are shared. We then apply our model to calcium imaging data from mouse primary visual cortex (V1), and find that it accurately fits the data, often outperforming a popular alternative model of correlations. Our analysis indicates that normalization signals are often shared between V1 neurons in this dataset. Our model will enable quantifying the relation between normalization and covariability in a broad range of neural systems, which could provide new constraints on circuit mechanisms of normalization and their role in information transmission and representation.

## Author summary

Cortical responses are often variable across identical experimental conditions, and this variability is shared between neurons (noise correlations). These noise correlations have been extensively studied to understand how they impact neural coding and what mechanisms determine their properties. Here we show how correlations relate to divisive normalization, a mathematical operation widely adopted to describe how the activity of a neuron is modulated by other neurons via divisive gain control. We introduce the first statistical model of this relation. We extensively validate the model and investigate parameter inference in synthetic data. We find that our model, when applied to data from mouse visual cortex, outperforms a popular model of noise correlations that does not include normalization, and it reveals diverse influences of normalization on correlations. Our work demonstrates a framework to measure the relation between noise correlations and the parameters of the normalization model, which could become an indispensable tool for quantitative investigations of noise correlations in the wide range of neural systems that exhibit normalization.

**Citation: **Weiss O, Bounds HA, Adesnik H, Coen-Cagli R (2023) Modeling the diverse effects of divisive normalization on noise correlations. PLoS Comput Biol 19(11):
e1011667.
https://doi.org/10.1371/journal.pcbi.1011667

**Editor: **Xue-Xin Wei,
UT Austin: The University of Texas at Austin, UNITED STATES

**Received: **January 12, 2023; **Accepted: **November 7, 2023; **Published: ** November 30, 2023

**Copyright: ** © 2023 Weiss et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **Data Availability now reads as: Code for running model simulations and fitting model to experimental data was written in MATLAB and is publicly available on Github (https://github.com/CoenCagli-Lab/RoG2_Pairwise_Ratio_of_Gaussian_Variables). Calcium imaging data is publicly available on Zenodo (https://doi.org/10.5281/zenodo.10003708).

**Funding: **This work was supported by NIH grants EY030578 (R.C.C.); DA056400 (R.C.C.); NS107574 (H.A.); NS107613 (H.A.); MH120680 (H.A.); EY023756 (H.A.); New York Stem Cell Foundation (H.A.); and NSF-DGE grant 1752814 (H.A.B.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Neurons in the sensory cortices of the brain exhibit substantial response variability across identical experimental trials [1, 2]. These fluctuations in activity are often shared between pairs of simultaneously recorded neurons, called noise correlations [3]. Because the presence of these correlations can constrain the amount of information encoded by neural populations and impact behavior [4–13], noise correlations have been widely studied. This work has also revealed that correlations are often modulated by stimulus and state variables [3, 14], and therefore can play an important role in computational theories of sensory coding. For instance, noise correlations could emerge from neurons performing Bayesian inference and reflect the statistics of sensory inputs [15–18] and prior expectations [19–21]. From a mechanistic point of view, such a statistical structure of noise correlations poses strong constraints on circuit models of cortical activity [22–26]. To better understand the functional impact and underlying mechanisms of noise correlations on neural coding and behavior, we need to be able to quantitatively characterize and interpret how noise correlations in neural populations are affected by experimental variables.

For this reason, successful descriptive models of neural activity have been developed to capture noise correlations [27–34]. However, none of those models considers divisive normalization [35–37], an operation observed in a wide range of neural systems [38–40] which has also been implicated in modulating the structure of noise correlations. Experimental phenomena that are accompanied by changes in noise correlations, including contrast saturation [41], surround suppression [42–44], and attentional modulations of neural activity [45–47] have been successfully modeled using divisive normalization [36, 48–50], although those models only captured average firing rates of individual neurons. Additionally, some numerical simulation studies have shown how normalization can affect noise correlations [51, 52]. These results indicate that it is important to quantify the relative contribution of normalization and other factors to modulation of noise correlations in experimental data.

We propose a stochastic normalization model, the pairwise Ratio of Gaussians (RoG), to capture the across-trial joint response statistics for pairs of simultaneously recorded neurons. This builds on our previous method that considered the relationship between normalization and single-neuron response variability (hence we refer to it as the independent RoG; [53]). In these RoG models, neural responses are described as the ratio of two random variables: the numerator, which represents excitatory input to the neuron, and the denominator (termed normalization signal), which represents the suppressive effect of summed input from a pool of neurons [35, 54]. Our pairwise RoG allows for the numerators and denominators that describe the individual responses to be correlated across pairs; in turn, these correlations induce correlations between the ratio variables (i.e., the model neurons’ activity; Fig 1A). In this paper, we derive and validate a bivariate Gaussian approximation to the joint distribution of pairwise responses, which greatly simplifies the problem of fitting the model and interpreting its behavior. The model provides a mathematical relationship between noise correlations and normalization, which predicts qualitatively different effects of normalization on noise correlations, depending on the relative strength and sign of the correlation between numerators and between denominators. This could explain the diversity of modulations of noise correlations observed in past work [46, 52, 55]. To provide practical guidance for data-analytic applications of our model, we investigate the accuracy and stability of parameter inference, and illustrate the conditions under which our pairwise RoG affords better estimates of single-trial normalization signals compared to the independent RoG. We then demonstrate that the model accurately fits pairwise responses recorded in the mouse primary visual cortex (V1), and often outperforms a popular alternative that ignores normalization. In our dataset, we find that when the correlation parameter between denominators is significantly different from zero, it is positive, indicating that those pairs share their normalization signals.

(A) The pairwise RoG model describes pairs of neural responses (*R*_{1}, *R*_{2}), where each response is computed as the ratio of two stimulus-driven signals on a trial-by-trial basis: numerators (*N*_{1}, *N*_{2}), representing the driving inputs; and denominators (*D*_{1}, *D*_{2}), representing the suppressive signals. Across trials, the numerators and denominators are distributed according to bivariate Gaussian distributions with correlation coefficients (*ρ*_{N}, *ρ*_{D}), respectively. The resulting response distribution is approximately Gaussian with correlation coefficient *ρ*_{NC}. (B-E) Comparison of the normal approximation we derived for the pairwise RoG noise covariance (B and C) and noise correlation (D and E) and the true values (estimated across 1*e*6 simulated trials) for 1*e*4 experiments (i.e., simulated pairs of neural responses). Each experiment used different model parameters and each trial was randomly drawn from the corresponding distribution. (B and D) scatter plot; (C and E) histogram of the percent difference between the Taylor approximation and the true value. The red marker indicates the median percent difference. Model parameters were drawn uniformly from the following intervals: The ranges of the mean parameters were chosen to reproduce realistic firing rates of V1 cortical neurons, while the ** α**,

**parameters were chosen such that the variances of the**

*β***and**

*N***are relatively small and the probability that**

*D***≤ 0 is negligible [57].**

*D*Our results highlight the importance of modeling the relation between normalization and covariability to interpret the rich phenomenology of noise correlations. Our model and code provide a data-analytic tool that will allow researchers to further investigate such a relationship, and to quantitatively evaluate predictions made by normative and mechanistic models regarding the role of correlated variability and normalization in neural coding and behavior.

## Methods

### Ethics statement

All experiments on animals were conducted with approval of the Animal Care and Use Committee of the University of California, Berkeley.

### Generative model—pairwise Ratio of Gaussians (RoG)

Here we describe in detail the RoG model and derive the Gaussian approximation to the ratio variable. Note that our RoG is entirely different from those used in [56] and in [48], despite the same acronym: [56] refers to the distribution obtained from the ratio between two Gaussian distributions, whereas we refer to the random variable that results from the ratio of two Gaussian variables. [48] does not refer to probability distributions at all, but rather to a surround suppression model in which the center and surround mechanisms are characterized by Gaussian integration over their respective receptive fields, while the RoG considered here is a model of neural covariability in general.

We build from the standard normalization model [35, 54] which computes the across-trial average neural response (e.g., firing rate) as the ratio between a driving input to a neuron (*N*) and the summed input from a pool of nearby neurons (*D*):
(1)
Where *f*_{÷} is the division operator; this functional notation is convenient for later derivations in which we consider the derivative of division. Our goal is to model the joint activity of pairs of neurons, so we extend the normalization model by considering two model neurons *R*_{1}, *R*_{2}. Since we are interested in trial-to-trial variability, we assume that a pair of neural responses *R*_{t} = (*R*_{1}, *R*_{2})_{t} on a single trial *t* can be written as the element-wise ratio of two Gaussian random vectors, *N*_{t} = (*N*_{1}, *N*_{2})_{t} and *D*_{t} = (*D*_{1}, *D*_{2})_{t}, with additive Gaussian random noise *η*_{t} = (*η*_{1}, *η*_{2})_{t} to capture the residual (i.e., stimulus-independent) variability.

As detailed further below, the numerators of two neurons can be correlated, and similarly for the denominators. In general, there can be correlations between the numerators and denominators (e.g., (*N*_{1}, *D*_{2}) may be correlated), requiring us to consider the joint, four-dimensional Gaussian distribution for the vector (*N*_{t}, *D*_{t}). However, in this paper we consider the simpler model in which *N*_{t} and *D*_{t} are independent and are each distributed according to their respective two-dimensional Gaussian distributions. This assumption allows for simplified mathematical derivations and is supported by our previous work which found that including a parameter for the correlation between *N* and *D* caused over-fitting to single-neuron data [53]. However, we have also derived the equations for the case that numerators and denominators are correlated (see S1 Text), and implemented them in the associated code toolbox, so that interested researchers can test if their data warrant the inclusion of those additional free parameters.

We therefore write the generative model for the pairwise RoG as:
(2)
Where *f*_{÷} is applied element-wise, *μ*_{N}, *μ*_{D} are the two-dimensional vectors of means of the numerator and denominator, respectively, *Σ*_{N}, *Σ*_{D} are the respective 2 × 2 covariance matrices, and (*μ*_{η}, **Σ**_{η}) is the mean and covariance matrix for the residual component of the model.

For the independent RoG, the ratio variable in general follows a Cauchy distribution whose moments are not well defined. To fit the model, we used the result that when the denominator has negligible probability mass at values less than or equal to zero, the ratio distribution can be approximated by a Gaussian distribution with mean and variance that can be derived from a Taylor expansion [57–60]. This assumption is justified since the denominator is the sum of the non-negative responses from a pool of neurons [36] and is therefore unlikely to attain values less than or equal to zero.

For the pairwise extension, we can use the multivariate delta method (an application of a Taylor expansion) to compute the mean and covariance for the joint distribution of ratio variables [61] under the assumption that *μ*_{D} > 0. We note that the true distribution of the ratio of bivariate or multivariate Gaussians vectors is unknown (although there is some work on ratios of complex Gaussian variables [62, 63]) and has higher-order statistics (e.g., skewness, kurtosis) that are not well approximated by an equivalent Gaussian. In this paper, we are interested in modeling the noise covariance as this is the most widely studied statistic in the field, and we show that the approximations we derive are very accurate (see Fig 1). Future work could extend the model to account for these statistics by using higher-order terms in the Taylor expansion or a non-Gaussian copula.

To derive equations for the mean and covariance of the pairwise RoG, we use a Taylor expansion around the point (*μ*_{N}, *μ*_{D}):
(3)

Using only the first order terms, we derive expressions for the mean and covariance matrix of the RoG: (4) (5)

Note that the variance of the denominator influences the mean of the ratio variable through a second-order term, hence it does not appear in Eq (4) (see [58] for the second-order Taylor expansion for the mean of a ratio variable). From Eq (5), we can obtain expressions for the variance of each neuron in the pair and their covariance and correlation. First, we adopt the following notation to simplify the equations: let and let . Then: (6) (7)

Eq (7) is commonly referred to as the formula for “spurious” correlation of ratios found when comparing ratios of experimental variables [64], and we further generalize this in S1 Text. To the extent that tuning similarity between neurons reflects similarity in the driving inputs, and that those driving inputs are variable, neurons with more similar tuning would have larger *ρ*_{N}, which in turn implies larger noise correlations according to Eq (7). This is consistent with the widespread empirical observation that signal correlations and noise correlations are correlated [3].

### Parametrization of the pairwise RoG for contrast responses

In the form described so far, the pairwise RoG has 10 stimulus-dependent parameters and 5 stimulus-independent parameters for the additive noise. For any stimulus condition, there are only five relevant measurements that can be derived from the neural data (the response means and variances for each neuron in a pair, and their correlation), so the model is over-parametrized. Therefore, to apply the RoG to neural data, we need to reduce the total number of parameters.

The generality of this model provides a procedure for converting a standard normalization model (i.e., a model for the mean response) into a RoG model that specifies both mean and (co)-variance. In this paper, we use the example of contrast-gain control, which has been widely used to model the response of single neurons and neural populations to visual input with varying contrast [36, 65–67]. By adapting such a model, we can reduce the stimulus dependence of the means of the numerator and denominator . In the contrast-gain control model, the neural response as a function of contrast *c* (0 − 100%) is computed as a “hyperbolic ratio” [36, 65]:
(8)
Where *R*^{max} is the maximum response rate, *ϵ* is the semi-saturation constant (the contrast at which *R*(*ϵ*) = *R*^{max}/2) to prevent division by 0, and *R*_{0} is the spontaneous activity of the neuron (the response at 0% contrast). We can convert this standard model into an RoG by setting the mean of the numerator and denominator in the RoG to the numerator and denominator in this equation:
(9)

By using this functional form, we can substitute the stimulus-dependent parameters of the RoG () with the stimulus-independent parameters . Another model simplification is to assume that individual neural variability and mean neural response are related by a power function as has been observed in the visual cortex [68–70]: (10)

This parametrization allows the Fano Factor (the ratio of the variance to the mean) to vary with stimulus input (as long as *β* ≠ 1) and for both over-dispersion (Fano factor >1) and under-dispersion (Fano factor <1). Moreover, as with the mean, the four stimulus-dependent variance parameters of the model () can be replaced with four pairs of stimulus-independent parameters (*α*_{N}, *β*_{N}, *α*_{D}, *β*_{D}). Lastly, in principle, the parameters controlling correlation (*ρ*_{N}, *ρ*_{D}) can vary with stimulus conditions but for computational simplicity we assume that (*ρ*_{N}, *ρ*_{D}) are stimulus-independent. However, even with this assumption, our model can capture stimulus-dependent noise correlations (see Pairwise Ratio of Gaussians model captures correlated variability in mouse V1) as often observed *in vivo* [41, 71, 72].

### Fitting the RoG to data

We optimize the values of the parameters, given a dataset, by maximum likelihood estimation. In this paper, we validate various properties of the pairwise RoG using synthetic data produced from the generative model. We will demonstrate the applicability of this model to neural data analysis by fitting the pairwise RoG to calcium imaging data (see Data collection and processing).

Based on our previous discussion, we assume that the model parameters (collectively denoted Θ) are stimulus-independent. We consider our dataset {*R*_{t}(*s*)} where *s* is the stimulus and *t* indexes the trial. We assume that, for each stimulus, our data is independent and identically distributed according to , and that data is independent across stimuli. We can therefore compute the negative log-likelihood of the data using the following equation (see S2 Text for derivation):
(11)
Where and are the empirical mean and covariance across trials computed from the data.

In practice, we have found that it is computationally faster to first optimize the parameters for each neuron in the pair separately (which is equivalent to fitting the independent RoG model), and then optimize the correlation parameters (i.e., the *ρ* parameters) with the single-neuron model parameters fixed. This two-step optimization process is referred to as the inference functions for marginals method in the copula literature, and is known to be mathematically equivalent to maximum likelihood estimation for Gaussian copulas [73], which is the case we consider here. This points to an extension or alternative to the pairwise RoG that considers the bivariate distribution to be some non-Gaussian copula with Gaussian marginals, which we leave for future work. We assumed that the pairwise distribution is Gaussian for computational simplicity, but others have used non-Gaussian copulas to model neural populations [74].

#### Cross-validated goodness of fit.

To measure the quality of model fit, we used a cross-validated pseudo-*R*^{2} measure [75], as follows. During fitting, we divided the recording trials for each pair and for each stimulus into training and test sets (for simulation studies, we used two or ten-fold cross validation; for the calcium analysis we used leave-one-out cross-validation). We then fit the parameters of the model for each training set and used the following equation to assess the model prediction on the held-out data:
(12)
Where *LL*_{fit} is the negative log-likelihood (using Eq (11)) for the test data using the optimized parameters, *LL*_{null} is the negative log-likelihood of the data assuming that there is no modulation of the responses by stimulus contrast, and *LL*_{oracle} is the negative log-likelihood of the data using the empirical mean, variance, and covariance of the training data per stimulus condition. The reported goodness of fit score is the median across all training and test splits of the computed score (Eq (12)). Because of this cross-validation, goodness of fit values can be <0 (the fit model is worse than the null model) or >1 (the fit model performance is better than the oracle).

#### Quantifying the accuracy of the estimated correlation parameters.

As we are interested in interpreting the correlation model parameters (*ρ*_{N}, *ρ*_{D}), we need to assess the accuracy of the maximum likelihood estimator. For simulations, we directly compare the estimated *ρ* values to the true values used to generate the data. For real neural data, however, we do not have access to the true values: instead, we compute confidence intervals. To do so, we perform a bootstrap fit procedure: given a set of pairwise neural responses {*R*_{t}(*s*)} with *T* simultaneously recorded trials, we sample these trials with replacement *T* times and then fit the pairwise RoG using the resampled set of neural responses as our observations. Repeating this procedure for a large number of samples (in the analysis in sections Inference of correlation parameters and Pairwise Ratio of Gaussians model captures correlated variability in mouse V1 we used 1000 bootstrap samples) gives us sets of fit *ρ*_{N}, *ρ*_{D}, which we use to compute a 90% confidence interval. Using the synthetic data, we validate these confidence intervals by measuring the empirical coverage probability and comparing to the nominal confidence level. These confidence intervals allow one to quantify the accuracy of the *ρ* parameter estimates. We then demonstrate one possible use of the confidence intervals, with an application focused specifically on the sign of the *ρ* parameters.

### Model comparison

We compared the pairwise RoG to a modified version of the modulated Poisson model [69], using Gaussian noise instead of Poisson [76]. We call this model the *modulated Gaussian* (MG). The original model is a compound Poisson-Gamma distribution, in which the Poisson rate parameter is the product of the mean tuning curve and a random gain variable that is Gamma distributed. The parameters of the Gamma distribution depend on the mean tuning curve (*f*(*s*)) and the variance of the gain variable (*σ*_{G}). Additionally, there are two sources contributing to (tuned) covariability: the correlation between the multiplicative gains (*ρ*_{G}), and the correlation between the Poisson processes (*ρ*_{P}). For the modulated Gaussian model, we use a bivariate Gaussian distribution whose moments (i.e., mean, variance, and covariance) are parametrized according to the moments of the modulated Poisson model. We made this modification to the modulated Poisson model for two main reasons. First, because we are examining continuously valued fluorescence traces as opposed to discrete spike count data, a continuous distribution is more appropriate for analysis. Second, the original modulated Poisson, while including a parametrization of the noise covariance between neurons, has no simple closed form for the bivariate distribution, which complicates the comparison of goodness of fit between the two models. By using a bivariate Gaussian distribution, we can more directly compare this model to our proposed pairwise RoG.

More explicitly, the pairwise neural responses are distributed as:
(13)
where we assume the mean tuning cure is the contrast-response curve (Eq (8)), is the standard deviation of the multiplicative gain for neuron *i*, and *ρ*_{G} is the correlation between the multiplicative gains. *ρ*_{P} is no longer interpreted as the point process correlation; instead, *ρ*_{P} controls the portion of the tuned covariability that is independent of the shared gain. As with the RoG, we also model the untuned variability *η* as additive bivariate Gaussian noise. We then fit the model parameters to data by minimizing the negative log-likelihood (Eq (11), with and defined in Eq (13)). As with the pairwise RoG, we use cross-validation to account for model complexity and compute the goodness of fit scores using Eq (12). An extension to this model was recently proposed that incorporates normalization by assuming the rate parameter is a ratio term in which the denominator is a Gaussian random variable, then deriving moments of the distribution for optimization [77]. However, this model does not currently account for noise correlations, so we chose to instead adapt the Poisson-Gamma model.

The most relevant difference between RoG and MG is in how each model accounts for the effect of normalization on (co)variability. In the RoG, normalization directly influences variability by division operating on random variables. This creates flexible dependencies between the mean firing rate, individual neuron variability and shared covariability. In the MG, normalization influences the gain of neurons through the interaction between the mean firing rate (i.e., the standard normalization model) and the gain parameter *σ*_{G}, which is assumed be a slowly fluctuating source of variability that scales how the mean firing rate effects variability. In this way, the normalization signal for the MG is a deterministic factor. The MG is therefore a simpler model that can only account for overdispersion, whereas the RoG allows for both overdispersion and underdispersion, and for diverse patterns of covariability (see Pairwise Ratio of Gaussians model captures correlated variability in mouse V1) albeit at the cost of additional parameters.

### Inference of single trial normalization from measured neural activity

Because of the probabilistic formulation of the RoG, we can use Bayes theorem to compute the posterior probability of the normalization variable *D*_{t} in a single trial, given the observed neural responses *R*_{t}:
(14)
Where multiplication by |*D*_{1}*D*_{2}| occurs due to the change of variables formula for probability density functions. From this distribution, we can find the maximum a posteriori (MAP) estimates of the normalization strength in a single trial by differentiating the posterior distribution with respect the denominator variables and finding the maxima by setting the partial derivatives to 0. For ease of computation, we solve the equivalent problem of finding the zeros of the partial derivatives of the negative logarithm of the posterior distribution. In our previous work [53], we found that, when subtracting the mean additive noise from the simulated activity, the MAP estimate remained unbiased. Thus, for simplicity, we assume that we can subtract off the mean spontaneous activity and consider instead the posterior *p*(** D**|

**−**

*R*

*μ*_{η}). To obtain an estimate for the denominator strength

*D*_{t}we look at the partial derivatives of the negative log posterior with respect to

*D*

_{1},

*D*

_{2}and solve to obtain the MAP estimate. This procedure leads to a two-dimensional system of bivariate quadratic equations: (15) Where the coefficients

*A*

_{1},

*A*

_{2},

*B*

_{1},

*B*

_{2},

*C*are functions of the parameters of the model (see S3 Text for the derivation of Eq (15) and the full expressions of these coefficients).

A basic result from the algebra of polynomial systems (Bézout’s theorem) tells us that this system has four pairs of possibly complex valued solutions [78]. In fact, as solving this system amounts to solving a quartic equation in one variable, there exists an algebraic formula (with radicals) for solutions to this system as a function of the coefficients. This solution is too long to include here and uninformative but was found using the Symbolic Math toolbox from MATLAB and is included in our toolbox (Code and Data Availability).

Because all the variables involved are real-valued, we are only interested in the existence of real solutions to this two-dimensional system. However, there is no theoretical guarantee that there will be any real solutions. In practice we take the real part of the algebraic solution to this system and find which pair of solutions minimize the negative log posterior. Alternatively, we can consider finding the MAP by directly minimizing Eq (14) (see S3 Text) using numerical optimization. We have verified that, when real-valued solutions exist to Eq (15), these coincide with numerically minimizing Eq (14). However, as optimization of Eq (14) must be computed on a per-trial basis, it is far too time consuming to perform when there are many experimental trials, so we utilize the algebraic solution to Eq (15).

### Generating realistic pairwise neural activity from the model

To constrain our simulations to realistic parameter values for the contrast response function (Eq (8)), we took the single-neuron best-fit parameters to macaque V1 data analyzed in our previous work (for details see [53]) and created parameter pairs by considering all combinations (*N* = 11628 pairs) of these parameters. Using the generative model for the pairwise RoG (Eq (2)) and the contrast-response parametrization (Eq (8)), we can simulate single-trial neural activity from these parameter pairs and specific values for (*ρ*_{N}, *ρ*_{D}). These synthetic data allow us to explore properties of the pairwise model without having to exhaustively explore the full parameter space.

### Data collection and processing

#### Animal preparation.

Data were collected from CaMKII-tTA;tetO-GCaMP6s mice [79], expressing GCaMP6s in cortical excitatory neurons. Mice were implanted with headplates and cranial windows over V1 [80]. Briefly, mice were anesthetized with 2% isoflurane and administered 2 mg/kg of dexamethasone and 0.5 mg/kg of buprenorphine. Animals were secured in a stereotaxic frame (Kopf) and warmed with a heating pad. The scalp was removed and the skull was lightly etched. A craniotomy was made over V1 using a 3.5 mm skin biopsy bunch. A cranial window, consisting of two 3 mm diameter circular coverslips glued to a 5 mm diameter circular coverslip, was placed onto the craniotomy, and secured into place with Metabond (C&B). Then a custom-made titanium headplate was secured via Metabond (C&B) and the animals were allowed to recover in a heated cage.

#### Behavioral task and visual stimuli.

During imaging, mice were head-fixed in a tube, and were performing an operant visual detection task [81]. Briefly, mice were trained to withhold licking when no stimulus was present, and lick within a response window after stimulus presentation. Mice were water-restricted and given a water reward for correct detection. Visual stimuli were drifting sinusoidal gratings (2 Hz, 0.08 cycles/degree) presented for 500 ms followed by a 1000 ms response window. Stimuli were generated and presented using PsychoPy2 [82]. Visual stimuli were presented using a gamma corrected LCD monitor (Podofo, 25 cm, 1024x600 pixels, 60 Hz refresh rate) located 10 cm from the right eye. Contrast of gratings were varied between 7 different levels: {2, 8, 16, 32, 64, 80, 100}, except for 2 recording sessions in which contrast level 80% was omitted. This did not alter any of the analysis, allowing sessions to be combined into a single dataset.

#### Calcium imaging.

Once they learned the task, mice started performing under the 2p microscope, and V1 was imaged via cranial window. Imaging was performed using a 2-photon microscope (Sutter MOM, Sutter Inc.), with a 20X magnification (1.0 NA) water-immersion objective (Olympus Corporation). Recordings were done in L2/3 in an 800 x 800 *μ*m field of view, with 75–100 mW of 920 nm laser light (Chameleon; Coherent Inc). An electrically tunable lens (Optotune) was used to acquire 3 plane volumetric images at 6.36 Hz. Planes were 30 *μ*m apart. Acquisition was controlled with ScanImage (Vidrio Technologies).

Calcium imaging data was motion-corrected and ROI extracted using suite2p [83], and all data was neuropil subtracted with a coefficient of 0.7 (we also analyzed data using neuorpil coefficients of 0.4 and 1, and see S4 Text for additional analysis with deconvolved data; all the results presented in the main text were qualitatively similar across preprocessing methods).

#### Data processing.

Processing of calcium imaging data was performed using custom MATLAB code. Fluorescence traces for individual trials and cells (average of the neuropil subtracted fluorescence across a ROI) consisted of 24 frames: 4 frames of pre-stimulus blank, followed by 3 frames of stimulus presentation and 17 frames of post-stimulus blanks corresponding to the response window for the behavioral task. In our analyses, we considered one extra frame to account for onset delays and calcium dynamics. Baseline fluorescence (*F*_{0}) was computed as the median across pre-stimulus frames (1–5), and the stimulus evoked fluorescence (Δ*F*/*F*) was computed as the mean of the normalized fluorescence per frame ((*F*(*i*) − *F*_{0})/*F*_{0} for *F*(*i*) the fluorescence of frame *i*) across frames corresponding to stimulus response. Spontaneous Δ*F*/*F* was computed as above during blank trials. Cells were included in further analysis if the evoked response at the highest contrast was at least 2 standard deviations above the spontaneous mean fluorescence. Across 9 recording sessions, 295/8810 neurons met this inclusion criterion. This small percentage is due to sessions recorded using gratings with fixed spatial frequency and orientation; thus, included neurons are visually responsive and selective for this combination.

## Results

We developed the pairwise Ratio of Gaussians model (RoG, Fig 1A) to quantify the relationship between normalization and response covariability (i.e., noise correlations) that has been suggested in empirical studies of neural activity in visual cortex [52, 55, 84, 85]. In the standard divisive normalization model (Eq (1)), the mean response is computed as the ratio of the excitatory drive (numerator ** N**) to a normalization signal summarizing inhibitory drive (denominator

**). Our pairwise RoG considers a pair of neurons where each individual neural response is well characterized by the standard normalization equation with corresponding numerators (**

*D**N*

_{1},

*N*

_{2}) and denominators (

*D*

_{1},

*D*

_{2}). We then assume that the numerators and denominators are bivariate Gaussian random vectors—which allows the possibility for correlations to exist among the numerators (denoted

*ρ*

_{N}) and among the denominators (

*ρ*

_{D}). From this, we derived equations for the mean responses and covariance matrix of the pair as a function of the numerators and denominators (Eqs (4) and (5)). These equations depend on the Gaussian approximation to the ratio of two Gaussian random variables. We verified the validity of this approximation for the moments of interest (mean, variance, and covariance), by simulating the activity of pairs of neurons, and comparing the covariance (Fig 1B and 1C) and correlation (Fig 1D and 1E) of the true ratio distribution and of the approximate distribution (Eqs (6) and (7)). The mean and variance are identical to the independent RoG model (a special case of the pairwise RoG, with numerators and denominators independent between neurons) which we validated previously [53].

### Modulations of correlated variability depend on sharing of normalization

Within the RoG modeling framework, there are two main sources of response (co)-variability: the numerator (excitatory drive) and the denominator (normalization). Depending on the value of the corresponding *ρ* parameters, each of these sources of variability can be independent (*ρ* = 0) or shared (*ρ* ≠ 0), and therefore contribute differently to noise correlations. Consequently, understanding modulations of noise correlations in the pairwise RoG requires understanding how normalization and external stimuli affect the relative amplitude of these sources, and how the effects depend on whether those sources are shared.

First, we studied the relationship between normalization and noise correlations for the lowest contrast stimuli (Fig 2, yellow symbols). We define normalization strength for a given neuron as the mean of the denominator; for a fixed stimulus contrast, this is determined by the semi-saturation constant *ϵ* in Eq (8). When the normalization signals are positively correlated (shared normalization, *ρ*_{D} = 0.5) but the excitatory drive is independent (*ρ*_{N} = 0), increasing normalization strength tends to decrease the magnitude of noise correlations (Fig 2A). Conversely, when the normalization signals are independent (*ρ*_{D} = 0) but there is shared driving input (*ρ*_{N} = 0.5), the magnitude of noise correlations tends to increase with increasing normalization (Fig 2B). Intuitively, this is due to how the model partitions neuronal covariability into two sources and how normalization separately effects variability of these sources. As mentioned at the beginning of this section, these terms describe the correlations among the numerators and among the denominators. The correlations arising from the numerator are unaffected by normalization strength, while the correlations arising from the denominator tend to decrease with normalization strength. So, when *ρ*_{N} = 0, the noise correlations are solely due to the denominator cofluctuations and thus tend to decrease with normalization. However, when *ρ*_{D} = 0, the numerator covariability drives the response covariability, which is unchanged by normalization strength. The reason why we see increased noise correlations in this scenario is because normalization decreases individual neuronal response variability [53] so the proportional contribution of the numerator term increases. This is derived more completely later in this section. Therefore, increasing normalization strength reduces a source of noise correlations when normalization is shared, whereas it reduces a source of independent noise when normalization is not shared.

Each panel shows, for a combination of *ρ*_{N}, *ρ*_{D} specified in the panel title, the median noise correlation of all generated neural pairs binned according to *ϵ*_{1} × *ϵ*_{2}, a contrast independent measure of the common normalization strength. Bins with less than 100 pairs were discarded. Neural responses were generated from the contrast-response parametrization (Eq (8)). Noise correlation strength was computed across 1e3 simulated trials drawn from the pairwise RoG model. For each contrast level and combination of *ρ*_{N}, *ρ*_{D}, 1e5 simulated experiments were created. See S1 Fig for a more systematic exploration of the factors influencing modulation of noise correlations for two simulated pairs that matches the large scale experiment considered here. Model parameters were drawn uniformly from the following intervals:

We found similar effects of normalization on noise correlations at higher contrast levels (Fig 2, orange and red symbols) although the slope of the relationship became shallower as contrast increased. This is due to the saturating effect of the contrast response function (Eq (8)): at high contrast, (co)fluctuations of the normalization signal across trials have relatively little effect on the responses of a neural pair, so the correlation in neural responses will be relatively unaffected by normalization strength. We also observed that the magnitude of noise correlations generally increased with contrast when the denominators were correlated and the numerators independent (Fig 2A and S2 Fig), whereas it decreased when the numerators were correlated (Fig 2B and 2C and S2 Fig). Importantly, for the analysis of normalization strength at fixed contrast, we used the contrast semi-saturation constants (Eq (8); i.e., a pure change in the denominator) as a measure of normalization strength. Conversely, increasing stimulus contrast increases both the numerator and denominator (Eq (8)). This explains our observation that, even though normalization is stronger at higher contrast, noise correlations can be modulated in different ways by contrast and by normalization, because changing stimulus contrast also affects the numerator term.

Indeed, these results can be derived from examining the equation for the correlation in the pairwise RoG (Eq (7)) rearranged as follows (see S5 Text for more details):
(16)
and by recognizing that the coefficient of variation of *D*_{i} () is a decreasing function of normalization strength (*μ*_{D} or *ϵ*), whereas the ratio is *often* an increasing function of contrast.

We further analyze this equation, first in the case of changing normalization strength while keeping contrast fixed. The term proportional to *ρ*_{N} is an increasing function of the mean normalization strength since the denominator is a decreasing function of *μ*_{D}. Conversely, the term proportional to *ρ*_{D} decreases with normalization since the denominator increases with *μ*_{D}. In these two cases, the monotonic dependence of noise correlation on normalization strength is guaranteed by Eq (16) regardless of the specific parameter values. Similar patterns emerge when *ρ*_{N}, *ρ*_{D} < 0, except the signs of the noise correlations are reversed (S2 Fig). When the correlations of input and normalization signals are both different from zero, the relationship between noise correlation and normalization strength resembles a combination of the two previously described scenarios, and the specific parameter values determine which of the two terms in Eq (16) dominates. For instance, in our simulations with (*ρ*_{N} = 0.5, *ρ*_{D} = 0.5) (Fig 2C), the magnitude of noise correlations increased with normalization strength on average similar to (Fig 2B), indicating that the magnitude of the term proportional to *ρ*_{N} is usually larger than the magnitude of the *ρ*_{D} term, but this trend is not consistent, as evidenced by the increased spread of the scatter. When the input strength and normalization signal have opposite correlations (e.g., *ρ*_{N} = 0.5, *ρ*_{D} = −0.5), we obtained similar results; however, the magnitude of noise correlations was on average closer to 0 due to cancellation between the two sources of covariability (S2 Fig).

A similar analysis of Eq (16) shows that the effects of stimulus contrast are opposite to those of a pure change of normalization strength, because is often an increasing function of contrast but a decreasing function of normalization strength. Notice that we assumed there is no residual noise component (*η* ≡ 0), but all the analyses above remain valid when the amplitude of noise variance *σ*_{η} is relatively small compared to (*δ*_{N}, *δ*_{D}) (see S5 Text).

In summary, our analysis shows that normalization and stimulus contrast can have diverse effects on noise correlations, depending on whether neurons share their normalization signals and on the interplay between multiple sources of variability.

### Inference of correlation parameters

The above analysis demonstrates that the relationship between noise correlations and normalization depends on how this correlated variability arises: either through cofluctuations in the excitatory drive or normalization signal (determined by *ρ*_{N}, *ρ*_{D} respectively). To employ these insights when fitting to data, we need to know how well we can infer these parameters from data. To do so, we generated synthetic neural data using realistic values for the single-neuron parameters (see Generating realistic pairwise neural activity from the model) and uniformly randomly sampled *ρ*_{N}, *ρ*_{D} parameters in the range [-0.9,0.9]. We then assessed the quality of the maximum likelihood estimate of the parameters by calculating bootstrapped confidence intervals (with N = 1000 bootstrap samples) and comparing the estimator and true values (see Quantifying the accuracy of the estimated correlation parameters).

First, we assessed the validity of the confidence interval by examining how well the empirical coverage probability matches the confidence level as constructed. To do so, we constructed 90% confidence intervals for the (*ρ*_{N}, *ρ*_{D}) parameters via bootstrap resampling. We then grouped by the ground truth *ρ* values using a sliding window and counted the proportion of cases in that bin for which the bootstrap confidence interval contains the true value. We found that for both *ρ*_{N} and *ρ*_{D} the coverage is near the nominal level: for *ρ*_{N} it is slightly lower (Fig 3A left) while for *ρ*_{D} it is nearly equivalent (Fig 3A right). This indicates that the confidence intervals constructed via the bootstrap are valid and can be used for further analysis.

Plots were generated with 11628 synthetic parameter pairs with uniformly randomly generated *ρ*_{N}, *ρ*_{D} ∈ [−0.9, 0.9], contrast levels {6.25, 12.5, 25, 50, 100}, 1000 synthetic trials and 1000 bootstrap resamples. The left column are the results of the analysis for *ρ*_{N}, the right column for *ρ*_{D}. (A) Empirical coverage probability for the 90% confidence intervals as a function of the ground truth *ρ* values. The dotted line indicates the nominal confidence level. Coverage probability was computed as the proportion of cases with a specified range of *ρ* values for which the 90% confidence intervals contained the true value. We used a moving window with width 0.4 and a step size of 0.2. (B) Direct comparison between the true *ρ* value and the maximum likelihood estimator for the *ρ* value. The darker colors are the pairs for which the *ρ* parameters are significantly different from 0 (i.e., the 90% confidence interval excludes 0), whereas the lighter colors are not significant (i.e., the 90% confidence interval includes 0).

Next, we directly compared the the true and inferred *ρ* values (Fig 3B). In general, the maximum likelihood estimators are largely unable to recover the true generating values (the overall Pearson correlation between the fit and true (*ρ*_{N}, *ρ*_{D}) = (0.48, 0.18), the mean squared error across pairs is (0.32, 0.57)), indicating that these parameters are not identifiable with the parametrization we considered (contrast tuning). Further, this analysis indicates that inference of *ρ*_{D} is in general more difficult than it is for *ρ*_{N}. The lack of identifiability of these parameters is likely due to the numerous multiplicative interactions between the parameters. For instance, by looking closer at Eq (7), we can see that the contribution of the *ρ* parameters to the noise correlation is multiplied by the respective standard deviations for the numerator or denominator variables. Such interactions may make it difficult to infer the exact value of the *ρ* parameters (see S6 Text for further discussion).

As we established the validity of the bootstrapped confidence intervals (Fig 3A), one could select pairs for which the parameter inference is accurate to the desired precision (i.e., selecting those pairs whose confidence intervals are a certain width). Additionally, one can also perform population-level analyses on the *ρ* parameters, as we demonstrate for experimental data (see Pairwise Ratio of Gaussians model captures correlated variability in mouse V1). Here, we introduce and validate another kind of analysis one can perform with the bootstrapped confidence intervals. Rather than the exact magnitude of the *ρ* parameters, we are often only interested in the *sign* of these parameters, as in deriving the relationship between normalization strength and noise correlations (Modulations of correlated variability depend on sharing of normalization). Moreover, because all the parameters in Eq (7) are *a priori* positive besides the *ρ* parameters, we reasoned that the accuracy of sign inference might be higher. Therefore, we considered the subset of cases where the parameters are significantly different from 0, which we define as cases where the 90% confidence interval does not include 0 and the pairwise goodness of fit is greater than the independent goodness of fit (see Cross-validated goodness of fit). First, for those pairs (for *ρ*_{N}, 4598/11628 pairs were significant, for *ρ*_{D}, 1625/11628), we see a much stronger relationship between the fit and true *ρ* values (Fig 3B, darker colors; Pearson correlation between significant fit and true (*ρ*_{N}, *ρ*_{D}) = (0.89, 0.57); mean squared error across pairs = (0.09,0.43)). Importantly, the plots also show that, for pairs with *ρ* parameters significantly different from 0, the sign of the inferred *ρ* parameter is very frequently equivalent to the sign of the true *ρ* parameter, in a similar proportion of cases for *ρ*_{N} and *ρ*_{D}, although *ρ*_{N} has a much higher proportion of cases (for *ρ*_{N}, 4509/4598 of pairs significantly different from 0 has the same sign; for *ρ*_{D}, 1309/1625). From this, we conclude that, for these significant *ρ* parameters, the *sign* of the inferred *ρ* parameter is accurate.

In summary, our analysis indicates that it is difficult to estimate the precise value of the *ρ* parameters in general, for the stimulus parametrization considered here (i.e., the classical normalization model for contrast tuning). However, we have provided a method to calculate bootstrapped confidence intervals for the maximum likelihood estimators of the *ρ* parameters and have shown that these confidence intervals accurately represent the uncertainty around those estimates. We then demonstrated one possible use-case for these confidence intervals: for *ρ* estimates that are significantly different from 0, the *ρ* estimators are accurately able to recover the sign of the ground truth *ρ* parameters.

### Pairwise model improves single-trial inference of normalization strength even when noise correlations are small

In past work that connected normalization to modulation of noise correlations, stimulus and experimental manipulations (e.g., contrast, attention) are used as proxies for normalization strength [51, 52, 55] because normalization strength cannot be measured directly. However, these manipulations also affect other factors that drive neural responses (as we have illustrated in Modulations of correlated variability depend on sharing of normalization, Fig 2), which could confound these as measures of normalization signal. Therefore, quantitatively testing the relationship between noise correlations and normalization requires estimating the single-trial normalization strength for a pair of neurons. One of the advantages of our probabilistic, generative formulation of the pairwise RoG model (Eq (2)) is that it allows us to infer the single-trial normalization strength from measured neural activity (see Inference of single trial normalization from measured neural activity). The independent RoG model also provides an estimate for the single-trial normalization, which is known to be a valid estimator for the ground-truth normalization strength for data generated from the independent RoG [53]. We found similar results for the pairwise RoG, so we examined how the pairwise estimates for the single-trial normalization strength compares to estimates based on the independent model.

One possibility is that the estimate derived from the pairwise model would outperform the independent model as the magnitude of noise correlations increase. However, this is not necessarily the case. Fig 4A and 4B demonstrates this with two example synthetic neuron pairs. Because the single-trial normalization inference depends on the single-trial neural activity, correlations between neurons will induce correlations between the inferred normalization signals even for the independent RoG (Fig 4A). However, when noise correlations are small due to cancellation between *ρ*_{N} and *ρ*_{D}, the independent model will infer minimal correlation between normalization signals while the pairwise model will correctly infer that the single-trial normalization is correlated (Fig 4B).

(A and B) Two simulated experiments drawn from the pairwise RoG with different noise correlations arising from different underlying *ρ*_{N}, *ρ*_{D} values to compare the pairwise and independent estimators of single-trial normalization with the ground truth normalization signal. (A) has overall noise correlation of 0.21 across contrasts generated with *ρ*_{N} = 0, *ρ*_{D} = 0.3, (B) has overall noise correlation of -0.05 across contrasts generated with *ρ*_{N} = −0.3, *ρ*_{D} = 0.3. Z-scoring performed across trials. Random parameters drawn from *R*^{max} ∈ [10, 100], ** ϵ** ∈ [15, 25],

*α*_{N},

*α*_{D}∈ [0.1, 1],

*β*_{N},

*β*_{D}∈ [1, 1.5],

**≡ 0. Contrasts levels were {6.25, 12.5, 25, 50, 100}. (C and D) Comparison of the single-trial normalization inference in the pairwise and independent RoG models for the mean-squared error (C) and the correlation between the estimate and the true value (D), as it depends on**

*η**ρ*

_{N},

*ρ*

_{D}and the contrast levels. Left: lowest contrast level (6.25); middle: intermediate contrast level (25); right: full contrast (100). Each bin corresponds to the median difference between the pairwise and independent models across simulated experiments. (C and D) used 11628 synthetic pairs (see Generating realistic pairwise neural activity from the model).

To demonstrate this principle, we computed the difference of the mean squared error (Fig 4C) and correlations (Fig 4D) between the pairwise RoG and independent RoG normalization inference (with respect to the ground truth values) for many simulated pairs (11628) with systematically varying *ρ*_{N}, *ρ*_{D} values. First, we found that the distinction between the two models depended on the contrast level: as contrast levels increase, the quality of the pairwise and independent estimators of normalization signal become more distinct. Second, the improvement of the pairwise RoG estimate over the independent estimate increased when the magnitude of the *ρ*_{D} parameter increased. Consistent with the intuition provided by Fig 4B, the largest improvement occurred when the *ρ*_{N} parameter had a large value that was the opposite sign of *ρ*_{D}. The dependence on the *ρ*_{D} parameter reflects that the estimator for the pairwise model incorporates knowledge about correlation between the normalization signals.

In summary, this analysis shows that the pairwise model estimates of the single trial normalization can improve upon the independent model even when noise correlations are small. As the single-trial normalization estimator from the independent model was previously shown to be accurate [53], our results imply that the pairwise model estimate is also able to recover the ground-truth normalization strength. Additionally, we have outlined the conditions in which those estimates are preferable to those obtained with the independent model.

### Pairwise Ratio of Gaussians model captures correlated variability in mouse V1

To test how well the model captures experimental data, we applied it to calcium imaging data recorded in V1 of mice responding to sinusoidal gratings of varying contrast levels (see Data collection and processing). We analyzed neurons that were strongly driven by the visual stimuli (N = 295 neurons, 5528 simultaneously recorded pairs; see Data collection and processing for inclusion criteria). We focused on stimulus contrast tuning (Eq (8)) because the formulation of the corresponding standard normalization model captures firing rate data well [36], and visual contrast affects both normalization strength and the strength of noise correlations [41].

Because the RoG framework has not been validated before on mouse V1 fluorescence data, we first applied the independent RoG and found that it provided excellent fits (average cross-validated goodness of fit 0.85, 95% c.i [0.846,0.858], 5163/5528 pairs with goodness of fit >0.5) on par with that found in macaque V1 data recorded with electrode arrays [53], thus demonstrating that the RoG framework is flexible enough to capture datasets with different statistics. In both cases, the analysis was performed on visually responsive neurons, that therefore exhibited strong contrast tuning of the firing rate, partly explaining the high performance. However, the independent RoG could not capture correlated variability, which was prominent in the data (median noise correlation across all pairs and contrasts 0.117, c.i. [0.115, 0.120], 2781/5528 pairs had noise correlations significantly different from 0).

Therefore, we tested if the pairwise RoG could capture correlated variability in the data. Fig 5A and 5B demonstrates that the model can capture contrast-dependent noise correlations, both for pairs with positive (example in Fig 5A; 4327/5528 pairs) and negative median noise correlations (example in Fig 5B; 1201/5528 pairs). Importantly, even though the *ρ*_{N}, *ρ*_{D} parameters were stimulus-independent, the pairwise model captured substantial changes in noise correlations with contrast for many of the pairs analyzed (2991/5528 had greater than 0.5 correlation between the observed noise correlations and model fit, across contrast levels). However, this ability to capture correlations comes at the cost of larger model complexity. To account for this, we next compared quantitatively the pairwise and independent models, using a cross-validated goodness of fit score (Eq (12)). The pairwise model slightly outperformed the independent model on average and for most pairs (median difference in goodness of fit = 0.0121, *p* < 0.001, 4123/5528 pairs with pairwise goodness of fit greater than independent), denoting that the additional free parameters are warranted. Furthermore, because the independent model is a special case of the pairwise with noise correlations fixed at zero, we found as expected that the performance difference between the two models increased for pairs of neurons with larger noise correlations (see S7 Text).

(A and B) Pairwise neural responses for two example pairs of neurons in mouse V1 with positive (A) and negative (B) median noise correlations. From left to right: 1) empirical mean and covariance ellipses (∼1 standard deviation from the empirical mean) for pairwise responses at each contrast level; 2) the RoG model predicted means and covariance ellipses, the panel title includes the cross-validated goodness of fit score; 3) the modulated Gaussian (MG) predicted means and covariance ellipses; 4) compares the two model fit noise correlation values (continuous lines), with the empirical values as a function of contrast (error bars are 68% bootstrapped confidence interval). Neuronal pair in (A) had 93 repeats of each stimulus contrast, pair in (B) had 68 repeats. (C) Scatter plot across all pairs of the goodness of fit score for modulated Gaussian vs. the goodness of fit for the RoG. (D) Histogram of the difference between the scores in (C). Contrast levels are {2,8,16,32,64,80,100}.

To benchmark the RoG against a widely adopted alternative model, we considered the modulated Poisson model that was previously shown to capture noise correlations in macaque V1 [69]. For application to our imaging dataset and for a fair comparison with the RoG, we used Gaussian noise instead of Poisson [76] and termed this the modulated Gaussian (MG) model (see Model comparison). The example pairs demonstrate that, while in some pairs the MG can capture the modulation of noise correlations with contrast as well as the RoG (Fig 5A), it is not able to capture it in other pairs while the RoG can (Fig 5B). Across the dataset, for the majority of pairs (5238/5528), the pairwise RoG had a higher goodness of fit score (Eq (12)) than the MG (Fig 5C and 5D, median difference between goodness of fit for RoG and MG = 0.238, 95% c.i. [0.232,0.245]). These results were also largely independent of the specific preprocessing method applied to the calcium imaging data (see S4 Text). Moreover, although both models capture the tuning of noise correlations with contrast level by using stimulus-independent correlation parameters, the RoG model better predicts the trend in noise correlations with contrast than the MG (median difference in correlations between model fit noise correlations and empirical noise correlations = 0.0771, 95% c.i. [0.071, 0.084]; 1450/5528 pairs had statistically significant correlation between the pairwise RoG predictions and empirical noise correlations compared to 1008/5528 for the MG). In principle the MG model’s ability to capture the modulation of noise correlations with contrast could be improved by including contrast dependence in the correlation parameters explicitly, although this would increase model complexity.

These results demonstrate that the pairwise RoG captures a range of effects of stimulus contrast on noise correlations observed in experimental data and performs competitively against a popular alternative model that does not account for normalization explicitly.

Next, we analyzed the correlation parameters (*ρ*_{N}, *ρ*_{D}) in the model fit (Fig 6). We first only selected those pairs of neurons whose pairwise goodness of fit exceeded 0.5 and the independent goodness of fit measure (3920/5528 total), and we computed the bootstrapped 90% confidence interval for the (*ρ*_{N}, *ρ*_{D}) parameters (see Quantifying the accuracy of the estimated correlation parameters).

Histograms comparing the inferred *ρ*_{N} (A) and *ρ*_{D} (B) values for all neuronal pairs meeting our goodness of fit criteria (outlined) and the subset of those pairs significantly different from zero with 90% confidence (filled). The histograms for all pairs and pairs significantly different from 0 are normalized separately.

Examining the correlation parameters for all of the pairs meeting the goodness of fit criteria (Fig 6A and Fig 6B outlined), we see a significant bias towards positive values (median *ρ*_{N}, *ρ*_{D} parameter values = 0.84, 1). This is partially due to the large number of cases in which the fit parameters were exactly equal to ±1 (for *ρ*_{N}, *ρ*_{D}, 1442 and 2700 fit values were ±1). However, even when excluding these pairs, the trend within the population is still towards positive fit *ρ* values (median fit *ρ*_{N}, *ρ*_{D} values excluding extreme pairs is 0.29, 0.22). This analysis suggests that these signals are, on aggregate, shared among the population recorded; in particular, this suggests that normalization is typically shared between the pairs recorded.

As a complementary analysis, we then focused on the cases where the parameters were assessed to be significantly different from zero (1270/3920 for *ρ*_{N}, 192/3920 for *ρ*_{D}) (Fig 6A and 6B filled). The proportion of pairs for which the estimated *ρ* parameters are significantly different from 0 is similar to the synthetic data (see Inference of correlation parameters). For these pairs that were significant, we found that nearly all inferred *ρ*_{N} (1239) and *ρ*_{D} (191) parameters were positive (see Fig 6), suggesting that normalization signals are generally shared for these pairs of neurons.

In summary, our results demonstrate a new approach to quantify how strongly normalization signals are shared between neurons, and to explain the diverse effects of normalization on noise correlations.

## Discussion

We introduced a stochastic model of divisive normalization, the pairwise RoG, to characterize the trial-to-trial covariability between cortical neurons (i.e., noise correlations). The model provides excellent fits to calcium imaging recordings from mouse V1, capturing diverse effects of stimulus contrast and normalization strength on noise correlations (Figs 5 and 6). We demonstrated that the effect of normalization on noise correlations differs depending on the sources of the variability, and that the model can accommodate both increases and decreases in noise correlations with normalization (Fig 2) as past experiments had suggested. We then investigated the accuracy of inference of a key model parameter, which determines whether normalization is shared between neurons, and we provided a procedure for quantifying the uncertainty of this inference using bootstrapping (Fig 3). Lastly, we derived a Bayesian estimator for the single-trial normalization signals of simultaneously recorded pairs. Surprisingly, this estimator can be more accurate than the estimator based on the model that ignores noise correlations (the independent RoG) even when noise correlations are negligible (Fig 4).

As a descriptive, data analytic tool, our modeling framework complements normative and mechanistic theories of neural population variability. For instance, normative probabilistic accounts of sensory processing have suggested that divisive normalization may play a role in the inference of perceptual variables by modulating neural representations of uncertainty [15, 18, 75, 77, 86–88]. Similarly, normalization could play a key role in multisensory cue combination [39, 89, 90]. However, the posited effect of normalization on covariability has not been tested quantitatively, as normalization signals are often not measurable. The pairwise RoG will allow researchers to test these hypotheses by providing a means with which to estimate normalization signals from neural data and relate these to measures of neural covariability. In circuit-based models of neural dynamics such as the stabilized supralinear network [25] and the ORGaNICs architecture [91], the normalization computation emerges naturally from the network dynamics [26] and shapes the structure of stimulus-dependent noise correlations [92]. By quantifying the parametric relation between normalization and covariability, our descriptive tool will enable mapping those parameters onto the different circuit motifs and cell types posited by these network models.

When comparing the RoG to the modulated Gaussian model (see Model comparison), we found that the RoG had better performance for the majority of pairs (Fig 5C). We chose to adapt the modulated Poisson model [69] as a comparison to the RoG because it was shown to successfully capture noise correlations in recordings from macaque V1. Moreover, this model belongs to the class of Generalized Linear Models, which are among the most widely used encoding models for neural activity [27, 93]. There are numerous alternative descriptive models of correlated neural population activity, among the most popular of these being latent variable models (LVMs), in which population-wide activity arises from interactions between a small set of unobserved variables [28, 33, 94–97]. This effectively partitions the population noise covariance into underlying causes (i.e., latents) that are responsible for coordinating neural responses, which resembles our attribution of noise correlations to either shared input drive or shared normalization pools (Fig 2). The RoG, on the other hand, is a pairwise model that seeks to explicitly characterize neural interactions through divisive normalization, which cannot be done with any existing LVMs; integrating normalization into the LVM framework is an important future extension of our model. One benefit of our current approach is that the RoG can be applied to any scenario in which two or more neurons are simultaneously recorded. LVMs can only be applied to relatively large populations of simultaneously recorded neurons to estimate the globally shared latent factors. This is not always feasible for regions of the brain that are difficult to record from or using techniques such as intracellular voltage recordings [98, 99]. The downside of a method such as the RoG is scalability to large populations, as the model parameters must be optimized for each recorded pair, which can be computationally expensive for modern datasets with thousands of neurons [100]. Nonetheless, we were able to fit the RoG to data across multiple different preprocessing methods (∼90000 pairs total) in a reasonable time (∼ 27 hours running in parallel on a 28-core server without GPU acceleration), suggesting that it is not entirely impractical to use the pairwise RoG on a large dataset.

Three models have directly studied the relationship between normalization and across trial covariability [51, 52, 101]. Tripp’s [51] simulation work on velocity tuning in the medial temporal cortex (MT) consistently predicted that normalization would decorrelate neural responses. However, we found that noise correlations could also increase with normalization. This is because Tripp modeled correlations to solely arise from tuning similarity between neurons. Conversely, in the RoG framework, noise correlations originate from input correlations *ρ*_{N} and correlations between normalization signals *ρ*_{D}. Our model then offers more flexibility than Tripp’s by allowing relationships between normalization and correlation to depend on the sources of correlations. Verhoef and Maunsell [52] investigated the effect of attention on noise correlations by using a recurrent network implementation of the normalization model of attention [49]. They describe multiple different patterns of the effect of normalization on noise correlations depending on tuning similarity between a pair of neurons and where attention is directed. Our model does not currently account for the effect of attention, but this would be possible by adapting the standard normalization model of attention which would require an additional parameter for the attentional gain. These prior two models are also primarily simulation based, while our model is meant to be data analytic. Lastly, Ruff and Cohen [101] proposed a normalization model to explain how attention increases correlations between V1 and MT neurons [55]. They modeled the trial-averaged MT neural responses as a function of trial-averaged responses of pools of V1 neurons. After fitting the parameters, single-trial MT responses were predicted by feeding the pooled single-trial V1 responses into the equation. By construction, variability in predicted MT neural responses only arises from variability in the V1 neural responses, which only occur in the numerator of their normalization model. Our model also allows for variability in the denominator of the normalization equation and therefore their model can be seen as a special case of the pairwise RoG.

An important limitation of our model is that the correlation parameters (*ρ*_{N}, *ρ*_{D}) are not identifiable (Fig 3), meaning the model parametrization is such that multiple different parameter sets result in equivalent models (e.g., equivalent likelihoods and moments). This is a common issue when using complex nonlinear models as proposed here [102]: in our model, this is due to multiplicative interactions between model parameters. Improving parameter estimation will require better constraints on model parameters, alternative optimization algorithms, or different objective functions (see S6 Text for further discussion). Future extensions of the model to population level interactions through latent variable models offer another avenue to improve parameter estimation: the variability of population activity is often low-dimensional, which could naturally impose parameter constraints. Nonetheless, we developed a method to calculate confidence intervals for the estimates of the *ρ* parameters, which can be used to select pairs for which the estimates’ uncertainty is less than a desired level. As an example application of this approach, we have demonstrated that the confidence intervals can be used to determine when the sign of those parameters, which is an important factor in controlling noise correlations (Fig 2), can be recovered accurately. We showed that we were accurately able to recover the sign of the correlation in synthetic datasets when the bootstrap confidence interval for the parameter of interest excluded 0. In the V1 dataset analyzed, we found that (1239,192)/3920 pairs meet this criterion for *ρ*_{N}, *ρ*_{D} respectively, and that the vast majority of those pairs had positive *ρ*_{N}, *ρ*_{D}. Although this is a minority of cases, it demonstrates that typical datasets with existing recording technologies could nonetheless provide sufficient power for studies that focus on the *ρ* parameters values. It will be important in future work to understand which experimental conditions would maximize the yield of pairs with accurate estimates of the *ρ* parameters.

We chose in this study to primarily analyze the normalized fluorescence traces (Δ*F*/*F*) rather than using deconvolution or spike inference methods (see [103–105] for a review). Deconvolution methods were developed in part due to the slow temporal dynamics of the calcium indicators relative to membrane potentials generating spiking activity [106, 107]. Deconvolution and other spike inference techniques attempt to mitigate this limitation for analyses that depend on more exact measures of spike timing, and developers note these methods should be avoided when temporal information is not relevant and the raw calcium traces provide “sufficient information” [104]. Because of the construction of the contrast detection task (see Data collection and processing) and the temporally invariant nature of contrast responses in V1 [108], the analysis of the dataset presented here does not require precise temporal information, so the use of normalized fluorescence traces was sufficient. Additionally, deconvolution changes the statistics of the data greatly, such as altering the distribution of noise correlations and increasing the sparsity of the fluorescence signal [109]. One recent work attempted to account for these differences by using more appropriate probabilistic models [110] but does not currently model noise correlations. On the other hand, calcium fluorescence is an indirect measure of neuronal communication and coding, being related to the underlying action potentials through a complex generative model [111, 112]. As such, it might be inappropriate or insufficient to apply an encoding model directly to the Δ*F*/*F* traces, as we have done here. To address this concern, we additionally analyzed deconvolved traces using two variants of the OASIS method [113]: unconstrained OASIS as found in suite2p [83], or OASIS with an *ℓ*_{1} sparsity constraint as in [114]. As expected, the deconvolution techniques significantly altered the distribution of noise correlations, but the results of our analysis of these deconvolved data was qualitatively in-line with the results obtained on the raw calcium traces (see S4 Text).

The generality of the modeling framework presented here leaves room for future expansion. One such direction would be to increase the dimensionality to model correlations among a neural population. This would require more correlation parameters, which could make the model more difficult to fit to data. However, reasoning that population variability is low-dimensional [84, 115–119], it is likely this issue could be circumvented by applying dimensionality reduction techniques within the model or by allowing the sharing of correlation parameters across a neural population. Another interesting application of this model would look directly at the effects of normalization on information transmission and representation. The relationship between noise correlations and the amount of information that can be represented by a neural population has been widely discussed [7–9, 11, 12, 120]. Moreover, some experimental and theoretical work has connected modulations of information in neural populations with computations that have been modeled with normalization models, such as surround suppression and attention [44, 51, 121, 122]. Our model could be modified to investigate this connection and further illuminate the effects of normalization on information transmission.

## Supporting information

### S1 Text. Derivation of moments for the generalized model.

Moments of the Ratio of Gaussians distribution for the general case of cross-correlations between numerator and denominator.

https://doi.org/10.1371/journal.pcbi.1011667.s001

(PDF)

### S2 Text. Derivation of negative log-likelihood for the model.

Details of the calculation of the negative-log likelihood for the Ratio of Gaussians distribution.

https://doi.org/10.1371/journal.pcbi.1011667.s002

(PDF)

### S3 Text. Negative log-posterior for inference of single-trial normalization strength.

Expands on Inference of single trial normalization from measured neural activity, Eq (14), showing the coefficient expressions.

https://doi.org/10.1371/journal.pcbi.1011667.s003

(PDF)

### S4 Text. Analysis of deconvolved imaging data.

Examining model performance on deconvolved fluorescence traces.

https://doi.org/10.1371/journal.pcbi.1011667.s004

(PDF)

### S5 Text. Derivation of relationship between mean normalization strength and noise correlations.

Mathematical derivations relating noise correlations and normalization in the model.

https://doi.org/10.1371/journal.pcbi.1011667.s005

(PDF)

### S6 Text. Further disccusion of parameter identifiability.

Additional considerations for *ρ* parameter estimation from data.

https://doi.org/10.1371/journal.pcbi.1011667.s006

(PDF)

### S7 Text. Pairwise model outperforms the independent model in simulations and V1 data when noise correlations are large.

Comparison of pairwise and independent Ratio of Gaussians model goodness of fits.

https://doi.org/10.1371/journal.pcbi.1011667.s007

(PDF)

### S1 Fig. Relationship between noise correlations and Ratio of Gaussians parameters.

Noise correlations in the model (Eq (7) in Methods subsection Generative model—pairwise Ratio of Gaussians (RoG)) can be modulated by stimulus strength (i.e., contrast), the correlation parameters of the model (*ρ*_{N}, *ρ*_{D}) and the parameters of the normalization model, in this case and (*ϵ*_{1}, *ϵ*_{2}) (see Fig 2). To understand these effects in isolation, we looked at how noise correlations (Eq (7)) changed with respect to each parameter, while keeping the other parameters constant. We illustrate with noise correlations that increase with contrast (A1-E1), and correlations that decrease with contrast (A2-E2). (A) Dependence of noise correlations on contrast. Three contrast levels that are fixed in the other panels are shown. (B) Dependence of noise correlations on *ρ*_{N}. (C) Dependence of noise correlations on *ρ*_{D}. (D) Dependence of noise correlations on , shown as a contour plot with the shade of color indicating noise correlation level. Different colors indicate different contrast levels as shown in the legend. (E) Dependence of noise correlations on (*ϵ*_{1}, *ϵ*_{2}).

(A1-E1) uses the following parameters (when not fixed): (A2-E2) uses the same parameters except with (*ρ*_{N}, *ρ*_{D}) = (0.5, 0). Contrast levels were {1,…,100}.

https://doi.org/10.1371/journal.pcbi.1011667.s008

(PDF)

### S2 Fig. Relationship between noise correlations and denominator strength.

Expands upon Fig 2 (see Results subsection Modulations of correlated variability depend on sharing of normalization) to include cases where (*ρ*_{N}, *ρ*_{D}) can be negative or have opposite signs. Figure was created using the exact same method and synthetic dataset as Fig 2: see the caption in the main text for details.

https://doi.org/10.1371/journal.pcbi.1011667.s009

(PDF)

## Acknowledgments

We thank members of the Kohn and Coen-Cagli laboratories for feedback on the manuscript. We also thank Daniel Quintana for help with animal training and Kenny Ye for advice on statistical analyses.

## References

- 1. Shadlen MN, Newsome WT. The Variable Discharge of Cortical Neurons: Implications for Connectivity, Computation, and Information Coding. Journal of Neuroscience. 1998;18(10):3870–3896. pmid:9570816
- 2. Tolhurst DJ, Movshon JA, Dean AF. The Statistical Reliability of Signals in Single Neurons in Cat and Monkey Visual Cortex. Vision Research. 1983;23(8):775–785. pmid:6623937
- 3. Cohen MR, Kohn A. Measuring and Interpreting Neuronal Correlations. Nature Neuroscience. 2011;14(7):811–819. pmid:21709677
- 4. Rumyantsev OI, Lecoq JA, Hernandez O, Zhang Y, Savall J, Chrapkiewicz R, et al. Fundamental Bounds on the Fidelity of Sensory Cortical Coding. Nature. 2020;580(7801):100–105. pmid:32238928
- 5. Kafashan M, Jaffe AW, Chettih SN, Nogueira R, Arandia-Romero I, Harvey CD, et al. Scaling of Sensory Information in Large Neural Populations Shows Signatures of Information-Limiting Correlations. Nature Communications. 2021;12(1):473. pmid:33473113
- 6. Bartolo R, Saunders RC, Mitz AR, Averbeck BB. Information-Limiting Correlations in Large Neural Populations. The Journal of Neuroscience. 2020;40(8):1668–1678. pmid:31941667
- 7. Abbott LF, Dayan P. The Effect of Correlated Variability on the Accuracy of a Population Code. Neural Computation. 1999;11(1):91–101. pmid:9950724
- 8. Averbeck BB, Lee D. Effects of Noise Correlations on Information Encoding and Decoding. Journal of Neurophysiology. 2006;95(6):3633–3644. pmid:16554512
- 9. Kohn A, Coen-Cagli R, Kanitscheider I, Pouget A. Correlations and Neuronal Population Information. Annual Review of Neuroscience. 2016;39(1):237–256. pmid:27145916
- 10. Hu Y, Zylberberg J, Shea-Brown E. The Sign Rule and Beyond: Boundary Effects, Flexibility, and Noise Correlations in Neural Population Codes. PLoS Computational Biology. 2014;10(2):e1003469. pmid:24586128
- 11. Moreno-Bote R, Beck J, Kanitscheider I, Pitkow X, Latham P, Pouget A. Information-Limiting Correlations. Nature Neuroscience. 2014;17(10):1410–1417. pmid:25195105
- 12. Zohary E, Shadlen MN, Newsome WT. Correlated Neuronal Discharge Rate and Its Implications for Psychophysical Performance. Nature. 1994;370(6485):140–143. pmid:8022482
- 13. Kanitscheider I, Coen-Cagli R, Pouget A. Origin of Information-Limiting Noise Correlations. Proceedings of the National Academy of Sciences. 2015;112(50). pmid:26621747
- 14. Panzeri S, Moroni M, Safaai H, Harvey CD. The Structures and Functions of Correlations in Neural Population Codes. Nature Reviews Neuroscience. 2022;23(9):551–567. pmid:35732917
- 15. Bányai M, Lazar A, Klein L, Klon-Lipok J, Stippinger M, Singer W, et al. Stimulus Complexity Shapes Response Correlations in Primary Visual Cortex. Proceedings of the National Academy of Sciences. 2019;116(7):2723–2732. pmid:30692266
- 16. Berkes P, Orbán G, Lengyel M, Fiser J. Spontaneous Cortical Activity Reveals Hallmarks of an Optimal Internal Model of the Environment. Science (New York, NY). 2011;331(6013):83–87. pmid:21212356
- 17. Orbán G, Berkes P, Fiser J, Lengyel M. Neural Variability and Sampling-Based Probabilistic Representations in the Visual Cortex. Neuron. 2016;92(2):530–543. pmid:27764674
- 18. Bányai M, Orbán G. Noise Correlations and Perceptual Inference. Current Opinion in Neurobiology. 2019;58:209–217. pmid:31593872
- 19. Haefner RM, Berkes P, Fiser J. Perceptual Decision-Making as Probabilistic Inference by Neural Sampling. Neuron. 2016;90(3):649–660. pmid:27146267
- 20. Lange RD, Haefner RM. Characterizing and Interpreting the Influence of Internal Variables on Sensory Activity. Current Opinion in Neurobiology. 2017;46:84–89. pmid:28841439
- 21. Lange RD, Haefner RM. Task-Induced Neural Covariability as a Signature of Approximate Bayesian Learning and Inference. PLOS Computational Biology. 2022;18(3):e1009557. pmid:35259152
- 22. Bondy AG, Haefner RM, Cumming BG. Feedback Determines the Structure of Correlated Variability in Primary Visual Cortex. Nature Neuroscience. 2018;21(4):598–606. pmid:29483663
- 23. Doiron B, Litwin-Kumar A, Rosenbaum R, Ocker GK, Josić K. The Mechanics of State-Dependent Neural Correlations. Nature Neuroscience. 2016;19(3):383–393. pmid:26906505
- 24. Litwin-Kumar A, Doiron B. Slow Dynamics and High Variability in Balanced Cortical Networks with Clustered Connections. Nature Neuroscience. 2012;15(11):1498–1505. pmid:23001062
- 25. Hennequin G, Ahmadian Y, Rubin DB, Lengyel M, Miller KD. The Dynamical Regime of Sensory Cortex: Stable Dynamics around a Single Stimulus-Tuned Attractor Account for Patterns of Noise Variability. Neuron. 2018;98(4):846–860.e5. pmid:29772203
- 26. Heeger DJ, Zemlianova KO. A Recurrent Circuit Implements Normalization, Simulating the Dynamics of V1 Activity. Proceedings of the National Academy of Sciences. 2020;117(36):22494–22505. pmid:32843341
- 27. Pillow JW, Shlens J, Paninski L, Sher A, Litke AM, Chichilnisky EJ, et al. Spatio-Temporal Correlations and Visual Signalling in a Complete Neuronal Population. Nature. 2008;454(7207):995–999. pmid:18650810
- 28.
Archer EW, Koster U, Pillow JW, Macke JH. Low-Dimensional Models of Neural Population Activity in Sensory Cortical Circuits. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, editors. Advances in Neural Information Processing Systems. vol. 27. Curran Associates, Inc.; 2014.
- 29. Gardella C, Marre O, Mora T. Modeling the Correlated Activity of Neural Populations: A Review. Neural Computation. 2019;31(2):233–269. pmid:30576613
- 30. Schneidman E, Berry MJ, Segev R, Bialek W. Weak Pairwise Correlations Imply Strongly Correlated Network States in a Neural Population. Nature. 2006;440(7087):1007–1012. pmid:16625187
- 31. Granot-Atedgi E, Tkačik G, Segev R, Schneidman E. Stimulus-Dependent Maximum Entropy Models of Neural Population Codes. PLoS Computational Biology. 2013;9(3):e1002922. pmid:23516339
- 32. Zhao Y, Park IM. Variational Latent Gaussian Process for Recovering Single-Trial Dynamics from Population Spike Trains. Neural Computation. 2017;29(5):1293–1316. pmid:28333587
- 33. Sokoloski S, Aschner A, Coen-Cagli R. Modelling the Neural Code in Large Populations of Correlated Neurons. eLife. 2021;10:e64615. pmid:34608865
- 34. Josić K, Shea-Brown E, Doiron B, De La Rocha J. Stimulus-Dependent Correlations and Population Codes. Neural Computation. 2009;21(10):2774–2804. pmid:19635014
- 35. Carandini M, Heeger DJ. Normalization as a Canonical Neural Computation. Nature Reviews Neuroscience. 2012;13(1):51–62.
- 36. Heeger DJ. Normalization of Cell Responses in Cat Striate Cortex. Visual Neuroscience. 1992;9(2):181–197. pmid:1504027
- 37. Albrecht DG, Geisler WS. Motion Selectivity and the Contrast-Response Function of Simple Cells in the Visual Cortex. Visual Neuroscience. 1991;7(6):531–546. pmid:1772804
- 38. Louie K, Khaw MW, Glimcher PW. Normalization Is a General Neural Mechanism for Context-Dependent Decision Making. Proceedings of the National Academy of Sciences. 2013;110(15):6139–6144. pmid:23530203
- 39. Ohshiro T, Angelaki DE, DeAngelis GC. A Normalization Model of Multisensory Integration. Nature Neuroscience. 2011;14(6):775–782. pmid:21552274
- 40. Olsen SR, Bhandawat V, Wilson RI. Divisive Normalization in Olfactory Population Codes. Neuron. 2010;66(2):287–299. pmid:20435004
- 41. Kohn A, Smith MA. Stimulus Dependence of Neuronal Correlation in Primary Visual Cortex of the Macaque. The Journal of Neuroscience. 2005;25(14):3661–3673. pmid:15814797
- 42. Liu LD, Haefner RM, Pack CC. A Neural Basis for the Spatial Suppression of Visual Motion Perception. eLife. 2016;5:e16167. pmid:27228283
- 43. Snyder AC, Morais MJ, Kohn A, Smith MA. Correlations in V1 Are Reduced by Stimulation Outside the Receptive Field. Journal of Neuroscience. 2014;34(34):11222–11227. pmid:25143603
- 44. Henry CA, Kohn A. Spatial Contextual Effects in Primary Visual Cortex Limit Feature Representation under Crowding. Nature Communications. 2020;11(1):1687. pmid:32245941
- 45. Cohen MR, Maunsell JHR. Attention Improves Performance Primarily by Reducing Interneuronal Correlations. Nature Neuroscience. 2009;12(12):1594–1600. pmid:19915566
- 46. Ruff DA, Cohen MR. Attention Can Either Increase or Decrease Spike Count Correlations in Visual Cortex. Nature Neuroscience. 2014;17(11):1591–1597. pmid:25306550
- 47. Mitchell JF, Sundberg KA, Reynolds JH. Spatial Attention Decorrelates Intrinsic Activity Fluctuations in Macaque Area V4. Neuron. 2009;63(6):879–888. pmid:19778515
- 48. Cavanaugh JR, Bair W, Movshon JA. Nature and Interaction of Signals From the Receptive Field Center and Surround in Macaque V1 Neurons. Journal of Neurophysiology. 2002;88(5):2530–2546. pmid:12424292
- 49. Reynolds JH, Heeger DJ. The Normalization Model of Attention. Neuron. 2009;61(2):168–185. pmid:19186161
- 50. Coen-Cagli R, Dayan P, Schwartz O. Cortical Surround Interactions and Perceptual Salience via Natural Scene Statistics. PLoS Computational Biology. 2012;8(3):e1002405. pmid:22396635
- 51. Tripp BP. Decorrelation of Spiking Variability and Improved Information Transfer Through Feedforward Divisive Normalization. Neural Computation. 2012;24(4):867–894. pmid:22168562
- 52. Verhoef BE, Maunsell JHR. Attention-Related Changes in Correlated Neuronal Activity Arise from Normalization Mechanisms. Nature Neuroscience. 2017;20(7):969–977. pmid:28553943
- 53. Coen-Cagli R, Solomon SS. Relating Divisive Normalization to Neuronal Response Variability. The Journal of Neuroscience. 2019;39(37):7344–7356. pmid:31387914
- 54. Sawada T, Petrov AA. The Divisive Normalization Model of V1 Neurons: A Comprehensive Comparison of Physiological Data and Model Predictions. Journal of Neurophysiology. 2017;118(6):3051–3091. pmid:28835531
- 55. Ruff DA, Cohen MR. Stimulus Dependence of Correlated Variability across Cortical Areas. Journal of Neuroscience. 2016;36(28):7546–7556. pmid:27413163
- 56. Pillow JW, Simoncelli EP. Dimensionality Reduction in Neural Models: An Information-Theoretic Generalization of Spike-Triggered Average and Covariance Analysis. Journal of Vision. 2006;6(4):9. pmid:16889478
- 57. Díaz-Francés E, Rubio FJ. On the Existence of a Normal Approximation to the Distribution of the Ratio of Two Independent Normal Random Variables. Statistical Papers. 2013;54(2):309–323.
- 58. Hayya J, Armstrong D, Gressis N. A Note on the Ratio of Two Normally Distributed Variables. Management Science. 1975;21(11):1338–1341.
- 59. Marsaglia G. Ratios of Normal Variables. Journal of Statistical Software. 2006;16(4).
- 60. Pham-Gia T, Turkkan N, Marchand E. Density of the Ratio of Two Normal Random Variables and Applications. Communications in Statistics—Theory and Methods. 2006;35(9):1569–1591.
- 61. Ver Hoef JM. Who Invented the Delta Method? The American Statistician. 2012;66(2):124–127.
- 62.
Baxley RJ, Walkenhorst BT, Acosta-Marum G. Complex Gaussian ratio distribution with applications for error rate calculation in fading channels with imperfect CSI. In: 2010 IEEE Global Telecommunications Conference GLOBECOM 2010. IEEE; 2010. p. 1–5.
- 63. Li Y, He Q. On the ratio of two correlated complex Gaussian random variables. IEEE Communications Letters. 2019;23(12):2172–2176.
- 64. Kronmal RA. Spurious Correlation and the Fallacy of the Ratio Standard Revisited. Journal of the Royal Statistical Society Series A (Statistics in Society). 1993;156(3):379.
- 65. Albrecht DG, Hamilton DB. Striate Cortex of Monkey and Cat: Contrast Response Function. Journal of Neurophysiology. 1982;48(1):217–237. pmid:7119846
- 66. Clatworthy PL, Chirimuuta M, Lauritzen JS, Tolhurst DJ. Coding of the Contrasts in Natural Images by Populations of Neurons in Primary Visual Cortex (V1). Vision Research. 2003;43(18):1983–2001. pmid:12831760
- 67. Geisler WS, Albrecht DG. Cortical Neurons: Isolation of Contrast Gain Control. Vision Research. 1992;32(8):1409–1410. pmid:1455713
- 68. Carandini M, Heeger DJ, Movshon JA. Linearity and Normalization in Simple Cells of the Macaque Primary Visual Cortex. The Journal of Neuroscience. 1997;17(21):8621–8644. pmid:9334433
- 69. Goris RLT, Movshon JA, Simoncelli EP. Partitioning Neuronal Variability. Nature Neuroscience. 2014;17(6):858–865. pmid:24777419
- 70. Gur M, Beylin A, Snodderly DM. Response Variability of Neurons in Primary Visual Cortex (V1) of Alert Monkeys. The Journal of Neuroscience. 1997;17(8):2914–2920. pmid:9092612
- 71. Ponce-Alvarez A, Thiele A, Albright TD, Stoner GR, Deco G. Stimulus-Dependent Variability and Noise Correlations in Cortical MT Neurons. Proceedings of the National Academy of Sciences. 2013;110(32):13162–13167. pmid:23878209
- 72. Sadagopan S, Ferster D. Feedforward Origins of Response Variability Underlying Contrast Invariant Orientation Tuning in Cat Visual Cortex. Neuron. 2012;74(5):911–923. pmid:22681694
- 73.
Joe H, Xu JJ. The Estimation Method of Inference Functions for Margins for Multivariate Models. Faculty Research and Publications; 1996.Available from: https://open.library.ubc.ca/collections/facultyresearchandpublications/52383/items/1.0225985.
- 74.
Berkes P, Wood F, Pillow J. Characterizing Neural Dependencies with Copula Models. In: Koller D, Schuurmans D, Bengio Y, Bottou L, editors. Advances in Neural Information Processing Systems. vol. 21. Curran Associates, Inc.; 2008.
- 75. Coen-Cagli R, Kohn A, Schwartz O. Flexible Gating of Contextual Influences in Natural Vision. Nature Neuroscience. 2015;18(11):1648–1655. pmid:26436902
- 76. Aljadeff J, Lansdell BJ, Fairhall AL, Kleinfeld D. Analysis of Neuronal Spike Trains, Deconstructed. Neuron. 2016;91(2):221–259. pmid:27477016
- 77. Hénaff OJ, Boundy-Singer ZM, Meding K, Ziemba CM, Goris RLT. Representation of Visual Uncertainty through Neural Gain Variability. Nature Communications. 2020;11(1):2513. pmid:32427825
- 78.
Sturmfels B. Solving Systems of Polynomial Equations. No. 97 in CBMS Regional Conference Series in Mathematics. Providence, R.I: Conference Board of the Mathematical Sciences; 2002.
- 79. Wekselblatt JB, Flister ED, Piscopo DM, Niell CM. Large-Scale Imaging of Cortical Dynamics during Sensory Perception and Behavior. Journal of Neurophysiology. 2016;115(6):2852–2866. pmid:26912600
- 80. Sridharan S, Gajowa MA, Ogando MB, Jagadisan UK, Abdeladim L, Sadahiro M, et al. High-Performance Microbial Opsins for Spatially and Temporally Precise Perturbations of Large Neuronal Networks. Neuron. 2022;110(7):1139–1155.e6. pmid:35120626
- 81.
Bounds HA, Sadahiro M, Hendricks WD, Gajowa M, Gopakumar K, Quintana D, et al. Ultra-Precise All-Optical Manipulation of Neural Circuits with Multifunctional Cre-dependent Transgenic Mice. bioRxiv: the preprint server for biology. 2022
- 82. Peirce J, Gray JR, Simpson S, MacAskill M, Höchenberger R, Sogo H, et al. PsychoPy2: Experiments in Behavior Made Easy. Behavior Research Methods. 2019;51(1):195–203. pmid:30734206
- 83.
Pachitariu M, Stringer C, Dipoppa M, Schröder S, Rossi LF, Dalgleish H, et al. Suite2p: Beyond 10,000 Neurons with Standard Two-Photon Microscopy. bioRxiv: the preprint server for biology. 2017
- 84. Lin IC, Okun M, Carandini M, Harris KD. The Nature of Shared Cortical Variability. Neuron. 2015;87(3):644–656. pmid:26212710
- 85. Rikhye RV, Sur M. Spatial Correlations in Natural Scenes Modulate Response Reliability in Mouse Visual Cortex. Journal of Neuroscience. 2015;35(43):14661–14680. pmid:26511254
- 86. Beck JM, Latham PE, Pouget A. Marginalization in Neural Circuits with Divisive Normalization. The Journal of Neuroscience. 2011;31(43):15310–15319. pmid:22031877
- 87. Dehaene GP, Coen-Cagli R, Pouget A. Investigating the Representation of Uncertainty in Neuronal Circuits. PLOS Computational Biology. 2021;17(2):e1008138. pmid:33577553
- 88. Festa D, Aschner A, Davila A, Kohn A, Coen-Cagli R. Neuronal Variability Reflects Probabilistic Inference Tuned to Natural Image Statistics. Nature Communications. 2021;12(1):3635. pmid:34131142
- 89. Hayashi T, Kato Y, Nozaki D. Divisively Normalized Integration of Multisensory Error Information Develops Motor Memories Specific to Vision and Proprioception. The Journal of Neuroscience. 2020;40(7):1560–1570. pmid:31924610
- 90. Ohshiro T, Angelaki DE, DeAngelis GC. A Neural Signature of Divisive Normalization at the Level of Multisensory Integration in Primate Cortex. Neuron. 2017;95(2):399–411.e8. pmid:28728025
- 91. Heeger DJ, Mackey WE. Oscillatory Recurrent Gated Neural Integrator Circuits (ORGaNICs), a Unifying Theoretical Framework for Neural Dynamics. Proceedings of the National Academy of Sciences. 2019;116(45):22783–22794. pmid:31636212
- 92. Echeveste R, Aitchison L, Hennequin G, Lengyel M. Cortical-like Dynamics in Recurrent Circuits Optimized for Sampling-Based Probabilistic Inference. Nature Neuroscience. 2020;23(9):1138–1149. pmid:32778794
- 93.
Paninski L, Pillow J, Lewi J. Statistical Models for Neural Encoding, Decoding, and Optimal Stimulus Design. In: Cisek P, Drew T, Kalaska JF, editors. Computational Neuroscience: Theoretical Insights into Brain Function. vol. 165 of Progress in Brain Research. Elsevier; 2007. p. 493–507.
- 94. Yu BM, Cunningham JP, Santhanam G, Ryu SI, Shenoy KV, Sahani M. Gaussian-Process Factor Analysis for Low-Dimensional Single-Trial Analysis of Neural Population Activity. Journal of Neurophysiology. 2009;102(1):614–635. pmid:19357332
- 95. Ecker AS, Berens P, Cotton RJ, Subramaniyan M, Denfield GH, Cadwell CR, et al. State Dependence of Noise Correlations in Macaque Primary Visual Cortex. Neuron. 2014;82(1):235–248. pmid:24698278
- 96. Whiteway MR, Butts DA. The Quest for Interpretable Models of Neural Population Activity. Current Opinion in Neurobiology. 2019;58:86–93. pmid:31426024
- 97. Whiteway MR, Socha K, Bonin V, Butts DA. Characterizing the Nonlinear Structure of Shared Variability in Cortical Neuron Populations Using Latent Variable Models. Neurons, behavior, data analysis and theory. 2019;3(1). pmid:31592129
- 98. Kodandaramaiah SB, Flores FJ, Holst GL, Singer AC, Han X, Brown EN, et al. Multi-Neuron Intracellular Recording in Vivo via Interacting Autopatching Robots. eLife. 2018;7:e24656. pmid:29297466
- 99. Hurwitz C, Kudryashova N, Onken A, Hennig MH. Building Population Models for Large-Scale Neural Recordings: Opportunities and Pitfalls. Current Opinion in Neurobiology. 2021;70:64–73. pmid:34411907
- 100. Stevenson IH, Kording KP. How Advances in Neural Recording Affect Data Analysis. Nature Neuroscience. 2011;14(2):139–142. pmid:21270781
- 101. Ruff DA, Cohen Marlene R. A Normalization Model Suggests That Attention Changes the Weighting of Inputs between Visual Areas. Proceedings of the National Academy of Sciences. 2017;114(20):E4085–E4094. pmid:28461501
- 102. Audoly S, Bellu G, D’Angio L, Saccomani MP, Cobelli C. Global Identifiability of Nonlinear Models of Biological Systems. IEEE Transactions on Biomedical Engineering. 2001;48(1):55–65. pmid:11235592
- 103. Pnevmatikakis EA. Analysis Pipelines for Calcium Imaging Data. Current Opinion in Neurobiology. 2019;55:15–21. pmid:30529147
- 104. Stringer C, Pachitariu M. Computational Processing of Neural Recordings from Calcium Imaging Data. Current Opinion in Neurobiology. 2019;55:22–31. pmid:30530255
- 105.
Evans MH, Petersen RS, Humphries MD. On the Use of Calcium Deconvolution Algorithms in Practical Contexts. bioRxiv: the preprint server for biology. 2020
- 106. Yaksi E, Friedrich RW. Reconstruction of Firing Rate Changes across Neuronal Populations by Temporally Deconvolved Ca2+ Imaging. Nature Methods. 2006;3(5):377–383. pmid:16628208
- 107.
Benisty H, Song A, Mishne G, Charles AS. Data Processing of Functional Optical Microscopy for Neuroscience; 2022.
- 108. Albrecht DG, Geisler WS, Frazor RA, Crane AM. Visual Cortex Neurons of Monkeys and Cats: Temporal Dynamics of the Contrast Response Function. Journal of Neurophysiology. 2002;88(2):888–913. pmid:12163540
- 109. Rupasinghe A, Francis N, Liu J, Bowen Z, Kanold PO, Babadi B. Direct Extraction of Signal and Noise Correlations from Two-Photon Calcium Imaging of Ensemble Neuronal Activity. eLife. 2021;10:e68046. pmid:34180397
- 110.
Wei XX, Zhou D, Grosmark A, Ajabi Z, Sparks F, Zhou P, et al. A Zero-Inflated Gamma Model for Post-Deconvolved Calcium Imaging Traces. bioRxiv: the preprint server for biology. 2019
- 111. Vogelstein JT, Packer AM, Machado TA, Sippy T, Babadi B, Yuste R, et al. Fast Nonnegative Deconvolution for Spike Train Inference From Population Calcium Imaging. Journal of Neurophysiology. 2010;104(6):3691–3704. pmid:20554834
- 112. Triplett MA, Goodhill GJ. Probabilistic Encoding Models for Multivariate Neural Data. Frontiers in Neural Circuits. 2019;13:1. pmid:30745864
- 113. Friedrich J, Zhou P, Paninski L. Fast Online Deconvolution of Calcium Imaging Data. PLOS Computational Biology. 2017;13(3):1–26. pmid:28291787
- 114. Lyall EH, Mossing DP, Pluta SR, Chu YW, Dudai A, Adesnik H. Synthesis of a Comprehensive Population Code for Contextual Features in the Awake Sensory Cortex. eLife. 2021;10:e62687. pmid:34723796
- 115. Cunningham JP, Yu BM. Dimensionality Reduction for Large-Scale Neural Recordings. Nature Neuroscience. 2014;17(11):1500–1509. pmid:25151264
- 116. Huang C, Ruff DA, Pyle R, Rosenbaum R, Cohen MR, Doiron B. Circuit Models of Low-Dimensional Shared Variability in Cortical Networks. Neuron. 2019;101(2):337–348.e4. pmid:30581012
- 117. Rabinowitz NC, Goris RL, Cohen M, Simoncelli EP. Attention Stabilizes the Shared Gain of V4 Populations. eLife. 2015;4:e08998. pmid:26523390
- 118. Schölvinck ML, Saleem AB, Benucci A, Harris KD, Carandini M. Cortical State Determines Global Variability and Correlations in Visual Cortex. Journal of Neuroscience. 2015;35(1):170–178. pmid:25568112
- 119. Umakantha A, Morina R, Cowley BR, Snyder AC, Smith MA, Yu BM. Bridging Neuronal Correlations and Dimensionality Reduction. Neuron. 2021;109(17):2740–2754.e12. pmid:34293295
- 120. Averbeck BB, Latham PE, Pouget A. Neural Correlations, Population Coding and Computation. Nature Reviews Neuroscience. 2006;7(5):358–366. pmid:16760916
- 121. Kanashiro T, Ocker GK, Cohen MR, Doiron B. Attentional Modulation of Neuronal Variability in Circuit Models of Cortex. eLife. 2017;6:e23978. pmid:28590902
- 122. Ringach DL. Population Coding under Normalization. Vision Research. 2010;50(22):2223–2232. pmid:20034510