Skip to main content
  • Loading metrics

Modeling the diverse effects of divisive normalization on noise correlations

  • Oren Weiss,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America

  • Hayley A. Bounds,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America

  • Hillel Adesnik,

    Roles Funding acquisition, Resources, Supervision

    Affiliations Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America, Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California, United States of America

  • Ruben Coen-Cagli

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America, Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, United States of America, Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, New York, United States of America


Divisive normalization, a prominent descriptive model of neural activity, is employed by theories of neural coding across many different brain areas. Yet, the relationship between normalization and the statistics of neural responses beyond single neurons remains largely unexplored. Here we focus on noise correlations, a widely studied pairwise statistic, because its stimulus and state dependence plays a central role in neural coding. Existing models of covariability typically ignore normalization despite empirical evidence suggesting it affects correlation structure in neural populations. We therefore propose a pairwise stochastic divisive normalization model that accounts for the effects of normalization and other factors on covariability. We first show that normalization modulates noise correlations in qualitatively different ways depending on whether normalization is shared between neurons, and we discuss how to infer when normalization signals are shared. We then apply our model to calcium imaging data from mouse primary visual cortex (V1), and find that it accurately fits the data, often outperforming a popular alternative model of correlations. Our analysis indicates that normalization signals are often shared between V1 neurons in this dataset. Our model will enable quantifying the relation between normalization and covariability in a broad range of neural systems, which could provide new constraints on circuit mechanisms of normalization and their role in information transmission and representation.

Author summary

Cortical responses are often variable across identical experimental conditions, and this variability is shared between neurons (noise correlations). These noise correlations have been extensively studied to understand how they impact neural coding and what mechanisms determine their properties. Here we show how correlations relate to divisive normalization, a mathematical operation widely adopted to describe how the activity of a neuron is modulated by other neurons via divisive gain control. We introduce the first statistical model of this relation. We extensively validate the model and investigate parameter inference in synthetic data. We find that our model, when applied to data from mouse visual cortex, outperforms a popular model of noise correlations that does not include normalization, and it reveals diverse influences of normalization on correlations. Our work demonstrates a framework to measure the relation between noise correlations and the parameters of the normalization model, which could become an indispensable tool for quantitative investigations of noise correlations in the wide range of neural systems that exhibit normalization.


Neurons in the sensory cortices of the brain exhibit substantial response variability across identical experimental trials [1, 2]. These fluctuations in activity are often shared between pairs of simultaneously recorded neurons, called noise correlations [3]. Because the presence of these correlations can constrain the amount of information encoded by neural populations and impact behavior [413], noise correlations have been widely studied. This work has also revealed that correlations are often modulated by stimulus and state variables [3, 14], and therefore can play an important role in computational theories of sensory coding. For instance, noise correlations could emerge from neurons performing Bayesian inference and reflect the statistics of sensory inputs [1518] and prior expectations [1921]. From a mechanistic point of view, such a statistical structure of noise correlations poses strong constraints on circuit models of cortical activity [2226]. To better understand the functional impact and underlying mechanisms of noise correlations on neural coding and behavior, we need to be able to quantitatively characterize and interpret how noise correlations in neural populations are affected by experimental variables.

For this reason, successful descriptive models of neural activity have been developed to capture noise correlations [2734]. However, none of those models considers divisive normalization [3537], an operation observed in a wide range of neural systems [3840] which has also been implicated in modulating the structure of noise correlations. Experimental phenomena that are accompanied by changes in noise correlations, including contrast saturation [41], surround suppression [4244], and attentional modulations of neural activity [4547] have been successfully modeled using divisive normalization [36, 4850], although those models only captured average firing rates of individual neurons. Additionally, some numerical simulation studies have shown how normalization can affect noise correlations [51, 52]. These results indicate that it is important to quantify the relative contribution of normalization and other factors to modulation of noise correlations in experimental data.

We propose a stochastic normalization model, the pairwise Ratio of Gaussians (RoG), to capture the across-trial joint response statistics for pairs of simultaneously recorded neurons. This builds on our previous method that considered the relationship between normalization and single-neuron response variability (hence we refer to it as the independent RoG; [53]). In these RoG models, neural responses are described as the ratio of two random variables: the numerator, which represents excitatory input to the neuron, and the denominator (termed normalization signal), which represents the suppressive effect of summed input from a pool of neurons [35, 54]. Our pairwise RoG allows for the numerators and denominators that describe the individual responses to be correlated across pairs; in turn, these correlations induce correlations between the ratio variables (i.e., the model neurons’ activity; Fig 1A). In this paper, we derive and validate a bivariate Gaussian approximation to the joint distribution of pairwise responses, which greatly simplifies the problem of fitting the model and interpreting its behavior. The model provides a mathematical relationship between noise correlations and normalization, which predicts qualitatively different effects of normalization on noise correlations, depending on the relative strength and sign of the correlation between numerators and between denominators. This could explain the diversity of modulations of noise correlations observed in past work [46, 52, 55]. To provide practical guidance for data-analytic applications of our model, we investigate the accuracy and stability of parameter inference, and illustrate the conditions under which our pairwise RoG affords better estimates of single-trial normalization signals compared to the independent RoG. We then demonstrate that the model accurately fits pairwise responses recorded in the mouse primary visual cortex (V1), and often outperforms a popular alternative that ignores normalization. In our dataset, we find that when the correlation parameter between denominators is significantly different from zero, it is positive, indicating that those pairs share their normalization signals.

Fig 1. Definition and validation of the pairwise Ratio of Gaussians model.

(A) The pairwise RoG model describes pairs of neural responses (R1, R2), where each response is computed as the ratio of two stimulus-driven signals on a trial-by-trial basis: numerators (N1, N2), representing the driving inputs; and denominators (D1, D2), representing the suppressive signals. Across trials, the numerators and denominators are distributed according to bivariate Gaussian distributions with correlation coefficients (ρN, ρD), respectively. The resulting response distribution is approximately Gaussian with correlation coefficient ρNC. (B-E) Comparison of the normal approximation we derived for the pairwise RoG noise covariance (B and C) and noise correlation (D and E) and the true values (estimated across 1e6 simulated trials) for 1e4 experiments (i.e., simulated pairs of neural responses). Each experiment used different model parameters and each trial was randomly drawn from the corresponding distribution. (B and D) scatter plot; (C and E) histogram of the percent difference between the Taylor approximation and the true value. The red marker indicates the median percent difference. Model parameters were drawn uniformly from the following intervals: The ranges of the mean parameters were chosen to reproduce realistic firing rates of V1 cortical neurons, while the α, β parameters were chosen such that the variances of the N and D are relatively small and the probability that D ≤ 0 is negligible [57].

Our results highlight the importance of modeling the relation between normalization and covariability to interpret the rich phenomenology of noise correlations. Our model and code provide a data-analytic tool that will allow researchers to further investigate such a relationship, and to quantitatively evaluate predictions made by normative and mechanistic models regarding the role of correlated variability and normalization in neural coding and behavior.


Ethics statement

All experiments on animals were conducted with approval of the Animal Care and Use Committee of the University of California, Berkeley.

Generative model—pairwise Ratio of Gaussians (RoG)

Here we describe in detail the RoG model and derive the Gaussian approximation to the ratio variable. Note that our RoG is entirely different from those used in [56] and in [48], despite the same acronym: [56] refers to the distribution obtained from the ratio between two Gaussian distributions, whereas we refer to the random variable that results from the ratio of two Gaussian variables. [48] does not refer to probability distributions at all, but rather to a surround suppression model in which the center and surround mechanisms are characterized by Gaussian integration over their respective receptive fields, while the RoG considered here is a model of neural covariability in general.

We build from the standard normalization model [35, 54] which computes the across-trial average neural response (e.g., firing rate) as the ratio between a driving input to a neuron (N) and the summed input from a pool of nearby neurons (D): (1) Where f÷ is the division operator; this functional notation is convenient for later derivations in which we consider the derivative of division. Our goal is to model the joint activity of pairs of neurons, so we extend the normalization model by considering two model neurons R1, R2. Since we are interested in trial-to-trial variability, we assume that a pair of neural responses Rt = (R1, R2)t on a single trial t can be written as the element-wise ratio of two Gaussian random vectors, Nt = (N1, N2)t and Dt = (D1, D2)t, with additive Gaussian random noise ηt = (η1, η2)t to capture the residual (i.e., stimulus-independent) variability.

As detailed further below, the numerators of two neurons can be correlated, and similarly for the denominators. In general, there can be correlations between the numerators and denominators (e.g., (N1, D2) may be correlated), requiring us to consider the joint, four-dimensional Gaussian distribution for the vector (Nt, Dt). However, in this paper we consider the simpler model in which Nt and Dt are independent and are each distributed according to their respective two-dimensional Gaussian distributions. This assumption allows for simplified mathematical derivations and is supported by our previous work which found that including a parameter for the correlation between N and D caused over-fitting to single-neuron data [53]. However, we have also derived the equations for the case that numerators and denominators are correlated (see S1 Text), and implemented them in the associated code toolbox, so that interested researchers can test if their data warrant the inclusion of those additional free parameters.

We therefore write the generative model for the pairwise RoG as: (2) Where f÷ is applied element-wise, μN, μD are the two-dimensional vectors of means of the numerator and denominator, respectively, ΣN, ΣD are the respective 2 × 2 covariance matrices, and (μη, Ση) is the mean and covariance matrix for the residual component of the model.

For the independent RoG, the ratio variable in general follows a Cauchy distribution whose moments are not well defined. To fit the model, we used the result that when the denominator has negligible probability mass at values less than or equal to zero, the ratio distribution can be approximated by a Gaussian distribution with mean and variance that can be derived from a Taylor expansion [5760]. This assumption is justified since the denominator is the sum of the non-negative responses from a pool of neurons [36] and is therefore unlikely to attain values less than or equal to zero.

For the pairwise extension, we can use the multivariate delta method (an application of a Taylor expansion) to compute the mean and covariance for the joint distribution of ratio variables [61] under the assumption that μD > 0. We note that the true distribution of the ratio of bivariate or multivariate Gaussians vectors is unknown (although there is some work on ratios of complex Gaussian variables [62, 63]) and has higher-order statistics (e.g., skewness, kurtosis) that are not well approximated by an equivalent Gaussian. In this paper, we are interested in modeling the noise covariance as this is the most widely studied statistic in the field, and we show that the approximations we derive are very accurate (see Fig 1). Future work could extend the model to account for these statistics by using higher-order terms in the Taylor expansion or a non-Gaussian copula.

To derive equations for the mean and covariance of the pairwise RoG, we use a Taylor expansion around the point (μN, μD): (3)

Using only the first order terms, we derive expressions for the mean and covariance matrix of the RoG: (4) (5)

Note that the variance of the denominator influences the mean of the ratio variable through a second-order term, hence it does not appear in Eq (4) (see [58] for the second-order Taylor expansion for the mean of a ratio variable). From Eq (5), we can obtain expressions for the variance of each neuron in the pair and their covariance and correlation. First, we adopt the following notation to simplify the equations: let and let . Then: (6) (7)

Eq (7) is commonly referred to as the formula for “spurious” correlation of ratios found when comparing ratios of experimental variables [64], and we further generalize this in S1 Text. To the extent that tuning similarity between neurons reflects similarity in the driving inputs, and that those driving inputs are variable, neurons with more similar tuning would have larger ρN, which in turn implies larger noise correlations according to Eq (7). This is consistent with the widespread empirical observation that signal correlations and noise correlations are correlated [3].

Parametrization of the pairwise RoG for contrast responses

In the form described so far, the pairwise RoG has 10 stimulus-dependent parameters and 5 stimulus-independent parameters for the additive noise. For any stimulus condition, there are only five relevant measurements that can be derived from the neural data (the response means and variances for each neuron in a pair, and their correlation), so the model is over-parametrized. Therefore, to apply the RoG to neural data, we need to reduce the total number of parameters.

The generality of this model provides a procedure for converting a standard normalization model (i.e., a model for the mean response) into a RoG model that specifies both mean and (co)-variance. In this paper, we use the example of contrast-gain control, which has been widely used to model the response of single neurons and neural populations to visual input with varying contrast [36, 6567]. By adapting such a model, we can reduce the stimulus dependence of the means of the numerator and denominator . In the contrast-gain control model, the neural response as a function of contrast c (0 − 100%) is computed as a “hyperbolic ratio” [36, 65]: (8) Where Rmax is the maximum response rate, ϵ is the semi-saturation constant (the contrast at which R(ϵ) = Rmax/2) to prevent division by 0, and R0 is the spontaneous activity of the neuron (the response at 0% contrast). We can convert this standard model into an RoG by setting the mean of the numerator and denominator in the RoG to the numerator and denominator in this equation: (9)

By using this functional form, we can substitute the stimulus-dependent parameters of the RoG () with the stimulus-independent parameters . Another model simplification is to assume that individual neural variability and mean neural response are related by a power function as has been observed in the visual cortex [6870]: (10)

This parametrization allows the Fano Factor (the ratio of the variance to the mean) to vary with stimulus input (as long as β ≠ 1) and for both over-dispersion (Fano factor >1) and under-dispersion (Fano factor <1). Moreover, as with the mean, the four stimulus-dependent variance parameters of the model () can be replaced with four pairs of stimulus-independent parameters (αN, βN, αD, βD). Lastly, in principle, the parameters controlling correlation (ρN, ρD) can vary with stimulus conditions but for computational simplicity we assume that (ρN, ρD) are stimulus-independent. However, even with this assumption, our model can capture stimulus-dependent noise correlations (see Pairwise Ratio of Gaussians model captures correlated variability in mouse V1) as often observed in vivo [41, 71, 72].

Fitting the RoG to data

We optimize the values of the parameters, given a dataset, by maximum likelihood estimation. In this paper, we validate various properties of the pairwise RoG using synthetic data produced from the generative model. We will demonstrate the applicability of this model to neural data analysis by fitting the pairwise RoG to calcium imaging data (see Data collection and processing).

Based on our previous discussion, we assume that the model parameters (collectively denoted Θ) are stimulus-independent. We consider our dataset {Rt(s)} where s is the stimulus and t indexes the trial. We assume that, for each stimulus, our data is independent and identically distributed according to , and that data is independent across stimuli. We can therefore compute the negative log-likelihood of the data using the following equation (see S2 Text for derivation): (11) Where and are the empirical mean and covariance across trials computed from the data.

In practice, we have found that it is computationally faster to first optimize the parameters for each neuron in the pair separately (which is equivalent to fitting the independent RoG model), and then optimize the correlation parameters (i.e., the ρ parameters) with the single-neuron model parameters fixed. This two-step optimization process is referred to as the inference functions for marginals method in the copula literature, and is known to be mathematically equivalent to maximum likelihood estimation for Gaussian copulas [73], which is the case we consider here. This points to an extension or alternative to the pairwise RoG that considers the bivariate distribution to be some non-Gaussian copula with Gaussian marginals, which we leave for future work. We assumed that the pairwise distribution is Gaussian for computational simplicity, but others have used non-Gaussian copulas to model neural populations [74].

Cross-validated goodness of fit.

To measure the quality of model fit, we used a cross-validated pseudo-R2 measure [75], as follows. During fitting, we divided the recording trials for each pair and for each stimulus into training and test sets (for simulation studies, we used two or ten-fold cross validation; for the calcium analysis we used leave-one-out cross-validation). We then fit the parameters of the model for each training set and used the following equation to assess the model prediction on the held-out data: (12) Where LLfit is the negative log-likelihood (using Eq (11)) for the test data using the optimized parameters, LLnull is the negative log-likelihood of the data assuming that there is no modulation of the responses by stimulus contrast, and LLoracle is the negative log-likelihood of the data using the empirical mean, variance, and covariance of the training data per stimulus condition. The reported goodness of fit score is the median across all training and test splits of the computed score (Eq (12)). Because of this cross-validation, goodness of fit values can be <0 (the fit model is worse than the null model) or >1 (the fit model performance is better than the oracle).

Quantifying the accuracy of the estimated correlation parameters.

As we are interested in interpreting the correlation model parameters (ρN, ρD), we need to assess the accuracy of the maximum likelihood estimator. For simulations, we directly compare the estimated ρ values to the true values used to generate the data. For real neural data, however, we do not have access to the true values: instead, we compute confidence intervals. To do so, we perform a bootstrap fit procedure: given a set of pairwise neural responses {Rt(s)} with T simultaneously recorded trials, we sample these trials with replacement T times and then fit the pairwise RoG using the resampled set of neural responses as our observations. Repeating this procedure for a large number of samples (in the analysis in sections Inference of correlation parameters and Pairwise Ratio of Gaussians model captures correlated variability in mouse V1 we used 1000 bootstrap samples) gives us sets of fit ρN, ρD, which we use to compute a 90% confidence interval. Using the synthetic data, we validate these confidence intervals by measuring the empirical coverage probability and comparing to the nominal confidence level. These confidence intervals allow one to quantify the accuracy of the ρ parameter estimates. We then demonstrate one possible use of the confidence intervals, with an application focused specifically on the sign of the ρ parameters.

Model comparison

We compared the pairwise RoG to a modified version of the modulated Poisson model [69], using Gaussian noise instead of Poisson [76]. We call this model the modulated Gaussian (MG). The original model is a compound Poisson-Gamma distribution, in which the Poisson rate parameter is the product of the mean tuning curve and a random gain variable that is Gamma distributed. The parameters of the Gamma distribution depend on the mean tuning curve (f(s)) and the variance of the gain variable (σG). Additionally, there are two sources contributing to (tuned) covariability: the correlation between the multiplicative gains (ρG), and the correlation between the Poisson processes (ρP). For the modulated Gaussian model, we use a bivariate Gaussian distribution whose moments (i.e., mean, variance, and covariance) are parametrized according to the moments of the modulated Poisson model. We made this modification to the modulated Poisson model for two main reasons. First, because we are examining continuously valued fluorescence traces as opposed to discrete spike count data, a continuous distribution is more appropriate for analysis. Second, the original modulated Poisson, while including a parametrization of the noise covariance between neurons, has no simple closed form for the bivariate distribution, which complicates the comparison of goodness of fit between the two models. By using a bivariate Gaussian distribution, we can more directly compare this model to our proposed pairwise RoG.

More explicitly, the pairwise neural responses are distributed as: (13) where we assume the mean tuning cure is the contrast-response curve (Eq (8)), is the standard deviation of the multiplicative gain for neuron i, and ρG is the correlation between the multiplicative gains. ρP is no longer interpreted as the point process correlation; instead, ρP controls the portion of the tuned covariability that is independent of the shared gain. As with the RoG, we also model the untuned variability η as additive bivariate Gaussian noise. We then fit the model parameters to data by minimizing the negative log-likelihood (Eq (11), with and defined in Eq (13)). As with the pairwise RoG, we use cross-validation to account for model complexity and compute the goodness of fit scores using Eq (12). An extension to this model was recently proposed that incorporates normalization by assuming the rate parameter is a ratio term in which the denominator is a Gaussian random variable, then deriving moments of the distribution for optimization [77]. However, this model does not currently account for noise correlations, so we chose to instead adapt the Poisson-Gamma model.

The most relevant difference between RoG and MG is in how each model accounts for the effect of normalization on (co)variability. In the RoG, normalization directly influences variability by division operating on random variables. This creates flexible dependencies between the mean firing rate, individual neuron variability and shared covariability. In the MG, normalization influences the gain of neurons through the interaction between the mean firing rate (i.e., the standard normalization model) and the gain parameter σG, which is assumed be a slowly fluctuating source of variability that scales how the mean firing rate effects variability. In this way, the normalization signal for the MG is a deterministic factor. The MG is therefore a simpler model that can only account for overdispersion, whereas the RoG allows for both overdispersion and underdispersion, and for diverse patterns of covariability (see Pairwise Ratio of Gaussians model captures correlated variability in mouse V1) albeit at the cost of additional parameters.

Inference of single trial normalization from measured neural activity

Because of the probabilistic formulation of the RoG, we can use Bayes theorem to compute the posterior probability of the normalization variable Dt in a single trial, given the observed neural responses Rt: (14) Where multiplication by |D1D2| occurs due to the change of variables formula for probability density functions. From this distribution, we can find the maximum a posteriori (MAP) estimates of the normalization strength in a single trial by differentiating the posterior distribution with respect the denominator variables and finding the maxima by setting the partial derivatives to 0. For ease of computation, we solve the equivalent problem of finding the zeros of the partial derivatives of the negative logarithm of the posterior distribution. In our previous work [53], we found that, when subtracting the mean additive noise from the simulated activity, the MAP estimate remained unbiased. Thus, for simplicity, we assume that we can subtract off the mean spontaneous activity and consider instead the posterior p(D|Rμη). To obtain an estimate for the denominator strength Dt we look at the partial derivatives of the negative log posterior with respect to D1, D2 and solve to obtain the MAP estimate. This procedure leads to a two-dimensional system of bivariate quadratic equations: (15) Where the coefficients A1, A2, B1, B2, C are functions of the parameters of the model (see S3 Text for the derivation of Eq (15) and the full expressions of these coefficients).

A basic result from the algebra of polynomial systems (Bézout’s theorem) tells us that this system has four pairs of possibly complex valued solutions [78]. In fact, as solving this system amounts to solving a quartic equation in one variable, there exists an algebraic formula (with radicals) for solutions to this system as a function of the coefficients. This solution is too long to include here and uninformative but was found using the Symbolic Math toolbox from MATLAB and is included in our toolbox (Code and Data Availability).

Because all the variables involved are real-valued, we are only interested in the existence of real solutions to this two-dimensional system. However, there is no theoretical guarantee that there will be any real solutions. In practice we take the real part of the algebraic solution to this system and find which pair of solutions minimize the negative log posterior. Alternatively, we can consider finding the MAP by directly minimizing Eq (14) (see S3 Text) using numerical optimization. We have verified that, when real-valued solutions exist to Eq (15), these coincide with numerically minimizing Eq (14). However, as optimization of Eq (14) must be computed on a per-trial basis, it is far too time consuming to perform when there are many experimental trials, so we utilize the algebraic solution to Eq (15).

Generating realistic pairwise neural activity from the model

To constrain our simulations to realistic parameter values for the contrast response function (Eq (8)), we took the single-neuron best-fit parameters to macaque V1 data analyzed in our previous work (for details see [53]) and created parameter pairs by considering all combinations (N = 11628 pairs) of these parameters. Using the generative model for the pairwise RoG (Eq (2)) and the contrast-response parametrization (Eq (8)), we can simulate single-trial neural activity from these parameter pairs and specific values for (ρN, ρD). These synthetic data allow us to explore properties of the pairwise model without having to exhaustively explore the full parameter space.

Data collection and processing

Animal preparation.

Data were collected from CaMKII-tTA;tetO-GCaMP6s mice [79], expressing GCaMP6s in cortical excitatory neurons. Mice were implanted with headplates and cranial windows over V1 [80]. Briefly, mice were anesthetized with 2% isoflurane and administered 2 mg/kg of dexamethasone and 0.5 mg/kg of buprenorphine. Animals were secured in a stereotaxic frame (Kopf) and warmed with a heating pad. The scalp was removed and the skull was lightly etched. A craniotomy was made over V1 using a 3.5 mm skin biopsy bunch. A cranial window, consisting of two 3 mm diameter circular coverslips glued to a 5 mm diameter circular coverslip, was placed onto the craniotomy, and secured into place with Metabond (C&B). Then a custom-made titanium headplate was secured via Metabond (C&B) and the animals were allowed to recover in a heated cage.

Behavioral task and visual stimuli.

During imaging, mice were head-fixed in a tube, and were performing an operant visual detection task [81]. Briefly, mice were trained to withhold licking when no stimulus was present, and lick within a response window after stimulus presentation. Mice were water-restricted and given a water reward for correct detection. Visual stimuli were drifting sinusoidal gratings (2 Hz, 0.08 cycles/degree) presented for 500 ms followed by a 1000 ms response window. Stimuli were generated and presented using PsychoPy2 [82]. Visual stimuli were presented using a gamma corrected LCD monitor (Podofo, 25 cm, 1024x600 pixels, 60 Hz refresh rate) located 10 cm from the right eye. Contrast of gratings were varied between 7 different levels: {2, 8, 16, 32, 64, 80, 100}, except for 2 recording sessions in which contrast level 80% was omitted. This did not alter any of the analysis, allowing sessions to be combined into a single dataset.

Calcium imaging.

Once they learned the task, mice started performing under the 2p microscope, and V1 was imaged via cranial window. Imaging was performed using a 2-photon microscope (Sutter MOM, Sutter Inc.), with a 20X magnification (1.0 NA) water-immersion objective (Olympus Corporation). Recordings were done in L2/3 in an 800 x 800 μm field of view, with 75–100 mW of 920 nm laser light (Chameleon; Coherent Inc). An electrically tunable lens (Optotune) was used to acquire 3 plane volumetric images at 6.36 Hz. Planes were 30 μm apart. Acquisition was controlled with ScanImage (Vidrio Technologies).

Calcium imaging data was motion-corrected and ROI extracted using suite2p [83], and all data was neuropil subtracted with a coefficient of 0.7 (we also analyzed data using neuorpil coefficients of 0.4 and 1, and see S4 Text for additional analysis with deconvolved data; all the results presented in the main text were qualitatively similar across preprocessing methods).

Data processing.

Processing of calcium imaging data was performed using custom MATLAB code. Fluorescence traces for individual trials and cells (average of the neuropil subtracted fluorescence across a ROI) consisted of 24 frames: 4 frames of pre-stimulus blank, followed by 3 frames of stimulus presentation and 17 frames of post-stimulus blanks corresponding to the response window for the behavioral task. In our analyses, we considered one extra frame to account for onset delays and calcium dynamics. Baseline fluorescence (F0) was computed as the median across pre-stimulus frames (1–5), and the stimulus evoked fluorescence (ΔF/F) was computed as the mean of the normalized fluorescence per frame ((F(i) − F0)/F0 for F(i) the fluorescence of frame i) across frames corresponding to stimulus response. Spontaneous ΔF/F was computed as above during blank trials. Cells were included in further analysis if the evoked response at the highest contrast was at least 2 standard deviations above the spontaneous mean fluorescence. Across 9 recording sessions, 295/8810 neurons met this inclusion criterion. This small percentage is due to sessions recorded using gratings with fixed spatial frequency and orientation; thus, included neurons are visually responsive and selective for this combination.


We developed the pairwise Ratio of Gaussians model (RoG, Fig 1A) to quantify the relationship between normalization and response covariability (i.e., noise correlations) that has been suggested in empirical studies of neural activity in visual cortex [52, 55, 84, 85]. In the standard divisive normalization model (Eq (1)), the mean response is computed as the ratio of the excitatory drive (numerator N) to a normalization signal summarizing inhibitory drive (denominator D). Our pairwise RoG considers a pair of neurons where each individual neural response is well characterized by the standard normalization equation with corresponding numerators (N1, N2) and denominators (D1, D2). We then assume that the numerators and denominators are bivariate Gaussian random vectors—which allows the possibility for correlations to exist among the numerators (denoted ρN) and among the denominators (ρD). From this, we derived equations for the mean responses and covariance matrix of the pair as a function of the numerators and denominators (Eqs (4) and (5)). These equations depend on the Gaussian approximation to the ratio of two Gaussian random variables. We verified the validity of this approximation for the moments of interest (mean, variance, and covariance), by simulating the activity of pairs of neurons, and comparing the covariance (Fig 1B and 1C) and correlation (Fig 1D and 1E) of the true ratio distribution and of the approximate distribution (Eqs (6) and (7)). The mean and variance are identical to the independent RoG model (a special case of the pairwise RoG, with numerators and denominators independent between neurons) which we validated previously [53].

Modulations of correlated variability depend on sharing of normalization

Within the RoG modeling framework, there are two main sources of response (co)-variability: the numerator (excitatory drive) and the denominator (normalization). Depending on the value of the corresponding ρ parameters, each of these sources of variability can be independent (ρ = 0) or shared (ρ ≠ 0), and therefore contribute differently to noise correlations. Consequently, understanding modulations of noise correlations in the pairwise RoG requires understanding how normalization and external stimuli affect the relative amplitude of these sources, and how the effects depend on whether those sources are shared.

First, we studied the relationship between normalization and noise correlations for the lowest contrast stimuli (Fig 2, yellow symbols). We define normalization strength for a given neuron as the mean of the denominator; for a fixed stimulus contrast, this is determined by the semi-saturation constant ϵ in Eq (8). When the normalization signals are positively correlated (shared normalization, ρD = 0.5) but the excitatory drive is independent (ρN = 0), increasing normalization strength tends to decrease the magnitude of noise correlations (Fig 2A). Conversely, when the normalization signals are independent (ρD = 0) but there is shared driving input (ρN = 0.5), the magnitude of noise correlations tends to increase with increasing normalization (Fig 2B). Intuitively, this is due to how the model partitions neuronal covariability into two sources and how normalization separately effects variability of these sources. As mentioned at the beginning of this section, these terms describe the correlations among the numerators and among the denominators. The correlations arising from the numerator are unaffected by normalization strength, while the correlations arising from the denominator tend to decrease with normalization strength. So, when ρN = 0, the noise correlations are solely due to the denominator cofluctuations and thus tend to decrease with normalization. However, when ρD = 0, the numerator covariability drives the response covariability, which is unchanged by normalization strength. The reason why we see increased noise correlations in this scenario is because normalization decreases individual neuronal response variability [53] so the proportional contribution of the numerator term increases. This is derived more completely later in this section. Therefore, increasing normalization strength reduces a source of noise correlations when normalization is shared, whereas it reduces a source of independent noise when normalization is not shared.

Fig 2. Relationship between noise correlations, normalization strength and contrast depends on the source of variability.

Each panel shows, for a combination of ρN, ρD specified in the panel title, the median noise correlation of all generated neural pairs binned according to ϵ1 × ϵ2, a contrast independent measure of the common normalization strength. Bins with less than 100 pairs were discarded. Neural responses were generated from the contrast-response parametrization (Eq (8)). Noise correlation strength was computed across 1e3 simulated trials drawn from the pairwise RoG model. For each contrast level and combination of ρN, ρD, 1e5 simulated experiments were created. See S1 Fig for a more systematic exploration of the factors influencing modulation of noise correlations for two simulated pairs that matches the large scale experiment considered here. Model parameters were drawn uniformly from the following intervals:

We found similar effects of normalization on noise correlations at higher contrast levels (Fig 2, orange and red symbols) although the slope of the relationship became shallower as contrast increased. This is due to the saturating effect of the contrast response function (Eq (8)): at high contrast, (co)fluctuations of the normalization signal across trials have relatively little effect on the responses of a neural pair, so the correlation in neural responses will be relatively unaffected by normalization strength. We also observed that the magnitude of noise correlations generally increased with contrast when the denominators were correlated and the numerators independent (Fig 2A and S2 Fig), whereas it decreased when the numerators were correlated (Fig 2B and 2C and S2 Fig). Importantly, for the analysis of normalization strength at fixed contrast, we used the contrast semi-saturation constants (Eq (8); i.e., a pure change in the denominator) as a measure of normalization strength. Conversely, increasing stimulus contrast increases both the numerator and denominator (Eq (8)). This explains our observation that, even though normalization is stronger at higher contrast, noise correlations can be modulated in different ways by contrast and by normalization, because changing stimulus contrast also affects the numerator term.

Indeed, these results can be derived from examining the equation for the correlation in the pairwise RoG (Eq (7)) rearranged as follows (see S5 Text for more details): (16) and by recognizing that the coefficient of variation of Di () is a decreasing function of normalization strength (μD or ϵ), whereas the ratio is often an increasing function of contrast.

We further analyze this equation, first in the case of changing normalization strength while keeping contrast fixed. The term proportional to ρN is an increasing function of the mean normalization strength since the denominator is a decreasing function of μD. Conversely, the term proportional to ρD decreases with normalization since the denominator increases with μD. In these two cases, the monotonic dependence of noise correlation on normalization strength is guaranteed by Eq (16) regardless of the specific parameter values. Similar patterns emerge when ρN, ρD < 0, except the signs of the noise correlations are reversed (S2 Fig). When the correlations of input and normalization signals are both different from zero, the relationship between noise correlation and normalization strength resembles a combination of the two previously described scenarios, and the specific parameter values determine which of the two terms in Eq (16) dominates. For instance, in our simulations with (ρN = 0.5, ρD = 0.5) (Fig 2C), the magnitude of noise correlations increased with normalization strength on average similar to (Fig 2B), indicating that the magnitude of the term proportional to ρN is usually larger than the magnitude of the ρD term, but this trend is not consistent, as evidenced by the increased spread of the scatter. When the input strength and normalization signal have opposite correlations (e.g., ρN = 0.5, ρD = −0.5), we obtained similar results; however, the magnitude of noise correlations was on average closer to 0 due to cancellation between the two sources of covariability (S2 Fig).

A similar analysis of Eq (16) shows that the effects of stimulus contrast are opposite to those of a pure change of normalization strength, because is often an increasing function of contrast but a decreasing function of normalization strength. Notice that we assumed there is no residual noise component (η ≡ 0), but all the analyses above remain valid when the amplitude of noise variance ση is relatively small compared to (δN, δD) (see S5 Text).

In summary, our analysis shows that normalization and stimulus contrast can have diverse effects on noise correlations, depending on whether neurons share their normalization signals and on the interplay between multiple sources of variability.

Inference of correlation parameters

The above analysis demonstrates that the relationship between noise correlations and normalization depends on how this correlated variability arises: either through cofluctuations in the excitatory drive or normalization signal (determined by ρN, ρD respectively). To employ these insights when fitting to data, we need to know how well we can infer these parameters from data. To do so, we generated synthetic neural data using realistic values for the single-neuron parameters (see Generating realistic pairwise neural activity from the model) and uniformly randomly sampled ρN, ρD parameters in the range [-0.9,0.9]. We then assessed the quality of the maximum likelihood estimate of the parameters by calculating bootstrapped confidence intervals (with N = 1000 bootstrap samples) and comparing the estimator and true values (see Quantifying the accuracy of the estimated correlation parameters).

First, we assessed the validity of the confidence interval by examining how well the empirical coverage probability matches the confidence level as constructed. To do so, we constructed 90% confidence intervals for the (ρN, ρD) parameters via bootstrap resampling. We then grouped by the ground truth ρ values using a sliding window and counted the proportion of cases in that bin for which the bootstrap confidence interval contains the true value. We found that for both ρN and ρD the coverage is near the nominal level: for ρN it is slightly lower (Fig 3A left) while for ρD it is nearly equivalent (Fig 3A right). This indicates that the confidence intervals constructed via the bootstrap are valid and can be used for further analysis.

Fig 3. Accuracy of inference of ρ parameters.

Plots were generated with 11628 synthetic parameter pairs with uniformly randomly generated ρN, ρD ∈ [−0.9, 0.9], contrast levels {6.25, 12.5, 25, 50, 100}, 1000 synthetic trials and 1000 bootstrap resamples. The left column are the results of the analysis for ρN, the right column for ρD. (A) Empirical coverage probability for the 90% confidence intervals as a function of the ground truth ρ values. The dotted line indicates the nominal confidence level. Coverage probability was computed as the proportion of cases with a specified range of ρ values for which the 90% confidence intervals contained the true value. We used a moving window with width 0.4 and a step size of 0.2. (B) Direct comparison between the true ρ value and the maximum likelihood estimator for the ρ value. The darker colors are the pairs for which the ρ parameters are significantly different from 0 (i.e., the 90% confidence interval excludes 0), whereas the lighter colors are not significant (i.e., the 90% confidence interval includes 0).

Next, we directly compared the the true and inferred ρ values (Fig 3B). In general, the maximum likelihood estimators are largely unable to recover the true generating values (the overall Pearson correlation between the fit and true (ρN, ρD) = (0.48, 0.18), the mean squared error across pairs is (0.32, 0.57)), indicating that these parameters are not identifiable with the parametrization we considered (contrast tuning). Further, this analysis indicates that inference of ρD is in general more difficult than it is for ρN. The lack of identifiability of these parameters is likely due to the numerous multiplicative interactions between the parameters. For instance, by looking closer at Eq (7), we can see that the contribution of the ρ parameters to the noise correlation is multiplied by the respective standard deviations for the numerator or denominator variables. Such interactions may make it difficult to infer the exact value of the ρ parameters (see S6 Text for further discussion).

As we established the validity of the bootstrapped confidence intervals (Fig 3A), one could select pairs for which the parameter inference is accurate to the desired precision (i.e., selecting those pairs whose confidence intervals are a certain width). Additionally, one can also perform population-level analyses on the ρ parameters, as we demonstrate for experimental data (see Pairwise Ratio of Gaussians model captures correlated variability in mouse V1). Here, we introduce and validate another kind of analysis one can perform with the bootstrapped confidence intervals. Rather than the exact magnitude of the ρ parameters, we are often only interested in the sign of these parameters, as in deriving the relationship between normalization strength and noise correlations (Modulations of correlated variability depend on sharing of normalization). Moreover, because all the parameters in Eq (7) are a priori positive besides the ρ parameters, we reasoned that the accuracy of sign inference might be higher. Therefore, we considered the subset of cases where the parameters are significantly different from 0, which we define as cases where the 90% confidence interval does not include 0 and the pairwise goodness of fit is greater than the independent goodness of fit (see Cross-validated goodness of fit). First, for those pairs (for ρN, 4598/11628 pairs were significant, for ρD, 1625/11628), we see a much stronger relationship between the fit and true ρ values (Fig 3B, darker colors; Pearson correlation between significant fit and true (ρN, ρD) = (0.89, 0.57); mean squared error across pairs = (0.09,0.43)). Importantly, the plots also show that, for pairs with ρ parameters significantly different from 0, the sign of the inferred ρ parameter is very frequently equivalent to the sign of the true ρ parameter, in a similar proportion of cases for ρN and ρD, although ρN has a much higher proportion of cases (for ρN, 4509/4598 of pairs significantly different from 0 has the same sign; for ρD, 1309/1625). From this, we conclude that, for these significant ρ parameters, the sign of the inferred ρ parameter is accurate.

In summary, our analysis indicates that it is difficult to estimate the precise value of the ρ parameters in general, for the stimulus parametrization considered here (i.e., the classical normalization model for contrast tuning). However, we have provided a method to calculate bootstrapped confidence intervals for the maximum likelihood estimators of the ρ parameters and have shown that these confidence intervals accurately represent the uncertainty around those estimates. We then demonstrated one possible use-case for these confidence intervals: for ρ estimates that are significantly different from 0, the ρ estimators are accurately able to recover the sign of the ground truth ρ parameters.

Pairwise model improves single-trial inference of normalization strength even when noise correlations are small

In past work that connected normalization to modulation of noise correlations, stimulus and experimental manipulations (e.g., contrast, attention) are used as proxies for normalization strength [51, 52, 55] because normalization strength cannot be measured directly. However, these manipulations also affect other factors that drive neural responses (as we have illustrated in Modulations of correlated variability depend on sharing of normalization, Fig 2), which could confound these as measures of normalization signal. Therefore, quantitatively testing the relationship between noise correlations and normalization requires estimating the single-trial normalization strength for a pair of neurons. One of the advantages of our probabilistic, generative formulation of the pairwise RoG model (Eq (2)) is that it allows us to infer the single-trial normalization strength from measured neural activity (see Inference of single trial normalization from measured neural activity). The independent RoG model also provides an estimate for the single-trial normalization, which is known to be a valid estimator for the ground-truth normalization strength for data generated from the independent RoG [53]. We found similar results for the pairwise RoG, so we examined how the pairwise estimates for the single-trial normalization strength compares to estimates based on the independent model.

One possibility is that the estimate derived from the pairwise model would outperform the independent model as the magnitude of noise correlations increase. However, this is not necessarily the case. Fig 4A and 4B demonstrates this with two example synthetic neuron pairs. Because the single-trial normalization inference depends on the single-trial neural activity, correlations between neurons will induce correlations between the inferred normalization signals even for the independent RoG (Fig 4A). However, when noise correlations are small due to cancellation between ρN and ρD, the independent model will infer minimal correlation between normalization signals while the pairwise model will correctly infer that the single-trial normalization is correlated (Fig 4B).

Fig 4. Inference of Single-trial normalization depends on ρ parameters and contrast level.

(A and B) Two simulated experiments drawn from the pairwise RoG with different noise correlations arising from different underlying ρN, ρD values to compare the pairwise and independent estimators of single-trial normalization with the ground truth normalization signal. (A) has overall noise correlation of 0.21 across contrasts generated with ρN = 0, ρD = 0.3, (B) has overall noise correlation of -0.05 across contrasts generated with ρN = −0.3, ρD = 0.3. Z-scoring performed across trials. Random parameters drawn from Rmax ∈ [10, 100], ϵ ∈ [15, 25], αN, αD ∈ [0.1, 1], βN, βD ∈ [1, 1.5], η ≡ 0. Contrasts levels were {6.25, 12.5, 25, 50, 100}. (C and D) Comparison of the single-trial normalization inference in the pairwise and independent RoG models for the mean-squared error (C) and the correlation between the estimate and the true value (D), as it depends on ρN, ρD and the contrast levels. Left: lowest contrast level (6.25); middle: intermediate contrast level (25); right: full contrast (100). Each bin corresponds to the median difference between the pairwise and independent models across simulated experiments. (C and D) used 11628 synthetic pairs (see Generating realistic pairwise neural activity from the model).

To demonstrate this principle, we computed the difference of the mean squared error (Fig 4C) and correlations (Fig 4D) between the pairwise RoG and independent RoG normalization inference (with respect to the ground truth values) for many simulated pairs (11628) with systematically varying ρN, ρD values. First, we found that the distinction between the two models depended on the contrast level: as contrast levels increase, the quality of the pairwise and independent estimators of normalization signal become more distinct. Second, the improvement of the pairwise RoG estimate over the independent estimate increased when the magnitude of the ρD parameter increased. Consistent with the intuition provided by Fig 4B, the largest improvement occurred when the ρN parameter had a large value that was the opposite sign of ρD. The dependence on the ρD parameter reflects that the estimator for the pairwise model incorporates knowledge about correlation between the normalization signals.

In summary, this analysis shows that the pairwise model estimates of the single trial normalization can improve upon the independent model even when noise correlations are small. As the single-trial normalization estimator from the independent model was previously shown to be accurate [53], our results imply that the pairwise model estimate is also able to recover the ground-truth normalization strength. Additionally, we have outlined the conditions in which those estimates are preferable to those obtained with the independent model.

Pairwise Ratio of Gaussians model captures correlated variability in mouse V1

To test how well the model captures experimental data, we applied it to calcium imaging data recorded in V1 of mice responding to sinusoidal gratings of varying contrast levels (see Data collection and processing). We analyzed neurons that were strongly driven by the visual stimuli (N = 295 neurons, 5528 simultaneously recorded pairs; see Data collection and processing for inclusion criteria). We focused on stimulus contrast tuning (Eq (8)) because the formulation of the corresponding standard normalization model captures firing rate data well [36], and visual contrast affects both normalization strength and the strength of noise correlations [41].

Because the RoG framework has not been validated before on mouse V1 fluorescence data, we first applied the independent RoG and found that it provided excellent fits (average cross-validated goodness of fit 0.85, 95% c.i [0.846,0.858], 5163/5528 pairs with goodness of fit >0.5) on par with that found in macaque V1 data recorded with electrode arrays [53], thus demonstrating that the RoG framework is flexible enough to capture datasets with different statistics. In both cases, the analysis was performed on visually responsive neurons, that therefore exhibited strong contrast tuning of the firing rate, partly explaining the high performance. However, the independent RoG could not capture correlated variability, which was prominent in the data (median noise correlation across all pairs and contrasts 0.117, c.i. [0.115, 0.120], 2781/5528 pairs had noise correlations significantly different from 0).

Therefore, we tested if the pairwise RoG could capture correlated variability in the data. Fig 5A and 5B demonstrates that the model can capture contrast-dependent noise correlations, both for pairs with positive (example in Fig 5A; 4327/5528 pairs) and negative median noise correlations (example in Fig 5B; 1201/5528 pairs). Importantly, even though the ρN, ρD parameters were stimulus-independent, the pairwise model captured substantial changes in noise correlations with contrast for many of the pairs analyzed (2991/5528 had greater than 0.5 correlation between the observed noise correlations and model fit, across contrast levels). However, this ability to capture correlations comes at the cost of larger model complexity. To account for this, we next compared quantitatively the pairwise and independent models, using a cross-validated goodness of fit score (Eq (12)). The pairwise model slightly outperformed the independent model on average and for most pairs (median difference in goodness of fit = 0.0121, p < 0.001, 4123/5528 pairs with pairwise goodness of fit greater than independent), denoting that the additional free parameters are warranted. Furthermore, because the independent model is a special case of the pairwise with noise correlations fixed at zero, we found as expected that the performance difference between the two models increased for pairs of neurons with larger noise correlations (see S7 Text).

Fig 5. Pairwise RoG captures contrast-dependent noise correlations in mouse V1.

(A and B) Pairwise neural responses for two example pairs of neurons in mouse V1 with positive (A) and negative (B) median noise correlations. From left to right: 1) empirical mean and covariance ellipses (∼1 standard deviation from the empirical mean) for pairwise responses at each contrast level; 2) the RoG model predicted means and covariance ellipses, the panel title includes the cross-validated goodness of fit score; 3) the modulated Gaussian (MG) predicted means and covariance ellipses; 4) compares the two model fit noise correlation values (continuous lines), with the empirical values as a function of contrast (error bars are 68% bootstrapped confidence interval). Neuronal pair in (A) had 93 repeats of each stimulus contrast, pair in (B) had 68 repeats. (C) Scatter plot across all pairs of the goodness of fit score for modulated Gaussian vs. the goodness of fit for the RoG. (D) Histogram of the difference between the scores in (C). Contrast levels are {2,8,16,32,64,80,100}.

To benchmark the RoG against a widely adopted alternative model, we considered the modulated Poisson model that was previously shown to capture noise correlations in macaque V1 [69]. For application to our imaging dataset and for a fair comparison with the RoG, we used Gaussian noise instead of Poisson [76] and termed this the modulated Gaussian (MG) model (see Model comparison). The example pairs demonstrate that, while in some pairs the MG can capture the modulation of noise correlations with contrast as well as the RoG (Fig 5A), it is not able to capture it in other pairs while the RoG can (Fig 5B). Across the dataset, for the majority of pairs (5238/5528), the pairwise RoG had a higher goodness of fit score (Eq (12)) than the MG (Fig 5C and 5D, median difference between goodness of fit for RoG and MG = 0.238, 95% c.i. [0.232,0.245]). These results were also largely independent of the specific preprocessing method applied to the calcium imaging data (see S4 Text). Moreover, although both models capture the tuning of noise correlations with contrast level by using stimulus-independent correlation parameters, the RoG model better predicts the trend in noise correlations with contrast than the MG (median difference in correlations between model fit noise correlations and empirical noise correlations = 0.0771, 95% c.i. [0.071, 0.084]; 1450/5528 pairs had statistically significant correlation between the pairwise RoG predictions and empirical noise correlations compared to 1008/5528 for the MG). In principle the MG model’s ability to capture the modulation of noise correlations with contrast could be improved by including contrast dependence in the correlation parameters explicitly, although this would increase model complexity.

These results demonstrate that the pairwise RoG captures a range of effects of stimulus contrast on noise correlations observed in experimental data and performs competitively against a popular alternative model that does not account for normalization explicitly.

Next, we analyzed the correlation parameters (ρN, ρD) in the model fit (Fig 6). We first only selected those pairs of neurons whose pairwise goodness of fit exceeded 0.5 and the independent goodness of fit measure (3920/5528 total), and we computed the bootstrapped 90% confidence interval for the (ρN, ρD) parameters (see Quantifying the accuracy of the estimated correlation parameters).

Fig 6. Inferred ρN, ρD are positive in Mouse V1.

Histograms comparing the inferred ρN (A) and ρD (B) values for all neuronal pairs meeting our goodness of fit criteria (outlined) and the subset of those pairs significantly different from zero with 90% confidence (filled). The histograms for all pairs and pairs significantly different from 0 are normalized separately.

Examining the correlation parameters for all of the pairs meeting the goodness of fit criteria (Fig 6A and Fig 6B outlined), we see a significant bias towards positive values (median ρN, ρD parameter values = 0.84, 1). This is partially due to the large number of cases in which the fit parameters were exactly equal to ±1 (for ρN, ρD, 1442 and 2700 fit values were ±1). However, even when excluding these pairs, the trend within the population is still towards positive fit ρ values (median fit ρN, ρD values excluding extreme pairs is 0.29, 0.22). This analysis suggests that these signals are, on aggregate, shared among the population recorded; in particular, this suggests that normalization is typically shared between the pairs recorded.

As a complementary analysis, we then focused on the cases where the parameters were assessed to be significantly different from zero (1270/3920 for ρN, 192/3920 for ρD) (Fig 6A and 6B filled). The proportion of pairs for which the estimated ρ parameters are significantly different from 0 is similar to the synthetic data (see Inference of correlation parameters). For these pairs that were significant, we found that nearly all inferred ρN (1239) and ρD (191) parameters were positive (see Fig 6), suggesting that normalization signals are generally shared for these pairs of neurons.

In summary, our results demonstrate a new approach to quantify how strongly normalization signals are shared between neurons, and to explain the diverse effects of normalization on noise correlations.


We introduced a stochastic model of divisive normalization, the pairwise RoG, to characterize the trial-to-trial covariability between cortical neurons (i.e., noise correlations). The model provides excellent fits to calcium imaging recordings from mouse V1, capturing diverse effects of stimulus contrast and normalization strength on noise correlations (Figs 5 and 6). We demonstrated that the effect of normalization on noise correlations differs depending on the sources of the variability, and that the model can accommodate both increases and decreases in noise correlations with normalization (Fig 2) as past experiments had suggested. We then investigated the accuracy of inference of a key model parameter, which determines whether normalization is shared between neurons, and we provided a procedure for quantifying the uncertainty of this inference using bootstrapping (Fig 3). Lastly, we derived a Bayesian estimator for the single-trial normalization signals of simultaneously recorded pairs. Surprisingly, this estimator can be more accurate than the estimator based on the model that ignores noise correlations (the independent RoG) even when noise correlations are negligible (Fig 4).

As a descriptive, data analytic tool, our modeling framework complements normative and mechanistic theories of neural population variability. For instance, normative probabilistic accounts of sensory processing have suggested that divisive normalization may play a role in the inference of perceptual variables by modulating neural representations of uncertainty [15, 18, 75, 77, 8688]. Similarly, normalization could play a key role in multisensory cue combination [39, 89, 90]. However, the posited effect of normalization on covariability has not been tested quantitatively, as normalization signals are often not measurable. The pairwise RoG will allow researchers to test these hypotheses by providing a means with which to estimate normalization signals from neural data and relate these to measures of neural covariability. In circuit-based models of neural dynamics such as the stabilized supralinear network [25] and the ORGaNICs architecture [91], the normalization computation emerges naturally from the network dynamics [26] and shapes the structure of stimulus-dependent noise correlations [92]. By quantifying the parametric relation between normalization and covariability, our descriptive tool will enable mapping those parameters onto the different circuit motifs and cell types posited by these network models.

When comparing the RoG to the modulated Gaussian model (see Model comparison), we found that the RoG had better performance for the majority of pairs (Fig 5C). We chose to adapt the modulated Poisson model [69] as a comparison to the RoG because it was shown to successfully capture noise correlations in recordings from macaque V1. Moreover, this model belongs to the class of Generalized Linear Models, which are among the most widely used encoding models for neural activity [27, 93]. There are numerous alternative descriptive models of correlated neural population activity, among the most popular of these being latent variable models (LVMs), in which population-wide activity arises from interactions between a small set of unobserved variables [28, 33, 9497]. This effectively partitions the population noise covariance into underlying causes (i.e., latents) that are responsible for coordinating neural responses, which resembles our attribution of noise correlations to either shared input drive or shared normalization pools (Fig 2). The RoG, on the other hand, is a pairwise model that seeks to explicitly characterize neural interactions through divisive normalization, which cannot be done with any existing LVMs; integrating normalization into the LVM framework is an important future extension of our model. One benefit of our current approach is that the RoG can be applied to any scenario in which two or more neurons are simultaneously recorded. LVMs can only be applied to relatively large populations of simultaneously recorded neurons to estimate the globally shared latent factors. This is not always feasible for regions of the brain that are difficult to record from or using techniques such as intracellular voltage recordings [98, 99]. The downside of a method such as the RoG is scalability to large populations, as the model parameters must be optimized for each recorded pair, which can be computationally expensive for modern datasets with thousands of neurons [100]. Nonetheless, we were able to fit the RoG to data across multiple different preprocessing methods (∼90000 pairs total) in a reasonable time (∼ 27 hours running in parallel on a 28-core server without GPU acceleration), suggesting that it is not entirely impractical to use the pairwise RoG on a large dataset.

Three models have directly studied the relationship between normalization and across trial covariability [51, 52, 101]. Tripp’s [51] simulation work on velocity tuning in the medial temporal cortex (MT) consistently predicted that normalization would decorrelate neural responses. However, we found that noise correlations could also increase with normalization. This is because Tripp modeled correlations to solely arise from tuning similarity between neurons. Conversely, in the RoG framework, noise correlations originate from input correlations ρN and correlations between normalization signals ρD. Our model then offers more flexibility than Tripp’s by allowing relationships between normalization and correlation to depend on the sources of correlations. Verhoef and Maunsell [52] investigated the effect of attention on noise correlations by using a recurrent network implementation of the normalization model of attention [49]. They describe multiple different patterns of the effect of normalization on noise correlations depending on tuning similarity between a pair of neurons and where attention is directed. Our model does not currently account for the effect of attention, but this would be possible by adapting the standard normalization model of attention which would require an additional parameter for the attentional gain. These prior two models are also primarily simulation based, while our model is meant to be data analytic. Lastly, Ruff and Cohen [101] proposed a normalization model to explain how attention increases correlations between V1 and MT neurons [55]. They modeled the trial-averaged MT neural responses as a function of trial-averaged responses of pools of V1 neurons. After fitting the parameters, single-trial MT responses were predicted by feeding the pooled single-trial V1 responses into the equation. By construction, variability in predicted MT neural responses only arises from variability in the V1 neural responses, which only occur in the numerator of their normalization model. Our model also allows for variability in the denominator of the normalization equation and therefore their model can be seen as a special case of the pairwise RoG.

An important limitation of our model is that the correlation parameters (ρN, ρD) are not identifiable (Fig 3), meaning the model parametrization is such that multiple different parameter sets result in equivalent models (e.g., equivalent likelihoods and moments). This is a common issue when using complex nonlinear models as proposed here [102]: in our model, this is due to multiplicative interactions between model parameters. Improving parameter estimation will require better constraints on model parameters, alternative optimization algorithms, or different objective functions (see S6 Text for further discussion). Future extensions of the model to population level interactions through latent variable models offer another avenue to improve parameter estimation: the variability of population activity is often low-dimensional, which could naturally impose parameter constraints. Nonetheless, we developed a method to calculate confidence intervals for the estimates of the ρ parameters, which can be used to select pairs for which the estimates’ uncertainty is less than a desired level. As an example application of this approach, we have demonstrated that the confidence intervals can be used to determine when the sign of those parameters, which is an important factor in controlling noise correlations (Fig 2), can be recovered accurately. We showed that we were accurately able to recover the sign of the correlation in synthetic datasets when the bootstrap confidence interval for the parameter of interest excluded 0. In the V1 dataset analyzed, we found that (1239,192)/3920 pairs meet this criterion for ρN, ρD respectively, and that the vast majority of those pairs had positive ρN, ρD. Although this is a minority of cases, it demonstrates that typical datasets with existing recording technologies could nonetheless provide sufficient power for studies that focus on the ρ parameters values. It will be important in future work to understand which experimental conditions would maximize the yield of pairs with accurate estimates of the ρ parameters.

We chose in this study to primarily analyze the normalized fluorescence traces (ΔF/F) rather than using deconvolution or spike inference methods (see [103105] for a review). Deconvolution methods were developed in part due to the slow temporal dynamics of the calcium indicators relative to membrane potentials generating spiking activity [106, 107]. Deconvolution and other spike inference techniques attempt to mitigate this limitation for analyses that depend on more exact measures of spike timing, and developers note these methods should be avoided when temporal information is not relevant and the raw calcium traces provide “sufficient information” [104]. Because of the construction of the contrast detection task (see Data collection and processing) and the temporally invariant nature of contrast responses in V1 [108], the analysis of the dataset presented here does not require precise temporal information, so the use of normalized fluorescence traces was sufficient. Additionally, deconvolution changes the statistics of the data greatly, such as altering the distribution of noise correlations and increasing the sparsity of the fluorescence signal [109]. One recent work attempted to account for these differences by using more appropriate probabilistic models [110] but does not currently model noise correlations. On the other hand, calcium fluorescence is an indirect measure of neuronal communication and coding, being related to the underlying action potentials through a complex generative model [111, 112]. As such, it might be inappropriate or insufficient to apply an encoding model directly to the ΔF/F traces, as we have done here. To address this concern, we additionally analyzed deconvolved traces using two variants of the OASIS method [113]: unconstrained OASIS as found in suite2p [83], or OASIS with an 1 sparsity constraint as in [114]. As expected, the deconvolution techniques significantly altered the distribution of noise correlations, but the results of our analysis of these deconvolved data was qualitatively in-line with the results obtained on the raw calcium traces (see S4 Text).

The generality of the modeling framework presented here leaves room for future expansion. One such direction would be to increase the dimensionality to model correlations among a neural population. This would require more correlation parameters, which could make the model more difficult to fit to data. However, reasoning that population variability is low-dimensional [84, 115119], it is likely this issue could be circumvented by applying dimensionality reduction techniques within the model or by allowing the sharing of correlation parameters across a neural population. Another interesting application of this model would look directly at the effects of normalization on information transmission and representation. The relationship between noise correlations and the amount of information that can be represented by a neural population has been widely discussed [79, 11, 12, 120]. Moreover, some experimental and theoretical work has connected modulations of information in neural populations with computations that have been modeled with normalization models, such as surround suppression and attention [44, 51, 121, 122]. Our model could be modified to investigate this connection and further illuminate the effects of normalization on information transmission.

Supporting information

S1 Text. Derivation of moments for the generalized model.

Moments of the Ratio of Gaussians distribution for the general case of cross-correlations between numerator and denominator.


S2 Text. Derivation of negative log-likelihood for the model.

Details of the calculation of the negative-log likelihood for the Ratio of Gaussians distribution.


S3 Text. Negative log-posterior for inference of single-trial normalization strength.

Expands on Inference of single trial normalization from measured neural activity, Eq (14), showing the coefficient expressions.


S4 Text. Analysis of deconvolved imaging data.

Examining model performance on deconvolved fluorescence traces.


S5 Text. Derivation of relationship between mean normalization strength and noise correlations.

Mathematical derivations relating noise correlations and normalization in the model.


S6 Text. Further disccusion of parameter identifiability.

Additional considerations for ρ parameter estimation from data.


S7 Text. Pairwise model outperforms the independent model in simulations and V1 data when noise correlations are large.

Comparison of pairwise and independent Ratio of Gaussians model goodness of fits.


S1 Fig. Relationship between noise correlations and Ratio of Gaussians parameters.

Noise correlations in the model (Eq (7) in Methods subsection Generative model—pairwise Ratio of Gaussians (RoG)) can be modulated by stimulus strength (i.e., contrast), the correlation parameters of the model (ρN, ρD) and the parameters of the normalization model, in this case and (ϵ1, ϵ2) (see Fig 2). To understand these effects in isolation, we looked at how noise correlations (Eq (7)) changed with respect to each parameter, while keeping the other parameters constant. We illustrate with noise correlations that increase with contrast (A1-E1), and correlations that decrease with contrast (A2-E2). (A) Dependence of noise correlations on contrast. Three contrast levels that are fixed in the other panels are shown. (B) Dependence of noise correlations on ρN. (C) Dependence of noise correlations on ρD. (D) Dependence of noise correlations on , shown as a contour plot with the shade of color indicating noise correlation level. Different colors indicate different contrast levels as shown in the legend. (E) Dependence of noise correlations on (ϵ1, ϵ2).

(A1-E1) uses the following parameters (when not fixed): (A2-E2) uses the same parameters except with (ρN, ρD) = (0.5, 0). Contrast levels were {1,…,100}.


S2 Fig. Relationship between noise correlations and denominator strength.

Expands upon Fig 2 (see Results subsection Modulations of correlated variability depend on sharing of normalization) to include cases where (ρN, ρD) can be negative or have opposite signs. Figure was created using the exact same method and synthetic dataset as Fig 2: see the caption in the main text for details.



We thank members of the Kohn and Coen-Cagli laboratories for feedback on the manuscript. We also thank Daniel Quintana for help with animal training and Kenny Ye for advice on statistical analyses.


  1. 1. Shadlen MN, Newsome WT. The Variable Discharge of Cortical Neurons: Implications for Connectivity, Computation, and Information Coding. Journal of Neuroscience. 1998;18(10):3870–3896. pmid:9570816
  2. 2. Tolhurst DJ, Movshon JA, Dean AF. The Statistical Reliability of Signals in Single Neurons in Cat and Monkey Visual Cortex. Vision Research. 1983;23(8):775–785. pmid:6623937
  3. 3. Cohen MR, Kohn A. Measuring and Interpreting Neuronal Correlations. Nature Neuroscience. 2011;14(7):811–819. pmid:21709677
  4. 4. Rumyantsev OI, Lecoq JA, Hernandez O, Zhang Y, Savall J, Chrapkiewicz R, et al. Fundamental Bounds on the Fidelity of Sensory Cortical Coding. Nature. 2020;580(7801):100–105. pmid:32238928
  5. 5. Kafashan M, Jaffe AW, Chettih SN, Nogueira R, Arandia-Romero I, Harvey CD, et al. Scaling of Sensory Information in Large Neural Populations Shows Signatures of Information-Limiting Correlations. Nature Communications. 2021;12(1):473. pmid:33473113
  6. 6. Bartolo R, Saunders RC, Mitz AR, Averbeck BB. Information-Limiting Correlations in Large Neural Populations. The Journal of Neuroscience. 2020;40(8):1668–1678. pmid:31941667
  7. 7. Abbott LF, Dayan P. The Effect of Correlated Variability on the Accuracy of a Population Code. Neural Computation. 1999;11(1):91–101. pmid:9950724
  8. 8. Averbeck BB, Lee D. Effects of Noise Correlations on Information Encoding and Decoding. Journal of Neurophysiology. 2006;95(6):3633–3644. pmid:16554512
  9. 9. Kohn A, Coen-Cagli R, Kanitscheider I, Pouget A. Correlations and Neuronal Population Information. Annual Review of Neuroscience. 2016;39(1):237–256. pmid:27145916
  10. 10. Hu Y, Zylberberg J, Shea-Brown E. The Sign Rule and Beyond: Boundary Effects, Flexibility, and Noise Correlations in Neural Population Codes. PLoS Computational Biology. 2014;10(2):e1003469. pmid:24586128
  11. 11. Moreno-Bote R, Beck J, Kanitscheider I, Pitkow X, Latham P, Pouget A. Information-Limiting Correlations. Nature Neuroscience. 2014;17(10):1410–1417. pmid:25195105
  12. 12. Zohary E, Shadlen MN, Newsome WT. Correlated Neuronal Discharge Rate and Its Implications for Psychophysical Performance. Nature. 1994;370(6485):140–143. pmid:8022482
  13. 13. Kanitscheider I, Coen-Cagli R, Pouget A. Origin of Information-Limiting Noise Correlations. Proceedings of the National Academy of Sciences. 2015;112(50). pmid:26621747
  14. 14. Panzeri S, Moroni M, Safaai H, Harvey CD. The Structures and Functions of Correlations in Neural Population Codes. Nature Reviews Neuroscience. 2022;23(9):551–567. pmid:35732917
  15. 15. Bányai M, Lazar A, Klein L, Klon-Lipok J, Stippinger M, Singer W, et al. Stimulus Complexity Shapes Response Correlations in Primary Visual Cortex. Proceedings of the National Academy of Sciences. 2019;116(7):2723–2732. pmid:30692266
  16. 16. Berkes P, Orbán G, Lengyel M, Fiser J. Spontaneous Cortical Activity Reveals Hallmarks of an Optimal Internal Model of the Environment. Science (New York, NY). 2011;331(6013):83–87. pmid:21212356
  17. 17. Orbán G, Berkes P, Fiser J, Lengyel M. Neural Variability and Sampling-Based Probabilistic Representations in the Visual Cortex. Neuron. 2016;92(2):530–543. pmid:27764674
  18. 18. Bányai M, Orbán G. Noise Correlations and Perceptual Inference. Current Opinion in Neurobiology. 2019;58:209–217. pmid:31593872
  19. 19. Haefner RM, Berkes P, Fiser J. Perceptual Decision-Making as Probabilistic Inference by Neural Sampling. Neuron. 2016;90(3):649–660. pmid:27146267
  20. 20. Lange RD, Haefner RM. Characterizing and Interpreting the Influence of Internal Variables on Sensory Activity. Current Opinion in Neurobiology. 2017;46:84–89. pmid:28841439
  21. 21. Lange RD, Haefner RM. Task-Induced Neural Covariability as a Signature of Approximate Bayesian Learning and Inference. PLOS Computational Biology. 2022;18(3):e1009557. pmid:35259152
  22. 22. Bondy AG, Haefner RM, Cumming BG. Feedback Determines the Structure of Correlated Variability in Primary Visual Cortex. Nature Neuroscience. 2018;21(4):598–606. pmid:29483663
  23. 23. Doiron B, Litwin-Kumar A, Rosenbaum R, Ocker GK, Josić K. The Mechanics of State-Dependent Neural Correlations. Nature Neuroscience. 2016;19(3):383–393. pmid:26906505
  24. 24. Litwin-Kumar A, Doiron B. Slow Dynamics and High Variability in Balanced Cortical Networks with Clustered Connections. Nature Neuroscience. 2012;15(11):1498–1505. pmid:23001062
  25. 25. Hennequin G, Ahmadian Y, Rubin DB, Lengyel M, Miller KD. The Dynamical Regime of Sensory Cortex: Stable Dynamics around a Single Stimulus-Tuned Attractor Account for Patterns of Noise Variability. Neuron. 2018;98(4):846–860.e5. pmid:29772203
  26. 26. Heeger DJ, Zemlianova KO. A Recurrent Circuit Implements Normalization, Simulating the Dynamics of V1 Activity. Proceedings of the National Academy of Sciences. 2020;117(36):22494–22505. pmid:32843341
  27. 27. Pillow JW, Shlens J, Paninski L, Sher A, Litke AM, Chichilnisky EJ, et al. Spatio-Temporal Correlations and Visual Signalling in a Complete Neuronal Population. Nature. 2008;454(7207):995–999. pmid:18650810
  28. 28. Archer EW, Koster U, Pillow JW, Macke JH. Low-Dimensional Models of Neural Population Activity in Sensory Cortical Circuits. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, editors. Advances in Neural Information Processing Systems. vol. 27. Curran Associates, Inc.; 2014.
  29. 29. Gardella C, Marre O, Mora T. Modeling the Correlated Activity of Neural Populations: A Review. Neural Computation. 2019;31(2):233–269. pmid:30576613
  30. 30. Schneidman E, Berry MJ, Segev R, Bialek W. Weak Pairwise Correlations Imply Strongly Correlated Network States in a Neural Population. Nature. 2006;440(7087):1007–1012. pmid:16625187
  31. 31. Granot-Atedgi E, Tkačik G, Segev R, Schneidman E. Stimulus-Dependent Maximum Entropy Models of Neural Population Codes. PLoS Computational Biology. 2013;9(3):e1002922. pmid:23516339
  32. 32. Zhao Y, Park IM. Variational Latent Gaussian Process for Recovering Single-Trial Dynamics from Population Spike Trains. Neural Computation. 2017;29(5):1293–1316. pmid:28333587
  33. 33. Sokoloski S, Aschner A, Coen-Cagli R. Modelling the Neural Code in Large Populations of Correlated Neurons. eLife. 2021;10:e64615. pmid:34608865
  34. 34. Josić K, Shea-Brown E, Doiron B, De La Rocha J. Stimulus-Dependent Correlations and Population Codes. Neural Computation. 2009;21(10):2774–2804. pmid:19635014
  35. 35. Carandini M, Heeger DJ. Normalization as a Canonical Neural Computation. Nature Reviews Neuroscience. 2012;13(1):51–62.
  36. 36. Heeger DJ. Normalization of Cell Responses in Cat Striate Cortex. Visual Neuroscience. 1992;9(2):181–197. pmid:1504027
  37. 37. Albrecht DG, Geisler WS. Motion Selectivity and the Contrast-Response Function of Simple Cells in the Visual Cortex. Visual Neuroscience. 1991;7(6):531–546. pmid:1772804
  38. 38. Louie K, Khaw MW, Glimcher PW. Normalization Is a General Neural Mechanism for Context-Dependent Decision Making. Proceedings of the National Academy of Sciences. 2013;110(15):6139–6144. pmid:23530203
  39. 39. Ohshiro T, Angelaki DE, DeAngelis GC. A Normalization Model of Multisensory Integration. Nature Neuroscience. 2011;14(6):775–782. pmid:21552274
  40. 40. Olsen SR, Bhandawat V, Wilson RI. Divisive Normalization in Olfactory Population Codes. Neuron. 2010;66(2):287–299. pmid:20435004
  41. 41. Kohn A, Smith MA. Stimulus Dependence of Neuronal Correlation in Primary Visual Cortex of the Macaque. The Journal of Neuroscience. 2005;25(14):3661–3673. pmid:15814797
  42. 42. Liu LD, Haefner RM, Pack CC. A Neural Basis for the Spatial Suppression of Visual Motion Perception. eLife. 2016;5:e16167. pmid:27228283
  43. 43. Snyder AC, Morais MJ, Kohn A, Smith MA. Correlations in V1 Are Reduced by Stimulation Outside the Receptive Field. Journal of Neuroscience. 2014;34(34):11222–11227. pmid:25143603
  44. 44. Henry CA, Kohn A. Spatial Contextual Effects in Primary Visual Cortex Limit Feature Representation under Crowding. Nature Communications. 2020;11(1):1687. pmid:32245941
  45. 45. Cohen MR, Maunsell JHR. Attention Improves Performance Primarily by Reducing Interneuronal Correlations. Nature Neuroscience. 2009;12(12):1594–1600. pmid:19915566
  46. 46. Ruff DA, Cohen MR. Attention Can Either Increase or Decrease Spike Count Correlations in Visual Cortex. Nature Neuroscience. 2014;17(11):1591–1597. pmid:25306550
  47. 47. Mitchell JF, Sundberg KA, Reynolds JH. Spatial Attention Decorrelates Intrinsic Activity Fluctuations in Macaque Area V4. Neuron. 2009;63(6):879–888. pmid:19778515
  48. 48. Cavanaugh JR, Bair W, Movshon JA. Nature and Interaction of Signals From the Receptive Field Center and Surround in Macaque V1 Neurons. Journal of Neurophysiology. 2002;88(5):2530–2546. pmid:12424292
  49. 49. Reynolds JH, Heeger DJ. The Normalization Model of Attention. Neuron. 2009;61(2):168–185. pmid:19186161
  50. 50. Coen-Cagli R, Dayan P, Schwartz O. Cortical Surround Interactions and Perceptual Salience via Natural Scene Statistics. PLoS Computational Biology. 2012;8(3):e1002405. pmid:22396635
  51. 51. Tripp BP. Decorrelation of Spiking Variability and Improved Information Transfer Through Feedforward Divisive Normalization. Neural Computation. 2012;24(4):867–894. pmid:22168562
  52. 52. Verhoef BE, Maunsell JHR. Attention-Related Changes in Correlated Neuronal Activity Arise from Normalization Mechanisms. Nature Neuroscience. 2017;20(7):969–977. pmid:28553943
  53. 53. Coen-Cagli R, Solomon SS. Relating Divisive Normalization to Neuronal Response Variability. The Journal of Neuroscience. 2019;39(37):7344–7356. pmid:31387914
  54. 54. Sawada T, Petrov AA. The Divisive Normalization Model of V1 Neurons: A Comprehensive Comparison of Physiological Data and Model Predictions. Journal of Neurophysiology. 2017;118(6):3051–3091. pmid:28835531
  55. 55. Ruff DA, Cohen MR. Stimulus Dependence of Correlated Variability across Cortical Areas. Journal of Neuroscience. 2016;36(28):7546–7556. pmid:27413163
  56. 56. Pillow JW, Simoncelli EP. Dimensionality Reduction in Neural Models: An Information-Theoretic Generalization of Spike-Triggered Average and Covariance Analysis. Journal of Vision. 2006;6(4):9. pmid:16889478
  57. 57. Díaz-Francés E, Rubio FJ. On the Existence of a Normal Approximation to the Distribution of the Ratio of Two Independent Normal Random Variables. Statistical Papers. 2013;54(2):309–323.
  58. 58. Hayya J, Armstrong D, Gressis N. A Note on the Ratio of Two Normally Distributed Variables. Management Science. 1975;21(11):1338–1341.
  59. 59. Marsaglia G. Ratios of Normal Variables. Journal of Statistical Software. 2006;16(4).
  60. 60. Pham-Gia T, Turkkan N, Marchand E. Density of the Ratio of Two Normal Random Variables and Applications. Communications in Statistics—Theory and Methods. 2006;35(9):1569–1591.
  61. 61. Ver Hoef JM. Who Invented the Delta Method? The American Statistician. 2012;66(2):124–127.
  62. 62. Baxley RJ, Walkenhorst BT, Acosta-Marum G. Complex Gaussian ratio distribution with applications for error rate calculation in fading channels with imperfect CSI. In: 2010 IEEE Global Telecommunications Conference GLOBECOM 2010. IEEE; 2010. p. 1–5.
  63. 63. Li Y, He Q. On the ratio of two correlated complex Gaussian random variables. IEEE Communications Letters. 2019;23(12):2172–2176.
  64. 64. Kronmal RA. Spurious Correlation and the Fallacy of the Ratio Standard Revisited. Journal of the Royal Statistical Society Series A (Statistics in Society). 1993;156(3):379.
  65. 65. Albrecht DG, Hamilton DB. Striate Cortex of Monkey and Cat: Contrast Response Function. Journal of Neurophysiology. 1982;48(1):217–237. pmid:7119846
  66. 66. Clatworthy PL, Chirimuuta M, Lauritzen JS, Tolhurst DJ. Coding of the Contrasts in Natural Images by Populations of Neurons in Primary Visual Cortex (V1). Vision Research. 2003;43(18):1983–2001. pmid:12831760
  67. 67. Geisler WS, Albrecht DG. Cortical Neurons: Isolation of Contrast Gain Control. Vision Research. 1992;32(8):1409–1410. pmid:1455713
  68. 68. Carandini M, Heeger DJ, Movshon JA. Linearity and Normalization in Simple Cells of the Macaque Primary Visual Cortex. The Journal of Neuroscience. 1997;17(21):8621–8644. pmid:9334433
  69. 69. Goris RLT, Movshon JA, Simoncelli EP. Partitioning Neuronal Variability. Nature Neuroscience. 2014;17(6):858–865. pmid:24777419
  70. 70. Gur M, Beylin A, Snodderly DM. Response Variability of Neurons in Primary Visual Cortex (V1) of Alert Monkeys. The Journal of Neuroscience. 1997;17(8):2914–2920. pmid:9092612
  71. 71. Ponce-Alvarez A, Thiele A, Albright TD, Stoner GR, Deco G. Stimulus-Dependent Variability and Noise Correlations in Cortical MT Neurons. Proceedings of the National Academy of Sciences. 2013;110(32):13162–13167. pmid:23878209
  72. 72. Sadagopan S, Ferster D. Feedforward Origins of Response Variability Underlying Contrast Invariant Orientation Tuning in Cat Visual Cortex. Neuron. 2012;74(5):911–923. pmid:22681694
  73. 73. Joe H, Xu JJ. The Estimation Method of Inference Functions for Margins for Multivariate Models. Faculty Research and Publications; 1996.Available from:
  74. 74. Berkes P, Wood F, Pillow J. Characterizing Neural Dependencies with Copula Models. In: Koller D, Schuurmans D, Bengio Y, Bottou L, editors. Advances in Neural Information Processing Systems. vol. 21. Curran Associates, Inc.; 2008.
  75. 75. Coen-Cagli R, Kohn A, Schwartz O. Flexible Gating of Contextual Influences in Natural Vision. Nature Neuroscience. 2015;18(11):1648–1655. pmid:26436902
  76. 76. Aljadeff J, Lansdell BJ, Fairhall AL, Kleinfeld D. Analysis of Neuronal Spike Trains, Deconstructed. Neuron. 2016;91(2):221–259. pmid:27477016
  77. 77. Hénaff OJ, Boundy-Singer ZM, Meding K, Ziemba CM, Goris RLT. Representation of Visual Uncertainty through Neural Gain Variability. Nature Communications. 2020;11(1):2513. pmid:32427825
  78. 78. Sturmfels B. Solving Systems of Polynomial Equations. No. 97 in CBMS Regional Conference Series in Mathematics. Providence, R.I: Conference Board of the Mathematical Sciences; 2002.
  79. 79. Wekselblatt JB, Flister ED, Piscopo DM, Niell CM. Large-Scale Imaging of Cortical Dynamics during Sensory Perception and Behavior. Journal of Neurophysiology. 2016;115(6):2852–2866. pmid:26912600
  80. 80. Sridharan S, Gajowa MA, Ogando MB, Jagadisan UK, Abdeladim L, Sadahiro M, et al. High-Performance Microbial Opsins for Spatially and Temporally Precise Perturbations of Large Neuronal Networks. Neuron. 2022;110(7):1139–1155.e6. pmid:35120626
  81. 81. Bounds HA, Sadahiro M, Hendricks WD, Gajowa M, Gopakumar K, Quintana D, et al. Ultra-Precise All-Optical Manipulation of Neural Circuits with Multifunctional Cre-dependent Transgenic Mice. bioRxiv: the preprint server for biology. 2022
  82. 82. Peirce J, Gray JR, Simpson S, MacAskill M, Höchenberger R, Sogo H, et al. PsychoPy2: Experiments in Behavior Made Easy. Behavior Research Methods. 2019;51(1):195–203. pmid:30734206
  83. 83. Pachitariu M, Stringer C, Dipoppa M, Schröder S, Rossi LF, Dalgleish H, et al. Suite2p: Beyond 10,000 Neurons with Standard Two-Photon Microscopy. bioRxiv: the preprint server for biology. 2017
  84. 84. Lin IC, Okun M, Carandini M, Harris KD. The Nature of Shared Cortical Variability. Neuron. 2015;87(3):644–656. pmid:26212710
  85. 85. Rikhye RV, Sur M. Spatial Correlations in Natural Scenes Modulate Response Reliability in Mouse Visual Cortex. Journal of Neuroscience. 2015;35(43):14661–14680. pmid:26511254
  86. 86. Beck JM, Latham PE, Pouget A. Marginalization in Neural Circuits with Divisive Normalization. The Journal of Neuroscience. 2011;31(43):15310–15319. pmid:22031877
  87. 87. Dehaene GP, Coen-Cagli R, Pouget A. Investigating the Representation of Uncertainty in Neuronal Circuits. PLOS Computational Biology. 2021;17(2):e1008138. pmid:33577553
  88. 88. Festa D, Aschner A, Davila A, Kohn A, Coen-Cagli R. Neuronal Variability Reflects Probabilistic Inference Tuned to Natural Image Statistics. Nature Communications. 2021;12(1):3635. pmid:34131142
  89. 89. Hayashi T, Kato Y, Nozaki D. Divisively Normalized Integration of Multisensory Error Information Develops Motor Memories Specific to Vision and Proprioception. The Journal of Neuroscience. 2020;40(7):1560–1570. pmid:31924610
  90. 90. Ohshiro T, Angelaki DE, DeAngelis GC. A Neural Signature of Divisive Normalization at the Level of Multisensory Integration in Primate Cortex. Neuron. 2017;95(2):399–411.e8. pmid:28728025
  91. 91. Heeger DJ, Mackey WE. Oscillatory Recurrent Gated Neural Integrator Circuits (ORGaNICs), a Unifying Theoretical Framework for Neural Dynamics. Proceedings of the National Academy of Sciences. 2019;116(45):22783–22794. pmid:31636212
  92. 92. Echeveste R, Aitchison L, Hennequin G, Lengyel M. Cortical-like Dynamics in Recurrent Circuits Optimized for Sampling-Based Probabilistic Inference. Nature Neuroscience. 2020;23(9):1138–1149. pmid:32778794
  93. 93. Paninski L, Pillow J, Lewi J. Statistical Models for Neural Encoding, Decoding, and Optimal Stimulus Design. In: Cisek P, Drew T, Kalaska JF, editors. Computational Neuroscience: Theoretical Insights into Brain Function. vol. 165 of Progress in Brain Research. Elsevier; 2007. p. 493–507.
  94. 94. Yu BM, Cunningham JP, Santhanam G, Ryu SI, Shenoy KV, Sahani M. Gaussian-Process Factor Analysis for Low-Dimensional Single-Trial Analysis of Neural Population Activity. Journal of Neurophysiology. 2009;102(1):614–635. pmid:19357332
  95. 95. Ecker AS, Berens P, Cotton RJ, Subramaniyan M, Denfield GH, Cadwell CR, et al. State Dependence of Noise Correlations in Macaque Primary Visual Cortex. Neuron. 2014;82(1):235–248. pmid:24698278
  96. 96. Whiteway MR, Butts DA. The Quest for Interpretable Models of Neural Population Activity. Current Opinion in Neurobiology. 2019;58:86–93. pmid:31426024
  97. 97. Whiteway MR, Socha K, Bonin V, Butts DA. Characterizing the Nonlinear Structure of Shared Variability in Cortical Neuron Populations Using Latent Variable Models. Neurons, behavior, data analysis and theory. 2019;3(1). pmid:31592129
  98. 98. Kodandaramaiah SB, Flores FJ, Holst GL, Singer AC, Han X, Brown EN, et al. Multi-Neuron Intracellular Recording in Vivo via Interacting Autopatching Robots. eLife. 2018;7:e24656. pmid:29297466
  99. 99. Hurwitz C, Kudryashova N, Onken A, Hennig MH. Building Population Models for Large-Scale Neural Recordings: Opportunities and Pitfalls. Current Opinion in Neurobiology. 2021;70:64–73. pmid:34411907
  100. 100. Stevenson IH, Kording KP. How Advances in Neural Recording Affect Data Analysis. Nature Neuroscience. 2011;14(2):139–142. pmid:21270781
  101. 101. Ruff DA, Cohen Marlene R. A Normalization Model Suggests That Attention Changes the Weighting of Inputs between Visual Areas. Proceedings of the National Academy of Sciences. 2017;114(20):E4085–E4094. pmid:28461501
  102. 102. Audoly S, Bellu G, D’Angio L, Saccomani MP, Cobelli C. Global Identifiability of Nonlinear Models of Biological Systems. IEEE Transactions on Biomedical Engineering. 2001;48(1):55–65. pmid:11235592
  103. 103. Pnevmatikakis EA. Analysis Pipelines for Calcium Imaging Data. Current Opinion in Neurobiology. 2019;55:15–21. pmid:30529147
  104. 104. Stringer C, Pachitariu M. Computational Processing of Neural Recordings from Calcium Imaging Data. Current Opinion in Neurobiology. 2019;55:22–31. pmid:30530255
  105. 105. Evans MH, Petersen RS, Humphries MD. On the Use of Calcium Deconvolution Algorithms in Practical Contexts. bioRxiv: the preprint server for biology. 2020
  106. 106. Yaksi E, Friedrich RW. Reconstruction of Firing Rate Changes across Neuronal Populations by Temporally Deconvolved Ca2+ Imaging. Nature Methods. 2006;3(5):377–383. pmid:16628208
  107. 107. Benisty H, Song A, Mishne G, Charles AS. Data Processing of Functional Optical Microscopy for Neuroscience; 2022.
  108. 108. Albrecht DG, Geisler WS, Frazor RA, Crane AM. Visual Cortex Neurons of Monkeys and Cats: Temporal Dynamics of the Contrast Response Function. Journal of Neurophysiology. 2002;88(2):888–913. pmid:12163540
  109. 109. Rupasinghe A, Francis N, Liu J, Bowen Z, Kanold PO, Babadi B. Direct Extraction of Signal and Noise Correlations from Two-Photon Calcium Imaging of Ensemble Neuronal Activity. eLife. 2021;10:e68046. pmid:34180397
  110. 110. Wei XX, Zhou D, Grosmark A, Ajabi Z, Sparks F, Zhou P, et al. A Zero-Inflated Gamma Model for Post-Deconvolved Calcium Imaging Traces. bioRxiv: the preprint server for biology. 2019
  111. 111. Vogelstein JT, Packer AM, Machado TA, Sippy T, Babadi B, Yuste R, et al. Fast Nonnegative Deconvolution for Spike Train Inference From Population Calcium Imaging. Journal of Neurophysiology. 2010;104(6):3691–3704. pmid:20554834
  112. 112. Triplett MA, Goodhill GJ. Probabilistic Encoding Models for Multivariate Neural Data. Frontiers in Neural Circuits. 2019;13:1. pmid:30745864
  113. 113. Friedrich J, Zhou P, Paninski L. Fast Online Deconvolution of Calcium Imaging Data. PLOS Computational Biology. 2017;13(3):1–26. pmid:28291787
  114. 114. Lyall EH, Mossing DP, Pluta SR, Chu YW, Dudai A, Adesnik H. Synthesis of a Comprehensive Population Code for Contextual Features in the Awake Sensory Cortex. eLife. 2021;10:e62687. pmid:34723796
  115. 115. Cunningham JP, Yu BM. Dimensionality Reduction for Large-Scale Neural Recordings. Nature Neuroscience. 2014;17(11):1500–1509. pmid:25151264
  116. 116. Huang C, Ruff DA, Pyle R, Rosenbaum R, Cohen MR, Doiron B. Circuit Models of Low-Dimensional Shared Variability in Cortical Networks. Neuron. 2019;101(2):337–348.e4. pmid:30581012
  117. 117. Rabinowitz NC, Goris RL, Cohen M, Simoncelli EP. Attention Stabilizes the Shared Gain of V4 Populations. eLife. 2015;4:e08998. pmid:26523390
  118. 118. Schölvinck ML, Saleem AB, Benucci A, Harris KD, Carandini M. Cortical State Determines Global Variability and Correlations in Visual Cortex. Journal of Neuroscience. 2015;35(1):170–178. pmid:25568112
  119. 119. Umakantha A, Morina R, Cowley BR, Snyder AC, Smith MA, Yu BM. Bridging Neuronal Correlations and Dimensionality Reduction. Neuron. 2021;109(17):2740–2754.e12. pmid:34293295
  120. 120. Averbeck BB, Latham PE, Pouget A. Neural Correlations, Population Coding and Computation. Nature Reviews Neuroscience. 2006;7(5):358–366. pmid:16760916
  121. 121. Kanashiro T, Ocker GK, Cohen MR, Doiron B. Attentional Modulation of Neuronal Variability in Circuit Models of Cortex. eLife. 2017;6:e23978. pmid:28590902
  122. 122. Ringach DL. Population Coding under Normalization. Vision Research. 2010;50(22):2223–2232. pmid:20034510