Figure 1.
Response of a large population of ganglion cells to a 10 s long repeated visual stimulus.
(a) White noise uncorrelated Gaussian stimulus presented at and the spiking patterns of 3 cells to repeated presentations of the stimulus. (b) Spike-trigerred averages of 110 simultaneously recorded cells; a subset of 100 cells was chosen for further analysis. (c) The histogram of pairwise correlation coefficients between cells for repeated Gaussian white noise stimulus (green). For comparison, the statistics of the response on repeated natural pixel movie (red), and non-repeated natural pixel movie (blue) is also shown, as documented in Ref. [35]. The significance cutoff for correlation coefficients is
, 95% of correlations are above this cut (see Methods). (d) Average pairwise correlation coefficient between cells as a function of the distance (mean and std are across pairs of cells at a given distance).
Figure 2.
Pairwise SDME (S2) model predicts the firing rate of single cells better than conditionally independent LN (S1) models.
(a) Example of the PSTH segment for one cell (green), the best prediction of the S1 model (blue) and of the S2 model (red). (b) Correlation coefficient between the true PSTH and S2 model prediction (vertical axis) vs. the correlation between the true PSTH and the S1 model prediction (horizontal axis); each plot symbol is a separate cell, dotted line shows equality. S2 significantly outperforms S1 (, paired two-sided Wilcoxon test). The neuron chosen in panel (a) is shown in orange.
Figure 3.
An overview of maximum entropy encoding models.
The explicit dependence of single-neuron terms (, vertical axis, ‘T’ or ‘S’), and the absence or presence of pairwise terms (
, horizontal axis, ‘1’ or ‘2’), together define the type of the maximum entropy model (e.g. pairwise SDME is ‘S2’). For completeness, the first row of the table includes static maximum entropy models of population vocabulary,
, which have no explicit stimulus dependence. Full conditionally independent model (T1) reproduces exactly the instantaneous firing rate of every neuron, and thus fully captures the stimulus sensitivity, history effects, and adaptation on a single neuron level; for experimentally recorded rasters with stimulus repeats, simulated T1 rasters are often generated by taking the original data and, at each time point and for every neuron, randomly permuting the responses recorded on different stimulus repeats. “Total correlation” is the pairwise correlation matrix of activities,
, averaged over all repetitions and all times in the experiment.
Figure 4.
Pairwise SDME (S2) model predicts population activity patterns for neurons better than conditionally independent LN (S1) models.
(a) The activity raster for 100 neurons across 626 repeats of the stimulus at a point in time where the retina is moderately active (). Dots represent individual spikes; training repeats denoted in black, test repeats in orange. (b) The diversity in retinal responses in a. Shown are all distinct patterns; their number is comparable to the number of repeats. Neurons are resorted by their instantaneous firing rate (high rate = top, low rate = bottom). (c) S2 model fit on the training repeats predicts the reliably estimated correlation coefficients between pairs of neurons at various time points where the retina is active. We identify all correlation coefficients whose value can be estimated from data with less than 25% relative error across many splits of the repeats into two halves. The value of these correlation coefficients is estimated on the test set (horizontal axis) and compared to the model prediction (vertical axis). (d) The log-likelihood ratio of the population firing patterns under the S2 model and under the S1 model, shown as a function of time (violet dots, scale at left) for an example (test) stimulus repeat. For reference, the average population firing rate is shown in grey (scale at right). The arrow denotes the time bin displayed in a, b.
Figure 5.
The performance of the SDME (S2) model relative to conditionally independent LN (S1) models.
The average log likelihood ratio between the S2 and the S1 models evaluated on the test set, as a function of the population size, (error bars = std over 10 randomly chosen groups of neurons at that
).
Figure 6.
The performance of various models in accounting for the total vocabulary of the population,
. The results for the S2 model are shown in (a), the results for the S1 model in (b), and the results for a full conditionally independent model (T1) in (c). The first row displays the log ratio of model to empirical probabilities for various codewords (dots), as a function of that codeword's empirical frequency in the recorded data. The model probabilities were estimated by generating Monte Carlo samples drawn from the corresponding model distributions; only patterns that were generated in the MC run as well as found in the recorded data are shown. GoF quantifies the deviation between true and predicted
of the non-silent codewords shown in the plot; smaller values indicate better agreement (see Methods). The second row summarizes this scatterplot by binning codewords according to their frequency, and showing the average log probability ratio in the bin (solid line), as well as the
std scatter across the codewords in the bin (shaded area). The highly probable all-silent state,
, is shown separately as a circle. The third row shows the overlap between 500 most frequent patterns in the data and 500 most likely patterns generated by the model (see text). Models were fit on training repeats; comparisons are done only with test repeats data.
Figure 7.
Measured vs predicted noise correlations for the pairwise SDME (S2) model.
Noise correlation (see text) is estimated from recorded data for every pair of neurons, and plotted against the noise correlation predicted by the S2 model (each pair of neurons = one dot; shown are dots for
neurons; for significantly correlated pairs, the slope of the best fit line is
, with
). Conditionally independent models predict zero noise correlation for all pairs.
Figure 8.
Pairwise SDME (S2) model parameters.
(a) Average values of the LN-like driving term, , where
, across all cells
(error bars = std across cells), for each of the
adaptive bins for
(see Methods). (b) Pairwise interaction map
of the S2 model, between all
neurons in the experiment. (c) Histogram of pairwise interaction values from (b), and their average value as a function of the distance between cells (inset). (d) For each pair of cells
and
, we plot the value of
under the static maximum entropy model of Eq. (6) vs. the
from the S2 model of Eq. (4).
Figure 9.
Clustering of response patterns into basins of attraction centered on meta-stable patterns generalizes across repeats.
a) Every response pattern from data is assigned to its corresponding meta-stable pattern
by descending on the energy landscape
defined by the S2 model of Eq (4) until the local minimum is reached (see text). Across all test repeats and at each point in time (horizontal axis), we find the metastable states that are visited more than 30 times, plot their energy
(vertical axis), and the number of repeats on which that metastable state is visited (shade of red). b) Inset: for
(blue rectangle in a), we plot the frequency of visit to each metastable state (dots) in the training set (horizontal) against the frequency in the test set (vertical). Main panel: the same analysis across all time bins (different colors) superposed, dashed line is equality.
Figure 10.
Surprise and information transmission estimated from the pairwise SDME (S2) model.
(a) Surprise rate (blue) is estimated from the static ME and S2 models assuming independence of codewords across time bins. The instantaneous information rate (red) is the difference between the surprise and the noise entropy rate, estimated from the S2 model (see text). The information transmission rate is the average of the instantaneous information across time. (b) Population firing rate as a function of time shows that bursts of spiking strongly correlate with the bursts of surprise and information transmission in the population. (c) The stimulus (normalized to zero mean and unit variance) is shown for reference as a function of time.