Predicting neuronal dynamics with a delayed gain control model

doi:10.1371/journal.pcbi.1007484

Fig 1.

Schematic of different temporal phenomena observed in neural response time courses.

For each phenomenon, we show a schematic with a stimulus time course (gray shading), a linear prediction (black dashed line), and a cartoon illustration of plausible neuronal responses consistent with prior findings (red line). The linear prediction is the result of convolving an impulse response (left) with a stimulus time course. A. For a sustained stimulus, the neuronal response reduces after an initial transient, differing from the sustained linear prediction [e.g., 2, 3]. B. Neuronal responses sum sub-linearly in time: doubling the stimulus duration results in a total response that is less than double (less than the linear prediction) [3, 4]. C. For two presentations of a single image with a brief gap in between, the neuronal response to the second presentation is lower than the linear prediction (e.g., refs [2, 4, 5]). D. Compared to the linear prediction, the neuronal response to a low contrast stimulus is both lower in amplitude and delayed [4, 6, 7].

More »

Expand

Fig 2.

The delayed normalization (DN) model.

(A) The input to the model is the contrast time course of a stimulus, S, which is 0 when the stimulus is absent and 1 when it is present. First, the model computes the linear neuronal response by convolving S with an impulse response function h₁ (parameterized by τ₁ and w). The linear output is then full-wave rectified and exponentiated by n. We assumed n>1 in this paper. The exponentiated output is divisively normalized by a denominator that consists of two components: a semi-saturation constant (σ), and a causally low-pass filtered version of the driving signal. Both components were raised to the same power n. The predicted neuronal response (right) to the example input stimulus S (left) includes a transient followed by a lower-level, more sustained response. (B) The effects of varying each of the 5 parameters are shown. For example, larger w means a more biphasic impulse response, therefore a larger transient response at stimulus offset (top row). In all simulations, the default parameters are w = 0, τ₁ = 0.05, τ₂ = 0.1, n = 2, σ = 1. For more details of model behavior see S1 Fig.

More »

Expand

Fig 3.

The DN model captures the response reduction for prolonged stimuli at different cortical locations.

(A) The DN model fits (red) accurately describe the ECoG broadband time course (black) in multiple ROIs. Data were averaged across trials and electrodes within ROIs, and models were fit to the average time course. Each trial had a 500-ms stimulus (gray box) followed by a 500-ms blank. Plots show the mean and 50% CI for data (bootstrapped 100 times across electrodes within an ROI), and the model fit averaged across the 100 bootstraps. The number of electrodes per ROI and the 50% CI of model accuracy (r² per bootstrap) are indicated in each subplot. (B) The model fits for the 4 ROIs are plotted together, scaled to unit height. For this plot, the latency was assumed to be 0 for each ROI, so that the difference in time to peak reflects a difference in integration time rather than a difference in response latency. (C) Cross-validation over trials and over electrodes. 30-fold leave-one-out cross validation over trials was performed on the 30 repeats. Red dots represent the median r² across trials, and black dots are the leave-one-out prediction to each trial. Leave-one-out cross validation was also performed over electrodes. Details of the cross-validated fit were presented in S3 Fig.

More »

Expand

Fig 4.

The DN model captures difference of temporal dynamics at different cortical locations.

(A) Temporal summation window length and the extent of gain control increase along the visual hierarchy. The model parameters fit to the data are shown on the right. The model fits were then summarized by two metrics. Tpeak is the duration from the onset of a sustained stimulus to the peak response, excluding the onset latency. Tpeak is longer for later ROIs, ranging from ~115 ms (V1) to ~145 ms (anterior ROIs). Rasymp is the level at which the response asymptotes for a sustained stimulus, as a fraction of the peak response. A smaller Rasymp indicates a greater extent of gain control. Rasymp is largest in V1 (~0.12) and declines in extrastriate areas. See S5 Fig for individual electrode results. (B) Offset response as a function of eccentricity. The lower plots show the time series and model fits to 3 example electrodes. The offset response increases from fovea to periphery. This pattern holds across all 3 ROIs, as shown in the dot plot. Each dot is the mean weight (w) on the negative lobe of the biphasic response. Larger values of w predict larger offset responses.

More »

Expand

Fig 5.

DN model captures sub-additive temporal summation and adaptation.

There are two types of temporal profiles used for the fMRI experiment: one-pulse stimuli with varying durations and two-pulse stimuli (134 ms each) with varying ISI. To generate DN model predictions to these stimuli, we used the median DN parameters fit to the V1 broadband time course measured in individual electrodes (S3 Fig). To convert the prediction to percent BOLD, we summed the predicted time course for each temporal profile and fit a single gain factor to minimize the difference between the predictions and the fMRI data. The DN model predictions (red) better capture the BOLD data than the linear prediction (green) (r²: 0.94 vs. 0.81).

More »

Expand

Fig 6.

DN model captures delayed responses at low contrast.

(A) The DN model was fit to single unit spike rate data from macaque V1, with stimulus contrasts ranging from 0% to 90%. The input time course for model fitting was scaled to the stimulus contrast. A single model was fit to all stimuli (10 time-courses) separately for each of the three cells. Model fits are shown in the main plots and data in the insets. The model captures both the lower response amplitudes and slower temporal dynamics at low contrast. Data from [6], provided by W. Geisler. (B) Time to peak (ms) and peak amplitude (normalized spike rate) for single unit data as a function of contrast. The 3 cells are those plotted in (A). The data points are the cell responses and the curved lines are the DN model fits. The colors of the dots match the colors in (A), indicating stimulus contrast. (C) A set of model parameters that predict non-converging response levels at stimulus offset. (D) The top row is identical to the model predictions in panel A except they are shown for an extended period (up to 200 ms). The predicted response to 70% contrast is highlighted, and was used to scale and shift to predict other time courses, as shown in the bottom row.

More »

Expand

Fig 7.

Comparison between temporal models.

A. We compared two groups of models by simulating their outputs to a set of stimulus time courses (different stimulus durations, ISIs, and different stimulus contrasts). One group of models (red) is inspired by RC-circuit models of normalization, and assumes gain control is implemented by changes in membrane conductance associated with spiking. The other group of models (blue) do not make this assumption. Within each set, there are some (but not all) models that capture the properties of the response time courses summarized in Fig 1. B. The two temporal channels model does not assume a gain control component. As a consequence, when stimuli are of lower contrast, the model captures the reduced response amplitude but not the slower dynamics. Within a model group, if the level of gain control depends on the instantaneous instead of the history-dependent response, the model does not reproduce the transient-decay response shape. C. The difference between the response time course from a striate versus an extra-striate visual area (earlier response peak and higher response level at stimulus offset for the striate response) can also be qualitatively captured by cascading the DN model: the first layer takes the stimulus time course as input, and the second layer takes the response output from the first layer as input.

More »

Expand