A Two-Stage Cascade Model of BOLD Responses in Human Visual Cortex

doi:10.1371/journal.pcbi.1003079

Figure 1.

Building general, predictive models of the visual system.

We seek to develop computational models that characterize how stimuli are encoded in responses measured in the visual system. These models consist of specific computations and may have parameters that are adjusted to fit the data. Importantly, the models should operate on a wide range of stimuli and predict responses beyond those to which the models are fit.

More »

Expand

Figure 2.

Second-order contrast (SOC) model.

(A) Schematic of model. First, the stimulus is filtered with a set of Gabor filters at different positions, orientations, and phases; the outputs of quadrature-phase pairs are squared, summed, and square-rooted (V1 energy). Second, filter outputs are divided by local population activity (Divisive normalization). Third, filter outputs are summed across orientation, producing a map of local contrast-energy. Contrast-energy is then weighted and summed across space using a 2D Gaussian (Spatial summation). The summation is not linear; rather, the summation is performed using a variance-like nonlinearity in which average contrast-energy is subtracted before squaring and summing across space (Second-order contrast). Finally, the output of the summation is subjected to a compressive power-law function (Compressive nonlinearity), yielding the predicted response. (B) Computation of second-order contrast. Second-order contrast is computed as the variance of the contrast-energy distribution within the 2D Gaussian. In this example, there is high variation in contrast-energy and thus a high amount of second-order contrast. (C) Simplified versions of the model. To motivate the SOC model, we consider several simplified versions of the model. Each version incorporates a model component not present in the previous version.

More »

Expand

Figure 3.

Divisive normalization accounts for contrast saturation.

We measured responses to several types of grating stimuli varying in contrast. Responses of an example voxel are shown (subject 2, area V1, voxel 31150). The complex-cell energy (CC) model consists of V1 energy and spatial summation, and predicts that responses rise linearly with contrast. However, the actual responses exhibit saturation at low contrasts. To account for contrast saturation, we incorporated divisive normalization [10] into the model. The divisive normalization (DN) model fits the data accurately.

More »

Expand

Figure 4.

Compressive nonlinearity accounts for spatial tolerance.

We measured responses to noise patterns covering different portions of the visual field. Responses of an example voxel are shown (subject 2, area V2, voxel 38512). The DN model underestimates responses to stimuli covering a small portion of the receptive field and overestimates responses to stimuli covering a large portion of the receptive field. To improve performance, we incorporated a compressive static nonlinearity into the model. The compressive nonlinearity is applied after spatial summation and provides increased tolerance for changes in the position and size of a stimulus [11]. The compressive spatial summation (CSS) model fits the data accurately.

More »

Expand

Figure 5.

Second-order contrast accounts for weak responses to grating stimuli.

(A) Second-order contrast improves model fits. We fit the CSS model to the spatial stimuli (shown in Figure 4) and evaluated how well the model predicts responses to the grating stimuli (shown in Figure 3). Results for an example voxel are shown (subject 2, area V2, voxel 42608). The CSS model substantially overestimates the grating responses. To improve performance, we incorporated computation of second-order contrast into the model. The second-order contrast (SOC) model fits the data accurately. (B) Additional demonstration of second-order effect. We measured responses to noise patterns varying in the amount of separation between the contours composing the patterns. At low separation levels, the stimuli contain little variation in contrast-energy across space and evoke weak responses, as expected (same voxel in panel A).

More »

Expand

Figure 6.

SOC model has high cross-validation accuracy.

Five-fold cross-validation was used to quantify the accuracy of the CC, DN, CSS, and SOC models. Vertical bars indicate the median accuracy across voxels in a given visual field map. Solid horizontal lines indicate the maximum possible performance given the noise in the data, and dotted horizontal lines indicate the performance of a control model that simply predicts the same response for every stimulus. The numbers at the top indicate the median performance of the SOC model, expressed in terms of explainable variance (see Methods). Within each visual field map, all pairwise differences between models are statistically significant (p<0.05, two-tailed sign test) with the exception of DN vs. CSS in V2.

More »

Expand

Figure 7.

Data and cross-validated model predictions.

Here we visualize the cross-validation results by averaging across voxels in a visual field map. Black bars indicate the median response across voxels, and colored curves indicate the median model prediction across voxels. The CC model captures qualitative features of the data but fails quantitatively. The DN and CSS models fare better than the CC model but systematically underestimate and overestimate certain responses. The SOC model does well quantitatively predicting the full range of responses.

More »

Expand

Figure 8.

Additional control models.

(A) Schematic of models. Six variants of the SOC model were tested. Text annotations indicate modifications to model components (see Methods for details). (B) Cross-validation accuracy. Format same as in Figure 6 except that accuracy is now expressed in terms of explainable variance (the CC model is omitted as it falls outside the visible range). No model outperforms the SOC model. The RM2 model—which is a variant of the SOC model that omits the Divisive normalization component—performs about as well as the SOC model. This can be explained by the fact that there is some degree of overlap in functionality between the Divisive normalization and Compressive nonlinearity components of the SOC model.

More »

Expand

Figure 9.

Parameters of the SOC model vary systematically across visual field maps.

(A) Size parameter (σ). The top panel shows the estimated σ value at 2° eccentricity for each visual field map. To quantify receptive field size, we compute model responses to small white spots (0.25°×0.25°) and fit 2D Gaussians to the results. The bottom panel shows contours at ±2 s.d. of the fitted Gaussians. (B) Exponent parameter (n). The top panel shows the median n value for each visual field map. To demonstrate the effect of n, we compute model responses to full-field noise patterns varying in contrast (same patterns used for the spatial stimuli). The bottom panel shows the resulting contrast response functions, normalized such that the maximum response is 1. (C) Second-order parameter (c). The top panel shows the median c value for each visual field map. To interpret the effect of c, we compute model responses to a 20%-contrast plaid pattern covering the entire receptive field and the same plaid pattern covering half of the receptive field. The bottom panel shows responses to the full and half plaids, normalized such that the response to the full plaid is 1. For reference we also show results obtained when c is set to 0.

More »

Expand

Figure 10.

SOC model exhibits surround suppression.

(A) Simulation results. Stimuli consisted of a horizontal grating presented within circles of different sizes. Using the typical parameter values found in V2 (see Figure 9), we simulated the response of an array of model units tiling the visual field. Responses are strongest for units positioned at the edge of the grating since responses are driven primarily by variation in contrast-energy. (B) Responses of one unit (marked by a white dot in panel A). With increasing stimulus size, the response rises and then falls, consistent with surround-suppression effects found in electrophysiology [e.g. 20], [48].

More »

Expand

Figure 11.

Natural images have relatively large amounts of second-order contrast.

(A) Simulation results. We prepared a collection of band-pass filtered natural image patches and phase-scrambled versions of these patches. We then quantified the amount of second-order contrast in each patch by computing the response of the SOC model to the patch (model parameters were set to the typical values found in V2). The median and interquartile range of responses are shown. For comparison we show results obtained when the second-order parameter c is set to 0. The SOC model but not the control model exhibits larger responses to the natural image patches. (B) Example patches. The natural image patch exhibits spatial variation in contrast, whereas its phase-scrambled counterpart is relatively homogeneous in contrast across space. Natural images were obtained from the McGill Colour Image Database [73].

More »

Expand