Statistical regularities in natural scenes that support figure-ground segregation by neural populations

doi:10.1371/journal.pcbi.1013573

Fig 1.

Motion maps and distance maps with figure-ground annotations were generated for analysis.

A) A dataset of motion maps was created by applying motion estimation to grayscale movies (left), yielding estimates of motion speed (left) and direction (right). The red outline indicates the central frame for which motion was estimated. B) Epochs of visual motion for analysis were sampled from each 4-minute-long movie by selecting two local maxima in the mean speed and then sampling the frames associated with the peak, onset, and offset of motion. C) RGB images (left) and distance maps (right) were obtained from the dataset described in [32]. D) Figure region (red) and ground region (blue) annotations of selected images from the distance and motion map datasets were performed by professional human annotators. Black pixels indicate excluded sky regions and figure-ground boundaries.

More »

Expand

Fig 2.

Simulated receptive fields (sRFs) and fixation points were used to sample from both datasets.

This figure illustrates these sRFs and how the associated fixation point can affect retinal speed and disparity (it is not a real scene from our datasets). After an sRF (gray circles) was identified, we assigned a random fixation point based on the scene’s saliency map (green circles) at the appropriate eccentricity (white dashed lines). A,B) For the motion maps, the calculated motion vector at the point of fixation was subtracted from the motion at all other points to simulate retinal motion (inset). In this example, the annotated figure region is moving leftward and the rest of the scene is stationary. Panel A illustrates the average speeds in the resulting figure and ground regions (s_f and s_g) for a stationary fixation point (such that the figure region speed is faster) and panel B illustrates the same for a moving fixation point on the figure (such that the ground region speed is faster). C,D) For the distance map dataset, the calculated distance at the point of fixation was used to determine the binocular retinal disparity. In this example, the annotated figure region is closer than the rest of the scene. Panel C illustrates average disparities in the resulting figure and ground regions (d_f and d_g) for a far fixation point and panel D illustrates the same for a near fixation point on the figure. In both cases, the figure region has a more positive disparity than the ground. Images are modified from artwork obtained on Pixabay.

More »

Expand

Fig 3.

Globally, figures tend to move faster, move more horizontally, and be nearer than their surroundings.

(A) The overall frequency distribution of all speeds (gray shading) is shown for the motion map data set, excluding motion below 0.5 deg/s. Red and blue lines show the distributions separately for pixels labeled as figure and ground, respectively. (B, C) The overall frequency distribution of motion directions and scene distances, as well as the figure and ground distributions, are plotted in the same manner as in A.

More »

Expand

Fig 4.

Within sRFs, figure regions tend to move faster than the ground regions.

(A) The frequency distribution of speed differences between figure (f) and ground (g) regions in log deg/s across all sRFs is plotted. Positive values indicate sRFs in which the average figure moves faster than the average speed in the ground. (B) The average and 95% confidence intervals for these figure-ground speed differences are plotted separately for each sRF eccentricity. (C) For relative speed, we show the average frequency across all points in the sRFs (gray shading), and separately for the figure regions and ground regions (red and blue lines, respectively). (D) The mean figure and ground results from panel C are plotted as a probability ratio as a function of relative speed. Values greater than 1 (red background) indicate that points are more likely to be in figure regions and values less than 1 (blue background) indicate that points are more likely to be in ground regions. In C and D, dashed lines indicate 95% confidence intervals, but in some panels the intervals are barely wider than the line thickness.

More »

Expand

Table 1.

Pairwise t-tests comparing figure-ground speed differences at each sRF eccentricity.

More »

Expand

Fig 5.

Within sRFs, figure regions tend to move more coherently than the ground regions.

(A) The frequency distribution of differences in circular variance between figure and ground is plotted as a measure of coherence difference. Positive values indicate lower coherence in the figure region and negative values indicate higher coherence in the figure region. Note that the x axis is flipped. (B) The average and 95% confidence intervals for these figure-ground variance differences are plotted separately for each sRF eccentricity. (C) For relative direction, we show the frequency across all points in the sRFs (gray shading), and separately for the figure regions and ground regions (red and blue lines, respectively). 0 deg corresponds to the dominant motion in each sRF. (D) The mean figure and ground results from panel C are plotted as a probability ratio as a function of relative motion direction. Values greater than 1 (red background) indicate that points are more likely to be in figure regions and values less than 1 (blue background) indicate that points are more likely to be in ground regions. In C and D, dashed lines indicate 95% confidence intervals, but in some panels the intervals are barely wider than the line thickness.

More »

Expand

Table 2.

Pairwise t-tests comparing figure-ground circular variance differences at each sRF eccentricity.

More »

Expand

Fig 6.

Within sRFs, figure regions tend to be nearer than the ground regions and have more positive (nearer) binocular disparity.

(A) The frequency distribution of relative disparity between figure and ground across all sRFs is plotted. Positive values indicate sRFs in which the average figure distance is nearer than the average distance in the ground. (B) The average and 95% confidence intervals for these figure-ground disparity differences are plotted separately for each sRF eccentricity. (C) The frequency of binocular disparities is presented for figure regions and ground regions (red and blue lines, respectively). Gray shading indicates the probability across all points in the sRFs. (D) The mean figure and ground results from panel C are plotted as a probability ratio as a function of binocular disparity. Values greater than 1 (red background) indicate that points are more likely to be in figure regions and values less than 1 (blue background) indicate that points are more likely to be in ground regions. In C and D, dashed lines indicate 95% confidence intervals, but in some panels the intervals are barely wider than the line thickness.

More »

Expand

Table 3.

Pairwise t-tests comparing figure-ground disparity differences at each sRF eccentricity.

More »

Expand

Fig 7.

Toy example of neural population responses to figure-ground borders with faster figure speeds.

A) Each line represents a neuron’s tuning curve within a population of neurons broadly tuned for stimulus speed. One example tuning curve is indicated with a thicker line. We assume the receptive fields of these neurons are spatially overlapping. Our model included 50 neurons with Gaussian tuning curves of standard deviation equal to 1 log[deg/s] and with means uniformly spaced in log speed. B) We consider a stimulus within this spatial location that contains a figure region (f) and a ground region (g). An illustrative histogram indicates that the average speed in the figure region (s_f) is faster than the average speed in the ground (s_g). C) An individual neuron in the neural population is shown – when stimulated by a figure-ground border this neuron will receive input consistent with both speeds (s_f and s_g). In isolation, these speeds would elicit response rates of r_f and r_g, respectively. D) We considered 3 possible strategies to determine the neural response to bi-speed input (R): prioritizing the faster speed such that the bi-speed response matches the response to the faster stimulus, averaging, and prioritizing the slower speed such that the bi-speed response matches the response to the slower stimulus. E) A simple decoding strategy was used to recover the stimulus speed from the population responses associated with each of these strategies: we computed the response-weighted average of each neuron’s preferred single stimulus speed. F) The decoded speed as a function of the speed of the figure region is plotted for eight example stimuli. For each stimulus, s_f was first selected and then s_g was set to be slower by a random scale factor ranging from 0.3 to 0.8. Line/marker colors correspond to the three strategies in panel D.

More »

Expand