Psychophysical Tests of the Hypothesis of a Bottom-Up Saliency Map in Primary Visual Cortex

doi:10.1371/journal.pcbi.0030062

Figure 1.

Prediction of Interference by Task-Irrelevant Features, and Its Psychophysical Test

(A–C) Schematics of texture stimuli (extending continuously in all directions beyond the portions shown), each followed by schematic illustrations of its V1 responses, in which the orientation and thickness of a bar denote the preferred orientation and response level, respectively, of the activated neuron. Each V1 response pattern is followed below by a saliency map, in which the size of a disk, denoting saliency, corresponds to the response of the most activated neuron at the texture element location. The orientation contrasts at the texture border in (A) and everywhere in (B) lead to less suppressed responses to the stimulus bars since these bars have fewer iso-orientation neighbours to evoke iso-orientation suppression. The composite stimulus (C), made by superposing (A) and (B), is predicted to be difficult to segment, since the task-irrelevant features from (B) interfere with the task-relevant features from (A), giving no saliency highlights to the texture border.

(D,E) RTs (differently colored data points denote different subjects) for texture segmentation and visual search tasks testing the prediction. For each subject, RT for the composite condition is significantly higher (p < 0.001). In all experiments in this paper, stimuli consist of 22 rows × 30 columns of items (of single or double bars) on a regular grid with unit distance 1.6° of visual angle.

More »

Expand

Figure 2.

Further Illustrations To Understand Interference by Task-Irrelevant Features

(A–C) As in Figure 1, the schematics of texture stimuli of various feature contrasts in task-relevant and -irrelevant features.

(D) Like (A), except that each bar is 10° from vertical, reducing orientation contrast to 20°.

(F) Derived from (C) by replacing each texture element of two intersecting bars by one bar whose orientation is the average of the original two intersecting bars.

(G–I) Derived from (A–C) by reducing the orientation contrast (to 20°) in the interfering bars, each is 10° from horizontal.

(J–L) Derived from (G–I) by reducing the task-relevant contrast to 20°.

(E) Plots the normalized RTs for three subjects, DY, EW, and TT, on stimuli (A,D,F,C,I,L) randomly interleaved within a session. Each normalized RT is obtained by dividing the actual RT by the RT (which are 471, 490, and 528 ms, respectively, for subjects DY, EW, and TT) of the same subject for stimulus (A).

For each subject, RT for (C) is significantly (p < 0.001) higher than that for (A,D,F,I) by at least 95%, 56%, 59%, and 29%, respectively. Matched sample t-test across subjects shows no significant difference (p = 0.99) between RTs for stimuli (C) and (L).

More »

Expand

Figure 3.

Interference between Orientation and Color, with Schematic Illustrations (Top [A,B]), and Stimuli/Data (Bottom [C–J])

(A) Orientation segmentation with irrelevant color.

(B) Color segmentation with irrelevant orientation.

(A,B) Larger patch sizes of irrelevant color gives stronger interference, but larger patch sizes of irrelevant orientation do not make interference stronger.

(C–E) Small portions of the actual experimental stimuli for orientation segmentation, without color contrast (C) or with irrelevant color contrast in 1 × 1 (D) or 2 × 2 (E) blocks. All bars had color saturation s_uv = 1, and were ±5° from horizontal.

(F) Normalized RTs for (C–E) for four subjects (different colors indicate different subjects). The “no”, “1 × 1”, and “2 × 2” on the horizontal axis mark stimulus conditions for (C–E), i.e., with no or n × n blocks of irrelevant features. The RT for condition “2 × 2” is significantly longer (p < 0.05) than that for “no” in all subjects, and than that of “1 × 1” in three out of four subjects. By matched sample t-test across subjects, mean RTs are significantly longer in “2 × 2” than that in “no” (p = 0.008) and than that in “1 × 1” (p = 0.042). Each RT is normalized by dividing by the subject's mean RT for the “no” condition, which for the four subjects (AP, FE, LZ, NG) are 1170, 975, 539, and 1107 ms, respectively.

(G–J) Color segmentation, analogous to (C–F), with stimulus bars oriented ±45° and of color saturation s_uv = 0.5. Matched sample t-test across subjects showed no significant difference between RTs in different conditions. Only two out of four subjects had their RT significantly higher (p < 0.05) in interfering than in no interfering conditions. The un-normalized mean RTs of the four subjects (ASL, FE, LZ, NG) in “no” condition are: 650, 432, 430, 446 ms, respectively.

More »

Expand

Figure 4.

Small Portions of Actual Stimuli and Data in the Test of the Predictions of Saliency Advantage in Color-Orientation Double Feature (Left, [A–D]) and the Lack of It in Orientation–Orientation Double Feature (Right [E–H])

(A–C) Texture segmentation stimuli by color contrast, or orientation contrast, or by double color–orientation contrast.

(D) Normalized RTs for the stimulus conditions (A–C). Normalization for each subject is by whichever is the shorter mean RT (which for the subjects AL, AB, RK, and ZS are, respectively, 651, 888, 821, and 634) of the two single-feature contrast conditions. All stimulus bars had color saturation s_uv = 0.2, and were ±7.5° from horizontal. All subjects had their RT for the double-feature condition significantly shorter (p < 0.001) than those of both single-feature conditions.

(E–G) Texture-segmentation stimuli by single- or double-orientation contrast, each oblique bar is ±20° from vertical in (E) and ±20° from horizontal in (F), and (G) is made by superposing the task-relevant bars in (E) and (F).

(H) Normalized RTs for the stimulus conditions (E–G) (analogous to [D]). The shorter mean RT among the two single-feature conditions are, for four subjects (LZ, EW, LJ, KC), 493, 688, 549, 998 ms, respectively. None of the subjects had RT for (G) lower than the minimum of the RT for (E) and (F). Averaged over the subjects, the mean normalized RT for the double-orientation feature in (G) is significantly longer (p < 0.01) than that for the color-orientation double feature in (C).

More »

Expand

Table 1.

RTs (ms) in Visual Search for Unique Color and/or Orientation, Corresponding to Those in Figures 3 and 4

More »

Expand

Figure 5.

Demonstration and Testing the Predictions on Spatial Grouping

(A–G) Portions of different stimulus patterns used in the segmentation experiments. Each row starts with an original stimulus (left) without task-irrelevant bars, followed by stimuli when various task-irrelevant bars are superposed on the original.

(H) RT data when different stimulus conditions are randomly interleaved in experimental sessions. The un-normalized mean RT for four subjects (AP, FE, LZ, NG) in condition (A) are: 493, 465, 363, 351 ms. For each subject, it is statistically significant that RT_C > RT_A (p < 0.0005), RT_D > RT_B (p < 0.02), RT_A > RT_B (p < 0.05), RT_A < RT_E, RT_G (p < 0.0005), RT_D > RT_F, RT_C > RT_E, RT_G (p < 0.02). In three out of four subjects, RT_E < RT_G (p < 0.01), and in two out of four subjects, RT_B < RT_F (p < 0.0005). Meanwhile, by matched sample t-tests across subjects, the mean RT values between any two conditions are significantly different (p smaller than values ranging from 0.0001 to 0.04).

(I) Schematics of responses from relevant (red) and irrelevant (blue) neurons, with (solid curves) and without (dot-dashed curves) considering general suppressions, for situations in (E–G). Interference from the irrelevant features arises from the spatial peaks in their responses away from the texture border.

More »

Expand

Table 2.

RTs (ms) for Visual Search for Unique Orientation, Corresponding to Data in Figure 5H

More »

Expand

Figure 6.

Illustration of Pomerantz's Configuration Superiority Effect

The triangle is easier to detect among the three arrow shapes in the composite stimulus, than the left-tilted bar among the right-tilted bars in the original stimulus. Identical shape of the target and distractor bars in the original stimulus could lead to confusion and longer RT.

More »

Expand