Fragmented ambiguous objects: Stimuli with stable low-level features for object recognition tasks

doi:10.1371/journal.pone.0215306

Fig 1.

The first (automated) stage of stimulus generation.

Top row: Line drawings, photographs, and full-color drawings of isolated objects were downloaded from existing object databases intended for vision reearch. Second row: Objects were scaled to occupy 88% of a 384x384 pixel square, then an algorithm using a regular grid of Gabor wavelet filters (dots indicate centers of Gabor wavelets) tuned to a range of spatial frequencies and orientations was used to convert local features to oriented line segments. Third row: the resulting array of line segments was convolved with a 16x16 pixel square, then subjected to a clustering algorithm that assigned a different integer to each disconnected region. “Objects” with 6 or more separate regions were eliminated from further consideration Bottom row: Each collection of object “features” was then embedded in a background of line segments at a randomly selected orientation. Credits, from left: bluebell_ed.bmp from http://testbed.herts.ac.uk/HIT/hit_apply.asp, shared free of copyright as stated in {Adlington}; 159.gif and Accordion.jpg from {Rossion} are downloaded from https://figshare.com/articles/Snodgrass_Vanderwart_Like_Objects/3102781, used with permission from Michael Tarr, original copyright CC BY NC SA 3.0 2016.

More »

Expand

Fig 2.

Histogram of number of wins for each of the 718 images ranked by authors.

On the right are exemplars of the one image that won all 16 possible comparisons (cat, far right), and one of the many images that won only 9 out of 16 times and therefore was eliminated (the accordion from Fig 1, middle panel). The number of images included for consideration after this stage was 217.

More »

Expand

Fig 3.

After the stimulus set was created, rare lines at orientations other than 0°, 45°, 90° or 135°were replaced.

Left: in the majority of the images, fewer than 5% of the line segments were at 22.5°, 67.5°, 112.5° or 157.5°. These were therefore removed to simplify the final stimulus set. While observer ratings were performed on the images before the rare orientations were removed (8 orientations, middle panel), the changes were minor enough they would not be expected to change ratings by affecting perception of the object as a whole. See text for details.

More »

Expand

Table 1.

Endorsement rates for groups participating in Stage 3.

Endorsement rate is the fraction of images to which a participant responded “Yes, there is a known object”. Group (N): mean (standard deviation).

More »

Expand

Fig 4.

Behavioral responses and line segment counts for all 217 images in a study with naïve participants.

A) The scatter plot shows the recognizability and stability scores for each of the 217 images, with the color of the dot indicating the dominant category for the image. The saturation of points representing images named fewer than 10 times is reduced. The black dot in the lower left indicates an image that was indicated as “recognizable” by 14% of observers in the Yes/No task, but then consistently rejected (by refusal to label) in the Naming task. B) The number of line segments used to describe each object was associated with recognition rates in this image set. The shape of the symbol reflects the convexity score: the more crescent-shaped, the smaller the ratio of the area of the object and the area of a convex bounding region. The darkness of the symbol reflects the density of the object, with black indicating objects that were defined by texture and light gray indicating objects that were defined by contours.

More »

Expand

Fig 5.

Behavioral responses and line segment counts for the subset of 100 images retained to create the vetted stimulus set.

A) Scatter plot and marginal histograms as in Fig 4. Filled circles indicate stimuli included in the final set of 100 images; empty circles indicate stimuli present in Fig 4 but excluded from the final set. Median recognizability in the final set is 0.51 (mean = 0.51); median stability is 0.82 (mean = 0.77). In the final set of objects there are 40 living and 60 non-living (22 non-manipulable). B) Line segment number is not associated with recognition rate in the final set of 100 stimuli.

More »

Expand

Table 2.

Mean recognizability, stability, density, and convexity, by category, for the 217 images that entered Stage 3 and the 100 images selected by Stage 3.

For each statistic, mean (standard deviation) is shown.

More »

Expand

Fig 6.

Orientations of line segments in intermediate and final image sets.

A) Color indicates orientation: red, 0°; green, 45°; cyan, 90°; purple, 135°. Images were divided into quartiles based on the proportion of participants who indicated they contained a known object (“recognizability”). The orientations of line segments were not uniformly distributed but there was no correlation between the proportion of line segments at a given orientation and object recognition, r(98) = -0.11, -0.11, 0.092, and 0.10, respectively, for 0°, 45°, 90°, and 135°, all p’s > 0.2. B) Orientation difference between neighboring elements (red: collinear; yellow: parallel; green: acute; blue: orthogonal; purple: obtuse). There was also no correlation between the proportion of stimuli with a given pairwise orientation and object recognition for collinear, r(98) = 0.15, p = 0.14, parallel, r(98) = 0.12, p = 0.25, acute, r(98) = -0.06, p = 0.58, or obtuse, r(98) = -0.11, p = 0.29. The relationship between the proportion of orthogonal pairs and recognizability was the strongest, r(98) = -0.193, p = 0.055.

More »

Expand