Conceived and designed the experiments: CCC CWT. Performed the experiments: CCC. Analyzed the data: CCC. Wrote the paper: CCC CWT.
The authors have declared that no competing interests exist.
Symmetry detection is an interesting probe of pattern processing because it requires the matching of novel patterns without the benefit of prior recognition. However, there is evidence that prior knowledge of the axis location plays an important role in symmetry detection. We investigated how the prior information about the symmetry axis affects symmetry detection under noise-masking conditions. The target stimuli were random-dot displays structured to be symmetric about vertical, horizontal, or diagonal axes and viewed through eight apertures (1.2° diameter) evenly distributed around a 6° diameter circle. The information about axis orientation was manipulated by (1) cueing of axis orientation before the trial and (2) varying axis salience by including or excluding the axis region within the noise apertures. The percentage of correct detection of the symmetry was measured at for a range of both target and masking noise densities. The threshold vs. noise density function was flat at low noise density and increased with a slope of 0.75–0.8 beyond a critical density. Axis cueing reduced the target threshold 2–4fold at all noise densities while axis salience had an effect only at high noise density. Our results are inconsistent with an ideal observer or signal-to-noise account of symmetry detection but can be explained by a multiple-channel model is which the response in each channel is the ratio between the nonlinear transform of the responses of sets of early symmetry detectors and the sum of external and intrinsic sources of noise.
One of the major functions of the visual system is to identify and localize objects in a visual scene. To achieve this, we can assume that the visual system is likely to have developed means of utilizing many kinds of useful information. Mirror symmetry is one of the important image features, and is present in a large proportion of the objects that we encounter. In the wild, for instance, many relevant aspects of the environment, such as potential predators, food sources or mates, tend to have mirror symmetry while the background elements, such as rocks, water, trees, and hillsides, are largely non-symmetric
While detecting mirror symmetry is easy for the human vision system
Currently, the spatial filtering approach is popular framework for understanding symmetry perception
The purpose of our study is then to understand the effect of uncertainty about axis orientation in the framework of Signal Detection Theory
Here, we consider two possible hypotheses as to how the visual system determines the axis orientation for the detection of symmetry. The first hypothesis assumes that a higher-order symmetric detector receives the responses from lower-order mechanisms that are each sensitive around a symmetry axis of a particular orientation. When the axis orientation is unknown to the observer, according to the uncertainty theory
The second hypothesis suggests that the visual system may simply analyze the spatial relationships among individual image elements and determine an image to be symmetric if a sufficient proportion of the spatial locations of image elements support it. That is, symmetry detection would be based solely on the signal-to-noise ratio or “weight-of-evidence” in the image
In addition to the axis orientation, we also need to consider the issue of salience of symmetry axis, which is relevant to how the location of symmetry axis is determined in some models. For instance, Rainville & Kingdom
The use of human participants was approved by the IRB of National Taiwan University Hospital and followed the guideline of Helsinki Declaration. The written informed consent was obtained from each participant.
The stimuli were presented on a ViewSonic VA902 17″ LCD monitor controlled by an HP D325MT computer with an ATI Radeon 9800PRO graphics card. The spatial resolution was 1280 (H) × 1024 (V). At the viewing distance of 83.2 cm, a pixel subtended 1′ (H) × 1′ (V). The temporal refresh rate of the monitor was 60 Hz (non-interlaced). The gamma function of the monitor was calibrated with a LightMouse photometer
In our experiment, the information about the symmetry axis was manipulated in two ways. The information about the axis was varied by (1) cueing: whether there was a cue indicating the axis orientation before a trial; and (2) axial salience: whether the axis location fell within the apertures or between them.
A. The configuration consisted of an overall noise pattern with a single axis orientation visible through a mask of eight apertures The axial salience was controlled by the position of the apertures, located so as to either include or exclude the region around the axis. B. Examples of different combinations of target and masker density.
On each trial, the stimuli consisted of a random-dot mask superimposed on either a symmetric target or a non-symmetric random-dot control. The purpose of the random-dot control was to balance the local statistics in the image. The stimuli were spatially masked with a uniform gray field (15 cd/m2) with eight apertures (1.2° diameter) evenly distributed around a 6° diameter circle. In the high axial salience condition, the centers of the apertures were located from 0° to 315° in 45° steps from the horizontal axis to include the symmetry axis in diametrically-opposite pairs, regardless of which of the four orientations the axis took. In the low axial salience condition, the centers of the apertures were shifted clockwise by 22.5° to exclude the symmetry axis from all the apertures. In this configuration the blank region around each possible axis location was a minimum of 1.16°.
On each trial, observers determined whether a symmetric target or a non-symmetric control pattern was presented. The axis was randomly selected from one of the four orientations on each trial, but information about the axis was manipulated by (1) cueing and (2) axial salience. Thus, there were a total of 4 ( = 2×2) test conditions in the experiment. In the cue condition, a straight line with the same orientation and location as the symmetry axis flashed for 500 ms, followed by 15 ms of a uniform gray field, before the onset of the stimuli. In the non-cue condition, instead of the valid cue, a neutral cue of four lines that had the same orientations and location as the four possible symmetry axes was presented before the test stimuli. The test stimuli stayed on the screen until the observer made a response, after which the display was replaced by the uniform gray field. The salience and non-salience conditions were determined by the location of the apertures as discussed above.
The trials were blocked by test condition as well by the noise density, but axis orientation was randomized throughout each block. In each block, we used a constant stimulus paradigm to measure the psychometric functions of percentage correct responses for a range of 7–9 target densities in 0.15 log increments. The range of target densities depended on both test conditions and noise density and was determined by a pilot experiment (data not shown) in which one of the authors served as an observer. The sequence of target density and axis orientation within a block, or noise density and test conditions between blocks, were all randomized.
Four observers participated in this study. One observer (CC) was one of the authors of this paper while the other three were paid observers who were naïve to the purpose of the experiment. All observers had a corrected–to-normal (20/20) visual acuity. Observer PC left the study before making measurements with the low-density noise masks.
Each panel represents data from one observer. Blue denotes the TvD function for the cued high salience condition; magenta, the cued low salience condition; green, the non-cued high salience condition; and red, the non-cued low salience condition. The smooth curves are fits of the model discussed below. The error bars are the estimated standard error of measurement.
For all conditions, the target density threshold increased with noise density. At medium to high noise densities, the slope of the increment function reached an average of about 0.77 in log-log coordinates for all conditions and observers, significantly less than a slope of 1 (t(15) = 6.19, p<0.001). The asymptotic slope of the TvD functions varied with axial salience. Averaged across observers and cue conditions, the TvD functions for the low salience conditions had a slope (0.86) significantly greater (t(7) = 2.38, p = 0.048) than that for the high salience conditions (0.70). Within the same salience condition, there is little difference is slope for TvD functions measured for different cueing conditions (t(7) = 0.65, p = 0.53). At the low noise densities, the slope of the increment function may be less because the density thresholds measured with no masking noise would be the same as those measured at noise densities between −3 and −4 log units (as predicted from the slope at the high noise densities).
Wenderoth
The dashed and dotted blue lines indicate the predictions of the uncertainty model and the signal-to-noise, or weight-of-evidence, model respectively.
The filled circles in
The experiment was designed to directly compare two aspects of the knowledge about the symmetry axis on the detectability of symmetry as a function of masking noise density.
The axis orientation cue reduced the symmetry detection threshold 2–4 fold. This axial salience effect was pronounced at high but not low noise densities. The threshold reduction produced by the cue is inconsistent with what would be predicted by a simple signal-to-noise ratio or weight-of-evidence account of symmetry detection
The effect of the cue, however, cannot be explained by uncertainty reduction alone. We assume that the observer's performance in both conditions is determined by the channel with the greatest response. Gaussian Max Uncertainty Theory
The axial salience effect was pronounced at high but not at low noise densities. Actually, when there was no external noise, the salience effect did not differ significantly from zero. Given that the gap between the neighboring apertures was at 1.16° at their closest, a lack of threshold difference between the high and low salience condition the observer is not using the information close to the symmetry axis for symmetry detection. Since models of symmetry based on the image property at or near the symmetry axis
Our results showed that the increase of target density threshold with noise density had a slope of between 0.70 and 0.86 in log-log coordinates. This result is not consistent with the simple signal-to-noise-ratio
Our data do not fit with this picture.
The solid lines have a slope of 1, the dashed lines, a slope of 0.75. The lines with unity slope tend to overestimate thresholds at high masker density and underestimate them at low masker density.
Here, we present a model that can explain all aspects of our data. This model, in the framework of Signal Detection Theory
A. Without cues. B. With cues. See text for details.
The first step of the perception stage is a band of orientation-selective symmetry processors that are sensitive to symmetry in an image. Each processor is sensitive to the mirror symmetry about one axis. Note that these are not the traditional local filters but long-range pairs of local multiplicative contrast detectors that register a signal whenever there is a similar contrast at two locations in the field equidistant from a symmetry axis. The outputs of all such pairs of detectors relative to a given symmetry axis are linearly summed to form the symmetry signal relative to that location. It is important to emphasize that symmetry processing requires such axis selectivity, since any binary noise pattern has an infinite number of dot pairings at arbitrary locations and pairwise orientations. It is only when a number of them line up with respect to a particular symmetry axis or axes that we say that the pattern has symmetry.
The image in the target+masker trial can be considered to consist of two components: the symmetric target and the noise masker while the image in the control+masker trial can be considered to consist of just one component with a density that is the sum of the control and the masker.
For sparse binary random-dot patterns, such as those in our experiment, the output of the j-th processor to the i-th image component, Ej,i, is
Eq. (2) should hold for spatial filtering approaches
The response of the perception stage of the model is the excitation of the j-th processor, Ej, raised by a power p, in which Ej = Σi Ei,j is the sum of excitations produced by all image components, and is then divided by a divisive inhibition term Ij plus an additive constant z. That is,
The contribution of each channel to the visual performance is limited by the noise. There are two sources of noise in this model: the internal noise inherited in the system, and the external noise provide by the noise patterns. The variability of the internal noise, σa2, is a constant for all processors in the model. The variability of the external noise, σe2 is proportional to the square of the density of random-dot patterns, Db; that is, σe2 = v * Db2, where v is a scalar constant. Pooled together, in each channel the standard deviation of the response distribution is
The output of the perception stage is sent to the decision stage. The decision stage monitors more channels than those that are relevant to the prescribed visual tasks
When there are m channels to be monitored, the maximum response of these channels can be described by a distribution whose mean approximates a fourth-power summation over these m channels
Here, we use the subscript b+c to emphasize that the noise pattern contained both the masking noise and a control pattern with same number of dots as the corresponding symmetry target. Suppose that there are n channels responding the symmetry image component in the stimuli when it is available. Then, the mean response in the decision stage becomes
The decision variable is the difference of the response to the image with the symmetry component and the response to the random-dot image of the same pattern divided by the standard deviation of the max distribution, σp. That is,
The threshold is defined when d' reaches unity. Note that the standard deviation of the max distribution of four independently and identically distributed samples is 0.71 times the standard deviation of the original distribution
In practice, if we use a typical value of 2 for the power for the divisive inhibition input q in Eq. (5)
Eqs. (4)′, (7) and (8)′ thus define the whole computation and all the parameters in the model. In general, the parameters in Eq. (4)′ were set the same for all conditions except as follows: we allowed the target-related sensitivity parameters Set and Sit to change with axial salience as images with different salience were physically different. As discussed above, uncertainty reduction alone cannot explain the whole cueing effect. Other parameters also need to be adjusted to model the cueing effect. From
Before describing the model fits, it is relevant to consider the inherent properties of the model. In particular, it has the property that the noise masking function can exhibit two “corners”, or locations where the slope of TvD function increases, instead of one as commonly seen in the discrimination functions in the contrast or luminance domain
The black curve has all three model components in the denominators, with parameters chosen to for a strong double-corner effect. Green curve: removing the internal noise reduces the threshold at low masker density. Blue curve: removing the divisive inhibition limits the masking effect at high masker density. (Removing the external noise results in a horizontal line, since it is the parameter of the x axis.)
The black curve contains all three components in the denominators of Eq (4)′. The parameters in this illustration were chosen to make both the two corners more pronounced. The green curve of
The model fits are shown as smooth curves in
CC | LY | TR | HP | ||
Set | cued, high axial salience | 1000 |
1000 |
1000 |
1000 |
cued, low axial salience | 401 | 443 | 867 | 477 | |
non-cued, high axial salience | 540 | 466 | 528 | 549 | |
non-cued, low axial salience | 201 | 268 | 416 | 362 | |
Seb | 1.68 | 270 | 0.10 | 247 | |
Sit | high axial salience | 890 | 850 | 400 | 1060 |
low axial salience | 60 | 750 | 260 | 780 | |
Sib | 1196 | 2007 | 7132 | 15219 | |
z | 0.15 | 0.03 | 3.54 | 2145 | |
p | 2.17 | 2.23 | 2.60 | 2.91 | |
Other fixed parameters used in the model fits | |||||
q | 2 |
2 |
2 |
2 |
|
m | cued | 1 |
1 |
1 |
1 |
non- cue | 4 |
4 |
4 |
4 |
|
n | 1 |
1 |
1 |
1 |
|
γ | cued | 1 |
1 |
1 |
1 |
non-cued | 0.71 |
0.71 |
0.71 |
0.71 |
*Fixed value, not a free parameter.
The
The excitatory sensitivity to the low salience targets shows a 15–60% reduction compared with that to the high salience ones. This degree of reduction may be taken as an index of the relative contribution of the information at or near the symmetry axis. This result also suggests that 40–85% of symmetry sensitivity is from sources distant from the symmetry axis (by more than 0.58°). Notice that, since the distance between two neighboring apertures is 1.5° in our stimuli, such a contribution must be from a long-range interaction mechanism
The target threshold vs. mask density function for symmetry detection was flat at low mask density and increased with a slope of 0.75–0.8 beyond a critical density. The axis cueing reduced the target threshold 2–4-fold at all masker densities. On the other hand, axis salience, whether the paraxial dots were visible in the windows or not, had an effect only at high masker densities. These results are inconsistent with a signal-to-noise account of symmetry detection but can be explained by a multiple-channel model is which the response in each channel is limited by the nonlinear transform of early symmetry detectors combined with the sum of separate sources of external and intrinsic noise.
The combined design of the present study revealed that the near-axis region, which is often considered to be the sole determinant of symmetry detection, plays little role under noise-limited conditions, since masking it from view has only a small effect on detectability. Overall, the results are inconsistent with all published models of symmetry processing of which we are aware. The data require a more elaborated model of the form that we propose, consisting of a band of local-feature-selective symmetry processors configured as long-range pairs of local multiplicative contrast detectors that register similar contrasts at pairs of locations equidistant from a symmetry axis. The primary symmetry signal from all such pairs of detectors is linearly summed relative to a prescribed symmetry axis, subject to an inhibitory gain control based on the external noise level which is then sent to a decision stage that optimizes the response relative to the prior knowledge of the axis location. This model accounts for all the parametric variance in the data, including the minor individual differences among observers. We therefore regard the noise masking and axis salience properties as key variables in discriminating among symmetry models, and as providing strong evidence in favor of the current model structure for this form of mid-level processing for object recognition.