Fig 1.
Images and 40x40 patches from the shadow and occlusion databases.
(a) Top: Two representative images from our shadow database (shad) with shadow edges labeled by one observer. Bottom: Overlaid shadow boundaries obtained from four observers. (b) Top: Two representative images from oset-2, with occlusions labeled by one observer. Bottom: Overlaid occlusion boundaries from three observers. (c) Representative 40x40 image patches from each image set.
Fig 2.
Contrast statistics measured from shadow and occlusion edges (N = 2000 of each category).
Green curves/symbols indicate occlusions, Magenta curves/symbols indicate shadows. (a) Distributions of Michelson contrast (left) and RMS contrast (right) for our shadow database and two sets of occlusions (oset-1: top, oset-2: bottom). Note that for both occlusion sets, the highest and lowest contrast edges are occlusions. (b) Two-dimensional scatterplot of contrast measurements for shadow and occlusion edges for both sets of occlusions. (c) Probability distribution of stimulus power πh in the high spatial frequency (>10 cycles/image) range for our shadow database (magenta curves) and both occlusion sets (green curves). (d) Rotational average of the amplitude spectrum for shadows (magenta curves) and both occlusion sets (green curves). (e) Scatterplots of πh and Michelson contrast cM for shadows (magenta symbols) and both occlusion sets (green symbols). (f) Same as (c) but for RMS contrast cRMS.
Fig 3.
Example occlusion and shadow edges at varying percentiles of our contrast and spatial frequency measurements.
Each image patch is within 8 ranks of the indicated percentile (out of N = 2000 total image patches analyzed). (a) Occlusion edges at varying percentiles of each contrast measure (RMS: left; Michelson: center) and our spatial frequency measure (πh: right) for both sets of occlusions (oset-1: top, oset-2: bottom). (b) Same as (a), but for shadow edges from our database.
Fig 4.
Schematic illustration of Gabor Filter Bank (GFB) and Filter-Rectify-Filter (FRF) classifier models.
(a) GFB model. An image patch is analyzed by a bank of 72 oriented log-Gabor filters covering 3 spatial scales (8, 16, 32 pixels). The rectified filter outputs are downsampled and MAX pooled to obtain a set of regressors x1,…,xn. Binomial logistic regression is performed to find a set of weights that optimally predict the probability that the patch is an occlusion. (b) FRF model. This model is identical to the GFB model, except the regressors x1,…,xn are then passed into a three-layer neural network having K = 4, 6, 10 hidden units (gray circles) to determine the probability an image patch is an occlusion edge.
Fig 5.
Stimuli that maximize + minimize the occlusion probability predicted by the GFB and FRF models trained to distinguish shadows from occlusions (oset-1, oset-2).
(a) Left: Stimuli obtained via numerical optimization with highest predicted occlusion probability by the GFB model. We see consistent results for both training sets, over multiple optimizations. Right: Same as left column, but with highest predicted shadow probability by the GFB model (equivalently, lowest predicted occlusion probability). (b) Same as (a) but for FRF-4. (c) Same as (a) but for FRF-6. (d) Same as (a) but for FRF-10.
Fig 6.
Occlusion and shadow edges with the 25 highest predicted of occlusion and shadow category membership by the GFB and FRF models.
Actual category is indicated by the row, predicted category by the column. The upper left and lower right quadrants indicate correct classifications. (a) GFB model, (b) FRF model.
Fig 7.
Optimal stimuli for the hidden units in the FRF models with 4 and 6 hidden units trained with both sets of occlusions (oset-1, oset-2).
(a) FRF-4 hidden unit receptive fields. Sign of output weight is indicated with a (+) or (-). (b) Same as (a) but for FRF-6.
Fig 8.
Effects of edge blur on model predictions of edge probability.
(a) Test image patches with varying levels of blur. (b) Gabor Filter Bank (GFB) model (see Fig 4A) responses to patches with varying levels of blur. Different curves indicate different fixed levels of Michelson contrast (blue = 0.2, orange = 0.4, yellow = 0.6). (c) Same as (a) but for Filter-Rectify-Filter (FRF) models having varying numbers of hidden units (Fig 4B).
Fig 9.
Effects of removing texture information from image patches on responses of the GFB and FRF models.
(a) Left: Original occlusions. Middle: Averaging the pixel intensities on each side of the occlusion boundary removes texture information while leaving Michelson contrast unchanged (no tex). Right: Blurring the sharp edge created by texture removal (blur). (b) Log-odds ratio of being an occlusion for the original images (horizontal axis) versus the texture-removed images (vertical axis). We see images for which the model predicts a high occlusion probability experience a drastic decrease in occlusion probability when texture is removed. Top: oset-1, Bottom: oset-2. (c) Same as (b) but for blurred texture-removed patches.
Fig 10.
Performance of human observers for each experiment for two different stimulus display methods.
(a) Schematic outline of a single trial for the psychophysical experiments. For each experiment, human observers classified image patches as either shadows (shown here) or occlusions. (b) Performance of all observers (N = 18) on each Qualtrics (QT) survey, as measured by proportion correct. Blue dots indicate individual observers. Blue circles indicate the median, and black diamonds indicate maximum performance. (c) Same as (b) but for lab (LB) experiments (N = 15).
Fig 11.
Human performance on individual images.
(a) Predicted probability of being an occlusion edge by the AO (blue curve). Image patches are sorted by their probability. Actual category is indicated by a (+). (b) Occlusion probability predicted by the AO for each occlusion edge (green curve) and each shadow edge (magenta curve), sorted by occlusion probability. (c) Patches of each actual category which are most likely classified by human observers as the predicted category for QT-2.
Table 1.
AO confusion matrix for all QT experiments.
Table 2.
AO confusion matrix for all LB experiments.
Fig 12.
Log-odds ratios of human observers (AO, horizontal axes) and GFB, FRF models (vertical axes) for each image for all three surveys.
Magenta symbols indicate image patches that are shadows, green symbols indicate occlusions. On the margins of each scatterplot we show the distributions of the log-odds ratios for each image category.
Table 3.
Spearman correlation between the log-odds ratio obtained from the aggregate observer (AO) and the machine classifiers over the N = 200 images in each survey.
All values are significant with p < 0.001.