Fig 1.
Rationale and experimental design.
(A) Rosch et al.’s stimulus images depicting the global shape of exemplars from four visually-similar basic-level categories within four superordinate categories (rows; adapted from Fig 1, [4]). (B) Schematics illustrating how blurring (in terms of the parameter sigma or σ; in units of pixels) was manipulated across three conditions in Experiment 1. In the non-blurred condition (top), sigma was fixed at a small value that results in no blurring throughout training, while in the two blur conditions, sigma started at 5 pixels and decreased—at different rates—over the first 50 epochs of training, corresponding to an increase in spatial acuity over the course of learning. See Methods for details. (C-D) Example ecoset images from the basic level categories “dog”, “fish”, “car”, “truck”, blurred at three example values for sigma. Note that larger sigma values give rise to blurrier images, while σ = 0 results in an intact, non-blurred image. Examples of the same images are shown for the (C) grayscale and (D) color conditions.
Fig 2.
Experiment 1 model performance.
Performance is shown over training for models in six conditions defined by the factors of blur and color. All models used the ResNet-50 architecture and were trained to perform basic-level object categorization using the ecoset image dataset. Each plotted point is an average across 10 runs of an otherwise identical model with different random seeds. Shaded error bars reflect mean ± SEM across these 10 trials. Dots along the top of each plot indicate time points at which a linear mixed effects model over a sliding temporal window revealed a significant effect of the specified pairwise condition comparison, in either direction (FDR corrected, ɑ = 0.05). (A) Validation set accuracy. Gray corresponds to models trained with non-blurred images, blue to models trained with images whose blur decreases linearly over the first 50 epochs (linear-blur condition), and green to models trained with images whose blur decreases according to a logarithmic function over the first 50 epochs (nonlinear-blur condition). Left and right plots show averages for models trained using either grayscale images or color images, respectively. Accuracy was temporally smoothed to reduce noise. (B) Estimated learning rate computed as slope of accuracy over time. Colors correspond to models as in (A).
Fig 3.
Experiment 2 model performance.
Performance is shown for ecoset-trained models from Experiment 1 that were fine-tuned with ImageNet images for 1000-way object categorization. ImageNet contains labels at different category levels, with some images labeled at the basic-level and others labeled at the subordinate-level. Fine-tuning was run only with non-blurred ImageNet images. Each plotted point is an average across 10 runs of an otherwise identical model with different random seeds. Shaded error bars reflect mean ± SEM across these 10 trials. Dots along the top of each plot indicate time points at which a linear mixed effects model over a sliding temporal window revealed a significant effect of the specified pairwise condition comparison, in either direction (FDR corrected, ɑ = 0.05). (A) ImageNet validation set accuracy. Light gray corresponds to models trained with ecoset in the non-blurred condition, blue to models trained with ecoset in the linear-blur condition, green to models trained with ecoset in the nonlinear-blur condition, and dark gray corresponds to new models with no pre-training (that is, starting from scratch). Left and right plots show averages for models trained using either grayscale images or color images, respectively. Accuracy was temporally smoothed to reduce noise. (B) Estimated learning rate computed as slope of accuracy over time. Colors correspond to models as in (A). Only pairwise comparisons between different pre-trained models are shown; all pre-trained models performed significantly better than the no-pre-training models at all time points.
Fig 4.
Experiment 2 model performance split by category level.
Models pre-trained on grayscale ecoset images with various blur conditions were fine-tuned for 1000-way image classification on ImageNet; the performance of these models is shown separately for the basic-level and subordinate-level category labels in ImageNet. Only results from models pre-trained with grayscale images are plotted because few differences were observed between models pre-trained with color images (see Fig 3). Validation set accuracy was computed using (A) all categories, (B) basic-level labeled categories only, and (C) subordinate-level labeled categories only. Bar heights and error bars indicate mean ± SEM across 10 trials of each model, light gray dots show accuracy for individual trials. Brackets above bars indicate the significance of pairwise comparisons between conditions, assessed using a two-tailed independent samples t-test for each pairwise condition comparison (FDR corrected, ɑ = 0.05) where * denotes a significant difference in either direction and n.s. denotes no significant difference.