Brain-like illusion produced by Skye’s Oblique Grating in deep neural networks

doi:10.1371/journal.pone.0299083

Fig 1.

Skye’s Oblique Grating Illusion.

Parallel horizontal bars create a tilted visual illusion effect when black and white diamonds are added in alternating order (A). There are two types of black-and-white diamond positional settings, with each producing opposite tilt effects (B). The tilt strength can be influenced by the width of the diamond.

More »

Expand

Fig 2.

Participant experiments in this study.

Participants provided feedback on visual illusions while observing stimuli on a screen (A). The stimuli were based on RGB color rings with 12 color types (B). There were six diamond width variations (5–10 pixels) and two cases of sequentially alternating diamonds, creating a total of 144 combinations of visual illusion stimuli.

More »

Expand

Fig 3.

Mapping of DNNs and visual pathways in the brain.

The ventral visual pathway of the brain is generally involved in object perception and recognition. The four regions from V1 to IT form perceptions (B). DNNs, as learning models for the ventral visual pathway, have multiple modules corresponding to the four brain pathway regions from V1 to IT (A). Based on the Brain Scores of the DNNs, eight models with high overall scores were selected to simulate the visual pathway. Models were ordered according to their scores, from highest to lowest.

More »

Expand

Fig 4.

Visual illusion strength on human perceptual adjustment and illusion.

The visual illusion is divided into eight ranges with a threshold of 0.1 degrees, representing eight levels of illusion intensity. On the right are the visual illusion images, corresponding to the human perception adjustment images on the left. Visual illusion images from illusion strength 1 to 4 are categorized under “No-illusion” (C1), while strength 5 to 8 fall under “With-illusion” (C2). There are a total of 144 combinations of visual illusion images, used as the test set. The human perception adjustment images on the left, featuring angled inclinations, serve as the training set for the network, comprising a total of 24,000 images.

More »

Expand

Fig 5.

The distribution of participants’ perceived angles on 12 colors with different diamond width.

The horizontal axis represents twelve colors based on the RBF color wheel, corresponding sequentially to the absolute average angles of diamond width sizes ranging from 5 to 10 pixels (with an interval of 1). Each color has a corresponding perceived angle size. The upper and lower limits in the graph correspond to the maximum and minimum angle values after adjustment, respectively.

More »

Expand

Fig 6.

The distribution of 12 colors on C1/C2.

The twelve colors correspond to their respective twelve combinations, distributed in terms of quantity within the “No-illusion” (C1) category with intensities 1 to 4, and the “With-illusion” (C2) category with intensities 5 to 8.

More »

Expand

Fig 7.

1000 times permutation test on 8 models.

Examine the test accuracy distribution for eight models corresponding to real visual illusion images, as well as their accuracy after training with shuffled labels 1000 times. The horizontal axis represents the test accuracy, while the vertical axis indicates the probability distribution. The orange dashed line denotes the accuracy in testing visual illusions following correct training. Performance evaluation is based on a criterion of p = 0.05. The probability density curve reflects the potential distribution of various test performances.

More »

Expand

Fig 8.

Comparative evaluation of 8 models in illusion images testing.

The comparative analysis of 8 models on illusion testing using three key performance metrics: Accuracy (deep sky blue bars), Recall (lime green bars), and F1 Score (brown bars), which collectively evaluate the models’ predictive capabilities. Additionally, the chart overlays a line graph (in orange) representing each model’s complexity, measured by the number of parameters (in millions). This dual representation facilitates an understanding of how model complexity correlates with performance across different metrics.

More »

Expand

Fig 9.

Comparative analysis of eight models: Performance across varied color and illusion intensities on Skye’s Oblique Grating.

Color Categories (Left Axis): This legend explains the different color bars on the left Y-axis representing various color categories. Each color bar represents the recognition accuracy of the corresponding color in different models. Illusion Intensity Categories (Right Axis): This legend explains the different color bars on the right Y-axis representing levels of illusion intensity. Each color bar represents the recognition accuracy of the corresponding level of illusion intensity in different models.

More »

Expand

Fig 10.

The features heatmap of C1/C2 illusion images on 8 models.

Feature preferences of eight models under No-Illusion (C1) and With-Illusion (C2) conditions. The stimulus features for C1 are displayed above each model, while those for C2 are shown below.

More »

Expand

Fig 11.

The feature heatmap of ResNet101 on Skye’s Oblique Grating Illusion.

We used back-propagation of gradients to compute weights to visualize the feature bias of the DNN on two stimulus images of C1 and C2. The ResNet101 yielded totally four corresponding features on no illusion (C1) and illusion (C2).

More »

Expand

Fig 12.

Comparative analysis of the variation in average L2 distance across network depths for different models.

Variations in total representational similarity(Average L2 Distance on human perceptual adjustment and illusion) across different network depths in DNNs. The legend represents different classification conditions. Red circles indicate correctly classified data points under condition C1, while spring green squares correspond to correct classifications under condition C2. Blue triangles mark instances of incorrect classification under C1, and purple crosses represent incorrect classifications under C2. The horizontal axis denotes the respective depths of the network.

More »

Expand

Fig 13.

Representative dissimilarity matrix (RDM) on different model depths of ResNet101.

We utilized the top-performing illusion response, ResNet-101, in the RDM for testing. Each RDM corresponds to network depth. The horizontal coordinate represents the stimulus images by illusion strengths 1–8, and the vertical coordinate represents the perceived angle images by illusion strengths 1–8. Network depth increased from left to right. The RDM group for Self-data Trained corresponds to the scenario where training is conducted with tilted images, while the RDM group for Pretrained corresponds to the RDM under the condition of pre-training loading (without training).

More »

Expand

Fig 14.

Representational dissimilarity matrix (RDM) based on ventral access partitioning.

(A) A comparison of the RDMs for a single network architecture and complete network architecture. (B) The potential vision of the visual pathway. The red dashed region in panel A represents the F1 module network built by the ResNet101 shallow module and trained by the same dataset. The dashed blue area represents the entire network of ResNet101. Both represent the V1 and ventral pathways (V1 to IT) in panel B, respectively.

More »

Expand