Fig 1.
An overview of a deep image reconstruction is shown. The pixel values of the input image are optimized so that the DNN features of the image are similar to those decoded from fMRI activity. A deep generator network (DGN) is optionally combined with the DNN to produce natural-looking images, in which optimization is performed at the input space of the DGN.
Fig 2.
Seen natural image reconstructions.
The black and gray surrounding frames indicate presented and reconstructed images respectively (reconstructed from VC activity using DNN1–8). Reconstructed images obtained through the optimization processes are shown for seen natural images. Reconstructions were constrained by the DGN.
Fig 3.
Effect of the deep generator network (DGN).
(A) Reconstructions with and without the DGN. The first, second, and third rows show presented images, and reconstructions with and without the DGN respectively (reconstructed from VC activity, DNN1–8). (B) Reconstruction quality of seen natural images (three subjects pooled, N = 150; chance level, 50%).
Fig 4.
Effect of multi-level visual features.
(A) Reconstructions using different combinations of DNN layers (without the DGN). The black and gray surrounding frames indicate presented and reconstructed images respectively (reconstructed from VC activity). (B) Objective and subjective assessments of reconstructions from different combinations of DNN layers (error bars, 95% confidence interval [C.I.] across samples, N = 50; see Material and Methods: “Evaluation of reconstruction quality” for the procedure to calculate winning percentage).
Fig 5.
DNN feature decoding accuracy of raw and absolute features.
The analysis was performed with features from the conv1_1 layer of the VGG19 model using the test natural image dataset (error bar, 95% C.I. across subjects). (A) Mean feature decoding accuracy of all units. (B) Mean feature decoding accuracy for individual filters. The feature decoding accuracies of units within the same filters were individually averaged. The filters were sorted according to the ascending order of the raw feature decoding accuracy averaged for individual filters.
Fig 6.
Seen artificial image reconstructions.
The black and gray surrounding frames indicate presented and reconstructed images respectively (VC activity, DNN 1–8, without the DGN). (A) Reconstructions for seen artificial shapes. (B) Reconstructions for seen alphabetical letters. The reconstructed letters were arranged in the word: “NEURON”. (C) Reconstruction quality of artificial shapes and alphabetical letters (three subjects pooled, N = 120 and 30 for artificial shapes and alphabetical letters, respectively; chance level, 50%).
Fig 7.
Reconstructions of shape and color from multiple visual areas.
(A) Reconstructions of artificial shapes from multiple visual areas (DNN 1–8, without the DGN). The black and gray surrounding frames indicate presented and reconstructed images respectively. (B) Reconstruction quality of shape and color for different visual areas (three subjects pooled, N = 120; chance level, 50%).
Fig 8.
The black and gray surrounding frames indicate presented and reconstructed images respectively (VC activity, DNN 1–8, without the DGN). (A) Reconstructions for imagined artificial shapes through optimization processes. Reconstructed images obtained through the optimization processes are shown for images with high human judgment accuracy. (B) Reconstructions of imagined artificial shapes with low human judgment accuracy. (C) Reconstructions for imagined natural images. (D) Reconstruction quality of imagined artificial shapes (three subjects pooled, N = 45; chance level, 50%). (E) Reconstruction quality of imagined artificial shapes separately evaluated for color and shape by human judgment (three subjects pooled, N = 45; chance level, 50%).