Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation

doi:10.1371/journal.pbio.3003293

Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation

Fig 3

Feature decoding analysis.

(A) DNN feature decoding and evaluation. Decoders are trained to predict DNN features from voxel patterns of fMRI responses. Decoding performance is assessed by evaluating the ability of the decoded features to identify perceived sounds from the test set. (B) Example of true and decoded features for a DNN feature unit. The graph displays the true and decoded values for a single DNN feature unit across 50 test stimuli. This unit (#21060) was from the Conv5 layer of the VGGish-ish model (ROI: AC). (C) Profile correlation of decoded auditory features. Each bar represents the mean profile correlation, with distinct colors indicating different subjects. Error bars denote the 95% confidence interval (CI). (D) Identification accuracy for decoded auditory features. Each bar represents the mean identification accuracy across 50 test stimuli, with error bars denoting the 95% CI. Although Pearson correlation was used as the primary evaluation metric, similar results were confirmed when applying Spearman correlation as an alternative measure for both profile correlation and identification accuracy. The data underlying this figure are provided in S1 and S2 Data.

doi: https://doi.org/10.1371/journal.pbio.3003293.g003