Relating dynamic brain states to dynamic machine states: Human and machine solutions to the speech recognition problem

doi:10.1371/journal.pcbi.1005617

Relating dynamic brain states to dynamic machine states: Human and machine solutions to the speech recognition problem

Fig 2

Mapping from GMM–HMM triphone log likelihoods to phone model RDMs.

(a) Each 10 ms frame of audio is transformed into MFCC vectors. From these, a GMM estimates triphone log likelihoods, which are used in the phonetic HMMs. (b) We used the log likelihood estimates for each triphone variation of each phone, concatenated over a 60 ms sliding window, to model dissimilarities between input words. Dissimilarities modelled by correlation distances between triphone likelihood vectors were collected as entries in phonetic model RDMs. (c) These phone-specific model RDMs were computed through time for each sliding window position, yielding 40 time-varying model RDMs.

doi: https://doi.org/10.1371/journal.pcbi.1005617.g002