Bayesian binding and fusion models explain illusion and enhancement effects in audiovisual speech perception

doi:10.1371/journal.pone.0246986

Fig 1.

The Joint Prior model of audiovisual speech perception.

Upper row: Example plots of prior, likelihood and posterior distributions. The horizontal axes represent the auditory dimension and the vertical dimension represents the visual dimension. The prior is a Gaussian ridge along the A = V diagonal, and the likelihood is a Gaussian (here depicted with greater variance in the visual dimension). The posterior distribution is also Gaussian, pulled in the direction of the A = V diagonal. Lower row: the marginal distribution of the prior, likelihood and posterior in the auditory dimension. Response boundaries (vertical lines) are applied to the posterior distribution and response probabilities are estimated as the probability mass (yellow area) delimited by the response boundaries.

More »

Expand

Fig 2.

Prior structures.

Illustration of the prior structure of each model compared in the study. A full derivation of the Joint Prior model of audiovisual speech perception is available in the supporting information.

More »

Expand

Fig 3.

Behavioural responses and model predictions.

Mean behavioural responses (dark bars) and model predictions (light bars) to visual-only (top row), auditory-only (left column) and audiovisual stimuli (central panels) for 16 participants. Error bars represent the standard error of the mean. Visual stimuli are divided into G (left compartment) and B (right compartment) and are presented with descending SNR (left: high SNR to right: low SNR within each compartment. Auditory stimuli are divided into B (top compartment) and G (bottom compartment) and are presented with descending SNR (top: high SNR to bottom: low SNR within each compartment). Each audiovisual stimulus is a combination of the auditory and aisual stimulus on the corresponding row and column, presented either in synchrony (blue bars) or out of sync (red bars). The model predictions displayed are cross-validation predictions from the Reduced Joint Prior model.

More »

Expand

Fig 4.

Modelling results.

A) Prior parameters for synchronous and asynchronous stimuli: binding parameter (0 = full binding, infinite = no binding) for the Full Joint Prior model, and probability of separate causes (0 = full binding, 1 = no binding) for the Full BCI model. B) Auditory and C) visual precision parameters of the Reduced Joint Prior and BCI for clear to noisy stimuli (left to right). The images depict the first author. Error bars represent SEM. D) Improvement in test error over baseline (the Maximum likelihood model) for the Reduced and Full Bayesian model implementations. Error bars represent SEM. E) Auditory weight in the Reduced Joint Prior model, plotted by SNR and SOA.

More »

Expand