To integrate or not to integrate: Temporal dynamics of hierarchical Bayesian causal inference

doi:10.1371/journal.pbio.3000210

Fig 1.

Experimental design, example trial, and behavioural and predicted AV weights (w_AV).

(A) Experimental design. In a 4 × 4 × 2 × 2 factorial design, the experiment manipulated (1) the location of the visual (‘V’) signal (−10°, −3.3°, 3.3°, and 10°), (2) the location of the auditory (‘A’) signal (−10°, −3.3°, 3.3°, and 10°), (3) the reliability of the visual signal (VR+ versus low VR−, as defined by the spread of the visual cloud), and (4) task relevance (auditory versus visual report). In addition, we included unisensory auditory and visual VR+ and VR− trials. The greyscale codes the spatial disparity between the auditory and visual locations for each AV condition (i.e., darker greyscale = larger spatial disparity). (B) Time course of an example trial. (C) Behavioural AV weight index w_AV computed from behavioural responses (left) and from the predictions of the Bayesian causal inference model (right; across-participants circular mean ± 68% CI and individual w_AV represented by filled/empty circles, n = 13). The AV weight index w_AV is shown as a function of (1) visual reliability: high [VR+] versus low [VR−]; (2) task relevance: auditory versus visual report; and (3) AV spatial disparity: small (≦6.6; D−) versus large (>6.6; D+). The data used to make this figure are available in file S1 Data. AV, audiovisual; D+, high disparity; D−, low disparity; VR+, high visual reliability; VR−, low visual reliability.

More »

Expand

Fig 2.

Temporal generalisation matrices within and across auditory and visual senses.

Each temporal generalisation matrix shows the decoding accuracy for each training (y-axis) and testing (x-axis) time point. We factorially manipulated the training data (auditory versus visual stimulation) and testing data (auditory versus visual stimulation). Decoding accuracy is quantified by the Pearson correlation between the true and the decoded locations of the auditory (or visual) stimulus. The grey line along the diagonal indicates where the training time is equal to the testing time (i.e., the time-resolved decoding accuracies). Horizontal and vertical grey lines indicate the stimulus onset. The thin grey lines encircle clusters with decoding accuracies that were significantly better than chance at p < 0.05 corrected for multiple comparisons. The thick grey lines encircle the clusters with decoding accuracies that were significantly better than chance jointly for both (1) auditory-to-visual and (2) visual-to-auditory cross-temporal generalisation at p < 0.05 corrected for multiple comparisons. The data used to make this figure are available in file S1 Data.

More »

Expand

Fig 3.

GLM-based w_AV and Bayesian modelling analysis overview.

(A) The GLM-based w_AV and Bayesian modelling analysis were performed on auditory (‘A’) and visual (‘V’) spatial estimates that were indicated by participants as behavioural localisation responses (left, ‘Behaviour’) or decoded from participants’ EEG activity patterns (right, ‘Neural’). The neural spatial estimates were obtained by training an SVR model on ERP activity patterns at each time point of the AV congruent trials to learn the mapping from EEG pattern to external spatial locations (black diagonal line). This learnt mapping was then used to decode the spatial location from the ERP activity patterns of the spatially congruent and incongruent AV conditions (coloured arrows). (B) Distributions of spatial localisation responses (left, Behaviour: S_Resp) and decoded spatial estimates (right, Neural: S_Dec) were computed for each of the 64 conditions of the 4 (visual stimulus location) × 4 (auditory stimulus location) × 2 (visual reliability) × 2 (task relevance) factorial design. (C) Left: In the GLM-based w_AV analysis, the perceived (or decoded at each time point) spatial estimates were predicted by the true visual and auditory spatial locations (S_V1..8, S_A1..8) for each of the eight conditions in the 2 (visual reliability: high versus low) × 2 (task relevance: auditory versus visual report) × 2 (spatial disparity: ≤6.6° versus >6.6°) factorial design. As a summary index, we defined the relative audiovisual weight (w_AV) as the four-quadrant inverse tangent of the visual (ß_V1..8) and auditory (ß_A1..8) parameter estimates for each of the eight conditions in each regression model. Right: In the Bayesian modelling analysis, we fitted the following models to observers’ behavioural and neural spatial estimates: SegA (green, for EEG only), SegV (red, for EEG only), SegV,A (light blue), ‘forced fusion’ (‘Fusion’, yellow), and BCI model (with model averaging, dark blue). We performed Bayesian model selection at the group level and computed the protected exceedance probability that one model is better than any of the other candidate models above and beyond chance [25]. (D) Left: Based on previous studies [14,16], we hypothesised that the w_AV profile with an interaction between task relevance (i.e., visual versus auditory report) and spatial disparity that is characteristic for BCI would emerge relatively late. Right: Likewise, we expected the different models to dominate the EEG activity patterns to some extent sequentially: first the unisensory segregation model (SegV, SegA), followed by the forced-fusion model (‘Fusion’), and finally the BCI estimate. The fading of colours indicates that we did not have specific hypotheses for those times. AV, audiovisual; BCI, Bayesian causal inference; D+, high disparity; D−, low disparity; EEG, electroencephalography; ERP, event-related potential; GLM, general linear model; S_Dec, Spatial estimate decoded; SegA, unisensory auditory segregation; SegV, unisensory visual segregation; SegV,A, audiovisual full-segregation; S_Resp, spatial estimate responded; stim, stimulus; SVR, support vector regression; VR+, high visual reliability; VR−, low visual reliability.

More »

Expand

Table 1.

Model parameters (across-subjects’ mean ± SEM) of the computational models fit to observers’ behavioural localisation reports.

More »

Expand

Fig 4.

EEG results for GLM-based w_AV and Bayesian modelling analysis.

The neural audiovisual weight index w_AV (across-participants’ circular mean ± 68% CI; n = 13). Neural w_AV as a function of time is shown for (A) visual reliability: VR+ versus VR−; (B) task relevance: auditory (‘A’) versus visual (‘V’) report; (C) audiovisual spatial disparity: small (≦6.6; D−) versus large (>6.6; D+); (D) the interaction between task relevance and disparity. Shaded grey areas indicate the time windows during which the main effect of (A) visual reliability, (B) task relevance, (C) audiovisual spatial disparity, or (D) the interaction between task relevance and disparity on w_AV was statistically significant at p < 0.05 corrected for multiple comparisons across time. (E) Time course of the circular–circular correlation (across-participants’ mean after Fisher z-transformation ± 68% CI; n = 13) between the neural and the behavioural audiovisual weight index w_AV. Shaded grey areas indicate significant correlation at p < 0.05 corrected for multiple comparisons across time. (F) Time course of the protected exceedance probabilities [25] of the five models of the Bayesian modelling analysis: SegA (green), SegV (red), SegV,A (light blue), ‘forced fusion’ (‘Fusion’, yellow), and BCI model (with model averaging, dark blue). The early time window until 55 ms (delimited by black vertical line on all plots) is shaded in white, because the decoding accuracy was not greater than chance for audiovisual congruent trials; hence, the neural weight index w_AV and Bayesian model fits are not interpretable in this window. The data used to make this figure are available in file S1 Data. BCI, Bayesian causal inference; D+, high disparity; D−, low disparity; EEG, electroencephalography; GLM, general linear model; SegA, unisensory auditory segregation; SegV, unisensory visual segregation; SegV,A, audiovisual full-segregation; VR+, high visual reliability; VR−, low visual reliability.

More »

Expand

Table 2.

Statistical significance of main, interaction, and simple main effects for the behavioural and neural audiovisual weight indices (w_AV) (‘model-free’ approach).

More »

Expand