Fig 1.
The between-subject experimental design of the avVSR task with three factors: display device (computer monitor vs. HMD), the audiovisual angle incongruence (congruent vs. incongruent), and the serial position (1 - 8) of the target digit sequence. In the congruent case, the ECA and the virtual sound source are co-located; in the incongruent case, there is an angular offset between the ECA and the virtual sound source.
Fig 2.
Graphical depiction of the trial phases Countdown, Stimulus presentation, Retention interval, and Recall phase over time. In the Recall phase, the cursor is visible as a red cross.
Fig 3.
The experiment was done with both (left) computer monitor and (right) HMD presentation. Left: computer monitor presentation in which a participant is seen wearing headphones with Optitrack tracking markers. Right: representation of the VR scene created in Unreal Engine. When wearing the HMD, the point of view of the blue camera represents the participants’ point of view.
Fig 4.
Audiovisual angle incongruence in Experiment 1.
The female ECA is displayed at an azimuth angle of on the horizontal plane at a distance of
from the listener (blue). The possible virtual sound source positions (green) are at the azimuth angles
and
at the distance d.
Table 1.
Summary of pairwise comparisons between the audiovisual Angle offsets (i.e., between the auditory source and the visual representation of the ECA) and the display Device combinations with Accuracy as the dependent variable.
Fig 5.
Questionnaire results of Experiment 1.
The standard deviation (SD) difference between the HMD and the Monitor display conditions is shown for each question on the y-axis. Error bars indicate the 95% credible intervals. All questions were rated on a scale of 1 to 7, except for Q_Inc, which was asked on a yes/no basis and is thus not displayed here.
Fig 6.
The between-subject experimental design of the avVSR task with three factors: display device (computer monitor vs. HMD), the audiovisual voice incongruence (congruent vs. incongruent), and the serial position (1 – 8) of the target digit sequence. In the congruent case, the ECA speaks with their own voice; in the incongruent case the female ECA speaks with the male ECAs’ voice and vice versa (here shown for the female ECA).
Fig 7.
Left: The female ECA was the same as in Experiment 1. Right: A male ECA was created for Experiment 2.
Table 2.
Summary of pairwise comparisons between the voice incongruence (Match) and display Device combinations with Accuracy as the dependent variable.
Fig 8.
Questionnaire results of Experiment 2.
The standard deviation (SD) difference between the HMD and the Monitor display conditions is shown for each question on the y-axis. Error bars indicate the 95% credible intervals. All questions were rated on a scale of 1 to 7, excluding Q_Inc, which was asked on a yes/no basis and is thus not displayed here.
Fig 9.
Standard deviation (SD) difference of questionnaire data between Experiment 1 and 2.
Results are shown for the questions Q_Nat, Q_Sen, Q_Spe, and Q_Imp (on the y-axis). Error bars indicate the 95% credible intervals. All questions (see section Questionnaire) were rated on a scale of 1 to 7.