Table 1.
Demographic and technical data of cochlear implanted individuals.
Figure 1.
Boxplots demonstrating lower quartile, median, and upper quartile, and whiskers representing 1.5 times the interquartile range (X = outliers): Speech reading performance (correctly-repeated words in percent) from 14 deaf individuals by using (A) the same high definition web camera (Logitech Pro9000) and different speakers (CD, medical student, 97 words/s; JB, actress, 161 words/s; SF, speech therapist, 178 words/s), (B) the same speaker (CD) but different communication modes and (C) the same speaker (SF) with 3 different webcams: Logitech Pro9000, Logitech C600, and Logitech C500.
Figure 2.
Speech reading performance (mean +/−1 SD) by n = 14 deaf individuals for 4 different spatial resolutions (A) and 5 different frame rates (B).
In B, the maximum achieved speech perception at 30 fps is set to 100% (relative data). Mean speech perception scores remained above 80% until the frame rate of 10 images per second. Frame rates <10 fps were associated with a substantial reduction of the speech reading performance and frame rates at 7 fps led to a 50% reduction of the initial performance at optimal video quality. Speech reading at 5 fps was almost impossible.
Figure 3.
Speech reading capability of cochlear implant users.
A. Comparison of speech perception scores in the absence of auditory input for n = 10 proficient (pCI) and n = 11 non-proficient (npCI) CI users for two visual communication modes (face-to-face without their implant activated vs. Skype™ video only). B. Boxplots showing speech reading scores for each condition and group.
Figure 4.
CI-users and audio-visual gain for Skype™ transmission.
A. Speech perception scores of n = 10 proficient (pCI) and n = 11 non-proficient (npCI) CI users for exclusive auditory input vs. audio-visual input. B. Non-proficient CI users and the two groups combined (all CI) showed a statistically significant audio-visual gain (Boxplots). Proficient CI users showed a non-significant trend for AV-gain.
Figure 5.
Bimodal mean speech perception (+/−1 SD) is plotted against audio-visual delay (auditory signal proceeds image) for n = 10 proficient (pCI) and n = 11 non-proficient (npCI) CI users. Fusion of incongruent auditory and visual stimuli is not possible after 200 ms for npCI and 300 ms for pCI users. Intelligibility improved again after long AV delays because CI users did not try to fuse both incongruent signals and relied on either one of the stimuli.