Negative impacts from latency masked by noise in simulated beamforming

doi:10.1371/journal.pone.0254119

Fig 1.

Comb filters.

Comb filters as a result of 3 gain / enhancement pairs. Yellow: 17 dB, 64 ms, dip depth = 2.89 dB. Red: 12 dB, 32 ms, dip depth = 6.06 dB. Blue: 8 dB, 8 ms, dip depth = 13.82 dB.

More »

Expand

Fig 2.

Filter latency vs. array gain.

Increased filter length gives larger array gain at the expense of increased latency. Reverberation time by 60 dB (RT60) is given in the legend. (Filter latency = group delay = filter length*0.5) Simulation for one desired source, two interfering sources.

More »

Expand

Fig 3.

Simplified simulink schematic.

a) Speech model containing 2 speech signals–direct path signal set at X dB and varies according to previous subject response. Enhanced path signal contains same speech signal with some additional gain of x_i dB and a latency value of y_i ms (i = 1,2, or 3; gain / latency pairs are described in text). b) Noise model containing 2 speech shaped noise signals–direct path fixed at 0 db, enhanced path fixed at -6 dB, representing worst case scenario of a successful beamformer, with the same latency value, y_i, in speech model.

More »

Expand

Fig 4.

Experiment’s guided user interface.

Example of MATLAB’s GUI for a single trial of implementing the Modified Rhyme Test (MRT).

More »

Expand

Fig 5.

Visual for permutation statistics.

The distribution of thresholds from the permutation statistics is shown in red. The gray line shows where the average from the experimental data falls within the distribution.

More »

Expand

Table 1.

Experimental conditions and threshold results.

Three gain / latency combinations from the RT60 = 0.6s beamforming simulation from Fig 2 used in this experiment. The average thresholds (SNR of enhanced signal to noise) and standard deviation across subjects for each of the 3 conditions. (dB = decibels; ms = milliseconds).

More »

Expand

Fig 6.

Intelligibility metrics.

Left graphs show the HASPI and STOI (top and bottom, respectively) as a function of the direct path signal-to-noise ratio. Adding the various enhancements translates the curve by the corresponding amount of gain. Right graphs show the HASPI and STOI (top and bottom, respectively) as a function of the enhanced path signal-to-noise ratio. Regardless of the direct path signal level, the enhanced signal dominates the intelligibility calculation.

More »

Expand

Fig 7.

Signal SNR across conditions.

Green line is noise level across conditions. Gray region shows the levels below the measured speech intelligibility threshold by the average listener. Yellow diamonds represent the threshold levels across conditions for the average listener. Purple circles indicate level of direct signal, unintelligible and in need of enhancement.

More »

Expand