Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Example of two simulated microphone signals.

Visualization of the signals y1 (blue) and y2 (orange).

More »

Fig 1 Expand

Fig 2.

Plot of the cross correlation of y1 and y2 for positive lags.

More »

Fig 2 Expand

Fig 3.

Visualization of the signals y1 (blue) and y2 (orange) after applying a delay of 20 samples to y1.

More »

Fig 3 Expand

Fig 4.

Cross correlation with different estimators.

The output of three cross correlation estimators, GCC-PHAT (blue), frequency domain (orange), and time domain (green) are plotted for a 350 ms frame of speech.

More »

Fig 4 Expand

Fig 5.

Geometry used to estimate DOA from two microphones (M1, and M2) that are separated by length D.

More »

Fig 5 Expand

Table 1.

Overview of the conditions recorded.

More »

Table 1 Expand

Fig 6.

Spectrogram of audio sample with human speech and REEM-C motions, represented in dB.

Three regions with energy that correspond to the sound from the REEM-C arm motors are highlighted.

More »

Fig 6 Expand

Table 2.

Parameter spaces defined for both methods.

(min, max)= uniform distribution. (mean, std) = normal distribution.

More »

Table 2 Expand

Fig 7.

Brute force DOA performance against frame size (s) and step size (%).

The small white region corresponds to parameter combinations that resulted in no windows of speech being detected.

More »

Fig 7 Expand

Fig 8.

Brute force classification performance against low and high thresholds.

The white region corresponds to parameter combinations that resulted in no windows of speech being detected.

More »

Fig 8 Expand

Fig 9.

TPE DOA performance against frame size (s) and step size (%).

The white region corresponds to parameter combinations that were not tested by the TPE method.

More »

Fig 9 Expand

Fig 10.

TPE classification performance visualized against frame size (s) and step size (%).

The white region corresponds to parameter combinations that were not tested by the TPE method.

More »

Fig 10 Expand

Fig 11.

Joint objective loss vs. step size and frame size.

The white region corresponds to parameter combinations that were not tested by the TPE method.

More »

Fig 11 Expand

Fig 12.

Joint regularized objective loss vs. step size and frame size.

The white region corresponds to parameter combinations that were not tested by the TPE method.

More »

Fig 12 Expand

Fig 13.

Frame size vs. average joint objective loss.

More »

Fig 13 Expand

Fig 14.

Frame size vs. average joint regularized objective loss.

More »

Fig 14 Expand

Table 3.

Best parameters and results for each task across optimization methods.

More »

Table 3 Expand

Table 4.

Test set performance.

More »

Table 4 Expand

Fig 15.

Test 6 DOA results.

The green regions indicate intervals where speech was present. The blue dots indicate the estimated DOA using best study parameters. The actual DOA of the speaker is indicated by the dotted line.

More »

Fig 15 Expand

Fig 16.

Test 28 DOA results.

The green regions indicate intervals where speech was present. The blue dots indicate the estimated DOA using best study parameters. The actual DOA of the speaker is indicated by the dotted line.

More »

Fig 16 Expand

Fig 17.

Test 36 DOA results.

The green regions indicate intervals where speech was present. The blue dots indicate the estimated DOA using best study parameters. The actual DOA of the speaker is indicated by the dotted line.

More »

Fig 17 Expand

Table 5.

Comparative results for voice and timing methods.

More »

Table 5 Expand

Fig 18.

Test 29 DOA results.

Estimates are generated comparing the GCC-PHAT (orange) and the GCC-SCOT (blue). The green regions indicate intervals where speech was present. The actual DOA of the speaker is indicated by the dotted line.

More »

Fig 18 Expand

Fig 19.

Visualization of 5 dataset folds used for cross validation.

Indices shaded white are used for training, and indices shaded black are used for testing.

More »

Fig 19 Expand

Table 6.

Cross validation results.

More »

Table 6 Expand

Table 7.

Best overall parameters for both methods.

More »

Table 7 Expand

Fig 20.

Microphone setup on REEM-C.

More »

Fig 20 Expand

Fig 21.

Average latencies of DOA estimate for three different frame lengths.

The errorbars indicate one standard error.

More »

Fig 21 Expand