Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Raw data in waveform and representation in spectrogram, mel-spectrogram, and mfcc form.

More »

Fig 1 Expand

Fig 2.

The architecture of the proposed 1D-CNN model.

More »

Fig 2 Expand

Table 1.

Layers and parameters of the proposed 1D-CNN model.

More »

Table 1 Expand

Fig 3.

The architecture of the 2DS-CNN model.

More »

Fig 3 Expand

Table 2.

Layers and parameters of the 2DS-CNN model.

More »

Table 2 Expand

Fig 4.

The architecture of the 2DM-CNN model.

More »

Fig 4 Expand

Table 3.

Layers and parameters of the 2DM-CNN model.

More »

Table 3 Expand

Fig 5.

Speech, spectrogram, mel-spectrogram and mfcc images of used datasets.

More »

Fig 5 Expand

Table 4.

Hardware specification values were used for training, testing, and analysis.

More »

Table 4 Expand

Fig 6.

Comparison of complexity and time parameters of all models (lower is better).

More »

Fig 6 Expand

Table 5.

Comparison results on digital speech dataset (mean±std).

More »

Table 5 Expand

Table 6.

Comparison results on spectrogram dataset.

More »

Table 6 Expand

Table 7.

Comparison results on the mel-spectrogram dataset.

More »

Table 7 Expand

Table 8.

Comparison results on the mfcc dataset.

More »

Table 8 Expand

Fig 7.

Results of the proposed algorithms (higher is better).

More »

Fig 7 Expand

Fig 8.

The training process of the End2End and 2DM-CNN models in the first run.

More »

Fig 8 Expand

Fig 9.

The confusion matrix of End2End and 2DM-CNN models in the first run.

More »

Fig 9 Expand

Fig 10.

Confusion matrices at the first run of the proposed neural networks used on the mel-spectrogram dataset.

More »

Fig 10 Expand

Fig 11.

Models ranking (lower is better).

More »

Fig 11 Expand