Towards the identification of Idiopathic Parkinson’s Disease from the speech. New articulatory kinetic biomarkers

doi:10.1371/journal.pone.0189583

Fig 1.

Categorization of the disturbances associated to the hypokinetic dysarthria of PD patents.

More »

Expand

Fig 2.

Histogram of the UPDRS-III labels of the corpus of speakers.

More »

Expand

Fig 3.

Speech waveform and spectrogram of a 35 years old normophonic speaker uttering the /pa/-/ta/-/ka/ test.

More »

Expand

Fig 4.

Detail of the speech corresponding to the /ka/ syllable of a 35 years old normophonic speaker.

The syllable starts with a stop gap (silence), followed by a burst that is previous to the periodic sound of the vowel. The structure depicted is typical of the plosive consonant-vowel combinations used in the DDK test.

More »

Expand

Table 1.

Segments in the consonant-vowel combinations of the DDK test with their corresponding time and frequency characteristics.

More »

Expand

Fig 5.

Speech trace and spectrograms of voiceless bilabial/alveolar/velar (left/center/right) stops uttered by five PD patients with different degrees of the disease according to the H&Y and UPDRS scales.

More »

Expand

Fig 6.

Analogy of the movements of the articulators in PD patients.

Speakers do not move the articulators to their largest extent, with the required acceleration, and during the required time.

More »

Expand

Fig 7.

Recognition rate vs. kernel length used to calculate the velocity of the envelope.

The optimum is considered to be in the interval [30–65] ms.

More »

Expand

Fig 8.

Recognition rate vs. kernel lengths used to calculate the velocity and acceleration of the envelope.

A 50 ms long kernel for the velocity corresponds with a 40 ms kernel for the acceleration.

More »

Expand

Fig 9.

Speech trace with its envelope and an estimate of the velocity and acceleration of the envelope for a young normophonic 35 years old person (a), a control speaker (b), a parkinsonian patient with H&Y = 2 (c), and a parkinsonian patient with H&Y = 3 (d), all of them calculated using 50 and 40 ms. long smoothing kernels for the velocity and acceleration respectively.

The speech traces correspond to one single utterance of the /pa/-/ta/-/ka/ test. The amplitudes are normalized in the range [–1, 1] for each 1.37 s long frame of analysis. Note that the time scales are different for each plot due to a different speech rate.

More »

Expand

Fig 10.

3D attractors of the envelope speed for a young normophonic 35 years old person (a), a control speaker (b), a parkinsonian patient with H&Y = 2 (c), and a parkinsonian patient with H&Y = 3 (d), all of them calculated using 50 ms long smoothing kernel for the speed and a time delay of 70 samples.

More »

Expand

Fig 11.

Accuracy vs. window size for a GMM-UBM system trained with 128 gaussians for two different parameterization approaches.

Best results are with 10 ms. windows.

More »

Expand

Table 2.

Best results in terms of accuracy, area under the ROC curve, sensitivity and specificity for both configurations (GMM-UBM and iVectors) and parameterization approaches (MFCC and RASTA-PLP).

More »

Expand

Fig 12.

DET curve using GMM-UBM and iVectors approaches for MFCC and RASTA-PLP parameterization approaches.

More »

Expand

Fig 13.

Normalized histograms of the UPDRS-III labels corresponding to the speakers wrongly categorized.

a) using GMM-UBM and RASTA-PLP; b) using GMM-UBM and MFCC; c) using iVectors and RASTA-PLP; d) using iVectors and MFCC.

More »

Expand

Fig 14.

DET plot of the best baseline system and of the proposed method.

More »

Expand

Table 3.

Best results in terms of accuracy ± confidence interval, area under the ROC curve, sensitivity and specificity for both configurations.

More »

Expand

Fig 15.

Boxplots corresponding to the complexity measures extracted from the acceleration (top row) and velocity (bottom row) sequences.

More »

Expand

Fig 16.

a) Normalized histogram of the UPDRS-III labels corresponding to the speakers wrongly categorized with the proposed method, b) UPDRS-III level vs. score given by the proposed method.

More »

Expand

Fig 17.

Example of the estimation of the time lag (a) and embedding dimension (b) for a 1.37 s. long frame corresponding to the velocity of variation of the envelope of a normophonic speaker during the /pa/-/ta/-/ka/ test. In this example, the first minimum of the auto mutual information can be found at 70. Regarding the embedding dimension, the plot of the E1 value used for the Cao’s method shows a kink at 6. The histograms in (c) and (d) correspond to the time delays and embedding dimensions respectively obtained for all the frames extracted from the database.

More »

Expand