Fig 1.
Muscles of the soft palate from posterior (left), and the side (right) view.
Fig 2.
Mid-sagittal RT-MRI images of the vocal tract for several velum positions, over time, showing evolution from a raised velum, to a lowered velum and back to initial conditions.
The presented curve, used for analysis, was derived from the images.
Fig 3.
Exemplification of the warped signal representing the nasal information extracted from RT-MRI (dashed line) superimposed on the speech recorded during the corresponding RT-MRI and EMG acquisition, for the sentence [6~p6, p6~p6, p6~].
Fig 4.
Coronal-oblique RT-MRI images depicting the nasal cavity (in white), over time, and the curve derived for analysis purposes.
Fig 5.
EMG electrodes positioning and the respective channels (1 to 5) plus the reference electrode (R).
EMG 1 and 2 use unipolar configurations and EMG 3, 4 and 5 use bipolar configurations.
Fig 6.
Exemplification of the EMG signal segmentation into nasal and non-nasal zones based on the information extracted from the RT-RMI (dashed red line).
The square wave depicted with a black line represents the velum information split into two classes where 0 stands for non-nasal and 1 for nasal. The blue line is the average of the RT-MRI information (after normalization) and the green line is the average plus half of the standard deviation.
Fig 7.
Raw EMG signal and pre-processed EMG signal of channel 1 (top) and 3 (bottom) for the sentence [6~p6, p6~p6, p6~] from speaker 1.
The pre-processed signal has been normalized and filtered using a 12-point moving average filter.
Fig 8.
Filtered EMG signal (12-point moving average filter) for the several channels (pink), the aligned RT-MRI information (blue) and the respective audio signal for the sentence [6~p6, p6~p6, p6~] from speaker 1.
An amplitude gain was applied to the RT-MRI information and to the EMG for better visualization of the superimposed signals.
Fig 9.
Filtered EMG signal (12-point moving average filter) for the several channels (pink), the aligned RT-MRI information (blue) and the respective audio signal for the sentence [i~p6, i~p6, pi~] from speaker 1.
An amplitude gain was applied to the RT-MRI information and to the EMG for better visualization of the superimposed signals.
Fig 10.
Portuguese vowels in an isolated context (Pre-processed EMG signal for all EMG channels (pink), the aligned RT-MRI information (blue) and the respective audio signal (red) for [6~, e~, i~, o~, u~]).
An amplitude gain was applied to the RT-MRI information and to the EMG for better visualization of the superimposed signals.
Fig 11.
Boxplot of the mutual information in the nasal zones between the RT-MRI information and the EMG signal of all speakers and for a single speaker.
Table 1.
Class distribution for all speakers for a single EMG channel by zones and frames (nasal and non-nasal).
Fig 12.
Classification results (mean value of the 10-fold for error rate, sensitivity and specificity) for all channels and all speakers.
Error bars show a 95% confidence interval.
Table 2.
Mean sensitivity and specificity measures (%) for each EMG channel with a 95% confidence interval.
Fig 13.
The graph on the left shows the mean error rate for each speaker clustered by EMG channel.
The graph on the right shows the mean of the error rates from each speaker also clustered by EMG channel. Error bars show a 95% confidence interval.
Fig 14.
Difference between the mean error rate of all channels and the respective result of each channel for all (left) and each (right) speaker.
Error bars show a 95% confidence interval.
Table 3.
Mean error rate grouped by nasal vowel.
Table 4.
Mean error rate using multiple channels combinations.
Table 5.
Results of the repeated-measures ANOVA analysis for the EMG channel pairs that attained significance level.
Table 6.
Mean error rates using a classification technique based on the majority of nasal/non-nasal frames for each zone.
Table 7.
Mean error rates using a classification technique based on the majority of nasal/non-nasal frames for each nasal zone.
Fig 15.
Classification results (mean value of the 10-fold for error rate, sensitivity and specificity) for all channels of speaker 1.
These results are based on four additional sessions from this speaker recorded a posteriori. Error bars show a 95% confidence interval.
Table 8.
Mean sensitivity and specificity measures (%) with a 95% confidence interval for each EMG channel of speaker 1.