I Hear You Eat and Speak: Automatic Recognition of Eating Condition and Food Type, Use-Cases, and Impact on ASR Performance

doi:10.1371/journal.pone.0154486

Table 1.

Technical setup for the recording of the audio-visual streams.

More »

Expand

Table 2.

Number of subjects having special health issues.

More »

Expand

Table 3.

Chosen food classes and amount of food served to the subjects while recording each utterance.

More »

Expand

Table 4.

Self-reporting on likability and difficulty of eating of food classes rated by all subjects.

More »

Expand

Table 5.

Statistics of the iHEARu-EAT database.

More »

Expand

Fig 1.

Exemplary subjects of the iHEARu-EAT database while recording an utterance without eating food (left), eating a banana (middle) and eating crisps (right).

Unusual configurations of the supra-glottal part of the vocal tract are clearly visible for the eating conditions.

More »

Expand

Table 6.

ASR WERs [%] using 7-way acoustic model training on the iHEARu-EAT dataset.

More »

Expand

Table 7.

ComParE acoustic feature set: 65 low-level descriptors (LLD).

More »

Expand

Table 8.

ComParE acoustic feature set: Functionals applied to LLD contours (Table 7).

More »

Expand

Table 9.

Binary classification of eating condition.

More »

Expand

Table 10.

2-way and 7-way classification of eating condition.

More »

Expand

Table 11.

Confusion matrix obtained by SVMs on the ComParE feature set in the 7-way classification of eating condition for both read and spontaneous speech production.

More »

Expand

Fig 2.

Solutions of non-metric dimensional scaling applied to class confusions (2-D (top), 1-D (bottom left)) or Euclidean class center distances (1-D (bottom right)) in the 7-way task, ComParE low-level acoustic features.

More »

Expand

Fig 3.

Degree of high-frequency noise of the words ‘warmed up’ caused by eating.

Subjects (left: female, right: male) while recording an utterance eating a banana (top), without eating a sort of food (middle), and eating crisps (bottom).

More »

Expand

Table 12.

Regression-based recognition of eating condition.

More »

Expand