Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Morphometric Differences of Vocal Tract Articulators in Different Loudness Conditions in Singing

  • Matthias Echternach ,

    Affiliation Institute of Musicians’ Medicine, Freiburg University Medical Center, Breisacher Str. 60, 79106 Freiburg, Germany

  • Fabian Burk,

    Affiliation Institute of Musicians’ Medicine, Freiburg University Medical Center, Breisacher Str. 60, 79106 Freiburg, Germany

  • Michael Burdumy,

    Affiliations Institute of Musicians’ Medicine, Freiburg University Medical Center, Breisacher Str. 60, 79106 Freiburg, Germany, Department of Medical Physics, Radiology, Freiburg University Medical Center, Breisacher Str. 60, 79106 Freiburg, Germany

  • Louisa Traser,

    Affiliations Institute of Musicians’ Medicine, Freiburg University Medical Center, Breisacher Str. 60, 79106 Freiburg, Germany, Department of Otorhinolaryngology, Freiburg University Medical Center, Kilianstr. 5, 79106 Freiburg, Germany

  • Bernhard Richter

    Affiliation Institute of Musicians’ Medicine, Freiburg University Medical Center, Breisacher Str. 60, 79106 Freiburg, Germany

Morphometric Differences of Vocal Tract Articulators in Different Loudness Conditions in Singing

  • Matthias Echternach, 
  • Fabian Burk, 
  • Michael Burdumy, 
  • Louisa Traser, 
  • Bernhard Richter



Dynamic MRI analysis of phonation has gathered interest in voice and speech physiology. However, there are limited data addressing the extent to which articulation is dependent on loudness.

Material and Methods

12 professional singer subjects of different voice classifications were analysed concerning the vocal tract profiles recorded with dynamic real-time MRI with 25fps in different pitch and loudness conditions. The subjects were asked to sing ascending scales on the vowel /a/ in three loudness conditions (comfortable = mf, very soft = pp, very loud = ff, respectively). Furthermore, fundamental frequency and sound pressure level were analysed from the simultaneously recorded optical audio signal after noise cancellation.


The data show articulatory differences with respect to changes of both pitch and loudness. Here, lip opening and pharynx width were increased. While the vertical larynx position was rising with pitch it was lower for greater loudness. Especially, the lip opening and pharynx width were more strongly correlated with the sound pressure level than with pitch.


For the vowel /a/ loudness has an effect on articulation during singing which should be considered when articulatory vocal tract data are interpreted.

I. Introduction

In recent years, interest in the analysis of the phonation apparatus using real time dynamic MRI technology has increased in both voice and speech research [16]. With regard to singing voice physiology, it has been shown that the vertical larynx position can change during phonation [7]. Furthermore, vocal tract shape has been evaluated. Here, it was shown that the vocal tract might be modified with respect to vocal registers [811] or might be adjusted in order to match vocal tract resonances or formants with voice source partials [12,13]. According to Titze et al. the term vocal tract resonances is used if the transfer function of the vocal tract is characterized whereas the term formant is used when the vocal tract is excited by a voice source [14]. These studies included the modification of different variables, such as pitch (perceptual term) and fundamental frequency (ƒo, acoustical parameter) [9,12,13,15], vowel conditions [10,13,16], register [810,17] or different singing styles [1820]. Different loudness (perceptual term) or sound pressure level (SPL, acoustical parameter) conditions have been neglected in most studies so far: the loud noise which is produced by the scanner prevented detailed analyses of the SPL. However, it could be hypothesized that SPL might influence articulation independent of register, ƒo or vowel condition.

In principle, there are three general strategies of SPL control in the voice. First, the subglottic pressure could be considered as the most important factor [2124]. Here, increased subglottic pressure is associated with an increased SPL [23]. Secondly, the phonation type (such as breathy, flow phonation, normal, and pressed) and the associated grade of adduction of the vocal folds might contribute to the SPL [2426]: flow phonation shows the greatest maximum flow declination rate which is associated with the greatest SPL [27]. Finally, resonatory properties of the vocal tract might affect sound pressure level, usually promoting the voice source partial which will be the strongest partial of the radiated spectrum and therefore determining SPL [27]. This is of importance especially for the lower resonances. It has been shown that the first vocal tract resonance is raised by a greater lip and jaw opening which could influence SPL [27]. Furthermore, the tracking of voice source partials by vocal tract resonances (often denoted as formant or vocal tract resonance tuning [2830]) has been found to be of importance, especially for soprano voices [28,30,31] since this technique is considered to increase SPL. This tuning is more relevant for female singers`voices but not for speech [32]. The employment of resonance tuning by male professional singers has recently been discussed divergently in literature [29,33,34]. Furthermore, especially in professional singers, modifications of the lower vocal tract might produce a clustering of the vocal tract resonances 3 to 5 [35,36]. As a result, the spectrum partials in this region are boosted. In the case of individual singers such partials could be strongest in the voice spectrum with the consequence that the SPL is determined by these partials. These three control mechanisms are not thought to be independent [37]. However, in a recent study by Herbst et al. it was shown for the singing voice that vocal loudness could be independently controlled from glottal configuration [21].

As stated above the vocal tract could have a direct influence on SPL. However, indirect influences of the vocal tract on the SPL and loudness could also be possible: if there is a rise of subglottic pressure, changes of muscular activity concerning tension and/or adduction are expected at the glottal level [38,39]. The degree of glottal adduction is associated with the vertical larynx position, i.e. that a small abduction is found for a lower vertical laryngeal position [27]. For trained singers both x-ray studies [40] and photographical recordings [41] showed a lower vertical larynx position for greater loudness and effort conditions. A lowering of the vertical laryngeal position would have the effect of vocal tract elongation and therefore a decrease in vocal tract resonance frequencies, which is frequently denoted as covered voice, voix sombrée, gedecktes Singen. Such a lowering could be caused by an activation of external laryngeal strap muscles. The activation of these muscles, such as the sternothyroid muscle, could on the one hand lower the vertical laryngeal position and on the other hand facilitate a shortening and thickening of the vocal folds [42] which might influence vocal fold oscillatory patterns and therefore loudness.

However, changes of vocal tract shape and associated resonances could also have another effect on the voice source. From non-linear dynamic theory, it could be expected that the vocal tract interacts with the flow pulse [43,44] and/or the vocal fold oscillations [44,45]. Since the SPL is related to the transglottal airflow depending on transglottal pressure difference [27,32], a reverse effect of the vocal tract configuration on sound pressure level could be expected.

Due to the increased noise exposure there is very little articulatory data of the vocal tract achieved by MRI which reflects different loudness or sound pressure conditions. In their early MRI studies, Neuschäfer-Rube et al. [46] were not able to observe great differences concerning the vertical laryngeal position in relation to different loudness conditions. However, the same research group found changes in the oro-pharyngeal part of the vocal tract [47]. Since the MRI noise was not cancelled in these early studies and the perceptual term loudness was not objectified by means of SPL evaluations, the accuracy of the task performed by the subjects could not be verified. Furthermore, at the time of their studies, no dynamic real-time MRI recordings were possible.

As a conclusion, the role of the vocal tract in relation to loudness and SPL remains unclear. This study aims to analyse vocal tract shape differences associated with different loudness conditions in professional singers concerning their singing voice. It is hypothesized that there are articulatory differences for various loudness conditions, which would imply that professional singers modify vocal tract resonances to achieve loudness.

II. Material and Methods

After approval from the local ethics committee (Freiburg University Hospital Ethical committee, Nr. 206/09, all subjects gave their written informed consent) 12 subjects were included in this study. All subjects are professional western classically trained operatic singers. None of the subjects had voice complaints and vocal fold pathologies were excluded by means of videostroboscopy and/or high speed digital imaging. A list of the subjects with their respective voice classification and taxonomy is provided in Table 1.

Table 1. Subjects, voice classification and taxonomy according to Bunch and Chapman [61].

The subjects were asked to sing an ascending diatonic major scale on the vowel /a/ (baritones: G3-E4 (196-330Hz), tenors: C4-A4 (262-440Hz), mezzo-sopranos: G3-E4 (196-330Hz), sopranos: A4-A5 (440-880Hz)). The rather low pitch for the mezzo soprano voice was chosen in order to make articulatory data comparable to male professional altos, analysed in a previous investigation [17]. Each pitch was to be sustained for approximately one second. This scale was repeated in three different loudness conditions and always in the same order: (1) in comfortable loudness (mf), (2) soft loudness (pp), and (3) great loudness (ff), respectively. Although there might be an effect on the results due to this order, it was firstly chosen to demonstrate both, pp and ff data relative to the comfortable loudness condition and secondly to exclude the effect that soft phonation might be altered by a great vocal loading after the ff task.

Comparable to our previous studies [810,15,19] the images display the mid-sagittal plane. The recordings were performed using the 3.0 T TIM TRIO (Siemens, Germany) MRI device with the subject in the supine position with 25 frames per second [11,48]. Therefore, each pitch of the diatonic scale was represented by approximately 25 frames. The exact MRI parameters used are provided in the study by Burdumy et al. [48].

Also in analogy with our previous experiments the audio signal was recorded with an optical microphone system (CONFON HP-SI 01, MR confon GmbH, Magdeburg, Germany), that included two microphone recordings (one recording vocalizations and background MRI scanner noise and one recording scanner noise only). During the recording the subjects were provided with the audio signal over headphones as acoustic feedback. Two experts (both otolaryngologists and singers) checked if the subjects performed the desired task. Only sequences found acceptable by the experts as well as by the singers themselves were subsequently analysed.

In order to verify that the intended perceptual loudness condition was fulfilled by the subjects, the audio recordings of each task were analyzed with respect to the SPL after noise cancellation. The first step in cancelling the scanner noise was performed automatically using dedicated software (Digital Audio Presentation Center, CONFON DAP-center mkII+, MR confon GmbH, Magdeburg, Germany). After this first noise cancellation the signal of the scanner noise was still present in most audio signals. In order to cancel this scanner noise, a part of the audio recording was analyzed where only the scanner noise but not the voice was present. This part was marked as reference noise and cancelled throughout the entire audio file using the Adobe Audition Software (Adobe Systems, San Jose, CA, USA). All tasks (pp, mf, ff, respectively) were filtered using the exact same filter properties. Fig 1 shows long time average spectra and spectra for a single task recording after the first cancellation by the Confon system and after the second noise cancellation procedure using the Adobe software. Both related audio files (S1 Audio and S2 Audio) are also provided as supplementary material. The SPL measurement was performed using the Praat software (University of Amsterdam).

Fig 1. Waveform (upper row) and audio spectrum (middle row) for the file after the first noise cancellation (Confon, left panel) and after the second noise cancellation (Adobe, right panel).

The waveforms include all three loudness tasks (mf = mezzoforte, pp = pianissimo, ff = fortissimo, respectively). The lowest row shows Long Time Average Spectra (LTAS) for both files, after the first noise cancellation (blue) and the second noise cancellation (red), respectively.

In each MR frame of the MRI material a series of measures were taken (Fig 2), as described in previous studies [811,15]: lip and jaw opening, height of the tongue, jaw protrusion, pharynx width, uvula and larynx position. Mean values were calculated for each pitch (mean values across approximately 25 single values per pitch) and related to ƒo.

Fig 2. Measured distances in each frame of the MRI material (red arrows), as described previously in detail [811,15].

The auxiliary lines are shown in green. LO (Lip Opening), JO (Jaw Opening), JP (Jaw Protrution) HPT (Highest Point Tongue), UE (Uvula Elevation), LP (vertical Larynx Position).

In order to estimate the error, repeated measurements were performed for the sequence of subject 8 in loudness condition mf. The subject was chosen randomly. Here, in order to estimate intra-rater agreement the same investigator evaluated the MRI data twice, with some time between the measurements. Furthermore, a second investigator measured the same articulatory parameters of the same sequence independent of the first investigator, in order to determine the inter-rater agreement. The results showed great consistency for all measured articulatory parameters both for the intra-rater and inter-rater-reliability (Fig 3). Table 2 provides the average absolute deviation from the mean (MAD) which is given in mm and in percent for both repeated measurements.

Fig 3. Scatter plots for all articulatory measures for subject 8, mf task.

The original first rating is shown on the x-axis and the repeated measurements for the same rater (blue) and a different second rater (red), respectively, on the y-axis. The grey line refers to equivalence.

Table 2. Mean average deviation (MAD) in mm and % for the intra-rater and inter-rater difference.

LO (Lip Opening), JO (Jaw Opening), JP (Jaw Protrusion) HPT (Highest Point Tongue), UE (Uvula Elevation), LP (vertical Larynx Position). The intra and inter-rater comparisons refer to the single subject 8, loudness mf.

Statistical analysis

Statistical analyses were performed using SPSS 22 (SPSS Inc., Armonk, NY, USA). Independent analyses of variance (ANOVA) were used to investigate the differences between the three loudness conditions with respect to the measured articulatory parameters. Post-hoc tests (LSD) were performed on significant ANOVA results. The level of significance was set to p = 0.05. Correlations were reported as Pearson’s r coefficients. For the relation of pitch or ƒo and articulatory data, the relative scale degree was used. The scale degree has the advantage that it is insensitive to the fact that the scale used during the task was different between the voice classifications. The correlation was tested using the Kendall-Tau-b test.

III. Results

There were articulatory differences concerning the measurements associated with both a change in pitch and a difference in loudness. Fig 4 shows an example for the pitches G3 (196Hz), C4 (262Hz) and E4 (330Hz) for all three loudness conditions (mf, pp, ff, respectively in a baritone singer (subject 10). The entire MRI video sequence is provided in the supplementary video material (S1 Video).

Fig 4. Representative mid-sagittal vocal tract profiles from a baritone (subject 10) for the different loudness tasks (mf = mezzoforte, upper row, pp = pianissimo middle row and ff = fortissimo, lowest row) for the pitches G3 (196Hz, left), C4 (262Hz, middle) and E4 (330Hz, right), respectively.

With rising pitch, lip opening and pharynx width were increased. Fig 5 presents these articulatory data with respect to ƒo. Furthermore, with rising pitch there was an elevation of the vertical laryngeal position (Fig 5). As shown in Table 3, there were differences between the various voice classifications: While sopranos and tenors showed many correlations of articulatory data with the scale degree, there were almost no such correlations for the mezzosoprano voices.

Fig 5. Mean values for the lip opening, jaw opening, pharynx width and larynx position, respectively, with respect to fundamental frequency (Hz).

The green color refers to the sopranos, blue to mezzosopranos, red to tenors and grey to baritones. The differences in color intensity reflect the loudness condition: The darker the color, the louder the voice.

Table 3. Kendall-Tau-b test for the description of the correlation between the scale degree and the articulatory data.

The table shows data for the different voice classifications, soprano, mezzosoprano, tenor and baritone, separately.

The lip opening and jaw opening are not independent articulators: in general, if the jaw shows a greater opening also the lips revealed the same tendency (Pearson correlation: r = .663, p < .001, see Table 4). However, these articulators show also some degrees of independency. Fig 6 shows the jaw opening versus the lip opening. If, for a given jaw opening, the corresponding lip openings are compared between the different loudness conditions, it was found that those lip openings were greater in the loud phonation task.

Fig 6. Jaw opening versus lip opening for the mean values across each fundamental frequency.

The green color refers to the sopranos, blue to mezzo-sopranos, red to tenors and grey to baritones. The symbols reflect the loudness condition: circles = mf, triangles = pp, squares = ff.

Table 4. Pearson-correlation which indicates correlation between different articulatory variables and with SPL as well as statistical significances for all articulatory data.

With regard to different loudness conditions, the evaluation of the SPL showed that the perceptual loudness was associated with statistically significant differences of SPL, i.e. that pp showed the lowest, mf a medium, and ff showed the greatest SPL (mean values for pp = 53.3dB, mf = 63.6dB, ff = 71.3 dB, p<0.001) (Fig 7). The SPL was increased with pitch (Table 3).

Fig 7. Mean values for sound pressure level (dB) with respect to fundamental frequency (Hz).

The green color refers to the sopranos, blue to mezzo-sopranos, red to tenors and grey to baritones. The differences in color intensity reflect the loudness condition: the darker the color, the louder the voice.

The mean values of the articulatory data showed statistically different courses with regard to loudness for the lip opening (p<0,001), the jaw opening (p<0,01), the pharynx width (p<0,01) and the vertical laryngeal position (p<0,05). Here, the lip opening, the jaw opening and the pharynx width increased with loudness (Fig 5). Furthermore, the vertical laryngeal position was found to be lower for the louder phonation tasks (Fig 5). However, the changes in the jaw protrusion, uvula elevation, and highest point of the tongue failed to reach a statistical significance. In fact, concerning the uvula it was found that three of the subjects (subjects 5, 8 and 10) were changing the position of the uvula in such a way that the subjects opened their nasal cavities to the pharynx in the course of the tasks.

Since there were articulatory changes for both, the pitch and the SPL (Fig 5), it could be of interest how much these variables are correlated with the articulatory data. Table 3 shows the relation of the articulatory data with the scale degree and Table 4 with SPL. It was shown, that the correlation of the lip opening and the pharynx width with SPL was greater in comparison to the correlation with the scale degree.

IV. Discussion

This study analyzes the effect of different loudness conditions on articulatory data concerning professional singers`singing voices. In general, articulatory changes were found for both increasing loudness and rising pitch.

Articulation is considered as an important factor which is changing with pitch [1012,19,49]. The presented data are in agreement with previous MRI based studies which showed an increased lip opening, pharynx width and elevated vertical larynx position for rising pitches and ƒo [810]. Since a higher pitch and ƒo are frequently associated with a greater SPL and loudness [27], it could be expected that some of the articulatory data would also show a dependency on the SPL and loudness. However, the presented data also showed differences for the articulatory data when the same pitch in different loudness conditions was analyzed. Here, an increase of the SPL was associated with an opening of the lip and the jaw. Especially, the lip opening and pharynx width revealed a stronger correlation with the SPL than with the scale degree. Furthermore–although the lip and the jaw opening were in general strongly correlated–the lip opening corresponding to a given jaw opening was greater for the loudest task. It seems that the lip opening is sometimes additionally modified for the intended loudness condition. The greater lip and jaw opening cause a “trumpet shaped” vocal tract configuration. Furthermore, the increased jaw opening should raise the first resonance [27]. However, the effect of the lip opening on the first vocal tract resonance might be counteracted by the increased pharynx width. A greater pharynx width is likely to lower the first while increasing the second vocal tract resonance frequency [27,50]. The increased pharynx width could be related to the tongue position; however, the presented data failed to show a uniform tendency concerning the tongue height. Therefore, it could be expected that individual morphologies and singing techniques might also contribute to the articulation in this respect: most sopranos have smaller vocal tract relations than tenors or baritones [51] which might have influenced the data. This was the reason why the correlations of the articulatory data were compared to the relative scale degree instead of the absolute value of ƒo. In order to compensate for anatomical differences, it could be worthwhile to analyze the articulatory differences in relation to an individual baseline in future investigations, such as the vocal tract length and width for the resting expiratory level. Lastly, while a vertical rise of the larynx was correlated with a rising pitch, the larynx was found at a lower vertical position in the louder dynamical conditions. This finding is in agreement with Luchsinger and Arnold [40] and Shipp [41] but in contrast to Neuschäfer-Rube et al. [46]. An elongation of the vocal tract will lower the vocal tract resonance frequencies. This phenomenon is often considered as covered voice, voix sombrée or gedecktes Singen. A lowering of the larynx will decrease the relation of the epilaryngeal tube to the pharynx cross section and will therefore also contribute to a clustering of resonances 3–5 [36]. The contrast of the presented data to Neuschäfer-Rube et al. [46] might be related to the fact that not all singers use formant/resonance clustering or that they accept timbre changes associated with an elongation of the vocal tract.

The control of the SPL and loudness is one of the most important parameters of voice production [27,32]. Since an increase of the subglottic pressure as a main factor for the increase of the SPL [27] is associated with an increase of the vocal fold impact stress and connected risk of vocal overuse [52], other strategies such as modification of the phonation type and vocal tract resonances might be of importance in order to increase the SPL and/or loudness. In this respect, the presented data show that the analyzed professional singers show differences in the vocal tract shape with regard to loudness, which might be due to resonatory strategies. In fact, much effort is given during singing voice education in order to learn such resonatory strategies. Therefore, the presented data might reflect such a training effect which might help the singers to produce a louder voice on stage without highly increasing the risk of overuse. In order to verify such a training effect, it seems relevant to analyze also untrained singing voices. However, up to now the MRI recording circumstances (supine position, singing in a tube, noisy surrounding etc.) still prevent a valid data acquisition and analysis for such untrained subjects. Furthermore, resonatory strategies might also be of importance for speech. In contrast to the singing voice, vowel production during speech is much faster and changing quickly due to prosody. Also, consonants are included in speech. As a consequence, it seems of interest for future investigations to analyze if the observed articulatory differences for the singing voice could also be detected in speech.

The presented data showed some differences concerning the vocal classifications. Compared to mezzos, particularly sopranos revealed stronger articulatory changes with rising pitch. In this respect, it should be noted here that these data are related to the vowel /a/ because it is considered as the vowel with the highest first formant [27]. Since articulatory changes are expected when ƒo reaches the region of the first formant [9,10,12,13] the vowel /a/ should be unproblematic for ƒos up to 700 Hz. As a consequence, in the presented experiment, the observed articulatory differences should be independent of the vowel condition for most voice classifications. It seems likely that other effects could be observed by using vowel conditions exhibiting lower first formant frequencies, such as /i/ or /u/. In the presented experiment only the sopranos were performing a task which reaches a ƒo greater than 700Hz. As a consequence, articulatory changes observed for these high pitches might also be related to the fact that singers tend to avoid the situation that ƒo crosses the first vocal tract resonance [28].

In previous studies it was found that register shifts from modal to falsetto were associated with only minor vocal tract shape changes, while the maintenance of the stage voice above the passaggio showed strong articulatory movements when reaching high pitches [8,10,15]. Since the falsetto is often associated with a lower SPL in comparison to the stage voice, part of the articulatory differences with respect to registers might also be related to loudness.

There are other limitations to the presented study. The subjects were asked to perform the task in different musically defined loudness conditions. Loudness is a subjective term. Although the examiners agreed that the different loudness conditions were fulfilled by the subjects, such perceptual evaluation should be considered problematic due to the noisy surroundings. Therefore, in the present experiment an attempt was made to measure the SPL after noise cancellation. Here, for every subject it was found that for every pitch pp was associated with the lowest, mf with medium, and ff with the greatest SPL, respectively. Since noise reduction was performed by means of dedicated software the noise cancellation process may also have changed the voice signal. Furthermore, since there are other approaches for noise cancellation [53,54] it cannot be excluded that other systems would show differences concerning the absolute values of the SPL. However, the same filter conditions were applied to all three of the tasks (pp, mf, and ff, respectively) and the error should only be of a systematic nature. Furthermore, as can be observed from the audio-files, the noise cancelled sound was considered perceptually quite satisfactory. There is another limitation concerning the presented evaluation of the SPL: the recordings were performed in a noisy environment which was only in part diminished by using ear-protective headphones. Although singers might be rather indifferent to noise by their professional training, it cannot be totally excluded that these classically trained subjects raised their sound pressure level with increasing noise, a phenomenon which is often denoted as the Lombard effect [55,56].

The recordings were performed in a supine position. Although Traser et al. [57] found nearly no articulatory differences concerning vocal tract shapes while singing in upright versus supine position in a study with professional singers, the authors found a higher vertical larynx position for the supine position. Therefore, it could be expected that the vertical laryngeal position could be even lower for singing in the (more natural) upright position.

The task for the mezzo sopranos was chosen at a rather low pitch in order to achieve comparability to male altos analyzed in a previous study [17]. Here, articulatory changes with respect to pitch showed comparable tendencies to male altos when these subjects maintained their modal register, avoiding changes to their stage falsetto. Only the pharynx width which was much increased for the male altos during rising pitch showed no such tendency for the mezzos. Due to the relatively low pitch it cannot be excluded that mezzo sopranos would show different articulatory behavior at higher pitches.

The presented data refer to classically trained singers. However, there are other professional singers singing in non-classical styles: it was shown in a pilot study that vocal tract shape might differ between a more classical way of singing in comparison to belting in Musical Theatre singing [19]. Frequently, belting is associated with a greater SPL [5860]. How much articulatory data are influenced by loudness and SPL in this special style of phonation should be clarified by further research.

V. Conclusions

Loudness has an effect on articulation in professional singers`singing voices for the vowel /a/, which should be considered when articulatory vocal tract data are interpreted. If the observed effects are vowel-dependent and if loudness also contributes to articulatory data in speech is a matter of future research.

Supporting Information

S1 Audio. Audio before the Adobe Audition filtering process.


S2 Audio. Audio after the Adobe Audition filtering process.


S1 Video. Subject 10 singing the task in mf, pp, and ff.



The authors thank Horst Urbach, MD, and Hansjörg Mast for their help in performing the MRI recordings. Furthermore the authors thank Manfred Nusseck, PhD for help in statistical analysis, Peter Pabon for help regarding the noise cancellation and Helena Daffern, PhD, for native corrections.

Author Contributions

Conceived and designed the experiments: ME LT BR. Performed the experiments: ME FB LT MB BR. Analyzed the data: ME FB LT MB. Contributed reagents/materials/analysis tools: FB MB. Wrote the paper: ME FB MB LT BR.


  1. 1. Echternach M, Markl M, Richter B (2012) Dynamic Real-Time Magnetic Resonance Imaging For The Analysis Of Voice Physiology. Curr Opin Otolaryngol Head Neck Surg 20: 450–457. pmid:23086261
  2. 2. Fu M, Zhao B, Carignan C, Shosted RK, Perry JL, Kuehn DP et al. (2015) High-Resolution Dynamic Speech Imaging With Joint Low-Rank And Sparsity Constraints. Magn Reson Med 73: 1820–1832. pmid:24912452
  3. 3. Xing F, Woo J, Murano EZ, Lee J, Stone M, Prince JL (2013) 3D Tongue Motion From Tagged And Cine MR Images. Med Image Comput Comput Assist Interv 16: 41–48. pmid:24505742
  4. 4. Woo J, Stone M, Suo Y, Murano EZ, Prince JL (2014) Tissue-Point Motion Tracking In The Tongue From Cine MRI And Tagged MRI. J Speech Lang Hear Res 57: S626–S636. 1831449 [pii]; pmid:24686470
  5. 5. Raol N, Sagar P, Nimkin K, Hartnick CJ (2015) New Technology: Use Of Cine MRI For Velopharyngeal Insufficiency. Adv Otorhinolaryngol 76: 27–32. 000368011 [pii]; pmid:25733229
  6. 6. Nunthayanon K, Honda E, Shimazaki K, Ohmori H, Inoue-Arai MS, Kurabayashi T et al. (2015) Use Of An Advanced 3-T MRI Movie To Investigate Articulation. Oral Surg Oral Med Oral Pathol Oral Radiol 119: 684–694. S2212-4403(15)00581-7 [pii]; pmid:25956219
  7. 7. Ahmad M, Dargaud J, Morin A, Cotton F (2009) Dynamic MRI Of Larynx And Vocal Fold Vibrations In Normal Phonation. J Voice 23: 235–239. pmid:18082366
  8. 8. Echternach M, Sundberg J., Arndt S, Breyer T, Markl M., Schumacher M et al. (2008) Vocal Tract And Register Changes Analysed By Real Time MRI In Male Professional Singers—A Pilot Study. Logoped Phoniatr Vocol 33: 67–73. pmid:18569645
  9. 9. Echternach M, Sundberg J, Arndt S, Markl M, Schumacher M, Richter B (2010) Vocal Tract In Female Registers—A Dynamic Real-Time MRI Study. J Voice 24: 133–139. pmid:19185452
  10. 10. Echternach M, Traser L, Richter B (2014) Vocal Tract Configurations In Tenors' Passaggio In Different Vowel Conditions-A Real-Time Magnetic Resonance Imaging Study. J Voice 28: 262. S0892-1997(13)00216-6 [pii];
  11. 11. Echternach M, Birkholz P, Traser L, Flügge T, Kamberger R, Burk F et al. (2015) Articulation And Vocal Tract Acoustics At Soprano`S High Fundamental Frequencies. J Acoust Soc Am 137: 2586–2595. pmid:25994691
  12. 12. Sundberg J (2009) Articulatory Configuration And Pitch In A Classically Trained Soprano Singer. J Voice 23: 546–551. pmid:18504111
  13. 13. Bresch E, Narayanan S (2010) Real-Time Magnetic Resonance Imaging Investigation Of Resonance Tuning In Soprano Singing. J Acoust Soc Am 128: EL335–EL341. pmid:21110548
  14. 14. Titze IR, Baken RJ, Bozeman KW, Granqvist S, Henrich N, Herbst CT et al. (2015) Toward A Consensus On Symbolic Notation Of Harmonics, Resonances, And Formants In Vocalization. J Acoust Soc Am 137: 3005–3007. pmid:25994732
  15. 15. Echternach M, Sundberg J, Markl M, Richter B (2010) Professional Opera Tenors`Vocal Tract Configurations In Registers. Folia Phoniatr Logop 62: 278–287. pmid:20588050
  16. 16. Niebergall A, Zhang S, Kunay E, Keydana G, Job M, Uecker M et al. (2013) Real-Time MRI Of Speaking At A Resolution Of 33 Ms: Undersampled Radial FLASH With Nonlinear Inverse Reconstruction. Magn Reson Med 69(2):477–85. pmid:22498911
  17. 17. Echternach M, Traser L, Markl M, Richter B (2011) Vocal Tract Configurations In Male Alto Register Functions. J Voice 25: 670–677. pmid:21402469
  18. 18. Proctor M, Bresch E, Byrd D, Nayak K, Narayanan S (2013) Paralinguistic Mechanisms Of Production In Human "Beatboxing": A Real-Time Magnetic Resonance Imaging Study. J Acoust Soc Am 133: 1043–1054. pmid:23363120
  19. 19. Echternach M, Popeil L, Traser L, Wienhausen S, Richter B (2014) Vocal Tract Shapes In Different Singing Functions Used In Musical Theater Singing-A Pilot Study. J Voice 28: 653. S0892-1997(14)00013-7 [Pii];
  20. 20. Echternach M, Markl M, Richter B (2011) Vocal Tract Configurations In Yodelling—Prospective Comparison Of Two Swiss Yodeller And Two Non-Yodeller Subjects. Logoped Phoniatr Vocol 36: 109–113. pmid:21417599
  21. 21. Herbst CT, Hess M, Muller F, Svec JG, Sundberg J (2015) Glottal Adduction And Subglottal Pressure In Singing. J Voice 29: 391–402. S0892-1997(14)00164-7 [pii]; pmid:25944295
  22. 22. Bjorklund S, Sundberg J (2015) Relationship Between Subglottal Pressure And Sound Pressure Level In Untrained Voices. J Voice. S0892-1997(15)00048-X [pii];
  23. 23. Bouhuys A, Mead J, Proctor D, Stevens K (1968) Pressure Flow Events During Singing. Annals Of The American Academy Of Science 155: 165–176.
  24. 24. Isshiki N (1964) Regulatory Mechanism Of Voice Intensity Variation. J Speech Hear Res 7: 17–29. pmid:14130741
  25. 25. Zanartu M, Galindo GE, Erath BD, Peterson SD, Wodicka GR, Hillman RE (2014) Modeling The Effects Of A Posterior Glottal Opening On Vocal Fold Dynamics With Implications For Vocal Hyperfunction. J Acoust Soc Am 136: 3262. pmid:25480072
  26. 26. Sundberg J, Gauffin J (1979) Waveform And Spectrum Of The Glottal Voice Source. In: Lindblom B, Öhman S, Editors. Frontiers Of Speech Communication Research. London: Academic Press. pp. 301–320.
  27. 27. Sundberg J. (1987) The Science Of The Singing Voice. Northern Illinois University Press.
  28. 28. Sundberg J (1975) Formant Technique In A Professional Female Singer. Acustica 32: 89–96.
  29. 29. Sundberg J, La FM, Gill BP (2013) Formant Tuning Strategies In Professional Male Opera Singers. J Voice 27: 278–288. S0892-1997(12)00209-3 [pii]; pmid:23453594
  30. 30. Joliveau E, Smith J, Wolfe J (2004) Acoustics: Tuning Of Vocal Tract Resonance By Sopranos. Nature 427: 116. pmid:14712266
  31. 31. Garnier M, Henrich N, Smith J, Wolfe J (2010) Vocal Tract Adjustments In The High Soprano Range. J Acoust Soc Am 127: 3771–3780. pmid:20550275
  32. 32. Titze I. R. (1994) Principles Of Voice Production. Prentice Hall, NJ.
  33. 33. Sundberg J, La FM, Gill BP (2011) Professional Male Singers' Formant Tuning Strategies For The Vowel /A/. Logoped Phoniatr Vocol.
  34. 34. Henrich BN, Smith J, Wolfe J (2014) Vocal Tract Resonances In Singing: Variation With Laryngeal Mechanism For Male Operatic Singers In Chest And Falsetto Registers. J Acoust Soc Am 135: 491–501. pmid:24437789
  35. 35. Mainka A, Poznyakovskiy A, Platzek I, Fleischer M, Sundberg J, Murbe D (2015) Lower Vocal Tract Morphologic Adjustments Are Relevant For Voice Timbre In Singing. Plos One 10: E0132241. PONE-D-14-56733 [pii]. pmid:26186691
  36. 36. Sundberg J (1974) Articulatory Interpretation Of The "Singing Formant". J Acoust Soc Am 55: 838–844. pmid:4833080
  37. 37. Stathopoulos ET, Sapienza C (1993) Respiratory And Laryngeal Function Of Women And Men During Vocal Intensity Variation. J Speech Hear Res 36: 64–75. pmid:8450666
  38. 38. Hirano M, Vennard W, Ohala J (1970) Regulation Of Register, Pitch And Intensity Of Voice. An Electromyographic Investigation Of Intrinsic Laryngeal Muscles. Folia Phoniatr (Basel) 22: 1–20.
  39. 39. Hirano M (1988) Vocal Mechanisms In Singing: Laryngological And Phoniatric Aspects. J Voice 2: 51–69.
  40. 40. Luchsinger R, Arnold G (1970) Methoden Der Stimmuntersuchung—Röntgenuntersuchungen. In: Luchsinger R, Arnold G, Editors. Handbuch Der Stimm- Und Sprachheilkunde. Wien, New York: Springer. pp. 125–136.
  41. 41. Shipp T (1984) Effects Of Vocal Frequency And Effort On Vertical Laryngeal Position. Journal Of Research In Singing 7: 5.
  42. 42. Sonninen A (1956) The Role Of The External Laryngeal Muscles In Length-Adjustment Of The Vocal Cords In Singing; Phoniatric, Roentgenologic And Experimental Studies Of The Mechanism Of Pitch Change In The Voice With Special Reference To The Function Of The Sternothyroid. Acta Otolaryngol Suppl 130: 1–102. pmid:13381445
  43. 43. Rothenberg M (1981) Acoustic Interaction Between The Glottal Source And The Vocal Tract. In: Stevens KN, Hirano M, Editors. Vocal Fold Physiology. Tokio. pp. 305–328.
  44. 44. Titze IR (2008) Nonlinear Source-Filter Coupling In Phonation: Theory. J Acoust Soc Am 123: 2733–2749. pmid:18529191
  45. 45. Zanartu M, Mehta DD, Ho JC, Wodicka GR, Hillman RE (2011) Observation And Analysis Of In Vivo Vocal Fold Tissue Instabilities Produced By Nonlinear Source-Filter Coupling: A Case Study. J Acoust Soc Am 129: 326–339. pmid:21303014
  46. 46. Neuschaefer-Rube C, Wein B, Angerstein W, Klajman S (1996) [MRI Examination Of Laryngeal Height During Vowel Singing]. Folia Phoniatr Logop 48: 201–209. pmid:8924947
  47. 47. Wein B, Neuschaefer-Rube C, Angerstein W, Klajman S, Gunther RW (1995) [Functionally Determined Area Changes In The Oro-Pharyngo-Laryngeal Vocal Tract Of Singers As Shown By Magnetic Resonance Tomography]. Rofo 162: 99–103. pmid:7881093
  48. 48. Burdumy M, Traser L, Richter B, Echternach M, Korvink JG, Hennig J et al. (2015) Acceleration Of MRI Of The Vocal Tract Provides Additional Insight Into Articulator Modifications. J Magn Reson Imaging.
  49. 49. Bresch E, Nielsen J, Nayak K, Narayanan S (2006) Synchronized And Noise-Robust Audio Recordings During Realtime Magnetic Resonance Imaging Scans. J Acoust Soc Am 120: 1791–1794. pmid:17069275
  50. 50. Fant, G. (1960) Acoustic Theory Of Speech Production. The Hague: Mouton. 51.
  51. 51. Roers F., Murbe D., & Sundberg J. (2009). Voice classification and vocal tract of singers: a study of x-ray images and morphology. J Acoust Soc Am 125: 503–512. pmid:19173435
  52. 52. Horacek J, Laukkanen AM, Sidlof P, Murphy P, Svec JG (2009) Comparison of accelerationand impact stress as possible loading factors in phonation: a computer modeling study. Folia Phoniatr Logop 61: 137–145. pmid:19571548
  53. 53. Inouye JM, Blemker SS, Inouye DI (2014) Towards Undistorted And Noise-Free Speech In An MRI Scanner: Correlation Subtraction Followed By Spectral Noise Gating. J Acoust Soc Am 135: 1019–1022. pmid:24606243
  54. 54. Zvyagintsev M, Klasen M, Mathiak KA, Weber R, Edgar JC, Mathiak K (2010) Real-Time Noise Cancellation For Speech Acquired In Interactive Functional Magnetic Resonance Imaging Studies. J Magn Reson Imaging 32: 705–713. pmid:20815071
  55. 55. Lombard E (1911) Le Signe De L'élévation De La Voix. Annales Des Maladies De L'oreille, Du Larynx, Du Nez Et Du Pharynx 37: 101–119.
  56. 56. Garnier M, Henrich N, Dubois D (2010) Influence Of Sound Immersion And Communicative Interaction On The Lombard Effect. J Speech Lang Hear Res 53: 588–608. 1092–4388_2009_08–0138 [pii]; pmid:20008681
  57. 57. Traser L, Burdumy M, Richter B, Vicari M, Echternach M (2013) The Effect Of Supine And Upright Position On Vocal Tract Configurations During Singing—A Comparative Study In Professional Tenors. J Voice 27: 141–148. S0892-1997(12)00194-4 [pii]; pmid:23380394
  58. 58. Osborne C (1979) The Broadway Voice, Part I: Just Singin`In The Pain. Hi-Fidelity 29: 57–65.
  59. 59. Estill J (1988) Belting And Classic Voice Quality: Some Physiological Differences. Med Probl Perform Art 3: 37–43.
  60. 60. Sundberg J, Gramming P, Lovetri J (1993) Comparisons Of Pharynx, Source, Formant, And Pressure Characteristics In Operatic And Musical Theatre Singing. J Voice 7: 301–310. pmid:8293062
  61. 61. Bunch M, Chapman J (2000) Taxonomy Of Singers Used As Subjects In Scientific Research. J Voice 14: 363–369. pmid:11021503