Laryngeal evidence for the first and second passaggio in professionally trained sopranos

Introduction Due to a lack of empirical data, the current understanding of the laryngeal mechanics in the passaggio regions (i.e., the fundamental frequency ranges where vocal registration events usually occur) of the female singing voice is still limited. Material and methods In this study the first and second passaggio regions of 10 professionally trained female classical soprano singers were analyzed. The sopranos performed pitch glides from A3 (ƒo = 220 Hz) to A4 (ƒo = 440 Hz) and from A4 (ƒo = 440 Hz) to A5 (ƒo = 880 Hz) on the vowel [iː]. Vocal fold vibration was assessed with trans-nasal high speed videoendoscopy at 20,000 fps, complemented by simultaneous electroglottographic (EGG) and acoustic recordings. Register breaks were perceptually rated by 12 voice experts. Voice stability was documented with the EGG-based sample entropy. Glottal opening and closing patterns during the passaggi were analyzed, supplemented with open quotient data extracted from the glottal area waveform. Results In both the first and the second passaggio, variations of vocal fold vibration patterns were found. Four distinct patterns emerged: smooth transitions with either increasing or decreasing durations of glottal closure, abrupt register transitions, and intermediate loss of vocal fold contact. Audible register transitions (in both the first and second passaggi) generally coincided with higher sample entropy values and higher open quotient variance through the respective passaggi. Conclusions Noteworthy vocal fold oscillatory registration events occur in both the first and the second passaggio even in professional sopranos. The respective transitions are hypothesized to be caused by either (a) a change of laryngeal biomechanical properties; or by (b) vocal tract resonance effects, constituting level 2 source-filter interactions.


Results
In both the first and the second passaggio, variations of vocal fold vibration patterns were found. Four distinct patterns emerged: smooth transitions with either increasing or decreasing durations of glottal closure, abrupt register transitions, and intermediate loss of vocal fold contact. Audible register transitions (in both the first and second passaggi) generally coincided with higher sample entropy values and higher open quotient variance through the respective passaggi. PLOS

Introduction
The frequency range of the singing voice is not a seamless continuous domain. Instead, at certain fundamental frequencies the voice quality may change abruptly. Vocal frequency regions with similar sound characteristics are commonly referred to as vocal registers [1] and abrupt shifts from one register to another are frequently called register breaks. Although vocal registers have been empirically described and physiologically analyzed as early as 1835 [2], vocal registers are not yet fully understood. To date, there is still no complete consensus regarding the terminology of singing voice registers, particularly concerning their number and definition [1,[3][4][5][6][7][8][9][10][11][12][13][14][15]. It is assumed that perceptive differences of registers could be related to differences in activities of laryngeal muscles [2,10,13,16], differences in vocal tract resonances [12,[17][18][19][20], interactions of the subglottal resonances with the voice source [21], or interactions of vocal tract resonances with the voice source [22][23][24][25]. Register changes are frequently accompanied by acoustic variations [12,[26][27][28] and fundamental frequency jumps [29,30]. In these cases, the sudden change of vocal fold oscillation patterns in untrained voices is often assumed to be a consequence of non-linear biomechanical properties of the vocal folds and/or interactions with the supra-or subglottal vocal tracts [22,24,31]. In contrast to untrained voices, western classically trained singers are largely able to avoid such perceptive register differences [32]. However, the mechanisms to prevent biomechanical instabilities are still not understood in detail.
In male voices, most attention is directed towards the transition from modal (or chest) register to the falsetto register. In particular, it was shown for male voices, that vocal fold oscillatory patterns might change in the fundamental frequency (ƒ o ) range where registration events typically occur [2,13,33], often denoted as the passaggio region. In addition to changes of vocal fold vibratory patterns, articulatory adjustments play an important role in the passaggio. For example, while the vocal tract shape remains nearly stable as singers advance from the modal register to the unmodified falsetto register, professional male singers introduce considerable changes into their vocal tract geometry when transitioning into their upper stage voice registers (stage falsetto for professional male altos and stage voice above the passaggio for tenors, respectively) [34][35][36][37]. This suggests that vocal tract changes could be used as a stabilizing factor for professional singing across the male passaggio.
The vocal registers of female voices are less well understood. Some authors suggest that female singers not only have (apart from the fry register) a first passaggio (also called primo passagio) from modal or chest register to a middle or head register, but also a second passaggio (secondo passaggio) from middle or head register to an upper register [6,12,14,[38][39][40][41]. Unfortunately, the terminology used to describe these registers is inconsistent and poorly defined. Furthermore, there might be subdivisions within registers. Based on electromyographic data, Kochis-Jennings et al. [10] propose that thyroarytenoid muscle activity might differ among what the authors denote as "chest", "chestmix", "headmix", and "head" register. Herbst et al. found that the degree of adduction of both the cartilaginous and membranous portions of the vocal folds could be controlled independently, and that such control could lead to production of different timbres within different registers [42]. As a consequence, the degree of adduction of the vocal process might also contribute to the differences observed in female voices [10]. Some singers also have been seen to produce a whistle register above 1000 Hz, suggesting the presence of a third passaggio [38,39,43,44]. The female first passaggio is often assumed to be caused by a change in vocal fold oscillation patterns [6,29,39,45]. It has been shown that both the transglottal flow pulse and the electroglottographic (EGG) signal changes through the first passaggio, resulting in increased open quotients [29,45]. Furthermore, radiologic studies have revealed that the distance between the arytenoid and the thyroid cartilages changes within the first passaggio [41]. Also, the sound spectrum has been found to differ between the modal and middle register, exhibiting a stronger first harmonic in the middle register, as compared to the second harmonic [46]. The ƒ o region where this register shift occurs is located only slightly above the region of the respective register shift from modal to falsetto register in males [6,12,40,41]. Although there is some evidence for changes of vocal fold oscillation patterns in this passaggio [6,47], there are only a few studies analysing vocal fold vibration. In 1960, Rubin and Hirt [48] found vocal fold oscillatory differences between what the authors denoted as chest and falsetto for their female singers. Svec et al. [39] analysed this passaggio in a single untrained subject and observed a decrease of arytenoid adduction for the middle or head register as compared to the modal register. Furthermore, it was found that the videokymographically derived closed quotient was decreasing from the middle to the head register [39].
In contrast to the first passaggio, empirical information on the second passaggio is still scarce and somewhat conflicting. While some authors suggest a resonatory phenomenon which occurs when ƒ o reaches the first vocal tract resonance [38], other studies propose a vibratory phenomenon, suggesting that it is also possible that vocal fold oscillation patterns are altered in the second passaggio region [19,39].
Supporting the idea of resonatory phenomena, the ƒ o region of the second passaggio often corresponds to the frequency region where vocal tract shape adjustments [40,49,50] and the resulting alterations of vocal tract resonances can often be observed (i.e., the so called "formant tuning" [43,51,52]).
With reference to laryngeal adjustments, Garnier et al. [19] found a decrease of EGG amplitude and an increase of EGG open quotient between the pitches of G4 and D6. These changes in vocal fold vibration were rather gradual and not necessarily accompanied by acoustic changes induced by resonatory phenomena [19]. This important evidence notwithstanding, there is a lack of empirical data describing vocal fold oscillation patterns in the second passaggio. Garnier et al. [19] discuss only one single high speed videoendoscopy (HSV) recording acquired using rigid laryngoscopy at a limited frame rate of 2,000 fps in a single "non-expert" subject. They observed only minor variations of vocal fold vibration during the second passaggio. In contrast, using videokymography (i.e., assessment of vocal fold vibration along only one single line perpendicular to the glottal axis at around 8,000 fps [53]) Svec et al. [39] found an ƒ o jump in their single untrained subject study at an ƒ o of 650 Hz. Unfortunately, these noteworthy pilot examinations were limited both technically and in their number of subjects. Furthermore, both studies were conducted using rigid endoscopy, a method that forces the participants to introduce considerable changes into the configuration of their vocal tracts while phonating [19,39], potentially influencing the participants' habitual strategies throughout the passaggi.
Summarizing, the exact nature of the second passaggio is still not understood in detail. Furthermore, the vocal fold oscillation patterns through both passaggi have not been recorded nor analysed in detail, partly due to limitations of frame rates and spatial resolution in previous investigations utilizing endoscopy. It is thus the purpose of this study to analyse vocal fold oscillation patterns in a greater number of participants, using state of the art HSV equipment with a sufficiently high temporal and spatial resolution, employing flexible endoscopy in order to allow the habitual articulatory gestures of the participants. Because untrained singers might have problems in reaching higher pitches above the second passaggio, the study focuses on professional singers.

Participants and phonatory tasks
After approval from the Freiburg University Ethical Comittee (nr. 380/12), ten professional female singers, all of them sopranos trained in classical singing, were included in this study. All subjects gave their written consent to participate in this study. The classification of the participants according to the Bunch and Chapman taxonomy [54] is shown in Table 1. In all of the participants, laryngoscopic examination prior to data acquisition revealed no signs of vocal fold pathology.
The participants were asked to perform two upward pitch glides, one from pitch A3 (ƒ o = 220 Hz) to A4 (ƒ o = 440 Hz) and another from A4 (ƒ o = 440 Hz) to A5 (ƒ o = 880 Hz) on the vowel [iː]. The two pitch glides cover the ƒ o regions where the first and second passaggio, respectively, are typically found [6,12,19,29,39,40,55]. The vowel [iː] was chosen in order to ensure best visibility of the vocal folds, additionally preventing major gag reflexes due to increased pharynx width. The participants were asked to sing both pitch glides using their professional "stage voice" at comfortable loudness, theoretically avoiding major voice quality differences. The glide was to be performed over a time period of approximately one second. The total number of acquired pitch glides was two per subject for ten subjects, a total of 20 glides.

Data acquisition
The data acquisition setup is described in detail in a previous publication [56]: Laryngeal endoscopy was performed trans-nasally using an ENF GP endoscope (Olympus, Hamburg, Germany) with a 38mm C-mount adapter (Karl Storz, Tuttlingen, Germany) and a 300W light source (Storz, Tuttlingen, Germany). Endoscopic laryngoscopy was recorded with a Fastcam SA-X2 high-speed video camera (Photron, Tokyo, Japan) operated at a frame rate of 20,000 frames per second and a spatial resolution of 386 x 320 pixels. No anaesthetic medication was given for the trans-nasal endoscopic approach.
Simultaneous with the HSV recording, the acoustic and EGG signals were recorded with a IMK SC 4061 microphone (DPA microphones, Alleroed, Denmark) and an EG2-PCX2 electroglottograph (Glottal Enterprises, Syracuse, NY, USA) using a data acquisition board (National Instruments, Austin, USA). The simultaneous recording of both HSV data and acoustic and EGG signals was performed using the PFV Viewer Software (Version 3660, Photron, Tokyo, Japan). As both the HSV camera and the data acquisition board were operated at a sampling frequency of 20,000 Hz, the PFV software allowed for time-synchronized acquisition of all signals. The accuracy of the synchronization was tested by simultaneous playback of a test signal consisting of TTL pulses to the data acquisition board and a blinking LED signal (to be acquired by the camera). Using this method, the accuracy was determined to be one frame, which is equivalent 50 μs.

Perceptual rating
It could be expected that, in contrast to untrained voices, the professional singers participating in this study were able to avoid great sound quality differences during the pitch glides. In order to evaluate if a register transition was perceptually noticeable, the acoustic signals of both pitch glides per participant were played back in randomized order to 12 experts for a perceptual rating. These expert raters were either professors of singing (n = 2) or full time singing students at a German University of music with a minimum professional voice training of 4 years (n = 10). The experts were asked to rate the acoustic recordings (n = 10 subjects x 2 phonatory tasks = 20 ratings) on a scale from 1 (no perceivable register event) to 5 (maximum perceivable register event). For all raters, the rating was performed in the same room with the same headphones and the same loudness. In order to estimate the reliability of the ratings, the stimuli of one subject were provided twice in the set, and the Intra-class Correlation Coefficient (ICC [57]) of the raters was calculated. The averaged measured ICC was 0.85, indicating a good degree of rating consistency.

Data processing
All high-speed videos were subjected to three pre-processing steps, as described previously [56]: the honeycomb structure introduced by the optics of the flexible endoscope was removed using a frequency-selective filter in the Fourier domain; the acquired images were rotated to represent the glottal midline exactly vertically with respect to the image frame; last, the video was cropped to a region of interest containing the vibrating vocal folds. Then, glottal segmentation, i.e., semi-automatic extraction of the time-varying medio-lateral deflections of the vocal folds from the video footage, was performed using the Glottis Analysis Tools software (Denis Dubrovskiy and Michael Döllinger, FAU Erlangen-Nürnberg, Germany) [58]. The time-varying area of the glottis (i.e., the air space between the vibrating vocal folds, as seen from the top) was computed based on the glottal segmentation data, resulting in the glottal area waveform (GAW). The electroglottographic (EGG) signal is proportional to changes of the relative vocal fold contact area during vocal fold vibration [59]. It is thus well suited for documenting the vocal fold oscillatory effects of any potentially occurring register transitions or instabilities during the examined pitch glides. This was achieved by calculating the sample entropy of the cycleseparated EGG signal [60,61]. Sample entropy is defined [60,61] as "the negative natural logarithm of the conditional probability that two sequences similar for m points remain similar at the next point, where self-matches are not included in calculating the probability." The sample entropy was chosen over other irregularity measures because it is not sensitive to changes of ƒ o when calculated using the cycle-separated (rather than the raw) EGG signal. This was verified by analysing synthesized stereotypical EGG signals representing the tasks of this study. These synthesized EGG signals had durations of one second, and the ƒ o was changed from 220 Hz to 440 Hz and from 440 Hz to 880 Hz, respectively, within an interval of 50 ms centred around a time offset of 0.5 s. There was no effect of the ƒ o variation on the sample entropy. In this study, the cycle based sample entropy was calculated based on the time series of the first two Discrete Fourier Transform components, termed "Fourier Descriptors", of the analyzed EGG signals (FDSE c ). The respective calculations were performed with the algorithm developed by Selamtzis and Ternström [60] which is described in detail in the supplementary material S1 Text.
In order to align all phonations of all participants at a temporal instant for each pitch glide signifying the moment of maximum change in the vocal fold vibration pattern, all analyzed EGG signals were divided into consecutive sequences of 25 ms, and the mean FD-based sample entropy was calculated for each of these segments. For the remainder of this manuscript, the term "window based FD sample entropy" (FDSE w ) is used to denote this parameter. The EGG signals were chosen over the GAW signals since they had a better signal-to-noise ratio and did not suffer from potential effects of endoscope movement or rotation, thus being less susceptible to spurious sample entropy results. The segment with the maximum FDSE w value within each EGG signal for each pitch glide was determined. Centred at this segment, a total of 11 segments (denoted as windows -5 to window 5 -see below and figures), each having a duration of 25 ms, were considered for further analysis. For each of these analysis windows, the following parameters were computed: FDSE w , open quotient of the GAW signal (OQ GAW , i.e., the relative, time-normalized duration of glottal opening per vocal fold vibratory cycle) and "between-window" variation thereof, and respective glottal opening and closing patterns.
Glottal opening and closing patterns were determined with a novel custom algorithm devised and implemented in Python by author C.T.H. For each data point along the anteriorposterior glottal axis, the time-varying medio-lateral vocal fold displacement was assessed. The instants where the respective vocal fold trajectory diverged from and converged to zero, respectively (indicating beginning and termination of glottal opening at a certain offset along the anterior-posterior (A-P) glottal midline), were divided by the respective duration of each glottal cycle within each analysis window (-5 to 5) for each pitch glide. The resulting cycle-bycycle data for each analysis window were averaged and plotted as a function of medio-lateral vocal fold displacement along the A-P axis and normalized intra-cycle time. The resulting graphs show the spatio-temporal opening and closing patterns of the vocal folds for all the eleven analysis windows. For each of these analysis windows, two contours leading from the anterior to the posterior ends of the vocal folds were plotted: The left contour (occurring earlier within the normalized intra-cycle temporal dimension) shows the pattern for glottal opening and the right contour that for glottal closing.

Results
Both pitch glides could be performed by all participants. Due to an equipment malfunction, the HSV data for the lower pitch glide of participant S6 had to be excluded from the analysis. Overall, the vibrating vocal folds were well visible in HSV for all recordings, allowing the segmentation of the glottis for all glides. The only exception was the upper pitch glide of participant S6, in which the visibility of the vocal folds was obstructed by a retracted epiglottis during the last portion of the recording.
Surprisingly, no common laryngeal behavior could be found for the participants' transitions through either the first or the second passaggio. When assessing the dEGG wavegrams and the vocal fold vibration patterns through analysis windows -5 to 5 (S1-S10 Figs), four main strategies emerged: 1. smooth transitions from the lower to the upper pitch, typically coinciding with a decrease of the relative duration of vocal fold closure, most prominently seen in participant S4, first passaggio (see Fig 1, left panels); 2. smooth transitions with an increase of vocal fold contact and closure duration, most prominently seen in participant S1, second passaggio (see Fig 1,  Summary graphs of all phonations of all participants are included as supplementary material S1-S10 Figs. In these summary graphs, a spectrogram of the acoustic signal (window length 1024 frames, 65 dB dynamic range), the time-varying ƒ o and its rate of change, a dEGG Wavegram [62] (see supplementary material S2 Text), both the FDSE c and the FDSE w , and a summary of glottal opening and closing patterns are shown for each phonation. Initial data assessment suggests inhomogeneous task execution by the participants. Pitch glides produced with continuous development of laryngeal dynamics (see Fig 1 for two stereotypic examples) were contrasted by pitch glides produced with abrupt changes in laryngeal oscillation patterns and instabilities of the fundamental frequency (see Fig 2) in both the passaggi. There was a good agreement between the perceptual rating and the maximum FDSE w (Fig  3). In both the lower (r 2 = 0.49) and upper pitch glide (r 2 = 0.74), a higher perceptual rating  Female first and second passaggio (indicating a perceptually more prominent registration event) had a tendency to coincide with greater maximum FDSE w (indicating greater alterations of the EGG waveform within an analysis window), and vice-versa.
The mean GAW open quotient (OQ GAW ), averaged over all participants, increased in both phonation tasks during the course of the pitch glides, by about 10% (lower pitch glide) and 18% (upper pitch glide), respectively (Fig 4A and 4B). In other words, most participants had the tendency to phonate with a longer relative duration of glottal closure per glottal cycle at the beginning of the tasks at lower ƒ o as compared to the end of the tasks at higher ƒ o . In four out of the 20 analyzed phonations, however, the relative duration of vocal fold closure and contact was higher at the end of the respective phonation, as compared to the beginning: S1 (second passaggio), S3 (both first and second passaggio), S9 (first passaggio-see also dEGG wavegrams and vocal fold vibration patterns in the supplementary figures S1 Fig, S3 Fig and S9 Fig). This would explain why a greater variation of OQ GAW values emerged towards end of the tasks within analysis windows 0 to 5, suggesting that the participants utilized different laryngeal strategies for mastering the transitions through their passaggio regions, an impression that is also corroborated by inspection of the vibratory patterns-see Discussion. This is also supported by assessment of the OQ GAW differences between consecutive analysis windows in each phonation of the individual participants (ΔOQ GAW , Fig 4C and 4D), showing a non-uniform development over the individual analysis windows. Phonations with great ΔOQ GAW "between-window" analysis (e.g., S5 and S8 at the lower pitch glide, or S4 and S10 at the upper pitch glide) had a tendency to coincide with a greater maximum FDSE w . This is further illustrated in Fig 5, where the maximum FDSE w is plotted against the standard deviations (across all analysis windows) of the ΔOQ GAW parameter for all participants, showing good correlations between these two parameters in and both phonation tasks and resulting in r 2 = 0.45 and r 2 = 0.52, respectively.

Discussion
In this study transitions through the first and second passaggi in a larger number of female singers were analyzed. Despite the fact that all 10 participants were trained classical singers, the results of the perceptual rating suggest that not all phonations were executed without perceptual register transitions, in partial violation of the aesthetical requirement of western classical singing to inaudibly "blend the registers" and to avoid abrupt changes of voice timbre throughout the singing tessitura (i.e., the pitch range used on stage).
The strong correlations between the perceptual rating data, the maximum FDSE w , and the "between-window" rate of change of GAW open quotients (recall Figs 3 and 5) suggest that the severity of the perceptual register transition correlated with a vocal fold oscillatory effect: the more audible a register transition, the greater the variations of vocal fold vibration when singing through the passaggio region.
Analysis of the dEGG wavegrams and the vocal fold vibration patterns revealed four strategies for navigating the passaggi (see Results). Previous research [6,19,47] would support Strategy I, where the relative duration of vocal fold contact and glottal closure would diminish (increasing OQ GAW ) when increasing ƒ o , for the first passaggio. Thus, the appearance of strategy II in some of the phonations was unexpected. It could be speculated that the increase of relative vocal fold closure duration was induced by an increase of posterior glottal adduction, which was suggested by visual inspection of the respective HSV footage. Such an increased adduction might facilitate the entrainment [63,64] of the two vocal folds in the potentially unstable passaggio region, thus helping to stabilize vocal fold vibration.
Phonations utilizing strategy I or II were produced with gradual changes of vocal fold oscillations as ƒ o increased. In the first passaggio, such gradual changes have already been described Female first and second passaggio using electroglottography in both male [32] and female [62] voices, when performing a register shift from modal to falsetto (for male) or middle (for female) register. The appearance of such gradual changes contradicts the hypothesis [65] that a mixture or blending of the modal and falsetto/middle registers ("voix mixte") would not be possible when traversing register boundaries in the passaggio, i.e., that the voice would be either in modal register (also termed laryngeal mechanism 1 or M1 [65]) or in falsetto/middle register (M2), and that a register transition would always be a distinct and binary event. Yet, our data supports van den Berg's idea of a "mixture" of modal and falsetto/middle register [13]. The absence of a clear register boundary, as found in our data, calls the definition of registers based on distinct laryngeal mechanisms [66] into question, at least for the professional singers analyzed in this study. Rather, the possibility of gradual adjustments of laryngeal mechanisms might be considered.
Strategy III, resulting in audible registration events and abrupt changes of vocal fold oscillatory patterns, is expected to occur in less proficient singers (see eg. [62], Fig 8). It is therefore not surprising that the clearest emergence of strategy III was found in participant S8, having a rather short period of training and one of the lowest ratings in the Bunch & Chapman taxonomy (see Table 1). Strategy IV involved intermediate loss of vocal fold contact and glottal closure. This could be associated with a slight and sudden abductory gesture of the arytenoids. Given the hypothesis that strategies I and II with a smooth transition could be both associated with gradual abduction or adduction, the contact losses of Strategy IV could reveal a deficit in such coordination. On the other hand, due to limitations in data storage space of the HSV camera, all phonatory tasks in this study had to be performed within one second, with the major ƒ o increase typically occurring within intervals of 50 ms to 250 ms. Typically, singers have more time for coordinating their passaggio for their performance on stage. Theoretically, the episodes of contact loss seen in strategy IV could therefore also be artifacts introduced by the data acquisition protocol.
The changes in oscillation patterns found in our data are to be expected for the lower pitch glides through the first passaggio, i.e. the register transition from modal register to middle register [48] (sometimes also termed M1 and M2, respectively [6,47]). Despite some cases of disagreement concerning changes of vocal fold closure (see above), for this transition, our data corroborates the general previously reported finding of laryngeal adjustments [6,12,47,48]. More surprising, however, is the finding that the transitions through the second passaggio, which occurred during the upper pitch glides, also caused considerable variations of vocal fold oscillation patterns. Preliminary evidence for this phenomenon, albeit with limited video frame duration/spatial resolution, has been brought forward in two previous single-subject studies involving two untrained female singers [19,39]. Here we provide the first conclusive confirmation for such vocal fold vibration pattern adjustments through the second passaggio in the female voice, utilizing HSV recordings with sufficient temporal and spatial resolution. Our data clearly demonstrates that the female second passaggio is not affected by vocal tract resonances alone, that is, without changes of vocal fold oscillations patterns.
The reason for these laryngeal oscillatory changes in the second passaggio is, however, unclear. One possibility is that the laryngeal oscillation patterns were caused by changes of laryngeal muscle activity. A second hypothesis, mainly based on theoretical modeling, suggests that the supraglottal vocal tract can interact with vocal fold oscillation patterns (level 2 interactions according to Titze [67]) and that voice instabilities could be expected when ƒ o or an integer multiple of it (i.e, a harmonic), is at or above the first vocal tract resonance (ƒ R1 ) [67,68]. In the vowel [iː], used for the phonations analyzed in this study, ƒ R1 is typically found around 350Hz [8,69,70], at least in speech. In trained classical singing, ƒ R1 is customarily raised together with ƒ o when ƒ o is close to ƒ R1 [20,51,52], presumably in order to avoid a crossing of ƒ o and ƒ R1 and the expected voice instabilities associated with this situation. Theory predicts that whenever a resonance is close to a harmonic, non-linear interactions between the inert vocal tract and the voice source might occur, either desirable (when the respective harmonic is just below the resonance), or undesirable (when the harmonic is at or slightly above the resonance) [67]. In this light, the changes of vocal fold vibration seen in the second passaggio could very well be induced by non-linear interactions between the vocal tract and the sound source. However, as our experimental setup did not allow for measurement of the supraglottal vocal tract resonances, this hypothesis could neither be confirmed nor ruled out.
There are some important limitations of this study. The participant pool was limited to 10 professionally trained sopranos. It could thus not be ruled out that other singers would use different strategies for navigating through the passaggi. Also, the number of subjects might have been too small to verify if, and to what extent, the described four laryngeal strategies for traversing the passagio zones are relevant. On the other hand, it is hardly feasible to include a greater number of professional singers for such an invasive experimental study. Secondly, the experiment only considers ascending glides. In experiments on male voices it has been shown that an ascending glide across the passaggio showed greater irregularities of vocal fold oscillations than descendent glides and that the ƒ o of the passaggio was found lower for the descendent glide [71], resulting in a hysteresis. As a consequence, it might be possible that register transitions on descending glides would reveal different strategies. Thirdly, it cannot be excluded that some of the perceptually noticeable register transitions were also influenced by the artificial recording situation (endoscope in the nose). However, the flexible endoscope approach was preferred over a rigid endoscope since it allows for more natural phonation. Though some subjects might not have executed the tasks to the best of their ability (as they would on stage), the recorded phonations are not artifacts. Rather, they constitute valuable data in the sense that they are examples of how the larynx can behave during transitions through either passaggio. Lastly, the study is only concerned with western classical singers. A wide variety of other singing styles exists, in which different strategies for navigating the passaggi exist, such as musical theater singing, pop/rock singing or yodeling. Analysis of such important groups of singers is left to future investigations.

Conclusions
This study provides evidence of vocal fold oscillatory changes during the first and second passaggi in a larger number of female singers. It is the first of its kind to utilize laryngeal imaging with sufficient temporal and spatial resolution. The findings suggest that noteworthy vocal fold oscillatory registration events occur even in professional (trained) singers. Four different laryngeal strategies were found for navigating the passaggio regions: smooth transitions with increasing or decreasing durations of glottal closure, abrupt register transitions, and intermediate loss of vocal fold contact, possibly accompanied by abductory gestures in the larynx. Audible register transitions (in both the first and second passaggio) were accompanied by noteworthy changes of vocal fold vibration patterns. This would suggest that either (a) the respective transitions were caused by the sound source through changing laryngeal biomechanical properties induced by intrinsic laryngeal muscles, or that (b) occurring vocal tract resonance effects had a strong influence on the sound source as described by Titze's level 2 interactions [67]. Further research is necessary to identify which of these two hypotheses is applicable.

Investigation: ME FB MK.
Methodology: ME CTH AS MB.
Project administration: ME CTH BR.
Resources: ME FB MK AS MD MB BR CTH.
Software: CTH MD AS ME.