Interdependencies between acoustic and high-speed videoendoscopy parameters

In voice research, uncovering relations between the oscillating vocal folds, being the sound source of phonation, and the resulting perceived acoustic signal are of great interest. This is especially the case in the context of voice disorders, such as functional dysphonia (FD). We investigated 250 high-speed videoendoscopy (HSV) recordings with simultaneously recorded acoustic signals (124 healthy females, 60 FD females, 44 healthy males, 22 FD males). 35 glottal area waveform (GAW) parameters and 14 acoustic parameters were calculated for each recording. Linear and non-linear relations between GAW and acoustic parameters were investigated using Pearson correlation coefficients (PCC) and distance correlation coefficients (DCC). Further, norm values for parameters obtained from 250 ms long sustained phonation data (vowel /i/) were provided. 26 PCCs in females (5.3%) and 8 in males (1.6%) were found to be statistically significant (|corr.| ≥ 0.3). Only minor differences were found between PCCs and DCCs, indicating presence of weak non-linear dependencies between parameters. Fundamental frequency was involved in the majority of all relevant PCCs between GAW and acoustic parameters (19 in females and 7 in males). The most distinct difference between correlations in females and males was found for the parameter Period Variability Index. The study shows only weak relations between investigated acoustic and GAW-parameters. This indicates that the reduction of the complex 3D glottal dynamics to the 1D-GAW may erase laryngeal dynamic characteristics that are reflected within the acoustic signal. Hence, other GAW parameters, 2D-, 3D-laryngeal dynamics and vocal tract parameters should be further investigated towards potential correlations to the acoustic signal.


Introduction
Phonation begins with an airstream, rising from the lungs, setting the vocal folds located in the larynx in motion. The vocal folds subdivide this airstream in a series of flow pulses which are further modulated in the vocal tract until exiting through the mouth and being perceived as acoustic signal [1,2]. It is logical to assume that relations between vocal fold oscillation characteristics and acoustic sound quality should exist. Uncovering such relations would highly improve treatment possibilities of voice disorders, since this knowledge will guide physicians in deciding what specific oscillation characteristic needs to be addressed in order to improve certain acoustic quality features. Due to different underlying disorders the process of voice production can be impaired in a variety of ways. In this work, we divide voice disorders in two groups: organic dysphonias (OD) and functional dysphonias (FD) [3]. Whilst signs of ODs are always (visible) laryngeal anatomical changes, FD is a diagnosis of exclusion due to no underlying anatomical/tissue related (visible) changes are ascertainable [4]. A voice disorder classified as FD may also have purely psychological etiology [5]. It is important to note that some uncertainty surrounds the term FD. First, the exact boundary between ODs and FDs is not always absolute, since organic pathologies may eventually result in functional disorders [3], being named a secondary functional dysphonia. Second, subcategories of FD are not entirely standardized and often reflect clinician's supposition and bias in practice [6]. However, in this study the subjects with FD diagnosis had no organic pathologies at the time of recording, i.e. only the so called primary functional dysphonia was considered.
In patients with voice disorders, the acoustic signal is altered. In many cases, this is due to impairments in the vocal fold oscillations [7,8]. It is assumed that there are three main vocal fold dynamical characteristics that foster healthy voice quality [9][10][11]: vocal fold oscillations are assumed to be (A) symmetric, (B) periodic and (C) exhibit a closed state during oscillations.
For instance, vocal fold asymmetry [12,13] and aperiodicity [14] have been linked to perceived audible roughness; incomplete glottis closure is associated with vocal fatigue and a breathy voice [7,8]. Better understanding the relations between features of vocal fold oscillations and their effects on the acoustic signal could be of great benefit in clinic settings: If auditory-perceptual symptoms can be traced back to specific vocal folds disorders or specific patterns of vocal fold oscillations, this may lead to improvement in patient's voice by directly treating underlying cause. Hence, finding relations between acoustic signal quality and vocal fold oscillation characteristics would provide further insight into fundamental connections in voice production and would eventually allow treatments tailored to the individual patient's needs.
One powerful tool for investigating vocal fold oscillations is high-speed videoendoscopy (HSV) [15][16][17]. As illustrated in Fig 1, during rigid-endoscope HSV data collection, as performed in this study, an endoscope is inserted in the mouth of the subject, to record the vocal fold oscillations. The oscillation frequency of the vocal folds lies between 80 and 400 Hz during normal phonation [3]. With HSV recording frame rates between 4,000 fps and 20.000 fps these oscillation frequencies are easily captured [7,18], leading to a thorough recording of oscillation characteristics during each glottis cycle.
From the resulting HSV data, different types of signals can be extracted, such as vocal fold trajectories [19], Phonovibrograms [20] and the "Glottal Area Waveform" (GAW) [21]. The GAW describes the changing area between the vocal folds, i.e. the glottal area, over time. The GAW reaches maxima during maximum opening of the glottis and minima during the closed phase. Also, synchronous recording of the acoustic signal is possible and often put into practice [22][23][24] as it was done in this work.
Based on the extracted acoustic and GAW signals various parameters can be calculated, describing different features of the signals reflecting different features of the voice production process. A great number of parameters have been introduced [9,25], but norm-values for many parameters are still missing due to a variety of reasons [26][27][28][29]. Widely used parameters such as Jitter and Shimmer describe period irregularity in fundamental frequency and amplitude in the signal. Increased values of these parameters are e.g. associated with hoarseness if they were calculated on acoustic signals [30]. However, given norm values for Jitter (in this case Jitter Percent) differ, with one study stating "healthy" values of around 0.25% for females and males while producing the vowel /a/ [31] whereas another study considers values as high as 0.53% for younger and 0.84% for older males phonating the vowel /a/ as healthy [32]. Such differences may be related to inadequate subject recruitment in these studies or other variabilities in the data collection process. Also in studies employing HSV data different factors appeared to be influencing these parameters such as recording frame rate [27], camera resolution [28] or sequence length [29]. Hence, norm value tables for HSV parameters to aid in objective separation of healthy and disordered voices are needed.
To this date, various works have investigated relations between vocal fold movements and resulting acoustics. However, often only linear relations were explored [33][34][35][36] or data from a small number of subjects (N � 20) was used [33,[35][36][37]. Some relations between vocal fold oscillations and resulting acoustic signal are known with the most obvious one being the strong correlation between fundamental frequency of the vocal fold oscillations and the fundamental frequency of the resulting acoustic signal in sustained phonation. Other examples include connections between insufficient closure of the vocal folds during phonation and perceived hoarseness in the acoustic signal or the "force" with which the vocal folds collide and the acoustic amplitude [7,8]. The fundamental frequency (F 0 ) at which the subject phonates is another factor that may influence acoustic and GAW parameters. For instance, period perturbation measurements in the GAW may be influenced by F 0 due to the lower sampling rate of GAW signals and a changing F 0 may affect more complex parameters such as noise measurements [30].
This study investigated linear and non-linear relations between GAW and acoustic data for a large number of subjects and parameters. Female and male subjects with normal voices formed the healthy voice group, and subjects who had been diagnosed with FD formed the voice disordered group. The influence of F 0 on the other parameters considered in this work was of particular interest, since parameters that are strongly affected by F 0 may require a correction of this influence. Further, we used our collected data to provide preliminary norm values of parameters obtained from 250 ms long sustained phonation data (vowel /i/). The aims of this work are: 1. Create a set of norm-values for all investigated parameters that differentiate females and males with normal voices from subjects with diagnosis of FD for the given recording settings.
2. Find parameters that are influenced by F 0

Methods
HSV recordings (N: 351) with simultaneously recorded acoustic signal (time-synchronized) were used for data evaluation. This data (without the acoustic recordings) was already used in a previous study applying machine learning approaches for classification purposes [38]. All 351 acoustic recordings were unanimously rated by three experts on ordinal scales (0 to 2) for signal noise and background noise: 0 was chosen as the best rating (no signal noise / background noise) and 2 as the worst (strong signal noise / background noise). Only recordings that had signal noise and background noise rated at 1 or 0, were used in further analysis, leading to final set of 250 combined HSV-acoustic recordings from female and male subjects for further analyses. The 250 combined HSV-acoustic recordings were divided into four groups depending on their gender and health status, Table 1. All recordings were taken under clinical conditions using a Photron Fastcam MC2 camera (frame rate: 4000 fps, resolution: 512×256 pixels, 70r igid endoscope). The acoustic signal was simultaneously recorded using a clip microphone (pentax model #7175-6000, Lapel Microphone, Audio Technica ASP-0091, sampling rate: 40 kHz). All subjects phonated the vowel /i/ at their habitual pitch and loudness level (sustained phonation). From each combined HSV-acoustic recording, a section of 250 ms of sustained phonation was selected.
All disordered patients were diagnosed by our clinicians with FD and no concurrent OD during regular clinical routine (i.e. only primary functional dysphonia was considered). Healthy subjects were recruited separately but examined analogous to disordered subjects. Only healthy subjects were included that did not show signs of any voice disorder. This study was approved by the ethic committee of the Medical School at Friedrich-Alexander-University Erlangen-Nürnberg (no. 290_13B); written consent was obtained by all subjects.

Signal extraction and parameter calculation
High-speed video data were processed using a preliminary version of the in house developed software Glottis Analysis Tools (GAT-2020), being freely available upon request. It is the next version of GAT-2018, and includes several bug fixes and an improved cycle detection algorithm. The process of segmentation and parameter calculation is illustrated in Fig 2. For a detailed explanation of the segmentation process see [38]. GAWs describing the total glottal area (GAW T ) and the left and right half of this glottal area (GAW L and GAW R ) were extracted from HSV videos. The acoustic signal was synchronously recorded using a clip microphone. Maximum based cycles (i.e. each cycle starts at a sufficiently distinct local maximum and ends before the next one) were detected in GAWs and acoustic signals. From all parameters featured in the GAT-software a set of relevant parameters, based on previous work [28,29,38,39], was selected. Only parameters were included that were previously found to be resistant towards certain influencing factors (spatial resolution and sequence length) [28,29], mathematically sound [39] and not strongly redundant [38]: 35 GAW-and 14 acoustic-based parameters were considered [40][41][42][43][44][45][46][47][48][49][50][51][52][53][54].
In Table 2 the parameters used in this study are summarized. "Signal" describes if the parameter was calculated exclusively for GAW or acoustic signal or for both signals. "Averaged" describes if only a single parameter value per signal was calculated or if multiple values were calculated (i.e. mean and standard deviation). Further, abbreviation, parameter unit and source are given. This means that a single row in this  calculated). In S1 Table a more detailed version of Table 2 [43]. Custom scripts in Python 3.7 were used to analyze the data and to prepare the figures.

HSV-acoustic correlations
Linear and non-linear relations between HSV and acoustic parameters were considered separately for females and males. For each gender healthy and disordered groups were merged, since parameters are expected to scatter between healthy and disordered voice subjects; i.e.  To investigate the linear relations, Pearson correlation coefficients (PCC) and p-values were calculated between all HSV and acoustic parameters. For investigation of general relations, distance correlation coefficients (DCC) and p-values were calculated. Distance correlation is a measure of dependence between random vectors that is only zero when the vectors are independent and 1 when the vectors are identical. Therefore DCC measures linear and non-linear associations between vectors and, contrary to PCC, cannot obtain negative values. For more information see the work by Székely, Rizzo and Bakirov [55]. The p-values calculated for the DCCs are, analogous to PCC p-values, the probability of a correlation being equal or greater than the observed DCC, if the null hypothesis (both parameters are uncorrelated) is true.
This approach yielded two sets of PCCs and two sets of DCCs with respective p-values. We controlled the false discovery rate (FDR), i.e. the expected percentage of false positive tests at 5% using the Benjamini-Yekutieli procedure, since there may be unknown interdependencies between the tests [56]. The p-vales were adjusted accordingly. The entire process is illustrated in Fig 3.

Results
Three main topics were of interest in this work: (A) Determining the ranges of values for healthy subjects (i.e. females and males with normal voices) and subjects with diagnosis of FD for the investigated parameters, (B) investigating influence of F 0 on other parameters and (C) detecting relations between parameters not related to the fundamental vocal fold oscillation frequency F 0 .

Ranges of values for healthy and FD subjects
Statistical values for all four groups (N F , FD F , N M , FD M ) are collected in S2 Table. This table  contains Minimum, Maximum, mean and median-values for these groups as well as the standard deviations, skewness and kurtosis. Further, below this table, distributions of parameter values for all parameters investigated in this study are plotted (similar to Fig 4). Parameter values scattered severely and outliers were common. In Fig 4, exemplary the distributions of two parameters, acoustic based CPP in females and GAW-based PQ [Std] in males, are depicted. Albeit some shift towards lower / higher values may be subjectively identifiable, no strong differences between healthy and FD groups are observable in low order statistical measures like means and medians. However, for some parameters, like GGI [Mean], high order statistical measures (skewness and kurtosis) deviate considerably (see S2 Table). Analogously differences were either similarly small or undetectable in all other GAW-and acoustic-based parameters for both females and males.

Parameters influenced by F 0
We used the rule-of-thumb limits proposed by Mukaka [57] to rate the size of the correlation coefficient (i.e. absolute value of PCC or DCC):  Mukaka only discussed linear relations; however, we also used this limit for distance correlation since it has (in absolutes) the same value range as Pearson correlation. This also leads to better comparability between PCCs and DCCs.
Further, we imposed two conditions that had to be fulfilled to determine a PCC or DCC between two parameters as relevant. (A) The PCC or DCC had to be statistically significant after FDR correction. (B) The PCC or DCC had to be above the rule-of-thumb limit of negligibility for correlation coefficients; i.e. an absolute value greater than or equal to 0.3.
The following relevant correlations were observed: The only parameters that correlated very high (� 0.9) were GAW-and acoustic-based F 0 [Mean], as depicted in Fig 5, for females and males. GAW-based but not acoustic based F 0 [Std] was highly associated with F 0 [Mean]. Two parameters were moderately associated with F 0 [Mean] (PCC or DCC between 0.5 and 0.7). Four parameters showed low and moderate correlations. 15 parameters showed only low correlations (between 0.3 and 0.5). A list of parameters that were associated with F 0 [Mean], as well as relevant PCCs and DCCs, is provided in Table 3.
Differences in PCCs (linear correlation) and DCCs (general correlation including linear and non-linear) for the same comparisons were small; i.e. linear correlations are dominant and non-linear relations seem to be small to negligible. For pairings with at least one, PCC or DCC, statistically significant and in absolute values � 0.3, the highest difference in females was 0.111 between GAW-based PhAI [Std] and acoustic-based F 0 [Mean]. In males the largest difference was 0.081 between GAW-based F 0 [Mean] and acoustic-based WMC mean . In Fig 6, scatter plots for these parameter pairings are depicted, including a fitted regression line and second degree polynomial. Further, a regression line, applying the random sample consensus (RANSAC) algorithm [58], to exclude outlier data points is fitted. As shown in Fig 6,

Fig 5. Correlation between GAW and acoustic F 0 [Mean] in (A) females and (B) males with fitted line (black).
https://doi.org/10.1371/journal.pone.0246136.g005 No notable differences in PCCs and DCCs between females and males were detected with the exception of GAW-based PVI, related on a moderate level (PCC = 0.663 and DCC = 0.661) with acoustic based F 0 in females but not in males (no statistically significant PCC or DCC). In general, if for a certain parameter relation a statistically significant PCC or DCC was found in males, the same parameter relation was also statistically significant in females, but not vice versa.

Correlations excluding mean F 0
Correlations between GAW-and acoustic-based parameters (excluding F 0 [Mean]) were in most cases negligible. As shown in Table 4, only 17 low and one barely moderate PCCs or DCCs could be observed. The highest correlations were found between acoustic-based WMC Max and GAW-based F 0 [Std], which were both also correlated to F 0 [Mean], Table 3. Analogously to F 0 [Mean] associated correlations, no distinct differences between PCCs and DCCs in females and males were observable. In S3 Table, all PCCs and DCCs and respective p-values (after FDR-correction) calculated in this study are given.

Discussion
For none of the investigated parameters healthy and disordered groups are clearly separable by parameter values, as shown in Fig 4 for two example parameters. However, by inspecting high order statistical measures like skewness and kurtosis that describe the shape of the distribution  of parameter values for groups N F , N M , FD F and FD M , several differences between subject groups become apparent (see S2 Table for a comparison of statistical values of all parameters). This is not surprising, since FD is an umbrella-term for a variety of voice disorders [6]. Therefore parameters that describe a certain feature of the phonation process may be expressive for certain subcategories of FD, but may not for others. This and high individual physiological variability [7] may lead to the observed outliers and high variability of parameter values in the data. Since the female and male FD groups consist out of subjects with varying conditions, specific parameters may differ from normal values for only some of the FD subjects. This may then lead to changes in the shape of the parameter distribution in comparison to healthy subjects. In summary, single parameters are not suitable for differentiating healthy from FD subjects and multi-parametric approaches are needed as suggested before [38,59,60]. However, if not FD in general but subcategories of FD (e.g. psychogenic dysphonia, conversion dysphonia or tension-fatigue syndrome [6,61]) are investigated, there could be single parameters or smaller sets of parameters that are able to differentiate these subcategories of FD from healthy voices. Therefore, the collected values for FD subjects, as provided in S2 Table, should be considered preliminary (see shortcomings). As expected [1,2], GAW and acoustic F 0 [Mean] are highly correlated, additionally other parameters are also, to some degree, correlated to F 0 [Mean], see Table 3. Albeit most of these correlations were only low (0.3 to 0.5), this still implies that these parameters change to a small degree with changing F 0 . Exceptions are GAW-based F 0 [Std] and PhAI [Std], showing no PCC or DCC below "high" (0.7 to 0.9)" respectively "moderate" level (0.5 to 0.7) in females and males (see Table 3).

Fig 6. Relevant parameter relations with highest difference between PCC and DCC in (A) females (acoustic based F 0 [Mean] versus GAW-based PhAI [Std]) and (B) males (acoustic based WMC Mean versus GAW-based F 0 [Mean]
Only GAW-based F 0 [Std] but not acoustic based F 0 [Std] showed the aforementioned strong correlation with F 0 [Mean]. F 0 [Std] is calculated from the inverse cycle lengths ( 1 cycle lenght ) which vary more for the acoustic signal than in the GAW due to noise and the more complex waveform shape of the acoustic signal which complicates the determination of the exact beginning and ending position of cycles. This effect may mask a potential existing correlation between acoustic based F 0 [Std] and F 0 [Mean]. PhAI describes the relative phase shift between GAW L and GAW R in one vocal fold oscillation cycle and PhAI [Std] respectively the standard deviation of this parameter, calculated for all oscillation cycles. Therefore the comparatively high positive correlation of this parameter with F 0 is expected, since with shorter cycles (higher F 0 ), the deviation of PhAI relative to cycle length increases. Regarding such effects, it may be the needed to correct for the influence of F 0 during further use of the affected parameters.
The found, small differences between PCCs and DCCs indicate weak non-linear relations between the investigated GAW and acoustic features, since this implies that the "general association" between parameters that are measured by DCCs are almost completely explainable by "linear association" that are measured by PCCs. As shown in Fig 6, in the parameter pairings with the highest difference between PCC and DCC, no obvious or only weak non-linear dependencies are observable.
Higher values of PCCs and DCCs and simultaneously a lower number of statistically significant PCCs and DCCs in males than in females may be attributable to the smaller number of available male subjects. PCC and DCC between GAW-based PVI and acoustic F 0 [Mean] differs the most between females and males. This can be attributed to males phonating at lower fundamental frequencies than females [30] and that the higher the F 0 [Mean], the stronger the association between GAW-based PVI and F 0 [Mean].
This relation may be to some degree an artefact attributable to the, in comparison to the speed of vocal fold oscillations, limited sampling rate of the GAW. Even though for 4000 fps recording frame rate and vocal fold oscillation frequencies between 80 and 400 Hz [3], each cycle is represented by 27 to 10 data points; i.e. a single data point shift results in up to 10% change of the cycle length. In female GAWs, less data points are contained in each cycle and hence period perturbation measures such as Jit(%) and PVI are artificially increased. MJit is an exception, since it is not normalized and hence would be expected to be higher in males, however, this effect and the one mentioned before level each other out.
Only 11 pairings of parameters in females and 1 paring of parameters in males that did not include F 0 [Mean] had statistically significant correlations and none of these correlations exceeded 0.5 (see Table 4). Therefore, the direct relation between investigated features of the GAW and the acoustic signal excluding F 0 is only low at best. However, there may be still some relations for subcategories of subjects that could not be detected. Further, the influences due to modulation of the airflow / acoustic signal in the vocal tract are not reflected by the GAW. Also, the actually 3-dimensional vocal fold oscillations are not entirely reflected by the one dimensional GAW. This means that 2D and 3D oscillatory characteristics of the vocal folds may be better suited to reflect changes in the acoustic signal than 1D-GAW features do [62,63]. This also aligns with previous findings, that GAW-based parameters are less important for healthy / FD classification tasks than parameters based on a more complex signal describing the vocal fold oscillation pattern (i.e. Phonovibrogram-based parameters) [38].
To summarize, the main gains from this investigation are as follows: 1. Values of investigated parameters for healthy and FD subjects were not clearly separable. A table containing norm values (Minimum, maximum, Mean, median and standard deviation) for all parameters in all four investigated groups are provided (S2 Table). All parameters were obtained from 250 ms long sustained phonation data (vowel /i/).
2. In many cases parameters are correlated with F 0 , which may require a correction for the influence of F 0 on these parameters in future studies. We provide a comprehensive list of parameters statistically significantly associated with F 0 ( Table 2).
3. Mostly, linear relations were found between GAW and acoustic parameters. Non-linear relations were only subjectively observable and weak. Further, no strong relations between GAW and acoustic signals, excluding F 0 , were found in females or males. This implies that no clear redundancy exists between both signals but also suggests that the GAW may be a too simplified one dimensional representation of the vocal fold oscillations.

Shortcomings
In this study more females than males have been investigated which influences the comparisons as explained in the discussion section. This imbalance was not avoidable without excluding many female subjects, since the vast majority of our clinical referrals are females, being similar to other clinics [64]. Also, voice pathologies are more common in females than males [65]. Further, subject age differed between healthy and disordered groups. Albeit we found no strong influence of subject age in a previous study [38], the influence of age on voice parameters is well documented in literature [66][67][68] and may have influenced the results. FD is a diagnosis of exclusion and hence a broad term uniting a vast amount of different voice disorders that all have varying symptoms and causes [4,6]. This means that a table of norm values for FD subjects is only of limited utility, since many parameter values describing only certain features of the voice may also be in the normal range for most of the subcategories of FD. Only for specific subcategories of FD, certain parameter values may deviate. In addition, the analyzed phonatory condition was limited to sustained phonation on vowel /i/. Other paradigms as pitch raise or phonating other vowels will have to be investigated in order to analyze if they are more suitable to differentiate between healthy and FD subjects. However, since we only looked for more general relations between parameters and only limited data was available, the distinction of a large number of FD subcategories was not feasible.
The acoustic signal was recorded in a clinical setting using a clip-microphone and hence was often noisy. We addressed this problem by rating all acoustic signals in regard of signal and background noise and only used data with acceptable external noise levels.
The GAW is only a 1-dimensional representation of the vocal fold oscillation process and hence does not describe the whole information contained in the 2D-HSV recordings [20,69] or the 3D vocal fold oscillations [62]. For further investigations in vocal fold-acoustic relations, Phonovibrogram-based parameters could be also considered, since the Phonovibrogram is a more complex, 2-dimensional representation of the vocal fold oscillations [63].
More signals, parameters and alternating definitions of parameters exist [25] that were not investigated in this study. Also, exact parameter definitions may differ between software tools [70].

Conclusion
In this study healthy and FD subjects were not separable by single parameter values. Still, we presented S2 Table containing values for male and female, healthy and FD subjects obtained from 250 ms long sustained phonation data (vowel /i/). This table does not rest upon a sufficiently large and diverse number of subjects to be used as a reference for clinical parameter value ranges. However, it can be expanded and supplemented in future studies to eventually lay the fundamentals for the development of software tools that may allow for objective clinical voice assessment and assisting clinicians.
About half of all 49 investigated parameters were found to be correlated statistically significantly with acoustic or GAW-based F 0 [Mean]. Albeit most correlations were low (between 0.3 and 0.5) this still implies a measurable influence of F 0 on the affected parameters. We suggest that, if the parameters affected by F 0 are used in the future, it may be required to correct for the influence of F 0 , at least for the stronger affected parameters PhAI [Std] and F 0 [Std].
Only low (and in one case barely moderate) correlations between not F 0 -related GAW-and acoustic-based parameters were found in females and males. Although no strong relations between features of the GAW and acoustic signal besides F 0 could be found in this work, these findings show the gain of synchronous HSV and acoustic recordings, since not much redundancy is present in both signals. Also, based on these only weak relations between acoustic and GAW-parameters, we conclude that other features besides the glottal area (i.e. specific vocal fold oscillation patterns or the vocal tract) may play a more prominent role in determining acoustic characteristics than the GAW.
Supporting information S1