Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Measuring negative emotions and stress through acoustic correlates in speech: A systematic review

  • Lilien Schewski ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Writing – original draft, Writing – review & editing

    lilien.schewski@unibe.ch

    Affiliations Department for Biomedical Research (DBMR), University of Bern, Bern, Switzerland, Department for Visceral Surgery and Medicine, Bern University Hospital, University of Bern, Bern, Switzerland, Graduate School for Health Sciences, University of Bern, Bern, Switzerland

  • Mathew Magimai Doss,

    Roles Formal analysis, Writing – review & editing

    Affiliation Idiap Research Institute, Martigny, Switzerland

  • Guido Beldi,

    Roles Conceptualization, Funding acquisition, Supervision, Writing – review & editing

    Affiliations Department for Biomedical Research (DBMR), University of Bern, Bern, Switzerland, Department for Visceral Surgery and Medicine, Bern University Hospital, University of Bern, Bern, Switzerland

  • Sandra Keller

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Supervision, Validation, Writing – review & editing

    Affiliations Department for Biomedical Research (DBMR), University of Bern, Bern, Switzerland, Department for Visceral Surgery and Medicine, Bern University Hospital, University of Bern, Bern, Switzerland

Abstract

Speech analysis offers a non-invasive method for assessing emotional and cognitive states through acoustic correlates, including spectral, prosodic, and voice quality features. Despite growing interest, research remains inconsistent in identifying reliable acoustic markers, providing limited guidance for researchers and practitioners in the field. This review identifies key acoustic correlates for detecting negative emotions, stress, and cognitive load in speech. A systematic search was conducted across four electronic databases: PubMed, PsycInfo, Web of Science, and Scopus. Peer-reviewed articles reporting studies conducted with healthy adult participants were included. Thirty-eight articles were reviewed, encompassing 39 studies, as one article reported on two studies. Among all features, prosodic features were the most investigated and showed the greatest accuracy in detecting negative emotions, stress, and cognitive load. Specifically, anger was associated with elevated fundamental frequency (F0), increased speech volume, and faster speech rate. Stress was associated with increased F0 and intensity, and reduced speech duration. Cognitive load was linked to increased F0 and intensity, although the results for F0 were overall less clear than those for negative emotions and stress. No consistent acoustic patterns were identified for fear or anxiety. The findings support speech analysis as a useful tool for researchers and practitioners aiming to assess negative emotions, stress, and cognitive load in experimental and field studies.

Introduction

While spoken communication conveys information through its content, it also reveals the speaker's emotional state through tone and other vocal characteristics. Negative emotions or stress in group or team communication can be contagious, escalate interpersonal tensions, or simply indicate the emotional state or stress level of one or more team members [1]. In an operating room, for example—where teamwork is an essential component of the work—negative emotions and stress can impair both technical performance [2] and nontechnical skills, such as communication (e.g., speaking-up behaviors) and decision-making [3,4].

Human emotions can be understood along two dimensions: valence, which describes how positive or negative an emotion is, and arousal, which reflects the intensity of the emotion [5]. Negative emotions are affective states characterized by negative valence and, in many cases, high arousal—such as anxiety, anger, and frustration. These emotional states are often accompanied by physiological activation and can trigger stress responses. Stress refers to the physiological and psychological responses to perceived threats or challenges that are appraised as exceeding an individual's available resources [6]. Stress can intensify the experience of negative emotions, complicating emotion regulation. Further, cognitive load—defined as the level of mental effort required to process a given amount of information [7]—can act as a stressor when the cognitive resources available are insufficient. Therefore, although negative emotions, stress, and cognitive load are distinct constructs, they are interrelated and may lead to overlapping acoustic patterns in speech. We thus decided to include the three constructs in the review and present the results in a way that allows disentangling similarities and differences in the acoustic correlates associated with each construct.

There is evidence that internal states, including emotions, stress, and cognitive load, affect how we speak. Speech patterns are influenced by physiological interactions between the central nervous system (CNS), autonomic nervous system (ANS), and the vocal production system [8]. When the ANS is activated in response to stress, it triggers physiological responses such as increased heart rate, changes in respiratory rate, and muscle tension [9], which affect the vocal folds and alters the way we sound [10]. Humans can correctly identify different emotional states in speech; however, human ratings demonstrate limitations, such as inconsistent accuracy in identifying different emotional states [11]. In contrast, speech analysis offers a fast, non-invasive, and unobtrusive alternative by examining specific features of speech, known as acoustic correlates [12]. These correlates can be categorized into three main groups:

Prosodic features are elements of speech—such as intonation (the rise and fall of fundamental frequency), energy pattern (loudness), rhythm (the timing of speech), and duration—that relate to long segments of speech, such as sentences, words, syllables, and expressions [13]. These are commonly referred to as suprasegmental features. Typically, they are derived through short-term processing of the speech signal to extract acoustic correlates such as fundamental frequency and short-term energy across different time windows. These correlates are then parameterized at the turn or utterance level (i.e., over long segments of speech). For a detailed description of the acoustic correlates, refer to Table 1.

thumbnail
Table 1. Classification of acoustic correlates used in these studies.

https://doi.org/10.1371/journal.pone.0328833.t001

Voice quality features describe attributes of a voice's sound, such as breathiness and smoothness. Common features include jitter (small variations in pitch), shimmer (small variations in loudness), and the harmonics-to-noise ratio (HNR), which indicates how clear a voice sounds. Changes in these features can signal vocal strain or indicate certain disorders [33,34].

Spectral features relate to the frequency content of the sound signal. A common feature is the Mel Frequency Cepstral Coefficient (MFCC), which captures characteristics of speech sounds and is widely used in speech processing (e.g., speaker recognition) [35].

These acoustic correlates map different emotions, as different emotions share distinct acoustic profiles in speech [36,37]. Emotions with the same level of valence and arousal share similar acoustic profiles [38]. Thus, we expect similarities between the acoustic profiles of negative emotions, stress, and cognitive load. However, the current state of research on identifying accurate markers of these states in speech is marked by inconsistency and variability across studies. As a result, a consensus on the acoustic correlates of negative emotions, stress, and cognitive load in speech has yet to be established.

The objective of this systematic review is to identify key features of speech that indicate negative emotions, stress, and cognitive load. By establishing a comprehensive overview of the current state of research, this review contributes to a deeper understanding of how these states are externalized through speech. By doing so, we aim to provide guidance for practitioners and researchers by offering recommendations for measuring affective states in communication.

The question the present literature review aims to answer is: What are acoustic correlates of negative emotions, stress, and cognitive load in speech in healthy adults?

Methods

This systematic search is reported according to the PRISMA 2020 guidelines [39]. A protocol for this review was published in PROSPERO (CRD42024525922).

Eligibility criteria

Studies were included if they met the following criteria: (1) they were original, peer-reviewed journal articles; (2) they reported experiments or field studies; (3) they examined the acoustic correlates of negative emotions, stress or cognitive load in speech; and (4) they involved healthy adult participants.

Studies published from the inception of the databases until March 19, 2024, were included.

We excluded studies that focused on participants with (1) disabilities; (2) psychological disorders (e.g., schizophrenia, depression, and autism); (3) neurodegenerative disorders (e.g., Alzheimer’s disease, dementia, Huntington’s disease, Parkinson’s disease); or (4) speech disorders (e.g., aphasia, dysphonia). Studies conducted with children or animals were excluded.

Additionally, we excluded review articles and studies that (1) focused solely on methods for feature extraction for machine learning models, (2) were related to speaker or language identification, or (3) were conducted with actors or simulated emotions.

Search strategy

A systematic search was conducted in March 2024 across four electronic databases: PubMed, PsycInfo, Web of Science, and Scopus. The search strategy included terms related to acoustic correlates, tension (e.g., stress, negative emotion, cognitive load, frustration, negative affect, anger, aggression), and speech (e.g., oral communication, voice communication). For a detailed description of the search string, refer to S1 Appendix in the Supporting Information.

Two independent reviewers (LS, SK) performed the selection of articles for inclusion using Rayyan [40]. Studies were selected based on their title, keywords, and abstract. The reviewers were blinded to each other's decisions; the concordance rate was 97% for abstract screening and disagreements were resolved by discussion.

Data extraction.

The extracted data included the following information: author(s), year of publication, study type, country of publication or study location, measured emotion or stress, measured acoustic correlates, type of measurement (human vs. automated rating), participants’ demographics (including gender, age, and language), and reported acoustic correlates.

In addition, we extracted information regarding the setting in which emotions were assessed, additional measurement methods employed to validate the emotion, and the presence of a control condition or group. For detailed information on the extracted data, refer to S1 Table in the Supporting Information.

Risk of bias assessment.

The methodological quality of the included studies was assessed using the latest version of the Mixed Methods Appraisal Tool (MMAT) [41]. The MMAT is designed for the quality appraisal of empirical studies performed based on a variety of different methodologies: qualitative research, randomized controlled trials, non-randomized studies, quantitative descriptive studies, and mixed methods studies. It is widely used in systematic reviews including mixed methods studies [42,43]. Studies were rated based on five subquestions, with each subquestion answered positively contributing to 20% of the total quality criteria. If a subquestion is rated as only partly positively answered, the study meets 10% of the quality criteria. The overall quality assessment is the sum of the percentages from all sub-questions. Therefore, each study can achieve between 0% and 100% of the quality criteria. In cases where data are missing, studies receive lower scores in the quality assessment.

Data synthesis.

For data synthesis, study characteristics were systematically extracted and tabulated in an Excel sheet. The following variables were recorded: author(s), year of publication, study location, study design (field study or experiment), emotion or stress measured, and acoustic correlates assessed. Additionally, details were recorded on the task or source used to assess or elicit negative emotions or stress, the type of speech samples analyzed (e.g., sentences, syllables, vowels), the main findings of each study, and the quality assessment scores.

To synthetize the findings, we grouped the results into three distinct categories:

  1. Stress-related
  2. Negative emotion related
  3. Cognitive load-related

Within each category, studies were further classified according to the type of acoustic features analyzed, grouping them into spectral, quality, and prosodic features. These categories were defined based on common acoustic parameters used in the included studies. The findings are summarized descriptively, with particular focus on the key patterns observed in relation to the specific psychological state (negative emotion, stress, or cognitive load) being assessed. Differences in study outcomes were noted and discussed in relation to variations in study design and measurement techniques.

Results

Risk of bias assessment

The Mixed Methods Appraisal Tool (MMAT) was used to assess the quality of the studies included. The results of this assessment are shown in Table 2. Out of the 39 studies, 35.9% showed a high methodological and reporting quality (scoring >80%), 33.3% showed moderate quality (scoring between 60–79%), and the remaining 30.8% were of low quality (scoring less than 60%). The average MMAT quality score across the 39 studies was 65.4%, indicating moderate overall quality. The most common reasons for low and moderate scores in non-randomized quantitative studies were unclear or missing information about the methodology, particularly regarding inclusion and exclusion criteria and participant recruitment methods. Other contributing factors included the lack of consideration for confounding variables such as smoking, caffeine, and alcohol, and the absence of additional measures (e.g., physiological or subjective measures) or sufficient control conditions. For a detailed overview, refer to S2 Table of the Supporting Information.

thumbnail
Table 2. Characteristics of the studies included in the systematic review (N = 38).

https://doi.org/10.1371/journal.pone.0328833.t002

Fig 1 illustrates the results of the search strategy. After the removal of duplicates, screening of abstracts and titles, full-text screening, and risk of bias analysis, 36 articles were included for data extraction (as detailed in Table 1). Additionally, after reviewing the reference lists of relevant studies, two additional articles were identified and included, bringing the total number of studies to 38. One article, a conference proceeding by Lee & Redford [44], was identified in the systematic search as a peer-reviewed work and, after discussion among the authors, met the inclusion criteria.

thumbnail
Fig 1. PRISMA 2020 flow diagram for new systematic reviews.

Adapted from Page et al. [39].

https://doi.org/10.1371/journal.pone.0328833.g001

Study characteristics

Among the 38 articles included, 78.9% (n=30) employed an experimental design, whereas 18.4% (n=7) were field studies conducted in areas such as aerospace, healthcare, broadcasting, and academic contexts. One of the articles [22] reported both a field study and a laboratory experimental design, bringing the total number of studies to n=39.

Most studies were conducted in the USA (n=17), followed by the UK (n=3), Finland (n=3), Belgium (n=3), and other European and Asian countries. The majority of studies targeted stress (n=20) and cognitive load (n=10). Studies on negative emotions (n=9) focused on anger (n=3), anxiety (n=3), fear/tension (n=2), and threat response (n=1).

Most studies (n=10) elicited stress or cognitive load using cognitive tasks, such as the Stroop Task and arithmetic tasks performed by participants under time pressure. Eight studies used simulated scenarios in contexts like airforce, military, operating rooms, and driving scenarios. Additional contexts included: 1) passive mood induction procedures, such as follow-up consultations and viewing slide presentations or images eliciting negative emotions (n=4); 2) audio recordings, including calls to emergency services, recordings of real-life stressful conversations, and broadcast recordings (n=4); 3) social stressors, such as Trier Social Stress Task (TSST), Cyberball, and cognitive tasks with negative evaluations (n=6); 4) oral examinations (n=3), and delivering presentations or speeches (n=2).

In addition to acoustic analyses of negative emotions, stress, and cognitive load, some studies (n=20) used physiological or subjective measures to validate the presence of these states. These measures include heart rate (n=7), blood pressure (n=1), pulse rate (n=3), self-reports (n=6), skin conductance (n=4), palmar sweating (n=1) behavioral signals (n=3), cortisol levels (saliva) (n=5), and pupillary response (n=1). A detailed overview is provided in S3 Table in the Supporting Information.

Commonly used tools or methods to analyze the acoustic features were PRAAT, Multi-speech software tools from KayPENTAX™, the open-source toolkit OpenSmile, and human ratings. Regarding measured speech sequences, most studies focused their analyses on natural speech, read-aloud sentences, utterances, and vowels. A few studies also examined counting and speech produced during presentations. For further details, see Table 2.

All 39 studies examined prosodic features, with 14 also investigating voice quality features and 12 analyzing spectral features. Some studies examined multiple feature categories.

Prosodic features were examined 69 times, with 52 instances showing a correlation or relationship with negative emotions, stress, or cognitive load. Voice quality features were examined 37 times, detecting these states in only 18 cases. Of the 19 instances where spectral features were examined, 8 showed at least one relationship with negative emotions, stress, or cognitive load. Thus, compared to spectral and voice quality features, prosodic features showed overall better accuracy in detecting negative emotions, stress, and cognitive load.

The following section presents the findings on acoustic correlates associated with negative emotions, stress, and cognitive load.

Negative emotions

Within the category of negative emotions, the studies investigated anxiety, anger, fear, and responses to perceived threat. The acoustic patterns associated with each are presented below.

Anxiety: In the category of spectral features, two studies reported changes in formant frequencies related to anxiety. Li and colleagues reported changes in the first formant (F1) and in the standard deviation of MFCC1 [45], while Fuller and colleagues reported changes in the second formant (F2) [46].

For prosodic features, findings on fundamental frequency (F0) were mixed. One study reported a correlation between F0 and self-reported anxiety [45], whereas two other studies found no significant changes [46,47]. Anxiety was also associated with decreased intensity and increased pause duration [47] and short-term energy (STE) [45].

Both studies examining the voice quality features jitter or shimmer reported associations with anxiety [46,47]. Fig 2 illustrates the percentage of studies reporting relationships between the acoustic features and anxiety, with larger bubbles indicating a greater number of studies.

thumbnail
Fig 2. Number of studies and percentage of studies reporting a relationship between acoustic features and anxiety visualized using SRplot [48].

The size of the balloons indicates the number of studies examining each acoustic correlate. The color represents the percentage of studies reporting a relationship, with darker colors indicating a higher proportion of findings supporting an association between the acoustic feature and anxiety.

https://doi.org/10.1371/journal.pone.0328833.g002

Anger: In the category of prosodic features, anger was associated with increased mean F0 and related features in two studies [49,50], whereas Sobin and Alpert [51] reported a decrease in mean F0 but an increase in F0 variance. Anger was also linked to higher volume [51] and greater maximum energy levels [49].

Results on temporal features were mixed. Sobin and Alpert [51] reported fewer and shorter pauses and reduced speech time, while Biassoni et al. [49] found no changes. Speech rate consistently increased for anger across two studies [50,51], whereas the voice quality features jitter and shimmer showed no association with anger [50]. Refer to Fig 3 for a visual summary of the results.

thumbnail
Fig 3. Number of studies and percentage of studies reporting a relationship between acoustic features and anger visualized using SRplot [48].

The size of the balloons indicates the number of studies examining each acoustic correlate. The color represents the percentage of studies reporting a relationship, with darker colors indicating a higher proportion of findings supporting an association between the acoustic feature and anger.

https://doi.org/10.1371/journal.pone.0328833.g003

Fear: For prosodic features, Sobin and Alpert [51] found that fear was associated with increases in both mean F0 and F0 variance. In addition, they reported a faster speaking rate, shorter speech duration, fewer and shorter pauses, and increased volume. In contrast, Bonner [26] observed changes in F0 and temporal parameters but found no consistent trends in the number and duration of pauses or syllable length. Refer to Fig 4 for a visual summary of the results.

thumbnail
Fig 4. Number of studies and percentage of studies reporting a relationship between acoustic features and fear visualized using SRplot [48].

The size of the balloons indicates the number of studies examining each acoustic correlate. The color represents the percentage of studies reporting a relationship, with darker colors indicating a higher proportion of findings supporting an association between the acoustic feature and fear.

https://doi.org/10.1371/journal.pone.0328833.g004

Threat response: Threat response was associated with increased mean F0 in one study [52].

Stress

Spectral Features: Three studies reported significant variations in formant frequencies under stress [22,53,54], while two studies found no significant changes [18,55]. Other spectral features, such as power spectra/spectral energy, and the Hammarberg Index, showed promising results [53,56]; however, the number of studies investigating these features is limited.

Prosodic Features: Several parameters related to fundamental frequency (F0) have been identified as correlates of stress. Notably, fifteen out of nineteen studies reported a significant increase in mean F0 [18,22,5255,5765]. However, four studies found no consistent trend or increase in F0 [56,6668]. For example, Kappen and colleagues [66] observed an increase in F0 within the MIST stress paradigm, but not in the Cyberball paradigm. Streeter and colleagues [67] reported no consistent trend using automated speech analysis, yet human raters perceived an increase in F0 amplitude levels and greater variability in F0. In contrast, Taylor and colleagues [68] observed a decrease in F0 in a social stress experimental setting, and Tolkmitt and Scherer found no consistent significant changes in mean F0 across conditions [56].

Two studies found a significant increase in intensity or amplitude under stress [59,65]; Streeter and colleagues [67] found that human raters also perceived an increased intensity, but did not find a consistent trend using automated speech analysis.

The results regarding time or duration of speech and speech rate are heterogeneous. While some studies found a decrease in speech duration [55,59] another study found no effect [69]. Similarly, Pisanski and Sorokowski [55] observed an increase in speech rate under stress while three other studies [61,67,70] found no effect. Kappen and colleagues [66] found an increase in voiced segments per second and in voiced segment length under stress. Buchanan et al. (2014) reported an increase in the number and duration of pauses.

Voice Quality Features: Three studies [55,64,66] found no significant change in Harmonics-to-Noise ratio (HNR) under stress, while two reported significant changes in HNR [47,64] but no consistent trend. While Kappen and colleagues [71] reported an increase in HNR, Tavi [53] reported a decrease in HNR under stress. Results for jitter and shimmer were inconsistent. Whereas two studies reported a decrease in shimmer [53,71] and a decrease in jitter [66], most of the studies found no association between shimmer and stress [53,55,64]. Similarly, the majority of the studies found no association between jitter and stress [53,55,64,71]. However, Kappen and colleagues [66] reported a reduction in jitter for the MIST stress paradigm. Refer to Fig 5 for details.

thumbnail
Fig 5. Number of studies and percentage of studies reporting a relationship between acoustic features and stress visualized using SRplot [48].

The size of the balloons indicates the number of studies examining each acoustic correlate. The color represents the percentage of studies reporting a relationship, with darker colors indicating a higher proportion of findings supporting an association between the acoustic feature and stress.

https://doi.org/10.1371/journal.pone.0328833.g005

Cognitive load

Spectral features: Three studies found no significant changes in formant frequencies [21,31,56]. Huttunen and colleagues [72] reported significant variations in formant frequencies under cognitive load, but the direction of the effect varied across different formants. Regarding other spectral features, a decrease in spectral tilt was observed by Lively and colleagues [21], whereas Boyer and colleagues [31] found no association.

Prosodic features: Most of the studies investigating F0 found a relationship with cognitive load. Six out of ten studies reported an increase in F0 [27,31,7375], while Lively and colleagues found a decrease in the standard deviation of F0 [21]. Four studies, however, reported inconsistent changes [29,44,56,76].

All three studies investigating intensity or amplitude found an increase under cognitive load [21,27,74].

Findings on speech rate were mixed: Huttunen and colleagues [74] reported a decrease in articulation rate (syllables per second), while Lee and Redford [44] and Brenner and colleagues [27] reported increases in speech and articulation rates. Lively and colleagues [21] reported a decrease in phrase duration, and Lee and Redford [44] reported fewer prosodic breaks.

Voice quality features: Mendoza and Carballo [75] found no change in noise-to-harmonics ratio (NHR) under cognitive load. Regarding the low-to-high spectral energy ratio (L/H ratio), MacPherson and colleagues [76] reported a decrease under cognitive load. Abur and colleagues [29] found a decrease in L/H ratio in older adults but the results were not statistically significant.

Most studies (n = 3) found no significant changes in jitter and shimmer under cognitive load [27,31,73], although Mendoza and Carballo [75] reported decreases in both features. Additionally, few studies identified an impact of cognitive load on further voice quality features: Mendoza and Carballo [75] reported a decrease in Voice Turbulence Index (VTI) and an increase in high-frequency harmonic energy (SPI), while Boyer and colleagues [31] reported an increase in N and a decrease in DALT0 (see Fig 6 for details).

thumbnail
Fig 6. Number of studies and percentage of studies reporting a relationship between acoustic features and cognitive load visualized using SRplot [48].

The size of the balloons indicates the number of studies examining each acoustic correlate. The color represents the percentage of studies reporting a relationship, with darker colors indicating a higher proportion of findings supporting an association between the acoustic feature and cognitive load.

https://doi.org/10.1371/journal.pone.0328833.g006

Discussion

Our review identified that studies investigated a total of 28 different acoustic correlates in association with negative emotions, stress or cognitive load.

Results for negative emotions showed different patterns for specific negative emotions. Some studies reported changes in formant frequencies and MFCCs associated with anxiety. Findings for F0 were inconsistent, while a few studies found associations with intensity/amplitude, short-term energy, duration or number of pauses, and shimmer and jitter. Anger was associated with increases in F0, speech rate, volume, and intensity but not with Jitter and shimmer. Similarly, fear was associated with faster speech rate and increased volume in one study. No consistent trends were found for F0 or time parameters.

For stress, a majority of studies reported significant increases in mean F0. Similarly, intensity/amplitude increased under stress. Results for speech rate and speech duration were heterogeneous with some studies reporting a decrease in speech duration and an increase in speech rate, while others found no changes. Voice quality features (such as HNR, jitter, and shimmer) showed no consistent trends.

For cognitive load, changes in F0 and intensity/amplitude were reported. However, a few studies found no changes, decreases or no consistent trend for F0. Speech rate findings were mixed, with both increases and decreases reported across studies. Changes in voice quality features were observed in a few studies. Jitter and shimmer showed no consistent patterns.

Across all conditions, increased F0 and increased intensity emerged as good indicators for anger, stress and cognitive load. However, some studies on cognitive load yielded mixed results indicating that F0 may be a slightly less effective marker of cognitive load than of negative emotions and stress. One possible reason is that cognitive load does not directly involve the autonomic nervous system in the same way as negative emotions and stress, leading to none or fewer of the high-intensity physiological responses that affect F0.

Across conditions, speaking rate and voice quality features showed inconsistent results. Interestingly, studies that reported no effect on jitter and shimmer focused on cognitive load, suggesting that further research is needed to explore their association with other emotional states. In addition to these findings, several voice quality features—Voice Turbulence Index (VTI), High-Frequency Harmonic Energy (SPI), N, and Digital Amplitude Length (DAL) – as well as spectral features such as spectral tilt, spectral energy, and the Hammarberg Index, showed a positive association with cognitive load and stress. Nonetheless, currently, studies on these parameters are scarce, limiting our understanding about how they vary depending on different emotions and levels of cognitive load and stress. Specifically, the impact of negative emotions and stress on spectral features remains underexplored, with no studies focusing on how these features may be associated with these states.

The gap in research is compounded by the limited comparability across studies due to differences in experimental settings, variations in stress induction procedures, and challenges in quantifying the type and level of emotion or stress induced [75,77]. It is well known that simulations can elicit the same emotions as real-life situations [78]. In contrast, laboratory studies might cause participants to modify their emotional responses due to increased self-awareness. Furthermore, laboratory-induced stress often results in smaller stress responses compared to real-life situations [79], and certain acoustic correlates might be less effective in detecting subtle changes in negative emotions and stress. Another reason for inconsistency, as suggested in other research, could be the reliance on acted emotional databases, as acoustic variations in spontaneous speech are more subtle than in posed emotional expressions [80]. Moreover, different stress protocols stimulate qualitatively and quantitatively different stress responses [66,81], and thus, inconsistent effects across studies might be explained by differences in stress induction methods [82]. The Trier Social Stress Test (TSST) is a widely recognized tool for inducing psychosocial stress in laboratory settings [83,84], whereas other methods might not evoke a stress response strong enough to elicit observable variation in speech or could induce different stress response patterns. For example, while Kappen and colleagues [66] observed significant changes in acoustic parameters with the MIST stress paradigm (inducing cognitive stress), no significant changes have been found for the Cyberball stress paradigm measuring cognitive load. This difference might be due to the circumstance that the MIST stress paradigm elicits a physiological neuroendocrine stress response while the Cyberball stress paradigm elicits a psychological stress response. In line with this, the direction of the effect can vary with the emotion or stress studied. Taylor et al (2016) found a significantly lower F0 in a social stress task but a marginally significant higher F0 in the problem-solving task. Future research should systematically compare the acoustic features across different types of stressors and report the nature of the stress induction to improve interpretability and cross-study comparability.

It is also important to consider individual differences in stress responses, with the same stressor potentially producing different responses. In line with this, some individuals might not show a physiological stress response and corresponding vocal changes despite being in a stressful condition [55]. Speech changes can result from both involuntary physiological changes and voluntary efforts. Individual coping styles may affect how stress responses are manifested in speech, leading to variations in outcomes. Individual differences in vocal output could be related to the degree of top-down regulation, which is affected by a person’s role, position, and training [8]. Furthermore, individuals may strategically conceal their true emotional states, complicating the measurement of negative emotions and stress in real-life situations. Therefore, some speech patterns might reflect a learned tendency to control the voice rather than a direct effect of autonomic arousal [85]. To minimize inconsistencies across studies, future research should focus on verified emotions and stress states and statistically control for individual differences in stress responses [63]. Using stress induction paradigms that reliably elicit strong stress reactions can also enhance the sensitivity of acoustic analyses to stress-related changes.

A key consideration when interpreting acoustic correlates is the type of measurement used. This review emphasizes associations between acoustic features and self-reports or task-defined conditions, which reflect experienced emotional states, stress, or cognitive load. However, these may differ from physiological indicators or ratings provided by external observers. A significant limitation of this review is the variability in quality of the included studies (overall quality score of 65.4%), with some studies showing poor study design and small sample sizes, limiting the external validity of findings. The critical appraisal indicates that most of the studies reviewed are of moderate quality. This is primarily due to unclear information about the representativeness of participants, the failure to account for potential confounding variables such as smoking, alcohol consumption, and caffeine intake, and the absence of additional control measures. Since we only included studies on healthy adult participants, the findings may not be fully applicable to populations with different characteristics (e.g., age, mental health conditions, or health conditions).

None of the papers in this review included the Teager Energy Operator (TEO) as a feature for emotion recognition. TEO is a nonlinear speech feature used to analyze and classify different emotional states. It is sensitive to the interactions between different frequency components [13]. The advantage of using nonlinear speech features, such as TEO, lies in their ability to detect subtle nonlinear patterns, such as variations in airflow through the vocal tract, which may reflect emotional changes.

While negative emotions, stress, and cognitive load share partially overlapping acoustic profiles – particularly increases in F0 and intensity – this similarity does not preclude differentiation. Instead, it underscores the importance of incorporating multiple acoustic features and integrating them with contextual information to more accurately distinguish between emotional, stressful, and cognitive states. Although single features may be nonspecific, combinations of features can help distinguish between closely related conditions. Therefore, we recommend using multiple features to improve accuracy in identifying these states. Additionally, we suggest incorporating other measurements, such as physiological data or subjective self-reports and assessments. Consequently, this could lead to more precise emotion recognition tools that can enhance real-time detection. This is particularly valuable in complex work environments, where negative emotions and stress can significantly impact teamwork and safety.

To improve future comparability and reduce methodological heterogeneity, the field would benefit from the development and adoption of standardized experimental protocols. These should include clear definitions of the constructs investigated, standardized speech tasks, controlled recording conditions, and consistent inclusion of relevant control variables. Such standardization would enhance transparency, facilitate replication, and support cross-study comparisons. Additionally, the standardized reporting of participant demographics – including age, gender, and language background – and specification of the feature extraction pipeline (e.g., tools used, time window) could improve generalizability across contexts.

Conclusions

In summary, our systematic review examined the acoustic correlates of negative emotions, stress, and cognitive load in speech. It shows that some, but not all, acoustic features may serve as valid, non-invasive indicators for assessing these constructs. Nonetheless, variability in study design and quality likely contributes to the heterogeneity of results observed in the literature. Despite these differences, F0 and intensity, which are prosodic markers, show strong potential as reliable indicators of emotional arousal, stress, and cognitive load. Future research should focus on these acoustic correlates. Other acoustic correlates, especially spectral features, showed promising results in analyzing stress and cognitive load in speech but require further research. This review also highlights the opportunity to explore whether and how spectral features could serve as markers for negative emotions beyond cognitive load and stress. To date, studies conducted in real-world or workplace settings are scarce, making it difficult to capture the complexity of emotions arising naturally in everyday life. Therefore, more research conducted in real-life settings is needed.

Supporting information

S2 Table. Quality Assessment with the MMAT.

https://doi.org/10.1371/journal.pone.0328833.s003

(DOCX)

S3 Table. Characteristics of the studies included in the systematic review.

https://doi.org/10.1371/journal.pone.0328833.s004

(DOCX)

Acknowledgments

We thank David Gaviria for his help with the graphs and Stéphanie Perrodin for commenting on an earlier version of the manuscript draft.

References

  1. 1. Elfenbein HA. The many faces of emotional contagion: An affective process theory of affective linkage. Organ Psychol Rev. 2014;4(4):326–62.
  2. 2. Chrouser KL, Xu J, Hallbeck S, Weinger MB, Partin MR. The influence of stress responses on surgical performance and outcomes: Literature review and the development of the surgical stress effects (SSE) framework. Am J Surg. 2018;216(3):573–84. pmid:29525056
  3. 3. Anton NE, Athanasiadis DI, Karipidis T, Keen AY, Karim A, Cha J, et al. Surgeon stress negatively affects their non-technical skills in the operating room. Am J Surg. 2021;222(6):1154–7. pmid:33549296
  4. 4. Wetzel CM, Kneebone RL, Woloshynowych M, Nestel D, Moorthy K, Kidd J, et al. The effects of stress on surgical performance. Am J Surg. 2006;191(1):5–10. pmid:16399098
  5. 5. Russell JA. A circumplex model of affect. J Pers Soc Psychol. 1980;39(6):1161–78.
  6. 6. Lazarus RS, Folkman S. Stress, Appraisal, and Coping. New York: Springer; 1984.
  7. 7. Cooper G. Cognitive load theory as an aid for instructional design. Australas J Educ Technol [Internet]. 1990 Dec 1 [cited 2025 Apr 1. ];6(2). Available from: http://ajet.org.au/index.php/AJET/article/view/2322
  8. 8. Van Puyvelde M, Neyt X, McGlone F, Pattyn N. Voice stress analysis: a new framework for voice and effort in human performance. Front Psychol. 2018;9:1994.
  9. 9. Ziegler MG. Psychological Stress and the Autonomic Nervous System. In: Primer on the Autonomic Nervous System [Internet]. Elsevier; 2012 [cited 2024 Nov 6]. p. 291–3. Available from: https://linkinghub.elsevier.com/retrieve/pii/B9780123865250000615
  10. 10. Yao X, Jitsuhiro T, Miyajima C, Kitaoka N, Takeda K. Physical characteristics of vocal folds during speech under stress. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) [Internet]. Kyoto, Japan: IEEE; 2012 [cited 2024 Nov 6]. p. 4609–12. Available from: http://ieeexplore.ieee.org/document/6288945/
  11. 11. Gill AJ, Gergle D, French RM, Oberlander J. Emotion rating from short blog texts. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems [Internet]. Florence Italy: ACM; 2008 [cited 2024 Sep 4]. p. 1121–4. Available from: https://dl.acm.org/doi/10.1145/1357054.1357229
  12. 12. Baird A, Triantafyllopoulos A, Zänkert S, Ottl S, Christ L, Stappen L, et al. An Evaluation of Speech-Based Recognition of Emotional and Physiological Markers of Stress. Front Comput Sci. 2021;3:750284.
  13. 13. Hashem A, Arif M, Alghamdi M. Speech emotion recognition approaches: A systematic review. Speech Commun. 2023;154:102974.
  14. 14. Giannakopoulos T, Pikrakis A. Audio Features. In: Introduction to Audio Analysis [Internet]. Elsevier; 2014 [cited 2024 Dec 31]. p. 59–103. Available from: https://linkinghub.elsevier.com/retrieve/pii/B9780080993881000042
  15. 15. MathWorks Inc. Spectral Descriptors [Internet]. [cited 2024 Dec 31. ]. Available from: https://www.mathworks.com/help/audio/ug/spectral-descriptors.html#SpectralDescriptorsExample-4
  16. 16. MathWorks, Inc. MFCC [Internet]. [cited 2024 Dec 31. ]. Available from: https://www.mathworks.com/help/audio/ref/mfccblock.html
  17. 17. Bäckström T, Räsänen O, Zewoudie A, Zarazaga PP, Koivusalo L, Das S, et al. Introduction to Speech Processing: 2nd Edition [Internet]. Zenodo; 2022 [cited 2024 Dec 31. ]. Available from: https://zenodo.org/record/6821775
  18. 18. Hall A, Kawai K, Graber K, Spencer G, Roussin C, Weinstock P, et al. Acoustic analysis of surgeons’ voices to assess change in the stress response during surgical in situ simulation. BMJ Simul Technol Enhanc Learn. 2021;7(6):471–7. pmid:35520977
  19. 19. Eyben F, Scherer KR, Schuller BW, Sundberg J, Andre E, Busso C, et al. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE Transactions on Affective Computing. 2016;7(2):190–202.
  20. 20. Ruiz R, De Hugues PP, Legros C. Advanced voice analysis of pilots to detect fatigue and sleep inertia. Acta Acustica United Acustica. 2010;96(3):567–79.
  21. 21. Lively SE, Pisoni DB, Van Summers W, Bernacki RH. Effects of cognitive workload on speech production: Acoustic analyses and perceptual consequences. J Acoust Soc Am. 1993;93(5):2962–73.
  22. 22. Ruiz R, Absil E, Harmegnies B, Legros C, Poch D. Time- and spectrum-related variabilities in stressed speech under laboratory and real conditions. Speech Commun. 1996;20(1–2):111–29.
  23. 23. Nondestructive Evaluation Physics: Sound. The Speed of Sound in Other Materials. [Internet]. NDE-Ed.org. 2025. Available from: https://www.nde-ed.org/Physics/Sound/speedinmaterials.xhtml
  24. 24. Wikipedia contributors. Sound pressure [Internet]. [cited 2025 Jan 3. ]. Available from: https://en.wikipedia.org/wiki/Sound_pressure
  25. 25. Jacewicz E, Fox RA, Salmons J. Vowel change across three age groups of speakers in three regional varieties of American English. J Phon. 2011;39(4):683–93. pmid:22125350
  26. 26. Bonner MR. Changes in the speech pattern under emotional tension. Am J Psychol. 1943;56:262–73.
  27. 27. Brenner M, Doherty ET, Shipp T. Speech measures indicating workload demand. Aviat Space Environ Med. 1994;65(1):21–6. pmid:8117221
  28. 28. Cummins N, Scherer S, Krajewski J, Schnieder S, Epps J, Quatieri TF. A review of depression and suicide risk assessment using speech analysis. Speech Commun. 2015;71:10–49.
  29. 29. Abur D, MacPherson MK, Shembel AC, Stepp CE. Acoustic Measures of Voice and Physiologic Measures of Autonomic Arousal During Speech as a Function of Cognitive Load in Older Adults. J Voice. 2023;37(2):194–202. pmid:33509665
  30. 30. Lee H, Woodward-Kron R, Merry A, Weller J. Emotions and team communication in the operating room: a scoping review. Med Educ Online. 2023;28(1):2194508. pmid:36995978
  31. 31. Boyer S, Paubel P-V, Ruiz R, El Yagoubi R, Daurat A. Human Voice as a Measure of Mental Load Level. J Speech Lang Hear Res. 2018;61(11):2722–34. pmid:30383160
  32. 32. Di Nicola V, Fiorella ML, Spinelli DA, Fiorella R. Acoustic analysis of voice in patients treated by reconstructive subtotal laryngectomy. Evaluation and critical review. Acta Otorhinolaryngol Ital. 2006;26(2):59–68. pmid:16886848
  33. 33. Upadhya SS, Cheeran AN, Nirmal JH. Statistical comparison of Jitter and Shimmer voice features for healthy and Parkinson affected persons. In: 2017 Second International Conference on Electrical, Computer and Communication Technologies (ICECCT) [Internet]. Coimbatore: IEEE; 2017 [cited 2024 Sep 17. ]. p. 1–6. Available from: http://ieeexplore.ieee.org/document/8117853/
  34. 34. Remacle A, Garnier M, Gerber S, David C, Petillon C. Vocal Change Patterns During a Teaching Day: Inter- and Intra-subject Variability. J Voice. 2018;32(1):57–63. pmid:28495327
  35. 35. Varma VSN, A Majeed KK. Advancements in speaker recognition: exploring mel frequency cepstral coefficients (mfcc) for enhanced performance in speaker recognition. Int J Res Appl Sci Eng Technol. 2023;11(8):88–98.
  36. 36. Banse R, Scherer KR. Acoustic profiles in vocal emotion expression. J Pers Soc Psychol. 1996;70(3):614–36. pmid:8851745
  37. 37. Patel S, Scherer KR, Björkner E, Sundberg J. Mapping emotions into acoustic space: the role of voice production. Biol Psychol. 2011;87(1):93–8. pmid:21354259
  38. 38. Yildirim S, Bulut M, Lee CM, Kazemzadeh A, Deng Z, Lee S, Narayanan S, Busso C. An acoustic study of emotions expressed in speech. In: Interspeech 2004 [Internet]. ISCA; 2004 [cited 2024 Aug 26. ]. p. 2193–6. Available from: https://www.isca-archive.org/interspeech_2004/yildirim04_interspeech.html
  39. 39. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. PLoS Med. 2021;18(3):e1003583. pmid:33780438
  40. 40. Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan-a web and mobile app for systematic reviews. Syst Rev. 2016;5(1):210. pmid:27919275
  41. 41. Hong KH, Kim HK, Kim YH. The role of the pars recta and pars oblique of cricothyroid muscle in speech production. J Voice. 2001;15(4):512–8. pmid:11792027
  42. 42. Pace R, Pluye P, Bartlett G, Macaulay AC, Salsberg J, Jagosh J, et al. Testing the reliability and efficiency of the pilot Mixed Methods Appraisal Tool (MMAT) for systematic mixed studies review. Int J Nurs Stud. 2012;49(1):47–53. pmid:21835406
  43. 43. Souto RQ, Khanassov V, Hong QN, Bush PL, Vedel I, Pluye P. Systematic mixed studies reviews: updating results on the reliability and efficiency of the Mixed Methods Appraisal Tool. Int J Nurs Stud. 2015;52(1):500–1. pmid:25241931
  44. 44. Lee O, Redford MA. Verbal and spatial working memory load have similarly minimal effects on speech production. Proc Int Congr Phon Sci. 2015;18:0798.
  45. 45. Li Y, Gao Z, Yang Q, Fu L, Xie Y, Ma X, et al. Nonverbal cues of anxiety in English class presentation: From self- and other-perspectives. Curr Psychol. 2023;42(10):8302–12.
  46. 46. Fuller BF, Horii Y, Conner DA. Validity and reliability of nonverbal voice measures as indicators of stressor-provoked anxiety. Res Nurs Health. 1992;15(5):379–89. pmid:1529122
  47. 47. Lebedeva SA, Shved DM. Study of cognitive performance and psychophysiological state of an operator in conditions of isolation and crowding. Meditsina Tr Promyshlennaya Ekol. 2022;62(4):225–31.
  48. 48. Tang D, Chen M, Huang X, Zhang G, Zeng L, Zhang G, et al. SRplot: A free online platform for data visualization and graphing. PLoS One. 2023;18(11):e0294236. pmid:37943830
  49. 49. Biassoni F, Balzarotti S, Giamporcaro M, Ciceri R. Hot or Cold Anger? Verbal and Vocal Expression of Anger While Driving in a Simulated Anger-Provoking Scenario. SAGE Open. 2016;6(3).
  50. 50. Rochman D, Diamond GM, Amir O. Unresolved anger and sadness: Identifying vocal acoustical correlates. J Couns Psychol. 2008;55(4):505–17.
  51. 51. Sobin C, Alpert M. Emotion in speech: the acoustic attributes of fear, anger, sadness, and joy. J Psycholinguist Res. 1999;28(4):347–65. pmid:10380660
  52. 52. Hodgins HS, Weibust KS, Weinstein N, Shiffman S, Miller A, Coombs G, et al. The cost of self-protection: threat response and performance as a function of autonomous and controlled motivations. Pers Soc Psychol Bull. 2010;36(8):1101–14. pmid:20693387
  53. 53. Tavi L. Acoustic correlates of female speech under stress based on /i/-vowel measurements. Int J Speech Lang Law. 2017;24(2):227–41.
  54. 54. Sondhi S, Khan M, Vijay R, Salhan AK, Chouhan S. Acoustic analysis of speech under stress. Int J Bioinform Res Appl. 2015;11(5):417–32. pmid:26558301
  55. 55. Pisanski K, Sorokowski P. Human Stress Detection: Cortisol Levels in Stressed Speakers Predict Voice-Based Judgments of Stress. Perception. 2021;50(1):80–7. pmid:33302780
  56. 56. Tolkmitt FJ, Scherer KR. Effect of experimentally induced stress on vocal parameters. J Exp Psychol Hum Percept Perform. 1986;12(3):302–13. pmid:2943858
  57. 57. Alvear RMB de, Barón-López FJ, Alguacil MD, Dawid-Milner MS. Interactions between voice fundamental frequency and cardiovascular parameters. Preliminary results and physiological mechanisms. Logoped Phoniatr Vocol. 2013;38(2):52–8. pmid:22741554
  58. 58. Bulling LJ, Bertschi IC, Stadelmann CC, Niederer T, Bodenmann G. Messung der Stimmfrequenz im Paargespräch - Chancen für Diagnostik und Intervention in der Paartherapie Measuring fundamental frequency in couples’ conversations - Opportunities for assessment and intervention in couple therapy. Z Psychiatr Psychol Psychother. 2020;68(4):217–27.
  59. 59. Griffin GR, Williams CE. The effects of different levels of task complexity on three vocal measures. Aviat Space Environ Med. 1987;58(12):1165–70. pmid:3426490
  60. 60. Kandsberger J, Rogers SN, Zhou Y, Humphris G. Using fundamental frequency of cancer survivors’ speech to investigate emotional distress in out-patient visits. Patient Educ Couns. 2016;99(12):1971–7. pmid:27506580
  61. 61. Kappen M, van der Donckt J, Vanhollebeke G, Allaert J, Degraeve V, Madhu N, et al. Acoustic speech features in social comparison: how stress impacts the way you sound. Sci Rep. 2022;12(1):22022. pmid:36539505
  62. 62. Wittels P, Johannes B, Enne R, Kirsch K, Gunga H-C. Voice monitoring to measure emotional load during short-term stress. Eur J Appl Physiol. 2002;87(3):278–82. pmid:12111290
  63. 63. Pisanski K, Nowak J, Sorokowski P. Individual differences in cortisol stress response predict increases in voice pitch during exam stress. Physiol Behav. 2016;163:234–8.
  64. 64. Pisanski K, Kobylarek A, Jakubowska L, Nowak J, Walter A, Błaszczyński K, et al. Multimodal stress detection: Testing for covariation in vocal, hormonal and physiological responses to Trier Social Stress Test. Horm Behav. 2018;106:52–61. pmid:30189213
  65. 65. Sabo R, Rajčáni J. Designing the database of speech under stress. Jazyk Cas. 2017;68(2):326–35.
  66. 66. Kappen M, Vanhollebeke G, Van Der Donckt J, Van Hoecke S, Vanderhasselt MA. Acoustic and prosodic speech features reflect physiological stress but not isolated negative affect: a multi-paradigm study on psychosocial stressors. Scientific Reports. 2024;14(1).
  67. 67. Streeter LA, Macdonald NH, Apple W, Krauss RM, Galotti KM. Acoustic and perceptual indicators of emotional stress. J Acoust Soc Am. 1983;73(4):1354–60. pmid:6853847
  68. 68. Taylor CJ, Freeman L, Olguin DO, Kim T. Deviation in voice pitch as a measure of physiological stress response to group processes. Adv Group Process. 2016;33:211–42.
  69. 69. Hecker MHL, Stevens KN, Von Bismarck G, Williams CE. Manifestations of Task-Induced Stress in the Acoustic Speech Signal. J Acoust Soc Am. 1968;44(4):993–1001.
  70. 70. Buchanan TW, Laures-Gore JS, Duff MC. Acute stress reduces speech fluency. Biol Psychol. 2014;97:60–6. pmid:24555989
  71. 71. Kappen M, Hoorelbeke K, Madhu N, Demuynck K, Vanderhasselt M-A. Speech as an indicator for psychosocial stress: A network analytic approach. Behav Res Methods. 2022;54(2):910–21. pmid:34357541
  72. 72. Huttunen KH, Keränen HI, Pääkkönen RJ, Päivikki Eskelinen-Rönkä R, Leino TK. Effect of cognitive load on articulation rate and formant frequencies during simulator flights. J Acoust Soc Am. 2011;129(3):1580–93. pmid:21428521
  73. 73. Congleton JJ, Jones WA, Shiflett SG, Mcsweeney KP, Huchingson RD. An evaluation of voice stress analysis techniques in a simulated AWACS environment. Int J Speech Technol. 1997;2(1):61–9.
  74. 74. Huttunen K, Keränen H, Väyrynen E, Pääkkönen R, Leino T. Effect of cognitive load on speech prosody in aviation: Evidence from military simulator flights. Appl Ergon. 2011;42(2):348–57. pmid:20832770
  75. 75. Mendoza E, Carballo G. Vocal tremor and psychological stress. J Voice. 1999;13(1):105–12. pmid:10223678
  76. 76. MacPherson MK, Abur D, Stepp CE. Acoustic Measures of Voice and Physiologic Measures of Autonomic Arousal during Speech as a Function of Cognitive Load. J Voice. 2017;31(4):504.e1–504.e9. pmid:27939119
  77. 77. Kirchhübel C, Howard DM, Stedmon AW. Acoustic correlates of speech when under stress: Research, methods and future directions. Int J Speech Lang Law. 2011;18(1):75–98.
  78. 78. Behrens CC, Driessen EW, Dolmans DH, Gormley GJ. “A roller coaster of emotions”: a phenomenological study on medical students lived experiences of emotions in complex simulation. Adv Simul (Lond). 2021;6(1):24. pmid:34217370
  79. 79. Zanstra YJ, Johnston DW. Cardiovascular reactivity in real life settings: measurement, mechanisms and meaning. Biol Psychol. 2011;86(2):98–105. pmid:20561941
  80. 80. Laukka P, Thingujam NS, Iraki FK, Elfenbein HA, Rockstuhl T, Chui W, et al. The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features. J Pers Soc Psychol. 2016;111(5):686–705.
  81. 81. Skoluda N, Strahler J, Schlotz W, Niederberger L, Marques S, Fischer S, et al. Intra-individual psychological and physiological responses to acute laboratory stressors of different intensity. Psychoneuroendocrinology. 2015;51:227–36. pmid:25462896
  82. 82. Van Lierde K, Van Heule S, De Ley S, Mertens E, Claeys S. Effect of psychological stress on female vocal quality. A multiparameter approach. Folia Phoniatr Logop. 2009;61(2):105–11. pmid:19299899
  83. 83. Birkett MA. The Trier Social Stress Test Protocol for Inducing Psychological Stress. J Vis Exp. 2011;56:3238.
  84. 84. Williams RA, Hagerty BM, Brooks G. Trier Social Stress Test: a method for use in nursing research. Nurs Res. 2004;53(4):277–80. pmid:15266167
  85. 85. VanDercar DH, Greaner J, Hibler NS, Spielberger CD, Bloch S. A description and analysis of the operation and validity of the psychological stress evaluator. J Forensic Sci. 1980;25(1):174–88. pmid:7391775