Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Using a standardized sound set to help characterize misophonia: The International Affective Digitized Sounds

  • Jacqueline Trumbull,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Validation, Writing – original draft, Writing – review & editing

    Affiliation Duke University, Durham, NC, United States of America

  • Noah Lanier,

    Roles Formal analysis, Writing – review & editing

    Affiliation Duke University, Durham, NC, United States of America

  • Katherine McMahon,

    Roles Formal analysis, Writing – review & editing

    Affiliation Duke University Medical Center, Durham, NC, United States of America

  • Rachel Guetta,

    Roles Formal analysis, Writing – review & editing

    Affiliation Duke University, Durham, NC, United States of America

  • M. Zachary Rosenthal

    Roles Conceptualization, Supervision, Writing – review & editing

    mark.rosenthal@duke.edu

    Affiliations Duke University, Durham, NC, United States of America, Duke University Medical Center, Durham, NC, United States of America

Abstract

Misophonia is a condition characterized by negative affect, intolerance, and functional impairment in response to particular repetitive sounds usually made by others (e.g., chewing, sniffing, pen tapping) and associated stimuli. To date, researchers have largely studied misophonia using self-report measures. As the field is quickly expanding, assessment approaches need to advance to include more objective measures capable of differentiating those with and without misophonia. Although several studies have used sounds as experimental stimuli, few have used standardized stimuli sets with demonstrated reliability or validity. To conduct rigorous research in an effort to better understand misophonia, it is important to have an easily accessible, standardized set of acoustic stimuli for use across studies. Accordingly, in the present study, the International Affective Digitized Sounds (IADS-2), developed by Bradley and Lang (Bradley MM et al., 2007), were used to determine whether participants with misophonia responded to certain standardized sounds differently than a control group. Participants were 377 adults (132 participants with misophonia and 245 controls) recruited from an online platform to complete several questionnaires and respond to four probes (arousal, valence, similarity to personally-relevant aversive sounds, and sound avoidance) in response to normed pleasant, unpleasant, and neutral IADS-2 sounds. Findings indicated that compared to controls, participants with high misophonia symptoms rated pleasant and neutral sounds as significantly more (a) arousing and similar to trigger sounds in their everyday life, (b) unpleasant and (c) likely to be avoided in everyday life. For future scientific and clinical innovation, we include a ranked list of IADS-2 stimuli differentiating responses in those with and without misophonia, which we call the IADS-M.

Introduction

Misophonia is a recently defined disorder characterized by intolerance to specific sounds and associated stimuli [1]. Aversive sounds (also sometimes called triggers) are most typically repetitive and produced by others orally or facially (e.g., chewing, throat-clearing, sniffing), though environmental noises also are common (e.g., pens clicking, clocks ticking) [2]. Misophonia is characterized by unusually strong multi-modal emotional responses to trigger sounds irrespective of the acoustic features of these stimuli (e.g., loudness or frequency), as well as acknowledgement that this reaction is disproportionate [3]. Though arguably under-recognized, it has recently gained attention in the popular press, and academic research has rapidly accelerated since Schröder, Vulink & Denys [4] in 2013 proposed that misophonia be considered a new psychiatric disorder and recommended diagnostic criteria.

Despite rapidly advancing research in the first 10 years of scientific studies examining misophonia, much remains scientifically unknown about the etiology, course, or underlying nature and features that differentiate it from other clinical presentations. Most of the early-stage research investigating misophonia has used questionnaires and other subjective measurement approaches. Responses to sounds presented in a standardized manner may help to assess misophonia more objectively and precisely than relying on self-report questionnaires alone. Once sound banks for misophonia are developed and validated, researchers could use them in conjunction with objective measures of biological, psychological, or social processes (e.g., psychophysiological measures, neuroimaging, etc.) to objectively characterize misophonic responses to trigger sounds in laboratory or treatment studies. Clinicians could use such sound banks to assist with clinical assessment and in certain interventions (e.g., inhibitory learning-based approaches) [5]. Further, using a more standardized measurement approach would help to more comprehensively characterize the nature and scope of trigger sounds and any shared underlying acoustic features.

In contrast, if a more standardized approach is not developed, research studies may rely only on asking participants to identify sounds most bothersome to them. This could result in the research literature remaining somewhat limited in discoveries about misophonia. More specifically, in the absence of standardized measurement using more objective methods, the knowledge base around misophonia may be constrained only to sounds that are most top of mind, or that participants report because the sounds are already widely known to be common in misophonia (e.g., chewing, eating), rather than sounds that are highly aversive but less commonly encountered or reported.

A number of studies have used recorded sounds to obtain a more objective measure of how individuals with misophonia respond to sounds in laboratory conditions. However, these studies largely have used unstandardized stimuli, with sound content chosen based on known triggers from the emerging research literature, and recordings taken from different sources, including YouTube and those created by researchers [613]. One recent study used a data driven approach to create a sound bank composed of standardized sounds [14], though sounds were chosen a priori from multiple sources with unknown psychometric properties.

Although these studies used acoustic stimuli, their methodologies have notable limitations. For example, selection of experimental stimuli in most studies has been limited to study team expertise and a review of the literature. Accordingly, most sound sets used were limited by a priori assumptions about the best sounds to use as triggers. On the one hand, it is appropriate to consider existing research and clinical experience in considering which sounds to use experimentally. On the other, if researchers only use their clinical experiences and the currently available research, the full scope of sounds that differentiate those with and without misophonia may be limited.

An additional limitation to most previous studies examining responsivity to sounds is that comparison sounds were not created using the same method as experimental (i.e., misophonia trigger) sounds, precluding clear interpretations of study findings. These studies typically selected sounds from multiple sources; the sounds themselves are of various lengths, are not controlled for sound qualities such as volume or duration and are largely not drawn from sound sets with standardized norms, reliability, or validity. Ultimately, these sound sets tend to have high face validity but are unstandardized and have undetermined psychometric properties, collectively rendering study interpretations somewhat inconclusive.

Additionally, previous studies using recorded sounds have not used standardized and psychometrically validated sounds derived from basic emotion research. The Free Open-Access Misophonia Stimulus (FOAMS) database advances the literature by providing standardized sounds that are highly accessible to researchers [15]. However, FOAMS differs from the current study in how sounds were chosen and rated. Sounds in the FOAMS dataset were initially chosen a priori and do not come from a sound set that is commonly used in emotion research. Ratings were derived by asking participants to identify the sound and rate how bothersome it was to them, rather than more standardized ratings such as valence and arousal. Sounds standardized on emotional arousal and valence may be particularly useful for contextualizing misophonia in comparison to the literature on other auditory or psychiatric conditions involving sound sensitivities that have used valence and arousal ratings in studies. Using these standardized ratings can also provide greater understanding of misophonia, as the disorder is widely characterized as featuring physiological (e.g., sympathetic nervous system activation), behavioral (e.g., avoidance, escape, and verbal aggression), and subjective (e.g., certain internal or external attributions; affective states such as anger, anxiety, or moral disgust) emotional responses when anticipating or reacting to personally-relevant aversive cues [1].

An additional consideration germane to misophonia is the degree to which one perceives an experimental sound to be similar to their own personally relevant triggers. To this purpose, researchers could use sound stimuli that are created in a non-standardized way, tailored to each participant. An advantage of that approach is that the sounds are personally relevant to each participant. However, sound responses across patients may be confounded by individual differences in the acoustic properties of the sounds (e.g., volume, duration). In effect, that approach would sacrifice internal validity for limited external validity. One possible solution to this challenge is to identify standardized sounds that are also experienced as personally relevant. To do this, participants can rate sounds on the similarity to their own trigger sounds.

A final consideration when identifying standardized sounds to use in misophonia is the degree to which these sounds elicit behavioral responses consistent with what would be expected in those with the condition. Avoidance and escape behaviors are widely characterized as primary responses to anticipated contexts or triggers [4], yet no studies have included ratings of how likely a participant would be to avoid a particular trigger sound.

Creating a misophonia-specific sound list from a standardized and widely accessible stimulus set could promote greater consistency in procedures across studies, as researchers would be using an easily accessible stimulus set with demonstrated reliability and validity [16]. The IADS-2 are available to any Ph.D. holding faculty member by emailing the creators a request form. Replication of results could be easier if more researchers used the same standardized sound list rather than introducing variability across studies by using stimulus sets with sounds drawn from multiple sources with inconsistent sound qualities and content. The more variable and psychometrically unvalidated sounds are across studies investigating misophonia, the more difficult it may be to synthesize results and make clear conclusions about the nature, feature, and treatment of misophonia.

To address the limitations in this body of research, in the present study we used a stimulus set that has been previously validated for use in basic emotion research and standardized for length, volume, and source of audio clips. To extend research on misophonia, we examined multiple responses to standardized sounds, including affective valence, arousal, similarity to participants’ triggers, and estimated avoidance of sounds. This approach was used to empirically identify standardized sounds relevant to misophonia without choosing sounds a priori.

Current study

This study aimed to use a standardized set of sounds widely used in emotion research, the International Affective Digital Sounds-2 (IADS-2) [16], to characterize how individuals with misophonia respond to certain sounds and their associated subjective qualities (e.g., valence, arousal). We examined whether there are any IADS-2 sounds that are differentially related to misophonia symptoms, with the aim of characterizing a sound bank of misophonia triggers that is free, accessible, standardized, and has been used in basic emotion research. Investigating whether existing sounds with established norms in emotion research are associated with misophonia symptoms will allow future researchers to use a standardized list of sounds in their research and thereby increase consistency across research.

To this aim, we presented to individuals with and without misophonia a set of audio clips that were previously normed as neutral, unpleasant, and pleasant from the IADS-2. We then asked all participants to rate each sound on valence (unpleasant/pleasant), arousal (relaxing/excited), similarity to their own trigger sounds, and likelihood they would avoid these sounds. We predicted that people with high self-reported misophonia would rate typical misophonic trigger sounds (chewing, sneezing, paper rustling) as less pleasant and more arousing than people without misophonia. We also hypothesized that people with misophonia would be more likely to rate typical misophonic trigger sounds as similar to their own trigger sounds and would rate higher estimated avoidance of these sounds than those people without misophonia. Because trigger sounds may be experienced as neutral or pleasant by those without misophonia, we also predicted that misophonic ratings of positive and neutral sounds will be lower in valence and higher in arousal, similarity, and avoidance.

Method

Participants

Participants (N = 2550) were recruited in May 2021 through CloudResearch in conjunction with Amazon’s Mechanical Turk (MTurk), a crowdsourcing platform that generates data of similarly high quality to other convenience samples (e.g., college students;) [1719]. Adhering to MTurk best practices [20], participants were required to be between 18 to 65 years of age, fluent in English, and currently residing in the United States. Participants who completed the study were compensated $7.25. Once a minimum of 225 controls were recruited, inclusion criteria were narrowed such that only participants who scored a minimum of 2 on the Misophonia Symptom Scale, a 2 on the Misophonia Emotions and Behavior Scale, and a minimum of 7 on the Severity scale of the MQ [21] could continue in the study. These cut scores were chosen because they are suggestive of significant misophonia symptomatology [21]. Participants with these scores or higher were included in the misophonia group. Maximum scores on these sections are 4, 4, and 15 respectively. At the time of study design and recruitment, similarly designed studies in misophonia research were not available to adequately conduct a power analysis. Therefore, group sizes were chosen to be adequately large while considering expense. Participants were evenly split in terms of gender across the entire sample (49.3% female, 49.3% male, and 1.3% other), averaged 38.9 years in age, and were predominantly White (see Demographics Table 1 for demographics specific to each group). There were no significant differences in demographics between the misophonia and control groups except for gender, which had a significance level of p < .001 after running a Chi-square test. In total, 2550 participants were initially recruited, with the vast majority screened out by the narrowed MQ criteria after forming the control group. A total of 377 participants remained after screening and passing attention checks, including a control group of 245 participants and a misophonia group of 132 participants. Participants provided implied consent (they continued onto the study procedure without signing), as data were analyzed anonymously. Participants were not warned of the potentially aversive nature of the sounds, but they were told they would be able to withdraw from the study at any time. Participants were told they would not be compensated unless the study was completed. Authors did not have access to information that could identify individuals during or after data collection. Before study procedures began, participants were given a written form describing study procedures and compensation. Participants were told that completing the study implied consent.

Data integrity check

Several measures were taken to protect data quality. MTurk workers were only included if they historically provided high quality responses (i.e., had completed at least 1000 Human Intelligence Tasks (HITs) with an approval rate > = 99%). Participants were told in the assent that if they clicked away from the survey without completing it, they would not receive compensation. Two attention check questions were administered during the survey, and only participants who had accurate responses to both could continue the study and be used in data analyses, as accurate answers would indicate that the participants were paying attention. The first attention check ensured that participants were using browsers that supported the sound technology used in the study, and the second asked participants to demonstrate that they had heard a sample sound. This ensured that participants were both paying attention and could hear the sounds played. These measures are in accordance with data quality recommendations for MTurk-administered studies [20] and reflect measures taken in existing MTurk studies [22], as well as previous studies in our lab. Of the 2550 individuals initially recruited, 377 participants (14.8%) passed the data integrity checks, screener, completed the study, and were included in the analyses.

Measures and materials

International Affective Digitized Sounds (IADS-2) [16].

The IADS-2 is a standardized set of emotional sound stimuli used in research worldwide. It was validated by a normative sample and is commonly used in research of emotion and attention [23, 24]. The IADS-2 was intended to allow researchers better control over stimulus selection and to aid in research replication across labs. This sound set was developed by collecting ratings of valence, arousal, and “dominance/control” from a large sample of college students at the University of Florida (at least 100 students rated each sound). The final set of 167 sounds was chosen after three different rating studies [16].

Misophonia Questionnaire (MQ) [21].

The MQ includes 17 items in three subscales: the Misophonia Symptom Subscale, the Misophonia Emotions and Behaviors Subscale, and the Severity Subscale. The Severity Subscale is a single item measure ranging from 1 to 15 that asks participants to indicate the extent to which sound sensitivity interferes with their lives. A score above six on the Severity Subscale indicates moderate or higher impairment related to sound sensitivities [21]. Initial validation of the MQ demonstrated good internal consistency (α = .86 - .89) [21]. The original study also demonstrated strong convergent and discriminant validity. Strong internal consistency was again found in a replication of the original study [25]. Cronbach’s alpha for the present study was .87.

Positive and Negative Affect Scale (PANAS) is a widely-used scale that measures state affect. It is comprised of 20 questions, 10 of which measure positive affect and 10 of which measure negative affect [26]. Each question includes a single affect (e.g., “inspired”) and participants are asked to rate how much they currently experience that affect on a 5-point Likert scale. Cronbach’s alpha for the Positive Affect scale is reportedly 0.86–0.9 and Cronbach’s alpha for the Negative Affect Scale is 0.84–0.87. Test-retest reliability were 0.47–0.68 and 0.39–0.71 for the positive and negative scales respectively. Cronbach’s alpha in the present study for the Positive Affect Scale was .92. Cronbach’s alpha in the present study for the Negative Affect Scale was .91.

Affect Intensity Measure (AIM) is a self-report questionnaire that uses a six-point Likert scale to measure trait-level positive and negative affect [27]. Subscales in the full measure include Negative Intensity, Positive Intensity, Negative Reactivity, and Positive Affectivity. To examine constructs associated with misophonia, the current study used only the following subscales: (1) Negative Intensity, how strongly individuals’ negative emotions are in reaction to situations (6 items), and (2) Negative Reactivity, or how easily negative emotions can be triggered by situations (6 items). Larsen and Diener [28] indicated good internal reliability for validation samples (α = .90-.94). Cronbach’s alpha in the present study for the total scale was .90.

Procedure

All study procedures were approved by the Duke Health Institutional Review Board and all participants provided informed consent before beginning the study. Following screening (see above), three groups of participants were each given the following instructions regarding valence and arousal ratings: “We will ask you how each sound makes you feel on a scale from 1 to 9, with 1 being ’most negative’ and 9 being ’most positive.’ Feel free to make ratings between these extremes as well. We will also ask you to rate the arousal of each question: this means we want to know how relaxing, soothing, or boring the sound is versus how exciting, aggravating, or intense it is. This rating is independent of valence, meaning that it doesn’t matter if the arousal you feel is pleasant or unpleasant.”

Each participant was then given a randomized set of 56 sounds (a random one-third of the entire set of IADS-2 for each participant to minimize participant burden associated with having to listen to all IADS-2 sounds) to rate, as well as self-report measures. Sound types (positive, negative, and neutral) were also randomized for each participant. After each sound, participants were asked to rate arousal and valence on a Self-Assessment Manikin (SAM), which is a pictorial representation of a Likert scale ranging from 1–9, where “1” refers to the lowest possible arousal or valence, and “9” refers to the highest. Following the SAM, participants are asked to rate (1) “To what extent does this sound resemble the sounds in your everyday environment that most bother you?” and (2) “To what extent would you avoid this sound in your everyday environment/ day to day life?” These questions are rated on a 1 (“not at all”) to 9 (“as much as possible”) Likert scale. In addition to these ratings, participants completed three measures: the MQ [21], the PANAS [26], and the AIM [27]. Participants could wait as long as they wanted after each six second sound played to move on to the ratings questions but could not replay or rewind the sound. Once they moved on to the ratings questions, they had five seconds to complete each sound rating. IADS-2 sounds are normed for volume; however, participants were able to adjust computer volume while listening.

Data analytic plan

Group differences in sound ratings were examined using a two-way, mixed analysis of covariance (2 groups x 3 sound types, where “group” corresponds to participants with misophonia or controls, and “sound type” refers to positive, negative, or neutral sounds) on four dependent variables (ratings of valence, arousal, similarity, and avoidance). When statistically significant interactions were observed for sound type, pairwise comparisons were used to determine group differences on each dependent variable, as well as mean differences between sound type on each dependent variable.

All analyses were conducted using IBM SPSS [29] statistical software. The first step in the data analytic plan included cleaning and screening the dataset by: (a) inspecting all variables for data entry errors (none were observed), and (b) examining the normality of distributions across study variables. Next, bivariate correlations were explored to examine the relationships among variables and determine whether it would be appropriate to use any covariates. The Positive and Negative Affect Scale (PANAS), the Affect Intensity Measure (AIM), and gender were all found to significantly correlate with results and were initially included as covariates. Skewness and kurtosis levels did not exceed acceptable ranges (skewness < 2, kurtosis < 4) [30].

Finally, we ranked sounds according to a composite Z-score calculated by the sum of the Z scores from mean group differences on each sound for each dependent variable and have listed the entire ranked stimuli as “IADS-M” sounds (See Table 2). Sounds listed first (e.g., writing, whistling) reflect the sounds that most differentiate individuals with misophonia from controls. Researchers using the IADS-M can determine how many sounds to include in future studies.

Results

To determine whether any of the responses differed as a function of sound type across groups, we conducted four univariate ANCOVAs with response as the dependent variable, Group (controls, misophonia) and Sound Type (positive, negative, neutral) as fixed factors, and gender, state affect (positive and negative), and trait affect as covariates. Across all ANCOVAs, Mauchly’s Tests of Sphericity were significant (all ps < .001), and as such, we report the Greenhouse-Geisser corrected values for these analyses. Means for each dependent variable (valence, arousal, similarity, avoidance) as a function of Sound Type (positive, negative, neutral) and Group (controls, misophonia) are presented in Table 3.

thumbnail
Table 3. Descriptive properties for all ratings (N = 377).

https://doi.org/10.1371/journal.pone.0301105.t003

Valence

The first analysis, examining valence, yielded a significant main effect of Sound Type, F(1.58, 585.12) = 7.87, MSE = 0.54, ηp2 = .021, p = .001, and a significant main effect of Group, F(1, 371) = 10.07, MSE = 1.27, ηp2 = .03, p = .002. Finally, there was a significant Sound Type by Group interaction, F(1.58, 585.12) = 27.11, MSE = 0.54, ηp2 = .07, p< .001.

To follow up on this interaction, we conducted a post-hoc analysis with a Bonferroni correction (corrected alpha .05/3 = .017) which revealed that the misophonia group rated positive sounds at a significantly lower valence (i.e., more unpleasant) than the control group (p < .017), with a mean difference in valence ratings of -.66 (95% CI, -.89 to -.43). The mean group difference between valence scores on negative sounds was not significant (.205; 95% CI, -.00 to .41; p > .05). The misophonia group rated neutral sounds at a significantly lower valence than the control group, with a mean difference in valence ratings of -.34 (95% CI, -.54 to -.14; p < .017).

Arousal

The second analysis examining group differences in arousal ratings yielded a significant main effect of Sound Type, F(1.88, 697.14) = 13.156, MSE = 0.50, ηp2 = .03, p < .001, and a non-significant main effect of Group, F(1, 371) = 1.86, MSE = 3.83, ηp2 = .01, p = .173. Finally, there was a significant Sound Type X Group interaction, F(1.88, 697.14) = 12.71, MSE = 0.50, ηp2 = .03, p< .001.

To follow up on this interaction, we conducted a post-hoc analysis with a Bonferroni correction (corrected alpha .05/3 = .017) and found that the mean group difference between arousal scores on positive sounds was not significant (.02; 95% CI, -.29 to .34; p > .017). The mean group difference between arousal scores on negative sounds was also not significant (.01; 95% CI, -.32 to .34; p > .017). However, the misophonia group rated neutral sounds at a significantly higher arousal than the control group (.56; 95% CI, .25 to .87; p < .001).

Similarity to trigger sounds

The third analysis examining group differences in similarity ratings yielded a non-significant main effect of Sound Type, F(1.48, 547.7) = 1.65, MSE = 1.15, ηp2 = .00, p = .199, but a significant main effect of Group, F(1, 371) = 32.17, MSE = 4.82, ηp2 = .08, p < .001. Finally, there was a significant Sound Type X Group interaction, F(1.48, 547.72) = 15.13, MSE = 1.15, ηp2 = .04, p< .001.

To follow up on this interaction, we conducted a post-hoc analysis with a Bonferroni correction (corrected alpha .05/3 = .017) which indicated that the misophonia group rated positive sounds at a significantly higher similarity level than the control group (1.21; 95% CI, .89 to 1.52; p < .017). The mean group difference between similarity scores on negative sounds was not significant (.35; 95% CI, -.06 to .85; p > .017). The misophonia group rated neutral sounds at a significantly higher similarity level than the control group (1.18; 95% CI, .84 to 1.51; p < .001.

Estimated avoidance

The fourth analysis examining group differences in avoidance ratings yielded a significant main effect of Sound Type, F(1.71, 634.2) = 31.29, MSE = 0.89, ηp2 = .08, p < .001, and a significant main effect of Group, F(1, 371) = 34.04, MSE = 4.33, ηp2 = .08, p < .001. Finally, there was a significant Sound Type X Group interaction, F(1.71, 634.2) = 22.09, MSE = 0.89, ηp2 = .06, p< .001.

To follow up on this interaction, we conducted a post-hoc analysis with a Bonferroni correction (corrected alpha .05/3 = .017) which indicated that the misophonia group rated positive sounds at a significantly higher avoidance level than the control group (1.23; 95% CI, .91 to 1.56; p < .001. The mean group difference between avoidance scores on negative sounds was not significant (.29; 95% CI, -.10 to .684; p = .14). The misophonia group rated neutral sounds at a significantly higher avoidance level than the control group (1.18; 95% CI, .84 to 1.53; p < .001).

Finally, bivariate correlations to preliminarily examine the relationship between sound stimuli and misophonia symptoms (via the MQ) revealed that the MQ Symptom Subscale was significantly negatively correlated with the mean negative valence and positively correlated with arousal, similarity, and avoidance scores (rs = -.34, .34, .21, and .35, respectively, ps < .001). In addition, the MQ Severity Score was positively correlated with the mean negative arousal (r = .20, p = .02) and avoidance ratings (r = .19, respectively, p = .03). A summary of these results can be found in Table 4.

Once these results were obtained, we created a list of all sounds ranked by the differences in valence ratings between the two groups (see Table 2). To do this, we took the means of the differences between scores on each sound between the two groups on all four dependent variables. We then created Z-scores for each mean (on each dependent variable). Because valence is the only scale on which participants with misophonia gave predictably lower ratings than controls, these Z-scores were reversed so that we could average the four Z-scores to create composite Z-scores. These composite Z-scores allowed us to rank each sound by how much it differentiated those with misophonia from control across the four dependent variables, with each dependent variable equally weighted (i.e., arousal, valence, similarity to one’s own triggers, and estimated avoidance). Descriptive properties for the ratings are found in Table 3, while a correlation matrix of study variables is found in Table 5.

Discussion

The primary purpose of this study was to explore whether individuals with high misophonia symptom severity differentially respond to sounds from the IADS-2 compared to healthy controls. We also aimed to curate an easily accessible sound list to be used in future misophonia research using this previously standardized and widely used stimulus set.

Several studies examining misophonia have created sound sets for idiosyncratic study use, and our study replicated these in several ways. Like previous studies [31, 32], specific cut-off scores on the MQ were used to form our misophonia group. We also used valence and arousal ratings to determine reactivity to sounds and included a similarity rating resembling that of another recent study [33]. This study also closely replicated the procedures used to validate the IADS-2; along with using the IADS-2 as our sound set, we used the SAM as our rating scale, included valence and arousal scores, and used the same rating time and duration of time between sounds. However, we extended the literature using the IADS-2 by including group comparisons on responses to the IADS-2 stimuli in both a control group and misophonia group, and by obtaining responses to stimuli regarding perceived similarity and avoidance of sounds.

Results from the present study extend previous research [613] by identifying a set of standardized sounds in an accessible sound bank that differentiate those with and without high misophonia symptom severity and impairment. While the IADS-2 are not completely accessible, as is the FOAMS database, it is available to any Ph.D. holding faculty who submits a request form. Unlike previous studies [613], the sounds in our study were standardized in terms of length, volume, duration, and other sound qualities. This ensured that inconsistencies in acoustic properties of stimuli could not confound results.

Based on these findings, we have generated a list of sounds that may be useful in future misophonia research. This stimulus set includes sounds that were not previously described in the empirical literature as the most common sounds that are aversive to those with misophonia. Accordingly, the results from this study help move the field of misophonia research closer to using standardized sounds to characterize the disorder in an empirical manner. In future studies, it will be beneficial to replicate and extend our findings by examining whether the IADS-M sounds add utility to the objective assessment and characterization of misophonia. While previous studies have demonstrated that oral-facial sounds and sounds commonly produced in offices (typing, pen clicking) are prominent triggers in misophonia [35] the top ten most differentiating sounds in the IADS-M included environmental noises, including those associated with eating or drinking (e.g., restaurant and party noises) and office sounds (e.g., paper crinkling and scribbling). This observation points to the importance of understanding the context in which triggering sounds occur, and not restricting the study of misophonia to using sounds lacking important contextual features. Indeed, several recent studies found that sounds in context, and not sounds alone, may elicit stronger aversive responses in misophonia [11, 34].

Finally, we hope that this study inspires clinicians to develop innovative approaches to ethically incorporating IADS-M sounds into treatments for people with misophonia. Although most clinicians may not be able to easily access the IADS-2,), findings from this study may help clinicians assess a more heterogeneous set of possible cues beyond the oral-facial sounds consistently highlighted in the literature [35]. Sounds rarely mentioned in the literature (e.g., bird sounds) were found to be quite aversive in this study. These sounds would be unlikely to be used as trigger stimuli, as the research literature prioritizes oral-facial sounds. Again, many of the top 10 IADS-M sounds included contextual and environmental information. This may encourage clinicians to consider contextual information when assessing trigger sounds in their patients as well as understanding how trigger sounds can begin to generalize into broader contexts (e.g., aversion to chewing sounds generalizes to aversion to restaurant sounds). This stimulus set could also be used directly in research. As one example, McKay and Frank [5] proposed the use of inhibitory learning approaches, whereby therapists use procedures to help patients inhibit unhelpful emotional behaviors in response to trigger sounds, rather than using exposures to induce habituation to sounds. Using a stimulus set as part of treatment to help develop skillful coping (cognitive, behavioral, emotional, physiological, etc.) responses to sounds may be a candidate treatment approach and aligns well with existing evidence-based transdiagnostic treatments such as the Unified Protocol [36].

We used the IADS-2 sounds to study differences in how people with high self-reported misophonia rated sounds compared to people without misophonia. We hypothesized that those with misophonia would rate neutral and positive sounds as less pleasant and more arousing than people without misophonia, because common misophonic triggers are not characterized by qualities normally aversive to the general population such as loud volume or disturbing content [37]. Further, sounds within the IADS-2 that were face-valid misophonic triggers such as chewing, scribbling, clocks ticking, and other environmental sounds were all rated as positive or neutral in the original validation studies. In line with our hypotheses, participants with misophonia rated positive and neutral sounds at a lower valence than healthy controls. However, contrary to our hypothesis, participants with misophonia did not rate positive sounds as significantly more arousing than healthy controls, yet they did rate neutral sounds as more arousing. Our hypothesized pattern of results emerged for both the similarity and avoidance scales, as participants with misophonia rated positive and neutral sounds significantly higher on both measures. There were no significant group differences for ratings on generally negative sounds. This pattern of results was unsurprising, since people with misophonia tend to be triggered by sounds that are not experienced as highly aversive to the general public. Importantly, these results were obtained after covarying for trait positive or negative affect and state negative affect (as well as gender), suggesting that individuals with misophonia react differentially to these sounds regardless of mood or a propensity to experience negative emotions.

While participants with misophonia rated positive and neutral sounds according to our hypotheses, it should not be concluded that individuals with misophonia generally like pleasant sounds less than controls. Rather, this is evidence that misophonic triggers are commonly reported as pleasant or neutral by the general population. Future studies may benefit from examining individual sounds related to misophonia irrespective of how they are categorized in the IADS-2 manual (e.g., positive, negative, neutral). Further, the discovery of trigger sounds that are normally rated as positive or neutral may also suggest that these sounds are less useful as control sounds in research unrelated to misophonia; there may be individuals with misophonia in future studies that experience these sounds as quite aversive.

This study has several limitations that should be considered and addressed in future research. A considerable limitation of this study was the reliance on self-report to define group conditions and assess dependent variables. Future studies could use clinical interviews [38], as well as more objective measures, including psychophysiology measures (e.g., galvanic skin response, neuroimaging, respiration or heart rate) to capture a more holistic picture of how individuals with misophonia respond to sounds. A multitrait-multimethod approach could strengthen and establish greater evidence of IADS-M validity, including demonstrating concurrent and discriminant validity [39]. Using psychophysiology measures would confirm self-report ratings and demonstrate concurrent validity or potentially illuminate gaps in awareness of how these individuals react to sounds. As an objective measure, our stimulus set has not demonstrated reliability and has demonstrated only preliminary construct validity in this study. Test-retest reliability is needed in future replication studies. Our results did correlate with the MQ, demonstrating concurrent validity, but we have no predictive validity since we have not demonstrated that our results predict any real-world behavior or concurrent criterion behavior of misophonia.

Our study used cut-off scores on the MQ [21] to form our misophonia group, which precluded a dimensional analysis of misophonia. Future studies might consider using dimensional assessments and analyses, so that a broader proportion of sufferers can be studied and so that we can more dimensionally understand the relationship between sound ratings and severity. Future studies should consider using a clinical control group with similar features to misophonia (e.g., high emotion dysregulation, hyperacusis, perfectionism, etc.) to determine whether features other than sound tolerance may be driving these results or whether this pattern of results is unique to misophonia. Dimensional studies would also help elucidate sound sensitivities that may exist in the control group, as sound sensitivity or aversion to specific sounds is not specific to misophonia. Alternatively, researchers using auditory stimuli may want to screen for misophonia, as these participants may provide unexpected results.

Finally, the diversity of participants in this study was not proportional to the racial breakdown of the United States, as the sample was primarily White. Therefore, we are missing data on how people from different races or cultures react to these sounds, and therefore may not have developed a fully generalizable sound set. Future studies should recruit a more diverse pool of participants to increase ecological validity and accurately represent the population of individuals with misophonia.

References

  1. 1. Swedo SE, Baguley DM, Denys D, Dixon LJ, Erfanian M, Fioretti A, et al. Consensus definition of misophonia: a delphi study. Frontiers in neuroscience. 2022:224. pmid:35368272
  2. 2. Brout JJ, Edelstein M, Erfanian M, Mannino M, Miller LJ, Rouw R, et al. Investigating misophonia: A review of the empirical literature, clinical implications, and a research agenda. Frontiers in neuroscience. 2018;12:36. pmid:29467604
  3. 3. Jastreboff P, Jastreboff M. Treatments for decreased sound tolerance (hyperacusis and misophonia). Seminars in hearing. 2014;35(02):105–120.
  4. 4. Schroder A, Vulink N, Denys D. Misophonia: Diagnostic Criteria for a New Psychiatric Disorder. PLOSONE. 2013;8(1):e54706. pmid:23372758
  5. 5. Frank B, McKay D. The suitability of an inhibitory learning approach in exposure when habituation fails: A clinical application to misophonia. Cognitive and behavioral practice. 2019;26(1):130–142.
  6. 6. Kumar S, Dheerendra P, Erfanian M, Benzaquén E, Sedley W, Gander PE, et al. The motor basis for misophonia. Journal of neuroscience. 2021. pmid:34021042
  7. 7. Dozier T, Grampp L, Lopez M. Misophonia: Evidence for an Elicited Initial Physical Response. Universal journal of psychology. 2020;8:27–35.
  8. 8. Edelstein M, Brang D, Rouw R, Ramachandran VS. Misophonia: Physiological investigations and case descriptions. Frontiers in human neuroscience. 2013;7:296. pmid:23805089
  9. 9. Cerliani L, Rouw R. Increased orbitofrontal connectivity in misophonia. bioRxiv. 2020.
  10. 10. Samermit P, Saal J, Collins J, Davidenko N. Cross-modal attenuation of misophonic responses. Journal of vision. 2018;18(10):1144.
  11. 11. Samermit P, Young M, Allen AK, Trillo H, Shankar S, Klein A, et al. Development and Evaluation of a Sound-Swapped Video (SSV) Database for Misophonia. Frontiers in psychology. 2022;13:4012.
  12. 12. Heller LM, Smith JM. Identification of Everyday Sounds Affects Their Pleasantness. Frontiers in psychology. 2022;13. pmid:35936236
  13. 13. Enzler F, Loriot C, Fournier P, et al. A psychoacoustic test for misophonia assessment. Scientific reports. 2021;11:11044. pmid:34040061
  14. 14. Hansen HA, Leber AB, Saygin ZM. What sound sources trigger misophonia? Not just chewing and breathing. Journal of clinical psychology. 2021 Nov;77(11):2609–25. pmid:34115383
  15. 15. Orloff DM, Benesch D, Hansen HA. Curation of FOAMS: a Free Open-Access Misophonia Stimuli Database. Journal of Open Psychology Data. 2023 Jan 1;11(1).
  16. 16. Bradley MM, Lang PJ. The International Affective Digitized Sounds (2nd Edition; IADS-2): Affective ratings of sounds and instruction manual. Technical report B-3. University of Florida, Gainesville, Fl; 2007.
  17. 17. Hauser D, Paolacci G, Chandler JJ. Common concerns with MTurk as a participant pool: Evidence and solutions. In: Kardes FR, Herr PM, Schwarz N, editors. Handbook of research methods in consumer psychology. New York, NY: Routledge; 2019. p. 319–337.
  18. 18. Kees J, Berry C, Burton S, Sheehan K. An analysis of data quality: Professional panels, student subject pools, and Amazon’s Mechanical Turk. Journal of advertising. 2017;46(1):141–155.
  19. 19. Miller JD, Crowe M, Weiss B, Maples-Keller JL, Lynam DR. Using online, crowdsourcing platforms for data collection in personality disorder research: The Example of Amazon’s Mechanical Turk. Personality disorders: theory, research, and treatment. 2017;8(1):26–34. pmid:28045305
  20. 20. Chandler J, Shapiro D. Conducting clinical research using crowdsourced convenience samples. Annual review of clinical psychology. 2016;12:53–81. pmid:26772208
  21. 21. Wu MS, Lewin AB, Murphy TK, Storch EA. Misophonia: incidence, phenomenology, and clinical correlates in an undergraduate student sample. Journal of clinical psychology. 2014 Oct;70(10):994–1007. pmid:24752915
  22. 22. Everaert J, Joormann J. Emotion regulation difficulties related to depression and anxiety: A Network approach to model relations among symptoms, positive reappraisal, and repetitive negative thinking. Clinical psychological science. 2019;7(6):1304–1318.
  23. 23. Partala T, Surakka V. Pupil size variation as an indication of affective processing. International journal of human-computer studies. 2003 Jul 1;59(1–2):185–98.
  24. 24. Lee PM, Tsui WH, Hsiao TC. The influence of emotion on keyboard typing: An experimental study using auditory stimuli. PloS one. 2015 Jun 11;10(6):e0129056. pmid:26065902
  25. 25. Zhou X, Wu MS, Storch EA. Misophonia symptoms among Chinese university students: Incidence, associated impairment, and clinical correlates. Journal of obsessive-compulsive and related disorders. 2017 Jul 1;14:7–12.
  26. 26. Watson D, Clark LA, Tellegen A. Development and validation of brief measures of positive and negative affect: the PANAS scales. Journal of personality and social psychology. 1988;54(6):1063. pmid:3397865
  27. 27. Larsen RJ, Diener E, Emmons RA. Affect intensity and reactions to daily life events. Journal of personality and social psychology. 1986;51:803–814.
  28. 28. Larsen RJ, Diener E. Affect intensity as an individual difference characteristic: A review. Journal of research in personality. 1987 Mar 1;21(1):1–39.
  29. 29. IBM Corp. IBM SPSS Statistics for Windows. Armonk, NY: IBM Corp; 2020.
  30. 30. Kline RB. Principles and practice of structural equation modeling. New York: Guilford; 1998.
  31. 31. Frank B, Roszyk M, Hurley L, Drejaj L, McKay D. Inattention in misophonia: Difficulties achieving and maintaining alertness. Journal of clinical and experimental neuropsychology. 2020 Jan 2;42(1):66–75. pmid:31537171
  32. 32. Simner J, Koursarou S, Rinaldi LJ, Ward J. Attention, flexibility, and imagery in misophonia: Does attention exacerbate everyday disliking of sound?. Journal of clinical and experimental neuropsychology. 2021 Nov 26;43(10):1006–17. pmid:35331082
  33. 33. Ferrer-Torres A, Giménez-Llort L. Sounds of Silence in Times of COVID-19: Distress and Loss of Cardiac Coherence in People With Misophonia Caused by Real, Imagined or Evoked Triggering Sounds. Frontiers in psychiatry. 2021;12. pmid:34276431
  34. 34. Siepsiak M, Vrana SR, Rynkiewicz A, Rosenthal MZ, Dragan W. Does context matter in misophonia?: a multi-method experimental investigation. Frontiers in neuroscience. 2023;16. pmid:36685219
  35. 35. Rosenthal MZ, Anand D, Cassiello-Robbins C, Williams ZJ, Guetta RE, Trumbull J, et al. Development and initial validation of the duke misophonia questionnaire. Frontiers in psychology. 2021 Sep 29;12:709928. pmid:34659024
  36. 36. Barlow DH, Farchione TJ, Bullis JR, Gallagher MW, Murray-Latin H, Sauer-Zavala S, et al. The unified protocol for transdiagnostic treatment of emotional disorders compared with diagnosis-specific protocols for anxiety disorders: A randomized clinical trial. JAMA psychiatry. 2017;74(9):875–884. pmid:28768327
  37. 37. Rouw R, Erfanian M. A large-scale study of misophonia. Journal of clinical psychology. 2018;74(3):453–479. pmid:28561277
  38. 38. Guetta RE, Cassiello-Robbins C, Anand D, Rosenthal MZ. Development and psychometric exploration of a semi-structured clinical interview for Misophonia. Personality and individual differences. 2022 Mar 1;187:111416.
  39. 39. Campbell DT, Fiske DW. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological bulletin. 1959;56(2):81–105. pmid:13634291