Figures
Abstract
Schizophrenia has been linked to reality monitoring confusions, particularly misattributions of internal productions to external sources. We hypothesized that these misattributions may be related to deficits in processing acoustic cues that distinguish the subjective experience of a sound source inside or outside the head, i.e., sound externalization. This study aimed to investigate sound externalization in patients with schizophrenia, particularly for emotional sounds, as emotion influences auditory perception. In an externalization task, twenty-three patients with schizophrenia and twenty-five healthy controls were exposed to neutral and emotional sounds processed to be perceived as: internalized (diotic) or externalized (filtered with either an anechoic head-related transfer function -HRTF- or a binaural room impulse response -BRIR). Participants had to indicate whether the sound source was perceived inside or outside their head. Exploratory analyses also examined the relationships between externalization, reality monitoring, and symptom severity. Compared to controls, patients with schizophrenia rated the filtered sounds (HRTF, BRIR) as less externalized (corrected p < .001) and the diotic sounds as more externalized (corrected p < 0.001). Sounds with negative emotional (anger, fear) were more externalized (corrected p < .001) in both groups, but patients showed reduced externalization when compared to control for several emotions (anger, happiness, fear, sadness) (corrected p < .001). No significant correlation was found between externalization and reality monitoring. In patients, greater symptom severity was associated with reduced externalization of sounds simulated as originating outside the head. These findings suggest an abnormal perception of sound sources in patients with schizophrenia, who confuse sounds inside and outside the head to a greater extent with an influence of emotional contents. Further research is needed to elucidate the relationship between sound externalization and symptoms such as hallucinations.
Citation: Fivel L, Lavandier M, Grimault N, Perrin F, Mondino M, Haesebaert F (2026) Is it inside my head? Characterization of sound externalization in schizophrenia. PLoS One 21(3): e0345074. https://doi.org/10.1371/journal.pone.0345074
Editor: Giulia Prete, Gabriele d'Annunzio University of Chieti and Pescara: Universita degli Studi Gabriele d'Annunzio Chieti Pescara, ITALY
Received: February 13, 2025; Accepted: March 2, 2026; Published: March 16, 2026
Copyright: © 2026 Fivel et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This research was financially supported by the Scientific Council of CH Le Vinatier (#CSRM01). The first and one of the co-last authors (LF and FH) received a salary from Lyon 1 University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Schizophrenia, affecting approximately 1% of the population worldwide, is characterized by disruptions in perceptual, cognitive and social abilities that impair daily functioning [1]. A central cognitive disturbance in schizophrenia is the disruption of the self, marked by difficulties in attributing one’s thoughts, actions, emotions, and experiences to oneself, and a tendency to attribute them to others or to external sources. In particular, patients with schizophrenia have difficulties in reality monitoring, the ability to distinguish between their imagination and perceived events [2,3]. This impairment is more severe in patients with auditory hallucinations [4], who display higher misattribution of internal (imagined) stimuli as coming from an external source. The cognitive underpinnings of reality monitoring deficit remain incompletely understood, but they are thought to involve deficits in the sense of agency, which determines the subjective origin of productions (self or non-self), and perceptual deficits in spatialization, which determines their subjective source/localization (inner or outer space) [5]. The hypothesis of spatialization deficits is supported by studies reporting bidirectional confusions between internal and external spaces in patients with schizophrenia [6]. However, investigations into the distinctions between internal and external spaces have predominantly focused on self-generated productions. Further research is needed to clarify whether the spatialization deficit extends beyond self-generated productions to externally presented stimuli, reflecting broader perceptual deficits in schizophrenia.
Among perceptual deficits in schizophrenia, auditory perception impairments have been consistently reported [7,8], including difficulties in localizing sound sources [9]. Research in schizophrenia has primarily focused on the localization of sounds originating from sources located outside the head and received through open ears, in line with most sounds in the natural world. The perception that a sound source is outside the head, referred to as sound externalization (for a review, see [10]), is thought to rely on various factors such as reverberation and filtering by the body, particularly the head and pinnae [11,12]. In experimental settings, it is possible to manipulate sounds delivered through headphones to simulate their perception as being localized inside or outside the head. Indeed, sound sources can be perceived as if originating from inside the head, or internalized, when both the interaural time and level differences are minimal [13], for instance when sounds are presented diotically through headphones. In contrast, sound can be perceived as coming from outside the head, or externalized, despite the use of headphones by applying filters such as a head-related transfer function (HRTF) [14] with the addition of simulated reverberation further enhancing the perception of externalization [15]. Investigating how individuals with schizophrenia perceive sounds simulated as originating inside or outside the head, by manipulating their degree of externalization, could shed light on spatialization deficits and their role in reality monitoring impairments.
Additionally, the emotional content of auditory information influences both sound localization [16] and reality monitoring. While healthy participants are more able to discriminate perceived and imagined words with a negative valence than neutral words [17], patients with schizophrenia are more likely to misattribute an emotionally negative sentence spoken by themselves as coming from an external source [18], suggesting that emotions can lead to source misjudgements. Emotional content may thus influence the degree of perceived externalization.
The present study aimed to investigate how individuals with schizophrenia perceive the localization of sound sources depending on their simulated degree of externalization and emotional content, and how it relates to their reality monitoring abilities and symptoms. We hypothesized that patients with schizophrenia would display greater confusion in sound externalization compared to healthy controls, particularly for sounds conveying negative emotions. Additionally, we explored the relationship between externalization, reality monitoring, and symptom severity. We hypothesized that confusions in externalization would be correlated with misattributions of internal productions as coming from an external source, and with symptom severity. By investigating the externalization of sound sources and its interplay with symptoms, we aimed to advance our understanding of perceptual disturbances in schizophrenia.
Materials and methods
Participants
Twenty-five outpatients diagnosed with schizophrenia according to DSM-5.0 criteria were recruited from Le Vinatier Hospital (Bron, France). Twenty-five healthy controls matched with the patients for age, sex, and education level, were recruited via social networks, universities and hospital staff. Exclusion criteria included medical treatment, current or history of schizophrenia spectrum disorder or other psychiatric conditions and a first-degree relative with schizophrenia or bipolar disorder for healthy controls and, for all participants, history of auditory or neurological impairment, intellectual disability as assessed by Raven’s matrices, and alcohol or substance abuse. Individuals with regular practice of a musical instrument were excluded to prevent potential confounding effects of musical training on sound perception [19].
Ethics statement
The study received approval from a local ethics committee (Comité de Protection des Personnes Mediterranée sud I, France; ID RCB: 2020-A00542-37; approved on June 18, 2020). All participants provided written informed consent prior to taking part in the study. The study was preregistered on February 9, 2021 (NCT04768335; https://clinicaltrials.gov/study/NCT04768335). Recruitment of participants began on March 30, 2021 and ended on December 22, 2022.
Psychopathology assessment
Schizophrenia symptoms were assessed using the Scale for the Assessment of Positive Symptoms (SAPS, [20]) and the Positive and Negative Syndrome Scale (PANSS, [21]), which assesses the positive, negative, and general psychopathology symptoms.
Procedure
Participants completed perceptual and cognitive tasks on a computer in an isolated room with minimal environmental noise. Sounds were delivered through headphones (HD 250 linear II Sennheiser) at a comfortable level for the participants. Before the experimental trials, participants completed a block of practice trials to ensure they understood the instructions.
Externalization task
The externalization task, programmed using OpenSesame v.3.1., was based on Leclère et al. (2019) and used non-verbal vocalizations from the Montreal Affective Voices battery [22]. These included vowels/a/ pronounced by male and female actors with neutral or emotional (at the highest intensity level) intonation: anger (growl), fear (scream), disgust (grumble), happiness (laugh), sadness (cry).
The degree of perceived externalization was manipulated in Matlab R2020b by creating three versions of each sound using filters for sound convolution [15]: diotic (no filter, assumed to be perceived as originating from the middle the head – internalized), HRTF-filtered (creating an intermediate level of externalization by simulating the effect of specific body characteristics on the sound waves), and filtered by a binaural room impulse response (BRIR; enhancing the degree of perceived externalization by simulating a reverberant environment).
To generate binaural diotic sounds (i.e., stereo sounds), the original monaural vocalizations (sampled at 44100 Hz) were duplicated across the two channels (left, right). For binaural filtered sounds, the original sounds were convolved with a HRTF or BRIR measured by Hummersone et al. [23], using a large cinema-style lecture theatre for the BRIR (room C). To introduce variability in sound source localization in the horizontal plane compared to the diotic sounds (perceived in front/the center), we used HRTFs and BRIRs measured in two left directions relative to the listener (−90°, −60°). Right sources (60°, 90°) were simulated from the left sources by inverting the right and left channels. As azimuths were variability factors, they were not included in the statistical analysis as experimental factors. All sounds were equalized in RMS level by applying the same gain to their left and right channels so that the mean signal power averaged across the two channels was identical across all vocalizations in all conditions. A total of 288 sounds were created and presented, combining 16 sounds * 3 types of sound processing (BRIR, HRTF, diotic) * 6 emotions (neutral, anger, fear, disgust, happiness, sadness).
After each sound, participants were asked to make a binary choice using a keyboard as to the sound source externalization (inside or outside the head). They were given 3 seconds to respond, after which the next sound was presented. Externalization ratings were calculated for each condition as the proportion of trials where the source was perceived outside the head. Participants were expected to have high externalization ratings for BRIR sounds, low ratings for diotic sounds, and intermediate ratings for HRTF sounds [15].
Reality monitoring task
The reality monitoring task, programmed with PsychoPy3 v2020.2.10, was based on the Hear-Imagine task by Brunelin et al. [24]. The task consisted of a presentation phase immediately followed by a test phase. In the presentation phase, 24 neutral French words were visually presented one by one on a computer screen for 3 seconds, preceded by an instruction sentence. Participants were instructed to either listen to the words delivered through headphones or imagine hearing them. The words were pronounced by a neutral male voice. In the test phase, 36 words including the 24 previously presented words and 12 new words, were presented one by one on the computer screen. Participants had to determine whether each word had been presented or not, and if presented, whether it had been heard or imagined. The number of correct source attributions was measured for each source (range 0–12): imagined, heard, and new. Two types of source misattributions were computed: the number of imagined words incorrectly recognized as heard (range 0–12) and the number of heard words incorrectly recognized as imagined (range 0–12).
Analyses
The statistical analyses were conducted using Jamovi (version 2.3.28) and R (version 4.2.3), with a significance threshold set at p < 0.05. The sociodemographic data were compared between groups using Fisher’s exact tests for categorical variables and Mann-Whitney U tests for quantitative variables.
As data did not follow a normal distribution, generalized linear models (GLM) were used.
Externalization ratings are bounded proportions including zero values, there were analyzed as binomial outcomes (with glm and emmeans packages), with the number of trials perceived as external versus internal specified in a cbind formulation. A GLM with binomial family and logit link was fitted with the predictors group (patients vs. controls), sound type (diotic, HRTF, BRIR), and emotion (neutral, anger, disgust, fear, happiness, sadness). Model selection was performed using likelihood ratio tests (LRT) to compare the full model including the three-way interaction (group × sound type × emotion) to models containing only two-way interactions or main effects. Estimated marginal means (EMMs) and pairwise post-hoc comparisons with Bonferroni correction were calculated. Odds ratios (ORs) and 95% confidence intervals (CIs) were derived from the GLM coefficients to quantify the effect sizes.
Two GLMs with gamma family and identity link function were performed on reality monitoring performance to assess whether correct responses were associated with group and source (imagine, hear, new) and whether misattribution errors were associated with group and type of confusion (imagine-to-hear, hear-to-imagine). When a significant association was observed, post-hoc comparisons (contrasts) were conducted with Bonferroni correction.
The relationship between externalization ratings and reality monitoring performance (correct source attributions and source misattributions) was investigated using Spearman correlation. For exploratory analyses, Spearman correlation coefficients were used to assess the relationship with externalization ratings for each type of sound processing and patient’ symptomatology (PANSS negative subscale, PANSS positive subscale, and SAPS hallucination subscale). Bonferroni correction was applied. We additionally conducted exploratory analyses assessing the relationship between antipsychotic medication (in chlorpromazine equivalents) and externalization ratings across the different sound-processing conditions.
Results
Participant characteristics
The analyses were performed on 47 participants (23 patients with schizophrenia and 24 healthy controls), with data from 3 participants excluded due to missing responses on more than 50% of the trials in the externalization task (reaction times > 3 seconds). The participants’ sociodemographic and clinical characteristics are described in Table 1. No significant group differences were found for age, education, or sex.
Externalization ratings across the 3 types of sound processing
A binomial GLM was fitted to participants’ externalization ratings, with group, sound type, and emotion as predictors. The initial three-way interaction (group × sound type × emotion) did not significantly improve model fit compared to a model with only two-way interactions (LRT: χ²(10) = 12.98, p = 0.225), and was therefore excluded. All two-way interactions significantly improved model fit and were retained in the final model: Group × Sound type (χ²(2) = 401.32, p < 0.001); Group × Emotion (χ²(5) = 16.12, p = 0.007) and Sound type × Emotion (χ²(10) = 19.51, p = 0.034). Comparison with the main-effects-only model confirmed that including the significant two-way interactions substantially improved model fit (LRT: χ²(17) = 433.99, p < 0.001). Full model estimates are reported in Supplementary Table 1.
The group × sound type interaction showed that patients had higher odds of perceiving diotic sounds as external compared to controls (corrected p < .001; Table 2, Figure 1). Conversely, they were less likely to perceive HRTF and BRIR sounds as external compared to controls (corrected p < .001).
***p < .001; **p < 0.01.
The group × emotion interaction revealed that across sound types, patients were less likely than controls to perceive sounds as external for anger, fear, happiness, and sadness (all corrected p < .001; Table 3). No significant differences were reported for neutral and disgust after Bonferroni correction.
The sound type × emotion interaction showed that the effect of emotion on externalization varied depending on the sound type. For diotic sounds, anger and fear were more likely to be perceived as external than neutral (corrected p < .001; Table 4, Fig 2), whereas disgust, happiness, and sadness showed no significant differences. For HRTF sounds, anger, fear, and sadness were all more likely to be perceived as external than neutral (corrected p < .001). For BRIR sounds, anger, fear, happiness, and sadness were perceived as more external than neutral (corrected p < .01). The externalization ratings for each sound type and emotional content are displayed in Supplementary Figure 1 for the two participant groups.
Results are presented as mean in percent ± one standard deviation. *p < .05.
Reality monitoring task
The GLM on the correct source attributions indicated a significant group * source interaction (β = −2.67, 95% CI [−4.61, −0.73], p = 0.008; Fig 3, S2 in S1 Table). Patients with schizophrenia less accurately recognized the source of imagined words compared with healthy controls (corrected p = 0.04). No significant between-group differences were found for heard or new words (corrected p = 1.00).
Results are presented as mean ± standard deviation. *p < .05.
The GLM on the source misattributions revealed no significant interaction between group * type of source misattribution (S3 in S1 Table).
Relationship between externalization and reality monitoring
No correlation was found between externalization ratings and reality monitoring performance, neither within healthy controls and patients separately, nor across the two groups.
Relationship between externalization and symptomatology
A significant negative correlation was found between scores on the PANSS negative subscale and externalization ratings of BRIR sounds (r = −0.48, p = 0.021), whereas no significant correlations were found for diotic and HRTF sounds. In exploratory analyses, the relationships between the externalization of emotional sounds and symptoms were further examined (S4 Table and S2 Fig in S1 Table). The PANSS negative subscale was negatively correlated with the externalization ratings of BRIR (r = −0.44, p = 0.033) and HRTF (r = −0.44, p = 0.036) neutral sounds, and BRIR sounds conveying disgust (r = −0.50, p = 0.015). PANSS positive subscale was correlated positively with the externalization ratings of neutral (r = 0.45, p = 0.032) and happiness diotic sounds (r = 0.49, p = 0.018), and negatively with the externalization ratings of BRIR sounds conveying anger (r = −0.42, p = 0.046). Moreover, SAPS hallucination scores were negatively correlated with the externalization ratings of BRIR (r = −0.43, p = 0.042) and HRTF (r = −0.43, p = 0.038) sounds conveying fear. None of these correlations reach statistical significance after Bonferroni correction (threshold at α = 0.05/18 = 0.002).
Relationship between externalization and medication
No significant correlation was found between antipsychotic medication load and externalization ratings in patients with schizophrenia, for any of the sound-processing conditions (BRIR: r = 0.20, p = 0.37; HRTF: r = 0.07, p = 0.76; diotic: r = 0.04, p = 0.85).
Discussion
For the first time, we investigated the externalization of sound sources using neutral and emotional vocalizations in patients with schizophrenia and healthy controls. Our results highlighted significant differences in sound externalization between the two groups, but no correlation was found with reality monitoring abilities. Sounds conveying emotions of fear or anger were perceived as more externalized in both groups. Exploratory analyses suggested potential correlations between externalization and patients’ symptomatology, but none reached statistical significance after Bonferroni correction.
Emotional content of sound modulates the perception of externalization
To the best of our knowledge, research on the impact of emotion on sound localization is limited, with most studies focusing on reaction time rather than performance. The current study brings new data in the research field of externalization by suggesting that emotional valence modulates the degree of perceived externalization.
Sounds conveying negative emotions, such as anger and fear, were rated as more externalized than neutral sounds, regardless of the participant group and the sound type (diotic, HRTF, BRIR). These results may reflect a tendency to attribute danger to the external environment. Fearful or angry voices may signal threat and trigger a response to danger, leading to a bias in processing these emotional stimuli as it has been shown for visual stimuli [26], thereby predicting an external cause for these emotions.
Interestingly, previous studies revealed that auditory information conveying negative emotional valence reduces self-agency compared to neutral or positive information [27,28]. Participants are less likely to attribute action or sound manifestations to themselves when they convey fear or anger. Similarly, healthy controls make more external attributions to negative than to positive events [29]. In relation to our results, higher externalization of sources conveying fear or anger could serve as a proxy for self-agency disturbances in a lower-level processing.
Additionally, our analyses also revealed more fine‑grained interactions with emotional content and sound type. Happiness and sadness showed a more spatial‑cue–dependent profile, increasing externalization only when spatial information was available (HRTF and BRIR for sadness, BRIR for happiness).
These results may be interpreted, at least in part, in light of the salience code hypothesis, which proposes that high‑arousal vocalizations such as fear and anger naturally capture auditory attention—possibly for evolutionary reasons—thereby facilitating rapid defensive responses [30]. In contrast, low‑arousal emotions exert weaker salience effects and may influence perception only when supported by rich spatial cues.
Externalization of sound sources in schizophrenia
Healthy controls showed the expected pattern of externalization, with BRIR and HRTF sounds leading to sources perceived as more externalized, and diotic sounds leading to more internalized sources. In contrast, patients with schizophrenia showed lower externalization for BRIR and HRTF sounds compared to controls, and higher externalization for diotic sounds. These confusions could reflect at least partly, a deficit of sound localization in patients, as externalization and localization processes share common underlying neurological bases [31,32]. The bidirectional confusions observed in patients may arise from abnormal processing of binaural cues. Binaural cues play a critical role in externalization perception, with less temporal variations in interaural differences leading to reduced perceived externalization, the temporal variations being associated with the effect of reverberation [33]. Patients with schizophrenia have difficulty in using these cues to localize sounds in space [34], which may explain the observed confusions – namely, the slightly more externalized perception of diotic sounds and the more internalized perception of dichotic reverberant sounds. However, achieving full externalization might require more than binaural cues; it also involves the processing of body filtering (simulated by HRTF) and room reverberation (simulated by BRIR) [35], even if we cannot exclude the possibility that they influence externalization only by causing the temporal variations in binaural cues. While these results suggest perceptual alterations in the processing of binaural cues, the behavioral data do not indicate whether the underlying deficits arise from peripheral auditory mechanisms, cortical processing, or both. Further investigations combining behavioral, electrophysiological, or neuroimaging approaches are necessary to clarify the neural contributions to these externalization deficits. Examining these acoustical cues in schizophrenia may help to understand which individual factors or combinations contribute to these perceptual confusions and to some extent to associated cognitive processes such as spatial self-representation, since externalization appears indispensable to the ability to situate elements of one’s environment in relation to oneself [36].
Furthermore, higher externalization of diotic sounds in the context of schizophrenia could also be interpreted as a failure to internalize sounds of inner space, which is consistent with theories proposing a misattribution of internal events to external sources [37,38]. This finding also resonates with our reality‑monitoring results, which similarly suggest vulnerabilities in the attribution of internally generated information.
Our analyses also revealed group x emotions interactions. Patients showed reduced externalization for anger and happiness when compared to control subjects, with fear and sadness showing the same directional pattern although the corresponding group × emotion interactions did not reach significance in the binomial GLM.
In patients with schizophrenia, these emotional influences on perception may be attenuated due to documented deficits in auditory emotion recognition (AER), as highlighted in a recent meta‑analysis [39]. Additionally, dysconnectivity within specific neural systems may further diminish the emotional modulation of auditory salience. For example, Kantrowitz et al. reported auditory–insula dysconnectivity during emotional sound processing in schizophrenia [40], which could blunt salience attribution given the insula’s central role as a core hub of the salience network. These interpretations should be considered with caution, as our study did not directly assess attentional salience nor include neural connectivity measures that would allow us to characterize these mechanisms more explicitly.
Furthermore, given the cross-sectional nature of the present study, it remains unclear whether externalization deficits precede the onset of clinical symptoms or instead worsen with illness chronicity. Longitudinal investigations in clinical high-risk populations are needed to determine whether externalization impairments hold prognostic value.
Externalization of sound sources in relation to the severity of symptoms
The wide variability in externalization ratings in patients with schizophrenia leads us to hypothesize that symptom severity could account for this variability. Higher levels of negative symptoms were associated with lower externalization of neutral sources simulated outside the head (HRTF and BRIR sounds). This finding is consistent with a model suggesting that auditory processing deficits in schizophrenia contribute to poor social and cognitive functioning via negative symptom severity [41]. We could speculate that difficulty in perceiving external sources outside the head may contribute to social functioning deficits commonly seen in schizophrenia, such as weak bodily self-representation [42] and impaired social cognition [43]. However, the absence of a specific scale evaluating negative symptoms prevent us from investigating which subtype of negative symptoms would be most strongly related to externalization confusions. Moreover, we found the hint of an association between hallucination severity and lower externalization of fearful sounds simulated outside the head. Brain activity of patients with auditory hallucinations respond differently to emotional words compared to patients without hallucinations [44] and healthy controls [45], especially to negative words. Notably, abnormal increased amygdala activation in resting state and in response to emotional stimuli has been observed in patients with hallucinations. Therefore, auditory perception of fear may contribute to externalization confusions especially in patients with hallucinations. Nevertheless, these hypotheses remain highly speculative, as the observed correlations did not survive correction for multiple comparisons.
Relationship between externalization of sound sources and reality monitoring
Our exploratory investigation found no evidence of a relationship between externalization and reality monitoring. Stephane [5] suggested that reality monitoring involves two independent processes: self-agency and spatialization. The absence of a link in our study may indicate a more critical role for self-agency than spatialization in reality monitoring. This is consistent with a recent study reporting a significant relationship between self-agency and reality monitoring abilities, although spatialization was not assessed [46]. Our findings might also be explained by the fact that we did not find evidence of biases in patients’ reality monitoring performance (i.e., patients did not exhibit more ‘imagine-to-hear’ confusions than controls, which is typically observed in schizophrenia [47,48]) although we found that imagined words are less recognized in patients compared to healthy controls. This may be explained by our sample including patients with less severe symptoms than those in previous studies (total PANSS score of 62.8 versus 80.9 in [49]). In addition, our sample of patients also includes patients without hallucinations (7/23 with SAPS hallucination subscale score = 0), and the presence of hallucinations has been associated with greater reality monitoring confusions [48,50]. Importantly, stratifying patients according to hallucination status would be necessary to determine whether externalization contributes differently to reality-monitoring in hallucinating versus non-hallucinating patients. In addition, it would help disentangle whether variations in externalization ratings are specifically related to hallucination or to other symptom dimensions, or general cognitive functioning, given that our exploratory correlation analyses showed associations with both positive and negative symptoms. However, our sample size did not permit sufficiently powered subgroup analyses. Future studies with larger and stratified samples and including standardized cognitive measures will be required to address these questions.
Limitations and perspectives
Our study presents several limitations that should be acknowledged. First, our results contradict previous studies suggesting that reverberation added in BRIR sounds enhances externalization compared to HRTF [15]. The absence of evidence for this effect may indicate that HRTF alone were sufficient to create an external source experience in healthy controls (with > 84% of HRTF sounds perceived outside the head). Another explanation is the use of a binary forced-choice response (outside or inside the head) to evaluate externalization. While this approach, used in prior research [13,51,52], may better correspond to the percept being measured, it may reduce sensitivity and make it more challenging to distinguish true from guessed responses, despite explicit instructions emphasizing that there were no right or wrong answers, only individual perception. Some other studies have used continuous scales [15,53], to increase response variability; however, these are often based on perceived distance, which can introduce a reference bias and potentially confound the measurement of externalization [10]. Alternative approaches, such as combining binary ratings with confidence judgments to create a continuous confidence-weighted metric [52], may increase sensitivity while avoiding distance-related confounds. Such methods could be considered in future studies to better capture subtle intergroup differences and to reduce ceiling effects in healthy participants.
Another limitation concerns a potential confound from acoustic arousal. Although all sounds were RMS-equalized to ensure identical overall presentation level, emotional vocalizations may still differ in pitch, intensity dynamics, or temporal structure, which could influence externalization judgments independently of emotional valence, notably due to differences in perceived arousal level. Future studies could control for these features or include acoustic analyses to account for their potential effects.
Third, we did not control for attentional engagement, as reduced attention may impair performance, particularly in patients with schizophrenia [54]. However, we attempted to mitigate this by using a 3 second cut-off for response times, as such delays may indicate attentional lapses. Third, the small sample size limits statistical power, particularly for exploring the effects of emotion on externalization or the relationship between externalization and reality monitoring. Additionally, the absence of a comparative group of patients precludes conclusions about whether externalization confusions are specific to schizophrenia.
Finally, our study was not designed to investigate differences between patients with and without hallucinations, or the association with specific types of hallucinations. Auditory hallucinations in schizophrenia can be experienced inside or outside the head [55], with distinct neural correlates for each [56]. Interestingly, the nature of auditory verbal hallucinations, whether truly perceptual or reflective of beliefs about perception, is still debated [57–59]. While our findings cannot fully resolve this question, they contribute to the debate by showing externalization deficits in veridical sounds, suggesting that perceptual mechanisms underlying auditory spatial judgments may be disrupted in patients and could interact with hallucination experiences. Future studies should specifically recruit patients with and without hallucinations to clarify whether variations in externalization are related to hallucination presence, severity or nature, and to better understand the potential perceptual contributions to these experiences.
Despite these limitations, our findings may open new perspectives for therapeutic interventions in schizophrenia. Indeed, auditory sensory training has shown benefits in improving auditory perception, cognitive functioning, and reducing symptoms in patients [60]. Some evidence suggests that perceptual feedback-based training can improve HRTF- or BRIR-filtered sound localization in healthy listeners [61,62]. However, it remains unknown whether modifying sound externalization could yield functional benefits in patients with schizophrenia. In addition, a better characterization of the neural signatures that distinguish internalized from externalized percepts may eventually allow the development of targeted interventions, such as neurofeedback-based training, although this remains speculative. For example, neurofeedback could target activity in the planum temporale, which shows differential activation depending on externalization status [31,63]. Future studies should test whether training aimed at improving externalization can enhance symptoms and functional outcomes in patients.
Conclusions
This study highlights abnormal perception of sound sources in patients with schizophrenia, with increased internal/external confusions. It also provides new insights into externalization, highlighting the role of negative emotions such as fear and anger in increasing perceived externalization. Future research should further investigate sound externalization in schizophrenia and its relation to patients’ symptomatology and higher cognitive processes such as reality monitoring.
Supporting information
S1 Table. Results of the Generalized Linear Model (GLM) analyzing Externalization ratings by group, sound type, and emotion.
https://doi.org/10.1371/journal.pone.0345074.s001
(ZIP)
Acknowledgments
The authors thank Gabrielle CHESNOY for her help with the study design, participant recruitment and patient inclusion. We thank Asako TAKAMATSU for her help with the participant recruitment and data collection. We also thank all the participants of the study.
References
- 1. Jauhar S, Johnstone M, McKenna PJ. Schizophrenia. Lancet. 2022;399:473–86.
- 2. Brunelin J, d’Amato T, Brun P, Bediou B, Kallel L, Senn M, et al. Impaired verbal source monitoring in schizophrenia: an intermediate trait vulnerability marker?. Schizophr Res. 2007;89(1–3):287–92. pmid:17029909
- 3. Damiani S, Donadeo A, Bassetti N, Salazar-de-Pablo G, Guiot C, Politi P, et al. Understanding source monitoring subtypes and their relation to psychosis: a systematic review and meta-analysis. Psychiatry Clin Neurosci. 2022;76(5):162–71. pmid:35124869
- 4. Waters F, Woodward T, Allen P, Aleman A, Sommer I. Self-recognition deficits in schizophrenia patients with auditory hallucinations: a meta-analysis of the literature. Schizophr Bull. 2012;38(4):741–50. pmid:21147895
- 5. Stephane M. The self, agency and spatial externalizations of inner verbal thoughts, and auditory verbal hallucinations. Front Psychiatry. 2019;10:668. pmid:31607965
- 6. Stephane M, Kuskowski M, McClannahan K, Surerus C, Nelson K. Evaluation of inner-outer space distinction and verbal hallucinations in schizophrenia. Cogn Neuropsychiatry. 2010;15(5):441–50. pmid:20349369
- 7. Dondé C, Luck D, Grot S, Leitman DI, Brunelin J, Haesebaert F. Tone-matching ability in patients with schizophrenia: A systematic review and meta-analysis. Schizophr Res. 2017;181:94–9. pmid:27742161
- 8. Fivel L, Mondino M, Brunelin J, Haesebaert F. Basic auditory processing and its relationship with symptoms in patients with schizophrenia: A systematic review. Psychiatry Res. 2023;323:115144. pmid:36940586
- 9. Perrin MA, Butler PD, DiCostanzo J, Forchelli G, Silipo G, Javitt DC. Spatial localization deficits and auditory cortical dysfunction in schizophrenia. Schizophr Res. 2010;124(1–3):161–8. pmid:20619608
- 10. Best V, Baumgartner R, Lavandier M, Majdak P, Kopčo N. Sound externalization: A review of recent research. Trends in Hearing. 2020;24:2331216520948390.
- 11. Catic J, Santurette S, Buchholz JM, Gran F, Dau T. The effect of interaural-level-difference fluctuations on the externalization of sound. J Acoust Soc Am. 2013;134(2):1232–41. pmid:23927121
- 12. Risoud M, Hanson J-N, Gauvrit F, Renard C, Lemesre P-E, Bonne N-X, et al. Sound source localization. Eur Ann Otorhinolaryngol Head Neck Dis. 2018;135(4):259–64. pmid:29731298
- 13. Brimijoin WO, Boyd AW, Akeroyd MA. The contribution of head movement to the externalization and internalization of sounds. PLoS One. 2013;8(12):e83068. pmid:24312677
- 14. Wightman FL, Kistler DJ. Headphone simulation of free-field listening. II: Psychophysical validation. J Acoust Soc Am. 1989;85(2):868–78. pmid:2926001
- 15. Leclère T, Lavandier M, Perrin F. On the externalization of sound sources with headphones without reference to a real source. J Acoust Soc Am. 2019;146(4):2309. pmid:31671981
- 16. Kryklywy JH, Macpherson EA, Greening SG, Mitchell DGV. Emotion modulates activity in the “what” but not “where” auditory processing pathway. Neuroimage. 2013;82:295–305. pmid:23711533
- 17. Kensinger EA, Schacter DL. Reality monitoring and memory distortion: effects of negative, arousing content. Mem Cognit. 2006;34(2):251–60. pmid:16752589
- 18. Pinheiro AP, Rezaii N, Rauber A, Niznikiewicz M. Is this my voice or yours? The role of emotion and acoustic quality in self-other voice discrimination in schizophrenia. Cogn Neuropsychiatry. 2016;21(4):335–53. pmid:27454152
- 19. Micheyl C, Delhommeau K, Perrot X, Oxenham AJ. Influence of musical and psychoacoustical training on pitch discrimination. Hear Res. 2006;219(1–2):36–47. pmid:16839723
- 20.
Andreasen NC. The Scale for the Assessment of Positive Symptoms (SAPS). Iowa City, IA: The University of Iowa; 1984.
- 21. Kay SR, Fiszbein A, Opler LA. The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophr Bull. 1987;13(2):261–76. pmid:3616518
- 22. Belin P, Fillion-Bilodeau S, Gosselin F. The montreal affective voices: a validated set of nonverbal affect bursts for research on auditory affective processing. Behav Res Methods. 2008;40(2):531–9. pmid:18522064
- 23. Hummersone C, Mason R, Brookes T. Dynamic precedence effect modeling for source separation in reverberant environments. IEEE Trans Audio Speech Lang Process. 2010;18(7):1867–71.
- 24. Brunelin J, Poulet E, Marsella S, Bediou B, Kallel L, Cochet A, et al. Un déficit de mémoire de la source spécifique chez les patients schizophrènes comparés à des volontaires sains et des patients présentant un épisode dépressif majeur. European Review of Applied Psychology. 2008;58(2):105–10.
- 25. Leucht S, Samara M, Heres S, Davis JM. Dose Equivalents for Antipsychotic Drugs: The DDD Method. Schizophr Bull. 2016;42(Suppl 1):S90-4. pmid:27460622
- 26. Flechsenhar A, Levine S, Bertsch K. Threat induction biases processing of emotional expressions. Frontiers in Psychology. 2022;13.
- 27. Christensen JF, Di Costa S, Beck B, Haggard P. I just lost it! Fear and anger reduce the sense of agency: a study using intentional binding. Exp Brain Res. 2019;237(5):1205–12. pmid:30826847
- 28. Yoshie M, Haggard P. Negative emotional outcomes attenuate sense of agency over voluntary actions. Curr Biol. 2013;23(20):2028–32. pmid:24094850
- 29. Lyon HM, Startup M, Bentall RP. Social cognition and the manic defense: attributions, selective attention, and self-schema in bipolar affective disorder. J Abnorm Psychol. 1999;108(2):273–82. pmid:10369037
- 30. Anikin A. The link between auditory salience and emotion intensity. Cogn Emot. 2020;34(6):1246–59. pmid:32126893
- 31. Callan A, Callan DE, Ando H. Neural correlates of sound externalization. Neuroimage. 2013;66:22–7. pmid:23108271
- 32. Hunter MD, Griffiths TD, Farrow TFD, Zheng Y, Wilkinson ID, Hegde N, et al. A neural basis for the perception of voices in external auditory space. Brain. 2003;126(Pt 1):161–9. pmid:12477703
- 33. Catic J, Santurette S, Dau T. The role of reverberation-related binaural cues in the externalization of speech. J Acoust Soc Am. 2015;138(2):1154–67. pmid:26328729
- 34. Matthews N, Todd J, Mannion DJ, Finnigan S, Catts S, Michie PT. Impaired processing of binaural temporal cues to auditory scene analysis in schizophrenia. Schizophr Res. 2013;146(1–3):344–8. pmid:23466188
- 35. Boyd AW, Whitmer WM, Soraghan JJ, Akeroyd MA. Auditory externalization in hearing-impaired listeners: the effect of pinna cues and number of talkers. J Acoust Soc Am. 2012;131(3):EL268-74. pmid:22423819
- 36. Heine L, Corneyllie A, Gobert F, Luauté J, Lavandier M, Perrin F. Virtually spatialized sounds enhance auditory processing in healthy participants and patients with a disorder of consciousness. Sci Rep. 2021;11(1):13702. pmid:34211035
- 37. Fletcher PC, Frith CD. Perceiving is believing: a Bayesian approach to explaining the positive symptoms of schizophrenia. Nat Rev Neurosci. 2009;10(1):48–58. pmid:19050712
- 38.
Frith CD. The cognitive neuropsychology of schizophrenia. Hove, East Sussex: Psychology Press; 1992.
- 39. Gong B, Li Q, Zhao Y, Wu C. Auditory emotion recognition deficits in schizophrenia: A systematic review and meta-analysis. Asian J Psychiatr. 2021;65:102820. pmid:34482183
- 40. Kantrowitz JT, Hoptman MJ, Leitman DI, Moreno-Ortega M, Lehrfeld JM, Dias E, et al. Neural Substrates of Auditory Emotion Recognition Deficits in Schizophrenia. J Neurosci. 2015;35(44):14909–21. pmid:26538659
- 41. Thomas ML, Green MF, Hellemann G, Sugar CA, Tarasenko M, Calkins ME. Modeling deficits from early auditory information processing to psychosocial functioning in schizophrenia. JAMA Psychiatry. 2017;74:37–46.
- 42. Gallese V, Ferri F. Psychopathology of the bodily self and the brain: the case of schizophrenia. Psychopathology. 2014;47(6):357–64. pmid:25359279
- 43. Lin C-H, Huang C-L, Chang Y-C, Chen P-W, Lin C-Y, Tsai GE, et al. Clinical symptoms, mainly negative symptoms, mediate the influence of neurocognition and social cognition on functional outcome of schizophrenia. Schizophr Res. 2013;146(1–3):231–7. pmid:23478155
- 44. Escartí MJ, de la Iglesia-Vayá M, Martí-Bonmatí L, Robles M, Carbonell J, Lull JJ, et al. Increased amygdala and parahippocampal gyrus activation in schizophrenic patients with auditory hallucinations: an fMRI study using independent component analysis. Schizophr Res. 2010;117(1):31–41. pmid:20071145
- 45. Sanjuan J, Lull JJ, Aguilar EJ, Martí-Bonmatí L, Moratal D, Gonzalez JC, et al. Emotional words induce enhanced brain activity in schizophrenic patients with auditory hallucinations. Psychiatry Res. 2007;154(1):21–9. pmid:17184978
- 46. Subramaniam K, Kothare H, Mizuiri D, Nagarajan SS, Houde JF. Reality Monitoring and Feedback Control of Speech Production Are Related Through Self-Agency. Front Hum Neurosci. 2018;12:82. pmid:29559903
- 47. Brookwell ML, Bentall RP, Varese F. Externalizing biases and hallucinations in source-monitoring, self-monitoring and signal detection studies: a meta-analytic review. Psychol Med. 2013;43(12):2465–75. pmid:23282942
- 48. Brunelin J, Combris M, Poulet E, Kallel L, D’Amato T, Dalery J, et al. Source monitoring deficits in hallucinating compared to non-hallucinating patients with schizophrenia. Eur Psychiatry. 2006;21(4):259–61. pmid:16545546
- 49. Dondé C, Mondino M, Leitman DI, Javitt DC, Suaud-Chagny M-F, D’Amato T, et al. Are basic auditory processes involved in source-monitoring deficits in patients with schizophrenia?. Schizophr Res. 2019;210:135–42. pmid:31176535
- 50. Woodward TS, Menon M, Whitman JC. Source monitoring biases and auditory hallucinations. Cogn Neuropsychiatry. 2007;12(6):477–94. pmid:17978935
- 51. Ohl B, Laugesen S, Buchholz J, Dau T. Externalization versus internalization of sound in normal-hearing and hearing-impaired listeners. In: Fortschritte der Akustik. 2010. 14:25.
- 52. Lavandier M, Heine L, Perrin F. Comparing the auditory distance and externalization of virtual sound sources simulated using nonindividualized stimuli. Trends Hear. 2024;28:23312165241285695. pmid:39435499
- 53. Kates JM, Arehart KH, Muralimanohar RK, Sommerfeldt K. Externalization of remote microphone signals using a structural binaural model of the head and pinna. J Acoust Soc Am. 2018;143(5):2666. pmid:29857749
- 54. Carter JD, Bizzell J, Kim C, Bellion C, Carpenter KLH, Dichter G, et al. Attention deficits in schizophrenia--preliminary evidence of dissociable transient and sustained deficits. Schizophr Res. 2010;122(1–3):104–12. pmid:20554160
- 55. Copolov D, Trauer T, Mackinnon A. On the non-significance of internal versus external auditory hallucinations. Schizophr Res. 2004;69(1):1–6. pmid:15145464
- 56. Looijestijn J, Diederen KMJ, Goekoop R, Sommer IEC, Daalman K, Kahn RS, et al. The auditory dorsal stream plays a crucial role in projecting hallucinated voices into external space. Schizophr Res. 2013;146(1–3):314–9. pmid:23453584
- 57. Gill K, Percival C, Roes M, Arreaza L, Chinchani A, Sanford N, et al. Real-Time Symptom Capture of Hallucinations in Schizophrenia with fMRI: Absence of Duration-Dependent Activity. Schizophr Bull Open. 2022;3(1):sgac050. pmid:39144798
- 58. Humpston CS, Woodward TS. Soundless voices, silenced selves: are auditory verbal hallucinations in schizophrenia truly perceptual?. Lancet Psychiatry. 2024;11(8):658–64. pmid:38631367
- 59. Blom JD, Sharpless BA. On the perceptual characteristics of auditory verbal hallucinations. Lancet Psychiatry. 2024;11(12):959–60. pmid:39572115
- 60. Dondé C, Mondino M, Brunelin J, Haesebaert F. Sensory-targeted cognitive training for schizophrenia. Expert Rev Neurother. 2019;19(3):211–25. pmid:30741038
- 61.
Mitruț O, Moldoveanu A, Moldoveanu F, Negoi I. The role of perceptual feedback training on sound localization accuracy in audio experiments. 2015. https://doi.org/10.12753/2066-026X-15-074
- 62. Klein F, Werner S, Mayenfels T. Influences of Training on Externalization of Binaural Synthesis in Situations of Room Divergence. J Audio Eng Soc. 2017;65(3):178–87.
- 63. Fivel L, Brunelin J, Leroux G, Haesebaert F, Mondino M. How the Brain Distinguishes Internal and External Sounds: An fMRI Investigation of Auditory Sound Externalization. bioRxiv. 2025:10.14.679174.