Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The effects of aging and musicianship on the use of auditory streaming cues

  • Sarah A. Sauvé ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Supervision, Visualization, Writing – original draft, Writing – review & editing

    sarah.sauve@mun.ca

    Affiliation Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, St. John’s, Newfoundland and Labrador, Canada

  • Jeremy Marozeau,

    Roles Formal analysis, Software, Visualization, Writing – review & editing

    Affiliation Department of Health Technology, Technical University of Denmark, Lyngby, Denmark

  • Benjamin Rich Zendel

    Roles Funding acquisition, Resources, Supervision, Writing – review & editing

    Affiliation Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, St. John’s, Newfoundland and Labrador, Canada

Abstract

Auditory stream segregation, or separating sounds into their respective sources and tracking them over time, is a fundamental auditory ability. Previous research has separately explored the impacts of aging and musicianship on the ability to separate and follow auditory streams. The current study evaluated the simultaneous effects of age and musicianship on auditory streaming induced by three physical features: intensity, spectral envelope and temporal envelope. In the first study, older and younger musicians and non-musicians with normal hearing identified deviants in a four-note melody interleaved with distractors that were more or less similar to the melody in terms of intensity, spectral envelope and temporal envelope. In the second study, older and younger musicians and non-musicians participated in a dissimilarity rating paradigm with pairs of melodies that differed along the same three features. Results suggested that auditory streaming skills are maintained in older adults but that older adults rely on intensity more than younger adults while musicianship is associated with increased sensitivity to spectral and temporal envelope, acoustic features that are typically less effective for stream segregation, particularly in older adults.

1 Introduction

In everyday life, our brain continuously analyses and organizes the omnipresent mixture of sounds reaching our eardrums. This organizational process is called auditory scene analysis (ASA; 1). In his seminal book, Bregman [1] describes ASA as the process by which we integrate and segregate acoustic information to form perceptual objects/streams that accurately represent the acoustic environment. ASA combines physical features from the environment (i.e., bottom-up), and learned knowledge (i.e., top-down) to segregate and track sound sources over time. Many bottom-up sound cues affect how efficiently this analysis can be carried out. As a general rule, when physical features suggest the presence of multiple sound sources (e.g., differing onsets, spectral components whose frequencies are not integer multiples of a fundamental frequency (F0), differing inter-aural time/levels, etc…), it is more likely for the sounds to be perceptually segregated into separate perceptual streams. When physical features suggest that sounds come from the same source (e.g., similar onsets, spectral components whose frequencies are integer multiples of a F0, similar inter-aural time/levels, etc…) it is more likely that they will be perceptually integrated into a single perceptual stream. A combination of bottom-up sound features, such as frequency [1, 2], tempo [2, 3], location [4, 5], and temporal and spectral envelopes [69], can contribute to the quality and efficiency of auditory stream segregation. Top-down cognitive aspects such as attention [10], expectation [11] and musicianship [12, 13] also influence the accuracy and efficiency of stream segregation.

The separation of two sound sources into streams involves both simultaneous and sequential aspects. Evidence suggests that aging negatively affects concurrent stream segregation, or the ability to segregate simultaneous sounds [1418]. On the other hand, sequential stream segregation–the segregation of sequential sounds–is little affected by age [1922], even for participants with hearing loss [23, 24]. This pattern suggests that older adults have more difficulty than younger adults separating sounds that occur simultaneously. Still, once the sounds are separated, older adults can track a sequence of sounds as well as younger adults.

Auditory stream segregation is important when listening to music. For example, there are sections of J.S. Bach’s Toccata and Fugue in D minor where sequential notes that differ in frequency are played rapidly in order to give the perception that there are two simultaneous melodies. In this situation, the participant hears a rapid change in F0 between the notes that suggests there must be two sound sources producing the music. In this case, the resulting percept is that of two auditory streams or two melodies. Rapidly alternating tones that differ in frequency are often used to explore auditory stream segregation in the lab. For example, Snyder and Alain [20] investigated potential age-related effects on sequential auditory streaming using the ABA_ paradigm, where ‘A’ and ‘B’ represent tones that differ in frequency and ‘_’ represents silence. When the A and B tones differ by a small amount in frequency, a ‘galloping’ rhythm is heard. As the frequency separation between the A and B tones increases, the likelihood of hearing two simultaneous streams increases. Snyder and Alain [20] presented young, middle-aged and older participants with ABA_ sequences where the A and B tones were 0, 4, 7 or 12 semitones apart (the A tone frequency was 500 Hz and the B tone frequencies were 500, 625, 750 or 1000 Hz) and asked them to indicate whether they perceived one stream (galloping percept) or two streams (isochronous percept). All participants had near-normal pure-tone average (PTA) thresholds at low frequencies (PTA0.5-1kHz < 30 dB HL), but at higher frequencies, the older adults had significantly higher thresholds than the younger adults (e.g., PTA at 8000Hz in the left ear was 5 dB HL [SD = 7.5 dB] for the younger adults, and was 66 dB HL [SD = 20.5 dB] for the older adults). These PTA differences are generally considered a normal part of aging [20, though see 25]. Despite these PTA differences between older and younger adults, there was no difference in streaming perception between the groups. This may be because the stimuli were all below 1000 Hz, and both groups of participants had normal thresholds at those frequencies. All participants exhibited a similar tendency to perceive two simultaneous streams as the frequency separation between the A and B tones increased. Snyder and Alain [20] only assessed the effect of frequency separation on auditory streaming, but there are many other acoustic features that can give rise to stream segregation, including harmonic structure (timbre), and intensity [26].

The effects of aging on other acoustic features that give rise to sequential auditory streaming are not as well known. The processing of temporal aspects of acoustic stimuli is reduced in older adults [27, 28]. Behavioural evidence suggests that older adults have more difficulty than younger adults identifying the temporal order of sound sequences [2932] and performing temporal discrimination tasks, such as silent gap detection [3336]. Interestingly, hearing loss had little impact on these abilities, as performance on these tasks was similar for older adults with and without hearing loss [29, 30, 3336]. Evidence from neuroscience suggests that the encoding precision of temporal information is poorer in older adults with normal hearing than in younger adults [37, 38].

Little is known about how aging affects the encoding of a sound’s spectral envelope. One study [39] tested melody and timbre recognition for a variety of stimulus manipulations, including low- and high-pass filters, vocoding and temporal envelope modulation. While younger adults were able to use both low- and high-pass modified stimuli to identify timbre, older adults with normal to near-normal (PTA0.25-4kHz < 25 dB HL) hearing were not able to use the higher frequency information. In another study, Grimault et al. [40], used an ABA task with complex tones where the F0 of the A tone was either 88 Hz or 250 Hz and the F0 of the B tone varied between 88 Hz and 352 Hz and asked younger and older adults with moderate or mild hearing loss to indicate whether they perceived one or two streams at the end of a 4s segment. They found that older adults with moderate hearing loss (PTA0.5-2kHz between 32 and 72 dB HL) and with mild hearing loss (PTA0.5-2kHz between 30 and 35 dB HL) had lower d’ scores than younger adults when the A tone’s F0 was 250 Hz, but not 88 Hz, suggesting difficulty differentiating tones with higher frequencies, and therefore higher harmonics. It is possible that these higher harmonics were not resolved due to broader auditory filters [41], which are specifically associated with hearing loss rather than increased age [42]. Studies specifically investigating the effect of hearing loss on stream segregation showed that streaming is impaired in older adults with hearing loss [43]. More specifically, auditory streaming induced by frequency [18, 23, 40, 44] and inter-aural time differences [45] is worse for older adults with hearing loss than for older adults with normal hearing. Overall, it is likely that aging and hearing loss affect the ability to discriminate the acoustic features needed to form auditory streams but do not affect the ability to stream sounds that are perceptually distinct.

While aging can negatively affect auditory stream segregation, musicianship seems to have a positive effect. One of the first studies to explore this musician advantage for stream segregation investigated the time decay of auditory stream biasing by playing a 10 second sequence of repeated A tones, called an induction sequence, a silent interval of between 0 and 8 seconds and a short ABAB test sequence, where participants indicated whether they heard one or two streams. Beauvois and Meddis [12] found that auditory streams persisted over longer silences for musicians than for non-musicians. More recently, Marozeau et al. [7] investigated the effect of musicianship on stream segregation. Musicians (mean age 31 years, SD = 7.2 years) and non-musicians (mean age 32.2 years, SD = 7.9 years) listened to a sequence of tones in which a four-note repeating target pattern was ‘hidden’ amongst distractor tones (see Fig 1). The distractor tones were manipulated in terms of intensity, spectral envelope and temporal envelope so that they varied in similarity. Participants were asked to continuously rate how easily they could perceive the target pattern. In a dissimilarity paradigm, participants performed perceptual similarity ratings for pairs of stimuli varying in intensity, spectral envelope and temporal envelope. These ratings were used to map the stimuli into one multi-dimensional space, allowing the effects of the three manipulated features to be directly compared to each other on a common perceptual scale. Compared to non-musicians, the musicians needed less physical difference between the targets and the distractors to successfully segregate the two streams. In terms of acoustic cues, intensity, led to the most robust stream segregation for musicians while intensity and spectral envelope were similar in effectiveness for non-musicians. This study suggests that a participant’s experience and knowledge (i.e., top-down) can affect how acoustic cues (i.e., bottom-up) are utilized during auditory stream segregation.

thumbnail
Fig 1. Illustration of stimuli and feature manipulations.

(A) Sample stimuli, where black note heads depict the target pattern while grey note heads depict the distractor sequence. An example of a deviant target is given in the third measure, where the two notes marked by the bracket were reversed in order. (B) Temporal envelopes varying in impulsiveness from 60 ms FDHM (full duration at half-maximum) to 160 ms FDHM (the target notes). Only 4 of the 20 possible temporal envelopes are shown here. (C) Spectral envelopes varying from 3 dB of attenuation per harmonic (the target notes—circles) to 25 dB of attenuation per harmonic (triangles). Only 3 out of the 20 possible spectral envelopes are shown here. Fig adapted from Marozeau et al. (2013), with permission.

https://doi.org/10.1371/journal.pone.0274631.g001

The goal of the present study was to explore how the interaction of musicianship, aging, and audiometric thresholds affect the relative salience of intensity, spectral envelope and temporal envelope as auditory streaming cues, using a modified version of the target/distractor paradigm described above [46, 47].

Based on current evidence of the maintenance of sequential streaming abilities in older adults [19, 20] and the positive relationship between musicianship and (non-speech) stream segregation in younger adults [12, 4851], we expected older musicians to require a similar level of dissimilarity between the target and the distractor to successfully segregate the two streams as young musicians [5256], and less dissimilarity than older and younger non-musicians. Of particular interest here is the interaction between age and musicianship. It is unknown whether the positive relationship between musicianship and auditory streaming is consistent throughout the lifespan. This auditory streaming task can also be interpreted as an inhibition task, where the distractor tones must be successfully separated and ignored in order to perform the task. Though the inhibition theory of aging suggests a decline in inhibition [5759], according to a recent meta-analysis of inhibition in aging, the ability to ignore distracting information remains intact in older adults [60], supporting the hypothesis that older and younger adults will perform similarly on the current streaming task; however, given that temporal encoding and discrimination decrease with increasing age [32, 34], streaming based on temporal envelope in the current study should be more difficult for older adults than younger adults. Similarly, as age-related hearing loss is usually high-frequency hearing loss and the manipulation of spectral envelope in this study involved varying attenuation of the higher frequencies of the distractor tones, loss of hearing in this range may lead to poorer timbral differentiation and therefore poorer streaming and task performance when spectral envelope is manipulated. Accordingly, both age and hearing loss will be important variables to consider independently.

To test these hypotheses, two experiments were conducted. In the first experiment, younger and older musicians and non-musicians identified deviations in the target pattern while distractor similarity to the target either increased or decreased over time. This modification provided a more objective measure of auditory streaming than a subjective perceptual rating (or judgment, 20), as identifying a deviant is only possible if the target is streamed from the distractor tones. In a second experiment, as in Marozeau et al. [7], younger and older musicians and non-musicians rated the similarity of pairs of melodies differing in intensity, spectral envelope and temporal envelope. Using multidimensional scaling (MDS), a common perceptual dissimilarity scale between the three manipulated features was extracted to allow a direct comparison of the effects of each of the acoustic features.

2 Experiment 1

2.1 Method

2.1.1 Participants.

Fifty-four participants were recruited and provided written informed consent following the Interdisciplinary Committee on Ethics in Human Research at Memorial University of Newfoundland (ICEHR#20192257). Participants were divided into two age groups: 28 younger adults (<38 years; age range 17–37; 16 female) and 26 older adults (> 60 years; age range 61–82; 9 female). Musicianship was measured using the Goldsmiths Musical Sophistication Index’s musical training sub-scale (58; henceforth referred to as the Gold-MSI, see section 2.1.2). Participants varied widely in their musical training backgrounds. All participants self-reported being healthy and free of any cognitive deficit. Hearing abilities were assessed using pure-tone audiometry. The reported pure-tone average (PTA) is based on thresholds at 500, 1000, 2000 and 4000 Hz for the better-hearing ear. Recently updated WHO guidelines define normal hearing as a PTA (i.e., average across 0.5, 1, 2 and 4 kHz) below 20 dB HL, while mild hearing loss is defined as a PTA between 20 and 34 dB HL [61]. Previous WHO guidelines considered normal hearing to be below 26 dB HL [61] and participants in the current study were screened with this criterion in mind. Thus, our sample only includes people with very mild hearing loss (i.e., less than 26 dB HL PTA). In a sample of older adults, it is normal to find many participants with mild hearing loss, with multiple large epidemiological studies showing that around 70% of older adults experience at least mild hearing loss by their late 70’s [62]. All participants received a $20 honorarium for their participation. Table 1 summarizes participant demographics.

2.1.2 Gold-MSI.

The Goldsmiths Musical Sophistication Index (Gold-MSI) [63] musical training sub-scale consisted of 7 questions, each answered on a 7-point Likert scale, for a total score out of 49. Two questions were subjective (e.g. ‘I would not consider myself a musician’) and five asked about length in years of formal instrumental and music theoretical training and length in years and hours per day of regular practice (e.g. At the peak of my interest, I practiced __ hours a day on my primary instrument). The sub-scale additionally asked the participant to report which instrument they played best. This continuous factor derived from an individual’s music training history will be referred to as musical training. For visualization and summary purposes, musicians were defined as participants scoring above 50% on the Gold-MSI (i.e. a score > 25; 14 younger, 12 older) and non-musicians as scoring less than 50% (i.e. a score < 25; 14 younger, 13 older). Note that no participant scored exactly 25. Though it is possible that differences in task performance between participants with varying musical training backgrounds may be attributed to differences other than formal musical training (e.g. genetics), in this study musical training history is the measured proxy for musicianship.

2.1.3 Stimuli.

Stimuli were identical to those of Marozeau et al. [7], Experiment 1. Fig 1 illustrates the stimuli, which consisted of two patterns: one four-note repeating melody (target; black note heads) with notes G4, C5, A5 and D5, and pseudorandom notes (distractor; grey note heads). The tones of the target melody had a temporal envelope consisting of a 30 ms raised-cosine onset, 140 ms of sustain, and a 10 ms offset, and a spectral envelope with 10 harmonics, successively attenuated by 3 dB. The F0s of the distractor notes were uniformly randomly selected from a range of an octave around the target melody (F4 to E5) and were manipulated along one feature at a time while the remaining features were kept constant. Loudness level, spectral envelope and temporal envelope were varied over 20 degrees of difficulty: at difficulty 1, they allowed easy segregation and at difficulty 20 they were the same as the target melody.

The stimuli of Marozeau et al. [7] were designed to be presented in free field at a loudness level of 65 phons (as loud as 1 kHz tone at 65 dB SPL), using a loudness model [64], to ensure that the sensation of loudness was constant even though the spectral and temporal envelope were varied. Stimuli are described with the specifications with which they were designed.

2.1.3.1 Intensity. Intensity was attenuated, in twenty steps of 2 phons, from equal level to target to an attenuation of 38 phons. The starting difficulty (no attenuation or 38 phons) of the distractor depended on the trial block (details in section 2.1.4 below).

2.1.3.2 Temporal envelope. Tone impulsiveness, defined as the full duration of the sound at half of the maximum amplitude (FDHM), was logarithmically spaced over twenty degrees of difficulty from 160 ms (equal to target) to 60 ms (Fig 1B), where the starting difficulty depended on the trial block.

2.1.3.3 Spectral envelope. The spectral envelope attenuation was manipulated in twenty logarithmically spaced steps from 3 dB (equal to target) to 25 dB attenuation per harmonic (Fig 1C). Note that at 25 dB attenuation per harmonic, the majority of harmonics were inaudible. The starting difficulty of the distractors’ spectral envelope depended on the trial block.

The stimuli were constructed using Matlab 7.5 and sequences were generated using MAX/MSP 8. For further details, see Marozeau et al. [7], Experiment 1, Stimuli subsection. For practical reasons, in this experiment sounds were played through over-ear headphones (Sennheiser HDA 200) at a comfortable level set by the listeners and ranging from 25–60 dB SPL. This affects the absolute intensity at which the stimuli were presented, which may affect baseline performance; however, the manipulation of interest is the difference in intensity between the target and the distractors at which the deviant detection task (described below) becomes possible, not the absolute intensity.

2.1.4 Procedure.

The experiment consisted of six blocks, three based on increasing difficulty and three based on decreasing difficulty for each of the acoustic features. In each of these blocks, the target melody was presented in a continuous loop while distractor tones were interleaved. Two of the target melody tones were reversed in order 25% of the time, creating a deviant melody. The participants’ task was to press the space bar when they detected such a deviant. Good performance is only possible if the target and distractor tones are perceived as separate streams. The deviant tones in each block varied along a single feature and each feature was tested twice, once with increasing difficulty (least similarity between target melody and distractor to exact match between target melody and distractor) and once with decreasing difficulty (exact match between target melody and distractor to least similarity between target melody and distractor). Blocks varied in difficulty according to the following adaptive rule: three consecutive hits, or correct identifications of a deviant, or three consecutive misses (i.e. failing to identify the deviant) were necessary to move onto a new degree of difficulty. When difficulty was increasing, three consecutive hits or misses were needed to move to a higher degree of difficulty (more similarity between target melody and distractor), while a complete lack of hits for three successive difficult levels led to the manual termination of the block. When difficulty was decreasing, three consecutive hits or misses were required to move to a lower degree of difficulty (less similarity between target melody and distractor). In these decreasing blocks, participants heard all stimulus degrees of difficulty. Movement between degrees of difficulty only occurred in one direction, based on whether difficulty was increasing or decreasing.

A response within 1.5 seconds (six tones, target or distractor) of a deviant was considered a hit. Any other response was considered a false alarm. There was no limit on the number of false alarms, or the identification of a deviant where there was none, leading to a variable amount of time spent at each degree of difficulty. A practice block was always completed first, and included the first few degrees of difficulty for the increasing intensity manipulation, until the participant understood the task; data from this practice block were discarded. Based on piloting, in an effort to minimize false alarms, participants were instructed to only identify a deviant if they could clearly perceive the target melody. They were reassured that it was normal not to perceive the target melody at all in the early stages of a block manipulated with decreasing difficulty and to simply wait until the target melody could be distinguished from the distractors. Despite this, false alarms occurred often, as discussed in Section 2.3.1.

2.1.5 Analysis.

All data analysis was performed in R 3.3.2. Alpha was set at .01, with the conservative Bonferroni correction applied for multiple comparisons. Effect sizes are reported for all statistical tests. Transformed data and analysis scripts are available from the first author upon request. Summary statistics can be found in the S2 Appendix.

2.1.5.1 Data processing. Raw data were transformed into d’ scores for each degree of difficulty, feature and participant. One older non-musician participant’s data were removed as they were unable to perform the task after the first few degrees of difficulty on more than half the blocks. Effects of block order and intensity at which the stimuli were played were tested as potential confounding variables using mixed effects multiple linear regression models. Both were non-significant and consequently excluded from the omnibus test described below.

2.1.5.2 Omnibus test. Mixed effects multiple linear regression models as implemented by the lme4 package [65] were used to analyse d’ values for each feature. In this model, fixed effects included degree of difficulty, age group, musical training and pure tone average (PTA) with random intercepts on participants. Degree of difficulty and age group were categorical variables, coded as factors, where Difficulty 1 and younger were the factor level against which all other factor levels were compared. Musical training and PTA were continuous variables. The model was evaluated using Pearson’s correlation between the model’s predictions and the data along with the correlation’s 95% confidence interval and effect size R2. The statistical significance of each predictor was determined using the lmer function’s default t-tests and p-values, which employ the Satterthwaite method [66]. All follow-up pairwise comparisons used between-samples t-tests with Bonferroni correction.

2.1.5.3 Equivalence test. A two-one-sided t-test (TOST) procedure was applied using the TOSTER package [67] in cases where the age effect was null. An equivalence test detects whether an effect is statistically different from zero and whether an effect is larger than a set smallest interesting effect size, or equivalence bound. In other words, in the presence of a null effect according to the omnibus test, the TOST procedure assesses whether the effect is non-zero and if it is large enough to be considered interesting. Here, the equivalence bound, or smallest interest effect size was set to Cohen’s d of 0.2.

2.2 Results

Task performance as measured by d’ is illustrated in Fig 2. The mixed effects multiple linear regression models are summarized in Tables 2 (intensity), 3 (spectral envelope) and 4 (temporal envelope), where only significant main effect predictors are included for brevity. Full model specifications can be found in the S1 Appendix.

thumbnail
Fig 2. Mean d’ scores.

Mean d’ score by degree of difficulty, grouped by age group and musical training for intensity (A), spectral envelope (B) and temporal envelope (C) features. Standard error is shown by grey shading around each line. While musical training was a continuous variable in statistical analysis, for visualization purposes, participants with scores over 25 were considered musicians, and participants with scores less than 25 were considered non-musicians.

https://doi.org/10.1371/journal.pone.0274631.g002

thumbnail
Table 2. Mixed effects multiple linear regression model for intensity feature.

Coefficients for each predictor (and each degree of difficulty, as relevant) along with standard error (SE) and predictor R2 are reported along with Pearson’s correlation, 95% CIs, p-value and R2 for full models in the note below.

https://doi.org/10.1371/journal.pone.0274631.t002

thumbnail
Table 3. Mixed effects multiple linear regression model for spectral envelope feature.

Coefficients for each predictor (and each degree of difficulty, as relevant) along with standard error (SE) and predictor R2 are reported along with Pearson’s correlation, 95% CIs, p-value and R2 for full models in the note below.

https://doi.org/10.1371/journal.pone.0274631.t003

thumbnail
Table 4. Mixed effects multiple linear regression model for temporal envelope feature.

Coefficients for each predictor (and each degree of difficulty, as relevant) along with standard error (SE) and predictor R2 are reported along with Pearson’s correlation, 95% CIs, p-value and R2 for full models in the note below.

https://doi.org/10.1371/journal.pone.0274631.t004

2.2.1 Intensity.

The model for intensity included significant main effects of degree of difficulty and musical training, with no significant interactions. Pairwise comparisons between degrees of difficulty indicated that d’ was significantly different (Bonferroni correction p < 5x10-5) between pairs of lower and higher degrees of difficulty, as illustrated in Fig 2A (details of all tests can be found in the S1 Appendix). Participants with higher musical training scores were better at detecting deviants overall. To confirm the null effect of age, a TOST comparing younger and older adults with lower and upper equivalence bounds of d’ = -0.40 and d’ = 0.40, equivalent to a Cohen’s d of 0.2. The equivalence test was not significant, t (2098) = 1.62, p > .01, indicating that the effect, Δd’ = -0.26 CIs [-0.46, -0.06], was in fact different from zero, where older adults had higher d’ scores than younger adults but this difference did not reach significance in the omnibus test.

2.2.2 Spectral envelope.

The model for spectral envelope included a significant main effect of degree of difficulty and musical training with no significant interactions. Pairwise comparisons between degrees of difficulty indicated that d’ was significantly different (Bonferroni correction p < 5x10-5) between pairs of lower and higher levels, as illustrated in Fig 2B (details of all tests can be found in the S1 Appendix). Participants with higher musical training scores were better at detecting deviants overall, including when the difference in spectral envelope between target and distractor was smaller. To confirm the null effect of age, a TOST comparing younger and older adults with lower and upper equivalence bounds of d’ = -0.42 and d’ = 0.42. The equivalence test was not significant, t (2109) = 0.21, p > .01, indicating that the effect, Δd’ = 0.44, CIs [0.23, 0.65], was non-zero and fell outside our equivalence bounds. Overall, an effect of age was not significant according to the omnibus test but was larger than the smallest interesting effect.

To investigate whether this ambiguity was due to an undetected interaction between age and musical training, two further TOST were applied. First, a TOST compared younger non-musicians and all older adults with lower and upper equivalence bounds of d’ = -0.40 and d’ = 0.40. The equivalence test was not significant, t (1185) = 2.26, p > .01. The effect, Δd’ = -0.16, CIs [-0.41, 0.08], fell within our equivalence bounds but its CI extended beyond it; however, the CI for this effect also included zero, indicating that the difference between younger non-musicians and older participants was inconclusive. Second, a TOST compared younger musicians and all older adults with lower and upper equivalence bounds of d’ = -0.42 and d’ = 0.42. The equivalence test was not significant, t (1113) = 5.09, p > .01, and our effect, Δd’ = 0.98, CIs [0.72, 1.24] fell outside our equivalence bounds. While the TOST comparing younger non-musicians to older adults was inconclusive, the TOST comparing younger musicians to older adults showed that this difference was non-zero.

2.2.3 Temporal envelope.

The model for temporal envelope included a significant main effect of musical training, with no significant interactions. Results are illustrated in Fig 2C. Details of all tests can be found in the S1 Appendix. To confirm the null effect of age, a TOST comparing younger and older adults with lower and upper equivalence bounds of d’ = -0.34 and d’ = 0.34, equivalent to a Cohen’s d of 0.2. The equivalence test was not significant, t (2100) = 0.66, p > .01, indicating that the effect, Δd’ = 0.39 CIs [0.22, 0.56], was different from zero, where younger adults had higher d’ scores than older adults but this difference did not reach significance in the omnibus test.

2.3 Discussion

2.3.1 Results summary.

This study aimed to investigate auditory streaming abilities in terms of age and its interaction with musicianship for three sound features. Younger and older musicians and non-musicians identified deviants randomly inserted in a repeated four-note melody and interleaved with distractor tones that were more or less similar to the melody tones in intensity, spectral envelope and temporal envelope. The ability to detect the deviants was affected by the manipulation of the intensity, spectral envelope and temporal envelope of the distractor tones. This supports the idea that the acoustic manipulations affected auditory stream segregation and replicates previous findings [7]. More importantly, people with higher musical training scores were better able to make use of all features of the distractor at all levels of difference from the target.

Critically, though some literature suggests an effect of hearing loss as little as 10–15 dB on auditory discrimination [68, 69], PTA was never a significant predictor, suggesting that stream segregation based on intensity, spectral or temporal envelope was not affected by the variation in hearing thresholds in this study. This is consistent with literature supporting the preservation of sequential auditory streaming for both normal hearing and hearing-impaired older adults based on frequency and interaural time differences [1921, 45]. Performance was also not significantly related to differences in the intensity at which stimuli were presented through the headphones, suggesting that there are not significant differences in baseline performance as a function of absolute intensity.

2.3.2 Effect of musicianship.

The positive effect of musical training on auditory streaming is consistent with existing literature [4851] and replicates Marozeau et al.’s [7] findings. Our model coefficients show that the Gold-MSI musical training subtest score was positively correlated with d’, corresponding to a d’ advantage of roughly 0.6 between our musician and non-musician groups when intensity was manipulated and 1 when spectral envelope was manipulated. When temporal envelope was manipulated, there was a significant main effect of musical training only, corresponding to a d’ advantage of roughly 1.2. However, the effect size was very small for all features.

2.3.3 Effect of aging.

The absence of a main effect of age for all features is consistent with previous work investigating auditory streaming for older adults with or without hearing loss [1922]. The absence of a main effect of PTA, despite the older adults having significantly higher audiometric thresholds than the younger adults (see Table 1), suggests that age-related changes in hearing abilities do not negatively impact auditory streaming. These previous studies investigated auditory streaming based on frequency [1921]; our results extend these findings to intensity and two factors affecting timbre, spectral envelope and temporal envelope. We were especially interested in the interaction between age and musicianship, where a significant interaction would suggest that younger and older adults’ performance on this auditory streaming task is differently affected by musicianship. This interaction was not found for any feature, suggesting that younger and older adults’ performance is similarly affected by their musical background.

3 Experiment 2

To compare between features in their effects on streaming, a dissimilarity paradigm experiment was conducted to establish a common perceptual scale (7; Experiment 2).

3.1 Methods

3.1.1 Participants.

Twelve participants took part in this second experiment, 6 younger and 6 older; 10 also took part in Experiment 1 while the rest were lab members. Table 5 outlines this participant pool’s demographic information. Level of musical training happened to coincide with age group, where all younger adults except one had <50% Gold-MSI subscale scores, while all older adults had >50% scores. This confound is not desirable but occurred because only those willing to stay for Experiment 2 were tested; this happened to produce these homogenous groups. However, as will be demonstrated below, there was no difference in performance between the groups and data were pooled for further analysis.

3.1.2 Stimuli.

Fifteen four-note melodies were used in this experiment, the same as for Marozeau et al [7], Experiment 2, containing the same F0s as the target melody (G4, C5, A5, D5) in Experiment 1 with all three acoustic features modified simultaneously. As the similarity results were analyzed using multidimensional scaling (MDS), it was important to ensure that the differences in each feature induced perceptual changes that were on the same magnitude scale, and all the stimuli were evenly distributed in a three-dimensional space. For each feature, five possible levels were selected, spanning approximately the upper half of each psychometric function found in Marozeau et al. [7], Experiment 1. Thus, stimuli were presented at loudness levels ranging from 0 to -8 phons of overall attenuation, with 1.69, 1.19, 0.75, 0.35 or 0 dB of attenuation per harmonic and a FHDM (full duration half maximum) value of 100, 112, 126, 142 or 160 ms (see Fig 3). The first five stimuli were constructed by assigning a random permutation of the five levels for each of the three features; this was repeated two more times while ensuring that none of the 15 stimuli were identical. The stimuli were created using Matlab 7.5 and the experiment was implemented in MAX/MSP 8. Stimuli were presented through over-ear headphones (Sennheiser HDA 200) at a comfortable level.

thumbnail
Fig 3. Illustration of stimuli for Experiment 2.

Example of three possible pairs of stimuli. In the first line, the listener is asked to judge the dissimilarity of the standard melody (bar 1) with the same melody in which each note is reduced in loudness (bar 3). In the second line, the listener is asked to judge the dissimilarity of the melody (bar 5) in which each note is now more impulsive than the standard melody (bar 7). In the third line, the listener is asked to judge the dissimilarity of a soft melody (bar 9) with a melody with a different spectral centroid (bar 11).

https://doi.org/10.1371/journal.pone.0274631.g003

3.1.3 Procedure.

This experiment was divided into two parts. In the first part, participants could listen to each of the 15 four-note melodies as many times as they wanted in order to acquaint themselves with the range of dissimilarity among the stimuli. In the second part, they rated pairs of melodies on a slider labelled “very dissimilar” at one end and “very similar” at the other, for a total of 105 pairs. Participants rated one pair of melodies at a time and could listen to that pair as many times as they wanted until they were satisfied with their rating (see Fig 3). When then clicked a button to submit their rating, the next trial was triggered. Participants were encouraged to use the full range of the scale. Each response was quantified as an integer ranging from 0 to 128, based on the slider’s position.

3.1.4 Analysis.

MDS analyses were performed in Matlab with custom made functions. Linear modelling was performed in R 3.3.2 using the lme4 package [65]. Alpha was set at .01, with the conservative Bonferroni correction applied for multiple comparisons, and effect sizes are reported for all statistical tests.

To test for an effect of age, two MDS bootstrap analyses [70] were performed on the data for the older and younger adults groups separately, allowing statistical comparison between the two solutions. In other words, 200 3-dimensional MDS spaces were created by randomly selecting six participants, with replacement (the same participant can be selected many times). If all participants were always in close agreement, the 200 spaces should be very similar. On the other hand, if the participants disagreed, each of the 200 spaces will depend strongly on which participants were randomly selected. Therefore, this technique allows an estimate of the stability of each stimuli according to the inter-participant variability. Two-hundred random selections were used to be consistent with previous research using the same technique [7]. Each space was rotated towards the same reference space created according to the physical properties of the stimuli. These 200 solutions defined a distribution of positions within the MDS space for each stimulus, making it possible to define the 95% confidence volume for each stimulus. To measure similarity between the bootstrap solution for the older and younger adults, a t-test was performed on the position of each stimulus. This test compared the difference of absolute value between the position of a stimulus in the space obtained with the young participants with the position of the same stimulus in the space obtained with the older participants. The standard deviation of the position was extracted based on the bootstrap analysis, and the degrees of freedom based on the number of participants in each group. Bonferroni correction for multiple comparisons was applied. No stimulus reached a p-value close to a significant level. Therefore, the spaces were considered similar and the data from the young and older participants were combined to create a single space that was used for the rest of the analysis.

The similarity ratings were averaged across participants and a three-dimensional space was extracted using the MDSCAL procedure, implemented according to the SMACOFF algorithm [71]. As the MDSCAL solution is rotationally undetermined, the solution was rotated with a procrustean procedure that minimized the least-squares fit between the perceptual and physical spaces.

3.2 Results

To determine the amount of dissimilarity between each level of the physical feature, the slope of the regression line between each physical feature and the MDS dimension was plotted (Fig 4). As expected, all three MDS dissimilarity dimensions were correlated with the physical dimensions (Fig 4): the first dimension was correlated with the temporal envelope values, r (11) = 0.93, 95% CIs [0.81, 0.97], p < 0.01; the second dimension was correlated with the spectral envelope values, r (11) = 0.86, 95% CIs [0.63, 0.95], p < 0.01; and the third dimension was correlated with the intensity values, r (11) = 0.98, 95% CIs [0.95, 0.99], p < 0.01.

thumbnail
Fig 4. Scatterplot of the perceptual dimensions derived from the MDS analysis against the physical features.

The equation in each panel describes the regression line. Logarithmic functions are used for temporal and spectral envelope as in Marozeau et al. [7], Fig 8; they offer better correlation to the perceptual dimensions than linear functions.

https://doi.org/10.1371/journal.pone.0274631.g004

Based on these relationships, Fig 2 was redrawn with a new scale on the x-axis: Figs 5 and 6 illustrate performance for all three features on the same x-axis dissimilarity scale, for each participant group. The data were re-analyzed in terms of dissimilarity instead of difficulty level. Dissimilarity was treated as a continuous variable and all other terms remained the same, including factors and their base categories. Table 6 presents a summary of the model including only predictors with R2 greater than .01 for brevity. Full model specifications can be found in the S1 Appendix. This model included significant main effects of dissimilarity, feature and musical training, but not age group or PTAv. Both dissimilarity and musical training had positive coefficients (1.33 and 0.07, respectively), indicating better performance with increased dissimilarity and musical training, as expected. There were several significant interactions: dissimilarity x feature, dissimilarity x musical training, dissimilarity x feature x musical training, feature x age group, feature x musical training, and feature x age group x musical training. The interaction between age group and musical training was not significant.

thumbnail
Fig 5. Mean d’ plotted on a common perceptual scale, younger adults.

Performance plotted against dissimilarity, allowing features to be directly compared for younger adults: non-musicians (A) and musicians (B). Standard errors are shown by grey shading around each line.

https://doi.org/10.1371/journal.pone.0274631.g005

thumbnail
Fig 6. Mean d’ plotted on a common perceptual scale, older adults.

Performance plotted against dissimilarity, allowing features to be directly compared for older adults: non-musicians (A) and musicians (B). Standard errors are shown by grey shading around each line.

https://doi.org/10.1371/journal.pone.0274631.g006

thumbnail
Table 6. Summary of mixed effects multiple linear regression model with common perceptual scale.

Coefficients for each predictor (and each level, as relevant) along with standard error (SE) and predictor R2 are reported along with Pearson’s correlation, 95% CIs, p-value and R2 for full models in the note below.

https://doi.org/10.1371/journal.pone.0274631.t006

Interactions including dissimilarity indicate differences in slope between features, as indicated by the dissimilarity x feature two-way interaction, and as a factor of musical training, as indicated by the dissimilarity x feature x musical training three-way interaction. According to model coefficients, slopes for both the spectral and temporal envelope manipulations were shallower than the slope for the intensity manipulation, with spectral envelope having the shallowest slope. This difference in slopes further interacted with musical training, where very small but positive coefficients indicate that higher Gold-MSI scores were associated with slightly steeper slopes.

To confirm the null effect of age, a TOST comparing younger and older adults with lower and upper equivalence bounds of d’ = -0.42 and d’ = 0.42, equivalent to a Cohen’s d of 0.2. The equivalence test was significant, t (6300) = -4.86, p < .01, indicating that the effect, Δd’ = 0.16 CIs [0.08, 0.25], was different from zero, where younger adults had higher d’ scores than older adults. However, the effect and its CIs fall within the equivalent bounds, meaning that the effect is also equivalent to zero. In other words, the effect is not bigger than the smallest interesting effect size and therefore can be considered negligible.

3.3 Discussion

The redrawing of the results of Experiment 1 on a dissimilarity scale in Figs 5 and 6 reveals that each feature spans a different space on the dissimilarity scale, from temporal envelope with the shortest span to spectral envelope with the widest. This suggests that the perceptual distance between 3 dB of harmonic attenuation and 25 dB of harmonic attenuation is four times larger than the perceptual distance between an FHDM of 160 ms and an FHDM of 60 ms, while the perceptual distance between 65 phons and 2 phons is about twice as large. Additionally, the maximally different spectral envelopes used in the study were perceived to be approximately twice as dissimilar as the maximally different intensities. Despite this, d’ was higher overall when intensity was manipulated, suggesting that loudness is the most effective of these three cues. This is consistent with Marozeau et al.’s [7] results.

3.3.1 Effect of musicianship.

Marozeau et al. [7] also reported an effect of musicianship, where intensity and spectral envelope required similar dissimilarity to segregate the target from the distractor in non-musicians, while intensity required less dissimilarity than both spectral and temporal envelope to successfully segregate the target from the distractor in musicians. Here, all participants had the least dissimilarity when intensity was manipulated than spectral envelope, although a higher musical training score was associated with needing less dissimilarity to identify deviants for both features. Specifically, musical training had an overall positive coefficient, providing a performance advantage of approximately 1.2 dissimilarity units for musicians with the highest Gold-MSI scores as compared to non-musicians with the lowest Gold-MSI scores. However, note once again that the effect size for musical training was small.

3.3.2 Effect of aging.

There were no main effects of age or PTA on the relative effectiveness of the three features; however, age interacted with feature, where older adults performed better than younger adults when intensity was manipulated but worse when spectral envelope was manipulated. In other words, older adults relied more heavily on loudness than spectral envelope, while younger adults used both cues. This decreased use of spectral envelope in older adults may be due to older adults’ loss of high frequency hearing, which would impede their ability to perceive the manipulations in amplitude envelope that focused on higher frequency attenuation [39, 40] and thus rely more heavily on perceptual loudness cues. A more systematic variation of hearing loss as it interacts with age (i.e. younger and older adults with or without hearing loss) may have helped differentiate between effects of age and hearing loss. However, the results of our current study, as designed, suggest that a different explanation must be sought, as PTA was not a significant predictor of performance. One such explanation may be loudness recruitment, where above a certain intensity, sounds are perceived as louder than for a person with normal hearing [72]. However, we cannot test this explanation with these data as the rate of increase of loudness as a function of increasing level (loudness recruitment) was not collected. Recall that Experiment 1 suggested no difference in performance between younger and older adults when spectral envelope was manipulated; Experiment 2 suggests that though raw performance was comparable, the relative weight of the feature differed between age groups, highlighting the importance of being able to compare the effect of features directly and suggesting that there may be qualitative rather than quantitative age differences in auditory streaming skills. In other words, younger and older adults can perform the task equally well, but the strategies they use to accomplish it differ. Temporal envelope was the feature with the lowest performance for all age groups, suggesting that it is not a particularly useful streaming cue, especially if intensity and spectral envelope information are available.

4 General discussion

The experiments reported here provide evidence that sequential auditory streaming, when cued by intensity, spectral envelope and temporal envelope is preserved in older adults who have normal hearing or mild hearing loss, based on the World Health Organization (WHO) criteria [62]. This extends existing evidence for the preservation of sequential auditory streaming for both normal-hearing and hearing-impaired older adults based on frequency and interaural time differences [1921, 45]. The dissimilarity rating paradigm allowed direct comparison of the effectiveness of the perceptual cues between the groups of participants. Interestingly, intensity was more effective for older adults than for younger adults, although the salience of intensity could be associated with elevated audiometric thresholds, or loudness recruitment, [7275], and not aging per se for older adults. In the following sections, the effect of musical training is reviewed and results are discussed in the context of theories of aging.

4.1 Effect of musicianship

Musicianship is associated with enhanced auditory processing [49, 50, 53, 7678] including auditory streaming [12]. Enhanced auditory processing abilities are preserved in older musicians compared to older non-musicians [52, 54, 55] while auditory stream segregation based on F0 is preserved in older adults regardless of musicianship [19, 20]. Therefore, one critical question to ask is how musicianship and aging interact during auditory stream segregation tasks based on intensity, spectral and temporal envelope cues. Significant interactions may suggest a pattern of differential preservation, if aging results in lower d’ for those with low musical training but not those with high musical training. Main effects of age group and musicianship without an interaction would suggest preserved differentiation, where aging has a negative effect on the d’ for all participants, regardless of their musical experience [52]. There were no significant interactions between age group and musicianship for any feature, nor when a common perceptual scale was applied. Furthermore, there was no main effect of age group, suggesting that while musical training has a positive effect on d’, aging has no negative effect. Combined with the results of Experiment 2, which show that intensity is the most effective perceptual cue for all participant groups, this pattern of results suggests that musicianship is associated with increased sensitivity to acoustic features that are typically less salient for stream segregation. An important caveat is the small effect size of musicianship. This suggests that the positive relationship between musical training and auditory streaming skills is negligible and therefore does not make a significant difference in day to day listening contexts. Indeed, Madsen et al. [79] found that while musical training was associated with better frequency discrimination, interaural time difference discrimination and attentive tracking, this advantage did not extend to speech-in-noise perception. At the same time, Madsen et al. [79] may be an outlier, as a recent meta-analysis revealed that musical training was associated with enhanced abilities to understand speech-in-noise [80]. This pattern of results suggests that musician benefits for understanding speech-in-noise may not be related to enhanced auditory stream segregation. Future meta-analyses of existing work studying the effect of musicianship on auditory streaming would be useful to better understand the importance of the benefit of musicianship.

4.2 Theories of aging

One way to consider the streaming task used in the current study is as an inhibition task. Processing of the distractor tones must be inhibited so that the target tones can be integrated into an attended auditory stream. A recent meta-analysis investigated the impact of aging on three types of cognitive inhibition [60]. The meta-analysis found support for age-related decline in the ability to inhibit a natural, habitual or dominant response in favour of a response appropriate to the goal of the study (e.g. Stroop task). For example, in a stop-signal task in which participants inhibited or executed responses based on a visual signal, older adults were less able to inhibit a response to a non-target stimulus [81]. Interestingly, the same meta-analysis found no impact of aging on the ability to inhibit distracting perceptual information, nor the ability to inhibit response interference. For example, Hsieh & Fang [82] found that older adults performed similarly to younger adults on a series of tasks testing participants’ ability to ignore distracting information. In this task, participants were asked to report either the direction of a target arrow on the screen (pro condition) or the opposite direction from the target arrow on the screen (anti condition), using the keyboard. The target arrow was surrounded by arrows pointing in the same or the opposite direction in congruent and incongruent conditions, or squares in a neutral condition. In a picture-word Stroop, or response inhibition, task, Bugg [83] found that older adults, like young adults, showed less interference for mostly incongruent items than for mostly congruent items, supporting evidence for intact and flexible reactive control. In the present context, an auditory stream segregation task would be a task that requires the inhibition of distracting information. Accordingly, the findings from the current study are consistent with this meta-analysis, finding no overall deficit in older adults.

There is evidence that although behavioural performance is preserved, brain function may be altered with age. This is known as the compensatory theory of aging [8486]. Typical changes include increased frontal lobe activation and decreased hemispheric asymmetry, where instead of more activity in one hemisphere than the other, the two are used equally. In both cases, older adults require the recruitment of more neural resources to accomplish the same task than younger adults. In terms of music perception, such compensatory activity as described above has been observed in the perception of F0 structure [87, 88] and F0-based sequential auditory streaming [20]. Given that differences in the sources of neural activity associated with streaming between younger and older adults have been observed for F0-based auditory stream segregation [20], it is possible that similar mechanisms were at play in the current study. Future research should examine potential compensatory activity using brain imaging methods during sequential auditory streaming based on acoustic cues other than F0.

In conclusion, this study investigated auditory stream segregation based on intensity, spectral and temporal envelope cues for older and younger musicians and non-musicians. The findings confirm that sequential auditory streaming is generally preserved in older adults, and enhanced in musicians. Furthermore, older adults relied on intensity more than younger adults and musicians were better able to use a variety of acoustic cues than non-musicians.

Supporting information

S1 Appendix. Full model and t-test details for all linear modelling.

https://doi.org/10.1371/journal.pone.0274631.s001

(DOCX)

S2 Appendix. Summary statistics (mean, standard deviation) of d’ and transformed d’ by level, age group, musicianship, and feature.

https://doi.org/10.1371/journal.pone.0274631.s002

(DOCX)

Acknowledgments

The authors acknowledge that this work was undertaken on Ktaqmkuk, the unceded, unsurrendered lands of the Mi’kmaq and the ancestral lands of the Beothuk. We would also like to acknowledge the Inuit of Nunatsiavut and NunatuKavut and the Innu of Nitassinan, and their ancestors, as the original peoples of Labrador. The authors thank Alex Cho and Liam Foley for their help with participant recruitment and data collection.

References

  1. 1. Bregman AS. Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press; 1990.
  2. 2. van Noorden L. Temporal Coherence in the Perception of Tone Sequences. Technical University Eindhoven. 1975.
  3. 3. Pressnitzer D, Hupé J-M. Temporal dynamics of auditory and visual bistability reveal common principles of perceptual organization. Curr Biol CB. 2006;16: 1351–1357. pmid:16824924
  4. 4. Cusack R, Decks J, Aikman G, Carlyon RP. Effects of Location, Frequency Region, and Time Course of Selective Attention on Auditory Scene Analysis. J Exp Psychol Hum Percept Perform. 2004;30: 643–656. pmid:15301615
  5. 5. Shinn-Cunningham BG, Lee AKC, Oxenham AJ. A sound element gets lost in perceptual competition. PNAS Proc Natl Acad Sci U S Am. 2007;104: 12223–12227. pmid:17615235
  6. 6. Cusack R, Roberts B. Effects of differences in timbre on sequential grouping. Percept Psychophys. 2000;62: 1112–1120. pmid:10997053
  7. 7. Marozeau J, Innes-Brown H, Blamey PJ. The Effect of Timbre and Loudness on Melody Segregation. Music Percept Interdiscip J. 2013;30: 259–274.
  8. 8. Roberts B, Glasberg BR, Moore BCJ. Primitive stream segregation of tone sequences without differences in fundamental frequency or passband. J Acoust Soc Am. 2002;112: 2074. pmid:12430819
  9. 9. Sauvé S. A., Stewart L., Pearce M.T. The Effect of Musical Training on Auditory Grouping. In: Kyoung Song M, editor. Proceedings of the ICMPC-APSCOM 2014 Joint Conference. Seoul, Korea; 2014. pp. 293–296.
  10. 10. Carlyon RP, Cusack R, Foxton JM, Robertson IH. Effects of attention and unilateral neglect on auditory stream segregation. J Exp Psychol Hum Percept Perform. 2001;27: 115–127. pmid:11248927
  11. 11. Andreou L-V, Kashino M, Chait M. The role of temporal regularity in auditory segregation. Hear Res. 2011;280: 228–235. pmid:21683778
  12. 12. Beauvois MW, Meddis R. Time decay of auditory stream biasing. Percept Psychophys. 1997;59: 81–86. pmid:9038410
  13. 13. Münte TF, Kohlmetz C, Nager W, Altenmüller E. Superior auditory spatial tuning in conductors. Nature. 2001;409: 580–580. pmid:11214309
  14. 14. Alain C, Arnott SR, Picton TW. Bottom–up and top–down influences on auditory scene analysis: Evidence from event-related brain potentials. J Exp Psychol Hum Percept Perform. 2001;27: 1072–1089. pmid:11642696
  15. 15. Getzmann S, Näätänen R. The mismatch negativity as a measure of auditory stream segregation in a simulated “cocktail-party” scenario: effect of age. Neurobiol Aging. 2015;36: 3029–3037. pmid:26254109
  16. 16. Mackersie CL, Prida TL, Stiles D. The role of sequential stream segregation and frequency selectivity in the perception of simultaneous sentences by listeners with sensorineural hearing loss. J Speech Lang Hear Res. 2001. pmid:11218102
  17. 17. Moore BC, Glasberg BR, Peters RW. Thresholds for hearing mistuned partials as separate tones in harmonic complexes. J Acoust Soc Am. 1986;80: 479–483. pmid:3745680
  18. 18. Summers V, Leek MR. F0 processing and the seperation of competing speech signals by listeners with normal hearing and with hearing loss. J Speech Lang Hear Res. 1998;41: 1294–1306.
  19. 19. Alain C, Ogawa KH, Woods DL. Aging and the segregation of auditory stimulus sequences. J Gerontol B Psychol Sci Soc Sci. 1996;51: P91–P93. pmid:8785691
  20. 20. Snyder JS, Alain C. Sequential auditory scene analysis is preserved in normal aging adults. Cereb Cortex. 2007;17: 501–512. pmid:16581981
  21. 21. Stainsby TH, Moore BC, Glasberg BR. Auditory streaming based on temporal structure in hearing-impaired listeners. Hear Res. 2004;192: 119–130. pmid:15157970
  22. 22. Tejani VD, Schvartz-Leyzac KC, Chatterjee M. Sequential stream segregation in normally-hearing and cochlear-implant listeners. J Acoust Soc Am. 2017;141: 50–64. pmid:28147600
  23. 23. Rose MM, Moore BC. Perceptual grouping of tone sequences by normally hearing and hearing-impaired listeners. J Acoust Soc Am. 1997;102: 1768–1778. pmid:9301054
  24. 24. Susie Valentine, Lentz Jennifer J. Broadband Auditory Stream Segregation by Hearing-Impaired and Normal-Hearing Listeners. J Speech Lang Hear Res. 2008;51: 1341–1352. pmid:18664686
  25. 25. Bergman M. Hearing and aging: Implications of recent research findings. Audiology. 1971;10: 164–171.
  26. 26. Moore BCJ, Gockel HE. Properties of auditory stream formation. Philos Trans R Soc B Biol Sci. 2012;367: 919–931. pmid:22371614
  27. 27. Fitzgibbons PJ, Gordon-Salant S, others. Auditory temporal processing in elderly listeners. J-Am Acad Audiol. 1996;7: 183–189. pmid:8780991
  28. 28. Pichora-Fuller MK, Schneider BA, MacDonald E, Pass HE, Brown S. Temporal jitter disrupts speech intelligibility: A simulation of auditory aging. Hear Res. 2007;223: 114–121. pmid:17157462
  29. 29. Fitzgibbons PJ, Gordon-Salant S, Friedman SA. Effects of age and sequence presentation rate on temporal order recognition. J Acoust Soc Am. 2006;120: 991–999. pmid:16938986
  30. 30. Fitzgibbons PJ, Gordon-Salant S. Auditory temporal order perception in younger and older adults. J Speech Lang Hear Res. 1998;41: 1052–1060. pmid:9771628
  31. 31. Humes LE, Busey TA, Craig J, Kewley-Port D. Are age-related changes in cognitive function driven by age-related changes in sensory processing? Atten Percept Psychophys. 2013;75: 508–524. pmid:23254452
  32. 32. Trainor LJ, Trehub SE. Aging and auditory temporal sequencing: Ordering the elements of repeating tone patterns. Atten Percept Psychophys. 1989;45: 417–426. pmid:2726404
  33. 33. Fitzgibbons PJ, Gordon-Salant S, Barrett J. Age-related differences in discrimination of an interval separating onsets of successive tone bursts as a function of interval duration. J Acoust Soc Am. 2007;122: 458–466. pmid:17614503
  34. 34. Fitzgibbons PJ, Gordon-Salant S. Aging and temporal discrimination in auditory sequences. J Acoust Soc Am. 2001;109: 2955–2963. pmid:11425137
  35. 35. Snell KB. Age-related changes in temporal gap detection. J Acoust Soc Am. 1997;101: 2214–2220. pmid:9104023
  36. 36. Strouse A, Ashmead DH, Ohde RN, Grantham DW. Temporal processing in the aging auditory system. J Acoust Soc Am. 1998;104: 2385–2399. pmid:10491702
  37. 37. Anderson S, Parbery-Clark A, White-Schwoch T, Kraus N. Aging Affects Neural Precision of Speech Encoding. J Neurosci. 2012;32: 14156–14164. pmid:23055485
  38. 38. Goossens T, Vercammen C, Wouters J, Wieringen A van. Aging affects neural synchronization to speech-related acoustic modulations. Front Aging Neurosci. 2016;8: 133. pmid:27378906
  39. 39. Arehart KH, Croghan NBH, Muralimanohar RK. Effects of Age on Melody and Timbre Perception in Simulations of Electro-Acoustic and Cochlear-Implant Hearing. Ear Hear. 2014;35: 195–202. pmid:24441739
  40. 40. Grimault N, Micheyl C, Carlyon RP, Arthaud P, Collet L. Perceptual auditory stream segregation of sequences of complex sounds in subjects with normal and impaired hearing. Br J Audiol. 2001;35: 173–182. pmid:11548044
  41. 41. Peters RW, Moore BCJ. Auditory filter shapes at low center frequencies in young and elderly hearing‐impaired subjects. J Acoust Soc Am. 1992;91: 256–266. pmid:1737876
  42. 42. Sommers MS, Humes LE. Auditory filter shapes in normal‐hearing, noise‐masked normal, and elderly listeners. J Acoust Soc Am. 1993;93: 2903–2914. pmid:8315154
  43. 43. Oxenham AJ. Pitch perception and auditory stream segregation: implications for hearing loss and cochlear implants. Trends Amplif. 2008;12: 316–331. pmid:18974203
  44. 44. Rose MM, Moore BC. The relationship between stream segregation and frequency discrimination in normally hearing and hearing-impaired subjects. Hear Res. 2005;204: 16–28. pmid:15925188
  45. 45. Füllgrabe C, Moore BCJ. Effects of age and hearing loss on stream segregation based on interaural time differences. J Acoust Soc Am. 2014;136: EL185–EL191. pmid:25096145
  46. 46. Marozeau J, Innes-Brown H, Grayden DB, Burkitt AN, Blamey PJ. The effect of visual cues on auditory stream segregation in musicians and non-musicians. PloS One. 2010;5: e11297. pmid:20585606
  47. 47. Slater KD, Marozeau J. The effect of tactile cues on auditory stream segregation ability of musicians and nonmusicians. Psychomusicology Music Mind Brain. 2016;26: 162–166.
  48. 48. François C, Jaillet F, Takerkart S, Schön D. Faster Sound Stream Segmentation in Musicians than in Nonmusicians. PLoS ONE. 2014;9: e101340. pmid:25014068
  49. 49. Fujioka T, Trainor LJ, Ross B, Kakigi R, Pantev C. Automatic encoding of polyphonic melodies in musicians and nonmusicians. J Cogn Neurosci. 2005;17: 1578–1592. pmid:16269098
  50. 50. Habibi A, Wirantana V, Starr A. Cortical Activity During Perception of Musical Pitch: Comparing Musicians and Nonmusicians. Music Percept Interdiscip J. 2013;30: 463–479.
  51. 51. Zendel BR, Alain C. Concurrent sound segregation is enhanced in musicians. J Cogn Neurosci. 2009;21: 1488–1498. pmid:18823227
  52. 52. Alain C, Zendel BR, Hutka S, Bidelman GM. Turning down the noise: The benefit of musical training on the aging auditory brain. Hear Res. 2014;308: 162–173. pmid:23831039
  53. 53. Parbery-Clark A, Anderson S, Hittner E, Kraus N. Musical experience strengthens the neural representation of sounds important for communication in middle-aged adults. Front Aging Neurosci. 2012;4: 30. pmid:23189051
  54. 54. Zendel BR, Alain C. Musicians experience less age-related decline in central auditory processing. Psychol Aging. 2012;27: 410–417. pmid:21910546
  55. 55. Zendel BR, Alain C. The influence of lifelong musicianship on neurophysiological measures of concurrent sound segregation. J Cogn Neurosci. 2013;25: 503–516. pmid:23163409
  56. 56. Zendel BR, Alain C. Enhanced attention-dependent activity in the auditory cortex of older musicians. Neurobiol Aging. 2014;35: 55–63. pmid:23910654
  57. 57. Caspary DM, Ling L, Turner JG, Hughes LF. Inhibitory neurotransmission, plasticity and aging in the mammalian central auditory system. J Exp Biol. 2008;211: 1781–1791. pmid:18490394
  58. 58. Lustig C, Hasher L, Zacks RT. Inhibitory deficit theory: Recent developments in a “new view.” Inhib Cogn. 2007;17: 145–162.
  59. 59. Salthouse TA, Meinz EJ. Aging, inhibition, working memory, and speed. J Gerontol B Psychol Sci Soc Sci. 1995;50: P297–P306. pmid:7583809
  60. 60. Rey-Mermet A, Gade M. Inhibition in aging: What is preserved? What declines? A meta-analysis. Psychon Bull Rev. 2018;25: 1695–1716. pmid:29019064
  61. 61. World report on hearing. Geneva: World Health Organization; 2021.
  62. 62. Yamasoba T, Lin FR, Someya S, Kashio A, Sakamoto T, Kondo K. Current concepts in age-related hearing loss: epidemiology and mechanistic pathways. Hear Res. 2013;303: 30–38. pmid:23422312
  63. 63. Müllensiefen D, Gingras B, Musil J, Stewart L. The Musicality of Non-Musicians: An Index for Assessing Musical Sophistication in the General Population. PLoS ONE. 2014;9: e89642. pmid:24586929
  64. 64. Glasberg BR, Moore BC. A model of loudness applicable to time-varying sounds. J Audio Eng Soc. 2002;50: 331–342.
  65. 65. Bates D, Mächler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. J Stat Softw. 2015;67.
  66. 66. Satterthwaite FE. An approximate distribution of estimates of variance components. Biom Bull. 1946;2: 110–114. pmid:20287815
  67. 67. Lakens D. Equivalence Tests: A Practical Primer for t Tests, Correlations, and Meta-Analyses. Soc Psychol Personal Sci. 2017;8: 355–362. pmid:28736600
  68. 68. Bernstein LR, Trahiotis C. Behavioral manifestations of audiometrically-defined “slight” or “hidden” hearing loss revealed by measures of binaural detection. J Acoust Soc Am. 2016;140: 3540–3548. pmid:27908080
  69. 69. Smoorenburg GF. Speech reception in quiet and in noisy conditions by individuals with noise-induced hearing loss in relation to their tone audiogram. J Acoust Soc Am. 1992;91: 421–437. pmid:1737889
  70. 70. Bigand E, Vieillard S, Madurell F, Marozeau J, Dacquet A. Multidimensional scaling of emotional responses to music: The effect of musical expertise and of the duration of the excerpts. Cogn Emot. 2005;19: 1113–1139.
  71. 71. Borg I, Groenen P. Modern multidimensional scaling: Theory and applications. J Educ Meas. 2003;40: 277–280.
  72. 72. Moore BC, Wojtczak M, Vickers DA. Effect of loudness recruitment on the perception of amplitude modulation. J Acoust Soc Am. 1996;100: 481–489.
  73. 73. Epstein M, Marozeau J. Loudness and intensity coding. Oxf Handb Audit Sci. 2010;3: 45–69.
  74. 74. Harris JD. A brief critical review of loudness recruitment. Psychol Bull. 1953;50: 190. pmid:13056097
  75. 75. Herrmann B, Maess B, Johnsrude IS. Aging affects adaptation to sound-level statistics in human auditory cortex. J Neurosci. 2018;38: 1989–1999. pmid:29358362
  76. 76. Meha-Bettison K, Sharma M, Ibrahim RK, Mandikal Vasuki PR. Enhanced speech perception in noise and cortical auditory evoked potentials in professional musicians. Int J Audiol. 2018;57: 40–52. pmid:28971719
  77. 77. Parbery-Clark A, Tierney A, Strait DL, Kraus N. Musicians have fine-tuned neural distinction of speech syllables. Neuroscience. 2012;219: 111–119. pmid:22634507
  78. 78. Rammsayer T, Altenmüller E. Temporal Information Processing in Musicians and Nonmusicians. Music Percept Interdiscip J. 2006;24: 37–48.
  79. 79. Madsen SMK, Marschall M, Dau T, Oxenham AJ. Speech perception is similar for musicians and non-musicians across a wide range of conditions. Sci Rep. 2019;9: 10404. pmid:31320656
  80. 80. Hennessy S, Mack WJ, Habibi A. Speech-in-noise perception in musicians and non-musicians: A multi-level meta-analysis. Hear Res. 2022;416: 108442. pmid:35078132
  81. 81. Kray J, Kipp KH, Karbach J. The development of selective inhibitory control: The influence of verbal labeling. Acta Psychol (Amst). 2009;130: 48–57. pmid:19084817
  82. 82. Hsieh S, Fang W. Elderly adults through compensatory responses can be just as capable as young adults in inhibiting the flanker influence. Biol Psychol. 2012;90: 113–126. pmid:22445781
  83. 83. Bugg J M. Evidence for the Sparing of Reactive Control With Age. Psychol Aging. 2014;29: 115–127. pmid:24378111
  84. 84. Cabeza R. Hemispheric asymmetry reduction in older adults: the HAROLD model. Psychol Aging. 2002;17: 85. pmid:11931290
  85. 85. Reuter-Lorenz PA, Cappell KA. Neurocognitive aging and the compensation hypothesis. Curr Dir Psychol Sci. 2008;17: 177–182.
  86. 86. Reuter-Lorenz PA, Lustig C. Brain aging: reorganizing discoveries about the aging mind. Curr Opin Neurobiol. 2005;15: 245–251. pmid:15831410
  87. 87. Halpern AR, Zioga I, Shankleman M, Lindsen J, Pearce MT, Bhattarcharya J. That note sounds wrong! Age-related effects in processing of musical expectation. Brain Cogn. 2017;113: 1–9. pmid:28064077
  88. 88. Lagrois M-É, Peretz I, Zendel BR. Neurophysiological and Behavioral Differences between Older and Younger Adults When Processing Violations of Tonal Structure in Music. Front Neurosci. 2018;12: 54. pmid:29487498