Our sensory environment changes constantly. Accordingly, neural systems continually adapt to the concurrent stimulus statistics to remain sensitive over a wide range of conditions. Such dynamic range adaptation (DRA) is assumed to increase both the effectiveness of the neuronal code and perceptual sensitivity. However, direct demonstrations of DRA-based efficient neuronal processing that also produces perceptual benefits are lacking. Here, we investigated the impact of DRA on spatial coding in the rodent brain and the perception of human listeners. Complex spatial stimulation with dynamically changing source locations elicited prominent DRA already on the initial spatial processing stage, the Lateral Superior Olive (LSO) of gerbils. Surprisingly, on the level of individual neurons, DRA diminished spatial tuning because of large response variability across trials. However, when considering single-trial population averages of multiple neurons, DRA enhanced the coding efficiency specifically for the concurrently most probable source locations. Intrinsic LSO population imaging of energy consumption combined with pharmacology revealed that a slow-acting LSO gain-control mechanism distributes activity across a group of neurons during DRA, thereby enhancing population coding efficiency. Strikingly, such “efficient cooperative coding” also improved neuronal source separability specifically for the locations that were most likely to occur. These location-specific enhancements in neuronal coding were paralleled by human listeners exhibiting a selective improvement in spatial resolution. We conclude that, contrary to canonical models of sensory encoding, the primary motive of early spatial processing is efficiency optimization of neural populations for enhanced source separability in the concurrent environment.
Citation: Gleiss H, Encke J, Lingner A, Jennings TR, Brosel S, Kunz L, et al. (2019) Cooperative population coding facilitates efficient sound-source separability by adaptation to input statistics. PLoS Biol 17(7): e3000150. https://doi.org/10.1371/journal.pbio.3000150
Academic Editor: Jennifer K. Bizley, University College London, UNITED KINGDOM
Received: January 23, 2019; Accepted: July 11, 2019; Published: July 29, 2019
Copyright: © 2019 Gleiss et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: German Research Council (DFG) to BG (CRC 870/B02 and EXC 2145, SyNergy – ID 390857198), MP (CRC 870/B02 and SPP1608), and JE (HE 6731/1-2). Federal Ministry of Education and Research (BMBF) to BG (IFB-LMU, TRFII-18). Bavarian Academy of Sciences (to MP). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: ACSF, artificial cerebrospinal fluid; ANF, auditory nerve fiber; CF, characteristic frequency; CGP, CGP 55845 hydrochloride; CN, Cochlear Nucleus; DM, double minima; DRA, dynamic range adaptation; GABA, gamma-aminobutyric acid; HPR, high-probability region; HRP, horseradish peroxide; ILD, interaural level difference; IQR, interquartile range; IR, infrared; ITD, interaural time difference; JND, just-noticeable difference; LED, light-emitting diode; LSO, Lateral Superior Olive; MLE, Maximum Likelihood Estimation; MNTB, Medial Nucleus of the Trapezoid Body; NADH, nicotinamide adenine dinucleotide; PFA, paraformaldehyde; rmse, root mean-squared error; ROI, region of interest; sCMOS, scientific Complementary metal-oxide semiconductor; SM, single minimum; sps, spikes per second
Our ability to distinguish individual objects in complex and dynamic environments is a fundamental brain function [1,2]. Conversely, the functional requirements of sensory systems are shaped by the physical properties of the outside world: only if the neural sensitivity matches the current statistics of the sensory inputs will the coding of relevant stimulus features be both informative and energetically efficient and consequently evolutionarily viable. Because realistic complex environments exhibit highly nonuniform occurrence probabilities of stimulus cues [3,4], sensory neurons adapt their action potential (“spike”) responses according to the probability of concurrent stimulus properties. This “dynamic range adaptation” (DRA) is thought to render neuronal firing maximally sensitive to changes in the stimulus range that is most likely to occur (Fig 1) [5–7] while keeping activity rates low. Consequently, DRA to stimulus statistics is believed to reflect a neuronal adjustment to optimize stimulus encoding efficacy while simultaneously mediating improved perceptional resolution in the relevant cue range. However, direct demonstrations of DRA-based neuronal coding that causes both increased neuronal efficiency and the resulting perceptual benefits are lacking.
(A) Upper panel: LSO neurons respond with increasing higher spike rates to increasingly more ipsilateral sound-source positions because these positions generate more “positive” ILDs (ILDs that favor the excitatory ear). Lower panel: circuit diagram of the inputs to the LSO. LSO neurons receive excitatory input from the ipsilateral ear via the CN and inhibitory inputs from the contralateral ear via the MNTB. (B) Non-unimodal probability distributions of monaural stimulus intensity already cause pronounced DRA on the level of the ANFs . Thus, for large ILDs, the ANFs in the left and right ear will adapt to different intensity levels (as indicated by color-coded ANFs in A). Colored shaded areas illustrate respective HPRs of stimulus value occurrence. See also (C) and (D). (C) Illustration of the temporal sequence of ILDs in the two HPR conditions (centered on −20 dB ILD, shown in red, or +20 dB ILD, shown in blue) that were used to test the effect of complex stimulation on ILD coding. Note that the sequence changed for each of the 10 iterations of each HPR epoch. (D) Probability histogram of the sequences shown in (C). 80% of stimuli had ILDs centered on either −20dB ILD ± 8 dB (red) or +20 dB ILD ± 8 dB (blue). (E) Representative example of DRA in response to the change in HPR condition in a single LSO neuron. The ILD response function for this neuron was substantially different between the −20 dB ILD ± 8 dB (red) or +20 dB ILD ± 8 dB (blue) condition. Given are mean response rates (solid lines) and SEM (shaded area). (F) Threshold ILDs (minimal ILD that significantly differed from baseline; see Materials and methods) shift significantly by altering the HPR condition (P = 0.001, paired Wilcoxon signed-rank test, n = 13 neurons). Inset shows median shift (8 dB, white bar), with IQR given by the box edges. Whiskers extend to overall data range. Underlying data can be found in S1 Data. See also S1 Fig. ANF, auditory nerve fiber; CN, Cochlear Nucleus; DRA, dynamic range adaptation; HPR, high-probability region; ILD, interaural level difference; IQR, interquartile range; LSO, Lateral Superior Olive; MNTB, Medial Nucleus of the Trapezoid Body; Thrsld, threshold.
In the auditory system, rapid (subsecond) DRA to stimulus statistics has been revealed on multiple processing levels from primary auditory cortex [8–11] to the midbrain [12–15] and even brainstem (Fig 1A). Specifically, DRA is prominently exhibited already by auditory nerve fibers (ANFs) [16,17] (Fig 1B), which consequently should affect the processing of all downstream centers but might potentially be most crucial for spatial computations.
To infer the location of a sound source, brainstem neurons of the Lateral Superior Olive (LSO) compare the difference in sound level at the two ears (interaural level difference [ILD]) that is generated by a location-specific sound-attenuating effect of the head. LSO neurons respond according to the relative strength of excitatory and inhibitory inputs from the ipsi- and contralateral ear, respectively (Fig 1A). The ensuing sigmoidal ILD response functions (average action potential rate as a function of ILD) are regarded as representing the neuronal basis of auditory space encoding based on intensity difference cues  (Fig 1A, top, in addition to timing cues not dealt with in the present study). Specifically, individual source locations are thought to be mapped onto a specific spiking activity pattern of precisely tuned neurons or neuronal populations [19–22]. Yet, the nature of this spatial code and its readout is still a matter of debate [23–26]. Historically, studies have argued in favor of a labeled-line coding strategy of auditory space (or a mix of strategies), in which small differences in the average spike-rate tuning of individual, identifiable neurons or subpopulation contribute to sound-source localization [27–29]. Yet, the majority of recent studies have concluded that sound-source locations are initially encoded by the specific relative spike rate of two oppositely tuned hemispheric populations of spatially sensitive neurons (for review, see ). This “two-channel hemispheric coding strategy” is motivated by the fact that the vast majority of neurons in each brainstem hemisphere are broadly and similarly tuned, thus providing redundant information about sound-source locations (reviewed in ). The reasons for such apparently inefficient coding of space are, however, unknown. In either case, conclusions about spatial coding were derived from examining average neuronal firing rates in response to multiple repetitions of a stimulus set with uniform probability distributions of spatial cues (e.g., each ILD was equally likely to occur). Consequently, these traditional approaches neglected that under more natural conditions, DRA (of ANFs or later stages) might crucially alter the nature of the neuronal code and/or its perceptual consequences.
First, the fact that sound-source positions far to the left or right will result in distinctly different sound levels at the two ears (i.e., a large ILD) consequently should evoke DRA to different (monaural) stimulus levels for the ANFs in the left and right ear. Yet, it remains to be tested how such differential monaural DRA impacts the detection and representation of ILDs in the LSO.
Interestingly, an earlier study from the midbrain had reported that the sensitivity to ILDs undergoes DRA as a function of the spatial statistics . These data thus suggest that spatial processing downstream of the LSO may adapt to accentuate relative differences in sound-source positions in complex environments. We previously also identified an activity-dependent LSO gain-control mechanism  that might additionally influence the response to ILDs based on stimulation history . It follows that the extraction of ILDs, and consequently the primary representation of auditory space, already might not be as rigid as traditionally assumed. More fundamentally, since DRA to absolute sound level is already prominently exhibited by ANFs [16,17], it is unclear to which extent the observed adaptation to ILDs at later stages are the direct result of adaptation at either ear alone.
Second, the auditory pathway—like all sensory systems—must detect and code the relevant stimulus properties from only a single stimulus occurrence. The nature of response distributions to such a single instance of, e.g., an ILD in a neuron population might be very different compared to the response distribution of a single neuron to this ILD averaged across trials and consequently could result in different coding regimes. Since the extent of DRA can vary considerably across cells [10,16], such differences in response distributions might be particularly evident in complex acoustic environments.
Third, neuronal adaptations such as DRA to increase computational efficiency supposedly also entail a behavioral improvement within the concurrent environmental conditions. Hence, important insight into how the brain encodes auditory space under complex conditions might be gained by investigating the perceptual impact of DRA . Yet, while perceptual changes due to neuronal adaptation to stimulus statistics have been reported [15,33], demonstrations of how these changes are linked to improving neuronal efficiency are missing.
To answer these questions, we studied the effects of spatially complex stimulation on ILD processing in the LSO of gerbils and on the perception of human listeners. We extended a well-established monaural stimulus paradigm for studying DRA [10,12,16,17] by generating a binaural version of these stimuli to specifically test how spatial coding in the LSO is affected by DRA on its monaural inputs. Our stimulus paradigm resulted in rapidly changing ILDs that switched periodically between favoring either the left or right azimuthal space (Fig 1C and 1D), as can be experienced in noisy environments . In response to these spatially dynamic stimuli, we observed prominent DRA in LSO neurons, which demonstrate a lack of absolute encoding of space by average neuronal firing rate. Surprisingly, DRA in single neurons resulted in large response variability to a given ILD across trials. However, we find that when considering single-instance population coding, DRA maximized the efficiency of neuronal separability for specifically those ILDs that were most likely to occur in the concurrent statistical environment (high-probability region [HPR], Fig 1C and 1D). These enhancements in neuronal coding were paralleled by human listeners exhibiting a selective improvement in just-noticeable differences (JNDs) for ILDs in the hemisphere of the respective HPR. Intrinsic LSO population imaging of energy consumption and a simple LSO model further explained that a slow-acting gain-control mechanism enhances the population efficiency by distributing activity across a group of neurons during DRA. We conclude that already on the primary detector level, the processing of ILDs is not tuned towards a representation of locations in space but optimizes efficient sound-source separation in the concurrent acoustic environment by instantaneous population coding of ILDs.
LSO neurons exhibit DRA to stimulus statistics
To explore the role of DRA on spatial coding in complex environments, we designed a stimulus paradigm with constantly varying ILDs in the context of two related but statistically distinct listening conditions. We used continuous broadband noise (identical on the two ears) and changed the ILD every 50 ms, with ILD values drawn from one of two nonuniform distributions. The two distributions covered an identical range of ILDs but favored predominately (80% of time) either the ipsi- or contralateral ear (ILDs of +20 dB ± 8 dB and −20 dB ± 8 dB, named the +20 dB HPR and −20 dB HPR, respectively; Fig 1C and Materials and methods). This way, we simulated dynamic spatial environments with dominant sound sources located either left or right of midline (Fig 1D). The two conditions switched periodically (Fig 1C: 1 run consisted of 19 switches every approximately 6 s; the sequence of ILDs was different for each switch but identical across repetitions; 3 runs were recorded for each cell). To assess to what extent changes in stimulus statistics on the two ears alter the neuronal detection and encoding of ILDs, we first carried out extracellular recordings from single neurons in the LSO of anesthetized gerbils while presenting the stimuli via calibrated earphones (see Materials and methods).
Following previous studies of DRA [12,14–16], we first assessed the average neuronal spike rates (calculated across occurrences of each ILD) separately for −20 dB and the +20 dB HPR conditions. We observed that the resulting ILD response functions differed between the two conditions (a single-neuron example is shown in Fig 1E). Specifically, a clear shift of the ILD-spike–rate functions was observable that entailed a change in the average spike rate in the respective HPRs (red and blue areas in Fig 1E and throughout). To quantify these shifts, which appeared highly reminiscent of DRA to accommodate the change in the range of overrepresented ILDs, we computed the minimal ILD that triggered significant spiking (“threshold ILD”; see Materials and methods) in the respective condition for each neuron. Threshold ILDs significantly increased when switching from the −20 dB to the +20 dB HPR (Fig 1F, n = 13 neurons, P = 0.001, paired Wilcoxon signed-rank test). For the population, the median shift in threshold ILD between the two conditions was 8 dB (interquartile range [IQR] 6 dB; Fig 1F inset). To further characterize the extent and specificity of the DRA, we also generated two additional ILD distributions (n = 18 neurons and n = 19 neurons), as well as a monaural condition (stimulation only on the ipsilateral ear, n = 11 neurons). These additional conditions confirmed that the observed shifts in threshold ILDs were dependent on the concurrent input statistics (S1 Fig). This presence of DRA-related shifts in ILD functions in the LSO directly demonstrates a lack of absolute encoding of sound-source locations by the average neuronal firing rate already on the level of cue detection.
DRA optimizes single-observation population coding
So far, we followed previous studies of DRA in the auditory system [12,14–16] and evaluated the spatial sensitivity of individual LSO neurons by their average spike rate given the repeated presentation of each ILD. However, in reality, processing must be able to compute the location of a sound source from observation of a single instance of the stimulus. Therefore, we next focused on the direct response by each neuron to each occurrence of a particular ILD. Examining individual spike counts for 75 recurrent instances of +20 dB and −20 dB ILDs in the respective HPR condition revealed two interesting findings (Fig 2A).
(A) While response probabilities were low overall, the variability in spike counts to repetitive instances of the same ILD in individual neurons was high. Shown are spike counts of all neurons in response to 75 instances of −20 dB ILD (lower right-hand panel) or +20 dB ILD (upper right-hand panel). Bottom line in each panel shows mean responses for each ILD instance. (B and C) Mean population spike-count probability density functions of all neurons were constructed based on the pooled response of all single-neuron responses (B) and pseudopopulation mean responses at each instance of ILD occurrence (C). Only the latter resulted in informative ILD tuning (see also Fig 3 and S2 Fig). Underlying data can be found in S1 Data. DRA, dynamic range adaptation; HPR, high-probability region; ILD, interaural level difference; Pop., population.
First, responses of most LSO neurons for ILDs from the concurrent HPR were very sparse (median spike count and IQR: −20 dB HPR, 0.92 and 0.46 spikes; +20 dB HPR, 3.15 and 0.85 spikes). Second, a high response variability, as indicated by the large IQR, was observable for all ILDs: spike counts varied considerably between repeated instances of the same ILD (trial-wise median Pearson’s correlation coefficient and IQR: 0.62 and 0.26, S2 Fig), and spike-triggered average analysis showed no systematic relationship between ILD sequences and their likelihood to trigger a spike (S2 Fig). Crucially, this lack of consistent responses to ILDs with complex probability statistics resulted in a very limited modulation of the average spiking probabilities in either HPR condition, i.e., the probability to observe a particular mean average spike count across the sample population was very similar for all ILDs (Fig 2B).
In contrast, however, more specific ILD population tuning emerged from our data set when considering the pseudopopulation response for a single occurrence of a particular ILD (i.e., averaging across a column in Fig 2A and 2C; compare also bottom lines in right-hand panels of Fig 2A; note that neurons were recorded sequentially). To determine how these different population tunings of mean single-cell population responses and single-instance pseudopopulation responses impact the decoding accuracy of ILDs, we performed a Maximum Likelihood Estimation (MLE; see Materials and methods) for both methods. In short, MLE approximates which ILD is most likely to have occurred given the observation of a particular spike count. The estimates differed in two important ways when either considering the mean single-cell population responses (MLE[mean]) or the single-instance pseudopopulation response (MLE[pop]) (Fig 3A and 3B): First, the estimated deviation from the true ILD values were much larger for the MLE(mean) (minimal deviations: −20 dB HPR, 14.3 dB at −24 dB; +20 dB HPR, 16.8 dB at −16 dB) compared to MLE(pop) (minimal deviations: −20 dB HPR, 4.7 dB at −20 dB; +20 dB HPR, 10.1 dB at 0 dB); i.e., the accuracy of MLE(pop) was higher. Secondly, the generally assumed advantageous effect of DRA, i.e., that DRA explicitly enhances the coding in the respective HPR, was evident for MLE(pop) but not MLE(mean) (Fig 3A and 3B; improvement of 9 dB and 0 dB, respectively).
(A) Average MLE(mean) exhibited no apparent advantage of DRA for decoding ILDs (improvement = 0%, right-hand panel), as the tuning function obtained from the −20 dB HPR condition was overall better in estimating the ILDs compared to those obtained from the +20 dB HPR condition. Solid lines show the mean deviations, and shaded area represents SEM. (B) Average MLE(pop) showed a 9-dB improvement (calculated as the difference between the red and blue functions within the HPRs) for decoding ILDs from the concurrent HPR. Conventions as in (A). (C) Average D(mean) across all cells (n = 13) decreased for the concurrent HPR, suggesting a relative worsening of the ability to distinguish adjacent ILDs by −15.5% (right-hand panel). Solid lines represent D, including the hypothetical second hemispheric response. Mean data from a single hemisphere are given by dashed lines (shaded area represents SEM). (D) Average D(pop) of the recorded LSO neurons (n = 13) increased for ILDs from the concurrent HPR by 27.1% (right-hand panel). Solid lines represent D(pop), including the hypothetical second hemispheric response. Mean data from a single hemisphere are given by dashed lines (shaded area represents SEM). (E) and (F) The effect of DRA to HPRs on ILD JND of human subjects were measured with stimuli presented over headphones. Presentation of an adapter sound consisting of a 2-s snippet of one of the two HPR stimuli was followed by two noise probes (50 ms each) for JND measurement. The ILDs of the probe tones were centered on −20 dB ILD or +20 dB ILD and hence either matched or mismatched the HPR of the preceding adapter. (G) Single-subject example of the influence of the HPR on ILD JND. Colocation (i.e., matching) of the adapter HPR and probe sound position led to significant improvement of the ILD JND (P = 0.004, Friedman test, n = 12 trials each). (H) The average improvement of colocation by adapter HPR and probe sound position across listeners was 52.6% (P = 0.002, unpaired Wilcoxon signed-rank test, n = 10 sets from 5 subjects, conventions as in inset in Fig 1F). Underlying data can be found in S1 Data. DRA, dynamic range adaptation; HPR, high-probability region; ILD, interaural level difference; JND, just-noticeable difference; MLE, Maximum Likelihood Estimation; pop, population.
Such relative enhancement in the neuronal precision of ILD estimation implies—but does not confirm—a relative improvement in the ability to resolve nearby sound locations. To quantify the impact of the differences in population tuning (Fig 2B and 2C) on resolution directly, we next determined the informational content of each neurons’ response towards the ability to distinguish adjacent ILDs. We followed previous studies on ILD coding [15,35] and calculated the standard separation (“D”) , which quantifies the separability of adjacent ILDs based on the ratio of slope steepness and response variability. We first calculated D(mean), i.e., the mean D across neurons, which is derived from averaging the D-ILD functions of each single neuron. Since the monaural stimuli that we presented in the −20 dB and +20 dB HPR epochs were mirror-symmetric to each other, the responses that we recorded in the LSO of one hemisphere during each HPR epoch can be assumed to reflect the responses to the other HPR epoch in the LSO in the other brain hemisphere (compare [12,15]). In other words, the LSO on each side of the brain would provide complementary spatial information for each HPR condition towards D. We therefore summed the D-ILD functions of each condition with the mirror-image of the function of the other condition (Fig 3C; dashed lines indicate single-hemisphere data, solid line represents sum of both hemispheres). In remarkable contrast to previous midbrain studies [12,15), we found that the average neuronal separability was not enhanced by the DRA but actually considerably lower for the ILDs from the respective HPRs (Fig 3C; change for concurrent HPR: −15%, compare red and blue lines in respective HPRs in left panel). Thus, D(mean) would predict a worsening of ILD resolution by the observed DRA to spatial stimulus statistics. In contrast, when D is calculated based on the mean spike count of all neurons to each instance of an ILD (D[pop]), a specificity of separability for the ILDs of the concurrent hemisphere, including the HPR, becomes evident (Fig 3D; change for concurrent HPR: +27%). Similarly, a distinct benefit of D(pop) over D(mean) was also observed for the two additional binaural stimulus paradigms we tested (S1 Fig).
Human spatial resolution improves specifically for HPR ILDs
To directly test whether the increased performance as predicted by analyzing MLE(pop) and D(pop) also results in an improved ability to resolve sound-source locations, we performed a spatial separability test with human listeners via calibrated headphones (Fig 3E). The subjects (N = 5) were presented with a 2-s–long snippet of the same stimulus used in the electrophysiological experiments, taken alternatively from the +20 dB and −20 dB ILD HPR condition (only +20 dB is illustrated in Fig 3F). Shortly after (0.35 s) this adapting period, the listeners were presented with two probe ILDs (each consisting of 50 ms broadband noise, spaced apart by 100 ms) and were asked to indicate which of the two was perceived more lateralized. Using an adaptive tracking paradigm (see Materials and methods), the difference in ILD between the two probe ILDs was systematically reduced to determine the JND in ILD for each subject. The probe ILDs were centered either on +20 dB ILD or −20 dB ILD (Fig 3F) to allow for deciphering the influence of matching and mismatching the adapter conditions. Specifically, the electrophysiological data suggested that JNDs should be enhanced for ILDs in the same hemisphere as the adapter HPR (Fig 3D). In agreement with this prediction made on the basis of D(pop), we observed a significant improvement in JND when probe center ILDs matched the hemispheric bias of the adapter (single-subject example in Fig 3G: −20 dB adapter and −20 dB probe, red, or +20 dB adapter and +20 dB probe, blue; P = 0.004, Friedman test). On average, JNDs of the five listeners improved by 52.6% (Fig 3H, IQR: 51.8%, P = 0.002, Wilcoxon signed-rank test). Thus, human JND performance is in close agreement with the neuronally derived MLE(pop) and D(pop), suggesting that DRA crucially affects both population coding and perception of ILDs in complex acoustic environments.
Slow gain control maximizes efficiency
How could single-instance population responses in the LSO be optimized for separability in the concurrent HPR? Moreover, what effect might link high response variability of single neurons and highly informative population ILD coding? To gain insight into potential underlying mechanisms, we first analyzed the time course of DRA in LSO neurons. In accordance with DRA studies using similar stimulus statistics in other centers of the auditory system [12,14,16], we observed an exponential time course of rate adaptation (Fig 4A). Yet, in contrast to previous reports, we found that adaptation kinetics were best described not by a single but by two time constants (Fig 4B). Addition of the second time constant resulted in lower root mean-squared errors (rmses) of fits (median rmse(double) = 0.149, median rmse(single) = 0.255, P = 0.001, Wilcoxon signed-rank test; Fig 4C), and double-exponential fitting was significantly superior to single-exponential fits even after compensating for the unspecific benefit of an additional fitting parameter (Fig 4D; median adjusted R2(double) = 0.45, median adjusted R2(single) = 0.025, P = 0.001, Wilcoxon signed-rank test). We also fitted the time course of DRA using a power-law fit  and found that it was also superior to single-exponential fitting in explaining the observed time course (median rmse(power) = 0.151, P = 0.001, Wilcoxon signed-rank test; S3 Fig). Indeed, rmse(power) was comparable across cells, yet consistently slightly worse, to the double-exponential fitting (P = 0.02, Wilcoxon signed-rank test; S3 Fig). Moreover, double-exponential fitting had greater explanatory power towards the origin of the observed adaptation. Specifically, the shorter (first) time constants of the double-exponential fit (Fig 4B; median tau = 222.8 ms, IQR: 1.465 s) were similar to the kinetics reported for the auditory nerve , suggesting that monaural DRA upstream of the LSO contributed substantially to the ILD adaptations in the LSO. The second time constants were considerably slower and in the range of a few seconds (Fig 4B; median tau = 2.2 s, IQR: 7.7 s).
(A) Evolution of rate adaptation is best explained by a double-exponential process. Shown is a single-neuron example (black) and double-exponential fit (magenta). Besides a rapid rate adaptation at the beginning of an epoch, a second, slow time constant was also present in gerbil LSO. (B) Average fast and slow time constants of DRA in the LSO from double-exponential fitting: median tau(first) = 0.22 s, IQR: 1.465 s; median tau (second) = 2.2 s, IQR: 7.7 s. (C and D) Inclusion of a second time constant was superior to single time constant fitting of rate adaptation: the rmses of fits decreased (C; P = 0.001, paired Student t test, n = 13 neurons) and adjusted R2-values increased (D; P = 0.001, paired Student t test, n = 13 neurons). (E) Efficiency of responses was measured by calculating D(pop)/spike, displaying high specificity for ILDs in the concurrent HPR. (F) A simple subtraction model of the LSO and DRA of its inputs can replicate the electrophysiological results only when including a binaural gain-control stage. Shown are adaptation time courses of the model (as for the neuron in A). Gray and black traces represent results excluding and including binaural negative feedback, respectively. (G, H, I) The model was able to qualitatively reproduce both the HPR-specific shifting effect of DRA in the LSO (G) and the HRP specificity of D (H) and D/spike (I). Model responses without and with slow negative feedback stage after binaural processing are shown by dashed and solid lines, respectively. Crucially, the presence of negative feedback resulted in stronger rate adaptation, which did not affect D but substantially increased efficiency by 55% (mean improvement within the two HPR regions). Underlying data can be found in S1 Data. DRA, dynamic range adaptation; HPR, high-probability region; ILD, interaural level difference; IQR, interquartile range; LSO, Lateral Superior Olive; rmse, root mean-squared error.
Such slow rate adaptation is consistent with previous reports of negative feedback loops in LSO neurons: the inhibitory transmitter gamma-aminobutyric acid (GABA) is released in an activity-dependent manner into the extracellular matrix and thereby suppresses presynaptic inputs in the vicinity via slow-acting GABA-B receptors . Hence, DRA in the LSO might be considerably influenced by a slow binaural gain control for ILD coding during complex stimulation. It has been suggested that such slow negative feedback serves to increase the efficiency of population coding [38,39]: because spiking is energetically costly, the efficiency of a neuronal representation depends on the informational content of a spiking response relative to the number of spikes that conveyed this information [40,41]. To quantify neuronal population efficiency directly, we calculated the average D transmitted per spike for the instantaneous hemispheric average (D[pop]/spike). This analysis revealed a hemispheric specificity of response efficiency for the concurrent spatial conditions (Fig 4E). Thus, the slow gain-control mechanism associated with the second time constant of DRA that we found might serve to maximize the efficiency of neuronal processing within the hemisphere of the HPR. To investigate this potential role of slow gain control on ILD coding in more detail, we generated a simple model of the LSO based on existing models of DRA. Specifically, we extended an existing auditory nerve model that included both threshold and gain adaptation  by adding a binaural subtraction stage to reflect LSO processing (S3 Fig). As expected, this simple model exhibited clear DRA in response to the binaural HPR stimuli (Fig 4G, dotted lines, and S1 Fig), demonstrating that the HRP-specific shifts in ILD sensitivity are predominately caused by the oppositional monaural DRA of the LSO inputs. Likewise, the model was also able to reproduce the nature and extent of DRA to the two additional binaural stimulus paradigms that we tested (S1 Fig and S3 Fig).
However, since this version of the model lacked a binaural gain-control stage, it captured only the fast time course of rate adaptation and quickly reached a steady-state spike rate (<1 s, Fig 4F, gray trace). To account for the second, slow adaptation component in the neuronal data, we included an additional slow negative feedback stage after binaural comparison in the model (S3 Fig). This modification resulted in a close match in the dynamics of rate adaptation between model and LSO neurons (Fig 4F, black trace) and led to lowered overall spike counts during DRA (Fig 4G, solid lines). This effect of slow gain control had little effect on the overall amount of spatial information (Fig 4H), again indicative that the shift in ILD sensitivity in the LSO can be mostly explained by DRA in its monaural inputs (presumably already in the ANFs, compare also S3 Fig for analysis of additional stimulus paradigms). Nonetheless, addition of a slow gain control specifically increased D/spike of model responses for ILDs from the concurrent HPR (Fig 4I; compare dotted and solid lines). These modeling results thus suggest that the main function of slow gain control in the LSO is the optimization of coding efficiency (i.e., separability per unit of neuronal activity).
Intrinsic imaging reveals energetic benefits of slow GABAergic gain control
To directly test the model prediction that slow feedback signaling may minimize energy expenditure in subpopulations of LSO neurons, we took advantage of the intrinsic autofluorescence of a key intermediate nicotinamide adenine dinucleotide (NADH) during metabolic activity [42,43]. Specifically, our rationale was to examine energy consumption across large regions of the LSO during prolonged activity in an in vitro brain slice preparation using a self-designed imaging system for determining the changes in relative levels of NADH  (see Materials and methods). This technique allowed us studying the temporal and spatial evolution of energy expenditure and testing of any energy-minimizing effect by gain-control mechanisms. To this end, we monitored the relative change in NADH levels with high spatial resolution in LSO brain slices (21 μm × 23 μm per region of interest [ROI], 1,200 ROIs per field of view; Fig 5A and 5B; see Materials and methods). Using 20-s–long fiber stimulation of the excitatory inputs to the LSO at 200 Hz, we determined the spatial distribution of energy consumption in the LSO (six brain slices). As expected, large parts of the imaged LSO area displayed a monotonic increase in energy consumption with a single minimum (SM) in response to the 20-s–long stimulation (Fig 5C, red region; Fig 5D, lowest trace). However, we also frequently observed areas in which energy consumption declined after a few seconds of stimulation before ultimately increasing again (Double minima [DM]; Fig 5D and 5F). This nonmonotonic progression of energy consumption, combined with its apparent slow time course (4.58 s; IQR: 2.36 s; Fig 5E), is highly suggestive of the known GABA-B-receptor–mediated, activity-dependent gain-control mechanism. Accordingly, application of the specific antagonist CGP 55845 hydrochloride (CGP, 10 μM) to the bath revealed that DM largely disappeared during blockade of GABA-B signaling, resulting in considerably larger energy consumption (Fig 5G). In accordance with the assumed gain-control function of GABA, on the population level (i.e., across all ROIs per slice), CGP had differential effects on the prevalence of observed DM. A spatial diversity in the effect of blocking GABA-B mediated inhibition was clearly observable within individual brain slices (Fig 5H). Specifically, DM were either more or less likely to appear during CGP dependent on the fraction of DM during control (Fig 5I). Application of CGP also had a striking effect on the overall energy consumption in the LSO: across the entire imaged area, the block of GABA-B signaling on average almost doubled the energy consumption (median CGP/control ratio: 1.6, IQR: 1.5; n = 6 slices; Fig 5J). Moreover, similar to the history-dependency observed for the DMs, the magnitude of change in the energy consumption during CGP application was highly correlated with the prior activity level during control conditions (Spearman correlation, P < 0.0001; Fig 5K), providing further corroboration for the activity dependency of the gain-control mechanism. Together, these data strongly suggest that the spatially variable, slow gain control mediated by GABA-B in the LSO serves for the efficient population coding of ILDs.
(A) Schematic view of the optical path for intrinsic fluorescence imaging of metabolic activity in the LSO. (B) Bright-field image of brainstem (oblique illumination with IR-LED). Dashed rectangular denotes imaged area shown in (C). (C) Heat map illustrating spatial distribution of maximal relative decrease of NADH autofluorescence in the LSO in response to 20 s fiber stimulation at 200 Hz. (D) Exemplary traces for the temporal evolution of changes of NADH levels (colors correspond to respective regions in C). Gray horizontal bar denotes duration of electrical stimulation. While only an SM was present in the bottom trace (red, corresponding to large NADH decrease), the top three traces with less overall NADH decrease exhibited DM. (E) Mean time from stimulation start to first minimum calculated for each slice from those ROIs that exhibited DM. Population median: 4.58 s. (F) Distribution of SM and DM in the recording shown in (C). (G) Exemplary CGP-induced change in NADH fluorescence of a single ROI. (H) Spatial distribution of CGP-induced changes of NADH response types (SM/DM) for the recording shown in (C). (I) CGP-induced changes in the fraction of ROIs showing DM (n = 6 slices). (J) The average CGP/control ratio of NADH levels at mean time point of first minima (independent of presence of first minimum in the respective ROIs) was 1.6. (K) CGP/control ratio of NADH levels as function of NADH level changes in the control measurement (both measured 20 s after onset, in 100 neighboring ROIs in C); red line corresponds to exponential fit (tau = 0.97%; Spearman correlation, r = 0.8176, P < 0.0001). Underlying data can be found in S1 Data. CGP, CGP 55845 hydrochloride; Con., control; DM, double minima; GABA, gamma-aminobutyric acid; IR, infrared; LED, light-emitting diode; LSO, Lateral Superior Olive; M, mirror; NADH, nicotinamide adenine dinucleotide; ROI, region of interest; sCMOS, scientific Complementary metal-oxide semiconductor; SM, single minimum; TL, tube lens.
Our findings advocate a novel, to our knowledge, concept for the neuronal detection and primary encoding of spatial cues. We observed that LSO neurons strongly adapted their ILD rate functions in response to changes in the input statistics. Consequently, ILD representation is dynamic and devoid of absolute mapping of sound-source locations already on the detector level. We further discovered that the average rate tuning of single LSO neurons conveys little spatial information during complex stimulation because of high response variability. However, if responses to individual instances of an ILD were averaged across neurons, DRA optimized the efficiency of responses, which resulted in improved separation of ILDs from the concurrent HPRs. Correspondingly, human listeners showed evidence of a focal improvement in ILD resolution specifically for HPR ILDs. Importantly, this study is—to our knowledge—the first to demonstrate stimulus-specific benefits by DRA both for the efficiency of neuronal coding as well as human perception. Finally, a simple LSO model and intrinsic energy imaging explained that the efficiency of the enhancement in spatial separability is facilitated by a slow gain-control mechanism involving GABAergic signaling downstream to binaural integration.
The established concept of spatial encoding assumes that specific average response rates of sensory neurons are mapped onto a particular physical cue to allow for a faithful encoding of the corresponding source location [19–21]. A recent study by Dahmen and colleagues  was instructive in suggesting that this assumption does not unrestrictedly hold because they found that already in the midbrain, ILD sensitivity was modulated by stimulus statistics, thus promoting a relative coding of sound-source positions. We expanded this concept and determined a prominent role of DRA for binaural processing that refutes the idea of an absolute representation of space already on the detector level. In this sense, our study represents an extension to earlier reports that had established the susceptibility of spatial tuning to adaptation by stimulus history [44–48].
However, since these earlier studies were conducted in downstream targets of the primary spatial cue detectors, the site of modulation was unresolved. We determined that the observed shifts in ILD sensitivity in the LSO are predominantly generated by the combinatory effect of upstream monaural adaptation to absolute sound level in the excitatory and inhibitory pathways (most likely the ANFs). This finding is not only crucial for the assessment of the functional role of DRA in the LSO (see next paragraph) but also bears significance towards the interpretation of previous finding of adaptive ILD coding at downstream processing stages such as the midbrain . The fact that already in the LSO, DRA of single cells was not ILD-specific (i.e., not beneficial for the separability of adjacent ILDs within the HPRs) strongly indicates that monaural adaptational mechanisms are the dominant driver for shifts in ILD sensitivity at any processing stage. Thus, fundamentally, our data highlight the importance of considering monaural DRA to sound level (as control data) when assessing the specificity of adaptation in binaural neuronal sensitivity along the auditory pathway.
Our evaluation of the impact of DRA on neuronal information further suggests that the basic principle of LSO spatial coding is the preservation of ecologically relevant coding efficiency by providing high separability of nearby sound sources within the statistically predominant range of ILDs [5,40]. In accordance with this interpretation, recent studies reported modulatory effects of a preceding stimulus (adapter) on perceptual spatial resolution, both for ILDs and the other binaural cue, the interaural time difference (ITD) [49–51]. These findings corroborate the generality of stimulus-history–dependent effects on spatial perception. Importantly, it has been shown that the improvements in JND cannot be explained by unspecific attentional effects because acoustic priming of spatial attention to the location of the probe via the adapter by itself does not alter JNDs. Rather, it requires a congruence of both spectral content and the relevant binaural cue between adapter and probe to elicit an improvement in spatial resolution [49,51]. These findings clearly demonstrate that JND improvements are caused by adaptation in the spatial processing circuits in the respective frequency channels and not by attentional priming. Our present data and a prior study on ITD processing (33) provide a mechanistic explanation on the detector level that has so far been linked to secondary processing at higher stages [13,15,49,52,53].
Because the LSO represents the initial binaural stage of ILD detection, our findings stand out for two more reasons: (I) Adaptive processing at the spatial cue detector should result in absolute localization errors because of a missing reference frame. This notion is supported by reports of human listeners producing significant absolute localization errors when presented with biased spatial statistics [15,33,50,52,54]. (II) While adaptation with the purpose to preserve a large dynamic coding range within the predominant stimulus range can be found across sensory systems , we showed that in the LSO, DRA is likely to be inherited to a large degree by adaptation to intensity statistics in the monaural inputs (e.g., the auditory nerves from each ear). This susceptibility to differences in adaptation between the two inputs due to different input statistics largely explained the observed DRA in the LSO (compare Fig 4 and S1 Fig) and may also explain findings of LSO sensitivity to overall intensity . We furthermore show that the major computational modification after binaural integration serves to optimize the efficiency of coding for the concurrent ILDs by further decreasing population spike rates. Such stimulus-statistic–specific processing to maximize the efficiency of information transmission (by redundancy reduction) has so far been associated with the midbrain and cortex, i.e., processing that is secondary to the initial detection of the respective feature [56–60]. In contrast, ILD detection and efficiency optimization are realized concurrently by the LSO (and subsequent negative feedback; see below). Interestingly, adaptation to binaural statistics to optimize spatial sensitivity has also been described for the detector neurons of the second important binaural cue, the ITDs [28,61]. However, in contrast to the short-term changes of the LSO, these adaptations take place over days during maturation and entail long-term morphological changes.
Different from prior studies on adaptation to spatial statistics in the midbrain [12,15], MLE(mean) declined for the concurrent HPR ILDs because of the high response variability of individual neurons. An informational gain was only revealed by applying a single-observation pseudopopulation coding concept in the form of MLE(pop). In this regard, our data provide physiological support for the framework of cooperative population decoding , which has been developed to explain the apparent noisiness of cortical processing. Specifically, the framework suggested that recurrent inhibition with a slow time constant can be utilized to maximize the efficiency of an average population code at the expense of increased response variability of individual neurons. Congruent with such a coding regime, individual LSO neurons responded sparsely (intermitted and with few spikes) and therefore decreased the redundancy of firing in the population for a given ILD. A potential limitation for such an interpretation of our data is that the neurons were not recorded at the same time (because of methodological limitations for brainstem recordings of highly stimulus-time–locked responses) and thus bear the possibility of overestimating the population advantage because of missing noise correlations. However, it is known that spiking in auditory brainstem nuclei occurs independently , and our in vitro recording of a large population of LSO neurons conclusively supports the single-neuron data. Accordingly, population analyses of single-neuron recordings are assumed a valid approximation and thus are commonly performed [12,13,15].
In conclusion, our findings suggest a new concept for spatial coding in the LSO: the detection and processing of ILDs is optimized for efficient sound-source separation in a given stimulus context by sparse population coding. This coding regime not only is energy efficient but also allows for detecting changes in the auditory periphery during high activity levels (noisy conditions) by maintaining high resolution. In ecologically plausible situations, the accompanying detriment of absolute localization accuracy might be compensated by an orienting head movement to bring the sound source into frontal space.
Materials and methods
All data underlying the presented quantitative observations can be found in S1 Data.
Animal experiments using ketamine/xylazine anesthesia were approved by the German animal welfare act (District Government of Upper Bavaria, reference number: 55.2-1-54-2531-105-10). Psychophysical testing with human subjects (data were analyzed anonymously) was approved by the Ethics Committee of the Medical Faculty of the LMU (59–16).
In vivo extracellular single-cell recordings were made from the LSO of young adult (postnatal age >80 days, n = 7 animals) Mongolian gerbils (Meriones unguiculatus) of both sexes.
To anesthetize the animals, a combination of ketamine (Ketavet, 100 mg/mL; Pfizer Inc., New York, NY, USA) and xylazine (Xylazin, 100 mg/mL; Sigma-Aldrich Chemie GmbH, Munich, Germany) was used. Physiological sodium chloride solution (NaCl, 0.9%; B. Braun Medicare GmbH, Melsungen, Germany) was mixed with 20% ketamine and 2% xylazine. After weighing the animals, they were anesthetized with an intraperitoneal injection (0.5 ml per 100 g body weight) of this anesthetic. After initial injection, the anesthetic was continuously provided by an automatic pump (801 Syringe Pump; Univentor High Precision Instruments Ltd., Zejtun, Malta) at a rate of 1.6 to 2.8 μl per minute depending on body weight and state of anesthesia. The anesthetic stage was periodically tested with the hind leg reflex. Constant body temperature of 37°C was ensured and checked by a thermostatically controlled heating pad the animals were placed on (Harvard Homeothermic Blanket Control Unit Model #50–7129; Harvard Apparatus Inc., Holliston, MA, USA). In order to ensure a sealed placement of the headphones on the acoustic meatus, the tragus was cut at two sides. The pericranium was anesthetized with lidocaine (Xylocain Pumpspray dental; AstraZeneca GmbH, Wedel, Germany).
A small cut of the skin was made across the rostrocaudal axis on the upper part of the skull, and a craniotomy and a durotomy (ca. 1.5 × 2.5 mm) approximately 1,800 mm lateral to the midline and 4,500 mm caudal to the bregmoid axis was performed. Ringer solution was periodically applied to the opening to prevent damage of the brain surface due to dehydration. The animals’ body functions were monitored though various devices. The heart rate and breathing cycle was monitored optically and acoustically through an electrocardiogram. The animals’ blood oxygen was measured through a pulse oximetry monitor (LifeSense Tabletop Capnography and Pulse Oximetry Monitor; Nonin Medical Inc., Plymouth, MN, USA). The animal was also typically provided carbogen through a custom-made mask. Recording sessions typically lasted between 10 to 12 hours.
The recording site was marked by iontophoretic application of the enzyme horseradish peroxidase (HRP). Experiments were then finalized by euthanizing the animals without awakening by an intraperitoneal injection of 1 ml of 20 mg/ml pentobarbital in Ringer solution. The animals were transcardially perfused with Ringer solution and 4% paraformaldehyde (PFA) for approximately 30 minutes. The brain was carefully removed from the cranium and put into 4% PFA at 4°C upon further processing to determine the recording location. Only recordings from locations that were positively identified within the LSO were used for further data analysis.
Extracellular single-cell recordings were obtained using pulled glass micropipettes (1.5 mm OD × 0.86 mm ID, GC150F-10; Harvard Apparatus Ltd) filled with 1 M HRP in 1 M NaCl and a resistance of 7 to 10 MΩ (measured with Ωmega-Tip Z; World Precision Instruments Inc., Sarasota, FL, USA). The electrode was mounted on a piezo drive (Inchworm controller 8200; Burleigh Products Group Inc., Victor, NY, USA), which was connected to a motorized manipulator (Digimatic series 164 type 161; Mitutoyo Deutschland GmbH, Neuss, Germany). The electrode signal was amplified (Electro 705, World Precision Instruments Inc. and Wide Band Amplifier, TOE 7607; Toellner GmbH, Herdecke, Germany) and fed to a computer via an A/D-converter (TDT RP2.1, System III; Tucker-Davis Technologies Inc., Alachua, FL, USA), where the signal was filtered. A notch filter was used to filter the 50-Hz electrical noise caused by the power line hum, and a high-pass filter with 300 Hz and low-pass with 5-kHz bandpass filtered the signal in the RP2.1. Brainware (Jan Schnupp, University of Oxford, Oxford, UK, for Tucker-Davis Technologies Inc., USA) was used to visualize and analyze incoming spike trains. The spike times and raw traces were recorded and saved for subsequent analysis.
Stimulus generation and presentation
All the stimuli were digitally generated using MATLAB (The MathWorks Inc., Natick, MA, USA) and fed into TDT hardware using Brainware. The stimuli were D/A-converted in a TDT Multi-Function Processor (TDT RX6, System III; Tucker-Davis Technologies Inc.) and then attenuated with a TDT Programmable Attenuator (TDT PA5, System III; Tucker-Davis Technologies Inc.). The analog signal was delivered to the headphones. To cover the wide range of the LSO’s frequency spectrum, either Etymotic Research headphones ER-10B+ with ER 10D-T04 silicon ear tips (Etymotic Research, Inc., Elk Grove Village, IL, USA) or custom-build electrostatic headphones were used. The same silicon ear tips were fitted to either headphones to have a comparable seal to the animals’ ears. Custom-written calibration filters were used to achieve a flat spectrum over the entire range of the respective headphones.
When spikes of single cells were identifiable, the characteristic frequency (CF) and absolute threshold were determined using a pure-tone stimulus having the same length as the search stimulus. For further characterization of a neuron, a baseline ILD function was obtained and a broadband noise rate-level functions were recorded in response to 50-ms bursts presented on the ipsilateral excitatory ear only. These recordings were used for determining a cell’s latency (median latency = 4.2 ms, IQR = 3.5 ms, n = 25 cells).
To measure DRA in LSO neurons, a bimodal HPR stimulus was created. The intensity of continuous broadband noise was drawn from a pseudorandomized predefined distribution every 50 ms (see Fig 1). The range of monaural intensities spread from 20 to 80 dB SPL in 2 dB steps, similar as used monaurally [16,17]. The predefined distribution consisted of two HPRs’ intensity levels around center intensities of 50 dB ± 4 dB SPL and 70 dB ± 4 dB SPL, resulting in 5 values per HPR with a cumulative occurrence probability of 0.8. To generate ILDs, the stimulus intensities were mirrored at 60 dB for presentation on the other ear, resulting in HPR center regions of −20 dB ILD and +20 dB ILD, respectively.
A single condition epoch was 6.55 seconds long and was repeated 10 times with different pseudorandomizations each time (but identical cumulative probabilities). A 131-s–long stimulus was generated by alternating the two HPR conditions repeatedly, resulting in 10 HPR epochs per sweep for each condition. The stimulus sequence was identical across recordings. Acquisition of a complete set of stimuli typically lasted >2 h, thereby making the acquisition of a large sample size challenging.
Neuronal data analysis
Recorded data files were analyzed offline using custom-made analysis in MATLAB and Python. First, the average latency of each cell was determined on the basis of its monaural rate-level functions to allow for subsequent spike-triggered analysis of responses to the HPR stimuli. To this end, spikes were assigned to 50-ms bins of the respective ILD that elicited the spikes (taking into account the latency of the cell). This resulted in a mean ILD response-rate function for each HPR condition.
Spike-triggered stimulus averages where calculated to investigate whether a particular ILD sequence influenced the spiking if neurons. To this end, we selected all bins with a nonzero response and determined the ILD values presented during this time bin and during the nine previous bins. We then averaged the determined ILD values for each of the 10 bins. The data are plotted relative to the mean ILD of the bin that triggered the spike. The non-spike–triggered average was calculated the same way but based (i.e., triggered) on bins that did not show any response.
The standard separation D is calculated as previously described : where mun + 1 and mun are the mean values of the responses to two ILD values while sigman + 1 and sigman are their standard deviation. Dn was subsequently smoothed using a 5-sample moving average filter.
In the case of the model, we calculated D based on the assumption of an underlying Poisson process in which the variance would equal the mean response.
MLEs were used to find the most probable ILD to result in a specific observed response Robs given all other observed responses R. For this, the joint probability density functions P(R, ILD) of the observed spike counts R and the presented ILDs were calculated for all responses of one neuron, excluding Robs. The ILD that maximizes P(R = Robs, ILD) was then used as the MLE for Robs.
To characterize shifts in ILD functions due to HPR statistics, the threshold ILD—defined as the ILD at which the firing rate differentiates more than 10% from baseline firing—was determined.
Time courses of adaptation were measured by fitting a single- or double-exponential function to the mean responses rates averaged over all 30 repetitions of an HPR condition. Inbuilt functions in MATLAB for the rmse and the adjusted coefficient of determination (R2) were used to evaluate the goodness of fits.
Psychophysical measurements and data analysis
Five normal-hearing (within 20 dB of ISO/TR 389–5:1998) listeners (2 males and 3 females, mean age 26 ± 4 years, right-handed) participated in the measurement of just-noticeable ILD differences. The signals consisted of white noise that was generated in MATLAB at a sampling rate of 44.1 kHz. The signals were digital to analog converted (Audio 2 Dj; Native Instruments GmbH, Berlin, Germany) before being presented over circumaural headphones (HDA 200; Sennheiser Electronic GmbH & Co. KG., Wedemark, Germany), which were calibrated for a flat frequency response between 20 Hz and 20 kHz. The signals were presented at 60-dB SPL average diotic sound-pressure level, and ILDs were introduced by symmetrically amplifying and attenuating the right and left ear signals by half the desired ILD. Within the experiment, a 2-s–long adapter stimulus was followed, after 350 ms, by two 50-ms probe stimuli that were separated by 100 ms. Similar to the physiological experiments, the adapter consisted of concatenated diotic noise bursts, each 50 ms in duration, with ILDs that were randomly drawn from one of the two nonuniform HPR distributions.
ILD JNDs were determined at two reference ILDs (i.e., −20 dB ILD and +20 dB ILD). One of the two probe stimuli was randomly presented at one of the two reference ILDs, while the other probe stimulus was systematically varied using a transformed up–down procedure, following a one-up three-down rule, as implemented by the MATLAB AFC package . To determine the JND, listeners were asked to specify the perceived direction of the probe pair sounds, which allows deducing which of the two probe stimuli was perceived as more lateralized. Following the subject’s answer, the variable probe ILD was adjusted until reaching the termination criterion (6 reversals) of the one-up three-down rule. ILD JNDs for each listener, each probe position, and each listening condition (i.e., HPR) were calculated as median over six sessions (each session consisting of 3 measurements). For each subject, the effect of listening condition was expressed as normalized change in ILD JND because of colocation of probe position and preceding HPR.
The LSO was modeled using a phenomenological rate model similar to the one used by Wen and colleagues  to model adaptation in the ANF (see S1 Fig). The LSO is implemented as a subtraction stage with inputs from the ipsi- and contralateral ANFs and a sigmoidal activation function (CN and MNTB were omitted to minimize model complexity). The firing rate RLSO(t) in spikes per second (sps) of the LSO is calculated as follows: where Rmax and Rmin are the maximum and the minimum firing rates, R0 is the rate at zero input, and k is the steepness of the sigmoid. Ripsi(t) and Rcontra(t) are the firing rates from the ipsi- and the contralateral ANFs, and g is a gain factor to weight the relative strength of the excitatory and inhibitory inputs. The ANF inputs were each calculated using a dual adaptation model , which was fitted to the data shown in Fig 2 of . Because we only fitted the response of one ANF, we switched the saturating nonlinearity used in the original model with a simple logistic function.
The parameters for the LSO model were determined by calculating the ILD rate function of the model and fitting it to the ILD rate function given by Fig 4 in  (resulting parameters: ). The slow LSO adaptation was implemented to resemble the second adaptation stage of . The adaptation parameters where adjusted so that the time course of adaptation was in agreement with the recorded data from the LSO. This resulted in an adaptation time constant of 4 ms, g1 = 0.3, g2 = 0.01, with an adaptation threshold of 35 sps (see  for details on the implementation). Running this model often resulted in zero firing rates for larger negative ILD values, which led to undefined D/spike values, so we introduced a minimal LSO firing rate of Rmin = 30 sps.
LSO intrinsic metabolic imaging
Changes in NADH levels in the LSO were monitored by imaging of NADH autofluorescence in acute brainstem slices as recently described [43,65]. The animals were anesthetized with isoflurane and decapitated. We removed the brains and cut 250-μm–thick transverse slices (VT1200S Vibratome; Leica Microsystems GmbH, Wetzlar, Germany). The slices were superfused at room temperature (22–25°C) in the recording chamber with gassed (95% O2 and 5% CO2) artificial cerebrospinal fluid (ACSF) solution containing (in mm): 23 sucrose, 125 NaCl, 25 NaHCO3, 2.5 KCl, 1.25 NaH2PO4, 1 MgCl2, 2 CaCl2, 2 glucose (Sigma-Aldrich). NADH was excited with a 365-nm LED, and fluorescence images (emission filter: 447 ± 30 nm) were recorded at 2 Hz (pco.edge 5.5; PCO AG, Kehlheim, Germany).
LSO neurons were electrically excited by a 20-s stimulation train at 200 Hz with biphasic pulses of 1 ms duration and 5-V amplitude using a bipolar Tungsten electrode placed in the fiber tract targeting the LSO. NADH fluorescence intensity was measured in individual rectangular ROIs, corrected for photobleaching, and presented as ΔF/F0 (F0 = fluorescence level at stimulation onset; ΔF = change in fluorescence level relative to F0).
The occurrence of DM was automatically detected based on their amplitudes (>0.05–0.10%), the time difference between them (>6–10 s), and the time differences between the minima and the interjacent maximum (>1–2 s). These parameters were individually adjusted for each slice by analyzing a measurement with larger ROIs and by comparing the automated results with those of visual inspection of the individual traces. Specific blockade of GABA-B receptors was performed by application of 10 μM CGP [(2S)-3-[[(1S)-1-(3,4-Dichlorophenyl)ethyl]amino-2-hydroxypropyl](phenylmethyl)phosphinic acid hydrochloride] (Tocris Bioscience, Bristol, UK) to the ACSF for 20 minutes.
S1 Fig. Schematics of additional HPR stimulus paradigms.
(A, F, and K) The intensity distribution on the ipsilateral ear was identical to the main HPR paradigm (Fig 1) but altered on the contralateral ear (A and F) or not stimulated at all (ipsi only, K). For (A) and (F), the intensity distribution on the contralateral ear was fixed for an entire epoch at 70 dB or 50 dB and switched between HPR conditions. These intensities were either presented out of phase with the mean HPR intensity on the ipsilateral ear (contra paradigm 2, panel A) or in phase (contra paradigm 3, panel F). (B, G, and L) Mean ILD response functions of all LSO neurons tested with the respective paradigm. Conventions as in Fig 2C. (C, H, and M) Scatter plot of threshold ILDs illustrates significant changes with HPR condition in paradigm 2 (P = 0.0003, N = 18 neurons, paired Wilcoxon signed-rank test) and paradigm 3 (P = 0.0005, N = 19 neurons, paired Wilcoxon signed-rank test) but not for the ipsi-only paradigm (P = 0.4, N = 11 neurons, paired Wilcoxon signed-rank test). Conventions as in Fig 1. (D–E and I–J) Similar to the original stimulus paradigm (Fig 1 and Fig 3), the advantage of these shifts for ILD computation is displayed when calculating D(pop) but not for D(mean) analysis. In contrast to Fig 3D, the asymmetry of the stimulus paradigms 2 and 3 do not allow calculating summed D-functions across both hemispheres. Note that the peaks of D(pop) are very close to the respective HPRs but do not perfectly align. In contrast, D(mean) takes minimal values within the respective HPRs in either condition. Underlying data can be found in S1 Data. HPR, high-probability region; ILD, interaural level difference; LSO, Lateral Superior Olive.
(A) Spike-count histograms for four representative LSO neurons (aligned in rows) to a full epoch (approximately 6.55 s) in each HPR condition (left column: −20 dB HPR condition, right column: +20 dB HPR condition). Bins (i.e., 50-ms snippets) during which the ILDs took the center value of the respective HPR in each condition (−20 dB and +20 dB, respectively) are color-coded, illustrating a high response variability. (B) Histogram of Pearson correlations of LSO responses to the full HPR stimulus. The average correlation of spike counts across three repetitions of the entire stimulus set (19 switches) was surprisingly low for the majority of neurons (median Pearson correlation coefficient = 0.62, IQR: 0.26). (C) A spike-triggered analysis of the responses of all neurons established that the likelihood of spiking to any ILD was not systematically associated with a prior occurrence of specific relative ILDs (upper panel; color code represents HPR conditions). Performing the same analysis but triggered by nonspiking to a represented ILD (lower panel) exposed a tendency of nonresponsiveness due to presentation of a more positive ILD shortly before. (D) The mean Pearson correlation coefficients of responses to the same ILDs was significantly higher when stimuli were presented with gaps of 300 ms in between (“Control”) compared to either HPR condition (paired-sample t test, N = 13 neurons). Underlying data can be found in S1 Data. HPR, high-probability region; ILD, interaural level difference; IQR, interquartile range; LSO, Lateral Superior Olive.
(A) Comparison of goodness of fits for power law and exponential fitting of the adaptation time course in LSO neurons. Left: Rmses using a power-law fit were lower compared to a single-exponential fit (left, P = 0.001, Wilcoxon signed-rank test), but slightly higher compared to a double-exponential fit (right, P = 0.02, Wilcoxon signed-rank test). (B) Block diagram of the LSO rate model. The input to the model is given as a sequence of sound levels on the ipsi- (Lipsi) and contralateral (Lcontra) ear. A dual adaptation model is used to calculate the ANF firing rates R_ipsi and R_contra. The LSO model is implemented as a subtraction stage, with the contralateral input weighted by a gain value and a following sigmoid to model the activation of the neuron. An optional adaptation stage that resembles the rate adaptation stage in the ANF model is used to account for the slow adaptation component present in the LSO measurements. (C) Model responses (left panel) and resulting D(pop) in response to contra paradigm 2 (see S1 Fig). The model closely captures both the magnitude of ILD tuning function shifts and corresponding changes in D(pop) (compare S1 Fig). (D) Same as in (C), but for contra paradigm 3 (compare S1 Fig). Underlying data can be found in S1 Data. ANF, auditory nerve fiber; ILD, interaural level difference; LSO, Lateral Superior Olive; rmse, root mean-squared error.
We thank C. Leibold for discussions and N. Lesica for comments on earlier versions of the manuscript.
- 1. Lewicki MS, Olshausen BA, Surlykke A, Moss CF. Scene analysis in the natural environment. Front Psychol. 2014;5:199. pmid:24744740
- 2. Bizley JK, Cohen YE. The what, where and how of auditory-object perception. Nat Rev Neurosci. 2013 Oct;14(10):693–707. pmid:24052177
- 3. Młynarski W. Efficient coding of spectrotemporal binaural sounds leads to emergence of the auditory space representation. Front Comput Neurosci. 2014;8:26. pmid:24639644
- 4. Młynarski W. The opponent channel population code of sound location is an efficient representation of natural binaural sounds. PLoS Comput Biol. 2015 May;11(5):e1004294. pmid:25996373
- 5. Barlow HB. Single units and sensation: A neuron doctrine for perceptual psychology? Perception. 1972;1(4):371–394. pmid:4377168
- 6. Wark B, Lundstrom BN, Fairhall A. Sensory adaptation. Curr Opin Neurobiol. 2007 Aug;17(4):423–9. pmid:17714934
- 7. Carandini M, Heeger DJ. Normalization as a canonical neural computation. Nat Rev Neurosci. 2012 Jan;13(1):51–62.
- 8. Rabinowitz NC, Willmore BDB, Schnupp JWH, King AJ. Contrast Gain Control in Auditory Cortex. Neuron. 2011 Jun;70(6):1178–91. pmid:21689603
- 9. Rabinowitz NC, Willmore BDB, King AJ, Schnupp JWH. Constructing noise-invariant representations of sound in the auditory pathway. PLoS Biol. 2013 Nov;11(11):e1001710. pmid:24265596
- 10. Watkins PV, Barbour DL. Specialized neuronal adaptation for preserving input sensitivity. Nat Neurosci. 2008 Nov;11(11):1259–61. pmid:18820690
- 11. Herrmann B, Schlichting N, Obleser J. Dynamic range adaptation to spectral stimulus statistics in human auditory cortex. J Neurosci Off J Soc Neurosci. 2014 Jan 1;34(1):327–31.
- 12. Dean I, Harper NS, McAlpine D. Neural population coding of sound level adapts to stimulus statistics. Nat Neurosci. 2005 Dec;8(12):1684–9. pmid:16286934
- 13. Maier JK, Hehrmann P, Harper NS, Klump GM, Pressnitzer D, McAlpine D. Adaptive coding is constrained to midline locations in a spatial listening task. J Neurophysiol. 2012 Oct;108(7):1856–68. pmid:22773777
- 14. Robinson BL, Harper NS, McAlpine D. Meta-adaptation in the auditory midbrain under cortical influence. Nat Commun. 2016 Nov 24;7:13442. pmid:27883088
- 15. Dahmen JC, Keating P, Nodal FR, Schulz AL, King AJ. Adaptation to stimulus statistics in the perception and neural representation of auditory space. Neuron. 2010 Jun 24;66(6):937–48. pmid:20620878
- 16. Wen B, Wang GI, Dean I, Delgutte B. Dynamic Range Adaptation to Sound Level Statistics in the Auditory Nerve. J Neurosci. 2009 Nov 4;29(44):13797–808. pmid:19889991
- 17. Wen B, Wang GI, Dean I, Delgutte B. Time course of dynamic range adaptation in the auditory nerve. J Neurophysiol. 2012 Jul;108(1):69–82. pmid:22457465
- 18. Grothe B, Pecka M, McAlpine D. Mechanisms of sound localization in mammals. Physiol Rev. 2010 Jul;90(3):983–1012. pmid:20664077
- 19. Bialek W, Rieke F, de Ruyter van Steveninck RR, Warland D. Reading a neural code. Science. 1991 Jun 28;252(5014):1854–7.
- 20. McAlpine D, Jiang D, Palmer AR. A neural code for low-frequency sound localization in mammals. Nat Neurosci. 2001 Apr;4(4):396–401. pmid:11276230
- 21. Pouget A, Dayan P, Zemel R. Information processing with population codes. Nat Rev Neurosci. 2000 Nov;1(2):125–32. pmid:11252775
- 22. Panzeri S, Harvey CD, Piasini E, Latham PE, Fellin T. Cracking the Neural Code for Sensory Perception by Combining Statistics, Intervention, and Behavior. Neuron. 2017 Feb 8;93(3):491–507.
- 23. McAlpine D, Grothe B. Sound localization and delay lines—do mammals fit the model? Trends Neurosci. 2003 Jul;26(7):347–50. pmid:12850430
- 24. Grothe B. New roles for synaptic inhibition in sound localization. Nat Rev Neurosci. 2003 Jul;4(7):540–50. pmid:12838329
- 25. Ashida G, Carr CE. Sound localization: Jeffress and beyond. Curr Opin Neurobiol. 2011 Oct;21(5):745–51. pmid:21646012
- 26. Carr CE, Christensen-Dalsgaard J. Sound Localization Strategies in Three Predators. Brain Behav Evol. 2015 Sep;86(1):17–27. pmid:26398572
- 27. Day ML, Delgutte B. Decoding sound source location and separation using neural population activity patterns. J Neurosci Off J Soc Neurosci. 2013 Oct 2;33(40):15837–47.
- 28. Harper NS, McAlpine D. Optimal neural population coding of an auditory spatial cue. Nature. 2004 Aug 5;430(7000):682–6. pmid:15295602
- 29. Harper NS, Scott BH, Semple MN, McAlpine D. The neural code for auditory space depends on sound frequency and head size in an optimal manner. PLoS ONE. 2014;9(11):e108154. pmid:25372405
- 30. Grothe B, Pecka M. The natural history of sound localization in mammals—a story of neuronal inhibition. Front Neural Circuits. 2014;8:116. pmid:25324726
- 31. Magnusson AK, Park TJ, Pecka M, Grothe B, Koch U. Retrograde GABA signaling adjusts sound localization by balancing excitation and inhibition in the brainstem. Neuron. 2008 Jul 10;59(1):125–37. pmid:18614034
- 32. Park TJ, Brand A, Koch U, Ikebuchi M, Grothe B. Dynamic changes in level influence spatial coding in the lateral superior olive. Hear Res. 2008 Apr;238(1–2):58–67. pmid:18162347
- 33. Stange A, Myoga MH, Lingner A, Ford MC, Alexandrova O, Felmy F, et al. Adaptation in sound localization: from GABAB receptor-mediated synaptic modulation to perception. Nat Neurosci. 2013 Dec;16(12):1840–7. pmid:24141311
- 34. Meffin H, Grothe B. Selective filtering to spurious localization cues in the mammalian auditory brainstem. J Acoust Soc Am. 2009 Nov;126(5):2437–54. pmid:19894825
- 35. Tollin DJ, Koka K, Tsai JJ. Interaural level difference discrimination thresholds for single neurons in the lateral superior olive. J Neurosci Off J Soc Neurosci. 2008 May 7;28(19):4848–60.
- 36. Sakitt B. Indices of discriminability. Nature. 1973 Jan 12;241(5385):133–4. pmid:4695543
- 37. Zilany MSA, Carney LH. Power-law dynamics in an auditory-nerve model can account for neural adaptation to sound-level statistics. J Neurosci Off J Soc Neurosci. 2010 Aug 4;30(31):10380–90.
- 38. Denève S, Machens CK. Efficient codes and balanced networks. Nat Neurosci. 2016 Mar;19(3):375–82. pmid:26906504
- 39. Deneve S, Chalk M. Efficiency turns the table on neural encoding, decoding and noise. Curr Opin Neurobiol. 2016 Apr;37:141–8. pmid:27065340
- 40. Laughlin SB. Energy as a constraint on the coding and processing of sensory information. Curr Opin Neurobiol. 2001 Aug;11(4):475–80. pmid:11502395
- 41. Willmore B, Tolhurst DJ. Characterizing the sparseness of neural codes. Netw Bristol Engl. 2001 Aug;12(3):255–70.
- 42. Shuttleworth CW. Use of NAD(P)H and flavoprotein autofluorescence transients to probe neuron and astrocyte responses to synaptic activation. Neurochem Int. 2010 Feb;56(3):379–86. pmid:20036704
- 43. Brosel S, Grothe B, Kunz L. An auditory brainstem nucleus as a model system for neuronal metabolic demands. Eur J Neurosci. 2018 Feb;47(3):222–35. pmid:29205598
- 44. Spitzer MW, Semple MN. Interaural phase coding in auditory midbrain: influence of dynamic stimulus features. Science. 1991 Nov 1;254(5032):721–4. pmid:1948053
- 45. Wang Y, Peña JL. Direction Selectivity Mediated by Adaptation in the Owl’s Inferior Colliculus. J Neurosci. 2013 Dec 4;33(49):19167–75. pmid:24305813
- 46. Lee C-C, Middlebrooks JC. Auditory cortex spatial sensitivity sharpens during task performance. Nat Neurosci. 2011 Jan;14(1):108–14. pmid:21151120
- 47. Malone BJ, Scott BH, Semple MN. Context-Dependent Adaptive Coding of Interaural Phase Disparity in the Auditory Cortex of Awake Macaques. J Neurosci. 2002 Jun 1;22(11):4625–38. pmid:12040069
- 48. Furukawa S, Maki K, Kashino M, Riquimaroux H. Dependency of the Interaural Phase Difference Sensitivities of Inferior Collicular Neurons on a Preceding Tone and Its Implications in Neural Population Coding. J Neurophysiol. 2005 Jun 1;93(6):3313–26. pmid:15703221
- 49. Getzmann S. Spatial discrimination of sound sources in the horizontal plane following an adapter sound. Hear Res. 2004 May;191(1–2):14–20. pmid:15109700
- 50. Lingner A, Pecka M, Leibold C, Grothe B. A novel concept for dynamic adjustment of auditory space. Sci Rep. 2018 May 29;8(1):8335. pmid:29844516
- 51. Maddox RK, Pospisil DA, Stecker GC, Lee AKC. Directing eye gaze enhances auditory spatial cue discrimination. Curr Biol CB. 2014 Mar 31;24(7):748–52. pmid:24631242
- 52. Phillips DP, Hall SE. Psychophysical evidence for adaptation of central auditory processors for interaural differences in time and level. Hear Res. 2005 Apr;202(1–2):188–99. pmid:15811711
- 53. Xiong XR, Liang F, Li H, Mesik L, Zhang KK, Polley DB, et al. Interaural Level Difference-Dependent Gain Control and Synaptic Scaling Underlying Binaural Computation. Neuron. 2013 Aug 21;79(4):738–53. pmid:23972599
- 54. Vigneault-MacLean BK, Hall SE, Phillips DP. The effects of lateralized adaptors on lateral position judgements of tones within and across frequency channels. Hear Res. 2007 Feb;224(1–2):93–100. pmid:17223297
- 55. Park TJ, Klug A, Holinstat M, Grothe B. Interaural level difference processing in the lateral superior olive and the inferior colliculus. J Neurophysiol. 2004 Jul;92(1):289–301. pmid:15056693
- 56. Lesica NA, Grothe B. Efficient temporal processing of naturalistic sounds. PLoS ONE. 2008;3(2):e1655. pmid:18301738
- 57. Rodríguez FA, Chen C, Read HL, Escabí MA. Neural modulation tuning characteristics scale to efficiently encode natural sound statistics. J Neurosci Off J Soc Neurosci. 2010 Nov 24;30(47):15969–80.
- 58. Pecka M, Han Y, Sader E, Mrsic-Flogel TD. Experience-dependent specialization of receptive field surround for selective coding of natural scenes. Neuron. 2014 Oct 22;84(2):457–69. pmid:25263755
- 59. Olshausen BA, Field DJ. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature. 1996 Jun 13;381(6583):607–9. pmid:8637596
- 60. Garcia-Lazaro JA, Ahmed B, Schnupp JWH. Tuning to natural stimulus dynamics in primary auditory cortex. Curr Biol CB. 2006 Feb 7;16(3):264–71. pmid:16461279
- 61. Kapfer C, Seidl AH, Schweizer H, Grothe B. Experience-dependent refinement of inhibitory inputs to auditory coincidence-detector neurons. Nat Neurosci. 2002 Mar;5(3):247–53. pmid:11850629
- 62. Garcia-Lazaro JA, Belliveau LAC, Lesica NA. Independent population coding of speech with sub-millisecond precision. J Neurosci Off J Soc Neurosci. 2013 Dec 4;33(49):19362–72.
- 63. Ewert, S. AFC—A modular framework for running psychoacoustic experiments and computational perception models [Internet]. 199.[cited 2017 Apr 21]. Available from: http://medi.uni-oldenburg.de/afc/index.htm.
- 64. Tollin DJ. The lateral superior olive: a functional role in sound source localization. Neurosci Rev J Bringing Neurobiol Neurol Psychiatry. 2003 Apr;9(2):127–43.
- 65. Direnberger S, Banchi R, Brosel S, Seebacher C, Laimgruber S, Uhl R, et al. Analysis of signal processing in vestibular circuits with a novel light-emitting diodes-based fluorescence microscope. Eur J Neurosci. 2015 May;41(10):1332–44. pmid:25847143