• Loading metrics

A circuit model of auditory cortex

  • Youngmin Park,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Otorhinolaryngology: HNS, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America

  • Maria N. Geffen

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Department of Otorhinolaryngology: HNS, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America, Department of Neuroscience, Department of Neurology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America

A circuit model of auditory cortex

  • Youngmin Park, 
  • Maria N. Geffen


The mammalian sensory cortex is composed of multiple types of inhibitory and excitatory neurons, which form sophisticated microcircuits for processing and transmitting sensory information. Despite rapid progress in understanding the function of distinct neuronal populations, the parameters of connectivity that are required for the function of these microcircuits remain unknown. Recent studies found that two most common inhibitory interneurons, parvalbumin- (PV) and somatostatin-(SST) positive interneurons control sound-evoked responses, temporal adaptation and network dynamics in the auditory cortex (AC). These studies can inform our understanding of parameters for the connectivity of excitatory-inhibitory cortical circuits. Specifically, we asked whether a common microcircuit can account for the disparate effects found in studies by different groups. By starting with a cortical rate model, we find that a simple current-compensating mechanism accounts for the experimental findings from multiple groups. They key mechanisms are two-fold. First, PVs compensate for reduced SST activity when thalamic inputs are strong with less compensation when thalamic inputs are weak. Second, SSTs are generally disinhibited by reduced PV activity regardless of thalamic input strength. These roles are augmented by plastic synapses. These roles reproduce the differential effects of PVs and SSTs in stimulus-specific adaptation, forward suppression and tuning-curve adaptation, as well as the influence of PVs on feedforward functional connectivity in the circuit. This circuit exhibits a balance of inhibitory and excitatory currents that persists on stimulation. This approach brings together multiple findings from different laboratories and identifies a circuit that can be used in future studies of upstream and downstream sensory processing.

Author summary

The mammalian auditory cortex is composed of multiple types of inhibitory and excitatory neurons, which form sophisticated microcircuits for processing and transmitting sensory information. Distinct inhibitory neuron subtypes play distinct functions in auditory processing, but it remains unknown what simple set of underlying mechanisms is responsible for inhibitory cortical function. Here, we built minimal rate and spiking models and identified a specific set of synaptic mechanisms that could best reproduce the broad set of experimental results in the auditory cortex. The simplicity of our model provides an understanding of inhibitory cortical processing at the circuit level, which explains results from different laboratories, and provides for a novel computational framework for future studies of cortical function.


Detecting sudden changes in the acoustic environment and extracting relevant acoustic features from noise are important computations for auditory navigation and scene analysis. The mammalian auditory cortex (AC) is a key region for processing temporally patterned sounds [1]. Neurons in AC exhibit adaptation to repeated tones, which may be selective for an overrepresented stimulus, such as in stimulus-specific adaptation, or SSA [1,2]. They furthermore exhibit forward suppression, in which a preceding stimulus masker tone drives a decrease in responses to the subsequent target tone [3,4]. How these computations are carried out by cortical circuits has been subject of extensive research.

The AC is composed of tightly coupled networks of excitatory and inhibitory neurons. Recent studies have identified the differential involvement of two distinct major classes of inhibitory neurons, parvalbumin-positive (PV) and somatostatin-positive (SST) neurons in these temporal paradigms. These neurons differ morphologically and physiologically [5,6], and recent studies found that they play differential functions in auditory processing. Specifically, SSTs, but not PVs facilitate stimulus-specific adaptation [2]. PVs and SSTs play distinct roles in adaptation to repeated tones along the frequency response function of the target neuron [7]. SSTs and PVs drive bi-directional effects on forward suppression [8]. In addition, PVs enhance feedforward connectivity in the auditory cortex [9]. These experimental results can be used to constrain currents and connections in an idealized auditory cortex model consisting of PVs, SSTs and Exc. Here, we tested whether these results can be accounted for by the same set of mechanisms.

In this paper, we build up from a simple dimensionless model consisting of one iso-frequency unit. We transition to a three-unit rate model to understand the mechanisms of inhibitory neural modulation on a gross tonotopy. We then build a detailed spiking model which incorporates the mechanisms discovered in the rate models. The rate models provide a qualitative intuition of the underlying mechanisms, and the spiking models incorporate these mechanisms in an accessible, open-source codebase for future work. We present the results on testing the three-unit rate model and the detailed spiking model in four distinct auditory paradigms.

The model accounted for observed experimental results including the differential role of SSTs and PVs in SSA [2], forward suppression [8], tuning-curve adaptation [7], and the effects of PV activation on feedforward functional connectivity [9]. We found that compensating currents between the two types of inhibitory neurons explain experimental findings of differential effects of their modulation on excitatory activity. Furthermore, the model is consistent with existing hypotheses regarding inhibitory and excitatory balance in the cortex. This framework can be used to build and test hypotheses for similar phenomena in other sensory modalities, and studies of upstream or downstream auditory processing in AC.

Materials and methods

We first built an augmented version of the Wilson-Cowan model, consisting of one iso-frequency unit of the auditory cortex. The model consisted of one excitatory neural population and two inhibitory neural subpopulations. Importantly, the single iso-frequency unit model served as the template for all other models in this paper. By using the results and parameters from this model, we extended our results to the substantially more complex three-unit rate model and three-unit spiking models. We constrained the parameters using experimental data from the literature.

All code used to generate figures (including model simulations, numerical methods, and analysis methods) are available on GitHub at under the MIT open source license.

In figures in this manuscript, we use blue or black lines to depict Exc activity in absence of optogenetic manipulation (called “Control”); magenta for SST activity; cyan solid for PV activity; orange for Exc activity under SST suppression or activation; green for Exc activity under PV suppression or activation.

Augmented Wilson-Cowan model

We modeled a single iso-frequency unit as an augmented version of the Wilson-Cowan model [10] by including an additional inhibitory subtype. We emphasize that we drew much of our understanding of adaptation throughout this paper using this single iso-frequency unit: (1) where u(t), p(t), and s(t) represent the normalized firing rate (scaled from 0 to 1) of the excitatory population, PV inhibitory subpopulation, and SST inhibitory subpopulation, respectively (Fig 1. Note that connection strengths in the rate model are given by wij, and the spiking model are given by conductances gij, which are distinct from the thalamic depression variable g(t). We will keep these distinctions consistent throughout the text, so there is no ambiguity). An important caveat of the rate equations is that the rescaled firing rates are a consequence of dimensionless equations. All weights, parameters, optogenetic inputs, and thalamic inputs are also dimensionless in the rate model, with the only exception being time.

Fig 1.

A. Model of the PV-SST-Exc circuit (spiking model, where connection strengths are given by conductances gij. Note that the conductances with subscripts gij are distinct from g(t), the adaptation variable for the thalamic input. The rate model uses connection strengths wij). B. Input and response profiles for the single-unit rate and spiking model to a 100 ms long tone. Top: Gray: thalamic depression variable g. Blue: excitatory (Exc) neuron activity. Cyan: PV. Magenta: SST. Bottom: Thalamic input (red). C. Responses to stimulus over the first 30 s after sound onset for the different paradigms modeled in the paper. We used a high and a low inhibition mode of synaptic weights to capture the different results. For SSA (black) and forward suppression (FWS, gray), the variable is higher than threshold , resulting in a set of low inhibition parameters. Paradigms for tuning-curve adaptation (purple) and PV activation (g) asymptote at below-threshold levels, resulting in a set of high inhibition parameters. D-G responses of neurons in the spiking model to a 50ms tone. Top: raster plot; Bottom: Firing rate of Exc (D), PVs (E), SSTs (F), and thalamo-cortical input (G).

The parameters IOpt,PV(t) and IOpt,SST(t) represent the strength of PV and SST activation or inactivation, respectively, and wij and τi are synaptic weights and time constants, respectively. All time constants are τu = τp = τs = 10ms, roughly in agreement with known data [2,11]. The function f is a threshold linear function defined as (2) where the function f roughly approximates a sigmoid that converges to zero for small or negative inputs and saturates to 1 for large inputs. The parameter r = 3 determines the gain of all firing-rate functions and was chosen roughly to be the same as other modeling studies [2,12]. We included a threshold by subtracting a constant uth from the input, i.e., f(xuth) for some input x, where uth is a positive number (typically in the range from 0 to 1). In all rate model simulations, we chose Exc, PV, and SST thresholds to be uth = 0.7, pth = 1, and sth = 1, respectively. The thresholds indicate the minimum activity required for a neural population to affect postsynaptic neural populations. Because the thresholds are greater than zero, sub-threshold activity does not affect the dynamics of the network.

The input function i(t) consists of blocks of inputs with stimulus duration and interval based on the experimental paradigm. We list the stimulus parameters in Table 1, and the stimulus duration and stimulus interval for each paradigm in Table 2. Details of each paradigm are explained in the text and figures where appropriate. When an auditory input arrives into the Exc and PV populations, the default temporal profile is taken to have an instantaneous rise with amplitude q and exponential decay (Fig 1B, bottom red curve) with time constant τq = 10ms, which roughly agrees with known values [13]. The instantaneous rise and exponential decay were chosen for simplicity. The input i(t) is further modulated by a slow synaptic depression term g satisfying the standard model of synaptic depression (3) where the time constants are for replenishment and for depletion (chosen close to reported values [2,11,14,15]). The synaptic depression variable g begins at a baseline value of g0 = 1 and when i(t)>0, i.e., when an input arrives into the Exc or PV populations, g(t) decreases on the timescale determined by . Because g(t) multiplies the input i(t) to u(t) and p(t) in Eq (1), g(t) serves to modulate the strength of auditory inputs to A1. In the absence of auditory input, g(t) recovers slowly on the order of seconds determined by .

Table 1. Parameter values of rate models across paradigms.

In the optogenetic parameters, numbers without parentheses show optogenetic strengths for use in model reproductions and numbers in parentheses show optogenetic strengths for use in model predictions. Positive optogenetic numbers correspond to activation and negative numbers correspond to inactivation.

Table 2. Auditory paradigm parameters.

SSA parameters from [2]. Forward suppression (FWS) parameters from [8,22]. Tuning-curve adaptation (TCA) parameters from [7]. PV activation parameters from [9]. Optogenetic inhibition was performed 100ms before tone onset and 100ms after tone offset in SSA and FWS. In TCA and PV activation, optogenetic inhibition was turned on at the beginning of the experiment and sustained through the trial. For optogenetic parameters, see Table 2.

We based the proportional strengths of connections in the single-unit model based on existing studies on AC [16]. The within-unit connectivity is equivalently represented by the matrix, (4)

All synaptic weights wij in the single-unit rate model are constant [17], with synaptic depression appearing in the feedforward thalamic inputs [18]. The inhibitory synaptic weights were roughly chosen to agree with known connection types and connection strengths [16,19], and the excitatory connections were tuned as free parameters. The constant synapses allowed us to fully understand the model dynamics before transitioning to the more complex three-unit model with depressing and facilitating synapses.

Three-unit Rate model

Using the single-unit rate model as a template, we arranged copies into three units with lateral cortical and thalamic connections (Fig 2A. Lateral inhibitory connections are hidden in Fig 2 for clarity). This arrangement endowed our model with a gross tonotopy, which we used to explore spectrally and temporally complex auditory inputs. While the three-unit rate model appears to be substantially more complex, the parameters were strongly constrained by the single-unit rate model. In particular, we aimed for each unit of the three-unit model to mimic the excitatory and inhibitory currents of the single-unit rate model. We found that maintaining currents explained many of the known optogenetic experiments in the literature. Before turning to the spiking model, we briefly describe technical details of the parameter values and functions. Each unit behaves according to the equations (5) where, (6) and (7)

The functions Ii(t) are defined as Ik(t) = gk(t)ik(t)+g2(t)i2(t)α, for k = 1,3, and I2(t) = (g1(t)i1(t)+g3(t)i3(t))α+g2(t)i2(t). We remark that any lower-case letter i that has a subscript and is a function of time, e.g., i1(t),i2(t),i3(t), represent thalamic inputs. The time-dependent notation for these functions will always be distinct from the index i. Note that each set of equations are almost identical to the single-unit case, but with the addition of lateral terms along with facilitating and depressing terms Fi(t) and Di(t). The lateral terms are between immediate neighbors and include lateral SST to Exc (facilitating), Exc to Exc, Exc to PV. The facilitating terms Fi(t) increase from 0 to nonzero values as unit i receives inputs, and the depressing terms Di(t) decrease from 1 to lower values as unit i receives inputs. Rather than adjusting weights integrated prior to facilitating synapses, we controlled the strength of facilitation by changing the time constants in Eq (8).

Fig 2. Input and response profiles of the three-unit model.

A: The three-unit rate model of the auditory cortex, with three preferred frequencies, f1, f*, and f2 (the spiking model follows the same motif). Non-excitatory lateral connections have been hidden for clarity. B: 50ms auditory inputs are applied at each frequency in sequence. C—E: Black traces show the excitatory cortical response of the first (u1), second (u2), and third (u3) rate units, respectively. Gray traces show the slow synaptic depression. F: The traces of the thalamic inputs: f1 (gray), f* (black), and f2 (red). Each iso-frequency unit contains lateral excitatory connections where the Exc population of a given unit synapses laterally onto the neighboring Exc, PVs, and SSTs.

We chose α = 0.65, i.e., 65% of the thalamic inputs to the left or right units reach the center unit. Likewise, 65% of thalamic inputs to the center unit reach the left and right units. The function f is threshold linear (Eq 2). The functions Ik(t) are time-dependent inputs with the strongest preference for unit k, and the profiles of i1(t), i2(t), and i3(t) are shown in Fig 2F (these profiles are the same as the profile in the single-unit model, Fig 1B, bottom). Parameters a, b control the strength of depression and facilitation and are chosen to be a = 0.5, b = 2. The parameter q controls the strength of all inputs. Each input Ij(t) is modulated by corresponding depression variables gk(t), where each gk(t) satisfies Eq 3 independently. The parameters τi are membrane time constants and chosen the same as the single-unit model, τu = τp = τs = 10ms [2,11]. The parameters wij are within-unit synaptic weights chosen according to Eq 4, while the parameters are lateral (between unit) synaptic weights. We chose = 0.667, , and to reflect the generally weaker lateral synaptic strengths in auditory cortex relative to the within-unit connections [20].

We added facilitating terms Fi(t) in the Exc to SST synapses, and depressing terms Di(t) in the PV to Exc synapses [21]. The parameters a and b control the degree of depression and facilitation, respectively, where we chose a = 0.5, b = 1 (these values were not taken from the literature). The depressing parameter a was chosen so that the term (wepaDi(t)) did not change sign across experimental paradigms. The facilitating variables Fi(t) satisfy (8) where and are as in Eq 3. The reuse of the depression time constants are an intentional and simplifying choice. By using the same time scales, we were able to use the inputs ij(t) as a proxy for the excitatory activity uj(t) and simulate the facilitation variable in terms of the depressing synaptic variable as Fj(t) = 1−gj(t). Similarly, the depression variables Di(t) satisfy (9) and used the thalamic input as a proxy for excitatory activity to simulate the depression variable as Di(t) = gi(t). All depressing and facilitating timescales were chosen close to reported values [11,14,15].

The activity of the model is shown in Fig 2, in response to three successive auditory stimuli are applied in order of the frequencies f1, f*, and f2, stimulating the left, center, and right units, respectively (Fig 2C–2E). The center unit (u2(t)) responded equally well to frequencies f1 and f2 (Fig 2D), a necessary response for SSA paradigms. For simplicity, activation of an adjacent unit did not affect the thalamic variable, i.e., g1(t) was left unaffected by i2(t), g2(t) was left unaffected by i1(t) and i3(t), and g3(t) was left unaffected by i2(t). We assumed that the frequency difference between f1 and f2 was great enough that auditory inputs at f1 (f2) did not affect units responsive to f2 (f1).

When matching the model outputs to experiments in the literature, we found that there were seemingly two different parameter sets that explained all phenomena and we were unable to tweak the rate model to unify the parameters. We unified the disjoint parameter sets by incorporating paradigm-dependent baseline states in the three-unit rate model. The model parameters switch between weak and strong baseline inhibition, where weak inhibition corresponds to high thalamic activity (which corresponds to the parameters in Eq (4)), and strong inhibition corresponds to relatively low thalamic activity (which corresponds to the parameters in Eq (11)). This idea was implemented using an additional facilitating variable, (10) where represents all thalamic inputs of any frequency, and the parameters are chosen to be , and . As we applied auditory inputs from particular experiments into the model, the function grew as a function of the excitatory input function I(t), and eventually saturated to different mean values based on the stimulus duration, stimulus interval, and inter-trial interval. Incidentally, we found that the facilitating variable saturated to greater values for forward suppression and SSA, and saturated to relatively lower values for PV activation and tuning-curve adaptation. The two putative disjoint parameter sets correspond exactly to these relatively high and low values of the variable . A simulation of Eq 10 is shown in Fig 1C for the various auditory paradigms, with a horizontal line shown where we chose to separate the saturation values.

If is above the threshold, which we chose to be Fth = 0.22, the system exhibits weak baseline inhibition, and the synapses take baseline values as shown in Eq 4. On the other hand, if , the synapses take the strong baseline inhibitory values (11) and the SST activity threshold, sth = 1, decreases to sth = 0. In our original model, we chose a smooth transition between these parameter sets, i.e., wep(t) = 2h(F(t))+3(1−h(F(t))), , and where (12) and r, the gain of the sigmoid h was chosen to be steep, e.g., r = 25. However, for simplicity, we replaced h with a Heaviside function and assumed that the system already reached either the weak baseline inhibition W1 (Eq 4), or the strong baseline inhibition W2 (Eq 11) based on the given experimental paradigm.

For each paradigm (with paradigm parameters shown in Table 2, we simulated Eq 10 and found that SSA and forward suppression belonged to the weak inhibitory regime ( integrated to values above threshold Fth), whereas tuning-curve adaptation and PV activation belonged to the strong inhibition regime ( integrated to values below threshold Fth). All rate models were simulated using the dynamical systems software XPP [23], called using PyXPPCALL and visualized using Python [24]. The equations were well-behaved enough that a very coarse time step of dt = 0.1 was sufficient for our purposes. We used the default integration method in XPP, the fourth-order Runge-Kutta.

Goodness of Fit and related metrics

We compared the results obtained with the model as result of optogenetic perturbations to published data qualitatively, by testing whether the model reproduces the increase/decrease of activity that was reported. The limit of large numbers yields theoretically tractable equations such as the Wilson-Cowan models we use in this paper. However, finite size effects may contribute to additional issues that the large-number limit does not address. How do we know that the observed changes would hold consistently with the inclusion of finite and noisy neural populations? To address this question, we built a spiking model constrained by the rate models above. Because the spiking model is derived directly from the simpler rate models, we generally expect that the spiking models will reproduce the rate models results for sufficiently large neural populations, however, we are able to establish that the results are consistent in the presence of noise. Moreover, we built the rate model using Python and brian2, which are both extensively documented and accessible open-source packages. Finally, we include the spiking model as a means for researchers to directly fit parameters from experimental data if needed. The rate model provides strong qualitative intuition, but does not explicitly account for single-cell interactions. Thus, the spiking model provides an additional, detailed framework for others to modify and extend beyond what the rate model may provide.

Spiking neuron dynamics

We use the single-unit rate model as a template for the spiking model, constraining the parameters while preserving the pattern of excitatory and inhibitory currents. All inhibitory neurons consisted of a single somatic compartment, while the excitatory neurons were modeled as two-compartment, “ball-and-stick” models. For each excitatory and inhibitory neuron, we modeled the somatic (ball) compartment as an adaptive exponential integrate-and-fire neuron [25,26]: (13) where the transmembrane currents are , where (14) and the sum ∑B in the synaptic current iterates over the presynaptic neurons, B∈{e,p,s}. If the presynaptic neuron B is excitatory or inhibitory, then EB =0mV or −67mV, respectively. If a synaptic connection existed from PV to Exc, we included a depression variable, (t), satisfying Eq 9, with , and : (15) where a = 1.7. As in the rate model, the parameter a was chosen such that the sign of did not change. The additional depression term was necessary to incorporate depression effects that operate well beyond the timescale of inhibitory conductances [15].

We added transmembrane noise in the form of a white noise process with zero mean and a standard deviation of 20mV to simulate intrinsic and extrinsic current fluctuations. All fixed parameters for each neuron type are shown in Table 3. The parameters that we varied manually were entirely contained in the time-dependent functions (thalamic inputs) and (optogenetic parameters). The thalamic input profile, , is determined by (16)

Where (17) where τI = 1ms, τD,Fast = 10ms, , and . The functions IA(t) (distinct from ) are square wave functions that are active for the duration of the auditory stimulus. Just as in the rate model, the thalamic input function only appears in Exc and PV neurons. The profile of the thalamic input is shown in Fig 1G. The optogenetic term, , only appears in the PV and SST equations.

Following a presynaptic spike from neuron j, the postsynaptic effect on neuron i appears as an instantaneous spike in the postsynaptic conductance gij(t)→gij(t)+gij,max/nX, where gij,max is given by Eq 18, and X stands for the presynaptic neuron type (Exc, PV, or SST). The magnitude of the conductances were chosen to have the same proportion as reported values [16], with the same type of connectivity structure as in the rate model.


In the absence of presynaptic spikes, the conductances gij decay exponentially to zero: (19) where τij = 1ms for all synapses except for the time constants from excitatory to PVs, τpe = 25ms, and excitatory to SSTs, τse = 15ms [27]. In the spiking model, we switched to the weak inhibitory regime by decreasing the inhibitory inputs into Exc from gep,max = 40 and ges,max = 20 to gep,max = 38 and ges,max = 19.

For excitatory neurons (A = e), the transmembrane currents are , where (20)

The term F(t) is a dimensionless slow timescale facilitation variable that depends on the thalamic drive, and satisfies Eq 8 (just as in depression, the additional slow timescale allows the model to operate on multiple timescales [15]). The parameter b = 3 modulates the facilitation strength, and , and . For simplicity, we allowed Fi(t) to vary continuously over time. The variable w represents spike-frequency adaptation and obeys (21)

The dynamics of the dendritic (stick) compartment obey (22) where the parameter κ = 0.3 is the ratio of somatic to total surface area [26].

For PV and SST interneurons, the equations are the same as Exc except that there is no dendritic component. Parameters differ marginally between PV and SST neurons (see Table 3). SSTs, unlike PVs, have no incoming synaptic connections from the thalamus and only receive excitatory inputs from Exc. PVs and SSTs include the optogenetic stimulus term , and as mentioned above, only Exc and PVs contain the thalamic input term . These connections reflect the choices made in the rate model.

Three-unit Spiking model

We introduced the gross tonotopy into the spiking model by copying the single unit spiking model into three units with lateral excitatory connections (Fig 2A). As in the 3-unit rate model, the thalamic inputs of the 3-unit spiking model have weaker lateral connections. For tone responses at frequency f1 and f2, the center unit receives an input of amplitude proportional to 0.85 that of the left and right units, respectively.

The spiking model contains 1600 Exc, 200 PVs, and 200 SSTs. For connection probabilities within units, we chose EE connections to have probability pEE = 0.1 and all other probabilities to be the same, pEE = pES = pPE = pPP = pPS = pSE = 0.6. For lateral connection probabilities, we chose p = 0.1.

The spiking model was constructed using Brian2 [28].


Differential effects of interneuron suppression in stimulus-specific adaptation

Almost all neurons (95%) in AC exhibit stimulus-specific adaptation, a phenomenon in which neurons reduce their response selectively to the inputs that is presented frequently in the stimulus (standard tone), while preserving the initial strong response to the less frequent input (deviant, or oddball tone) [1]. Previous studies found that following a presentation of the deviant tone, the excitatory neurons adapt over successive presentations of the standard [1,29]. Similar adaptation was observed in the songbird [30,31]. This phenomenon was largely attributed to feedforward thalamo-cortical depressing synapses [17,32], but such models could not account for the full range of the effects that were observed [12]. A recent study found that inhibitory neurons exhibit differential control over SSA [2]. Suppressing SSTs resulted in disinhibition of the excitatory neurons, such that disinhibition increased with successive presentations of the standard tone, but not the deviant. By contrast, PV inhibition drove equal amount of disinhibition of excitatory neurons in response to both the deviant and the standard. These results suggest that SST inhibition controls adaptation level of excitatory neurons.

In order to understand the roles of inhibitory interneurons in modulating SSA responses, we tested whether the single-unit model could reproduce the differential effects of suppressing PVs and SSTs in temporal adaptation. This paradigm involved the application of 8 successive tones to the single iso-frequency circuit with constant synapses and depressing thalamic inputs (Fig 3, Eq 1). We then tuned the parameters of the model in order to achieve the qualitative results for SSA [2]. We simulated optogenetic suppression of inhibitory neurons by defining the functions IOpt,PV(t) and IOpt,SST(t) to turn on 100ms before tone onset and turn off 100ms after tone onset, thereby inhibiting the activity of PV or SST neurons as performed in the experimental paradigms. The degree of PV and SST inhibition was chosen as a free parameter, and in the case of this single-unit rate model, had the dimensionless values of IOpt,PV(t) = −2 during PV suppression and IOpt,SST(t) = −1 during SST suppression. Through this suppression of inhibition, we sought to reproduce the constant disinhibition from PV inactivation, and increasing disinhibition from SST inactivation. Once the model reproduced this qualitative result, we discovered that the key mechanisms involve the temporal structure of the responses: PVs exhibit a temporally fast tone-evoked response and peak earlier than Exc and SSTs, while SSTs exhibit a temporally delayed and broad tone-evoked response (Fig 1B, Fig 3B top left plot), which both agreed with earlier studies [2]. The SST delay is not hard-coded, but the result of SSTs receiving indirect thalamic excitation through the Exc population [16].

Fig 3. The effect of optogenetic manipulations on adaptation to repeated tones.

A. Stimulus of repeated tones, with or without concurrent laser stimulation. B. Left column: Circuit diagram specifying the inactivation of populations. Top row: no stimulation. Responses of Exc (blue), PV (cyan) and SST (magenta) populations to the first (middle column) and last (right column) tones. Middle row: responses during PV suppression. Green: Responses of Exc under PV suppression, gray: responses of Exc in the control condition. Bottom row: responses during SST suppression. Orange: Responses of Exc under SST suppression, gray: responses of Exc in the control condition. C. Top: Mean response of the excitatory population to the repeated tones. Bottom: difference in excitatory responses with and without stimulation. No stimulation (blue), PV suppression (green); SST suppression (orange).

The simplicity of the model allowed us to understand the mechanism underlying these changes. One important observation is the faster temporal activation of PVs relative to SSTs. With this property, excitatory activity is immediately affected by changes to PV activity and less by changes to SST activity. With PV suppression (Fig 3C middle row), PVs are reduced in activity, leading to greater Exc activity. The SSTs were indeed disinhibited due to a lack of inhibition from PVs and the greater Exc activity, but the inhibition from SST to Exc was not strong enough to compensate for changes in PV activity. Thus, the Exc population received an overall decrease in inhibitory current across all successive tones, resulting in constant disinhibition prior to and following adaptation (Fig 3D green). This result suggests that the temporally faster PV inhibition of Exc neurons is necessary for modulating excitatory activity, and only a few milliseconds of earlier PV activity results in a substantially stronger inhibitory effect relative to SSTs.

With SST inactivation, Exc activity at the first tone was virtually unaffected because the reduced SST activity resulted in PV disinhibition, and the increased PV activity resulted in no net change to the total inhibitory current entering the Exc population (Fig 3C, bottom left). PVs compensated for the reduced SST activity. Following adaptation, the overall reduced excitatory activity in both thalamus and Exc resulted in reduced PV activity and a net loss of inhibitory current in Exc. The Exc population was therefore disinhibited at the last tone (Fig 3C, bottom right). Combined, these effects drove an increase in disinhibition over successive tones (Fig 3D, orange). These simple mechanisms of disinhibition and compensation can therefore explain the complementary roles of inhibitory interneurons in shaping cortical activity.

To understand the model dynamics in the presence of inputs at multiple frequencies in the oddball paradigm, we extended the model to a rate and spiking model with three iso-frequency units, in which each microcircuit received inputs of specific preferred frequencies. The three-unit circuitry was based on the single-unit model and the free parameters tuned to reproduce the inhibitory and excitatory currents. For example, an auditory input to the left unit caused lateral excitatory and inhibitory currents to enter the center unit. These currents to the center unit were designed to be similar to the currents entering the single-unit in response to auditory stimuli. We performed the same procedure for auditory inputs to the right unit: lateral excitatory and inhibitory currents from the right were designed to enter the center unit in a manner similar to the single-unit case. Using this procedure, we extended the differential SST and PV inhibition in the single-unit model to work in the case of a tonotopy without the need for exhaustive parameter fitting.

The oddball stimulus activated the neighboring units to the center one, whose activity we measured (Stimulus described in Table 2, Fig 4A). We observe thalamic depression in the thalamus which decreases responses to the repeated activation from one unit, and large responses to the unrepeated tone. We simulated PV and SST suppression with the functions IOpt,PV(t) and IOpt,SST(t). These functions were turned on 100ms before tone onset and turned off 100ms after tone onset. In the rate model, the inhibition was dimensionless and free parameters chosen to be IOpt,PV(t) = −4 during PV inhibition and IOpt,SST(t) = −2 during SST inhibition. In the spiking model, we chose IOpt,PV(t) = −0.2nA during PV inhibition and IOpt,SST(t) = −1nA during SST inhibition (Fig 4B–4G).

Fig 4. Summary of SSA in the rate and spiking model.

A: Oddball stimulus consisted of two tones: standard tones (gray) appear with 90% probability, whereas deviant tones (red) appear with 10% probability. B, C: Average response of the excitatory population to the deviant (red outline) and subsequent standards (gray outline) without stimulation; with PV suppression (green) and with SST suppression (orange). B. Rate model. C. Spiking model. D. Additive change in response of excitatory population due to PV suppression in the rate and spiking models to the deviant (red outline) and standards (gray outline). Left: from published data. Center: rate model. Right: Spiking model. E. Additive change in response of excitatory population due to SST suppression in the rate and spiking models to the deviant (red outline) and standards (gray outline). Left: from published data. Center: rate model. Right: Spiking model. F. Predictions for the responses to the oddball stimulus with and without interneuron activation in the rate model. Left: Mean responses of the excitatory population to the deviant and subsequent standards (red/gray outline: no activation; green: PV activation; orange: SST activation). Middle: Change in excitatory neuron responses due to PV activation, Right: Identical plot as the middle panel, but for SST activation. PV activation resulted in a near-uniform decrease in FRs, whereas SST resulted in an increase in adaptation.

In the rate and spiking model, the firing rates increased uniformly across all post-deviant tones (Fig 4D and 4E). In the rate and spiking model, the firing rates exhibited an increase in disinhibition as a function of post-deviant tone number ((Fig 4F and 4G). Both results agree with existing results in SSA [2]. In order to establish the robustness of these results, we varied several parameters and measured the Common-contrast SSA Index (CSI) [12], (23) where d(fi) is the deviant rate response and s(fi)) is the standard rate response to frequency fi. For full adaptation, when the standard responses are 0, CSI = 1, indicating a high degree of SSA. If the standard responses are equal to the deviant responses, then CSI = 0, indicating a low degree of SSA.

We performed a parameter sweep with four key parameters of circuit connectivity ((Fig 5): (1) recurrent excitation (wee, (Fig 5A, 5B, 5E and 5F), a key parameter considered in many studies [12], especially those related to inhibitory stabilized networks (ISNs) [11,26]; (2) timescale of thalamic depression (, Fig 5C, 5D, 5G and 5H) whose reported values vary over a large range, from 0.8s to 3s [2,12]; (3) the strength of PV activation or inactivation (Fig 5A, 5C, 5E and 5G); and (4) the strength of SST activation or inactivation (Fig 5B, 5D, 5F and 5H). In all cases, inactivating SSTs reduced the CSI relative to PV inactivation, reflecting the increasing disinhibition over post-deviant tones. The x-axis of each plot corresponds to the strength of the optogenetic laser. Negative values correspond to a decrease in inhibitory activity, and positive values correspond to an increase in inhibitory activity. These effects may be more clearly seen in the model equations (Eqs 1, 5 and 13), where we add the terms IOpt,PV(t) and IOpt,SST(t).

Fig 5.

Predicted effects of the key parameters on SSA index (CSI) for the rate model (A-D) and spiking model (E-H). Control parameters are denoted by white circles, PV and SST inactivation parameters are denoted by white triangles, and PV and SST activation parameters are denoted by white squares. The control parameter values, IOpt,PV = IOpt,SST = 0, are denoted by white circles. A, E: PV optogenetic parameter vs recurrent excitation (wee). B, F: SST optogenetic parameter vs recurrent excitation. C, G: PV optogenetic parameter vs thalamic depression time constant . D, H: SST optogenetic parameter vs thalamic depression time constant. White regions in all subfigures denote areas where the firing rate (FR) of the standard tone is too low (FR<0.1), or where the excitatory response saturates, making CSI measurements impossible.

This analysis reveals robustness in parameter ranges for given optogenetic modulation strengths. The CSI in the control case (white circle) changed little when the parameters wee and were varied for a given optogenetic strength. In other words, for a fixed value on the x-axis, changing positions in the vertical direction on each plot did not change the CSI index significantly for a nontrivial range of parameter values. This observation suggests that the cortical model can operate in a broad parameter regime, and precise parameter values may not be important for normal function. In extreme cases, decreasing recurrent excitation removed the decrease in CSI following SST inactivation (Fig 5B), suggesting that sufficient recurrent excitation is an important factor in generating responses in the SSA paradigm. Second, while increasing optogenetic inhibition had little effect on the CSI, increasing optogenetic activation showed an increase in CSI in all cases (CSI = 0.35 for PV activation and CSI = 0.31 for SST activation). Therefore, we predicted that optogenetic activation of PVs and SSTs will generally improve context-dependent cortical responses.

Like the rate model, the spiking model exhibited little sensitivity to changes in wee and . However, the spiking model showed almost no dependence on recurrent excitation wee in the case of SST inactivation (Fig 5E). This effect is likely due to the differences in connectivity between the rate and spiking models. In the rate model, lateral connections depend entirely on excitatory activity, thus SSA results in the rate model are more sensitive to changes in recurrent excitation. In the spiking model, recurrent excitation plays a less important role because the lateral connection probabilities are low (p = 0.1), whereas the connection probabilities within units are high (p = 0.6).

The three-unit model was developed to reproduce the compensating mechanisms of the single-unit model: PV suppression results in constant disinhibition for repeated tones, and SST suppression results in a compensating effect from PVs before adaptation that weakens as adaptation strengthens. These differential roles explain experimental data in the SSA paradigm to a remarkable degree. We then asked whether this simple mechanism is sufficient to reproduce additional optogenetic experiments. For the remainder of the paper, we use the three-unit rate model with no parameter modifications except for the changes in the inhibition modes and the auditory inputs that depend on the experimental paradigm (Table 2, Fig 1C, and Eq 10).

Differential effects of inhibitory neuron manipulation on cortical forward suppression

Context dependence of auditory responses has been revealed on many time scales. In a well-studies phenomenon termed “forward suppression”, the responses of AC neurons to a tone are suppressed if the tone is preceded by another tone, but the level of suppression depends on the frequency difference between the two tones (Fig 6). In the experiment, the first tone, called the masker, varies in frequency between trials, while the second tone, called the probe, remains fixed at the preferred frequency of the neuron. This phenomenon was explained by feedforward depression, but the inhibitory neurons were recently shown to also control forward suppression [8]. PV inactivation (orange) concurrent with the auditory stimulus resulted in a selective increase in forward suppression at the preferred frequency relative to the control case (blue), whereas SST inactivation (green) reduced forward suppression at the preferred frequency relative to the control case (blue) (Fig 4B second row) [8].

Fig 6. Forward suppression in the rate and spiking model.

A. The stimulus consisted of pairs of tones activating either neighboring or the same iso-frequency units. The laser was presented continuously throughout stimulation trials. B,C,D, E: Responses of excitatory neurons to the probe tone as a function of the frequency of the masker. Blue: control. B, C: Top row: Schematic of results from Phillips et al., 2017. Middle row: Results from the rate model. Bottom row: Results from the spiking model. B: Results of PV suppression (green). C: Results of SST suppression (orange). D,E: Rate model prediction for forward suppression during PV and SST activation. PV and SST activation resulted in enhanced forward suppression. D: Results from PV activation. E. Results from SST activation.

We used the same parameters for connectivity within the circuit as with SSA to reproduce the experimental findings, with only slight changes to the input strength (q = 1.3). The stimuli used in the forward suppression paradigm place the baseline state in the strong inhibitory regime (Fig 1C). Both the rate (Fig 6A middle, 6B middle) and spiking models (Fig 6B bottom 6B bottom) yielded the existence of experimentally observed differential effects for PV (Fig 6A) and SST inactivation (Fig 6B): PV inactivation drove a selective decrease in responses whereas SST inactivation drove a suppression of excitatory neuronal responses. We do not expect the scales between the rate and spiking models to match precisely, and we only sought to match the existence of the differential phenomena. Functions IOpt,PV(t) and IOpt,SST(t) were turned on for the duration of the simulation to match the experimental protocol. In the rate model, the inhibition was dimensionless and free parameters chosen to be IOpt,PV(t) = −4 during PV inhibition, IOpt,PV(t) = 0.5 during PV activation, IOpt,SST(t) = −2 during SST inhibition, and IOpt,SST(t) = 1.2 during SST activation. In the spiking model, we chose IOpt,PV(t) = −0.2nA during PV suppression, IOpt,PV(t) = during PV activation, IOpt,SST(t) = −1nA during SST inhibition, and IOpt,SST(t) = −1nA during SST activation.

At first glance, this result seems paradoxical given that PV suppression generally results in excitatory disinhibition as shown in the adaptation and SSA results (Figs 3 and 4), but the underlying mechanism is straightforward to understand. Following PV suppression, excitatory activity is indeed disinhibited, but the firing rate function of the excitatory population saturates. This behavior means that higher activity neurons are generally disinhibited less strongly than lower activity neurons. Thus, upon receiving the second tone, the input received by the excitatory population is weaker due to thalamic depression, but the disinhibition is greater relative to the disinhibition in the first tone. This phenomenon also requires a special property in the thalamic depression variable g(t), namely that it cannot depress too quickly during the first time or else the excitation in the second tone will be too weak, and although it recovers slowly, it recovers enough such that the input from the second tone is still somewhat strong. In the case of SST suppression, PVs compensate for the loss of inhibition in the first tone, but lose the ability for compensation in the second tone, so Exc are able to respond more strongly relative to the control case. Thus, forward suppression is weakened.

Next, we tested the effects of activating PVs or SSTs (as could be done with ChR2 experimentally) on model responses. The model predicted that both PV and SST activation will result in an increase of forward suppression across preferred and sideband frequencies (Fig 6D and 6E).

Differential adaptation to repeated tones along the frequency response function

Neurons in A1 adapt to repeated tones [2]. This adaptation is proportional to the strength of their tone-evoked responses: it is stronger in the center of the frequency response function, and weaker for the sidebands [7]. A recent study found that PVs and SSTs exert a differential effect on this form of adaptation: Suppressing PVs drives disinhibition selective to the sidebands in the adapted state, whereas suppressing SSTs drives disinhibition both in the center and at the sidebands of the frequency response function of excitatory neurons [7]. To understand how inhibitory neurons affect adaptation across different frequency-tuned inputs, we presented a sequence of 8 tones at each frequency to generate adapting tuning curves (Fig 7A), and repeated this stimulus with PV and SST suppression for the model circuit. We found that this auditory paradigm resulted in a below-threshold integration of , so the system switched to a state of strong baseline inhibition (and importantly, the model did not respond in precisely the same way as in SSA and forward suppression).

Fig 7. Adaptation to repeated tones along the frequency response function.

A. The stimulus consisted of a sequence of repeated tones, presented to each iso-frequency unit. On stimulation trials, the laser overlapped with the sound stimulus. B, D, F, H, J, L, N, P: The responses of excitatory units to the first (left) and last tone (right) as a function of the distance in frequency between the unit and the stimulus without (blue) and with (green: PV suppression; orange: SST suppression) stimulation. C, E, G, I, K, M, O, Q: The response of excitatory neurons to tone 1 (left) and tone 8 (right) on light on and light off trials. The control lines have unit slope because light on and light off in an experimental condition yields no changes to the firing rate. Thicker lines in Fig 7E, 7G, 7K and 7M represent the peak excitatory responses from the first and last simulations taken directly from the simulations, whereas thinner lines are linear extrapolations to assist the visual comparison to the control line (blue). B,C. Experimental results, PV suppression. D,E: Rate Model, PV suppression. F,G. Spiking model, PV suppression. H,I. Experimental results, SST suppression. J,K. Rate model, SST suppression. L,M. Spiking model, SST suppression. N,O. Rate model, PV activation. P,Q. Rate model, SST activation.

Our model reproduced the differential experimental effects of PV and SST suppression (Fig 7). In the rate model before adaptation, PV and SST inactivation resulted in sideband disinhibition with little to no disinhibition at the preferred frequency (Fig 7D and 7J eft). After adaptation, PV inactivation resulted in sideband disinhibition and no preferred frequency disinhibition (Fig 7D right), whereas SST inactivation resulted in disinhibition across all sideband and preferred frequencies (Fig 7J right). The spiking model closely mirrored these results (Fig 7F and 7L). The ratio of excitatory responses between light off and light on trials summarize the degree of sideband and preferred frequency disinhibition (Fig 7E, 7G, 7K and 7M).

The mechanisms behind these results involve synaptic facilitation and depression and the compensating mechanism discussed in the earlier sections for SSA and adaptation. In the case of PV suppression, SSTs were the only interneurons capable of contributing to Exc inhibition, so only Exc-SST interactions drove the observed effects. In particular, lateral SST to Exc synapses suppressed the center unit over each tone, and facilitation allowed this suppression to persist throughout adaptation. Note that this preferred-frequency effect was not observed in SSA because we never directly stimulated the center unit. Next, in the case of SST suppression, the increasing disinhibition with adaptation at the preferred frequency was a consequence of the same compensating mechanism as in SSA. The tuning curves shown in Fig 7B and 7H are representative cartoons of the changes in the tuning curves observed quantitatively [7], but we aimed to reproduce the light-on and light-off curves (Fig 7C and 7I) using our model. The comparison of the model tuning curves to the cartoons from the experiments appear to be significantly different at the sidebands, but in fact our model outputs closely match the desired lines in Fig 7C and 7I.

Turning to predictions, we found that before adaptation, PV activation resulted in a slight decrease at the preferred frequency, whereas SST inactivation reduced overall firing rates across all frequencies (Fig 7N and 7O). After adaptation, PV and SST activation resulted in a subtractive effect. General optogenetic activation and inactivation of PVs and SSTs modulated tuning-curves in combinations of additive, subtractive, multiplicative, and divisive effects [3]. Our model reproduced one of the key results in these studies as well, since PV and SST inactivation were found to have additive and divisive effects on the frequency response functions of excitatory neurons (Fig 7E, 7G, 7K, 7M, 7O and 7Q).

Similar to SSA, the functions IOpt,PV(t) and IOpt,SST(t) were turned on 100ms before tone onset and turned off 100ms after tone onset. In the rate model, the inhibition was dimensionless and free parameters chosen to be IOpt,PV(t) = −0.5 during PV inhibition, IOpt,PV(t) = 1.2 during PV activation, IOpt,SST(t) = −1 during SST inhibition, and IOpt,SST(t) = 0.1 during SST activation. In the spiking model, we chose IOpt,PV(t) = −1nA during PV inhibition and IOpt,SST(t) = −1nA during SST inhibition.

PVs enhance feedforward functional connectivity

Cortical neurons in AC receive inputs from the thalamic auditory nuclei. As the result, neuronal responses in the cortex are correlated with neuronal firing in the thalamus. These interactions can be captured using an Ising model to measure the connection from the thalamus to the cortex. When PVs were activated, the functional coupling between cortical and thalamic responses [9] became stronger. The specific mechanism underlying this change is unknown.

Using the three-unit model, we identified a candidate mechanism for the enhanced thalamo-cortical correlation following PV activation. We assumed that the feedforward functional connection from the thalamus to the cortex is the same as the anatomical connection, so thalamic inputs directly modulated cortical responses in our model. Following an increase in inhibition, cortical responses became sharper, thus aligning more closely with thalamic inputs and improving feedforward functional connectivity (Fig 8).

Fig 8. Activation of PVs enhanced feedforward connectivity in the model.

A. Stimulus was a single tone accompanied by a laser on stimulation trials presented at 0.1 s. B. Top: Cortical excitatory population responses to tones without (blue) and with PV stimulation (green) in the rate model. Bottom: Thalamic input (red). C. Top: Cortical excitatory population responses to tones without (blue) and with PV stimulation (green) in the spiking model. D: Rate model prediction for effects of PV inactivation.

PV activation (green) in the rate model resulted in an increase in the Pearson correlation between the control (blue) and thalamic inputs (red), from 0.77 and 0.83 (Fig 8B). Thus, whereas inhibitory activation decreased the overall firing rate, the response became more synchronized to the thalamic inputs, resulting in an increase in feedforward functional connectivity. In the spiking model, PV activation resulted in a delayed response of excitatory activity, but we were interested in tested whether PV-activated Exc response profile resembled the thalamic activity more than the control Exc response. To make this comparison, we shifted the PV trace so that the onset of PV-activated Exc activity (green) coincided with the onset of the control curve (blue) (Fig 8C. An equivalent approach would be to measure the peak value of the cross-correlation between excitatory and thalamic activity without shifting the data in time). We observed an increase in the Pearson correlation from 0.87 in the control Exc activity to 0.95 in the PV-activated Exc activity, thus demonstrating a sharpening of excitatory responses, and an increase in feedforward functional connectivity. As in the previous paradigms, we only aimed to show the existence of changes without requiring the magnitude of change to match the experimental data.

These results provide for a simple plausible mechanism for enhanced feedforward functional connectivity: as inhibition reduces the overall cortical inputs, cortical responses better synchronize to thalamic inputs, resulting in stronger correlated activity. We remark that the classic approach to the problem of establishing proper functional connectivity relies on sophisticated models such as Ising models [33], expectation-maximization [34], and subspace identification [35], because direct calculations of correlations can lead to false positives when the anatomical connections are not known [9]. Our model has explicit anatomical connections, which eliminates the problem of false-positive correlations. Thus, in this case, the use of the correlation serves as a reliable proxy for the Ising model.

Balanced networks

Multiple studies postulated that excitatory and inhibitory currents are matched in cortical circuits, contributing to stability of circuit function [36,37]. Therefore, we tested whether our model operated as a balanced network. To make this measurement, we took ratios of inhibitory and excitatory currents during suprathreshold activity, because our model ignores virtually all subthreshold activity. Interestingly, with no tuning, our models showed evidence of operating as a balanced network: as we varied the input strength to the rate and spiking models (Fig 9C and 9F), the ratio of excitatory to inhibitory inputs to the excitatory population (Fig 9B and 9E) remained constant. The rate model had an excitatory/inhibitory ratio of 0.37 (Fig 9B), and the spiking model had an excitatory/inhibitory ratio of 2.5 (Fig 9E). These results suggest that excitatory-inhibitory balance a robust, emergent feature of cortical networks. The large differences in scales are to be expected when comparing a dimensional and dimensionless model.

Fig 9.

Excitatory-inhibitory balance in the rate and spiking models A. Plot of incoming excitatory and inhibitory currents into the Exc population as a function of different input strengths (C). Darker currents correspond to stronger inputs B. A best-fit line (dashed) accurately captures the ratio of excitatory and inhibitory responses, implying excitatory-inhibitory balance. Equivalent results for the spiking model are shown in panels D, E, and F.


A wealth of recent studies provide evidence for distinct function of different types of cortical inhibitory neurons in temporal processing of auditory information. The studies demonstrate that different types of inhibitory neurons, SSTs and PVs, play a differential role in auditory processing, controlling adaptation at different time scales and contexts, and changes to feedforward functional connectivity. Our goal was to integrate the results of these studies to understand whether the observed effects were due to a small set of mechanisms.

We built an idealized rate and spiking model that reproduced multiple key results from studies that tested the function of specific inhibitory opsins in specific cells in the auditory cortex. In addition to including different baseline states that modulate the strength of PV-to-Exc and SST-to-Exc synapses, the key mechanisms underlying our models included the fast temporal activation of PVs, the delayed, broad temporal activation of SSTs, the ability for PVs to compensate for weakened SST activity, and dynamic synapses including SST-to-Exc facilitation. These interactions accounted for the differential modulation of cortical responses by interneuron subtypes and suggests that a simplified set of mechanisms can support experimental results.

To reproduce the differential function of SSTs and PVs in stimulus-specific adaptation, we built a model loosely based on multiple existing models for SSA and multiple configurations of spiking neuron populations, consisting of inhibitory and excitatory neurons. Previously, a two-layer rate model with synaptic depression was proposed to establish the relationship between the cortical response and the parameters in SSA experiments, such as stimulus frequency differences, probability of deviation, and tone presentation rate. Yarden et al. (2017) successfully used a multi-unit rate model arranged in a coarse tonotopy consisting of inhibitory and excitatory populations to reproduce general deviance detection, but model has not yet been adapted to explain differential interneuron modulation. Another existing model of SSA including differential inhibitory modulation demonstrating similar differential inhibitory effects as in our SSA result (Fig 4), but did not include a tonotopy [2]. These models only included one type of inhibitory neuron type or did not include tonotopy, and therefore could not account for the observed differential effects of suppression of SSTs and PVs on SSA across multiple frequencies. In the present study, we developed a simple rate and spiking model that accounted for multiple inhibitory cell types and which faithfully reproduced the differential effects of SST and PV inactivation in SSA (Fig 4). In addition, a parameter sweep revealed that both the rate and spiking models were robust to large changes in key parameters commonly explored in the literature, suggesting that SSA is a robust phenomenon [12].

Existing models that reproduce the enhanced forward suppression from PV inactivation and the reduced forward suppression from SST inactivation (Fig 6) include multiple layers that require both depression and facilitation [8], or rely on depressing recurrent excitation and do not distinguish between inhibitory subtypes [4]. We incorporated depression and facilitation in the model synapses and reproduced the former results with only a single layer, suggesting a surprisingly simple mechanism supporting forward suppression. Furthermore, the models in the present study reproduced tuning-curve adaptation effects previously observed experimentally but not computationally: SSTs exhibited strong preferred-frequency disinhibition following adaptation, while PV disinhibition is independent of the degree of adaptation (Fig 7) [7]. These results suggest that the underlying mechanism(s) of the model, namely the PV/SST compensation effect, combined with the facilitating SST-to-Exc synapse, may serve as a general mechanism of adaptation. In addition, our models reproduced changes in feedforward functional connectivity (Fig 8). By increasing PV activity in the models, excitatory activity decreased but became more time-locked to thalamic inputs. This effect agreed with observations in the cortex, where PV activation resulted in enhanced feedforward functional connectivity [9]. The effects of inhibition on sharpening cortical responses have been well-established, thus our models serve as plausible mechanisms for this change [3840]. Finally, our models were shown to operate as a balanced network, where inhibitory and excitatory currents entering neural populations were shown to maintain a consistent ratio across input strengths, suggesting a generality to the theory of balanced networks.

One drawback of the model is that it does not feature population spikes, which explain many fundamental cortical responses in AC [38]. In future work, we will seek to reconcile the differences between our models and the population spike model of SSA [4]. Establishing the importance of depression and facilitation in different synapses and extending our model to include population spikes warrants further study.

Although we do not explore simultaneous auditory stimuli in this study, it is worth mentioning the response properties of the network due to recent interest in supralinear network models (Rubin et al. 2016). Throughout this paper, neurons operate in a linear manner when above threshold: neurons add inputs linearly, until the maximum rate is reached in the rate models, or until the refractory period saturates spiking rates in the spiking model. The models do not use sub-threshold responses to modulate population activity.

Multiple studies from different laboratories revealed the differential effect of distinct inhibitory neurons in auditory processing. We show that a minimalistic model, built on simple mechanisms, is capable of reproducing disparate results in the literature. As inhibitory neurons form similar circuits throughout the mammalian cortex, this model can be readily adapted to test their function and generate predictions (with adjustments for local changes in connectivity) in different sensory modalities.


The authors thank the members of the Geffen laboratory and the Hearing Research Center at the University of Pennsylvania, Richard Gao, Auguste Schulz, Hedyeh Rezaei, Jason Kim and Lia Papadopoulos, for insightful comments on the manuscript and discussions.


  1. 1. Ulanovsky N, Las L, Nelken I. Processing of low-probability sounds by cortical neurons. Nat Neurosci. 2003;6: 391. pmid:12652303
  2. 2. Natan RG, Briguglio JJ, Mwilambwe-Tshilobo L, Jones SI, Aizenberg M, Goldberg EM, et al. Complementary control of sensory adaptation by two types of cortical interneurons. Elife. 2015;4.
  3. 3. Phillips EAK, Hasenstaub AR. Asymmetric effects of activating and inactivating cortical interneurons. Elife. 2016;5.
  4. 4. Loebel A, Nelken I, Tsodyks M. Processing of sounds by population spikes in a model of primary auditory cortex. Front Neurosci. 2007;1: 15.
  5. 5. Fino E, Packer AM, Yuste R. The logic of inhibitory connectivity in the neocortex. Neurosci. 2013;19: 228–237.
  6. 6. Isaacson JS, Scanziani M. How inhibition shapes cortical activity. Neuron. 2011;72: 231–243. pmid:22017986
  7. 7. Natan RG, Rao W, Geffen MN. Cortical interneurons ensure maintenance of frequency tuning following adaptation. bioRxiv. 2017; 172338.
  8. 8. Phillips EAK, Schreiner CE, Hasenstaub AR. Cortical interneurons differentially regulate the effects of acoustic context. Cell Rep. 2017;20: 771–778. pmid:28746863
  9. 9. Hamilton LS, Sohl-Dickstein J, Huth AG, Carels VM, Deisseroth K, Bao S. Optogenetic activation of an inhibitory network enhances feedforward functional connectivity in auditory cortex. Neuron. 2013;80: 1066–1076. pmid:24267655
  10. 10. Wilson HR, Cowan JD. Excitatory and inhibitory interactions in localized populations of model neurons. Biophys J. 1972;12: 1–24. pmid:4332108
  11. 11. Tsodyks M V, Skaggs WE, Sejnowski TJ, McNaughton BL. Paradoxical effects of external modulation of inhibitory interneurons. J Neurosci. 1997;17: 4382–4388. pmid:9151754
  12. 12. Yarden TS, Nelken I. Stimulus-specific adaptation in a recurrent network model of primary auditory cortex. PLoS Comput Biol. 2017;13: e1005437. pmid:28288158
  13. 13. Rankin J, Sussman E, Rinzel J. Neuromechanistic model of auditory bistability. PLoS Comput Biol. 2015;11: e1004555. pmid:26562507
  14. 14. Abbott LF, Varela JA, Sen K, Nelson SB. Synaptic Depression and Cortical Gain Control. Science (80-). 1997;275: 221–224. pmid:8985017
  15. 15. Wehr M, Zador AM. Synaptic mechanisms of forward suppression in rat auditory cortex. Neuron. 2005;47: 437–445. pmid:16055066
  16. 16. Pfeffer CK, Xue M, He M, Huang ZJ, Scanziani M. Inhibition of inhibition in visual cortex: the logic of connections between molecularly distinct interneurons. Nat Neurosci. 2013;16: 1068. pmid:23817549
  17. 17. Mill R, Coath M, Wennekers T, Denham SL. A neurocomputational model of stimulus-specific adaptation to oddball and Markov sequences. PLoS Comput Biol. 2011;7: e1002117. pmid:21876661
  18. 18. Lee CC, Sherman SM. Synaptic properties of thalamic and intracortical inputs to layer 4 of the first-and higher-order cortical areas in the auditory and somatosensory systems. J Neurophysiol. 2008;100: 317–326. pmid:18436628
  19. 19. Womelsdorf T, Valiante TA, Sahin NT, Miller KJ, Tiesinga P. Dynamic circuit motifs underlying rhythmic gain control, gating and integration. Nat Neurosci. 2014;17: 1031. pmid:25065440
  20. 20. Levy RB, Reyes AD. Spatial profile of excitatory and inhibitory synaptic connectivity in mouse primary auditory cortex. J Neurosci. 2012;32: 5609–5619. pmid:22514322
  21. 21. Beierlein M, Gibson JR, Connors BW. Two dynamically distinct inhibitory networks in layer 4 of the neocortex. J Neurophysiol. 2003;90: 2987–3000. pmid:12815025
  22. 22. Brosch M, Schreiner CE. Time course of forward masking tuning curves in cat primary auditory cortex. J Neurophysiol. 1997;77: 923–943. pmid:9065859
  23. 23. Ermentrout B. Simulating, analyzing, and animating dynamical systems: a guide to XPPAUT for researchers and students. SIAM; 2002.
  24. 24. Rossum G. Python Reference Manual. Amsterdam, The Netherlands, The Netherlands: CWI (Centre for Mathematics and Computer Science); 1995.
  25. 25. Brette R, Gerstner W. Adaptive exponential integrate-and-fire model as an effective description of neuronal activity. J Neurophysiol. 2005;94: 3637–3642. pmid:16014787
  26. 26. Litwin-Kumar A, Rosenbaum R, Doiron B. Inhibitory stabilization and visual coding in cortical circuits with multiple interneuron subtypes. J Neurophysiol. 2016;115: 1399–1409. pmid:26740531
  27. 27. Markram H, Wang Y, Tsodyks M. Differential signaling via the same axon of neocortical pyramidal neurons. Proc Natl Acad Sci. 1998;95: 5323–5328. pmid:9560274
  28. 28. Stimberg M, Brette R, Goodman DFM. Brian 2: an intuitive and efficient neural simulator. bioRxiv. 2019.
  29. 29. von der Behrens W, Bäuerle P, Kössl M, Gaese BH. Correlating stimulus-specific adaptation of cortical neurons and local field potentials in the awake rat. J Neurosci. 2009;29: 13837–13849. pmid:19889995
  30. 30. Bell BA, Phan ML, Vicario DS. Neural responses in songbird forebrain reflect learning rates, acquired salience, and stimulus novelty after auditory discrimination training. J Neurophysiol. 2015. pmid:25475353
  31. 31. Lu K, Vicario DS. Familiar but unexpected: Effects of sound context statistics on auditory responses in the songbird forebrain. J Neurosci. 2017. pmid:29118103
  32. 32. Mill R, Coath M, Wennekers T, Denham SL. Characterising stimulus-specific adaptation using a multi-layer field model. Brain Res. 2012;1434: 178–188. pmid:21955728
  33. 33. Ganmor E, Segev R, Schneidman E. The architecture of functional interaction networks in the retina. J Neurosci. 2011;31: 3044–3054. pmid:21414925
  34. 34. Turaga S, Buesing L, Packer AM, Dalgleish H, Pettit N, Hausser M, et al. Inferring neural population dynamics from multiple partial recordings of the same neural circuit. Advances in Neural Information Processing Systems. 2013. pp. 539–547.
  35. 35. Nonnenmacher M, Turaga SC, Macke JH. Extracting low-dimensional dynamics from multiple large-scale neural population recordings by learning to predict correlations. Advances in Neural Information Processing Systems. 2017. pp. 5702–5712.
  36. 36. Denève S, Machens CK. Efficient codes and balanced networks. Nature Neuroscience. 2016. pmid:26906504
  37. 37. Zhou S, Yu Y. Synaptic E-I balance underlies efficient neural coding. Frontiers in Neuroscience. 2018. pmid:29456491
  38. 38. Wehr M, Zador AM. Balanced inhibition underlies tuning and sharpens spike timing in auditory cortex. Nature. 2003;426: 442. pmid:14647382
  39. 39. Cardin JA, Carlén M, Meletis K, Knoblich U, Zhang F, Deisseroth K, et al. Driving fast-spiking cells induces gamma rhythm and controls sensory responses. Nature. 2009;459: 663. pmid:19396156
  40. 40. Sohal VS, Zhang F, Yizhar O, Deisseroth K. Parvalbumin neurons and gamma rhythms enhance cortical circuit performance. Nature. 2009;459: 698. pmid:19396159