A circuit model of auditory cortex

The mammalian sensory cortex is composed of multiple types of inhibitory and excitatory neurons, which form sophisticated microcircuits for processing and transmitting sensory information. Despite rapid progress in understanding the function of distinct neuronal populations, the parameters of connectivity that are required for the function of these microcircuits remain unknown. Recent studies found that two most common inhibitory interneurons, parvalbumin- (PV) and somatostatin-(SST) positive interneurons control sound-evoked responses, temporal adaptation and network dynamics in the auditory cortex (AC). These studies can inform our understanding of parameters for the connectivity of excitatory-inhibitory cortical circuits. Specifically, we asked whether a common microcircuit can account for the disparate effects found in studies by different groups. By starting with a cortical rate model, we find that a simple current-compensating mechanism accounts for the experimental findings from multiple groups. They key mechanisms are two-fold. First, PVs compensate for reduced SST activity when thalamic inputs are strong with less compensation when thalamic inputs are weak. Second, SSTs are generally disinhibited by reduced PV activity regardless of thalamic input strength. These roles are augmented by plastic synapses. These roles reproduce the differential effects of PVs and SSTs in stimulus-specific adaptation, forward suppression and tuning-curve adaptation, as well as the influence of PVs on feedforward functional connectivity in the circuit. This circuit exhibits a balance of inhibitory and excitatory currents that persists on stimulation. This approach brings together multiple findings from different laboratories and identifies a circuit that can be used in future studies of upstream and downstream sensory processing.


Introduction
Detecting sudden changes in the acoustic environment and extracting relevant acoustic features from noise are important computations for auditory navigation and scene analysis. The mammalian auditory cortex (AC) is a key region for processing temporally patterned sounds [1]. Neurons in AC exhibit adaptation to repeated tones, which may be selective for an overrepresented stimulus, such as in stimulus-specific adaptation, or SSA [1,2]. They furthermore exhibit forward suppression, in which a preceding stimulus masker tone drives a decrease in responses to the subsequent target tone [3,4]. How these computations are carried out by cortical circuits has been subject of extensive research.
The AC is composed of tightly coupled networks of excitatory and inhibitory neurons. Recent studies have identified the differential involvement of two distinct major classes of inhibitory neurons, parvalbumin-positive (PV) and somatostatin-positive (SST) neurons in these temporal paradigms. These neurons differ morphologically and physiologically [5,6], and recent studies found that they play differential functions in auditory processing. Specifically, SSTs, but not PVs facilitate stimulus-specific adaptation [2]. PVs and SSTs play distinct roles in adaptation to repeated tones along the frequency response function of the target neuron [7]. SSTs and PVs drive bi-directional effects on forward suppression [8]. In addition, PVs enhance feedforward connectivity in the auditory cortex [9]. These experimental results can be used to constrain currents and connections in an idealized auditory cortex model consisting of PVs, SSTs and Exc. Here, we tested whether these results can be accounted for by the same set of mechanisms.
In this paper, we build up from a simple dimensionless model consisting of one iso-frequency unit. We transition to a three-unit rate model to understand the mechanisms of inhibitory neural modulation on a gross tonotopy. We then build a detailed spiking model which incorporates the mechanisms discovered in the rate models. The rate models provide a qualitative intuition of the underlying mechanisms, and the spiking models incorporate these mechanisms in an accessible, open-source codebase for future work. We present the results on testing the three-unit rate model and the detailed spiking model in four distinct auditory paradigms.
The model accounted for observed experimental results including the differential role of SSTs and PVs in SSA [2], forward suppression [8], tuning-curve adaptation [7], and the effects of PV activation on feedforward functional connectivity [9]. We found that compensating currents between the two types of inhibitory neurons explain experimental findings of differential effects of their modulation on excitatory activity. Furthermore, the model is consistent with existing hypotheses regarding inhibitory and excitatory balance in the cortex. This framework can be used to build and test hypotheses for similar phenomena in other sensory modalities, and studies of upstream or downstream auditory processing in AC.

Materials and methods
We first built an augmented version of the Wilson-Cowan model, consisting of one iso-frequency unit of the auditory cortex. The model consisted of one excitatory neural population and two inhibitory neural subpopulations. Importantly, the single iso-frequency unit model served as the template for all other models in this paper. By using the results and parameters from this model, we extended our results to the substantially more complex three-unit rate model and three-unit spiking models. We constrained the parameters using experimental data from the literature.
All code used to generate figures (including model simulations, numerical methods, and analysis methods) are available on GitHub at https://github.com/geffenlab/park_geffen under the MIT open source license.
In figures in this manuscript, we use blue or black lines to depict Exc activity in absence of optogenetic manipulation (called "Control"); magenta for SST activity; cyan solid for PV activity; orange for Exc activity under SST suppression or activation; green for Exc activity under PV suppression or activation.

Augmented Wilson-Cowan model
We modeled a single iso-frequency unit as an augmented version of the Wilson-Cowan model [10] by including an additional inhibitory subtype. We emphasize that we drew much of our understanding of adaptation throughout this paper using this single iso-frequency unit: t u duðtÞ dt ¼ À uðtÞ þ f w ee uðtÞ À w ep pðtÞ À w es sðtÞ þ qgðtÞiðtÞÞ; � t p dpðtÞ dt ¼ À p t ð Þ þ f w pe uðtÞ À w pp pðtÞ À w ps sðtÞ þ I Opt;PV ðtÞ þ qgðtÞiðtÞ � � ; t s dsðtÞ dt ¼ À s t ð Þ þ f w se uðtÞ À w sp pðtÞ À w ss sðtÞ þ I Opt;SST ðtÞ where u(t), p(t), and s(t) represent the normalized firing rate (scaled from 0 to 1) of the excitatory population, PV inhibitory subpopulation, and SST inhibitory subpopulation, respectively (Fig 1. Note that connection strengths in the rate model are given by w ij , and the spiking model are given by conductances g ij , which are distinct from the thalamic depression variable g(t). We will keep these distinctions consistent throughout the text, so there is no ambiguity). An important caveat of the rate equations is that the rescaled firing rates are a consequence of dimensionless equations. All weights, parameters, optogenetic inputs, and thalamic inputs are also dimensionless in the rate model, with the only exception being time.
The parameters I Opt,PV (t) and I Opt,SST (t) represent the strength of PV and SST activation or inactivation, respectively, and w ij and τ i are synaptic weights and time constants, respectively. All time constants are τ u = τ p = τ s = 10ms, roughly in agreement with known data [2,11]. The function f is a threshold linear function defined as where the function f roughly approximates a sigmoid that converges to zero for small or negative inputs and saturates to 1 for large inputs. The parameter r = 3 determines the gain of all firing-rate functions and was chosen roughly to be the same as other modeling studies [2,12]. We included a threshold by subtracting a constant u th from the input, i.e., f(x−u th ) for some input x, where u th is a positive number (typically in the range from 0 to 1). In all rate model simulations, we chose Exc, PV, and SST thresholds to be u th = 0.7, p th = 1, and s th = 1, Note that the conductances with subscripts g ij are distinct from g(t), the adaptation variable for the thalamic input. The rate model uses connection strengths w ij ). B. Input and response profiles for the single-unit rate and spiking model to a 100 ms long tone. Top: Gray: thalamic depression variable g. Blue: excitatory (Exc) neuron activity. Cyan: PV. Magenta: respectively. The thresholds indicate the minimum activity required for a neural population to affect postsynaptic neural populations. Because the thresholds are greater than zero, subthreshold activity does not affect the dynamics of the network. The input function i(t) consists of blocks of inputs with stimulus duration and interval based on the experimental paradigm. We list the stimulus parameters in Table 1, and the stimulus duration and stimulus interval for each paradigm in Table 2. Details of each paradigm are explained in the text and figures where appropriate. When an auditory input arrives into the Exc and PV populations, the default temporal profile is taken to have an instantaneous rise with amplitude q and exponential decay ( Fig 1B, bottom red curve) with time constant τ q = 10ms, which roughly agrees with known values [13]. The instantaneous rise and exponential decay were chosen for simplicity. The input i(t) is further modulated by a slow synaptic depression term g satisfying the standard model of synaptic depression where the time constants are t d 1 ¼ 1500ms for replenishment and t d 2 ¼ 20ms for depletion (chosen close to reported values [2,11,14,15]). The synaptic depression variable g begins at a baseline value of g 0 = 1 and when i(t)>0, i.e., when an input arrives into the Exc or PV populations, g(t) decreases on the timescale determined by t d 2 . Because g(t) multiplies the input i(t) to u(t) and p(t) in Eq (1), g(t) serves to modulate the strength of auditory inputs to A1. In the absence of auditory input, g(t) recovers slowly on the order of seconds determined by t d 1 .
We based the proportional strengths of connections in the single-unit model based on existing studies on AC [16]. The within-unit connectivity is equivalently represented by the matrix, All synaptic weights w ij in the single-unit rate model are constant [17], with synaptic depression appearing in the feedforward thalamic inputs [18]. The inhibitory synaptic weights SST. Bottom: Thalamic input (red). C. Responses to stimulus over the first 30 s after sound onset for the different paradigms modeled in the paper. We used a high and a low inhibition mode of synaptic weights to capture the different results. For SSA (black) and forward suppression (FWS, gray), the variable � F � is higher than threshold � F � Th , resulting in a set of low inhibition parameters. Paradigms for tuning-curve adaptation (purple) and PV activation (g) asymptote at below-threshold levels, resulting in a set of high inhibition parameters. D-G responses of neurons in the spiking model to a 50ms tone. Top: raster plot; Bottom: Firing rate of Exc (D), PVs (E), SSTs (F), and thalamo-cortical input (G).

PLOS COMPUTATIONAL BIOLOGY
were roughly chosen to agree with known connection types and connection strengths [16,19], and the excitatory connections were tuned as free parameters. The constant synapses allowed us to fully understand the model dynamics before transitioning to the more complex threeunit model with depressing and facilitating synapses.

Three-unit Rate model
Using the single-unit rate model as a template, we arranged copies into three units with lateral cortical and thalamic connections ( This arrangement endowed our model with a gross tonotopy, which we used to explore spectrally and temporally complex auditory inputs. While the three-unit rate model appears to be substantially more complex, the parameters were strongly constrained by the single-unit rate model. In particular, we aimed for each unit of the three-unit model to mimic the excitatory and inhibitory currents of the single-unit rate model.   The functions I i (t) are defined as I k (t) = g k (t)i k (t)+g 2 (t)i 2 (t)α, for k = 1,3, and I 2 (t) = (g 1 (t) i 1 (t)+g 3 (t)i 3 (t))α+g 2 (t)i 2 (t). We remark that any lower-case letter i that has a subscript and is a Table 2. Auditory paradigm parameters. SSA parameters from [2]. Forward suppression (FWS) parameters from [8,22]. Tuning-curve adaptation (TCA) parameters from [7]. PV activation parameters from [9]. Optogenetic inhibition was performed 100ms before tone onset and 100ms after tone offset in SSA and FWS. In TCA and PV activation, optogenetic inhibition was turned on at the beginning of the experiment and sustained through the trial. For optogenetic parameters, see Table 2. function of time, e.g., i 1 (t),i 2 (t),i 3 (t), represent thalamic inputs. The time-dependent notation for these functions will always be distinct from the index i. Note that each set of equations are almost identical to the single-unit case, but with the addition of lateral terms along with

PLOS COMPUTATIONAL BIOLOGY
facilitating and depressing terms F i (t) and D i (t). The lateral terms are between immediate neighbors and include lateral SST to Exc (facilitating), Exc to Exc, Exc to PV. The facilitating terms F i (t) increase from 0 to nonzero values as unit i receives inputs, and the depressing terms D i (t) decrease from 1 to lower values as unit i receives inputs. Rather than adjusting weights integrated prior to facilitating synapses, we controlled the strength of facilitation by changing the time constants in Eq (8).
We chose α = 0.65, i.e., 65% of the thalamic inputs to the left or right units reach the center unit. Likewise, 65% of thalamic inputs to the center unit reach the left and right units. The function f is threshold linear (Eq 2). The functions I k (t) are time-dependent inputs with the strongest preference for unit k, and the profiles of i 1 (t), i 2 (t), and i 3 (t) are shown in Fig 2F  (these profiles are the same as the profile in the single-unit model, Fig 1B, bottom). Parameters a, b control the strength of depression and facilitation and are chosen to be a = 0.5, b = 2. The parameter q controls the strength of all inputs. Each input I j (t) is modulated by corresponding depression variables g k (t), where each g k (t) satisfies Eq 3 independently. The parameters τ i are membrane time constants and chosen the same as the single-unit model, τ u = τ p = τ s = 10ms [2,11]. The parameters w ij are within-unit synaptic weights chosen according to Eq 4, while the parameters w � ij are lateral (between unit) synaptic weights. We chose w � ee = 0.667, w � pe ¼ 1:25, and w � se ¼ 0:125 to reflect the generally weaker lateral synaptic strengths in auditory cortex relative to the within-unit connections [20].
We added facilitating terms F i (t) in the Exc to SST synapses, and depressing terms D i (t) in the PV to Exc synapses [21]. The parameters a and b control the degree of depression and facilitation, respectively, where we chose a = 0.5, b = 1 (these values were not taken from the literature). The depressing parameter a was chosen so that the term (w ep −aD i (t)) did not change sign across experimental paradigms. The facilitating variables F i (t) satisfy where t D 1 and t D 2 are as in Eq 3. The reuse of the depression time constants t D i are an intentional and simplifying choice. By using the same time scales, we were able to use the inputs i j (t) as a proxy for the excitatory activity u j (t) and simulate the facilitation variable in terms of the depressing synaptic variable as F j (t) = 1−g j (t). Similarly, the depression variables D i (t) satisfy and used the thalamic input as a proxy for excitatory activity to simulate the depression variable as D i (t) = g i (t). All depressing and facilitating timescales were chosen close to reported values [11,14,15].
The activity of the model is shown in Fig 2, in response to three successive auditory stimuli are applied in order of the frequencies f 1 , f � , and f 2 , stimulating the left, center, and right units, respectively (Fig 2C-2E). The center unit (u 2 (t)) responded equally well to frequencies f 1 and f 2 (Fig 2D), a necessary response for SSA paradigms. For simplicity, activation of an adjacent unit did not affect the thalamic variable, i.e., g 1 (t) was left unaffected by i 2 (t), g 2 (t) was left unaffected by i 1 (t) and i 3 (t), and g 3 (t) was left unaffected by i 2 (t). We assumed that the frequency difference between f 1 and f 2 was great enough that auditory inputs at f 1 (f 2 ) did not affect units responsive to f 2 (f 1 ).
When matching the model outputs to experiments in the literature, we found that there were seemingly two different parameter sets that explained all phenomena and we were unable to tweak the rate model to unify the parameters. We unified the disjoint parameter sets by incorporating paradigm-dependent baseline states in the three-unit rate model. The model parameters switch between weak and strong baseline inhibition, where weak inhibition corresponds to high thalamic activity (which corresponds to the parameters in Eq (4)), and strong inhibition corresponds to relatively low thalamic activity (which corresponds to the parameters in Eq (11)). This idea was implemented using an additional facilitating variable, where � IðtÞ represents all thalamic inputs of any frequency, and the parameters are chosen to be t F 1 ¼ 1500, and t F 2 ¼ 100. As we applied auditory inputs from particular experiments into the model, the function F � ðtÞ grew as a function of the excitatory input function I(t), and eventually saturated to different mean values based on the stimulus duration, stimulus interval, and inter-trial interval. Incidentally, we found that the facilitating variable F � ðtÞ saturated to greater values for forward suppression and SSA, and saturated to relatively lower values for PV activation and tuning-curve adaptation. The two putative disjoint parameter sets correspond exactly to these relatively high and low values of the variable F � ðtÞ. A simulation of Eq 10 is shown in Fig 1C for the various auditory paradigms, with a horizontal line shown where we chose to separate the saturation values. If F � ðtÞ is above the threshold, which we chose to be F th = 0.22, the system exhibits weak baseline inhibition, and the synapses take baseline values as shown in Eq 4. On the other hand, if F � ðtÞ < F th , the synapses take the strong baseline inhibitory values and the SST activity threshold, s th = 1, decreases to s th = 0. In our original model, we chose a smooth transition between these parameter sets, i.e., w ep (t) = 2h(F(t))+3(1−h(F(t))), and r, the gain of the sigmoid h was chosen to be steep, e.g., r = 25. However, for simplicity, we replaced h with a Heaviside function and assumed that the system already reached either the weak baseline inhibition W 1 (Eq 4), or the strong baseline inhibition W 2 (Eq 11) based on the given experimental paradigm.
For each paradigm (with paradigm parameters shown in Table 2, we simulated Eq 10 and found that SSA and forward suppression belonged to the weak inhibitory regime ( � F ðtÞ integrated to values above threshold F th ), whereas tuning-curve adaptation and PV activation belonged to the strong inhibition regime ( � F ðtÞ integrated to values below threshold F th ). All rate models were simulated using the dynamical systems software XPP [23], called using PyXPPCALL and visualized using Python [24]. The equations were well-behaved enough that a very coarse time step of dt = 0.1 was sufficient for our purposes. We used the default integration method in XPP, the fourth-order Runge-Kutta.

Goodness of Fit and related metrics
We compared the results obtained with the model as result of optogenetic perturbations to published data qualitatively, by testing whether the model reproduces the increase/decrease of activity that was reported. The limit of large numbers yields theoretically tractable equations such as the Wilson-Cowan models we use in this paper. However, finite size effects may contribute to additional issues that the large-number limit does not address. How do we know that the observed changes would hold consistently with the inclusion of finite and noisy neural populations? To address this question, we built a spiking model constrained by the rate models above. Because the spiking model is derived directly from the simpler rate models, we generally expect that the spiking models will reproduce the rate models results for sufficiently large neural populations, however, we are able to establish that the results are consistent in the presence of noise. Moreover, we built the rate model using Python and brian2, which are both extensively documented and accessible open-source packages. Finally, we include the spiking model as a means for researchers to directly fit parameters from experimental data if needed. The rate model provides strong qualitative intuition, but does not explicitly account for single-cell interactions. Thus, the spiking model provides an additional, detailed framework for others to modify and extend beyond what the rate model may provide.

Spiking neuron dynamics
We use the single-unit rate model as a template for the spiking model, constraining the parameters while preserving the pattern of excitatory and inhibitory currents. All inhibitory neurons consisted of a single somatic compartment, while the excitatory neurons were modeled as two-compartment, "ball-and-stick" models. For each excitatory and inhibitory neuron, we modeled the somatic (ball) compartment as an adaptive exponential integrate-and-fire neuron [25,26]: and the sum ∑ B in the synaptic current iterates over the presynaptic neurons, B2{e,p,s}. If the presynaptic neuron B is excitatory or inhibitory, then E B =0mV or −67mV, respectively. If a synaptic connection existed from PV to Exc, we included a depression variable, (t), satisfying Eq 9, with t D 1 ¼ 1000, and t D 2 ¼ 250: where a = 1.7. As in the rate model, the parameter a was chosen such that the sign of I E PV;i ðtÞ did not change. The additional depression term was necessary to incorporate depression effects that operate well beyond the timescale of inhibitory conductances [15].
We added transmembrane noise in the form of a white noise process with zero mean and a standard deviation of 20mV to simulate intrinsic and extrinsic current fluctuations. All fixed parameters for each neuron type are shown in Table 3. The parameters that we varied manually were entirely contained in the time-dependent functions I A Thal ðtÞ (thalamic inputs) and I A Opto ðtÞ (optogenetic parameters). The thalamic input profile, I A Thal ðtÞ, is determined by Where where τ I = 1ms, τ D,Fast = 10ms, t D 1 ¼ 1000ms, and t D 2 ¼ 250ms. The functions I A (t) (distinct from I A Thal ðtÞ) are square wave functions that are active for the duration of the auditory stimulus. Just as in the rate model, the thalamic input function I A Thal ðtÞ only appears in Exc and PV neurons. The profile of the thalamic input is shown in Fig 1G. The optogenetic term, I A Opto ðtÞ, only appears in the PV and SST equations.
Following a presynaptic spike from neuron j, the postsynaptic effect on neuron i appears as an instantaneous spike in the postsynaptic conductance g ij (t)!g ij (t)+g ij,max /n X , where g ij,max is given by Eq 18, and X stands for the presynaptic neuron type (Exc, PV, or SST). The magnitude of the conductances were chosen to have the same proportion as reported values [16], with the same type of connectivity structure as in the rate model.
In the absence of presynaptic spikes, the conductances g ij decay exponentially to zero: where τ ij = 1ms for all synapses except for the time constants from excitatory to PVs, τ pe = 25ms, and excitatory to SSTs, τ se = 15ms [27]. In the spiking model, we switched to the weak inhibitory regime by decreasing the inhibitory inputs into Exc from g ep,max = 40 and g es,max = 20 to g ep,max = 38 and g es,max = 19.
The term F(t) is a dimensionless slow timescale facilitation variable that depends on the thalamic drive, and satisfies Eq 8 (just as in depression, the additional slow timescale allows the model to operate on multiple timescales [15]). The parameter b = 3 modulates the facilitation strength, and t F 1 ¼ 1000, and t F 2 ¼ 250. For simplicity, we allowed F i (t) to vary continuously over time. The variable w represents spike-frequency adaptation and obeys The dynamics of the dendritic (stick) compartment obey where the parameter κ = 0.3 is the ratio of somatic to total surface area [26]. For PV and SST interneurons, the equations are the same as Exc except that there is no dendritic component. Parameters differ marginally between PV and SST neurons (see Table 3). SSTs, unlike PVs, have no incoming synaptic connections from the thalamus and only receive excitatory inputs from Exc. PVs and SSTs include the optogenetic stimulus term I A Opto ðtÞ, and as mentioned above, only Exc and PVs contain the thalamic input term I A Thal ðtÞ. These connections reflect the choices made in the rate model.

Three-unit Spiking model
We introduced the gross tonotopy into the spiking model by copying the single unit spiking model into three units with lateral excitatory connections (Fig 2A). As in the 3-unit rate model, the thalamic inputs of the 3-unit spiking model have weaker lateral connections. For tone responses at frequency f 1 and f 2 , the center unit receives an input of amplitude proportional to 0.85 that of the left and right units, respectively.
The spiking model contains 1600 Exc, 200 PVs, and 200 SSTs. For connection probabilities within units, we chose E E connections to have probability p EE = 0.1 and all other probabilities to be the same, p EE = p ES = p PE = p PP = p PS = p SE = 0.6. For lateral connection probabilities, we chose p = 0.1.

Differential effects of interneuron suppression in stimulus-specific adaptation
Almost all neurons (95%) in AC exhibit stimulus-specific adaptation, a phenomenon in which neurons reduce their response selectively to the inputs that is presented frequently in the stimulus (standard tone), while preserving the initial strong response to the less frequent input (deviant, or oddball tone) [1]. Previous studies found that following a presentation of the deviant tone, the excitatory neurons adapt over successive presentations of the standard [1,29]. Similar adaptation was observed in the songbird [30,31]. This phenomenon was largely attributed to feedforward thalamo-cortical depressing synapses [17,32], but such models could not account for the full range of the effects that were observed [12]. A recent study found that inhibitory neurons exhibit differential control over SSA [2]. Suppressing SSTs resulted in disinhibition of the excitatory neurons, such that disinhibition increased with successive presentations of the standard tone, but not the deviant. By contrast, PV inhibition drove equal amount of disinhibition of excitatory neurons in response to both the deviant and the standard. These results suggest that SST inhibition controls adaptation level of excitatory neurons. In order to understand the roles of inhibitory interneurons in modulating SSA responses, we tested whether the single-unit model could reproduce the differential effects of suppressing PVs and SSTs in temporal adaptation. This paradigm involved the application of 8 successive tones to the single iso-frequency circuit with constant synapses and depressing thalamic inputs (Fig 3, Eq 1). We then tuned the parameters of the model in order to achieve the qualitative results for SSA [2]. We simulated optogenetic suppression of inhibitory neurons by defining the functions I Opt,PV (t) and I Opt,SST (t) to turn on 100ms before tone onset and turn off 100ms after tone onset, thereby inhibiting the activity of PV or SST neurons as performed in the experimental paradigms. The degree of PV and SST inhibition was chosen as a free parameter, and in the case of this single-unit rate model, had the dimensionless values of I Opt,PV (t) = −2 during PV suppression and I Opt,SST (t) = −1 during SST suppression. Through this suppression

PLOS COMPUTATIONAL BIOLOGY
of inhibition, we sought to reproduce the constant disinhibition from PV inactivation, and increasing disinhibition from SST inactivation. Once the model reproduced this qualitative result, we discovered that the key mechanisms involve the temporal structure of the responses: PVs exhibit a temporally fast tone-evoked response and peak earlier than Exc and SSTs, while SSTs exhibit a temporally delayed and broad tone-evoked response (Fig 1B, Fig 3B top left  plot), which both agreed with earlier studies [2]. The SST delay is not hard-coded, but the result of SSTs receiving indirect thalamic excitation through the Exc population [16].
The simplicity of the model allowed us to understand the mechanism underlying these changes. One important observation is the faster temporal activation of PVs relative to SSTs. With this property, excitatory activity is immediately affected by changes to PV activity and less by changes to SST activity. With PV suppression (Fig 3C middle row), PVs are reduced in activity, leading to greater Exc activity. The SSTs were indeed disinhibited due to a lack of inhibition from PVs and the greater Exc activity, but the inhibition from SST to Exc was not strong enough to compensate for changes in PV activity. Thus, the Exc population received an overall decrease in inhibitory current across all successive tones, resulting in constant disinhibition prior to and following adaptation (Fig 3D green). This result suggests that the temporally faster PV inhibition of Exc neurons is necessary for modulating excitatory activity, and only a few milliseconds of earlier PV activity results in a substantially stronger inhibitory effect relative to SSTs.
With SST inactivation, Exc activity at the first tone was virtually unaffected because the reduced SST activity resulted in PV disinhibition, and the increased PV activity resulted in no net change to the total inhibitory current entering the Exc population (Fig 3C, bottom left). PVs compensated for the reduced SST activity. Following adaptation, the overall reduced excitatory activity in both thalamus and Exc resulted in reduced PV activity and a net loss of inhibitory current in Exc. The Exc population was therefore disinhibited at the last tone (Fig 3C, bottom  right). Combined, these effects drove an increase in disinhibition over successive tones (Fig 3D,  orange). These simple mechanisms of disinhibition and compensation can therefore explain the complementary roles of inhibitory interneurons in shaping cortical activity.
To understand the model dynamics in the presence of inputs at multiple frequencies in the oddball paradigm, we extended the model to a rate and spiking model with three iso-frequency units, in which each microcircuit received inputs of specific preferred frequencies. The threeunit circuitry was based on the single-unit model and the free parameters tuned to reproduce the inhibitory and excitatory currents. For example, an auditory input to the left unit caused lateral excitatory and inhibitory currents to enter the center unit. These currents to the center unit were designed to be similar to the currents entering the single-unit in response to auditory stimuli. We performed the same procedure for auditory inputs to the right unit: lateral excitatory and inhibitory currents from the right were designed to enter the center unit in a manner similar to the single-unit case. Using this procedure, we extended the differential SST and PV inhibition in the single-unit model to work in the case of a tonotopy without the need for exhaustive parameter fitting.
The oddball stimulus activated the neighboring units to the center one, whose activity we measured (Stimulus described in Table 2, Fig 4A). We observe thalamic depression in the thalamus which decreases responses to the repeated activation from one unit, and large responses to the unrepeated tone. We simulated PV and SST suppression with the functions I Opt,PV (t) and I Opt,SST (t). These functions were turned on 100ms before tone onset and turned off 100ms after tone onset. In the rate model, the inhibition was dimensionless and free parameters chosen to be I Opt,PV (t) = −4 during PV inhibition and I Opt,SST (t) = −2 during SST inhibition. In the spiking model, we chose I Opt,PV (t) = −0.2nA during PV inhibition and I Opt,SST (t) = −1nA during SST inhibition (Fig 4B-4G).
In the rate and spiking model, the firing rates increased uniformly across all post-deviant tones (Fig 4D and 4E). In the rate and spiking model, the firing rates exhibited an increase in disinhibition as a function of post-deviant tone number ( (Fig 4F and 4G). Both results agree with existing results in SSA [2]. In order to establish the robustness of these results, we varied several parameters and measured the Common-contrast SSA Index (CSI) [12], where d(f i ) is the deviant rate response and s(f i )) is the standard rate response to frequency f i . For full adaptation, when the standard responses are 0, CSI = 1, indicating a high degree of SSA. If the standard responses are equal to the deviant responses, then CSI = 0, indicating a low degree of SSA. We performed a parameter sweep with four key parameters of circuit connectivity (( Fig 5): (1) recurrent excitation (w ee , (Fig 5A, 5B, 5E and 5F), a key parameter considered in many studies [12], especially those related to inhibitory stabilized networks (ISNs) [11,26]; (2) timescale of thalamic depression (t d 1 , Fig 5C, 5D, 5G and 5H) whose reported values vary over a large range, from 0.8s to 3s [2,12]; (3) the strength of PV activation or inactivation (Fig 5A, 5C, 5E and 5G); and (4) the strength of SST activation or inactivation (Fig 5B, 5D, 5F and 5H). In all cases, inactivating SSTs reduced the CSI relative to PV inactivation, reflecting the increasing disinhibition over post-deviant tones. The x-axis of each plot corresponds to the strength of the optogenetic laser. Negative values correspond to a decrease in inhibitory activity, and positive values correspond to an increase in inhibitory activity. These effects may be more clearly seen in the model equations (Eqs 1, 5 and 13), where we add the terms I Opt,PV (t) and I Opt, This analysis reveals robustness in parameter ranges for given optogenetic modulation strengths. The CSI in the control case (white circle) changed little when the parameters w ee and t d 1 were varied for a given optogenetic strength. In other words, for a fixed value on the xaxis, changing positions in the vertical direction on each plot did not change the CSI index significantly for a nontrivial range of parameter values. This observation suggests that the cortical model can operate in a broad parameter regime, and precise parameter values may not be important for normal function. In extreme cases, decreasing recurrent excitation removed the decrease in CSI following SST inactivation (Fig 5B), suggesting that sufficient recurrent excitation is an important factor in generating responses in the SSA paradigm. Second, while increasing optogenetic inhibition had little effect on the CSI, increasing optogenetic activation showed an increase in CSI in all cases (CSI = 0.35 for PV activation and CSI = 0.31 for SST activation). Therefore, we predicted that optogenetic activation of PVs and SSTs will generally improve context-dependent cortical responses. Like the rate model, the spiking model exhibited little sensitivity to changes in w ee and t d 1 . However, the spiking model showed almost no dependence on recurrent excitation w ee in the PLOS COMPUTATIONAL BIOLOGY case of SST inactivation (Fig 5E). This effect is likely due to the differences in connectivity between the rate and spiking models. In the rate model, lateral connections depend entirely on excitatory activity, thus SSA results in the rate model are more sensitive to changes in recurrent excitation. In the spiking model, recurrent excitation plays a less important role because the lateral connection probabilities are low (p = 0.1), whereas the connection probabilities within units are high (p = 0.6).
The three-unit model was developed to reproduce the compensating mechanisms of the single-unit model: PV suppression results in constant disinhibition for repeated tones, and SST suppression results in a compensating effect from PVs before adaptation that weakens as adaptation strengthens. These differential roles explain experimental data in the SSA paradigm to a remarkable degree. We then asked whether this simple mechanism is sufficient to reproduce additional optogenetic experiments. For the remainder of the paper, we use the threeunit rate model with no parameter modifications except for the changes in the inhibition modes and the auditory inputs that depend on the experimental paradigm (Table 2, Fig 1C, and Eq 10).

Differential effects of inhibitory neuron manipulation on cortical forward suppression
Context dependence of auditory responses has been revealed on many time scales. In a wellstudies phenomenon termed "forward suppression", the responses of AC neurons to a tone are suppressed if the tone is preceded by another tone, but the level of suppression depends on the frequency difference between the two tones (Fig 6). In the experiment, the first tone, called the masker, varies in frequency between trials, while the second tone, called the probe, remains fixed at the preferred frequency of the neuron. This phenomenon was explained by feedforward depression, but the inhibitory neurons were recently shown to also control forward suppression [8]. PV inactivation (orange) concurrent with the auditory stimulus resulted in a selective increase in forward suppression at the preferred frequency relative to the control case (blue), whereas SST inactivation (green) reduced forward suppression at the preferred frequency relative to the control case (blue) (Fig 4B second row) [8].
We used the same parameters for connectivity within the circuit as with SSA to reproduce the experimental findings, with only slight changes to the input strength (q = 1.3). The stimuli used in the forward suppression paradigm place the baseline state in the strong inhibitory regime (Fig 1C). Both the rate (Fig 6A middle, 6B middle) and spiking models (Fig 6B bottom  6B bottom) yielded the existence of experimentally observed differential effects for PV ( Fig  6A) and SST inactivation (Fig 6B): PV inactivation drove a selective decrease in responses whereas SST inactivation drove a suppression of excitatory neuronal responses. We do not expect the scales between the rate and spiking models to match precisely, and we only sought to match the existence of the differential phenomena. Functions I Opt,PV (t) and I Opt,SST (t) were turned on for the duration of the simulation to match the experimental protocol. In the rate model, the inhibition was dimensionless and free parameters chosen to be I Opt,PV (t) = −4 during PV inhibition, I Opt,PV (t) = 0.5 during PV activation, I Opt,SST (t) = −2 during SST inhibition, and I Opt,SST (t) = 1.2 during SST activation. In the spiking model, we chose I Opt,PV (t) = −0.2nA At first glance, this result seems paradoxical given that PV suppression generally results in excitatory disinhibition as shown in the adaptation and SSA results (Figs 3 and 4), but the underlying mechanism is straightforward to understand. Following PV suppression, excitatory activity is indeed disinhibited, but the firing rate function of the excitatory population saturates. This behavior means that higher activity neurons are generally disinhibited less strongly than lower activity neurons. Thus, upon receiving the second tone, the input received by the excitatory population is weaker due to thalamic depression, but the disinhibition is greater relative to the disinhibition in the first tone. This phenomenon also requires a special property in the thalamic depression variable g(t), namely that it cannot depress too quickly during the first time or else the excitation in the second tone will be too weak, and although it recovers slowly, it recovers enough such that the input from the second tone is still somewhat strong. In the case of SST suppression, PVs compensate for the loss of inhibition in the first tone, but lose the ability for compensation in the second tone, so Exc are able to respond more strongly relative to the control case. Thus, forward suppression is weakened.
Next, we tested the effects of activating PVs or SSTs (as could be done with ChR2 experimentally) on model responses. The model predicted that both PV and SST activation will result in an increase of forward suppression across preferred and sideband frequencies (Fig 6D  and 6E).

Differential adaptation to repeated tones along the frequency response function
Neurons in A1 adapt to repeated tones [2]. This adaptation is proportional to the strength of their tone-evoked responses: it is stronger in the center of the frequency response function, and weaker for the sidebands [7]. A recent study found that PVs and SSTs exert a differential effect on this form of adaptation: Suppressing PVs drives disinhibition selective to the sidebands in the adapted state, whereas suppressing SSTs drives disinhibition both in the center and at the sidebands of the frequency response function of excitatory neurons [7]. To understand how inhibitory neurons affect adaptation across different frequency-tuned inputs, we presented a sequence of 8 tones at each frequency to generate adapting tuning curves (Fig 7A), and repeated this stimulus with PV and SST suppression for the model circuit. We found that this auditory paradigm resulted in a below-threshold integration of � F, so the system switched to a state of strong baseline inhibition (and importantly, the model did not respond in precisely the same way as in SSA and forward suppression).
Our model reproduced the differential experimental effects of PV and SST suppression (Fig  7). In the rate model before adaptation, PV and SST inactivation resulted in sideband disinhibition with little to no disinhibition at the preferred frequency (Fig 7D and 7J eft). After adaptation, PV inactivation resulted in sideband disinhibition and no preferred frequency disinhibition (Fig 7D right), whereas SST inactivation resulted in disinhibition across all sideband and preferred frequencies (Fig 7J right). The spiking model closely mirrored these results PLOS COMPUTATIONAL BIOLOGY (Fig 7F and 7L). The ratio of excitatory responses between light off and light on trials summarize the degree of sideband and preferred frequency disinhibition (Fig 7E, 7G, 7K and 7M).
The mechanisms behind these results involve synaptic facilitation and depression and the compensating mechanism discussed in the earlier sections for SSA and adaptation. In the case of PV suppression, SSTs were the only interneurons capable of contributing to Exc inhibition, so only Exc-SST interactions drove the observed effects. In particular, lateral SST to Exc synapses suppressed the center unit over each tone, and facilitation allowed this suppression to persist throughout adaptation. Note that this preferred-frequency effect was not observed in SSA because we never directly stimulated the center unit. Next, in the case of SST suppression, the increasing disinhibition with adaptation at the preferred frequency was a consequence of the same compensating mechanism as in SSA. The tuning curves shown in Fig 7B and 7H are representative cartoons of the changes in the tuning curves observed quantitatively [7], but we aimed to reproduce the light-on and light-off curves (Fig 7C and 7I) using our model. The comparison of the model tuning curves to the cartoons from the experiments appear to be significantly different at the sidebands, but in fact our model outputs closely match the desired lines in Fig 7C and 7I.
Turning to predictions, we found that before adaptation, PV activation resulted in a slight decrease at the preferred frequency, whereas SST inactivation reduced overall firing rates across all frequencies (Fig 7N and 7O). After adaptation, PV and SST activation resulted in a subtractive effect. General optogenetic activation and inactivation of PVs and SSTs modulated tuning-curves in combinations of additive, subtractive, multiplicative, and divisive effects [3]. Our model reproduced one of the key results in these studies as well, since PV and SST inactivation were found to have additive and divisive effects on the frequency response functions of excitatory neurons (Fig 7E, 7G, 7K, 7M, 7O and 7Q).
Similar to SSA, the functions I Opt,PV (t) and I Opt,SST (t) were turned on 100ms before tone onset and turned off 100ms after tone onset. In the rate model, the inhibition was dimensionless and free parameters chosen to be I Opt,PV (t) = −0.5 during PV inhibition, I Opt,PV (t) = 1.2 during PV activation, I Opt,SST (t) = −1 during SST inhibition, and I Opt,SST (t) = 0.1 during SST activation. In the spiking model, we chose I Opt,PV (t) = −1nA during PV inhibition and I Opt,

PVs enhance feedforward functional connectivity
Cortical neurons in AC receive inputs from the thalamic auditory nuclei. As the result, neuronal responses in the cortex are correlated with neuronal firing in the thalamus. These interactions can be captured using an Ising model to measure the connection from the thalamus to the cortex. When PVs were activated, the functional coupling between cortical and thalamic responses [9] became stronger. The specific mechanism underlying this change is unknown.
Using the three-unit model, we identified a candidate mechanism for the enhanced thalamo-cortical correlation following PV activation. We assumed that the feedforward functional connection from the thalamus to the cortex is the same as the anatomical connection, so

PLOS COMPUTATIONAL BIOLOGY
thalamic inputs directly modulated cortical responses in our model. Following an increase in inhibition, cortical responses became sharper, thus aligning more closely with thalamic inputs and improving feedforward functional connectivity (Fig 8).
PV activation (green) in the rate model resulted in an increase in the Pearson correlation between the control (blue) and thalamic inputs (red), from 0.77 and 0.83 (Fig 8B). Thus, whereas inhibitory activation decreased the overall firing rate, the response became more synchronized to the thalamic inputs, resulting in an increase in feedforward functional connectivity. In the spiking model, PV activation resulted in a delayed response of excitatory activity, but we were interested in tested whether PV-activated Exc response profile resembled the thalamic activity more than the control Exc response. To make this comparison, we shifted the PV trace so that the onset of PV-activated Exc activity (green) coincided with the onset of the control curve (blue) (Fig 8C. An equivalent approach would be to measure the peak value of the cross-correlation between excitatory and thalamic activity without shifting the data in time). We observed an increase in the Pearson correlation from 0.87 in the control Exc activity to 0.95 in the PV-activated Exc activity, thus demonstrating a sharpening of excitatory

PLOS COMPUTATIONAL BIOLOGY
responses, and an increase in feedforward functional connectivity. As in the previous paradigms, we only aimed to show the existence of changes without requiring the magnitude of change to match the experimental data.
These results provide for a simple plausible mechanism for enhanced feedforward functional connectivity: as inhibition reduces the overall cortical inputs, cortical responses better synchronize to thalamic inputs, resulting in stronger correlated activity. We remark that the classic approach to the problem of establishing proper functional connectivity relies on sophisticated models such as Ising models [33], expectation-maximization [34], and subspace identification [35], because direct calculations of correlations can lead to false positives when the anatomical connections are not known [9]. Our model has explicit anatomical connections, which eliminates the problem of false-positive correlations. Thus, in this case, the use of the correlation serves as a reliable proxy for the Ising model.

Balanced networks
Multiple studies postulated that excitatory and inhibitory currents are matched in cortical circuits, contributing to stability of circuit function [36,37]. Therefore, we tested whether our model operated as a balanced network. To make this measurement, we took ratios of inhibitory and excitatory currents during suprathreshold activity, because our model ignores virtually all subthreshold activity. Interestingly, with no tuning, our models showed evidence of operating as a balanced network: as we varied the input strength to the rate and spiking models (Fig 9C and 9F), the ratio of excitatory to inhibitory inputs to the excitatory population (Fig 9B  and 9E) remained constant. The rate model had an excitatory/inhibitory ratio of 0.37 (Fig 9B), and the spiking model had an excitatory/inhibitory ratio of 2.5 ( Fig 9E). These results suggest that excitatory-inhibitory balance a robust, emergent feature of cortical networks. The large differences in scales are to be expected when comparing a dimensional and dimensionless model.

Discussion
A wealth of recent studies provide evidence for distinct function of different types of cortical inhibitory neurons in temporal processing of auditory information. The studies demonstrate that different types of inhibitory neurons, SSTs and PVs, play a differential role in auditory processing, controlling adaptation at different time scales and contexts, and changes to feedforward functional connectivity. Our goal was to integrate the results of these studies to understand whether the observed effects were due to a small set of mechanisms.
We built an idealized rate and spiking model that reproduced multiple key results from studies that tested the function of specific inhibitory opsins in specific cells in the auditory cortex. In addition to including different baseline states that modulate the strength of PV-to-Exc and SST-to-Exc synapses, the key mechanisms underlying our models included the fast temporal activation of PVs, the delayed, broad temporal activation of SSTs, the ability for PVs to compensate for weakened SST activity, and dynamic synapses including SST-to-Exc facilitation. These interactions accounted for the differential modulation of cortical responses by interneuron subtypes and suggests that a simplified set of mechanisms can support experimental results.
To reproduce the differential function of SSTs and PVs in stimulus-specific adaptation, we built a model loosely based on multiple existing models for SSA and multiple configurations of spiking neuron populations, consisting of inhibitory and excitatory neurons. Previously, a two-layer rate model with synaptic depression was proposed to establish the relationship between the cortical response and the parameters in SSA experiments, such as stimulus frequency differences, probability of deviation, and tone presentation rate. Yarden et al. (2017) successfully used a multi-unit rate model arranged in a coarse tonotopy consisting of inhibitory and excitatory populations to reproduce general deviance detection, but model has not yet been adapted to explain differential interneuron modulation. Another existing model of SSA including differential inhibitory modulation demonstrating similar differential inhibitory effects as in our SSA result (Fig 4), but did not include a tonotopy [2]. These models only included one type of inhibitory neuron type or did not include tonotopy, and therefore could not account for the observed differential effects of suppression of SSTs and PVs on SSA across multiple frequencies. In the present study, we developed a simple rate and spiking model that accounted for multiple inhibitory cell types and which faithfully reproduced the differential effects of SST and PV inactivation in SSA (Fig 4). In addition, a parameter sweep revealed that both the rate and spiking models were robust to large changes in key parameters commonly explored in the literature, suggesting that SSA is a robust phenomenon [12].
Existing models that reproduce the enhanced forward suppression from PV inactivation and the reduced forward suppression from SST inactivation (Fig 6) include multiple layers that require both depression and facilitation [8], or rely on depressing recurrent excitation and do not distinguish between inhibitory subtypes [4]. We incorporated depression and facilitation in the model synapses and reproduced the former results with only a single layer, suggesting a surprisingly simple mechanism supporting forward suppression. Furthermore, the models in the present study reproduced tuning-curve adaptation effects previously observed experimentally but not computationally: SSTs exhibited strong preferred-frequency disinhibition following adaptation, while PV disinhibition is independent of the degree of adaptation (Fig 7) [7]. These results suggest that the underlying mechanism(s) of the model, namely the PV/SST compensation effect, combined with the facilitating SST-to-Exc synapse, may serve as a general mechanism of adaptation. In addition, our models reproduced changes in feedforward functional connectivity (Fig 8). By increasing PV activity in the models, excitatory activity decreased but became more time-locked to thalamic inputs. This effect agreed with observations in the cortex, where PV activation resulted in enhanced feedforward functional connectivity [9]. The effects of inhibition on sharpening cortical responses have been wellestablished, thus our models serve as plausible mechanisms for this change [38][39][40]. Finally, our models were shown to operate as a balanced network, where inhibitory and excitatory currents entering neural populations were shown to maintain a consistent ratio across input strengths, suggesting a generality to the theory of balanced networks.
One drawback of the model is that it does not feature population spikes, which explain many fundamental cortical responses in AC [38]. In future work, we will seek to reconcile the differences between our models and the population spike model of SSA [4]. Establishing the importance of depression and facilitation in different synapses and extending our model to include population spikes warrants further study.
Although we do not explore simultaneous auditory stimuli in this study, it is worth mentioning the response properties of the network due to recent interest in supralinear network models (Rubin et al. 2016). Throughout this paper, neurons operate in a linear manner when above threshold: neurons add inputs linearly, until the maximum rate is reached in the rate models, or until the refractory period saturates spiking rates in the spiking model. The models do not use sub-threshold responses to modulate population activity. Excitatory-inhibitory balance in the rate and spiking models A. Plot of incoming excitatory and inhibitory currents into the Exc population as a function of different input strengths (C). Darker currents correspond to stronger inputs B. A best-fit line (dashed) accurately captures the ratio of excitatory and inhibitory responses, implying excitatory-inhibitory balance. Equivalent results for the spiking model are shown in panels D, E, and F. https://doi.org/10.1371/journal.pcbi.1008016.g009

PLOS COMPUTATIONAL BIOLOGY
Multiple studies from different laboratories revealed the differential effect of distinct inhibitory neurons in auditory processing. We show that a minimalistic model, built on simple mechanisms, is capable of reproducing disparate results in the literature. As inhibitory neurons form similar circuits throughout the mammalian cortex, this model can be readily adapted to test their function and generate predictions (with adjustments for local changes in connectivity) in different sensory modalities.