A Unifying Mechanistic Model of the Auditory Cortex with Inhibitory Subtypes

Several recent studies implicate parvalbumin(PV) and somatostatin(SOM) positive interneurons in processing the temporal context of sounds in the auditory cortex. We built minimal rate and spiking models of AC in order to understand how these interneurons modulate cortical processing. Our models not only replicate findings from recent experiments involving optogenetic manipulation of PV or SOM activity, accounting for the differential effects of PVs and SOMs in stimulus-specific adaptation, forward suppression and tuning-curve adaptation, but also provided for a simple mechanism for changes to PV-modulated functional connectivity. The unifying mechanisms of our model include dynamic synapses from SOMs and PVs to pyramidal neurons, such as depressing and facilitating synapses from PVs and SOMs to pyramidal neurons, respectively. To reproduce experimental studies, we fine-tuned two key parameters: the strength of thalamic inputs and the strength of optogenetic inactivation. Our model will be useful in predicting the function of PVs and SOMs in sensory processing.


Introduction
Temporal auditory processing, such as habituating to expected sounds, detecting sudden changes in the acoustic environment, and extracting important acoustic features from noise, are important computations for auditory navigation and scene analysis. The mammalian auditory cortex (AC) is a key region for processing temporally complex sound, as evidenced by multiple temporal paradigms, such as stimulus-specific adaptation (SSA) and forward suppression [8,52,3].
Tightly-coupled networks of excitatory and inhibitory neurons comprise the circuits for processing auditory responses, and these networks continue to motivate the latest mechanistic models of AC [26,30,34,60]. Recent development of optogenetic tools allow for testing the function of specific neuronal subpopulations with sub-millisecond precision and cell-type specificity [10], thereby testing the predictions and refining the models of cortical processing of sounds.
In (AC), recent optogenetic studies implicated the two most common types of inhibitory neurons, parvalbumin-(PV) and somatostatin-(SOM) positive interneurons, in processing simple and complex sounds, identifying their specific roles in different auditory paradigms [15,2,32,36,33,38,6]. Yet the mechanisms that support these diverse effects remain poorly understood.
Existing large-scale mechanistic models of the auditory cortex apply to the scale of the whole brain [16,54], or apply to AC, but treat inhibitory neurons as one large group [30,26,60]. Only a handful of studies, restricted to specific results, include multiple inhibitory neuronal subtypes. For example, models of forward suppression in Phillips et al. [37,38] also account for differential modulation of tuning curves by inhibitory subtypes [36,45]. Yet it remains unclear whether these models generalize to other temporally-dependent paradigms such as SSA or predict other phenomena such as changes in functional connectivity. Likewise, another recent model of SSA with inhibitory subtypes [32] does not generalize to results in tuning-curve adaptation, nor does it apply to forward suppression. The Ising model has been used to describe enhanced functional connectivity as a function of PV activation [15], but the mechanisms and anatomical connections underlying this change are not known. Our goal is to build a unifying model for auditory cortical connectivity and test whether there exists a set of parameters that would account for the multitude of recent findings on the function of PVs and SOMs in temporal sound processing.
In the present study, we introduce a consistent mechanistic framework to unravel the role of interneurons in processing dynamic auditory stimuli. We identify the set of parameters that accounts for the observed results including SSA, forward suppression, tuning-curve adaptation, and changes in functional connectivity. We find that the key parameters of the model tuned to account for the results are the strength of the thalamic inputs, and the strength of the optogenetic inactivation and activation. These findings can be readily generalized to other cortical areas, and the framework that we develop can be used to build and test hypotheses for similar phenomena in vision [44,21,43,61,22], olfaction [17,48,19], taste perception [11] and spatial navigation [29,28].

Methods and Materials
We constructed two primary model types in this paper. We first built an augmented version of the Wilson-Cowan model, consisting of one or three isofrequency units of the auditory cortex, and included one excitatory neural population and two inhibitory neural subpopulations. We then built the spiking model analogue of the rate model, which also consisted of one or three isofrequency units.
All code used to generate figures (including model simulations, numerical methods, and analysis methods) are available on GitHub at https://github. com/geffenlab/park_geffen under the MIT open source license.

Augmented Wilson-Cowan Model
We modeled a single iso-frequency unit as an augmented version of the Wilson-Cowan model [57] by including an additional inhibitory subtype. We drew much of our understanding of adaptation in the auditory cortex using this single iso-frequency unit: τ u du(t) dt = −u(t) + f (w ee u − w ep p − w es s + qI(t)) , τ p dp(t) dt = −p(t) + f (w pe u − w pp p − w ps s + I Opt,PV + qI(t)) , τ s ds(t) dt = −s(t) + f (w se u − w sp p − w ss s + I Opt,SOM ), where u, p, and s represent the normalized firing rate (scaled from 0 to 1) of the excitatory population, PV inhibitory subpopulation, and SOM inhibitory subpopulation, respectively ( Figure 1). The parameters I Opt,PV (I Opt,SOM ) represent the strength of PV (SOM) activation or inactivation, and w ij and τ i are synaptic weights and time constants, respectively. All time constants are τ u = τ p = τ s = 10ms [50,32]. The function f is a threshold linear function defined as The parameter r = 3 determines the gain of all firing-rate functions [32,60]. We included a threshold as f (x − u th ), where u th is a real number. In all rate model simulations, we chose Pyr, PV, and SOM thresholds to be u th = 0.7, p th = 1, and s th = 1, respectively. The thresholds indicate the minimum activity required for a neural population to affect postsynaptic neural populations. These thresholds reflect the minimum activity required to activate postsynapic populations [25,26]. The input function I(t) consists of blocks of inputs with stimulus interval δms. When an auditory input arrives, the default temporal profile is taken to have an instantaneous rise with amplitude q and exponential decay with time constant τ q = 10ms [40]. The input I(t) is further modulated by a slow timecourse synaptic depression term g satisfying where the time constants are τ d1 = 1500ms for replenishment and τ d2 = 20ms for depletion (chosen close to reported values [1,50,56,32]). The variable g effectively modulates the amplitude of auditory inputs to A1 over long times. In Figure 2A, each unit shows the connectivity pattern based on existing studies on AC [35]. The within-unit connectivity is equivalently represented by the matrix, In the single-unit rate model, all synaptic weights w ij in the rate model are constant (as in Mill et al. [30]), with synaptic depression appearing in the feedforward thalamic inputs [20]. We choose constant synapses for the single-unit rate model to better understand the model dynamics before transitioning to the more complex three-unit model with depressing and facilitating synapses. The synaptic weights were chosen to agree with known connection types [58] and connection strengths [35].
Iso-frequency Unit C Figure 1: Input and response profiles of the single-unit rate model. A: Gray: thalamic depression variable g. Black: pyramidal (Pyr) neuron activity. Dashed orange: PV activity. Green: SOM activity. B: Thalamic inputs (red). The tone onset occurs at 300ms. The tone duration is 100ms, but fast timescale thalamic adaptation quickly depresses the thalamic input. q = 5. C: Isofrequency unit model. Pyrs form recurrent connections with SOMs and PVs. Connectivity structure and synaptic weights as in Equation (4). Iso-frequency auditory stimuli directly excite Pyr (E) neurons and PV (orange) interneurons.
Because both PVs and Pyrs receive direct thalamic inputs, their activations coincide, but PVs peak earlier, in agreement with existing studies [18]. Because synapses between excitatory and inhibitory neurons are static, this feature is a result of the connectivity pattern and not due to synaptic effects. The delay in SOM activation is caused by a lack of direct thalamic inputs. SOMs have a high threshold and receive excitatory input from the Pyr population, which occurs on a slower timescale than thalamic inputs.
An important feature of this model is the early, fast temporal activation of PVs and the late, broad temporal activation of SOMs. While the total (combined SOM and PV) inhibition is active both before and well after the peak activation of Pyrs, the differential inhibition of PVs and SOMs are fundamental to producing nontrivial changes to Pyr activity.

Three-unit Rate Model
Using the single-unit rate model as a template, we arranged copies into three units with lateral cortical and thalamic connections (Figure 2A). This arrangement endowed our model with a gross tonotopy, which we used to explore spectrally and temporally complex auditory inputs.
In contrast to the single-unit rate model, we added facilitating terms F i in the SOM to Pyr synapses, and depressing terms D i in the PV to Pyr synapses [4]. This addition was necessary for the reproduction of tuning-curve adaptation and did not affect the SSA results. The parameters a and b control the degree of depression and facilitation, respectively, and we chose a = 0.5, b = 2. The depressing parameter a was chosen carefully such that the term (w ep (t) − aD i ) did not change sign across experimental paradigms. The facilitating variables where τ D1 and τ D2 are as in Equation (3). We used the inputs I i (t) as a proxy for the excitatory activity u i (t) so we could simulate the facilitation variable in terms of the depression variable as F i = 1 − g i . Similarly, the depression variables D i satisfy and again using the thalamic input as a proxy for excitatory activity, we simulated the depression variable as D i = g i . All depressing and facilitating timescales were chosen according to reported values [1,50,56]. The functions I k (t) are time-dependent inputs with the strongest preference for unit k, and the profiles of i 1 , i 2 , and i 3 are shown in Figure 2E (these profiles are the same as the profile in the single-unit model, Figure 1B). The parameter q controls the strength of all inputs. Each input i j (t) is modulated by its own thalamic variable g i , where each g j satisfies Equation (3) independently.
Three successive auditory stimuli applied in order of the frequencies f 1 , f * , and f 2 , stimulate the left, center, and right units, respectively ( Figure 2B-D). The center unit (u 2 ) responded equally well to both f 1 and f 2 ( Figure 2C), which is a necessary response for SSA paradigms [52,60]. For simplicity, activation of .
an adjacent unit did not affect the thalamic variable, i.e., g 1 , g 2 , and g 3 were left unaffected by u 1 , u 2 , and u 3 , respectively. We assumed that the frequency difference between f 1 and f 2 was great enough that auditory inputs at f 1 (f 2 ) did not affect units responsive to f 2 (f 1 ). We incorporated paradigm-dependent baseline states in the three-unit rate model. The parameters switch between weak and strong baseline inhibition, where weak inhibition corresponds to high thalamic activity, and strong inhibition corresponds to relatively low thalamic activity. This idea is captured more precisely by the facilitating variable, where I(t) is the sum of all thalamic inputs (independent of the tonotopic ar-  rangement), τ F1 = 1500, and τ F2 = 100. As the experimental paradigm progresses,F grows and eventually saturates (over the course of approximately 15 seconds). A simulation of Equation (7) is shown in Figure 3 for the various auditory paradigms.
IfF is above the threshold F th = 0.22, the system exhibits weak baseline inhibition, and the synapses take baseline values as shown in Equation (4). On the other hand, ifF < F th , the synapses take the strong baseline inhibitory values and the SOM activity threshold, s th = 1, decreases to s th = 0. In our original model, we chose a smooth transition between these parameter sets, i.e., w ep (t) = 2g(F (t))+3(1−g(F (t))), w es (t) = g(F (t))+3(1−g(F (t))), and s th (t) = 1g(F (t)) where and r, the gain of the sigmoid g was chosen to be steep, e.g., r = 25. However, for simplicity, we replaced g with a Heaviside function and assumed that the system already reached either the weak baseline inhibition W 1 (Equation (4)), or the strong baseline inhibition W 2 (Equation (8)) based on the given experimental paradigm.
For each paradigm (with paradigm parameters shown in Table 1), we simulated Equation (7) and found that SSA and forward suppression belonged to the weak inhibitory regime (F integrated to values above threshold F th ), whereas tuning-curve adaptation and PV activation belonged to the strong inhibition regime (F integrated to values below threshold F th ). Table 1: Auditory paradigm parameters. SSA parameters from Natan et al. [32]. Forward suppression parameters from Phillips et al. [37] and Brosch and Schreiner [7]. Tuning-curve adaptation parameters from Natan et al. [33]. PV activation parameters from Hamilton et al. [15].

Spiking Neuron Dynamics
We additionally considered a spiking model equivalent of the rate model. All inhibitory neurons consisted of a single somatic compartment, while the excitatory neurons were modeled as two-compartment, "ball-and-stick" models. For each excitatory and inhibitory neuron, we modeled the somatic (ball) compartment as an adaptive exponential integrate-and-fire neuron [5,24]: where the transmembrane currents are and the sum B in the synaptic current iterates over the presynaptic neurons, B ∈ {e, p, s}. If the presynaptic neuron B is excitatory (inhibitory), then E B = 0mV (−67mV). If a synaptic connection existed from PV to Pyr, we included a depression variable, D [4], satisfying Equation (6), with τ D1 = 1000, and τ D2 = 250 [1,50]: where a = 1.7. As in the rate model, the parameter a was chosen such that the sign of I E PV,i did not change. The additional depression term was necessary to incorporate depression effects that operate well beyond the timescale of inhibitory conductances [56]. Noise in our model comes from a white noise process with zero mean and a standard deviation of 20mV, to simulate intrinsic and extrinsic membrane fluctuations. All fixed parameters for each neuron type are shown in Table  2. The parameters that we varied manually were entirely contained in the time-dependent functions I A Thal (t) (thalamic inputs) and I A Opto (t) (optogenetic parameters). The thalamic input profile, I A Thal (t), is determined by  where where τ I = 1ms, τ D,Fast = 10ms, τ D1 = 1000ms, and τ D2 = 250ms [1,50]. The functions I A (t) (distinct from I A Thal ) are square wave functions that are active for the duration of the auditory stimulus. Just as in the rate model, the thalamic input function I A Thal (t) only appears in Pyr and PV neurons. The profile of the thalamic input is shown in Figure 4G. The optogenetic term, I A Opto (t), only appears in the PV and SOM equations.
Following a presynaptic spike from neuron j, the postsynaptic effect on neuron i appears as an instantaneous spike in the postsynaptic conductance g ij → g ij + g ij,max /n X , where g ij,max is given by Equation (10), and X stands for the presynaptic neuron type (Pyr, PV, or SOM). The magnitude of the conductances were chosen to have the same proportion as reported values [35], with the same type of connectivity structure as in the rate model [58].
G max =   g ee,max g ep,max g es,,max g pe,max g pp,max g ps,max g se,max g sp,max g ss,max In the absence of presynaptic spikes, the conductances g ij decay exponentially to zero: where τ ij = 1ms for all synapses [13] except for the time constants from pyramidal to PVs, τ pe = 25ms, and pyramidal to SOMs, τ se = 15ms [27]. In the spiking model, we switched to the weak inhibitory regime by decreasing the inhibitory inputs into Pyrs from g ep,max = 40 and g es,max = 20 to g ep,max = 38 and g es,max = 19. For excitatory neurons (A = e), the transmembrane currents are The term F is a dimensionless slow timescale facilitation variable [4] that depends on the thalamic drive, and satisfies Equation (5) (just as in depression, the additional slow timescale allows the model to operate on multiple timescales [56]). The parameter b = 3 modulates the facilitation strength, and τ F1 = 1000, and τ F2 = 250 [50,27]. For simplicity, we allowed F i to vary continuously over time. The variable w represents spike-frequency adaptation and obeys The dynamics of the dendritic (stick) compartment obey where the parameter κ = 0.3 is the ratio of somatic to total surface area [24].
For PV and SOM interneurons, the equations are the same as Pyr except that there is no dendritic component, and parameters differ marginally (see Table 2). SOMs, unlike PVs, have no incoming synaptic connections from the thalamus, PVs and other SOMs [35] and only receives excitatory input from Pyrs. Both PVs and SOMs include the optogenetic term I A Opto (t), and as mentioned above, only Pyrs and PVs contain the thalamic input term I A Thal (t). These connections reflect the choices made in the rate model.

Three-unit Spiking Model
We introduced a gross tonotopy into the spiking model by copying the single unit spiking model into three units with lateral excitatory connections (Figure Table 2: Parameter values of spiking neurons. Pyr Iso-frequency Units For tone responses at frequency f 1 (f 2 ), the center unit receives an input of amplitude proportional to 0.85 that of the left (right) unit [40]. The spiking model contains 800 Pyrs, 100 PVs, and 100 SOMs [59,42]. For connection probabilities within units, we chose E ← E connections to have probability p EE = 0.1 and all other probabilities to be the same, p EE = p ES = p P E = p P P = p P S = p SE = 0.6 [24]. For lateral connection probabilities, we chose p = 0.1.
The spiking model was constructed in its entirety using Brian2 [47].

Adaptation
Neurons in the AC exhibit adaptation, manifested by a reduction in neural response to repeated stimuli, for both the spectral and temporal range of in- puts. We hypothesized that a single iso-frequency unit contains the necessary mechanisms to enable SSA and forward suppression. SSA relies on two inhibitory components [32]: the constant disinhibition of excitatory activity due to PV inactivation, and the increasing disinhibition of excitatory activity due to SOM inactivation. We started with a simple sequence of five single-frequency tones, and treated the first tone as a surrogate for the deviant tone, and treated each tone thereafter as post-deviant tones. A single-unit model with constant synapses and depressing thalamic inputs (Equation (1)) are sufficient to reproduce SSA-like results ( Figure 6). The two key features of the mechanism behind this result is the temporal structure of the responses. In the control case ( Figure  6A,B), for each tone, PVs exhibit a temporally fast response and peak earlier than Pyrs and SOMs, while SOMs exhibit a temporally delayed and broad response. SOM responses include a threshold such that they inhibit Pyrs and PVs above a certain population activity level (Equation (2)). The threshold results in a delay of SOM activity of several milliseconds, in agreement with existing results [32].
With PV inactivation ( Figure 6C,D), the Pyr activity was directly proportional to the thalamic inputs, resulting in constant disinhibition prior to ( Figure  6C) and following ( Figure 6D) adaptation. This constant disinhibition appeared over all successive tones ( Figure 6G,H, orange, hatched).
SOM inactivation at the first tone resulted in no net change in the excitatory response because PV inhibition compensated for the reduced SOM inhibition ( Figure 6E). Following adaptation, PV activity was relatively weak and did not compensate for the reduced SOM activity, resulting in Pyr disinhibition ( Figure 6F). This increasing disinhibition appeared over across all successive tones ( Figure 6G,H, green.
We then fine-tuned parameters in the three-unit rate model such that these single-unit adaptation results persisted in the standard SSA paradigm.

Stimulus-Specific Adaptation
Auditory SSA manifests as a decrease in cortical response to a repeating auditory stimulus (standard) that does not generalize to another rarely-occurring stimulus (deviant) [60]. SSA is a well-documented cortical phenomenon in various animal models including cats [52], rats [53], and mice [32,33].
We built on multiple existing models for SSA. Mill et al. [30] used multiple configurations of spiking neuron populations, consisting of inhibitory and excitatory neurons, to reproduce cortical responses to oddball sequences used in SSA experiments. A two-layer rate model with synaptic depression was built by Mill et al. [31] to quantify the relationship between the cortical response and the parameters in SSA experiments, such as stimulus frequency differences, probability of deviation, and tone presentation rate. Yarden et al. (2017) used a multi-unit rate model arranged in a coarse tonotopy consisting of inhibitory and excitatory populations [60] to reproduce general deviance detection. Each study advanced our understanding of the basic mechanisms underlying SSA, however, they did not account for multiple inhibitory subtypes and their differential roles as discovered in recent optogenetic experiments.
Because the model inputs are such that the majority of deviance detection occurs at the level of the thalamus, we used an equivalent and concise stimulus paradigm: In the rate model, we used a sequence of 5 single-frequency tones, in which the first tone was equivalent to the deviant responses, and the remaining tones were equivalent to post-deviant responses. In the spiking model, we allowed the model to adapt to the standard tones before applying a deviant tone and recording the post-deviant responses. SSA ( Figure 7A) is simulated with the three-unit rate model ( Figure 7B,C) and with the three-unit spiking model ( Figure 7D,E). Figure 7B shows the firing rate of the pyramidal neuron in three cases: control (black), PV inactivation (orange, hatched), and SOM inactivation (green). The firing rates are normalized to the adapted firing rate (4th postdeviant tone number). All neurons (Pyr, PV, and SOM) exhibit adaptation, in agreement with the literature [32] (data not shown).
In the rate and spiking model, the firing rates increased uniformly across all post-deviant tones ( Figure 7B, bottom, orange hatched and Figure 7C, bottom, orange hatched, respectively). In the rate and spiking model, the firing rates exhibited an increase in disinhibition as a function of post-deviant tone number ( Figure 7B, bottom, green and Figure 7C, bottom, green, respectively). Both results agree with existing results in SSA [32].   Figure 7D).
In order to establish the robustness of these results, we varied several parameters and measured the Common-contrast SSA Index (CSI) [60], where We performed a parameter sweep with four parameters (Figures 8, 9). For the first parameter, we chose recurrent excitation (w ee , Figures 9A,B, 8A,B), because it is a key parameter in many modeling studies, especially those related to inhibitory stabilized networks (ISNs) [51]. For the second parameter, we chose the timescale of thalamic depression (τ d1 , Figures 9C,D, 8C,D) because reported values vary over a large range, from 0.8s [60] to 3s [32]. Finally, we chose the remaining two parameters to be the strength of PV activation or inactivation (Figures 9A,C, 8A,C), and the strength of SOM activation or These plots reveal robustness in parameter ranges for given optogenetic modulation strengths. The CSI in the control case (white circle) changed little when the parameters w ee and τ d1 were varied (i.e., shifting the white circle up and down). This result suggests that the cortical model is capable of operating in a broad parameter regime, and precise parameter values may not always be important for normal function. In extreme cases, decreasing recurrent excitation removed the decrease in CSI following SOM inactivation ( Figure 8B), suggesting that sufficient recurrent excitation is an important factor in generating responses in the SSA paradigm. Second, while increasing optogenetic inhibition (i.e., shifting the white triangle right) had little effect on the CSI, increasing optogenetic activation (i.e., shifting the white square left) showed an increase in CSI in all cases (CSI= 0.35 for PV activation and CSI= 0.31 for SOM activation). Thus, we predicted that optogenetic activation of PVs and SOMs will generally improve context-dependent cortical responses.
Much like the rate model, the spiking model exhibited little sensitivity to changes in w ee and τ d1 , supporting the prediction that the cortex does not need precise parameters. However, in contrast to the rate model, the spiking model showed almost no dependence on recurrent excitation w ee in the case of SOM inactivation ( Figure 9B). This effect is likely due to the differences in connectivity between the rate and spiking models. In the rate model, lateral connections are much stronger and depend entirely on excitatory activity, thus SSA results in the rate model are more sensitive to changes in recurrent excitation. In the spiking model, recurrent excitation plays a less important role because the lateral connection probabilities are low (p = 0.1), whereas the connection probabilities within units are high (p = 0.6). Moreover, much more of the excitation in the spiking model comes from the thalamus.  The standard forward suppression procedure follows the firing rate of a chosen neuron (with a known preferred frequency) in response to two tones. The first tone, called the masker, varies in frequency between trials, while the second tone, called the probe, remains fixed at the preferred frequency of the neuron. When the masker frequency is different from the probe frequency, the neuron response to the probe tone remains unaffected. When the masker frequency is similar to the probe frequency, neural responses are diminished or abolished [37,38].

Forward Suppression
The stimuli used in the forward suppression paradigm place the baseline state in the strong inhibitory regime (Figure 3). PV inactivation (orange, dashed) resulted in an overall strengthening of forward suppression at the preferred frequency relative to the control case (black) (Figure 10A, top). SOM inactivation (green) reduced forward suppression effects at the preferred frequency relative to the control case (black) (Figure 10A, bottom). The spiking model yielded the same results for PV inactivation and SOM inactivation ( Figure 10B, top, bottom, respectively). Strikingly, we use the same parameters used to reproduce SSA, with only slight changes to the input strength, which we choose to be q = 1.3 in forward suppression.
Our model predicted that both PV and SOM activation will result in a persistent strengthening of forward suppression across preferred and sideband frequencies ( Figure 10C).

Tuning-Curve Adaptation
Recent work has turned to teasing apart the role of interneurons in modulating neural tuning curves [36,38] as well as the role of interneurons in temporal modulation of tuning curves over the course of adaptation [33] in what we call tuning-curve adaptation.
To reproduce the tuning-curve adaptation results, we applied a sequence of 5 tones at each frequency (there was negligible difference in the adapted Pyr activity after 5 and 8 tones) to generate adapting tuning curves, and repeated this process with PV and SOM inactivation. We found that this auditory paradigm resulted in a below-threshold integration ofF , so the system switched to a state of strong baseline inhibition.
Our model reproduced tuning-curve adaptation ( Figure 11A,B). The rate model, before adaptation, exhibited sideband disinhibition with little no on disinhibition at the preferred frequency following PV and SOM inactivation (( Figure 11A, top). After adaptation, PV inactivation resulted in sideband disinhibition and no preferred frequency disinhibition, whereas SOM inactivation resulted in disinhibition across all sideband and preferred frequencies ( Figure  11A, bottom). The spiking model closely mirrored these results ( Figure 11B). These observations are in agreement with Natan et al. [33], and the three-unit rate and spiking models are the first to reproduce these results.
The mechanisms responsible for these effects are the synaptic facilitation and depression effects. Following PV inactivation, only SOM interneurons contribute to Pyr inhibition, and therefore only SOMs can modulate Pyr activity. Thus, a facilitating SOM-to-Pyr synapse enables SOMs to directly modulate Pyr activity, resulting in a decreased effect at the preferred frequency. The increasing disinhibition with adaptation at the preferred frequency following SOM inactivation is a consequence of the same compensating mechanism as in SSA.
Our model predicted that before adaptation, PV activation resulted in a slight decrease at the preferred frequency, whereas SOM inactivation reduced overall firing rates across all frequencies ( Figure 11C, top). After adaptation, PV and SOM activation resulted in a subtractive effect ( Figure 11C, bottom).
General optogenetic activation and inactivation of PVs and SOMs have been shown to modulate tuning-curves in combinations of additive, subtractive, multiplicative, and divisive effects. The model reproduces one of the key results in Natan et al. [33], where PV and SOM inactivation were found to have additive and divisive on tuning curves ( Figure 11D,E).

Enhanced Functional Connectivity
Hamilton et al. [15] established a direct function connection from the thalamus to deep layer 3 in the cortex using the Ising model [14,49], and demonstrated that this functional connection becomes stronger following PV activation [15]. Because the Ising model does not establish anatomical connections, the exact mechanism underlying this change remains unknown.
Using our three-unit model, we demonstrate a plausible mechanism for this change. First, we assume that the functional connection from the thalamus to the cortex is the same as the anatomical connection, so thalamic inputs directly modulate cortical responses in our model. Next, we show that following an increase in inhibition, cortical responses become sharper, thus aligning more closely with thalamic inputs, and improving functional connectivity. These results are summarized in Figure 12.  Figure 12: Effects of activating PVs in the rate and spiking models. A: The control Pyr firing rate in the rate model (black) decreased after PV activation (orange). The thalamic input is shown in red. Pearson correlation between thalamic input and control (PV activation): 0.77 (0.83). B: The same plot as A, but with the spiking model. The control Pyr firing rate (black) decreased after PV activation (orange). PV activation delayed the Pyr activity by approximately 2.5ms, so we shifted the PV activated firing rates (orange dashed) to align with the onset of the control curve (black) in order to directly compare the similarity between the PV-activated Pyr activity to the thalamic input. The Pearson correlation between thalamic input and control (PV activation): 0.87 (0.9). C: Predicted effects following optogenetic inactivation of PVs. Correlation decreased from 0.77 to 0.71. Rate model q = 1.2, PV activation and inactivation strengths were I Opt,PV = 1, −1, respectively. Spiking model q = 0.7nA, PV activation and inactivation strengths were I Opt,PV = 0.2nA, −0.5nA, respectively.
PV activation in the rate model resulted in an increase in the Pearson correlation between the control (black) and thalamic inputs (red), from 0.77 and 0.83 ( Figure 12A). Thus, while inhibitory activation decreased the overall firing rate, the response became more synchronized to the thalamic inputs, resulting in an increase in functional connectivity.
PV activation in the spiking model resulted in a delayed response of excitatory activity ( Figure 12B). We were curious to see if the PV-activated Pyr response profile resembled the thalamic activity more than the control Pyr response. To aid in quantifying this similarity, we introduced a delay in PV activation of approximately 2.5ms so that the onset of PV-activated Pyr activity (orange dashed) coincided with the onset of the control curve (black), and used the Pearson correlation to quantify if the PV-activated Pyr response was more synchronized to the thalamic inputs. We observed an increase in correlation from 0.84 in the control Pyr activity to 0.89 in the PV-activated Pyr activity, thus demonstrating a sharpening of excitatory responses, and an increase in functional connectivity.
These results provide for a simple plausible mechanism for enhanced functional connectivity in the cortex: as inhibition reduces the overall cortical inputs, cortical responses better synchronize to thalamic inputs, resulting in stronger correlated activity. We remark that while correlations in general do not measure functional connections as the Ising model, our model has explicit anatomical connections, which enables one to directly remove false-positive correlations. Thus, in this case, the use of the correlation serves as a worthwhile proxy for the Ising model results in Hamilton et al. [15].

Discussion
The recent growth of optogenetic studies have revealed the intricate cortical circuitry underlying fundamental auditory processing. Accompanying these studies are a wide range of models with very different mechanisms, such as single-unit rate models [32], abstracted multi-unit rate models [37,38,36], and Ising models [15]. In the present study, we have introduced a unifying threeunit rate and three-unit spiking model that reproduces multiple key results from studies that have used optogenetics in the auditory cortex. In addition to including different baseline states that modulate the strength of PV-to-Pyr and SOM-to-Pyr synapses, the key mechanisms underlying our models are the fast temporal activation of PVs, the delayed, broad temporal activation of SOMs, SOM-to-Pyr facilitation, and PV-to-Pyr depression. These mechanisms explain the differential modulation of cortical responses by interneuron subtypes.
In the present study, the models faithfully reproduce the differential effects of SOM and PV inactivation in SSA (Figure 7). Our model features multiple inhibitory subtypes and a gross tonotopy, which is an improvement over existing models. A model by [60] relies on recurrent excitatory depression without inhibitory subtypes. An existing model of differential inhibitory modulation in SSA, which demonstrated similar differential inhibitory effects as in our SSA result (Figure 2), did not include a tonotopy [32]. A parameter sweep revealed that both the rate and spiking models are robust to large changes in key parameters, suggesting that SSA is generally a robust phenomenon [60].
Existing models that reproduce the enhanced forward suppression from PV inactivation and the reduced forward suppression from SOM inactivation (Figure 10) include multiple layers that require both depression and facilitation [37], or rely on depressing recurrent excitation and do not distinguish between inhibitory subtypes [26]. While we utilized synaptic depression and facilitation, we reproduced the former results with only a single layer, suggesting that the mechanism supporting forward suppression may be very simple.
The models in the present study are the first to reproduce tuning-curve adaptation effects: SOMs exhibit strong preferred-frequency disinhibition following adaptation, while PV disinhibition is independent of the degree of adaptation ( Figure 11) [33]. These results suggest that the underlying mechanism(s) of the model, namely the PV/SOM compensation effect, combined with the facilitating SOM-to-Pyr synapse and depressing PV-to-Pyr synapse, may serve as unifying mechanisms of adaptation.
Finally, our models reproduce changes in functional connectivity in the cortex ( Figure 12) [15]. By increasing PV activity in the models, excitatory activity decreases but becomes more synchronized with the shape of thalamic inputs. This effect agrees with observations in the cortex, where PV activation results in enhanced functional connectivity [15]. The effects of inhibition on sharpening cortical responses have been well-established, thus our models serve as plausible mechanisms for this change [55,9,46].
Our models currently do not feature population spikes, which explain many fundamental cortical responses in AC [25,26]. In future work, we will seek to reconcile the differences between our models and the population spike model of SSA by Yarden and Nelken [60]. Establishing the importance of depression and facilitation in different synapses and extending our model to include population spikes warrants further study.
The various optogenetic manipulations in AC considered in the present study were developed by independent labs to explore different aspects of auditory processing. The experimental results are diverse, thus one may reasonably expect that models with greatly differing mechanisms and parameters are required to reproduce these results. Strikingly, our minimalistic model, containing simple mechanisms, has helped us understand the importance of differential information processing by various subtypes of inhibitory neurons, and has unified these disparate studies. As inhibitory neurons form similar circuits throughout the mammalian cortex, this model can be readily adapted to test their function and generate predictions (with adjustments for local changes in connectivity) in different sensory modalities.

Competing interests
The authors declare no competing interests.