The authors have declared that no competing interests exist.
Probabilistic inference offers a principled framework for understanding both behaviour and cortical computation. However, two basic and ubiquitous properties of cortical responses seem difficult to reconcile with probabilistic inference: neural activity displays prominent oscillations in response to constant input, and large transient changes in response to stimulus onset. Indeed, cortical models of probabilistic inference have typically either concentrated on tuning curve or receptive field properties and remained agnostic as to the underlying circuit dynamics, or had simplistic dynamics that gave neither oscillations nor transients. Here we show that these dynamical behaviours may in fact be understood as hallmarks of the specific representation and algorithm that the cortex employs to perform probabilistic inference. We demonstrate that a particular family of probabilistic inference algorithms, Hamiltonian Monte Carlo (HMC), naturally maps onto the dynamics of excitatoryinhibitory neural networks. Specifically, we constructed a model of an excitatoryinhibitory circuit in primary visual cortex that performed HMC inference, and thus inherently gave rise to oscillations and transients. These oscillations were not mere epiphenomena but served an important functional role: speeding up inference by rapidly spanning a large volume of state space. Inference thus became an order of magnitude more efficient than in a nonoscillatory variant of the model. In addition, the network matched two specific properties of observed neural dynamics that would otherwise be difficult to account for using probabilistic inference. First, the frequency of oscillations as well as the magnitude of transients increased with the contrast of the image stimulus. Second, excitation and inhibition were balanced, and inhibition lagged excitation. These results suggest a new functional role for the separation of cortical populations into excitatory and inhibitory neurons, and for the neural oscillations that emerge in such excitatoryinhibitory networks: enhancing the efficiency of cortical computations.
Our brain operates in the face of substantial uncertainty due to ambiguity in the inputs, and inherent unpredictability in the environment. Behavioural and neural evidence indicates that the brain often uses a close approximation of the optimal strategy, probabilistic inference, to interpret sensory inputs and make decisions under uncertainty. However, the circuit dynamics underlying such probabilistic computations are unknown. In particular, two fundamental properties of cortical responses, the presence of oscillations and transients, are difficult to reconcile with probabilistic inference. We show that excitatoryinhibitory neural networks are naturally suited to implement a particular inference algorithm, Hamiltonian Monte Carlo. Our network showed oscillations and transients like those found in the cortex and took advantage of these dynamical motifs to speed up inference by an order of magnitude. These results suggest a new functional role for the separation of cortical populations into excitatory and inhibitory neurons, and for the neural oscillations that emerge in such excitatoryinhibitory networks: enhancing the efficiency of cortical computations.
Uncertainty plagues neural computation. For instance, hearing the rustle of an animal at night, it may be impossible to ascertain the species, and thus whether or not it is dangerous. One approach in this scenario is to respond based on a point estimate, usually the single most probable explanation of our observations. However, this leads to a problem: if the probability of the animal being dangerous is below 50%, then the single most probable explanation is that the animal is harmless; and considering only this explanation, and thus failing to respond, could easily prove fatal. Instead, to respond appropriately, it is critical to take uncertainty into account by also considering the possibility of there being a dangerous animal, given the rustle and any other available clues.
The optimal way to perform computations and select actions under uncertainty is to represent a probability distribution that quantifies the probability with which each scenario may describe the actual state of the world, and update this probability distribution according to the laws of probability, i.e. by performing Bayesian inference. Human behaviour is consistent with Bayesian inference in many sensory [
The apparent success of probabilistic inference in accounting for a diverse set of experimental observations raises the question of how neural systems might represent and compute with uncertainty [
Here, we present an EI neural network model of V1 that performs probabilistic inference such that it retains a computationally useful representation of uncertainty, and has rich, cortexlike dynamics, including oscillations and transients. In particular, our network uses a samplingbased representation of uncertainty [
HMC is based on the idea that it is possible to sample from a probability distribution by setting up a dynamical system whose dynamics is Hamiltonian (
In the following, we first define the statistical model of natural visual scenes that served as the testbed for our simulations of V1 dynamics. We then describe the HMCbased neural network that implemented sampling under this statistical model. We demonstrate that our dynamics sample more rapidly than noisy gradient ascent (also known as Langevin dynamics), and therefore that the presence of oscillations and transients in our network speeds up inference. Next, we show by both theoretical analysis and simulation that our sampler reproduces three properties of experimentally observed cortical dynamics. First, our sampler has balanced excitation and inhibition, with inhibition lagging excitation [
In order to model the dynamics of V1 responses, we adopted a statistical model that has been widely used to capture the statistics of natural images and consequently to account for the
The Gaussian scale mixture (GSM) model is relatively simple, yet captures some fundamental higherorder statistical properties of natural image patches by introducing latent variables,
Parameter  Value  Role 


prior covariance of 

See 
edgedetecting filters represented by model neurons  

0.1  variance of observation noise 
10 ms  membrane time constant  
13 s^{−1}  rate at which stochastic vesicle release injects noise  
See 
recurrent connection weights in the network 
See
Crucially, assuming that V1 simple cell activities represent values of
To ensure efficient sampling from the posterior, we constructed network dynamics based on the core principles of HMC sampling. The efficiency of HMC stems from its ability to speed up inference by preventing the random walk behaviour plaguing other samplingbased inference schemes. In particular, it introduces auxiliary variables to complement the ‘principal’ variables whose value needs to be inferred (
We noted that the particular interaction between principal and auxiliary variables required by HMC dynamics is naturally implemented by the recurrently connected excitatory and inhibitory populations of cortical circuits. Thus, the dynamics of our twopopulation neural network that sampled from the GSM posterior were (
The network consists of two populations of neurons, excitatory neurons with membrane potential
Network dynamics consisted of three components. First, recurrent dynamics implementing HMC was specified by the first two terms in Eqs (
Second, there was an input current
Finally, the last term in Eqs (
When given an input image, our network exhibited oscillatory dynamics due to its intrinsic excitatoryinhibitory interactions (
The Langevin sampler was constructed by setting the recurrent weights in our network (
The oscillatory behaviour of our HMC sampler allowed it to explore a larger volume of state space in a fixed time interval than Langevin sampling (
The efficiency of HMC is typically attributed to the suppression of the random walk behaviour of Langevin dynamics [
As we saw above, the advantage of HMC over Langevin dynamics could be attributed to the contribution of the recurrent connections, i.e. the
Oscillations are a ubiquitous property of cortical dynamics [
In order to extract an LFP from our model, in line with previous approaches (e.g. [
To further quantify this intuition, we simplified the dynamics of our network by incorporating the effects of inhibition directly into the equations describing the dynamics of the excitatory cells (see
Indeed, as predicted by these arguments, the network exhibited contrastdependent oscillation frequencies both in its membrane potentials (
When we computed firing rates in the model by applying a threshold to membrane potentials (
In order to understand how transients emerged in the full Hamiltonian dynamics of our network, sampling
More formally, taking the 1D version of the simplified dynamics (
Simulating this simplified dynamical system did indeed yield large transients (
Previously proposed mechanisms by which the cortex could either represent and manipulate uncertainty or just find the most probable explanation for sensory data failed to explain the richness of cortical dynamics. In particular, these models either had no dynamics or only gradient ascentlike dynamics, whereas neural activity displays oscillations in response to a fixed stimulus, and large transients in response to stimulus onset. Moreover, these models typically violated Dale’s law, by having neurons whose outputs were both excitatory and inhibitory. We demonstrated that it was, in fact, possible to perform probabilistic inference in an EI network that displayed oscillations and transients. Moreover, having oscillations actually improved the network, in that it was able to perform inference faster than networks that did not have oscillations. Our model displayed four further dynamical properties that did not appear, at first, to be compatible with probabilistic inference: excitation and inhibition were balanced at the level of individual cells [
Our work suggests a new functional role for cortical oscillations, and for inhibitory neurons that are involved in their generation: speeding up inference. We have demonstrated this role in the specific context of V1, but our formalism is readily applicable to other cortical areas in which probabilistic inference is supposed to take place, and similar stimuluscontrolled transients and oscillations can be observed [
While the statistical model of images underlying our network was able to capture some interesting properties of the statistics of natural images, it was nevertheless clearly simplified, in that e.g. it did not capture any notion of objects, or occlusion. Once such higherorder features are incorporated into the model, we expect a variety of interesting new dynamical properties to emerge. For example, there should be strong statistical relationships between lowlevel variables describing a single object, and hence strong dynamical relationships, including synchronisation, between neurons representing different parts of the same object [
It will also be important to understand how local learning rules, modelling synaptic plasticity, may be able to set up the weight matrices that we found were necessary for implementing efficient Hamiltonian dynamics. For example, there might be two sets of learning rules operating in parallel, one set of rules which learns that statistical structure of the input, perhaps mainly through the plasticity of excitatorytoexcitatory connections [
Finally, while the type of linear membrane potential dynamics we used in our network could be implemented using firing rate nonlinearities in combination with synaptic and dendritic nonlinearities [
The sampler was derived by combining an HMC step, and a Langevin step to add noise and ensure ergodicity. The most general equations describing HMC are given by
For the HMC step, there is freedom to specify the distribution of the auxiliary variable,
In order to add noise without perturbing the stationary distribution, we perform a Langevin step, that is, we simultaneously add noise and take a step along the gradient of the logprobability. Notably, this introduces a new time constant
The dynamics therefore become
Again, we can break up the
Now, we compute these gradients, and convert them into a neuralnetwork (see
We can thus write the dynamics of our neural network as
Finally, we substitute
The brain does not know
In particular, we simply extended the dynamics with an additional element for
By setting the weight matrices implementing HMC,
The GSM model has three parameters, the Gabor features,
We can set
We know that the average posterior equals the prior [
We make the ansatz that
In principle, we could find
Setting these expressions equal, substituting for
(Note that while this derivation is valid for the complete and undercomplete case, a more complex analysis would be necessary for the overcomplete case.)
With these choices, the dynamics only depend on the probabilistic model through the product (
For the dynamics to be correct, we need this matrix to be positive definite. While this is not guaranteed, we found that in practice the matrix turns out to satisfy this constraint. As
Next, we consider the observation noise level,
Based on the literature, we set the values of the relevant constants as
To obtain this range for
To choose values for
Finally, we estimated
However, estimating
To estimate the required ranges, we took values from the neuroscience literature. First, estimates of firing rates vary widely, from around 0.5 Hz [
These ranges give a central estimate of
One might worry that it is possible for
Differentiating again yields
Thus, for fixed
We simulated stimulus onset by first running the sampler until it reached equilibrium with no stimulus, then turning on the stimulus. To represent no stimulus we sampled
To make contact with experimental data, we also computed local field potentials (LFPs), and firing rates. There are many methods for computing LFPs, we chose the simplest, averaging the membrane potentials across neurons, as it gave similar results to the other methods, without tuneable parameters. To compute firing rates, we used a rectified linear function of the membrane potential:
(PDF)
See readme for further details.
(ZIP)
We thank G. Orbán for useful discussions and suggestions.