• Loading metrics

Achieving stable dynamics in neural circuits

  • Leo Kozachkov ,

    Roles Conceptualization, Formal analysis, Software, Writing – original draft

    ‡ LK and ML share first authorship on this work. JJS and EKM are joint senior principal investigators on this work.

    Affiliations The Picower Institute for Learning & Memory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America, Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America, Nonlinear Systems Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America

  • Mikael Lundqvist ,

    Roles Conceptualization, Formal analysis, Software

    ‡ LK and ML share first authorship on this work. JJS and EKM are joint senior principal investigators on this work.

    Affiliations The Picower Institute for Learning & Memory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America, Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America, Department of Psychology, Stockholm University, Stockholm, Sweden

  • Jean-Jacques Slotine ,

    Roles Conceptualization, Supervision, Validation, Writing – review & editing

    ‡ LK and ML share first authorship on this work. JJS and EKM are joint senior principal investigators on this work.

    Affiliations The Picower Institute for Learning & Memory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America, Nonlinear Systems Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America

  • Earl K. Miller

    Roles Conceptualization, Project administration, Supervision, Validation, Writing – original draft, Writing – review & editing

    ‡ LK and ML share first authorship on this work. JJS and EKM are joint senior principal investigators on this work.

    Affiliations The Picower Institute for Learning & Memory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America, Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America

Achieving stable dynamics in neural circuits

  • Leo Kozachkov, 
  • Mikael Lundqvist, 
  • Jean-Jacques Slotine, 
  • Earl K. Miller


The brain consists of many interconnected networks with time-varying, partially autonomous activity. There are multiple sources of noise and variation yet activity has to eventually converge to a stable, reproducible state (or sequence of states) for its computations to make sense. We approached this problem from a control-theory perspective by applying contraction analysis to recurrent neural networks. This allowed us to find mechanisms for achieving stability in multiple connected networks with biologically realistic dynamics, including synaptic plasticity and time-varying inputs. These mechanisms included inhibitory Hebbian plasticity, excitatory anti-Hebbian plasticity, synaptic sparsity and excitatory-inhibitory balance. Our findings shed light on how stable computations might be achieved despite biological complexity. Crucially, our analysis is not limited to analyzing the stability of fixed geometric objects in state space (e.g points, lines, planes), but rather the stability of state trajectories which may be complex and time-varying.

Author summary

Stability is essential for any complex system including, and perhaps especially, the brain. The brain’s neural networks are highly dynamic and noisy. Activity fluctuates from moment to moment and can be highly variable. Yet it is critical that these networks reach a consistent state (or sequence of states) for their computations to make sense. Failures in stability have consequences ranging from mild (e.g incorrect decisions) to severe (disease states). In this paper we use tools from control theory and dynamical systems theory to find mechanisms which produce stability in recurrent neural networks (RNNs). We show that a kind of “unlearning” (inhibitory Hebbian and excitatory anti-Hebbian plasticity), balance of excitation and inhibition, and sparse anatomical connectivity all lead to stability. Crucially, we focus on the stability of neural trajectories. This is different from traditional studies of stability of fixed points or planes. We do not assess what trajectories our networks will follow but, rather, when these trajectories will all converge towards each other to achieve stability.


Behavior emerges from complex neural dynamics unfolding over time in multi-area brain networks. Even in tightly controlled experimental settings, these neural dynamics often vary between identical trials [1,2]. This can be due to a variety of factors including variability in membrane potentials, inputs, plastic changes due to recent experience and so on. Yet, in spite of these fluctuations, brain networks must achieve computational stability: despite being “knocked around” by plasticity and noise, the behavioral output of the brain on two experimentally identical trials needs to be similar. How is this stability achieved?

Stability has played a central role in computational neuroscience since the 1980’s, with the advent of models of associative memory that stored neural activation patterns as stable point attractors [37], although researchers were thinking about the brain’s stability since as early as the 1950’s [8]. The vast majority of this work is concerned with the stability of activity around points, lines, or planes in neural state space [9,10]. However, recent neurophysiological studies have revealed that in many cases, single-trial neural activity is highly dynamic, and therefore potentially inconsistent with a static attractor viewpoint [1,11]. Consequently, there has been a number of recent studies—both computational and experimental—which focus more broadly on the stability of neural trajectories [12,13], which may be complex and time-varying.

While these studies provide important empirical results and intuitions, they do not offer analytical insight into mechanisms for achieving stable trajectories in recurrent neural networks. Nor do they offer insights into achieving such stability in plastic (or multi-modal) networks. Here we focus on finding conditions that guarantee stable trajectories in recurrent neural networks and thus shed light onto how stable trajectories might be achieved in vivo.

To do so, we used contraction analysis, a concept developed in control theory [14]. Unlike a chaotic system where perturbations and distortions can be amplified over time, the population activity of a contracting network will converge towards the same trajectory, thus achieving stable dynamics (Fig 1). One way to understand contraction is to represent the state of a network at a given time as a point in the network’s ‘state-space’, for instance the space spanned by the possible firing rates of all the networks’ neurons. This state-space has the same number of dimensions as the number of units n in the network. A particular pattern of neural firing rates corresponds to a point in this state-space. This point moves in the n dimensions as the firing rates change and traces out a trajectory over time.

Fig 1. Cartoon demonstrating the contraction property.

In a network with N neural units and S dynamic synaptic weights, the network activity can be described a trajectory over time in an (N + S)-dimensional space. In a contracting system all such trajectories will converge exponentially in some metric towards each other over time, regardless of initial conditions. In other words, the distance between any two trajectories shrinks to zero—potentially after transient divergence (as shown).

In a contracting network, all such trajectories converge. These contracting dynamics have previously been used in several applications, including neural networks with winner take all dynamics [15,16], in a model of action-selection in the basal ganglia [17], and to explain how neural synchronization can protect from noise [18]. Here, we instead explore how contraction can be achieved generally in more complex recurrent neural networks (RNNs) including those with plastic weights. We used RNNs that received arbitrary time-varying inputs and had synapses that changed on biologically relevant timescales [1921]. Our analysis reveals several novel classes of mechanisms that produced contraction including inhibitory Hebbian plasticity, excitatory anti-Hebbian plasticity, excitatory-inhibitory balance, and sparse connectivity. For the first two parts of the Results section, we focus on contraction of both neural activity and components of the weight matrix (Fig 1). For the remaining parts of the Results section, we hold the weights fixed (i.e they become parameters, not variables) and focus on contraction of neural activity alone.


The main tool we used to characterize contraction was the logarithmic norm (also known as a matrix measure). The formal definition of the logarithmic norm is as follows (from [22] section 2.2.2): let A be a matrix in Cn×n and ∥⋅∥i be an induced matrix norm on Cn×n. Then the corresponding logarithmic norm is the function defined by

In the same way that different vector norms induce different matrix norms, different vector norms also induce different logarithmic norms. Two important logarithmic norms which we use throughout the paper are those induced by the vector 1-norm and the vector 2-norm: Where λmax denotes the largest eigenvalue. To study the contraction properties of RNNs, we applied the logarithmic norm to the RNN’s Jacobians. The Jacobian of a dynamical system is a matrix essentially describing the local ‘traffic laws’ of nearby trajectories of the system in its state space. More formally, it is the matrix of partial derivatives describing how a change in any system variable impacts the rate of change of every other variable in the system. It was shown in [14] that if the logarithmic norm of the Jacobian is negative then all nearby trajectories are funneled towards one another (see S1A Text Section 1.2 for technical review). This, in turn, implies that all trajectories are funneled towards one another at rate called the contraction rate. The contraction rate and the logarithmic norm are related as follows: the maximum value attained by the absolute value of logarithmic norm of the Jacobian along the network’s trajectory is the contraction rate. In other words, if the logarithmic norm of the Jacobian is upper bounded by some negative number −c, where c > 0, then the contraction rate is simply c.

Importantly, the above description can be generalized to different metrics. A metric is a symmetric, positive definite matrix which generalizes the notion of Euclidean distance. Every invertible coordinate transformation y = θx yields a metric M = θTθ. To see this, consider the squared norm of ∥y2 = yTy = xTθTθx = xTMx. Thus, the norm of y is related to the norm of x through the metric M. If one can find metric in which the network is contracting—in the sense that its Jacobian has negative logarithmic norm–this implies contraction for all coordinate systems. This makes contraction analysis useful for analyzing systems where exponential convergence of trajectories is preceded by transient divergence (Fig 1) as in recent models of motor cortex [23,24]. In this case, it is usually possible to find a coordinate system in which the convergence of trajectories is ‘pure’. For example, linear stable systems were recently used in the motor control literature to find initial conditions which produce the most energetic neural response [23] They are ‘purely’ contracting in a metric defined by the eigenvectors of the weight matrix (see Example 5.1 in [14]) but transiently diverging in the identity metric (i.e M = I). Note that the identity metric corresponds to θ = I, which is simply the original, untransformed coordinate system.

Inhibitory hebbian plasticity & excitatory anti-Hebbian plasticity produce contraction

It is known that certain forms of synaptic plasticity can quickly lead to extreme instabilities if left unchecked [9,25]. Thus, the same feature that can aid learning can also yield chaotic neural dynamics if not regulated. It is not known how the brain resolves this dilemma. A growing body of evidence—both experimental and computational—suggests that inhibitory plasticity (that is, the strengthening of inhibitory synapses) can stabilize neural dynamics while simultaneously allowing for learning/training in neural circuits [2628]. By using the Jacobian analysis outlined above, we found that inhibitory Hebbian synaptic plasticity (as well as excitatory anti-Hebbian plasticity) indeed leads to stable dynamics in neural circuits. Specifically, we considered neural networks of the following common form: (1) where the term xi denotes the ‘activation’ of neuron i as a function of time. Here we follow other authors [23] and interpret xi as the deviation from the baseline firing rate of neuron i. Note that this interpretation assumes that the baseline firing rates are positive–thus allowing for x to be negative—and large enough so that baseline + x > 0. The term Wij denotes the weight between neurons i and j the term h(xi) captures the dynamics neuron i would have in the absence of synaptic input, including self-feedback terms arising from the diagonal elements of the weight matrix—in other words, the dynamics neuron i would have if for all i and j, Wij = 0. The term being summed represents the weighted contribution of all the neurons in the network on the activity of neuron i. Finally, the term ui(t) represents external input into neuron i.

We did not constrain the inputs into the RNN (except that they were not infinite) and we did not specify the particular form of h(xi) except that it should be a leak term (i.e has a negative derivative for all x, see S1A Text Section 2.2.4, e.g h(xi) = −xi). Furthermore, we made no assumptions regarding the relative timescales of synaptic and neural activity. Synaptic dynamics were treated on an equal footing as neural dynamics. We considered synaptic plasticity of the following correlational form [29]: (2) where the term kij > 0 is the learning rate for each synapse and γ(t) > 0 is a decay factor for each synapse. For technical reasons outlined in the appendix (S1A Text Section 3), we restricted K, the matrix containing the learning rates kij, to be positive semi-definite, symmetric, and have positive entries. A particular example of K satisfying these constraints is to have the learning rates of all synapses to be equal (i.e. kij = k > 0).

Before we show that (2) leads to overall synaptic and neural contraction, it’s useful to spend some time interpreting this plasticity. Since Wij can be positive or negative (corresponding to excitatory and inhibitory synapses, respectively), and xixj can be positive or negative (corresponding to correlated and anticorrelated neurons, respectively), there are four cases to consider. We summarize these cases in Table 1 and discuss them in details below. By Hebbian plasticity we refer to the increase of synaptic efficiency between correlated neurons [30]. In the context of simple neural networks with scalar weights, as we consider here, efficiency refers to the absolute value |w| of a weight. Thus, for excitatory synapses, (2) in fact describes anti-Hebbian plasticity, because the positive synaptic weight becomes less positive (and thus less efficient) between correlated neurons and more positive (thus more efficient) for anticorrelated neurons. For inhibitory synapses, (2) describes Hebbian plasticity because the direction of synaptic weight change is negative between correlated neurons, and thus the synapse becomes more efficient [31,32], while for anticorrelated neurons the direction of synaptic weight change is positive, and thus the synapse becomes less efficient. Plasticity of this form produced contracting neural and synaptic dynamics regardless of the initial values of the weights and neural activity (Figs 2 and 3). The black trace of Fig 3A shows that this is not simply due to the weights decaying to 0. Thus, this plasticity is not only contraction preserving, it is contracting ensuring. Furthermore, we showed that the network is contracting in a non-identity metric (which we derive from the system parameters in K), opening up the possibility of transient divergent dynamics in the identity metric, as seen in the modelling of motor dynamics [23].

Table 1. Summary of the effect of the plasticity described in Eq (2) on excitatory and inhibitory for correlated or anticorrelated pre and post synaptic neurons.

Fig 2. Contracting dynamics of neural and synaptic activity.

Euclidean distances between synaptic and neural trajectories demonstrate exponential shrinkage over time. The top row of panels shows the activation of a randomly selected neural unit (black) and synapse (blue) across two simulations (dotted and solid line). The bottom row shows the average Euclidean distance in state space for the whole population across simulations with distinct, randomized starting conditions. Leftmost Panel: Simulations of a contracting system where only starting conditions differ over simulations. Center Panel: the same as in Leftmost but with an additional random pulse perturbation in one of the two simulations indicated by a red background shading. Rightmost Panel: same as in Center Panel but with additional sustained noise, unique to each simulation.

Fig 3.

The anti-Hebbian plasticity pushes the weight matrix towards symmetry. (Left) Plotted are the spectral norms (largest singular value) of the overall weight matrix as well as the anti-symmetric part of that matrix. Since every square matrix can be uniquely decomposed as the sum of a symmetric and anti-symmetric component—0.5*(W+W’) and 0.5*(W-W‘), respectively—the teal curve decaying to zero implies that the matrix becomes symmetric. The black trace shows the spectral norm of the overall weight matrix. If this quantity does not decay to zero, it implies that not all the weights have decayed to zero. On the right, we plot the largest eigenvalue of the symmetric part of W. A prerequisite for overall contraction of the network is that this quantity be less than or equal to the ‘leak-rate’ of the individual neurons. The dotted line shows our theoretical upper bound for this quantity, and the solid line shows the actual value of taken from a simulation (see Methods).

To explain how inhibitory Hebbian plasticity and excitatory anti-Hebbian plasticity work to produce contraction across a whole network, we needed to deal with the network in a holistic fashion, not by analyzing the dynamics of single neurons. To do so, we conceptualized RNNs with dynamic synapses as a single system formed by combining two subsystems, a neural subsystem and a synaptic subsystem. We showed that the above plasticity rule led the neural and synaptic subsystems to be independently contracting. Thus contraction analysis of the overall system then boiled down to examining the interactions between these subsystems [33].

We found that this plasticity works like an interface between these systems. It produces two distinct effects that push networks toward contraction. First, it makes the synaptic weight matrix symmetric (Fig 3A, red trace). This means that the weight between neuron i to j is the same as j to i. We showed this by using the fact that every matrix can be written as the sum of a purely symmetric matrix and a purely anti-symmetric matrix. An anti-symmetric matrix is one where the ij element is the negative of the ji element (i.e. Wij = −Wji) and all the diagonal elements are zero. We then showed that anti-Hebbian plasticity shrinks the anti-symmetric part of the weight matrix to zero, implying that the weight matrix becomes symmetric. The symmetry of the weight matrix ‘cancels out’ off-diagonals in the Jacobian matrix (see S1A Text Section 3) of the overall neural-synaptic system. Loosely speaking, off-diagonal terms in the Jacobian represent potentially destabilizing cross-talk between the two subsystems. Furthermore, anti-Hebbian plasticity makes the weight matrix negative semi-definite. This means that all its eigenvalues are less than or equal to zero (Fig 3).

Sparse connectivity pushes networks toward contraction

Synaptic connectivity in the brain is extraordinarily sparse. The adult human brain contains at least 1011 neurons yet each neuron forms and receives on average only 103−104 synaptic connections [34]. If the brain’s neurons were all-to-all connected this number would be on the order of 1011 synaptic connections per neuron (). Even in local patches of cortex, such as we model here, connectivity is far from all-to-all; cortical circuits are sparse [35]. Our analyses revealed that sparse connectivity helps produce global network contraction for many types of synaptic plasticity.

To account for the possibility that some synapses may have much slower plasticity than others (and can thus be treated as synapses with fixed amplitude), we made a distinction between the total number of synapses and the total number of plastic synapses. These plastic synapses then changed on a similar time-scale as the neural firing rates. By neural dynamics, we mean the change in neural activity as a function of time. We analyzed RNNs with the structure: (3) Where hi(xi) is a nonlinear leak term (see S1A Text Section 2.2.4), and r(xj) is a nonlinear activation function. The RNNs analyzed in this section are identical to those analyzed in the previous section, with the exception of the r terms, which we constrained to be linear. Under the assumption that the plastic synapses have a ‘forgetting term’, we show in the appendix (S1A Text Section 4) that if the following equation is satisfied for every neuron, then the overall network is contracting: (4) where pi denotes the total number of afferent synapses into neuron i and αi denotes the fraction of afferent plastic synapses into neuron i. The term wmax refers to the maximum possible absolute efficiency of any single synapse. That is, wmax = maxi,j |wij|. Similarly, the term rmax refers to the maximum possible absolute value of r. That is, rmax = maxi,t |ri(t)|. The term βi denotes the contraction rate of the ith isolated neuron. That is, . Recall from the introduction that the contraction rate measures how quickly the trajectories of a contracting system reconvene after perturbation. Finally, gmax refers to the maximum gain of any neuron in the network. That is, . Note that because βi is a positive number by assumption, it is always possible to decrease pi to the point where (4) is satisfied. Of course, it is possible that the only value of pi that satisfies (4) is the trivial solution pi = 0, which corresponds to removing all interconnections between neurons. Since these neurons are assumed to be contracting in isolation, the network is trivially contracting. However, if the term inside the parentheses of (4) is small enough, or βi is large enough, intermediate value of pi can be found which satisfy the inequality. Because increasing the sparsity of a network corresponds to decreasing pi, we may conclude that increasing the sparsity of connections pushes the system in the direction of contraction. Note that (4) also implies that the faster the individual neurons are contracting (i.e. the larger βi is), the denser you can connect them with other neurons while still preserving overall contraction.

Up to now we have focused our analysis on the case where synaptic weights vary on a timescale comparable to neurons, and must therefore be factored into the stability analysis. For the next two sections, we’ll apply contraction analysis to neural network in the case where the weights may be regarded as fixed relative to the neural dynamics (i.e. there is a separation of timescales).

E-I balance leads to contraction in static RNNs

Apart from making connections sparse, one way to ensure contraction is to make synaptic weights small. This can be seen for the case with static synapses by setting αi = 0 in the section above, where Wmax now has to be small to ensure contraction. Intuitively, this is because very small weights mean that neurons cannot exert much influence on one another. If the neurons are stable before interconnection, they will remain so. Since strong synaptic weights are commonly observed in the brain, we were more interested in studying when contraction can arise irrespective of weight amplitude. Negative and positive synaptic currents are approximately balanced in biology [3638]. We reasoned that such balance might allow much larger weight amplitudes while still preserving contraction since most of the impact of such synapses cancel and the net effect small. This was indeed the case. To show this, we studied the same RNN as in the section above, while assuming additionally that the weights are static. In particular, we show in the appendix (S1A Text Section 5) that contraction can be assessed by studying the eigenvalues of the symmetric part of W (i.e. ).

Before we discuss the above result in detail, it is useful here to quickly review some facts about the stability of nonlinear systems as compared to the stability of linear systems. In particular, the fact that the eigenvalues of W are only informative for assessing contraction in regions where the dynamics may be regarded as linear. This is because in linear time-variant (LTI) systems (i.e. ) stability is completely characterized in terms of the eigenvalues of A. However, this is not true for nonlinear systems, even those of the linear time-varying form To see this, consider the following counter-example (from [39], section 4.2.2): (5)

The eigenvalues of A(t) are (−1, 1) for all time, however one can verify by direct evalution that the solution of this system satisfies y = y(0)et, which is unstable along x. However, it can be shown straightforwardly that if the eigenvalues of the symmetric part of A(t) are all negative, then the system is stable [39]. This fact underlies our analysis, and highlights the reason why the eigenvalues of the symmetric part of W are important for stability.

Returning to our results, we show that if excitatory to inhibitory connections are of equal amplitude (and opposite sign) as inhibitory to excitatory connections, they will not interfere negatively with stability—regardless of amplitude (see S1A Text Section 5). This is because connections between inhibitory and excitatory units will be in the off-diagonal of the overall weight matrix and get cancelled out when computing the symmetric part. As an intuitive example, consider a two-neuron circuit made of one excitatory neuron and one inhibitory neuron connected recurrently (as in [40], Fig 1A). Assume that the overall weight matrix has the following structure:

When taking the symmetric part of this matrix, the off-diagonal elements cancel out, leaving only the diagonal elements to consider. Since the eigenvalues of a diagonal matrix are simply its diagonal elements, we can conclude that if the excitatory and inhibitory subpopulations are independently contracting (w is less than the contraction rate of an isolated neuron), then overall contraction is guaranteed. It is straightforward to generalize this simple two-neuron example to circuits achieving E-I balance through interacting populations (see S1A Text Section 5). It is also straightforward to generalize to the case where E-I and I-E connections do not cancel out exactly neuron by neuron, but rather they cancel out in a statistical sense where the mean amplitudes are matched. Another way to view this E-I balance is in the framework of combinations of contracting systems (Fig 4). It is known that combining independently contracting systems in negative feedback preserves contraction [14]. We show that E-I balance actually translates to this negative feedback and thus can preserve contraction.

Fig 4. Cartoon illustrating the combination properties of contracting systems.

A) Two isolated, contracting systems. The Jacobian of the overall system is block diagonal, with all zeros on the off-diagonal—corresponding to the fact that the systems are not connected. B) If one of the systems is connected to the other in a feedforward manner, the overall Jacobian is changed by the presence of non-zero terms on the bottom left block—corresponding to the connections going from the ‘top’ system to the ‘bottom’ system. This Jacobian may not be negative definite. However, it is known that a coordinate change exists which will make it negative definite. Thus, hierarchically connected contracting systems are contracting. C) If the systems are reciprocally connected, the system may lose its contracting properties (for example in the case of positive feedback). However, it is known that if the feedforward connections (blue) are ‘equal and opposite’ to the feedback connections (green) then the overall system is contracting. We use this property in the main text to prove that inhibitory Hebbian plasticity and excitatory anti-Hebbian plasticity lead to contracting neural circuits.

Relation to other models with fading memory

As can be seen in Fig 2, contracting systems have ‘fading memories’. This means that past events will affect the current state, but that the impact of a transient perturbation gradually decays over time. Consider the transient input in Fig 2 (red panel) presented on only one of the two trials to the network. Because the input is only present on one trial and not the other we call it a perturbation. When this perturbation occurs, the trajectories of the two trials become separated. However, after the disturbance is removed, the distance between the network’s trajectories starts shrinking back to zero again. Thus, the network does not hold onto the memory of the perturbation indefinitely—the memory fades away. A similar property has been used in Echo State Networks (ESNs) and liquid state machines (LSMs) to perform useful brain-inspired computations [41,42]. These networks are an alternative to classical attractor models in which neural computations are performed by entering stable states rather than by ‘fading memories’ of external inputs [43].

While there are several distinctions between the networks described above and ESNs (e.g. ESNs are typically discrete time dynamical systems, rather than continuous), we show in the appendix (S1A Text Section 5.1) that they are a special case of the networks considered here. We show this for ESNs as opposed to LSMs because LSMs are typically implemented on integrate and fire neurons which, because of the spike reset, have a sharp discontinuity in their dynamics—making them unamenable to contraction analysis.

By highlighting the link between contraction and ESNs, we demonstrate that the contracting neural networks considered here are in principle capable of performing useful and interesting neural computations. In other words, the strong stability properties of contracting neural networks do not automatically prohibit them from doing interesting computations. By working within the framework of contraction analysis we were able to study networks both with dynamic synapses and non-identity metrics—a much broader model space than allowed by the standard ESN framework.


We studied a fundamental question in neuroscience: How do neural circuits maintain stable dynamics in the presence of disturbance, noisy inputs and plastic change? We approached this problem from the perspective of dynamical systems theory, in light of the recent successes of understand neural circuits as dynamical systems [44]. We focused on contracting dynamical systems, which are yet largely unexplored in neuroscience, as a solution to the problem outlined above. We did so for three reasons:

  1. Contracting systems can be input-driven. This is important because neural circuits are typically bombarded with time-varying inputs either from the environment or from other brain areas. Previous stability analyses have focused primarily on the stability of RNNs without time-varying input. These analyses are most insightful in situations where the input into a circuit can be approximated as either absent or constant. However, naturalistic stimuli tend to be highly time-varying and complex [45].
  2. Contracting systems are robust to noise and disturbances. Perturbations to a contracting system are forgotten at the rate of the contraction and noise therefore does not stack up over time. Importantly, the rate of forgetting (i.e the contraction rate) does not change with the size of the perturbation. Thus dynamic stability can co-exist with high trial-to-trial variability in contracting neural networks, as observed in biology.
  3. Contracting systems can be combined with one another in ways that preserve contraction (Fig 4). This is not true of most dynamical systems which can easily ‘blow up’ when connected in feedback with one another [8]. This combination property is important as it is increasingly clear that cognitive functions such as working memory or attention are distributed in multiple cortical and sub-cortical regions [46,47]. In particular, prefrontal cortex has been suggested as a hub that can reconfigure the cortical effective network based on task demands [48]. Brain networks must therefore be able to effectively reconfigure themselves on a fast time-scale without loss of stability. Most attempts in modelling cognition, for instance working memory, tend to utilize single and often autonomous networks. Contracting networks display a combination of input-driven and autonomous dynamics, and thus have key features necessary for combining modules into flexible and distributed networks.

To understand what mechanisms lead to contraction in neural circuits, we applied contraction analysis to RNNs. For RNNs with static weights, we found that the well- known Echo State Networks are a special case of a contracting network. Since realistic synapses are complex dynamical systems in their own right, we went one step further and asked when neural circuits with dynamic synapses would be contracting. We found that inhibitory Hebbian plasticity as well as excitatory anti-Hebbian plasticity and synaptic sparsity all lead to contraction in a broad class of RNNs.

Inhibitory plasticity has recently been the focus of many experimental and computational studies due to its stabilizing nature as well as its capacity for facilitating nontrivial computations in neural circuits [27,28,49]. It is known to give rise to excitatory-inhibitory balance and has been implicated as the mechanism behind many experimental findings such as sparse firing rates in cortex [28]. Similarly, anti-Hebbian plasticity exists across many brain areas and species, such as salamander and rabbit retina [31], rat hippocampus [50,51], electric fish electrosensory lobe [52] and mouse prefrontal cortex [53]. Anti-Hebbian dynamics can give rise to sparse neural codes which decrease correlations between neural activity and increase overall stimulus representation in the network [54]. Because of this on-line decorrelation property, anti-Hebbian plasticity has also been implicated in predictive coding [31,52]. Our findings suggest that it also increase the stability of networks.

For more general forms of synaptic dynamics, we showed that synaptic sparsity pushes RNNs towards being contracting. This aligns well with the experimental observation that synaptic connectivity is typically extremely sparse in the brain. Our results suggest that sparsity may be one factor pushing the brain towards dynamical stability. It is therefore interesting that synapses are regulated by homeostatic processes where synapses neighboring an upregulated synapse are immediately downregulated [55]. On the same note, we also observed that balancing the connections between excitatory and inhibitory populations leads to contraction. Balance between excitatory and inhibitory synaptic inputs are often observed in biology [3638], and could thus serve contractive stability purposes. Related computational work on spiking networks has suggested that balanced synaptic currents leads to fast response properties, efficient coding, increased robustness of function and can support complex dynamics related to movements [21,5658].

A main advantage to our approach is that it provides provable certificates of global contractive stability for nonlinear, time-varying RNNs with synaptic plasticity. This distinguishes it from previous works where—while very interesting and useful—stability is experimentally observed, but not proven [12]. In some cases [23,24], linear stability around the origin is proven (which implies that there is a contraction region around the origin) but the size of this region is neither established nor sought after. Indeed, one future direction we are pursuing is the question of: given an RNN, can one provide a certificate of contractive stability in a region? An answer to this question would shed light on the stability properties of known RNN models in the literature (e.g. trained RNNs, biologically-detailed spiking models, etc.).

Experimental neuroscience is moving in the direction of studying many interacting neural circuits simultaneously. This is fueled by the expanding capabilities of recording multiple areas simultaneously in vivo and study their interactions. This increases the need for multi-modal cognitive models. We therefore anticipate that the presented work can provide a useful foundation for how cognition in noisy and distributed computational networks can be understood.

Materials and methods

In the interested of space and cohesion, we’ve placed all the detailed proofs of main results into the appendix. The appendix was written to be self-contained, and thus also contains additional definitions of mathematical objects used throughout the text. Simulations (Figs 2 and 3) were performed in Python. Code to reproduce the figures is available at []. Numerical integrating was performed using sdeint, an open-source collection of numerical algorithms for integrating stochastic ordinary differential equations.

Fig 2 details:

All parameters and time constants in Eqs (1) and (2) were set to one. The integration step-size, dt, was set to 1e-2.

Initial conditions for both neural and synaptic activation were drawn uniformly between -1 and 1. Inputs into the network were generated by drawing N frequencies uniformly between dt and 100dt, phases between 0 and 2π, amplitudes between 0 and 20 and generating an N x Time vector of sinusoids with the above parameters.

The perturbations of the network was achieved by adding a vector of all 10s (i.e an additive vector input into the network, with each network of the element equal to 10) to the above input on one of the trials for 100 time steps in the middle of the simulation.

The noise was generated by driving each neural unit with an independent Weiner process (sigma = .2).

Fig 3 details:

The weight matrix used was the same as in Fig 2, leftmost panel (without perturbation, without noise).

Supporting information

S1 Text. The supplementary appendix file contains extensive mathematical proofs of the results stated above.

We kept the appendix self-contained by restating the basic results of contraction analysis and linear algebra which we used often in our proofs.



We thank Pawel Herman for comments on an earlier version of this manuscript. We thank Michael Happ and all members of the Miller Lab for helpful discussions and suggestions.


  1. 1. Lundqvist M, Rose J, Herman P, Brincat SL, Buschman TJ, Miller EK. Gamma and Beta Bursts Underlie Working Memory. Neuron. 2016;90: 152–164. pmid:26996084
  2. 2. Churchland MM, Yu BM, Cunningham JP, Sugrue LP, Cohen MR, Corrado GS, et al. Stimulus onset quenches neural variability: A widespread cortical phenomenon. Nat Neurosci. 2010;13: 369–378. pmid:20173745
  3. 3. Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci. 1982;79: 2554–2558. pmid:6953413
  4. 4. Hirsch MW. Convergent activation dynamics in continuous time networks. Neural Networks. 1989. pp. 331–349.
  5. 5. Cohen MA, Grossberg S. Absolute Stability of Global Pattern Formation and Parallel Memory Storage by Competitive Neural Networks. IEEE Trans Syst Man Cybern. 1983;SMC-13: 815–826.
  6. 6. Lundqvist M, Herman P, Lansner A. Theta and gamma power increases and alpha/beta power decreases with memory load in an attractor network model. J Cogn Neurosci. 2011;23: 3008–3020. pmid:21452933
  7. 7. Lansner A, Ekeberg O. Reliability and Speed of Recall in an Associative Network. IEEE Trans Pattern Anal Mach Intell. 1985;PAMI-7: 490–498. pmid:21869287
  8. 8. Ashby W. Design for a brain: The origin of adaptive behaviour. Chapman & Hall Ltd; 1952.
  9. 9. Dayan P, Abbot LF. Theoretical Neuroscience Computational Neuroscience. The MIT press. 2005. pmid:18995824
  10. 10. Zhang H, Wang Z, Liu D. A comprehensive review of stability analysis of continuous-time recurrent neural networks. IEEE Trans Neural Networks Learn Syst. 2014;25: 1229–1262.
  11. 11. Spaak E, Watanabe K, Funahashi S, Stokes MG. Stable and dynamic coding for working memory in primate prefrontal cortex. J Neurosci. 2017;37: 6503–6516. pmid:28559375
  12. 12. Laje R, Buonomano DV. Robust timing and motor patterns by taming chaos in recurrent neural networks. Nat Neurosci. 2013;16: 925–933. pmid:23708144
  13. 13. Chaisangmongkon W, Swaminathan SK, Freedman DJ, Wang XJ. Computing by Robust Transience: How the Fronto-Parietal Network Performs Sequential, Category-Based Decisions. Neuron. 2017;93: 1504–1517.e4. pmid:28334612
  14. 14. Lohmiller W, Slotine J-JE. On Contraction Analysis for Non-linear Systems. Automatica. 1998;34: 683–696.
  15. 15. Rutishauser U, Douglas RJ, Slotine J-J. Collective stability of networks of winner-take-all circuits*. 2018 [cited 31 Oct 2019].
  16. 16. Rutishauser U, Slotine J-J, Douglas R. Computation in Dynamically Bounded Asymmetric Systems. PLoS Comput Biol. 2015;11: 1004039. pmid:25617645
  17. 17. Girard B, Tabareau N, Pham QC, Berthoz A, Slotine J-J. Where neuroscience and dynamic system theory meet autonomous robotics: A contracting basal ganglia model for action selection. Neural Networks. 2008;21: 628–641. pmid:18495422
  18. 18. Tabareau N, Slotine JJ, Pham QC. How synchronization protects from noise. PLoS Comput Biol. 2010;6: 1–9. pmid:20090826
  19. 19. Orhan AE, Ma WJ. A diverse range of factors affect the nature of neural representations underlying short-term memory. Nat Neurosci. 2019;22: 275–283. pmid:30664767
  20. 20. Mongillo G, Barak O, Tsodyks M. Synaptic Theory of Working Memory. Science (80-). 2008;319: 1543. pmid:18339943
  21. 21. Lundqvist M, Compte A, Lansner A. Bistable, Irregular Firing and Population Oscillations in a Modular Attractor Memory Network. Morrison A, editor. PLoS Comput Biol. 2010;6: e1000803. pmid:20532199
  22. 22. Vidyasagar M. Nonlinear systems analysis. 2002.
  23. 23. Hennequin G, Vogels TP, Gerstner W. Optimal control of transient dynamics in balanced networks supports generation of complex movements. Neuron. 2014;82: 1394–1406. pmid:24945778
  24. 24. Stroud JP, Porter MA, Hennequin G, Vogels TP. Motor primitives in space and time via targeted gain modulation in cortical networks. Nat Neurosci. 2018;21: 1774–1783. pmid:30482949
  25. 25. Zenke F, Gerstner W, Ganguli S. The temporal paradox of Hebbian learning and homeostatic plasticity. Current Opinion in Neurobiology. 2017. pp. 166–176. pmid:28431369
  26. 26. Vogelsy TP, Froemkey RC, Doyon N, Gilson M, Haas JS, Liu R, et al. Inhibitory synaptic plasticity: Spike timing-dependence and putative network function. Frontiers in Neural Circuits. 2013. pmid:23882186
  27. 27. Hennequin G, Agnes EJ, Vogels TP. Inhibitory Plasticity: Balance, Control, and Codependence. Annu Rev Neurosci. 2017;40: 557–579. pmid:28598717
  28. 28. Vogels TP, Sprekeler H, Zenke F, Clopath C, Gerstner W. Inhibitory Plasticity Balances Excitation and Inhibition in Sensory Pathways and Memory Networks. Science (80-). 2011;334: 1569–1573. pmid:22075724
  29. 29. Gerstner W, Kistler WM. Mathematical formulations of Hebbian learning. Biol Cybern. 2002;87: 404–415. pmid:12461630
  30. 30. Gerstner W, Kistler WM. Mathematical formulations of Hebbian learning. Biol Cybern. 2002;87: 404–415. pmid:12461630
  31. 31. Hosoya T, Baccus SA, Meister M. Dynamic predictive coding by the retina. Nature. 2005;436: 71. Available: pmid:16001064
  32. 32. Gerstner W, Kistler WM. Mathematical formulations of Hebbian learning. Biol Cybern. 2002;87: 404–415. pmid:12461630
  33. 33. Slotine JJE. Modular stability tools for distributed computation and control. Int J Adapt Control Signal Process. 2003;17: 397–416.
  34. 34. Kandel ER, Schwartz JH, Jessell TM, Jessell D of B and MBT, Siegelbaum S, Hudspeth AJ. Principles of neural science. McGraw-hill New York; 2000.
  35. 35. Song S, Sjöström PJ, Reigl M, Nelson S, Chklovskii DB. Highly nonrandom features of synaptic connectivity in local cortical circuits. PLoS Biol. 2005;3: 0507–0519. pmid:15737062
  36. 36. Mariño J, Schummers J, Lyon DC, Schwabe L, Beck O, Wiesing P, et al. Invariant computations in local cortical networks with balanced excitation and inhibition. Nat Neurosci. 2005;8: 194–201. pmid:15665876
  37. 37. Wehr M, Zador AM. Balanced inhibition underlies tuning and sharpens spike timing in auditory cortex. Nature. 2003;426: 442–446. pmid:14647382
  38. 38. Shu Y, Hasenstaub A, McCormick DA. Turning on and off recurrent balanced cortical activity. Nature. 2003;423: 288–293. pmid:12748642
  39. 39. Slotine J-JE, Li W. Applied nonlinear control. Prentice hall Englewood Cliffs, NJ; 1991.
  40. 40. Murphy BK, Miller KD. Balanced Amplification: A New Mechanism of Selective Amplification of Neural Activity Patterns. Neuron. 2009;61: 635–648. pmid:19249282
  41. 41. Jaeger H. The “echo state” approach to analysing and training recurrent neural networks-with an erratum note. Bonn, Ger Ger Natl Res Cent Inf Technol GMD Tech Rep. 2001;148: 13.
  42. 42. Pascanu R, Jaeger H. A Neurodynamical Model for Working Memory.
  43. 43. Buonomano DV, Maass W. State-dependent computations: spatiotemporal processing in cortical networks. 2009 [cited 11 Mar 2019]. pmid:19145235
  44. 44. Sussillo D. Neural circuits as computational dynamical systems. Curr Opin Neurobiol. 2014;25: 156–163. pmid:24509098
  45. 45. Van Steveninck RRDR, Lewen GD, Strong SP, Koberle R, Bialek W. Reproducibility and Variability in Neural Spike Trains. 1997;275.
  46. 46. Chatham CH, Badre D. Multiple gates on working memory. Curr Opin Behav Sci. 2015;1: 23–31. pmid:26719851
  47. 47. Halassa MM, Kastner S. Thalamic functions in distributed cognitive control. Nat Neurosci. 2017;20: 1669–1679. pmid:29184210
  48. 48. Miller EK, Cohen JD. An Integrative Theory of Prefrontal Cortex Function. Annu Rev Neurosci. 2001;24: 167–202. pmid:11283309
  49. 49. Vogelsy TP, Froemkey RC, Doyon N, Gilson M, Haas JS, Liu R, et al. Inhibitory synaptic plasticity: Spike timing-dependence and putative network function. Front Neural Circuits. 2013;7: 1–11.
  50. 50. Lisman J. A mechanism for the Hebb and the anti-Hebb processes underlying learning and memory. Proc Natl Acad Sci. 1989;86: 9574–9578. pmid:2556718
  51. 51. Kullmann DM, Lamsa KP. Long-term synaptic plasticity in hippocampal interneurons. Nat Rev Neurosci. 2007;8: 687–699. pmid:17704811
  52. 52. Enikolopov AG, Abbott L, Sawtell NB. Internally Generated Predictions Enhance Neural and Behavioral Detection of Sensory Stimuli in an Electric Fish. 2018 [cited 1 Mar 2019]. pmid:30001507
  53. 53. Ruan H, Saur T, Yao W-D. Dopamine-enabled anti-Hebbian timing-dependent plasticity in prefrontal circuitry. Front Neural Circuits. 2014;8: 38. pmid:24795571
  54. 54. Földiák P. Forming sparse representations by local anti-Hebbian learning. Biol Cybern. 1990;64: 165–170. pmid:2291903
  55. 55. El-Boustani S, Ip JPK, Breton-Provencher V, Knott GW, Okuno H, Bito H, et al. Locally coordinated synaptic plasticity of visual cortex neurons in vivo. Science (80-). 2018;360: 1349–1354. pmid:29930137
  56. 56. Denève S, Machens CK. Efficient codes and balanced networks. Nat Neurosci. 2016;19: 375–382. pmid:26906504
  57. 57. Hennequin G, Vogels TP, Gerstner W. Optimal control of transient dynamics in balanced networks supports generation of complex movements. Neuron. 2014;82: 1394–1406. pmid:24945778
  58. 58. Brunel N. Dynamics of Sparsely Connected Networks of Excitatory and Inhibitory Spiking Neurons. J Comput Neurosci. 2000. Available: