## Figures

## Abstract

Following a stimulus, the neural response typically strongly varies in time and across neurons before settling to a steady-state. While classical population coding theory disregards the temporal dimension, recent works have argued that trajectories of transient activity can be particularly informative about stimulus identity and may form the basis of computations through dynamics. Yet the dynamical mechanisms needed to generate a population code based on transient trajectories have not been fully elucidated. Here we examine transient coding in a broad class of high-dimensional linear networks of recurrently connected units. We start by reviewing a well-known result that leads to a distinction between two classes of networks: networks in which all inputs lead to weak, decaying transients, and networks in which specific inputs elicit amplified transient responses and are mapped onto output states during the dynamics. Theses two classes are simply distinguished based on the spectrum of the symmetric part of the connectivity matrix. For the second class of networks, which is a sub-class of non-normal networks, we provide a procedure to identify transiently amplified inputs and the corresponding readouts. We first apply these results to standard randomly-connected and two-population networks. We then build minimal, low-rank networks that robustly implement trajectories mapping a specific input onto a specific orthogonal output state. Finally, we demonstrate that the capacity of the obtained networks increases proportionally with their size.

## Author summary

Classical theories of sensory coding consider the neural activity following a stimulus as constant in time. Recent works have however suggested that the temporal variations following the appearance and disappearance of a stimulus are strongly informative. Yet their dynamical origin remains little understood. Here we show that strong temporal variations in response to a stimulus can be generated by collective interactions within a network of neurons if the connectivity between neurons satisfies a simple mathematical criterion. We moreover determine the relationship between connectivity and the stimuli that are represented in the most informative manner by the variations of activity, and estimate the number of different stimuli a given network can encode using temporal variations of neural activity.

**Citation: **Bondanelli G, Ostojic S (2020) Coding with transient trajectories in recurrent neural networks. PLoS Comput Biol 16(2):
e1007655.
https://doi.org/10.1371/journal.pcbi.1007655

**Editor: **Kenneth D. Miller,
Columbia University Medical School, UNITED STATES

**Received: **July 9, 2019; **Accepted: **January 14, 2020; **Published: ** February 13, 2020

**Copyright: ** © 2020 Bondanelli, Ostojic. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **All relevant data are within the manuscript and its Supporting Information files.

**Funding: **This work was funded by the Programme Emergences of City of Paris, Agence Nationale de la Rechere grant ANR-16-CE37-0016, and the program “Investissements d’Avenir” launched by the French Government and implemented by the ANR, with the references ANR-10- LABX-0087 IEC and ANR-11-IDEX-0001-02 PSL Research University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

The brain represents sensory stimuli in terms of the collective activity of thousands of neurons. Classical population coding theory describes the relation between stimuli and neural firing in terms of tuning curves, which assign a single number to each neuron in response to a stimulus [1–3]. The activity of a neuron following a stimulus presentation typically strongly varies in time and explores a range of values, but classical population coding typically leaves out such dynamics by considering either time-averaged or steady-state firing.

In contrast to this static picture, a number of recent works have argued that the temporal dynamics of population activity may play a key role in neural coding and computations [4–14]. As the temporal response to a stimulus is different for each neuron, an influential approach has been to represent population dynamics in terms of temporal trajectories in the neural state space, where each axis corresponds to the activity of one neuron [15–18]. Coding in this high-dimensional space is typically examined by combining linear decoding and dimensionality-reduction techniques [19–21], and the underlying network is often conceptualised in terms of a dynamical system [18, 22–30]. Such approaches have revealed that the discrimination between stimuli based on neural activity can be higher during the transient phases than at steady state [16], arguing for a coding scheme in terms of neural trajectories. A full theory of coding with transient trajectories is however currently lacking.

To produce useful transient coding, the trajectories of neural activity need to satisfy at least three requirements [4]. They need to be (i) stimulus-specific, (ii) robust to noise and (iii) non-monotonic, in the sense that the responses to different stimuli differ more during the transient dynamics than at steady-state. This third condition is crucial as otherwise coding with transients can be reduced to classical, steady-state population coding. Recent works have shown that recurrent networks with so-called non-normal connectivity can lead to amplified transients [28, 31–35], but general sufficient conditions for such amplification were not given. We start by reviewing a well-known result linking the norm of the transient activity to the spectrum of the symmetric part of the connectivity matrix. This results leads to a simple distinction between two classes of networks: networks in which all inputs lead to weak, decaying transients, and networks in which specific inputs elicit transiently amplified responses. We then characterize inputs that lead to non-monotonic trajectories, and show that they induce transient dynamics that map inputs onto orthogonal output directions. We first apply these analyses to standard two-population and randomly-connected networks. We then specifically exploit these results to build low-rank connectivity matrices that implement specific trajectories to transiently encode specified stimuli, and examine the noise-robustness and capacity of this setup.

## Results

We study linear networks of *N* randomly and recurrently coupled rate units with dynamics given by:
(1)
Such networks can be interpreted as describing the linearized dynamics of a system around an equilibrium state. In this picture, the quantity *r*_{i} represents the deviation of the activity of the unit *i* from its equilibrium value. For simplicity, in the following we refer to the quantity *r*_{i} as the firing rate of unit *i*. Here *J*_{ij} denotes the effective strength of the connection from neuron *j* to neuron *i*. Unless otherwise specified, we consider an arbitrary connectivity matrix **J**. Along with the recurrent input, each unit *i* receives an external drive *I*(*t*)*r*_{0,i} in which the temporal component *I*(*t*) is equal for all neurons, and the vector **r**_{0} (normalized to unity) represents the relative amount of input to each neuron.

### Monotonic vs. amplified transient trajectories

We focus on the transient autonomous dynamics in the network following a brief input in time (*I*(*t*) = *δ*(*t*)) along the external input direction **r**_{0}, which is equivalent to setting the initial condition to **r**_{0}. The temporal activity of the network in response to this input can be represented as a trajectory **r**(*t*) in the high-dimensional space in which the *i*-th component is the firing rate of neuron *i* at time *t*. We assume the network is stable, so that the trajectory asymptotically decays to the equilibrium state that corresponds to *r*_{i} = 0. At intermediate times, depending on the connectivity matrix **J** and on the initial condition **r**_{0}, the trajectory can however exhibit two qualitatively different types of behavior: it can either monotonically decay towards the asymptotic state or transiently move away from it (Fig 1A and 1B). We call these two types of trajectories respectively monotonic and amplified.

Dynamics of a linear recurrent network in response to a short external perturbation along a given input direction **r**_{0}. The left and right examples correspond to two different connectivity matrices, where the connection strengths are independently drawn from a Gaussian distribution with zero mean and variance equal to *g*^{2}/*N* (left: *g* = 0.5; right: *g* = 0.9). **A**. Firing rate dynamics of 10 individual units. **B**. Projections of the population activity onto the first two principal components of the dynamics. Yellow and red color correspond respectively to *g* = 0.5 and *g* = 0.9. **C**. Temporal dynamics of the activity norm ∥**r**(*t*)∥. *Left*: in the case of weakly non-normal connectivity the activity norm displays monotonic decaying behaviour for any external input perturbation. *Right*: for strongly non-normal connectivity, specific stimuli generate a transient increase of the activity norm. *N* = 200 in simulations.

The two types of transient trajectories can be distinguished by looking at the Euclidean distance between the activity at time point *t* and the asymptotic equilibrium state, given by the activity norm . Focusing on the norm allows us to deal with a single scalar quantity instead of *N* firing rates. Monotonic and amplified transient trajectories respectively correspond to monotonically decaying and transiently increasing ∥**r**(*t*)∥ (Fig 1C). Note that a transiently increasing ∥**r**(*t*)∥ necessarily implies that the firing rate of at least one neuron shows a transient increase in its absolute value before decaying to zero.

One approach to understanding how the connectivity matrix **J** determines the transient trajectory is to project the dynamics on the basis formed by the right-eigenvectors {**v**_{k}} of **J** [36]. The component along the *k*—th eigenmode decays exponentially and the activity norm can be expressed as:
(2)
If all the eigenvectors **v**_{k} are mutually orthogonal, then the squared activity norm is a sum of squares of decaying exponentials, and therefore a monotonically decaying function. Connectivity matrices **J** with all orthogonal eigenvectors are called normal matrices, and they thus generate only monotonic transients. In particular, any symmetric matrix is normal. On the other hand, connectivity matrices for which some eigenvectors are not mutually orthogonal are called *non-normal* [37]. For such matrices, the second term under the square root in Eq (2) can have positive or negative sign, so that the norm cannot in general be written as the sum of decaying exponentials. It is well known that non-normal matrices can lead to non-monotonic transient trajectories [28, 31–35, 38].

Nonetheless, a non-normal connectivity matrix **J** is just a necessary, but not a sufficient condition for the existence of transiently amplified trajectories. As will be illustrated below, having non-orthogonal eigenvectors does not guarantee the existence of transiently amplified inputs. This raises the question of identifying the sufficient conditions on the connectivity matrix **J** and input **r**_{0} for the transient trajectory to be amplified. In the following, we point out a simple criterion on the connectivity matrix **J** for the existence of amplified trajectories, and show that it is possible to identify the input subspace giving rise to amplified trajectories and estimate its dimensionality.

### Two classes of non-normal connectivity

To distinguish between monotonic and amplified trajectories, we focus on the rate of change d∥**r**(*t*)∥/d*t* of the activity norm. For a monotonic trajectory, this rate of change is negative at all times, while for amplified trajectories it transiently takes positive values before becoming negative as the activity decays to the equilibrium value. Using this criterion, we can determine the conditions under which a network generates an amplified trajectory for at at least one input **r**_{0}. Indeed, the rate of change of the activity norm satisfies (see [38, 39])
(3)
Here the matrix **J**_{S} denotes the symmetric part of the connectivity matrix **J**. The right hand side of Eq (3) is a Rayleigh quotient [40]. It reaches its maximum value when **r**(*t*) is aligned with the eigenvector of **J**_{S} associated with its largest eigenvalue, λ_{max}(**J**_{S}), and the corresponding maximal rate of change of the activity norm is therefore λ_{max}(**J**_{S}) − 1.

Eq (3) directly implies that a necessary and sufficient condition for the existence of transiently amplified trajectories is that the largest eigenvalue of the symmetric part **J**_{S} be larger than unity, λ_{max}(**J**_{S}) > 1 [38]. If that is the case, choosing the initial condition along the eigenvector associated with λ_{max}(**J**_{S}) leads to a positive rate of change of the activity norm at time *t* = 0, and therefore generates a transient increase of the norm corresponding to an amplified trajectory, which shows the sufficiency of the criterion. Conversely, if a given input produces an amplified trajectory, at least one eigenvalue of **J**_{S} is necessarily larger than one. If that were not the case, the right hand side of the equation for the norm would take negative values for all vectors **r**(*t*), implying a monotonic decay of the norm. This demonstrates the necessity of the criterion.

The criterion based on the symmetric part of the connectivity matrix allows us to distinguish two classes of connectivity matrices: if λ_{max}(**J**_{S}) < 1 all external inputs **r**_{0} lead to monotonically decaying trajectories (non-amplifying connectivity); if λ_{max}(**J**_{S}) < 1 specific input directions lead to a non-monotonic amplified activity norm (amplifying connectivity). The key point here is that for a non-normal connectivity matrix **J**, the symmetric part **J**_{S} is in general different from **J**. The condition for the stability of the system () and the condition for transient amplification (λ_{max}(**J**_{S}) > 1) are therefore not mutually exclusive. This is instead the case for normal networks, which include symmetric, anti-symmetric, orthogonal connectivity matrices and trivial one-dimensional dynamics.

The simplest illustration of this result is a two-population network. In that case the relationship between the eigenvalues of **J** and **J**_{S} is straightforward. The eigenvalues of **J** and **J**_{S} are given by
(4)
where Tr(**J**) and Det(**J**) are the trace and determinant of the full connectivity matrix **J**, and 2Δ is the difference between the off-diagonal elements of **J**. Assuming for simplicity that the eigenvalues of **J** are real, Eq (4) show that the maximal eigenvalue of **J**_{S} is in general larger than the maximal eigenvalue of **J**, and the difference between the two is controlled by the parameter Δ which quantifies how non-symmetric the matrix **J** is. If Δ is large enough, **J**_{S} will have an unstable eigenvalue, even if both eigenvalues of **J** are stable (Fig 2A). The value of Δ therefore allows to distinguish between non-amplifying and amplifying connectivity. Furthermore, for amplifying connectivity, the parameter Δ directly controls the amount of amplification in the network (Fig 2B), defined as the maximum value of the norm ∥**r**(*t*)∥ over time and initial conditions **r**_{0} (see Methods). A specific example is a network consisting of an excitatory and an inhibitory population [32]. In that case our criterion states that the excitatory feedback needs to be (approximately) larger than unity in order to achieve transient amplification, when *k* is larger than but close to one (Fig 2C and S6 Text).

**A**. Relation between the eigenvalues of the connectivity matrix **J** (blue dots) and the eigenvalues of its symmetric part, **J**_{S} (red dots). Both pairs of eigenvalues are symmetrically centered around Tr(**J**)/2, but the eigenvalues of **J**_{S} lie further apart (Eq 4), and the maximal eigenvalue of **J**_{S} can cross unity if the difference 2Δ between the off-diagonal elements of the connectivity matrix is sufficiently large (bottom panel). **B**. Value of the maximum amplification of the system (quantified by the maximal singular value *σ*_{1}(**P**_{t*}) of the propagator, see Methods) as a function of the non-normal parameter Δ. Here we fix the two eigenvalues of **J**, the largest of which effectively determines the largest timescale of the dynamics, and vary Δ. Colored traces correspond to different values of the largest timescale of the system . For small values of Δ the maximum amplification is equal to one, and it increases approximately linearly when Δ is larger than the critical value. Each colored trace corresponds to a different choice of Tr(**J**) and Det(**J**). From top to bottom traces: Tr(**J**) = 0, −0.5, −2, −4 and Det(**J**) = Tr^{2}(**J**)/4 (for convenience), corresponding respectively to *τ* = 1, 0.8, 0.5, 0.33. The trace for *τ* = 1 corresponds to . **C**. Dynamical regimes for an excitatory-inhibitory two population model, as in [32]. Here *w* represents the weights of the excitatory connections (*J*_{EE} = *J*_{IE} = *w*) and −*kw* the weights of the inhibitory ones (*J*_{EI} = *J*_{II} = −*kw*). The inhibition-dominated regime corresponds to *k* > 1. The color code corresponds to the maximum amplification, as quantified by the maximal singular value *σ*_{1}(**P**_{t*}). The grey trace corresponds to the boundary between the monotonic and the amplified parameter regions. The red trace represents the stability boundary, with the unstable region hatched in red. In order to achieve transient amplification the excitatory weight *w* has to be approximately larger than unity, when *k* is larger than but close to one. Note that amplification can be obtained also for 0 < *k* < 1, in a parameter region limited by the stability boundary.

A second illustrative example is a network of *N* randomly connected neurons, where each connection strength is independently drawn from a Gaussian distribution with zero mean and variance equal to *g*^{2}/*N*. For such a network, the eigenvalues of **J** and **J**_{S} are random, but their distributions are known. The eigenvalues of **J** are uniformly distributed in the complex plane on a circle of radius *g* [41], so that the system is stable for *g* < 1 (Fig 3A). On the other hand, the eigenvalues of the symmetric part **J**_{S} are real and distributed according to the semicircle law with spectral radius [42, 43] (Fig 3B). The fact that the spectral radius of **J**_{S} is larger by a factor than the spectral radius of **J** implies that if *g* is in the interval the network is stable but exhibits amplified transient activity (Fig 3C). Note that the connectivity is non-normal for any value of *g*, but the additional condition is needed for the existence of amplified trajectories. This in particular implies that for random connectivity transient amplification requires the network to be close to instability, so that the dynamics are slowed down as pointed out in [34].

Each entry of **J** is independently drawn from a Gaussian distribution with zero mean and variance *g*^{2}/*N*. **A**. The eigenvalues of **J** are complex, and, in the limit of large *N*, distributed uniformly within a circle of radius *R*(**J**) = *g* in the complex plane (Girko’s law, [41]). The system is stable if *g* < 1. Left: *g* = 0.5. Right: *g* = 0.9. **B**. The eigenvalues of the symmetric part **J**_{S} are real-valued, and are distributed in the large *N* limit according to the semicircle law, with the largest eigenvalue of **J**_{S} given by the spectral radius [42, 43]. Since the spectral radius of **J**_{S} is larger than the spectral radius of **J**, for sufficiently large values of *g* some eigenvalues of **J**_{S} can be larger than unity (in red), while the network dynamics are stable (*g* < 1). **C**. Spectral radii of **J** and **J**_{S} as a function of the random strength *g*. The interval of values of *g* for which the system displays strong transient dynamics in response to specific inputs is given by . *N* = 200 in simulations.

### Coding with amplified transients

For a connectivity matrix satisfying the amplification condition λ_{max}(**J**_{S}) > 1, only specific external inputs **r**_{0} are amplified by the recurrent circuitry, while others lead to monotonically decaying trajectories (Fig 4B). Which and how many orthogonal inputs are amplified? What is the resulting state of the network at the time of maximal amplification, and how can the inputs be decoded from that state?

Example corresponding to a *N*-dimensional Gaussian connectivity matrix with *g* = 0.9. **A**. Singular values of the propagator, , as a function of time (SV trajectories). Dark blue traces show the amplified singular values, defined as having positive slope at time *t* = 0; The dominant singular value corresponds to the dashed line. Light blue traces correspond to the non-amplified singular values, having negative slope at *t* = 0. **B**. Norm of the activity elicited by the first two amplified inputs, i.e. , , (right singular vectors corresponding to singular values and at time *t** indicated by the dashed vertical line in panel A; purple and red traces), and by one non-amplified input (chosen as , corresponding to ; orange trace). **C**. Illustration of the dynamics elicited by the three inputs, **R**_{1}, **R**_{2} and **R**_{100} (shown in different rows), as in **B**. *Left*: Activity of 10 individual units. *Center*: Projections of the evoked trajectories onto the plane defined by the stimulus and the corresponding readout vector (in analogy with the amplified case, we chose the readout of the non-amplified dynamics to be the state of the system at time *t**, i.e. ). *Right*: population responses to the three stimuli projected on the readout vectors , and . *N* = 1000 in simulations.

One approach to these questions is to examine the mapping from inputs to states at a given time *t* during the dynamics. Since we consider linear networks, the state reached at time *t* from the initial condition **r**_{0} is given by the linear mapping **r**(*t*) = **P**_{t} **r**_{0}, where for any time *t* > 0, **P**_{t} = exp(*t*(**J** − **I**)) is an *N* × *N* matrix called the propagator of the network. At a given time *t*, the singular value decomposition (SVD) of **P**_{t} defines a set of singular values , and two sets of orthonormal vectors and , such that **P**_{t} maps onto . In other words, taking as the initial condition leads the network to the state at time *t*:
(5)

If , the norm of the activity at time *t* is larger than unity, so that the initial condition is amplified. In fact, the largest singular value of **P**_{t} determines the maximal possible amplification at time *t* (see Methods). Note that for a normal matrix, the left and right singular vectors and are identical, and the singular values are equal to the modulus of the eigenvalues, so that the stability of the dynamics imply an absence of amplification. Conversely, stable amplification implies that and are not identical, so that an amplified trajectory explores at least two dimensions corresponding to the plane spanned by and .

Since the propagator **P**_{t} depends on time, the singular vectors and , and the singular values depend on time. One can therefore look at the temporal trajectories , which by definition all start at one at *t* = 0 (Fig 4A). If the connectivity satisfies the condition for transient amplification, at least one singular value increases above unity, and reaches a maximum before asymptotically decreasing to zero. The number of singular values that simultaneously take values above unity (Fig 4A) defines the number of orthogonal initial conditions amplified by the dynamics. Choosing a time *t** at which *N*_{s} of the singular value trajectories lie above unity, we can indeed identify a set of *N*_{s} orthogonal, amplified inputs corresponding to the right singular vectors of the propagator at time *t**. According to Eq (5), each of these inputs is mapped in an amplified fashion to the corresponding left singular vector at time *t**, which also form an orthogonal set. Each amplified input can therefore be decoded by projecting the network activity on the corresponding left singular vector (Fig 4C). Since are mutually orthogonal, the different initial conditions lead to independent encoding channels. Again, as the dynamics are non-normal, the inputs **R**_{k} and the outputs **L**_{k} are not identical, so that the dynamics for each amplified input are at least two-dimensional (Fig 4C).

How many independent, orthogonal inputs can a network encode with amplified transients? To estimate this number, a central observation is that the slopes of the different singular value trajectories at *t* = 0 are given by the eigenvalues of the symmetric part of the connectivity **J**_{S}. This follows from the fact that the singular values of the propagator **P**_{t} are the square root of the eigenvalues of , and at short times *δt* we have . This implies that the number of singular values with positive slope at the initial time is equal to the number of eigenvalues of the symmetric part **J**_{S} larger than unity. To eliminate the trajectories with small initial slopes, one can further constrain the slopes to be larger than a margin *ϵ*, in which case the number of amplified trajectories *N*_{S}(*ϵ*) is given by the number of eigenvalues of **J**_{S} larger than 1 + *ϵ*. Note that *N*_{S}(*ϵ*) provides only a lower bound on the number of amplified inputs, as singular values with initial slope smaller than zero can increase at later times. It is straightforward to compute *N*_{S}(*ϵ*) when the connectivity matrix **J** is a random matrix with independent and identically distributed elements. In this case the probability distribution of the eigenvalues of its symmetric part **J**_{S} follows the semicircle law (Fig 3), and when the number of neurons *N* is large, the number *N*_{s} of amplified inputs scales linearly with *N*.

To summarize, the amplified inputs and the corresponding encoding at peak amplification can be determined directly from the singular value decomposition of the propagator, given by the exponential of the connectivity matrix. For an arbitrary *N* × *N* matrix **J**, characterizing analytically the SVD of its exponential is in general a complex and to our knowledge open mathematical problem. For specific classes of matrices, the propagator and its SVD can however be explicitly computed, and in the following we will exploit this approach.

### Implementing specific transient trajectories

The approach outlined above holds for any arbitrary connectivity matrix, and allows us to identify the external inputs which are strongly amplified by the recurrent structure, along with the modes that get most activated during the elicited transients, and therefore encode the inputs. We now turn to the converse question: how to choose the network connectivity **J** such that it generates a pre-determined transient trajectory. Specifically, we focus on low-rank networks, a type of model ubiqitous in neuroscience [44, 45], and set out to determine the minimal-rank connectivity that transiently transforms a fixed, arbitrary input **r**_{0} into a fixed, arbitrary output **w** at the time of peak amplification, through two-dimensional dynamics.

To address this question, we consider a connectivity structure given by a unit-rank matrix **J** = Δ**uv**^{T} [45]. Here **u** and **v** are two vectors with unitary norm and correlation *ρ* (〈**u**, **v**〉 = *ρ*), and Δ is an overall scaling parameter. We applied to this connectivity the general analysis outlined above (see Methods). The only non-zero eigenvalue of **J** is Δ*ρ*, and the corresponding linear system is stable for Δ*ρ* < 1. The largest eigenvalue of the symmetric part of the connectivity **J**_{S} is given by Δ(*ρ* + 1)/2, so that the network displays amplified transients if and only if Δ(*ρ* + 1)/2 > 1 (while Δ*ρ* < 1). Keeping the eigenvalue Δ*ρ* constant and increasing Δ will therefore lead to a transition from monotonically decaying to amplified transients (Fig 5A). If *ρ* = 0, the vectors **u** and **v** are orthogonal, and the condition for amplification is simply Δ > 2. Note that in this situation, amplification is obtained without slowing down the dynamics, in contrast to randomly coupled networks [34].

**A**. Dynamical regimes as a function of the structure vector correlation *ρ* = **u** ⋅ **v** and the scaling parameter of the connectivity matrix, Δ. Grey shaded areas correspond to parameter regions where the network activity is monotonic for all inputs; blue shaded areas indicate parameter regions where the network activity is amplified for specific inputs; for parameter values in the white area, activity is unstable. Samples of dynamics are shown in the bottom panels, for parameter values indicated by the colored dot in the phase diagram: Δ = 4 and *ρ* = 0. Dashed colored traces correspond to the parameter regions explored in panels **B**. and **C**., defined by the equation λ = Δ*ρ*. **B**. Maximum amplification of the system, quantified by *σ*_{1}(**P**_{t*}), the first singular value of the propagator, as a function of the scaling parameter Δ. Here we fix the eigenvalue of the connectivity matrix λ = Δ*ρ* associated with the eigenvector **u**, and vary Δ. Colored traces correspond to different choices of the eigenvalue of the connectivity λ. **C**. Correlation between the optimally amplified input direction and the structure vector **v** as a function of the parameter Δ. Increasing the non-normal parameter Δ aligns the optimally amplified input with the structure vector **v**. In **B**. and **C**. mean and standard deviation over 50 realizations of the connectivity matrix are shown for each trace. The elements of the structure vectors are drawn from a Gaussian distribution, so that they have on average unit norm and correlation *ρ* (see Methods). **D**. Low-dimensional dynamics in the case of two stored patterns. Input **v**^{(1)} (resp. **v**^{(2)}) elicits a two-dimensional trajectory which brings the activity along the other structure vector **u**^{(1)} (resp. **u**^{(2)}), mapping stimulus **v**^{(1)} (resp. **v**^{(2)}) into its transient readout **u**^{(1)} (resp. **u**^{(2)}). Blue and red colors correspond to the two stored patterns. **E**. Firing rates of 10 individual units. **F**. Temporal evolution of the activity norm. **G**. Projection of the network response evoked by the input along **v**^{(1)} (resp. **v**^{(2)}) on the corresponding readout **u**^{(1)} (resp. **u**^{(2)}). The case of unit rank connectivity (one stored pattern) reduces to the first row of panels **D**. −**G**. (where the activity on **u**^{(2)} is equivalent to the activity on a readout orthogonal to **u**^{(1)}). *N* = 3000 in simulations.

For this unit rank connectivity matrix, the full propagator **P**_{t} = exp(**t**(**J** − **I**)) of the dynamics can be explicitly computed (see Methods). The non-trivial dynamics are two-dimensional, and lie in the plane spanned by the structure vectors **u** and **v** (Fig 5D), while all components orthogonal to this plane decay exponentially to zero. Determining the singular value decomposition of the propagator allows us to compute the amount of amplification of the system, as the value of *σ*_{1}(**P**_{t}) at the time of its maximum *t**. In the amplified regime (for Δ(*ρ* + 1)/2 > 1), the amount of amplification increases monotonically with Δ (Fig 5B). Since only one eigenvalue of **J**_{S} is larger than unity, only one input perturbation is able to generate amplified dynamics. For large values of Δ, this optimal input direction is strongly correlated with the structure vector **v**. Perturbing along the vector **v** elicits a two-dimensional trajectory which at its peak amplification is strongly correlated with the other structure vector **u** (Fig 5C). Choosing **v** = **r**_{0} and **u** = **w**, the unit-rank connectivity therefore directly implements a trajectory that maps the input **r**_{0} into the output **w**, identified as the transient readout vector for stimulus **r**_{0}.

Several, orthogonal trajectories can be implemented by adding orthogonal unit-rank components. For instance, taking **J** = Δ**u**^{(1)}**v**^{(1)T} + Δ**u**^{(2)}**v**^{(2)T}, where the planes defined by the structure vectors in each term are mutually orthogonal, the input **v**^{(1)} evokes a trajectory which is confined to the plane defined by **u**^{(1)} and **v**^{(1)}, and which maps the input **v**^{(1)} into the output **u**^{(1)} at the time of peak amplification (Fig 5D–5G). Similarly, the input **v**^{(2)} is mapped into the output **u**^{(2)} during the evoked transient dynamics. Therefore, the rank-2 connectivity **J** implements two transient patterns, encoding the stimuli **v**^{(1)} and **v**^{(2)} into the readouts **u**^{(1)} and **u**^{(2)}. A natural question is how robust the scheme is and how many patterns can be implemented in a network of fixed size *N*.

### Robustness and capacity

To investigate the robustness of the transient coding scheme implemented with unit-rank terms, we first examined the effect of additional random components in the connectivity. Adding to each connection a random term of variance *g*^{2}/*N* introduces fluctuations of order to the component of the activity on the plane defined by **u** and **v** (see Methods). Consequently, the projection of the trajectory on the readout **w** = **u** has fluctuations of the same order (Fig 6A–6C). A supplementary effect of random connectivity is to add to the dynamics a component orthogonal to **u** and **v**, proportional to Δ (see S8 Text), which however does not contribute to the readout along **w**. Thus, for large *N*, the randomness in the synaptic connectivity does not impair the decoding of the stimulus **r**_{0} from the activity along the corresponding readout **w**.

**(A-B-C)** Robustness of the readout activity for a single stored pattern **u-v** in presence of randomness in the connectivity with variance *g*^{2}/*N*. **A**. Projection of the population activity elicited by input **v** along the readout **u** (red trace) and along a readout orthogonal to **u** (blue trace) for *g* = 0.5. The elements of the orthogonal readout are drawn from a random distribution with mean zero and variance 1/*N* and are fixed over trials. The projection of the activity on **u** is also shown for the zero noise case (*g* = 0; black dashed line). **B**. Value of the activity along **u** (red dots) and along the orthogonal readout (blue dots) at the peak amplification (*t* = *t**), as a function of *g*. In **A** and **B**, *N* = 200; error bars correspond to the standard deviation over 100 realizations of the random connectivity. **C**. Standard deviation of the readout activity at the peak amplification as a function of the network size *N* for two values of *g*. The fluctuations are inversely proportional to the network size and scale as . Error bars correspond to the standard deviation of the mean over 100 realization of the connectivity noise. **(D-E-F)** Robustness of the transient coding scheme in presence of multiple stored patterns. **D**. Projection of the population activity elicited by one arbitrary amplified input **v**^{(k)} along the corresponding readout **u**^{(k)} (red trace) and along a different arbitrary readout **u**^{(k′)} (blue trace) for *P*/*N* = 0.02. The readout **u**^{(k′)} was changed for every trial. The projection of the activity on **u**^{(k)} is also shown when only the pattern **u**^{(k)}-**v**^{(k)} is encoded (*P* = 1; black dashed line). **E**. Value of the activity along **u**^{(k)} (red dots) and along the readout **u**^{(k′)} (blue dots) at the peak amplification (*t* = *t**), as a function of *P*/*N*. In **D** and **E** *N* = 200; error bars correspond to the standard deviation over 100 realizations of the connectivity matrix. **F**. Standard deviation of the readout activity (along **u**^{(k)}) at the peak amplification as a function of the network size *N* for two values of *P*/*N*. The fluctuations are inversely proportional to the network size and scale as . Error bars correspond to the standard deviation of the mean over 100 realizations of the connectivity noise.

The robustness of the readouts to random connectivity implies in particular that the unit-rank coding scheme is robust when an extensive number *P* of orthogonal transient trajectories are implemented by the connectivity **J**. To show this, we generalize the unit-rank approach and consider a rank-*P* connectivity matrix, given by the sum of *P* unit-rank matrices, , where each term specifies an input-output pair. We focus on the case where the elements of all the vectors **u**^{(p)} and **v**^{(p)} are independently drawn from a random distribution (see Methods), implying that all input-output pairs are mutually orthogonal, i.e. uncorrelated, in the limit of large *N*. In this situation, the interaction between the dynamics evoked by one arbitrary input **v**^{(p)} and the additional *P* − 1 patterns is effectively described by a system with connectivity **J** = Δ**u**^{(p)} **v**^{(p)T} corrupted by a random component with zero mean and variance equal to Δ^{2} *P*/*N*^{2} (see Methods). From the previous results, it follows that the fluctuations of the activity of the readout **u**^{(p)} are of order (Fig 6D–6F). Thus, in high dimension, the readout activity is robust to the interactions between multiple encoded trajectories. When the number of encoded trajectories is extensive (*P* = *O*(*N*)), each stimulus (**v**^{(p)}, can therefore still be decoded from the projection of the activity on the corresponding readout **u**^{(p)}.

A natural upper bound on the number of trajectories that can be implemented by the connectivity **J** is derived from the stability constraints of the linear system. Indeed, the largest eigenvalues of **J** is given by and it needs to be smaller than one for stability. Thus, the maximum number of trajectories that can be encoded in the connectivity **J** is given by *P*_{max} = *N*/Δ^{2} and defines the capacity of the network. Crucially, the capacity scales linearly with the size of the network *N*. The capacity also decreases for highly amplified systems, resulting in a trade-off between the separability of the neural activity evoked by different stimuli (quantified by Δ) and the number of stimuli that can be encoded in the connectivity (quantified by *P*_{max}).

## Discussion

We examined the conditions under which linear recurrent networks can implement an encoding of stimuli in terms of amplified transient trajectories. The fundamental mechanism underlying amplified transients relies on the non-normal properties of the connectivity matrix, i.e. the fact that the left- and right-eigenvectors of the connectivity matrix are not identical [37]. A number of recent studies in theoretical neuroscience have pointed out the interesting dynamical properties of networks with non-normal connectivity [28, 31–35, 46, 47]. Several of these works [28, 32, 34, 35] have examined the amplification of the norm of the activity vector, as we do here. However, it was not pointed out that the presence of amplification can be diagnosed by considering the eigenvalues of the symmetric part **J**_{S} of the connectivity matrix (rather than examining properties of the eigenvectors of the connectivity matrix **J**), leading to the distinction of two classes of recurrent networks. This general criterion appears to be well-known in the theory of linear systems (Theorem 17.1 in [37]). Here we applied it to standard models of recurrent networks used in computational neuroscience, and in particular to low-rank networks [45].

We have shown that the largest eigenvalue of the symmetric part of the connectivity defines the amplification properties of the system asymptotically at small times *t*. Yet, it does not provide a direct measure of the maximum amplification of the norm ∥**r**(*t*)∥ over all times *t* (see Maximum amplification of the system). The maximum amplification can be derived using the singular value decomposition (SVD) of the propagator **P**_{t}, which however can be computed analytically only for simple connectivity matrices. To quantify the maximum amplification other measures have been developed that rely on the so-called pseudospectra [37] of the connectivity matrix, a generalization of the eigenvalue spectra useful for the study of non-normal dynamics. While the spectrum of the symmetric part of the connectivity controls the amplification of the system at small times and the eigenvalue spectrum determines its large time dynamics, the transient dynamics at intermediate times is largely determined by the properties of the pseudospectra of the connectivity **J** (Chapter 4 of [37]). Notably, the result known as the Kreiss Matrix Theorem (Eq. 14.8 and 14.12 in [37]) provides a lower and upper bound for the maximum amplification of ∥**r**(*t*)∥ based on the pseudospectrum of the connectivity **J**.

Applying the criterion for transient amplification to classical randomly connected networks, we found that amplification occurs only in a narrow parameter region close to the instability, where the dynamics substantially slow down as previously shown [34]. To circumvent this issue, and produce strong transient amplification away from the instability, [28] introduced stability-optimized circuits (SOCs) in which inhibition is fine-tuned to closely balance excitation, and demonstrated that such dynamics can account for the experimental data recorded in the motor cortex [24]. We showed here that low-rank networks can achieve the same purpose, and exhibit strong, fast amplification in a large parameter region away from the instability. One difference with SOCs is that low-rank networks explicitly implement low-dimensional dynamics that transform a specified initial state into a specified, orthogonal output state. Several low-rank channels could be combined to reproduce higher-dimensional dynamics similar to those observed during the generation of complex movements [24].

In our framework we modeled the external stimulus as the initial condition of the network, and the amplified dynamics is autonomously generated by the recurrent interactions. Although this might appear as an oversimplifying assumption, it has nevertheless been proven useful to describe the transient population activity in motor and sensory areas. In motor and pre-motor cortex, the initial condition of the population dynamics during the execution of the movement may be set by the phase of preparatory neural activity that precedes the movement, and may determine to a large extent the time course of the movement-related dynamics [22–24]. A similar mechanism has been recently proposed to underlie the generation and population coding properties of strong sensory responses following stimulus offsets in auditory cortex. Here different auditory stimuli result in largely orthogonal initial conditions at the stimulus offset, thus generating orthogonal population offset responses across stimuli [48]. The assumption of autonomous dynamics does not hold when naturalistic (e.g. temporally structured) stimuli are considered [49]. Understanding how more complex external inputs are transformed by the non-normal amplified network dynamics constitutes a major direction of future work.

The study by Murphy and Miller [32] reported that the excitatory-inhibitory (EI) structure of cortical networks induces non-normal amplification between so-called sum and difference E-I modes. Interestingly, the specific networks they considered are of the low-rank type, with sum and difference modes corresponding to left- and right- vectors of the individual unit-rank terms [35]. This connectivity structure is therefore a particular instance of the low-rank implementation of amplified trajectories that we described here. Moreover, Murphy and Miller specifically focused on the inhibition-dominated regime [50], which as we show approximately corresponds to the class of unit-rank E-I networks that exhibit strong transient amplification (Fig 2 and Supp Info; note that these networks can exhibit amplification also for 0 < *k* ≤ 1, in a parameter region limited by the stability boundary). In the present study, we have not enforced a separation between excitatory and inhibitory neurons, but this can be done in a straightforward way by adding a unit-rank term in which all excitatory (resp. inhibitory) connections have the same weight, and these weights are chosen strong enough to make all excitatory (resp. inhibitory) synapses positive (resp. negative). This additional component would induce one more amplified channel that would correspond to the global E-I difference mode of Murphy and Miller.

Here our aim was to produce amplified, but not necessarily long-lasting transients. The timescale of the transients generated using the unit-rank implementation is in fact determined by the effective timescale of the network, set by the dominant eigenvalue of the connectivity matrix. As shown in previous studies that focused on implementing transient memory traces [31, 33, 46], longer transients can be obtained either by increasing recurrent feedback (i.e. the overlap between vectors in the unit-rank implementation), or by creating longer hidden feed-forward chains. For instance, an effective feed-forward chain of length *k* can be obtained from a rank *k* connectivity term of the type **J** = Δ**v**^{(k+1)}**v**^{(k)T} +…+ Δ**v**^{(3)}**v**^{(2)T} + Δ**v**^{(2)}**v**^{(1)T}, i.e. in which each term feeds into the next one [51]. This leads in general to a *k* + 1-dimensional transient with a timescale extended by a factor *k* [33]. Implementing this kind of higher-dimensional transients naturally comes at the cost of reducing the corresponding capacity of the network.

The implementation of transient channels proposed here clearly bears a strong analogy with Hopfield networks [44]. The aim of Hopfield networks is to store patterns of activity in memory as fixed points of the dynamics, and this is achieved by adding to the connectivity matrix a unit-rank term *ξξ*^{T} for each pattern ** ξ**. One key difference with the present network is that Hopfield networks rely on symmetric connectivity [52], while amplified transients are obtained by using strongly asymmetric terms in which the left- and right-vectors are possibly orthogonal. Another difference is that Hopfield networks rely on a non-linearity to generate fixed points for each pattern, while here we considered instead linear dynamics in the vicinity of a single fixed-point. The non-linearity of Hopfield networks endows them with error-correcting properties, in the sense that a noisy initial condition will always lead to the activation of a single memorized pattern. A weaker form of error-correction is also present in our linear, transient encoding, since any component along non-amplified directions will decay faster than the amplified pattern. However, if two amplified patterns are simultaneously activated, they will lead to the activation of both corresponding outputs. This absence of competition may not be undesirable, as it can allow for the simultaneous encoding, and possibly binding, of several complementary stimulus features.

The amplified dynamics map specific external inputs onto orthogonal patterns of activity with larger values of the norm ∥**r**(*t*)∥. These dynamics however, along with amplifying the amplitude of the signal, also amplify the external noise that is injected in the network. This external noise is maximally amplified along the readout dimensions corresponding to the amplified inputs (Eq (83)), implying that the signal-to-noise ratio at the peak amplification is comparable to the SNR at the initial state. Therefore, transient amplification may not favor stimulus decoding during the transient state with respect to the initial state in presence of noisy input, but may nonetheless be needed to keep a stable value of the SNR across the transient dynamics (see Robustness of the readout activity). Instead, the amplification of the norm constitutes an advantage when the synaptic connections to the readout neurons are themselves corrupted by noise. As the noise in the readout weights, or observational noise, is not directly fed into the network, it does not get amplified by the recurrent interactions. As a result, the detrimental effects of observational noise are overcome by amplifying the signal **r**(*t*) above the noise level, which can be directly implemented by the transient coding scheme illustrated here. Transient amplification of external inputs may therefore result in an increased ability to robustly decode the external input in presence of noisy readout synapses.

While we focused here on linear dynamics in the vicinity of a fixed point, strong non-linearities can give rise to different transient phenomena [12]. In particular, one prominent proposal is that robust transient coding can be implemented using stable heteroclinic channels, i.e. sequences of saddle points that feed into each other [4]. This mechanism has been exploited in specific models based on clustered networks [5]. A general theory for this type of transient coding is to our knowledge currently lacking, and constitutes an interesting avenue for future work.

## Methods

### The network model

We study a recurrent network of *N* randomly coupled rate units. Each unit *i* is described by the time-dependent variable *r*_{i}(*t*), representing its firing rate at time *t*. The transfer function of the individual units is linear, so that the equation governing the temporal dynamics of the network reads:
(6)
where *τ* represents the membrane time constant (fixed to unity), and *J*_{ij} is the effective synaptic strength from neuron *j* to neuron *i*. In absence of external input, the system has only one fixed point corresponding to *r*_{i} = 0 for all *i*. To have stable dynamics, we require that the eigenvalues of the connectivity matrix **J** be smaller than unity, i.e. . We write the external input as the product between a common time-varying component *I*(*t*), and a term *r*_{0,i} which corresponds to the relative activation of each unit. The terms *r*_{0,i} can be arranged in a *N*-dimensional vector **r**_{0}, which we call the external input direction. Here we focus on very short external input durations (*I*(*t*) = *δ*(*t*)) and on input directions of unit norm (∥**r**_{0}∥ = 1). This type of input is equivalent to setting the initial condition to **r**(0) = **r**_{0}. Since we study a linear system, varying the norm of the input direction would result in a linear scaling of the dynamics.

### Dynamics of the network

We first outline the standard approach to the dynamics of the linear network defined by Eq (6) (see e.g. [36, 53]). The solution of the differential equation given by Eq (6) can be obtained by diagonalizing the linear system, i.e. by using a change of basis such that the connectivity matrix in the new basis **Λ** = **V**^{−1} **JV** is diagonal. The matrix **V** contains the eigenvectors **v**_{1}, **v**_{2}, …, **v**_{N} of the connectivity **J** as columns, while **Λ** has the corresponding eigenvalues λ_{i} on the diagonal. Therefore the variables represent the components of the rate vector on the basis of eigenvectors of **J**. In this new basis the system of coupled equations in Eq (6) reduces to the set of uncoupled equations
(7)
The dynamics of the linear network given by Eq (6) can thus be written in terms of its components on the eigenvectors **v**_{i} as
(8)
Equivalently, the solution of the linear system can be expressed as the product between a linear, time-dependent operator **P**_{t} and the initial condition **r**_{0} [54]:
(9)
The linear operator **P**_{t} is called the propagator of the system and it is defined as the matrix exponential of the connectivity matrix **J**, i.e. **P**_{t} = exp(*t*(**J** − **I**)/*τ*). By using the definition of matrix exponential in terms of power series, we can express the propagator as . From Eq (9) we note that the propagator **P**_{t} at time *t* defines a mapping from the state of the system at time *t* = 0, i.e. the external input direction **r**_{0}, to the state **r**(*t*).

### Dynamics of the norm

To study the amplification properties of the network, we follow [39] and focus on the temporal dynamics of the population activity norm ∥**r**(*t*)∥ [28]. The equation governing the dynamics of the norm can be derived by writing , so that the relative rate of change of the norm is given by [39]
(10)
By using Eq (6) we can write the right hand side of the previous equation as
(11)
where we introduced **J**_{S} = (**J** + **J**^{T})/2, the symmetric part of the connectivity matrix **J**.

Both the eigenvalues and the eigenvectors of **J**_{S} provide information on the transient dynamics of the system. On one hand, we show in the main text that the activity norm can have non-monotonic behaviour if and only if at least one eigenvalue of the matrix **J**_{S} is larger than one. Therefore the eigenvalues of **J**_{S} determine the type of transient regime of the system. On the other hand, as **J**_{S} is symmetric, its set of eigenvectors is orthogonal and provides a useful orthonormal basis onto which we can project the dynamics. In this basis, the connectivity matrix is given by , where **V**_{S} contains the eigenvectors of **J**_{S} as columns. The matrix **J** can be uniquely decomposed as **J** = **J**_{S} + **J**_{A}, where **J**_{A} = (**J** − **J**^{T})/2 is the anti-symmetric part of **J**, so that
(12)
The first term on the right hand side is a diagonal matrix, while the second term is an anti-symmetric matrix. Since the latter has zero diagonal elements, the new connectivity matrix **J**′ displays the eigenvalues of **J**_{S} on the diagonal. The off-diagonal terms of **J**′ are given by the elements of and represent the strength of the couplings between the eigenvectors of **J**_{S}. In the amplified regime, some of the eigenvalues of **J**_{S} are larger than one, so that without the coupling between the modes of **J**_{S}, the connectivity **J**′ would be unstable. However, in our case **J** and **J**′ are stable matrices, meaning that the coupling terms ensure the stability of the overall system. Moreover, varying the strengths of the coupling terms while keeping fixed the diagonal terms affects in a non-trivial way the maximum amplification of the system. Therefore, the decomposition in Eq (12) allows us to identify the set of key parameters that controls the maximum amplification of a specific system. In the following, we will systematically use this decomposition to analyze specific classes of matrices.

### Amplification

To identify which inputs are amplified, we examine the dynamics of the activity norm ∥**r**(*t*)∥ for an arbitrary external input **r**_{0}. The one-dimensional Eq (11) alone is not enough to determine the time course of ∥**r**(*t*)∥, since the right hand side depends on the solution of the *N*—dimensional system Eq (6). Therefore, for a specific input **r**_{0}, we can use Eq (9) and write the norm of the elicited trajectory as
(13)

#### Input-output mapping between amplified inputs and readouts.

The dynamics elicited in response to an input along an arbitrary direction is in general complex. However, the singular value decomposition (SVD) of the propagator provides a useful way to understand the network dynamics during the transient phase. Any matrix **A** can be written as
(14)
where the matrix **Σ** contains the singular values *σ*_{i}(**A**) on the diagonal, while the columns of **L** (resp. **R**) are the left (resp. right) singular vectors of **A**, i.e. the eigenvectors of **AA**^{T} (resp. **A**^{T} **A**). The matrices **R** and **L** are unitary, meaning that they separately provide two orthogonal sets of unitary vectors. Thus, we can write the SVD of the propagator as
(15)
From Eq (15) we see that, at a given time *t*, the propagator **P**_{t} maps each right singular vector into the left singular vector , scaled by the singular value (see Eq (5)). Note that for normal systems the singular value decomposition and the eigen-decomposition coincide. In this case the matrices **L** and **R** both contain the eigenvectors of **P**_{t} as columns, so that and lie on a single dimension. Instead, for a non-normal system the right and left singular vectors do not align along one direction, and the dynamics of the system in response to an input along spans at least the two dimensions defined by the two vectors and . The vectors for which correspond to the amplified inputs at time *t*, while the outputs are the corresponding readouts at time *t*.

#### Number of amplified inputs.

The number of amplified inputs at time *t* is given by the number of singular values larger than unity. To estimate this number, we examine the temporal dynamics of the singular values in time (SV trajectories). We observe that, for a system in the amplified regime (λ_{max}(**J**_{S}) > 1), at least one of the SV trajectories has non-monotonic dynamics, starting from one at *t* = 0 and then increasing before decaying to zero. In fact, the singular values of the propagator at small times *t* = *δt* are defined as the square roots of the eigenvalues of
(16)
From Eq (16) we can compute the singular values of **P**_{δt} as
(17)
so that the slope at time *t* = 0 of the *k*-th singular value of the propagator is
(18) Eq (18) shows that the number of singular values larger than unity at small times is given by the number of the eigenvalues of **J**_{S} larger than unity, which we denote as *N*_{S}.

#### Maximum amplification of the system.

From Eq (15) we see that the maximum over initial conditions of the amplification at time *t* corresponds to the dominant singular value of the propagator, . The associated amplified input and corresponding readout are respectively and . To obtain the maximum amplification of the system over inputs and over time, we need to compute the time *t** at which attains its maximum value. Therefore, the value quantifies the maximum amplification over inputs and over time, while and correspond respectively to the most amplified input direction and the associated readout.

Interestingly, it can be shown that the input satisfies the equation (see S1 Text)
(19)
which depends only on the symmetric part of the connectivity matrix **J**_{S}. We will exploit this equation to identify the amplified initial condition in specific cases. Note that, except for *N* = 2, Eq (19) does not fully specify the maximally amplified input.

### Characterizing transient dynamics—Summary

Summarizing, our approach for characterizing its transient dynamics can be divided into three main steps:

- Compute
**J**_{S}, along with its eigenvalues and eigenvectors. - Compute the propagator of the system
**P**_{t}. - Compute the Singular Value Decomposition (SVD) of the propagator.

These three steps can be in principle performed numerically for any connectivity matrix. For particular classes of connectivity matrices, we show below that some or all three steps are analytically tractable.

### Random Gaussian network

Here we consider a non-normal random connectivity matrix with synaptic strength independently drawn from a Gaussian distribution
(20)
The eigenvalues of **J** are complex and uniformly distributed in a circle of radius *g* [41]:
(21)

For this class of matrices, we can analytically determine the condition for amplified transients, and estimate the number of amplified inputs. In the stable regime (*g* < 1), the symmetric part of the connectivity **J**_{S} can have unstable eigenvalues. In fact, the elements of the symmetric part are distributed according to
(22)
From random matrix theory we know that the eigenvalues of the matrix given by Eq (22) are real and distributed according to the semicircle law [42, 43]:
(23)
In particular, the spectral radius of **J**_{S} is , meaning that **J**_{S} has unstable eigenvalues if .

To estimate the number of amplified initial conditions, we compute the lower bound on their number *N*_{S}(*ϵ*), i.e. the number of eigenvalues of **J**_{S} larger than 1 + *ϵ*:
(24)
The number of eigenvalues of **J**_{S} is maximum when *g* is close to (but smaller than) unity. In this case Eq (24) at the first order in *ϵ* translates to
(25)
Therefore, the maximal capacity of a randomly-connected network is therefore around 10%.

Computing the SVD of the exponential of a *N*-dimensional random matrix is to our knowledge an open mathematical problem. Therefore, for an arbitrary random connectivity matrix, the maximal amount of amplification and the amplified initial conditions are accessible only by numerically computing the SVD of exp(*t*(**J** − **I**)).

### Two-dimensional system

In this section we consider connectivity matrices describing networks composed of two interacting units of the form
(26)
The eigenvalues of **J** determine the stability of the network and can be expressed in terms of its trace and determinant as follows:
(27)
For the dynamics to be stable, the largest eigenvalue of **J** needs to satisfy , equivalent to the requirement that Tr(**J** − **I**) < 0 and Det(**J** − **I**) > 0. Note that if the two eigenvalues λ^{±} are real, they are symmetrically centered around Tr(**J**)/2 on the real axis; if they are complex conjugates they have real part equal to Tr(**J**)/2 and are symmetrically arranged on either side of the real axis.

#### Eigenvalues and eigenvectors of J_{S}.

The condition for transient amplification is determined by the two eigenvalues of **J**_{S}, which read:
(28)
where we introduced the parameter
(29)Δ represents the difference between the off-diagonal elements of **J**, and provides a measure of how far from symmetric the connectivity matrix is (Δ = 0 meaning symmetric connectivity). Note that the equation for the eigenvalues of **J**_{S} (Eq 28) differs from the one for the eigenvalues of **J** (Eq 27) by the additive term 4Δ^{2} under the square root. Under the assumption of a stable connectivity **J**, there exists a critical value for Δ, given by:
(30)
above which the rightmost eigenvalue of **J**_{S} is larger than one, meaning that specific inputs are transiently amplified. Note that for a stable **J**, we have Det(**J** − **I**) > 0, implying that Δ_{c} is real. Thus, Δ is the crucial parameter which determines the dynamical regime of the system.

#### Decomposition on the modes of J_{S}.

To identify the parameters which determine the maximum amplification of a system, we project the network dynamics onto the orthonormal basis of eigenvectors of **J**_{S}. In the new basis the connectivity matrix is given by Eq (12). Interestingly, the non-normal parameter Δ directly appears in the expression of the anti-symmetric part **J**_{A}, so that we obtain
(31)
up to a sign of the off-diagonal elements. From Eq (31) we see that the non-normal parameter Δ, which determines the dynamical regime of the system, also represents the strength of the coupling between the modes of **J**_{S}. For Δ > Δ_{c} we have . Thus, at small times, any component of the dynamics on the first mode of **J**_{S} is amplified by an amount proportional to . However, at later times, because of the recurrent feedback of strength Δ between the modes of **J**_{S}, the system reaches a finite amount of amplification and relaxes back to the zero fixed point. In the following we examine how the value of Δ determines the amount of amplification of the system.

#### Propagator of the dynamics.

To examine the dependence of the maximum amplification of the system on the parameter Δ we compute the propagator **P**_{t} and its SVD. A convenient method to compute the exponential of a matrix is provided in [55] (see S2 Text), which we apply to **J**′ to obtain
(32)
where the time-dependent functions *x*_{0}(*t*) and *x*_{1}(*t*) are given by
(33a)
(33b)
Here λ^{+} and λ^{−} are the eigenvalues of **J** (Eq 27).

#### SVD of the propagator.

In order to compute the maximum amplification of the system we next compute the largest singular value of the propagator *σ*_{1}(**P**_{t}) (see S3 Text):
(34)
where
(35)

#### Maximum amplification of the system.

Here we compute the maximal amount of amplification by evaluating the maximum value in time of the amplification envelope *σ*_{1}(**P**_{t}) (Eq 34), and examine its dependence on the non-normal parameter Δ. In particular we find that, for large values of Δ, this dependence is linear.

To derive this relationship, we note that the combination depends on Δ, while does not. Therefore in Eq (35) only the functions *H*(*t*) and *F*(*t*) depend on Δ. In the amplified regime Δ ≫ Δ_{c}, we have that *H*(*t*) ≫ *E*(*t*) for times *t* ≫ 1/Δ (while for small times *δ* ≪ 1/Δ we have *E*(*δt*) = 1 + Tr(**J**)*δt*/2 ≫ Δ*δt* = *H*(*δt*)). In addition, for large values of Δ, we can write so that the singular value can be written as
(36)
To find the value of the maximum amplification we need to compute the time *t** of occurrence of the global maximum of *σ*_{1}(**P**_{t}) and the value *σ*_{1}(**P**_{t*}). The final result is given by
(37)
(38)

The two-dimensional model given by Eq (26) has four free parameters, namely the strengths of the four recurrent connections. In our analysis we fix the values of the trace Tr(**J**) and determinant Det(**J**) of the connectivity matrix, so that the dynamics are stable, and vary the parameter Δ. This implies fixing the eigenvalues λ^{±} and the corresponding timescales . This approach allows us to explore how different degrees of symmetry in the connectivity, as quantified by Δ, influence the dynamics while keeping the timescales constant. Thus, we find that, for Δ ≫ Δ_{c}, and for fixed λ^{±}, the maximum amplification of the system scales linearly with the non-normal parameter Δ.

### Rank-1 connectivity

In this section we consider a unit-rank connectivity matrix defined by
(41)
where the vectors **u** and **v** are two *N*-dimensional vectors generated as
where the vectors **x**_{1}, **x**_{2} and **y** are *N*-dimensional vectors with components drawn from a Gaussian distribution with mean zero and variance 1/*N* and *ρ* is a number between −1 and 1 [45]. The average norm and correlation are given by 〈**u** ⋅ **u**〉 = 〈**v** ⋅ **v**〉 = 1 and 〈**u** ⋅ **v**〉 = *ρ*, and Δ is an overall scaling parameter. We consider only positive values of Δ, since a minus sign can be absorbed in the correlation coefficient *ρ*. The matrix **J** has *N* − 1 eigenvalues equal to zero and one eigenvalue given by λ = Δ*ρ*, associated with the eigenvector **u**. In the two-dimensional plane spanned by **u** and **v**, the direction orthogonal to **v** specifies another eigenvector of **J** corresponding to one of the zero eigenvalues.

#### Eigenvalues and eigenvectors of J_{S}.

We first compute the eigenvalues and eigenvectors of the symmetric part of the connectivity
(42) **J**_{S} is a rank-2 matrix, meaning it has in general two non-zero eigenvalues given by
(43)
Here denotes the determinant of **J**_{S} restricted to the **uv**-plane, i.e. the determinant of the 2 × 2 matrix [**u**, **u**_{⊥}]^{T} **J**_{S}[**u**, **u**_{⊥}], where **u**_{⊥} is a vector perpendicular to **u** on the **uv**-plane (the determinant of the full matrix **J**_{S} is zero because of the zero eigenvalues of **J**_{S}). We find that the two non-zero eigenvalues of the symmetric part **J**_{S} are given by (see S4 Text)
(44)
Note that the eigenvalues of **J**_{S} are symmetrically centered around λ/2, and their displacement is controlled by the scaling parameter Δ. The condition for the system to be in the regime of transient amplification is therefore
(45)

To compute the eigenvectors associated with the non-zero eigenvalues we have to solve the eigenvector equation
(46)
Since the two eigenvectors lie on the **uv**-plane, we can write them in the form and . Solving the eigenvector equation for *α* and *β* yields *α* = 1 and *β* = −1. The two normalized eigenvectors of **J**_{S} are thus given by
(47)

#### Decomposition on the modes of J_{S}.

We can project the dynamics of the system on the basis of eigenvectors of **J**_{S}. Let **V**_{S} be the *N*-dimensional matrix containing the eigenvectors of **J**_{S} as columns:
(48)
where the *ξ*_{i}’s are *N* − 2 arbitrary vectors orthogonal to both **u** and **v**. The projection of the connectivity matrix **J** onto the modes of **J**_{S} yields the new connectivity **J**′:
(49)
From Eq (49) we see that the parameter Δ controls the strength of the coupling between the modes of **J**_{S} through the term . Thus, in the following analysis, we examine the amplification properties of the system as a function of the parameter Δ.

#### Propagator of the dynamics.

We explicitly compute the expression of the propagator for the unit-rank system. From the definition of matrix exponential in terms of infinite sum of matrix powers we obtain
(50)
Therefore the final expression for the propagator is given by
(51)
where we introduced
(52)
Note that the non-trivial dynamics of the system are restricted to the plane spanned by **u** and **v**. In fact any component of the initial condition orthogonal to this plane decays to zero as *e*^{−t}, as any component orthogonal to **v** in the **uv**-plane. From this it follows that non-monotonic transients occur only if the initial condition of the system has a non-zero component on the structure vector **v**.

#### SVD of the propagator.

To study how the maximum amplification depends on Δ we compute the amplification envelope *σ*_{1}(**P**_{t}). The singular values of the propagator **P**_{t} are given by the square roots of the eigenvalues of the matrix . From Eq (51) we can write
(53)
We obtain the expression for the singular values of the propagator *σ*_{1,2}(**P**_{t}) as a function of Δ and λ (see S5 Text):
(54)
The other *N* − 2 singular values of **P**_{t} are equal to *e*^{−t}.

#### Choice of the free parameters.

For the unit-rank system, two parameters out of Δ, λ and *ρ* can vary independently. Since we set Δ as a free parameter, we need to fix the second independent parameter. We explore three scenarios, which imply different scalings of λ or *ρ* with the parameter Δ:

- keep the eigenvalue λ constant, so as to fix the timescale
*τ*= 1/(1 − λ), and vary Δ. In this case the correlation*ρ*between the**u**and**v**scales according to*ρ*= λ/Δ, meaning that increasing Δ makes the structure vectors more orthogonal to each other. - Fix the correlation between the structure vectors,
*ρ*, to a positive value and vary Δ. Increasing Δ has the effect to increase the timescale of the system*τ*= 1/(1 − Δ*ρ*), until a point where the system becomes unstable, i.e. for λ > 1, or equivalently Δ > 1/*ρ*. - Keep
*ρ*fixed to a negative value. In this case Δ can be increased without bounds and higher values of Δ decrease the timescale*τ*.

#### Maximum amplification of the system.

The singular values of the propagator given by Eq (54) depend in a complex manner on Δ and λ. To understand how the maximum amplification of the system depends on Δ, we study the limit of very large Δ, defined as
(55)
which we call the *strong amplification regime*. Note that in general the eigenvalue λ depends on Δ, according to λ(Δ) = Δ*ρ*. For fixed λ, Eq (55) is given by , while for a fixed value of *ρ*, Eq (55) translates into Δ ≫ 2(1−*ρ*) (with the additional constraint Δ < 1/*ρ* ensuring stability, in case *ρ* > 0). If condition given by Eq (55) is met, we can approximate Eq (54) for times *t* ≫ 2/Δ as
(56)
For large Δ we can neglect the first two terms on the right hand side and write the largest singular value as
(57)
The maximum amplification of the system corresponds to the maximum value in time of *σ*_{1}(**P**_{t}). In the strong amplification regime (Eq 55) the time *t** at which the singular value attains its maximum is independent of Δ and reads:
(58)
Thus, the maximum amplification increases monotonically with Δ:
(59)
where *g*(λ) is a multiplicative factor which depends on the eigenvalue λ. Different choices of the free parameters imply different growths of the maximum amplification with Δ:

- for λ fixed and
*ρ*= λ/Δ, the maximum amplification increases linearly with Δ. - For
*ρ*> 0 fixed and λ = Δ*ρ*, the maximum amplification increases monotonically with Δ, until it reaches a value equal to Δ for Δ = 1/*ρ*(or λ = 1). - For
*ρ*< 0 fixed and λ = Δ*ρ*, the amplification increases monotonically with Δ, but it saturates at a value given by 1/|*ρ*|. This follows from the fact that (60)

In the case*ρ*= 0 the maximum amplification grows linearly as Δ/*e*, since (61)

The general observation that the largest eigenvalue of the symmetric part of **J** does not provide direct information about the maximum amount of amplification that the system can reach (maximum value of ∥**r**(*t*)∥ over time) is illustrated by the third case. Here *ρ* < 0 and the largest eigenvalue of the symmetric part of the connectivity is given by λ_{max}(**J**) = Δ(1 + *ρ*)/2. Therefore, while λ_{max}(**J**) can grow indefinitely by increasing the value of Δ, the maximum amplification saturates at a positive level given by 1/|*ρ*|. This indicates that the value of the largest eigenvalue of **J**_{S} is in general inadequate to characterize the maximum amplification of the system, for which other measures may be considered (see Discussion).

#### Optimally amplified initial condition and optimal readout.

Using the result we found for the two dimensional case, Eqs (40) and (44), we can determine the angles and of the optimal initial condition and optimal readout with respect to the first mode of **J**_{S} as
(62)
where the + and − signs correspond respectively to ans . The optimally amplified initial condition and optimal readout are thus given by
(63)
Here we examine and in the strong amplification regime (Eq 55). We summarize our results as follows.

- For fixed λ and
*ρ*= λ/Δ, we have (64) up to the first order in Δ^{−1}. In the strong amplification regime the second term on the right hand side is much smaller than unity, so that we can compute and at the first order in Δ^{−1}. Denoting by and respectively the vectors orthogonal to**v**and**u**in the**uv**-plane, we can write (65)

In the strong amplification regime the optimal initial condition is thus strongly aligned with**v**and the optimal readout with the vector**u**. - For fixed
*ρ*> 0 and λ = Δ*ρ*, we compute the value of tan*θ** for the largest value Δ can take before the system becomes unstable, i.e. Δ = 1/*ρ*. For this value we have (66) Thus we have (67) - For fixed
*ρ*< 0 and λ = Δ*ρ*, we can write (68) so that (69)

In conclusion we find that, in the strong amplification regime, the optimal input has a strong component on the structure vector **v**, while the optimal readout is strongly aligned with **u**. In cases (2) and (3), however, this requires the additional condition that *ρ* be small.

### Robustness of the readout to noise in the connectivity

In this section we study the dynamics of the system in presence of noise in the synaptic connectivity. We consider the connectivity matrix given by Eq (41), which implements a single transient pattern, and we add uncorrelated noise of standard deviation to each weight Δ*u*_{i} *v*_{j}. The resulting connectivity matrix can be written as the sum of a structured unit-rank part and a Gaussian random matrix of the form [35]
(70)
The elements of ** χ** are independently drawn from a Gaussian distribution with zero mean and variance 1/

*N*and are uncorrelated with the structured part. In the limit of large

*N*, the matrix

**J**has one eigenvalue equal to the eigenvalue of the unit-rank part, λ = Δ

*ρ*, while the other

*N*− 1 eigenvalues are uniformly distributed in a circle of radius

*g*. This holds under the condition that the operator norm of the unit-rank part max

_{x}∥Δ

**uv**

^{T}

**x**∥ is

*O*(1) [56]. Since the structure vectors

**u**and

**v**have unit norm, the operator norm of the unit-rank part is equal to Δ. Therefore, if Δ is

*O*(1), the condition for the stability of the system is max{λ,

*g*} < 1.

#### Eigenvalues of J_{S}.

To draw the phase diagram of the system, we compute the eigenvalues of the symmetric part of **J** (71)
where *χ*_{S} denotes the symmetric part of ** χ**. The entries of

*χ*_{S}are distributed according to (72) We can express the eigenvalues of

**J**

_{S}as a function of

*g*and of the eigenvalues of the symmetric part of the unit-rank matrix (see Eq 44) [57, 58]. In particular, the rightmost eigenvalue of

**J**

_{S}is given by (73) where corresponds to the spectral radius of

*χ*_{S}. We distinguish two cases:

- if , is larger than one only if the two conditions (74) are satisfied. The first inequality is satisfied if or . Since for we have , the condition for the amplified regime becomes (75)
- If , the inequality (λ + Δ)/2 +
*g*^{2}/(λ + Δ) > 1 is always satisfied for λ + Δ > 0, thus holding also for . From Eq (73) we conclude that, for , λ_{max}(**J**_{S}) is larger than one independently of the values of λ and Δ.

In the case , adding noise in the connectivity has a small effect on the phase diagram of the system. In fact, Eq (75) can be approximated as λ + Δ ≳ 2 − *g*^{2}, which leads to a correction of order *g*^{2} to the condition for the amplified regime in absence of noise (see S1 Fig).

#### Robustness of the readout activity.

Here we examine the magnitude of the fluctuations around the mean activity introduced by the random term in the connectivity given by Eq (70). In particular we assess the robustness of the readout projection of the response evoked by the optimal stimulus of the noiseless system, i.e. *g* = 0 (for a discussion on the effects of the connectivity noise on the activity orthogonal to the **uv**-plane see S8 Text). For simplicity, we assume that the correlation between the structure vectors, *ρ*, is close to zero, and that the condition for the strong amplification regime is satisfied (Eq (55)). Therefore, the optimal stimulus is strongly aligned with **v**, while the corresponding readout is **u**. We consider the system
(76)
Each neuron receives independent noise with mean zero, variance *σ*^{2} and autocorrelation function 〈*η*_{i}(*t*)*η*_{j}(*t*′)〉 = *δ*_{ij} *δ*(*t* − *t*′), where the angular brackets represent the average over the noise in the input and in the connectivity. In the limit of large *N*, the equation for the mean activity depends only on the structured part of the connectivity:
(77)
Thus, the mean activity in response to an input along **v** is given by (see Eq 51)
(78)
From Eq (77) we write the equation for the fluctuations of *r*_{i}(*t*) around the mean activity, *δr*_{i}(*t*) = *r*_{i}(*t*) − 〈*r*_{i}(*t*)〉, as
(79)
where in the third term on the right hand side we neglected the corrections to *r*_{i}(*t*) due to the random component of the connectivity and input noise, keeping only the 0-th order term in *g*, i.e. 〈*r*_{i}(*t*)〉. Using Eq (78) we can write the solution of Eq (79) as
(80)

The time-dependent correlation matrix **C**(*t*) = 〈*δ* **r**(*t*)*δ* **r**(*t*)^{T}〉 can be written as the sum of two terms, corresponding to the contributions of the noise in the connectivity (with variance proportional to *g*^{2}) and the noise in the input (with variance *σ*^{2}):
(81)
where in the first term in the right hand side we used 〈*χ*_{kl} *χ*_{mn}〉 = *δ*_{km} *δ*_{ln}/*N*.

We start by computing the first term in Eq (81). Since the elements of the matrix propagator and the mean activity are known (see Eqs 51 and 78), we can compute for a given realization of the structured part (see S7 Text). The variance of the activity along the direction of the readout **u** due to the noise in the connectivity is computed by projecting the matrix **C**^{g} onto **u**. In particular we compute the variance of *δr*_{u} and at the peak of the transient phase (*t** ≃ 1, see Eq (58)). As a result, the fluctuations of the readout activity at *t* = *t** due to the noise in the connectivity read:
(82)
and scale as (for large Δ).

Computing the variance of the activity along the readout **u** due to the input noise yields (see S7 Text)
(83)
From Eq (81), we can write the total amount of variability along the readout **u** at the peak amplification as
(84)
Note that the fluctuations along **u** due to the noise in the input do not depend on the size of the network *N*. Therefore, in the limit of large *N*, only the input noise affects the readout activity significantly. By computing the signal-to-noise ratio (*SNR*) of the readout activity, we can assess the reliability of the readout in presence of input noise. The signal of the readout is simply the amplification level at the peak of the transient phase. Since for orthogonal structure vectors (*ρ* ≃ 0) the amplification grows as Δ/*e*, we find
(85)
The readout is reliable if its signal-to-noise ratio is much larger than unity. Interestingly, for large values of Δ (see Eq 55), the SNR is independent of Δ, so that increasing the amplification does not improve the SNR significantly (see S2 Fig). In fact, for Δ ≫ 2, we can approximate Eq (85) as
(86)
In this regime, the critical value of *σ* above which the *SNR* becomes smaller than unity is:
(87)
We observe that the signal-to-noise ratio along the initial state (the vector **v**) is simply given by *SNR*_{0}(*σ*) = 1/*σ*, so that the critical value of *σ* above which *SNR*_{0} becomes smaller than unity is given by *σ*_{c,0} = 1. As a result, the maximum gain in *SNR* that strongly amplified networks can achieve is less than 20%. The signal-to-noise ratio along the transient readout **u** and along the initial state **v** are therefore comparable. However, since *SNR*(*σ*;Δ)<*SNR*_{0} for non amplified dynamics (Δ < 2), transient amplification is needed to keep a stable *SNR* across initial state and transient readout.

### Robustness to multiple stored patterns and capacity of the network

In this section we examine the robustness of the transient readouts when *P* transient trajectories are encoded in the connectivity **J**. We consider a connectivity matrix given by the sum of *P* unit-rank matrices
(88)
where the elements of the vectors **u**^{(p)} and **v**^{(p)} are randomly and independently distributed with zero mean and variance equal to 1/*N*. Therefore, for large *N* and for *P* ≤ *N*/2, these vectors are close to orthogonal to each other, meaning that the correlation between all the pairs of structure vectors, *ρ*, is close to zero. For simplicity, we assume that the non-normal parameter Δ is the same for all stored trajectories. We first study the case of two stored transient trajectories (*P* = 2), then generalizing to an extensive number of patterns *P* = *O*(*N*).

#### Two encoded transient trajectories.

The connectivity matrix in this case is given by
(89)
Since the four structure vectors in Eq (89) are uncorrelated with each other, in the limit of large *N*, we can factorize the full propagator of the dynamics as the product of the propagators of the single unit-rank parts (see Eq (51)) and obtain (see S9 Text)
(90)
where *α*(*t*; λ = 0) = *t* (see Eq 52). From Eq (90) we see that, in high dimensionality, the two transient patterns do not interact. In fact, any initial condition defined on the plane spanned by **u**^{(1)} and **v**^{(1)} evokes a two-dimensional trajectory which remains confined on the same plane. The same holds for the dynamics on the plane defined by **u**^{(2)} and **v**^{(2)}.

#### Extensive number of encoded trajectories and capacity of the network.

When the number of encoded trajectories *P* is of order *N*, we cannot factorize the propagator as in the case of two stored patterns, due to the stronger correlations between the 2*P* structure vectors **u**^{(p)} and **v**^{(p)}. However, the results for the case of one stored pattern with connectivity noise can be applied to this case if we write the connectivity matrix in Eq (88) as
(91)
Here we isolate the first term of the sum but, since all the *P* patterns are statistically equivalent, the choice of the first pattern is arbitrary. The vectors **u**^{(i)} and **v**^{(i)} are uncorrelated with each other, so that we can consider the second term on the right hand side of Eq (91) effectively as noise in the connectivity **J** = Δ**u**^{(1)}**v**^{(1)T}, with mean zero and variance Δ^{2} *P*/*N*^{2}. In fact, the mean and the variance of the effective noise are given respectively by
(92)
and
(93)
Applying the results from the previous sections with , we can state that the noise coming from the additional *P*−1 patterns adds fluctuations of the order to the projection of the activity on the readout **u**^{(1)} corresponding to the stimulus **v**^{(1)}. Since the number of encoded patterns *P* is extensive, the readout fluctuations scale as .

However, when a number *P* of trajectories are encoded in **J**, we are not guaranteed that the connectivity has stable eigenvalues. Indeed, the eigenvalues of the matrix are distributed in a circle of radius (yet the spectral density is not uniform, since Eq (88) can be written as the product of two rectangular Gaussian matrices) [59]. Thus, to ensure overall stability we need , resulting in a maximal number of patterns *P*_{max} that can be stored in the connectivity before the system becomes unstable. This number defines the capacity of the system and is given by
(94)
If the vectors **u**^{(p)} and **v**^{(p)} are exactly orthogonal to each other for all *p*, therefore forming an orthonormal basis in , Eq (94) reduces to *P*_{max} = *N*/2. From Eq (94) we see that, for fixed Δ, the number of transient trajectories that we can encode in the connectivity matrix scales linearly with the size of the system, *N*. The capacity of the system rapidly drops when Δ is increased, meaning that more amplified systems can encode less number of stimuli. When the structure vectors are orthogonal to each other as in our case (*ρ* ≃ 0), the system is amplified for Δ > 2 (see Eq 45). Therefore, Eq (94) evaluated at Δ = 2 provides an upper bound on the capacity for an amplified system with uncorrelated structure vectors:
(95)

## Supporting information

### S1 Fig. Phase diagram for the unit-rank network with connectivity noise.

**A**. . The red line indicates the boundary between the monotonic and amplified parameter regions for *g* = 0.5. The grey dashed line corresponds to the case *g* = 0. **B**. . The dynamics are amplified regardless of the values of the parameters Δ and *ρ*.

https://doi.org/10.1371/journal.pcbi.1007655.s010

(TIF)

### S2 Fig. Signal-to-noise ratio of the readout in presence of external input noise.

Signal-to-noise ratio of the readout as a function of the standard deviation of the input noise *σ* for two values of the non-normal parameter Δ. Non-amplified dynamics (Δ = 1) are less robust to noise than amplified dynamics (Δ = 4). Dashed lines correspond to the theoretical values (Eq 85). In simulations, *N* = 1000. Errorbars represent the standard deviation of the mean over 200 realizations of the connectivity matrix.

https://doi.org/10.1371/journal.pcbi.1007655.s011

(TIF)

## Acknowledgments

We are grateful to Francesca Mastrogiuseppe and Manuel Beiran for discussions and feedback on the manuscript.

## References

- 1. Seung HS, Sompolinsky H. Simple models for reading neuronal population codes. Proceedings of the National Academy of Sciences. 1993;90(22):10749–10753.
- 2. Pouget A, Dayan P, Zemel R. Information processing with population codes. Nature Reviews Neuroscience. 2000;1(2):125–132. pmid:11252775
- 3. Pouget A, Dayan P, Zemel R. Inference and computation with population codes. Annual Review of Neuroscience. 2003;26(1):381–410. pmid:12704222
- 4. Rabinovich M, Huerta R, Laurent G. Transient dynamics for neural processing. Science. 2008;321(5885):48–50. pmid:18599763
- 5. Rabinovich MI, Huerta R, Varona P, Afraimovich VS. Transient cognitive dynamics, metastability, and decision making. PLOS Computational Biology. 2008;4(5):1–9.
- 6. Durstewitz D, Deco G. Computational significance of transient dynamics in cortical networks. European Journal of Neuroscience. 2008;27(1):217–227. pmid:18093174
- 7. Buonomano DV, Maass W. State-dependent computations: spatiotemporal processing in cortical networks. Nature Reviews Neuroscience. 2009;10(2):113–125. pmid:19145235
- 8. Brody CD, Hernández A, Zainos A, Romo R. Timing and neural encoding of somatosensory parametric working memory in macaque prefrontal cortex. Cerebral Cortex. 2003;13(11):1196–1207. pmid:14576211
- 9. Crowe DA, Averbeck BB, Chafee MV. Rapid Sequences of Population Activity Patterns Dynamically Encode Task-Critical Spatial Information in Parietal Cortex. Journal of Neuroscience. 2010;30(35):11640–11653. pmid:20810885
- 10. Jun JK, Miller P, Hernández A, Zainos A, Lemus L, Brody CD, et al. Heterogenous population coding of a short-term memory and decision task. Journal of Neuroscience. 2010;30(3):916–929. pmid:20089900
- 11. Shafi M, Zhou Y, Quintana J, Chow C, Fuster J, Bodner M. Variability in neuronal activity in primate cortex during working memory tasks. Neuroscience. 2007;146(3):1082–1108. pmid:17418956
- 12. Laje R, Buonomano DV. Robust timing and motor patterns by taming chaos in recurrent neural networks. Nature Neuroscience. 2013;16:925–933. pmid:23708144
- 13. Chaisangmongkon W, Swaminathan SK, Freedman DJ, Wang XJ. Computing by robust transience: how the fronto-parietal network performs sequential, category-based decisions. Neuron. 2017;93(6):1504–1517.e4. pmid:28334612
- 14. Goudar V, Buonomano DV. Encoding sensory and motor patterns as time-invariant trajectories in recurrent neural networks. eLife. 2018;7:e31134. pmid:29537963
- 15. Churchland MM, Shenoy KV. Temporal complexity and heterogeneity of single-neuron activity in premotor and motor cortex. Journal of Neurophysiology. 2007;97(6):4235–4257. pmid:17376854
- 16. Mazor O, Laurent G. Transient dynamics versus fixed points in odor representations by locust antennal lobe projection neurons. Neuron. 2005;48(4):661–673. pmid:16301181
- 17. Machens CK. Demixing population activity in higher cortical areas. Frontiers in Computational Neuroscience. 2010;4:126. pmid:21031029
- 18. Mante V, Sussillo D, Shenoy KV, Newsome WT. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature. 2013;503:78–84. pmid:24201281
- 19. Cunningham JP, Yu BM. Dimensionality reduction for large-scale neural recordings. Nature Neuroscience. 2014;17(11):1500–1509. pmid:25151264
- 20. Kobak D, Brendel W, Constantinidis C, Feierstein CE, Kepecs A, Mainen ZF, et al. Demixed principal component analysis of neural population data. eLife. 2016;5:e10989. pmid:27067378
- 21. Bagur S, Averseng M, Elgueda D, David S, Fritz J, Yin P, et al. Go/No-Go task engagement enhances population representation of target stimuli in primary auditory cortex. Nature Communications. 2018;9(1):2529. pmid:29955046
- 22. Shenoy KV, Sahani M, Churchland MM. Cortical control of arm movements: a dynamical systems perspective. Annual Review of Neuroscience. 2013;36(1):337–359. pmid:23725001
- 23. Churchland MM, Cunningham JP, Kaufman M, Ryu SI, Shenoy KV. Cortical preparatory activity: representation of movement or first cog in a dynamical machine? Neuron. 2010;68(3):387–400. pmid:21040842
- 24. Churchland MM, Cunningham JP, Kaufman M, Foster JD, Nuyujukian P, Ryu SI, et al. Neural population dynamics during reaching. Nature. 2012;487:51–56. pmid:22722855
- 25. Michaels JA, Dann B, Scherberger H. Neural population dynamics during reaching are better explained by a dynamical system than representational tuning. PLOS Computational Biology. 2016;12(11):1–22.
- 26. Wang J, Narain D, Hosseini EA, Jazayeri M. Flexible timing by temporal scaling of cortical responses. Nature Neuroscience. 2018;21:102–110. pmid:29203897
- 27. Remington ED, Narain D, Hosseini EA, Jazayeri M. Flexible sensorimotor computations through rapid reconfiguration of cortical dynamics. Neuron. 2018;98(5):1005–1019.e5. pmid:29879384
- 28. Hennequin G, Vogels TP, Gerstner W. Optimal control of transient dynamics in balanced networks supports generation of complex movements. Neuron. 2014;82(6):1394–1406. pmid:24945778
- 29. Carnevale F, de Lafuente V, Romo R, Barak O, Parga N. Dynamic control of response criterion in premotor cortex during perceptual detection under temporal uncertainty. Neuron. 2015;86(4):1067–1077. pmid:25959731
- 30. David S. Neural circuits as computational dynamical systems. Current Opinion in Neurobiology. 2014;25:156–163.
- 31. Ganguli S, Huh D, Sompolinsky H. Memory traces in dynamical systems. Proceedings of the National Academy of Sciences. 2008;105(48):18970–18975.
- 32. Murphy BK, Miller KD. Balanced amplification: a new mechanism of selective amplification of neural activity patterns. Neuron. 2009;61(4):635–648. pmid:19249282
- 33. Goldman MS. Memory without feedback in a neural network. Neuron. 2009;61(4):621–634. pmid:19249281
- 34. Hennequin G, Vogels TP, Gerstner W. Non-normal amplification in random balanced neuronal networks. Physical Review E. 2012;86:011909.
- 35. Ahmadian Y, Fumarola F, Miller KD. Properties of networks with partially structured and partially random connectivity. Physical Review E. 2015;91:012820.
- 36.
Dayan P, Abbott LF. Theoretical Neuroscience, Computational and Mathematical Modeling of Neural Systems. The MIT Press; 2005.
- 37.
Trefethen LN, Embree M. Spectra and Pseudospectra: The Behavior of Nonnormal Matrices and Operators. Princeton, NJ: Princeton University Press; 2005.
- 38. Trefethen LN, Trefethen AE, Reddy SC, Driscoll TA. Hydrodynamic stability without eigenvalues. Science. 1993;261(5121):578–584. pmid:17758167
- 39. Neubert MG, Caswell H. Alternatives to resilience for measuring the responses of ecological systems to perturbations. Ecology. 1997;78(3):653–665.
- 40.
Horn RA, Johnson CR. Matrix Analysis. Cambridge University Press; 2012.
- 41. Girko VL. The circular law. Teoriya Veroyatnostei i ee Primeneniya. 1984;29(4):669–679.
- 42. Wigner EP. Characteristic vectors of bordered matrices with infinite dimensions. Annals of Mathematics. 1955;62(3):548–564.
- 43. Wigner EP. On the Distribution of the Roots of Certain Symmetric Matrices. Annals of Mathematics. 1958;67(2):325–327.
- 44. Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences. 1982;79(8):2554–2558.
- 45. Mastrogiuseppe F, Ostojic S. Linking connectivity, dynamics, and computations in low-rank recurrent neural networks. Neuron. 2018;99(3):609–623.e29. pmid:30057201
- 46. White OL, Lee DD, Sompolinsky H. Short-term memory in orthogonal neural networks. Physical Review Letters. 2004;92:148102. pmid:15089576
- 47. Martí D, Brunel N, Ostojic S. Correlations between synapses in pairs of neurons slow down dynamics in randomly connected neural networks. Physical Review E. 2018;97:062314. pmid:30011528
- 48. Bondanelli G, Deneux T, Bathellier B, Ostojic S. Population coding and network dynamics during OFF responses in auditory cortex. bioRxiv. 2019.
- 49. Farrell BF, Ioannou PJ. Generalized Stability Theory. Part I: Autonomous Operators. Journal of the Atmospheric Sciences. 1996;53(14):2025–2040.
- 50. Ozeki H, Finn IM, Schaffer ES, Miller KD, Ferster D. Inhibitory stabilization of the cortical network underlies visual surround suppression. Neuron. 2009;62(4):578–592. pmid:19477158
- 51. Sompolinsky H, Kanter I. Temporal association in asymmetric neural networks. Physical Review Letters. 1986;57:2861–2864. pmid:10033885
- 52. Brunel N. Is cortical connectivity optimized for storing information? Nature Neuroscience. 2016;19:749–755. pmid:27065365
- 53.
Strogatz SH. Nonlinear dynamics and chaos. With applications to Physics, Biology, Chemistry, and Engineering. Westview Press; 2015.
- 54.
Arnold VI. Ordinary differential equations. The MIT Press; 1973.
- 55. Leonard IE. The matrix exponential. SIAM Review. 1996;38(3):507–512.
- 56. Tao T. Outliers in the spectrum of iid matrices with bounded rank perturbations. Probability Theory and Related Fields. 2013;155(1):231–263.
- 57. Benaych-Georges F, Rao RN. The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Advances in Mathematics. 2011;227(1):494–521.
- 58. Benaych-Georges F, Rao RN. The singular values and vectors of low rank perturbations of large rectangular random matrices. J Multivar Anal. 2012;111:120–135.
- 59. Burda Z, Jarosz A, Livan G, Nowak MA, Swiech A. Eigenvalues and singular values of products of rectangular Gaussian random matrices. Physical Review E. 2010;82:061114.