The authors have declared that no competing interests exist.

Following a stimulus, the neural response typically strongly varies in time and across neurons before settling to a steady-state. While classical population coding theory disregards the temporal dimension, recent works have argued that trajectories of transient activity can be particularly informative about stimulus identity and may form the basis of computations through dynamics. Yet the dynamical mechanisms needed to generate a population code based on transient trajectories have not been fully elucidated. Here we examine transient coding in a broad class of high-dimensional linear networks of recurrently connected units. We start by reviewing a well-known result that leads to a distinction between two classes of networks: networks in which all inputs lead to weak, decaying transients, and networks in which specific inputs elicit amplified transient responses and are mapped onto output states during the dynamics. Theses two classes are simply distinguished based on the spectrum of the symmetric part of the connectivity matrix. For the second class of networks, which is a sub-class of non-normal networks, we provide a procedure to identify transiently amplified inputs and the corresponding readouts. We first apply these results to standard randomly-connected and two-population networks. We then build minimal, low-rank networks that robustly implement trajectories mapping a specific input onto a specific orthogonal output state. Finally, we demonstrate that the capacity of the obtained networks increases proportionally with their size.

Classical theories of sensory coding consider the neural activity following a stimulus as constant in time. Recent works have however suggested that the temporal variations following the appearance and disappearance of a stimulus are strongly informative. Yet their dynamical origin remains little understood. Here we show that strong temporal variations in response to a stimulus can be generated by collective interactions within a network of neurons if the connectivity between neurons satisfies a simple mathematical criterion. We moreover determine the relationship between connectivity and the stimuli that are represented in the most informative manner by the variations of activity, and estimate the number of different stimuli a given network can encode using temporal variations of neural activity.

The brain represents sensory stimuli in terms of the collective activity of thousands of neurons. Classical population coding theory describes the relation between stimuli and neural firing in terms of tuning curves, which assign a single number to each neuron in response to a stimulus [

In contrast to this static picture, a number of recent works have argued that the temporal dynamics of population activity may play a key role in neural coding and computations [

To produce useful transient coding, the trajectories of neural activity need to satisfy at least three requirements [

We study linear networks of _{i} represents the deviation of the activity of the unit _{i} as the firing rate of unit _{ij} denotes the effective strength of the connection from neuron _{0,i} in which the temporal component _{0} (normalized to unity) represents the relative amount of input to each neuron.

We focus on the transient autonomous dynamics in the network following a brief input in time (_{0}, which is equivalent to setting the initial condition to _{0}. The temporal activity of the network in response to this input can be represented as a trajectory _{i} = 0. At intermediate times, depending on the connectivity matrix _{0}, the trajectory can however exhibit two qualitatively different types of behavior: it can either monotonically decay towards the asymptotic state or transiently move away from it (

Dynamics of a linear recurrent network in response to a short external perturbation along a given input direction _{0}. The left and right examples correspond to two different connectivity matrices, where the connection strengths are independently drawn from a Gaussian distribution with zero mean and variance equal to ^{2}/

The two types of transient trajectories can be distinguished by looking at the Euclidean distance between the activity at time point

One approach to understanding how the connectivity matrix _{k}} of _{k} are mutually orthogonal, then the squared activity norm is a sum of squares of decaying exponentials, and therefore a monotonically decaying function. Connectivity matrices

Nonetheless, a non-normal connectivity matrix _{0} for the transient trajectory to be amplified. In the following, we point out a simple criterion on the connectivity matrix

To distinguish between monotonic and amplified trajectories, we focus on the rate of change d∥_{0}. Indeed, the rate of change of the activity norm satisfies (see [_{S} denotes the symmetric part of the connectivity matrix _{S} associated with its largest eigenvalue, λ_{max}(_{S}), and the corresponding maximal rate of change of the activity norm is therefore λ_{max}(_{S}) − 1.

_{S} be larger than unity, λ_{max}(_{S}) > 1 [_{max}(_{S}) leads to a positive rate of change of the activity norm at time _{S} is necessarily larger than one. If that were not the case, the right hand side of the equation for the norm would take negative values for all vectors

The criterion based on the symmetric part of the connectivity matrix allows us to distinguish two classes of connectivity matrices: if λ_{max}(_{S}) < 1 all external inputs _{0} lead to monotonically decaying trajectories (non-amplifying connectivity); if λ_{max}(_{S}) < 1 specific input directions lead to a non-monotonic amplified activity norm (amplifying connectivity). The key point here is that for a non-normal connectivity matrix _{S} is in general different from _{max}(_{S}) > 1) are therefore not mutually exclusive. This is instead the case for normal networks, which include symmetric, anti-symmetric, orthogonal connectivity matrices and trivial one-dimensional dynamics.

The simplest illustration of this result is a two-population network. In that case the relationship between the eigenvalues of _{S} is straightforward. The eigenvalues of _{S} are given by
_{S} is in general larger than the maximal eigenvalue of _{S} will have an unstable eigenvalue, even if both eigenvalues of _{0} (see

_{S} (red dots). Both pairs of eigenvalues are symmetrically centered around the trace of _{S} lie further apart (_{S} can cross unity if the difference 2Δ between the off-diagonal elements of the connectivity matrix is sufficiently large (bottom panel). _{1}(_{t*}) of the propagator, see ^{2}(_{EE} = _{IE} = _{EI} = _{II} = −_{1}(_{t*}). The grey trace corresponds to the boundary between the monotonic and the amplified parameter regions. The red trace represents the stability boundary, with the unstable region hatched in red. In order to achieve transient amplification the excitatory weight

A second illustrative example is a network of ^{2}/_{S} are random, but their distributions are known. The eigenvalues of _{S} are real and distributed according to the semicircle law with spectral radius _{S} is larger by a factor

Each entry of ^{2}/_{S} are real-valued, and are distributed in the large _{S} given by the spectral radius _{S} is larger than the spectral radius of _{S} can be larger than unity (in red), while the network dynamics are stable (_{S} as a function of the random strength

For a connectivity matrix satisfying the amplification condition λ_{max}(_{S}) > 1, only specific external inputs _{0} are amplified by the recurrent circuitry, while others lead to monotonically decaying trajectories (

Example corresponding to a _{1}, _{2} and _{100} (shown in different rows), as in

One approach to these questions is to examine the mapping from inputs to states at a given time _{0} is given by the linear mapping _{t} _{0}, where for any time _{t} = exp(_{t} defines a set of singular values _{t} maps

If _{t} determines the maximal possible amplification at time

Since the propagator _{t} depends on time, the singular vectors _{s} of the singular value trajectories lie above unity, we can indeed identify a set of _{s} orthogonal, amplified inputs corresponding to the right singular vectors _{k} and the outputs _{k} are not identical, so that the dynamics for each amplified input are at least two-dimensional (

How many independent, orthogonal inputs can a network encode with amplified transients? To estimate this number, a central observation is that the slopes of the different singular value trajectories at _{S}. This follows from the fact that the singular values of the propagator _{t} are the square root of the eigenvalues of _{S} larger than unity. To eliminate the trajectories with small initial slopes, one can further constrain the slopes to be larger than a margin _{S}(_{S} larger than 1 + _{S}(_{S}(_{S} follows the semicircle law (_{s} of amplified inputs scales linearly with

To summarize, the amplified inputs and the corresponding encoding at peak amplification can be determined directly from the singular value decomposition of the propagator, given by the exponential of the connectivity matrix. For an arbitrary

The approach outlined above holds for any arbitrary connectivity matrix, and allows us to identify the external inputs which are strongly amplified by the recurrent structure, along with the modes that get most activated during the elicited transients, and therefore encode the inputs. We now turn to the converse question: how to choose the network connectivity _{0} into a fixed, arbitrary output

To address this question, we consider a connectivity structure given by a unit-rank matrix ^{T} [_{S} is given by Δ(

_{1}(_{t*}), the first singular value of the propagator, as a function of the scaling parameter Δ. Here we fix the eigenvalue of the connectivity matrix λ = Δ^{(1)} (resp. ^{(2)}) elicits a two-dimensional trajectory which brings the activity along the other structure vector ^{(1)} (resp. ^{(2)}), mapping stimulus ^{(1)} (resp. ^{(2)}) into its transient readout ^{(1)} (resp. ^{(2)}). Blue and red colors correspond to the two stored patterns. ^{(1)} (resp. ^{(2)}) on the corresponding readout ^{(1)} (resp. ^{(2)}). The case of unit rank connectivity (one stored pattern) reduces to the first row of panels ^{(2)} is equivalent to the activity on a readout orthogonal to ^{(1)}).

For this unit rank connectivity matrix, the full propagator _{t} = exp(_{1}(_{t}) at the time of its maximum _{S} is larger than unity, only one input perturbation is able to generate amplified dynamics. For large values of Δ, this optimal input direction is strongly correlated with the structure vector _{0} and _{0} into the output _{0}.

Several, orthogonal trajectories can be implemented by adding orthogonal unit rank components. For instance, taking ^{(1)}^{(1)T} + Δ^{(2)}^{(2)T}, where the planes defined by the structure vectors in each term are mutually orthogonal, the input ^{(1)} evokes a trajectory which is confined to the plane defined by ^{(1)} and ^{(1)}, and which maps the input ^{(1)} into the output ^{(1)} at the time of peak amplification. Similarly, the input ^{(2)} is mapped into the output ^{(2)} during the evoked transient dynamics. Therefore, the rank-2 connectivity ^{(1)} and ^{(2)} into the readouts ^{(1)} and ^{(2)}. A natural question is how robust the scheme is and how many patterns can be implemented in a network of fixed size

To investigate the robustness of the transient coding scheme implemented with unit rank terms, we first examined the effect of additional random components in the connectivity. Adding to each connection a random term of variance ^{2}/_{0} from the activity along the corresponding readout

^{2}/^{(k)} along the corresponding readout ^{(k)} (red trace) and along a different arbitrary readout ^{(k′)} (blue trace) for ^{(k′)} was changed for every trial. The projection of the activity on ^{(k)} is also shown when only the pattern ^{(k)}-^{(k)} is encoded (^{(k)} (red dots) and along the readout ^{(k′)} (blue dots) at the peak amplification (^{(k)}) at the peak amplification as a function of the network size

The robustness of the readouts to random connectivity implies in particular that the unit-rank coding scheme is robust when an extensive number ^{(p)} and ^{(p)} are independently drawn from a random distribution (see ^{(p)} and the additional ^{(p)} ^{(p)T} corrupted by a random component with zero mean and variance equal to Δ^{2} ^{2} (see ^{(p)} are of order ^{(p)}, can therefore still be decoded from the projection of the activity on the corresponding readout ^{(p)}.

A natural upper bound on the number of trajectories that can be implemented by the connectivity _{max} = ^{2} and defines the capacity of the network. Crucially, the capacity scales linearly with the size of the network _{max}).

We examined the conditions under which linear recurrent networks can implement an encoding of stimuli in terms of amplified transient trajectories. The fundamental mechanism underlying amplified transients relies on the non-normal properties of the connectivity matrix, i.e. the fact that the left- and right-eigenvectors of the connectivity matrix are not identical [_{S} of the connectivity matrix (rather than examining properties of the eigenvectors of the connectivity matrix

We have shown that the largest eigenvalue of the symmetric part of the connectivity defines the amplification properties of the system asymptotically at small times _{t}, which however can be computed analytically only for simple connectivity matrices. To quantify the maximum amplification other measures have been developed that rely on the so-called pseudospectra [

Applying the criterion for transient amplification to classical randomly connected networks, we found that amplification occurs only in a narrow parameter region close to the instability, where the dynamics substantially slow down as previously shown [

In our framework we modeled the external stimulus as the initial condition of the network, and the amplified dynamics is autonomously generated by the recurrent interactions. Although this might appear as an oversimplifying assumption, it has nevertheless been proven useful to describe the transient population activity in motor and sensory areas. In motor and pre-motor cortex, the initial condition of the population dynamics during the execution of the movement may be set by the phase of preparatory neural activity that precedes the movement, and may determine to a large extent the time course of the movement-related dynamics [

The study by Murphy and Miller [

Here our aim was to produce amplified, but not necessarily long-lasting transients. The timescale of the transients generated using the unit-rank implementation is in fact determined by the effective timescale of the network, set by the dominant eigenvalue of the connectivity matrix. As shown in previous studies that focused on implementing transient memory traces [^{(k+1)}^{(k)T} +…+ Δ^{(3)}^{(2)T} + Δ^{(2)}^{(1)T}, i.e. in which each term feeds into the next one [

The implementation of transient channels proposed here clearly bears a strong analogy with Hopfield networks [^{T} for each pattern

The amplified dynamics map specific external inputs onto orthogonal patterns of activity with larger values of the norm ∥

While we focused here on linear dynamics in the vicinity of a fixed point, strong non-linearities can give rise to different transient phenomena [

We study a recurrent network of _{i}(_{ij} is the effective synaptic strength from neuron _{i} = 0 for all _{0,i} which corresponds to the relative activation of each unit. The terms _{0,i} can be arranged in a _{0}, which we call the external input direction. Here we focus on very short external input durations (_{0}∥ = 1). This type of input is equivalent to setting the initial condition to _{0}. Since we study a linear system, varying the norm of the input direction would result in a linear scaling of the dynamics.

We first outline the standard approach to the dynamics of the linear network defined by ^{−1} _{1}, _{2}, …, _{N} of the connectivity _{i} on the diagonal. Therefore the variables _{i} as
_{t} and the initial condition _{0} [_{t} is called the propagator of the system and it is defined as the matrix exponential of the connectivity matrix _{t} = exp(_{t} at time _{0}, to the state

To study the amplification properties of the network, we follow [_{S} = (^{T})/2, the symmetric part of the connectivity matrix

Both the eigenvalues and the eigenvectors of _{S} provide information on the transient dynamics of the system. On one hand, we show in the main text that the activity norm can have non-monotonic behaviour if and only if at least one eigenvalue of the matrix _{S} is larger than one. Therefore the eigenvalues of _{S} determine the type of transient regime of the system. On the other hand, as _{S} is symmetric, its set of eigenvectors is orthogonal and provides a useful orthonormal basis onto which we can project the dynamics. In this basis, the connectivity matrix is given by _{S} contains the eigenvectors of _{S} as columns. The matrix _{S} + _{A}, where _{A} = (^{T})/2 is the anti-symmetric part of _{S} on the diagonal. The off-diagonal terms of _{S}. In the amplified regime, some of the eigenvalues of _{S} are larger than one, so that without the coupling between the modes of _{S}, the connectivity

To identify which inputs are amplified, we examine the dynamics of the activity norm ∥_{0}. The one-dimensional _{0}, we can use

The dynamics elicited in response to an input along an arbitrary direction is in general complex. However, the singular value decomposition (SVD) of the propagator provides a useful way to understand the network dynamics during the transient phase. Any matrix _{i}(^{T} (resp. ^{T} _{t} maps each right singular vector _{t} as columns, so that

The number of amplified inputs at time _{max}(_{S}) > 1), at least one of the SV trajectories has non-monotonic dynamics, starting from one at _{δt} as
_{S} larger than unity, which we denote as _{S}.

From

Interestingly, it can be shown that the input _{S}. We will exploit this equation to identify the amplified initial condition

Summarizing, our approach for characterizing its transient dynamics can be divided into three main steps:

Compute _{S}, along with its eigenvalues and eigenvectors.

Compute the propagator of the system _{t}.

Compute the Singular Value Decomposition (SVD) of the propagator.

These three steps can be in principle performed numerically for any connectivity matrix. For particular classes of connectivity matrices, we show below that some or all three steps are analytically tractable.

Here we consider a non-normal random connectivity matrix with synaptic strength independently drawn from a Gaussian distribution

For this class of matrices, we can analytically determine the condition for amplified transients, and estimate the number of amplified inputs. In the stable regime (_{S} can have unstable eigenvalues. In fact, the elements of the symmetric part are distributed according to
_{S} is _{S} has unstable eigenvalues if

To estimate the number of amplified initial conditions, we compute the lower bound on their number _{S}(_{S} larger than 1 + _{S} is maximum when

Computing the SVD of the exponential of a

In this section we consider connectivity matrices describing networks composed of two interacting units of the form
^{±} are real, they are symmetrically centered around Tr(

The condition for transient amplification is determined by the two eigenvalues of _{S}, which read:
_{S} (^{2} under the square root. Under the assumption of a stable connectivity _{S} is larger than one, meaning that specific inputs are transiently amplified. Note that for a stable _{c} is real. Thus, Δ is the crucial parameter which determines the dynamical regime of the system.

To identify the parameters which determine the maximum amplification of a system, we project the network dynamics onto the orthonormal basis of eigenvectors of _{S}. In the new basis the connectivity matrix is given by _{A}, so that we obtain
_{S}. For Δ > Δ_{c} we have _{S} is amplified by an amount proportional to _{S}, the system reaches a finite amount of amplification and relaxes back to the zero fixed point. In the following we examine how the value of Δ determines the amount of amplification of the system.

To examine the dependence of the maximum amplification of the system on the parameter Δ we compute the propagator _{t} and its SVD. A convenient method to compute the exponential of a matrix is provided in [_{0}(_{1}(^{+} and λ^{−} are the eigenvalues of

In order to compute the maximum amplification of the system we next compute the largest singular value of the propagator _{1}(_{t}) (see

Here we compute the maximal amount of amplification by evaluating the maximum value in time of the amplification envelope _{1}(_{t}) (

To derive this relationship, we note that the combination _{c}, we have that _{1}(_{t}) and the value _{1}(_{t*}). The final result is given by

The two-dimensional model given by ^{±} and the corresponding timescales _{c}, and for fixed λ^{±}, the maximum amplification of the system scales linearly with the non-normal parameter Δ.

Here we compute the optimal input direction _{S}, i.e.

In this section we consider a unit-rank connectivity matrix defined by
_{1}, _{2} and

We first compute the eigenvalues and eigenvectors of the symmetric part of the connectivity
_{S} is a rank-2 matrix, meaning it has in general two non-zero eigenvalues given by
_{S} restricted to the _{⊥}]^{T} _{S}[_{⊥}], where _{⊥} is a vector perpendicular to _{S} is zero because of the zero eigenvalues of _{S}). We find that the two non-zero eigenvalues of the symmetric part _{S} are given by (see _{S} are symmetrically centered around λ/2, and their displacement is controlled by the scaling parameter Δ. The condition for the system to be in the regime of transient amplification is therefore

To compute the eigenvectors _{S} are thus given by

We can project the dynamics of the system on the basis of eigenvectors of _{S}. Let _{S} be the _{S} as columns:
_{i}’s are _{S} yields the new connectivity _{S} through the term

We explicitly compute the expression of the propagator for the unit-rank system. From the definition of matrix exponential in terms of infinite sum of matrix powers we obtain
^{−t}, as any component orthogonal to v in the

To study how the maximum amplification depends on Δ we compute the amplification envelope _{1}(_{t}). The singular values of the propagator _{t} are given by the square roots of the eigenvalues of the matrix _{1,2}(_{t}) as a function of Δ and λ (see _{t} are equal to ^{−t}.

For the unit-rank system, two parameters out of Δ, λ and

keep the eigenvalue λ constant, so as to fix the timescale

Fix the correlation between the structure vectors,

Keep

The singular values of the propagator given by _{1}(_{t}). In the strong amplification regime (

for λ fixed and

For

For

In the case

The general observation that the largest eigenvalue of the symmetric part of _{max}(_{max}(_{S} is in general inadequate to characterize the maximum amplification of the system, for which other measures may be considered (see

Using the result we found for the two dimensional case, Eqs _{S} as

For fixed λ and ^{−1}. In the strong amplification regime the second term on the right hand side is much smaller than unity, so that we can compute ^{−1}. Denoting by

In the strong amplification regime the optimal initial condition is thus strongly aligned with

For fixed

For fixed

In conclusion we find that, in the strong amplification regime, the optimal input has a strong component on the structure vector

In this section we study the dynamics of the system in presence of noise in the synaptic connectivity. We consider the connectivity matrix given by _{i} _{j}. The resulting connectivity matrix can be written as the sum of a structured unit-rank part and a Gaussian random matrix of the form [_{x}∥Δ^{T}

To draw the phase diagram of the system, we compute the eigenvalues of the symmetric part of _{S} denotes the symmetric part of _{S} are distributed according to
_{S} as a function of _{S} is given by
_{S}. We distinguish two cases:

if

If ^{2}/(λ + Δ) > 1 is always satisfied for λ + Δ > 0, thus holding also for _{max}(_{S}) is larger than one independently of the values of λ and Δ.

In the case ^{2}, which leads to a correction of order ^{2} to the condition for the amplified regime in absence of noise (see

Here we examine the magnitude of the fluctuations around the mean activity introduced by the random term in the connectivity given by ^{2} and autocorrelation function 〈_{i}(_{j}(_{ij} _{i}(_{i}(_{i}(_{i}(_{i}(_{i}(

The time-dependent correlation matrix ^{T}〉 can be written as the sum of two terms, corresponding to the contributions of the noise in the connectivity (with variance ^{2}) and the noise in the input (with variance ^{2}):
_{kl} _{mn}〉 = _{km} _{ln}/

We start by computing the first term in ^{g} onto _{u} and at the peak of the transient phase (

Computing the variance of the activity along the readout _{0}(_{0} becomes smaller than unity is given by _{c,0} = 1. As a result, the maximum gain in _{0} for non amplified dynamics (Δ < 2), transient amplification is needed to keep a stable

In this section we examine the robustness of the transient readouts when ^{(p)} and ^{(p)} are randomly and independently distributed with zero mean and variance equal to 1/

The connectivity matrix in this case is given by
^{(1)} and ^{(1)} evokes a two-dimensional trajectory which remains confined on the same plane. The same holds for the dynamics on the plane defined by ^{(2)} and ^{(2)}.

When the number of encoded trajectories ^{(p)} and ^{(1)}. However, the results for the case of one stored pattern with connectivity noise can be applied to this case if we write the connectivity matrix in ^{(i)} and ^{(i)} are uncorrelated with each other, so that we can consider the second term on the right hand side of ^{(1)}^{(1)T}, with mean zero and variance Δ^{2} ^{2}. In fact, the mean and the variance of the effective noise are given respectively by
^{(1)} corresponding to the stimulus ^{(1)}. Since the number of encoded patterns

However, when a number _{max} that can be stored in the connectivity before the system becomes unstable. This number defines the capacity of the system and is given by
^{(p)} and ^{(p)} are exactly orthogonal to each other for all _{max} =

(TIF)

Signal-to-noise ratio of the readout as a function of the standard deviation of the input noise

(TIF)

We are grateful to Francesca Mastrogiuseppe and Manuel Beiran for discussions and feedback on the manuscript.

Dear Dr Bondanelli,

Thank you very much for submitting your manuscript, 'Coding with transient trajectories in recurrent neural networks', to PLOS Computational Biology. As with all papers submitted to the journal, yours was fully evaluated by the PLOS Computational Biology editorial team, and in this case, by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some aspects of the manuscript that should be improved.

We would therefore like to ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer and we encourage you to respond to particular issues Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.raised.

In addition, when you are ready to resubmit, please be prepared to provide the following:

(1) A detailed list of your responses to the review comments and the changes you have made in the manuscript. We require a file of this nature before your manuscript is passed back to the editors.

(2) A copy of your manuscript with the changes highlighted (encouraged). We encourage authors, if possible to show clearly where changes have been made to their manuscript e.g. by highlighting text.

(3) A striking still image to accompany your article (optional). If the image is judged to be suitable by the editors, it may be featured on our website and might be chosen as the issue image for that month. These square, high-quality images should be accompanied by a short caption. Please note as well that there should be no copyright restrictions on the use of the image, so that it can be published under the Open-Access license and be subject only to appropriate attribution.

Before you resubmit your manuscript, please consult our Submission Checklist to ensure your manuscript is formatted correctly for PLOS Computational Biology:

- Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition).

- Supporting Information uploaded as separate files, titled 'Dataset', 'Figure', 'Table', 'Text', 'Protocol', 'Audio', or 'Video'.

- Funding information in the 'Financial Disclosure' box in the online system.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we ask that you let us know the expected resubmission date by email at

If you have any questions or concerns while you make these revisions, please let us know.

Sincerely,

Kenneth D. Miller

Guest Editor

PLOS Computational Biology

Lyle Graham

Deputy Editor

PLOS Computational Biology

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact

[LINK]

The reviewers agree that this paper is a good and solid contribution but also raise many smaller issues that need to be addressed. Their reviews are clear, I would add just two very tiny comments to reviewer comments: reviewer 2 comment 1: if there are degenerate eigenvalues, then either the corresponding subspace must be missing an eigenvector, or that subspace of eigenvectors is normal (an orthonormal basis of eigenvectors can be chosen for that subspace). reviewer 3 comment 3: more generally for a normal matrix the singular values are the absolute values (or modulus, for complex eigenvalues) of the eigenvalue- singular values are nonnegative and real, eigenvalues can be negative or complex.

Reviewer's Responses to Questions

Reviewer #1: This paper provides a number of useful analytical results characterizing the strength and structure

of non-normal transient amplification, which is of relevance to the study of dynamics in many biological networks,

including recurrent neural networks.

In the theoretical neuroscience literature, transient amplification has been previously proposed as a mechanism

underlying fast spontaneous cortical fluctuations and selective amplification of neural patterns, as well as in

networks with extensive working memory.

While non-normality (non-orthogonality of eigenvectors) is necessary for transient amplification in linear networks,

it is clear that any slight deviation from normality is not sufficient for creating amplification.

The authors therefore present sufficient conditions for the latter (criteria in terms of eigenvalues of the symmetrized

connectivity matrix, which have been known previously, at least in the math literature on transient amplification),

which they also specialize to classes of network structures that have been of interest in theoretical neuroscience.

They then provide a simple method for determining the spatial (i.e. in terms of neural population pattern) and temporal structure of transient amplification using a singular value decomposition of the linear network’s propagator.

They use these tools to study transient amplification in effectively low dimensional (i.e. low-rank) as well as high dimensional random systems, and to study the robustness of selective amplification in the presence of random structural and dynamic perturbations. Finally they provide a proposal for constructing systems that can selectively (and transiently) amplify multiple specific patterns, and they quantify the corresponding capacity of such linear networks —i.e. how many on-average orthogonal patterns can be “learned” (i.e. included in the connectivity matrix), without disrupting the desired selective amplification.

This is a nice paper with several technical results that could be quite useful to the theoretical/computational

neuroscience community. Even though many of these results are easy to derive —and a few have been already published, as noted by authors, in the math literature on transient amplification— I still believe this is a useful paper for the community, in that it presents all such old and new results together, specializes them to some network structures of interest in neuroscience, and also makes proposals about their possible computational utility.

The paper is also well-written, and discussions and expositions were clear and to-the-point. So I recommend its publication with minor corrections that I list below.

Comments:

1. Lines 93-94 (and also lines 162): The language needs to be refined/made more precise. It’s, strictly speaking, meaningless to talk about identifying “different inputs giving rise to amplified trajectories” or estimating “their numbers”, or asking “… how many inputs are amplified?", since the input space (and its subspace with those properties) is a continuum, or more precisely a vector space with infinite members. You should rather say “identity the input subspace giving rise to amplified trajectories” and “estimating the dimensionality of this subspace”, and/or “how many orthogonal inputs [or input patterns] are amplified?".

2. Lines 126-127: you say “… except in the case of 126 one-dimensional dynamics or symmetric connectivity matrices.” The exception is not limited to symmetric matrices and should cover all normal J’s, so this phrase should be corrected. (E.g. consider anti-symmetric J’s which have imaginary eigenvalues: in that case the symmetric part is 0 and all its eigenvalues are 0, so the second condition is 0 >1, while because eigenvalues of J are imaginary the first condition is 0 < 1, and the two are inconsistent.)

3. Line 210: you end the sentence by “… when the connectivity J is Gaussian.", but the correct statement is “… when the connectivity matrix J is a random matrix with independent and identically distributed elements.” Gaussianity is not key (due to universality theorems on the circle and semicircular laws), but independence is key: so conversely a (symmetric) gaussian matrix with nontrivial covariances across elements need not give rise to the circle (semicircular) law.

4. Caption of Fig. 4: Change the caption for panel C to "C. Illustration of the dynamics elicited by the three inputs, R_1, R_2 and R_100 (show in different rows), as in B.”

5. Lines 228 and 231: you need to define what you mean by minimal connectivity. It seems that you mean a matrix with minimal rank. So try to justify why that notion of minimality is relevant to neuroscience.

6. Lines 280-292: the writing has to be made accurate regarding the “orthogonality" of different pattens. If the u_p/v_p vectors pairs for different p’s are literally mutually orthogonal (as the current text suggests) then there will be exactly zero interference between different rank-one terms (i.e. different patterns), and fluctuations of the activity along a readout u_p caused by other patterns will be exactly 0, and not just O(\\sqrt{P}/N). In this case the “capacity” would be N/2. But reading the Methods section it becomes clear that the authors are implicitly talking about a connectivity with the rank-P form with the different u’s and v’s sampled from a random ensemble of vectors with independently distributed components, such that different vectors are only approximately orthogonal with high probability (when N is large), and not literally orthogonal as the main text in lines 280-290 suggests. So this should be corrected.

7. Lines 27-28: replace “… but sufficient conditions for such amplification were not given.” with “… but general sufficient conditions for such amplification were not given.”

8. Line 53: I would add the following at the end of this sentence (first sentence of “Monotonic vs Amplified Transient Trajectories”): “… in the autonomous network with I(t) = 0."

9. Lines 740, 749 and 763: fix the broken (appearing as ?) references.

10. The brackets (signifying averaging) around dr_i on the left hand side should be removed.

11. In equation 80 (lines 782-783): the sum over the index l is only needed for the first term and the second term (the one involving \\eta_k) should not be summed over l.

12. Line 817: “randomly distributed” should be replaced with “randomly and independently distributed”.

13. Line 343: “straightforward” doesn’t have a hyphen.

14. Figure 6: In panel E, the maximum readout value for P/N ~ 0.02 seems to be significantly higher than the maximum of the red curve in panel D. Why?

15. Figure 2: in panel B, it would be helpful to also include a line for \\tau>1, corresponding to max Re \\lambda(J) >0.

16. Figure 2: panel C would be more inofrmative if it is turned into a heat-map of \\sigma_1^*, with the two phases (monotonic and amplified) and their boundary indicated/drawn on top of the heat-map.

17. In caption of Fig. 4, panel B: replace “… at time t_* in pannel [sic.] A” with “… at time t_* indicated by the dashed vertical line in panel A”.

18. Fig 4 C, last row, middle column: unlike in the top and middle row of panel C, the labels L_100^* and R_100^* are missing here.

19. Line 544: replace "symmetrically arranged along the imaginary dimension.” with "symmetrically arranged on either side of the real axis.”

,

Reviewer #2: The authors provide a clear and elegant explanation of the behavior of linear networks of neurons in terms of transient amplification. The paper establishes the basic tools needed to understand transient amplification (eigenvalues of the symmetric part of the connectivity matrix, SVD of the propagator…) and analyses in details random and low-rank connectivity matrix, together allowing the reader to develop an intuitive understanding of the phenomenon. I also liked very much the structuring of the paper into 3 parts: simple take-home messages, methods, and appendices. I congratulate the authors. The study will be very useful for teaching.

I only have a couple of small comments:

1) I had the intuition that strong amplification required not only non-normality (lack of orthogonality of the eigenvectors of J), but also a **difference** between their corresponding eigenvalues. In this case, initial conditions loading on several non-orthogonal eigenvectors (i.e., those involving ‘cancellations’) align initially with the slowest decaying modes (this is the period of growth), and eventually decay. Is this intuition incorrect in general, or is this picture somehow hidden in the properties of J_s?

2) Transient amplification (TA) is usually studied in connection to specific initial conditions from which amplification happens. The ‘transient’ is then purely generated by the recurrent connectivity. From the point of view of neuroscience, I think it would be interesting to include some comments in the discussion about the kinds of experimental or real-life situations where this may be expected to occur or not. For instance, TA has been related to preparatory states before movement. How about onset sensory responses in cortex? How about time-varying inputs (which actually are the ethological norm)? The discussion on this would benefit from the recognition that, although linear models provide interesting approximations, the real networks have significant non-linearities, even at the firing rate level… (I say this because if everything is purely linear, there is perfect superposition in time and time-varying input is not particularly interesting...)

Reviewer #3: # Summary and main comments

This paper discusses non-normal mechanisms for transient amplification in linear recurrent networks. The authors study the distinction between two classes of networks: those that transiently amplify their inputs, and those that do not. They provide a necessary and sufficient condition on the connectivity matrix for a network to belong to the latter, and go on explaining i) how to find those inputs that are transiently amplified and ii) how to construct networks that can exploit such transient amplification effects to perform "coding".

I am very strongly supportive of publication, as this is both a timely and very well executed paper, easy to read, well illustrated. I only have a couple of suggestions:

1. Perhaps my main comment concerns the old debate on matrix nonnormality, and how it is inherently difficult to characterise a concept that has to do with the geometry of a matrix' eigenbasis using a single scalar number (here, the numerical abscissa). In this paper, the distinction is made between networks in which the activity vector can grow transiently in response to an appropriately chosen input; and those that simply can't, regardless of the input. The authors derive a diagnostic criterion based on the numerical abscissa; while this is entirely valid (given the definition of "amplifying vs non-amplifying" networks outlined above), it is nevertheless an asymptotic criterion which only concerns the slope of the norm $\\| r(t)\\|$ of the response at small times ($t=0^+$) and therefore _in principle_ provides little insight about how much $\\| r(t) \\|$ will grow and for how long. To be fair, the authors go on discussing an analysis of the induced 2-norm of the propagator $e^{tA}$, which gives the full description of transients, but ─ as they acknowledge ─ whose SVD can only be computed numerically in most cases. There is a whole chapter in Trefethen & Embree ("Spectra and pseudospectra"; chap. 4 on transients) which discusses alternative summary scalar measures that are much better indications of expected transient magnitude / durations (i.e. provide useful bounds on transient size & timing). The pseudospectrum and associated Kreiss theorems should probably be mentioned.

So I suggest the authors discuss these issues briefly in the intro / discussion. Perhaps a useful place to start is the 2x2 example they have already studied in depth, for which it's not too difficult to find corner cases where the max amplification is bounded whereas the numerical abscissa isn't. E.g. the peak amplification for W = [ 1; -a]; [1; -a] ] is upper bounded by $\\sqrt{2}$ (it tends to $\\sqrt{2}$ as $a \\to \\infty$); yet the numerical abscissa (spectral abscissa of $W+W^T-2I$) grows linearly in $a$ ($\\approx (\\sqrt(2)-1) a$ (the peak of amplification occurs earlier and earlier, too).

2. Going back to the "coding" viewpoint, adopted in the introduction: for the type of deterministic Markovian dynamics considered here, future states are entirely determined by the current one ─ the whole trajectory is entirely determined by the initial condition; so for deterministic dynamics, there can be no advantage of eliciting fancy transients: there is just as much information about the stimulus at the peak of the transient as there was at $t=0$. The advantage of amplification for coding become apparent when noise is taken into consideration ─ but then, it is important to distinguish between process noise (which tends to get amplified together with signal) and observation noise (which doesn't). Nonnormal transients don't help much with process noise (though that depends on its geometry), but can hugely increase the mutual information between stimulus and r(t) in cases when output (observation) noise is strong; having signal rise above noise is a big win. I would recommend discussing this briefly, too.

# Minor comments:

1. line 30:

> This results leads to a simple distinction between two classes of networks: networks in which all inputs lead to weak, decaying transients, and networks in which specific inputs elicit strongly amplified transient responses.

Cf main comment (1) above: this is a little too strong, as the class of networks of the latter type also contains networks that amplify very weakly.

2. line 71:

> necessarily implies that the firing rate of at least one neuron shows a transient increase before decaying to baseline.

increase, or decrease (but can always be turned into an increase by reversing the sign of the initial condition of course)

3. line 177:

> Note that for a normal matrix, the left and right singular vectors $R(t)_k$ and $L(t)_k$ are identical, and the singular values are equal to the eigenvalues [...]

corner case: e.g. skew symmetric (hence normal) matrix W has complex eigenvalues, whereas singular values are real

4. regarding the margin:

> To eliminate the trajectories with very short amplification, one can further constrain the slopes to be larger than a margin $\\epsilon$

again, this asymptotic result concerns the slope at time t=0; even with a margin, this seems to give no formal guarantee on the amount of amplification that can follow

5. line 228:

> Specifically, we wish to determine the minimal connectivity that transiently transforms a fixed, arbitrary input $r_0$ into a fixed, arbitrary output $w$, through two-dimensional dynamics.

Considerations of time unclear here → transform $r_0$ into $w$ _at some point_ during the transient?

6. reference missing on line 405

7. reference to figure missing on line 808.

Best wishes,

Guillaume Hennequin

**********

Large-scale datasets should be made available via a public repository as described in the

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes: Guillaume Hennequin

Submitted filename:

Dear Dr Bondanelli,

Thank you very much for submitting your manuscript, 'Coding with transient trajectories in recurrent neural networks', to PLOS Computational Biology. As with all papers submitted to the journal, yours was fully evaluated by the PLOS Computational Biology editorial team, and in this case, by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some aspects of the manuscript that should be improved.

We would therefore like to ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer and we encourage you to respond to particular issues Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.raised.

In addition, when you are ready to resubmit, please be prepared to provide the following:

(1) A detailed list of your responses to the review comments and the changes you have made in the manuscript. We require a file of this nature before your manuscript is passed back to the editors.

(2) A copy of your manuscript with the changes highlighted (encouraged). We encourage authors, if possible to show clearly where changes have been made to their manuscript e.g. by highlighting text.

(3) A striking still image to accompany your article (optional). If the image is judged to be suitable by the editors, it may be featured on our website and might be chosen as the issue image for that month. These square, high-quality images should be accompanied by a short caption. Please note as well that there should be no copyright restrictions on the use of the image, so that it can be published under the Open-Access license and be subject only to appropriate attribution.

Before you resubmit your manuscript, please consult our Submission Checklist to ensure your manuscript is formatted correctly for PLOS Computational Biology:

- Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition).

- Supporting Information uploaded as separate files, titled 'Dataset', 'Figure', 'Table', 'Text', 'Protocol', 'Audio', or 'Video'.

- Funding information in the 'Financial Disclosure' box in the online system.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we ask that you let us know the expected resubmission date by email at

If you have any questions or concerns while you make these revisions, please let us know.

Sincerely,

Kenneth D. Miller

Guest Editor

PLOS Computational Biology

Lyle Graham

Deputy Editor

PLOS Computational Biology

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact

[LINK]

(From K Miller, guest editor)

The reviews supported publication of the paper and the authors have responded well to all the review comments. The paper is almost ready for publication. However in scanning through the paper looking at the corrections I noticed a number of very small points that should be revised in a final version:

Abstract: "and networks in which specific inputs elicit strongly amplified transient responses and are mapped onto orthogonal output states during the dynamics". Here you are talking about the whole class of networks that show initial amplification, which may not be strongly amplified and need not be mapped onto orthogonal states, so this should be changed, e.g. to "and networks in which specific inputs elicit amplified responses during the dynamics". You could if you want restore the orthogonal concept by inserting 'orthogonal' in "We then build minimal, low-rank networks ... mapping a specific input onto a specific *orthogonal* output state."

Line 62: monotonic decay "exploring essentially a single dimension" -- this isn't true, the monotonic decay could curve or spiral through an arbitrarily high-dimensional space. So I suggest cutting this phrase. 'or transiently move away from it by following a rotation': similarly, this need not be a rotation -- for example if in 2D you have two eigenvectors both nearly vertical, and you start with a small horizontal initial condition, and eigenvalues are real and of very different magnitudes, then the trajectory will go up toward the eigenvector with the larger eigenvalue and then decay down along that eigenvector, which is nearly an up and down motion with only a slight rotational component. So I'd suggest cutting "by following a rotation".

Line 72: "the firing rate of at least one neuron shows a transient increase (or decrease) before decaying to baseline" -- 'increase or decrease' includes pretty much everything, so this is more or less a tautology. I think what you want to state is that at least one r_i shows a transient increase in its absolute value.

You've also gotten yourself into a bit of trouble since you are calling the r_i's the deviation of the firing rate from the fixed point, so you cannot simply refer to them as 'the firing rates', i.e. you can't say 'at least one firing rate increases in its absolute value'. You could get yourself out of this by, under Eq. 1, saying that for simplicity you will refer to the r_i's as the firing rates, in which case you would want to say 'before decaying to zero' instead of 'before decaying to the baseline'.

Lines 144-147: "two interacting excitatory-inhibitory populations" is confusing, it sounds like 4 populations, two excitatory and two inhibitory. You could just say 'an excitatory and an inhibitory population'.

"our criterion states that the excitatory feedback needs to be (approximately) larger than unity in order to achieve transient amplification (Fig. 2 and Methods)". 'Methods' should instead be Appendix F. But also, this statement isn't correct -- it's only correct when the network is of the form {{w,-k w},{w,-k w}} with k=1+\\epsilon for \\epsilon small. It is certainly not true for the general case of one excitatory and one inhibitory population. The same problem is in the legend of Figure 2, which again states this same criterion, now the form of the matrix is specified but you haven't noted the restriction on k.

Lines 315 and 323: I think you want to cite citation [37], not [38]

Lines 378-380: "the inhibition-dominated regime, which as we show approximately corresponds to the class of unit-rank E-I networks satisfying the general criterion for transient amplification." This isn't true -- if you assume the form {{w,-k w},{w,-k w}}, you can get transient amplification in a stable network either with k<1 or k>1.

(Mathematica tells me: If k<1, the criterion is (2 (-1 + k))/(1 + k)^2 + 2 Sqrt[2] Sqrt[(1 + k^2)/(1 + k)^4] < w < -(1/(-1 + k)); while if k>1, the criterion is w > (2 (-1 + k))/(1 + k)^2 + 2 Sqrt[2] Sqrt[(1 + k^2)/(1 + k)^4].)

Methods, Eq. 30: You should note that, *for J stable*, Eq 30 is the criterion for Eq. 28 to be >1, and that the stability condition that Det(J-1)>0 is precisely the condition that the argument of the square root in (30) is positive, so that \\Delta_c is real. Without the condition that J is stable, the criterion for Eq 28 > 1 is more complicated.

Appendix F, line 1012 "recovers the results from (32), showing that the system is amplified if the excitatory strength w is (approximately) larger than one." In (32), we showed something different: that for an initial condition of r_E>0, r_I=0, r_E (not |r|) shows transient amplification if and only if w>1, for any value of k. What you are showing is that there will be some initial condition for which |r| will show transient amplification if k=1+\\epsilon and w>1-\\epsilon^2/4 for \\epsilon<<1. These are very different.

I'm sorry it took me a while to get to this after the revision came in, due to the holidays. If it comes back with these revisions I promise very quick turnaround.

Submitted filename:

Dear Dr Bondanelli,

We are pleased to inform you that your manuscript 'Coding with transient trajectories in recurrent neural networks' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Once you have received these formatting requests, please note that your manuscript will not be scheduled for publication until you have made the required changes.

In the meantime, please log into Editorial Manager at

One of the goals of PLOS is to make science accessible to educators and the public. PLOS staff issue occasional press releases and make early versions of PLOS Computational Biology articles available to science writers and journalists. PLOS staff also collaborate with Communication and Public Information Offices and would be happy to work with the relevant people at your institution or funding agency. If your institution or funding agency is interested in promoting your findings, please ask them to coordinate their releases with PLOS (contact

Thank you again for supporting Open Access publishing. We look forward to publishing your paper in PLOS Computational Biology.

Sincerely,

Kenneth D. Miller

Guest Editor

PLOS Computational Biology

Lyle Graham

Deputy Editor

PLOS Computational Biology

PCOMPBIOL-D-19-01144R2

Coding with transient trajectories in recurrent neural networks

Dear Dr Bondanelli,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Laura Mallard

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom