Learning spatiotemporal signals using a recurrent spiking network that discretizes time

Learning to produce spatiotemporal sequences is a common task the brain has to solve. While many sequential behaviours differ superficially, the underlying organization of the computation might be similar. The way the brain learns these tasks remains unknown as current computational models do not typically use realistic biologically-plausible learning. Here, we propose a model where a spiking recurrent network drives a read-out layer. Plastic synapses follow common Hebbian learning rules. The dynamics of the recurrent network is constrained to encode time while the read-out neurons encode space. Space is then linked with time through Hebbian learning. Here we demonstrate that the model is able to learn spatiotemporal dynamics on a timescale that is behaviorally relevant. Learned sequences are robustly replayed during a regime of spontaneous activity.


Introduction
Neuronal networks perform flexible computations on a wide range of time scales. While individual neurons operate on the millisecond time scale, behavior typically spans from a few milliseconds to hundreds of milliseconds and longer. Building functional models that bridge this time gap is of increasing interest [Abbott et al., 2016], especially now that the activity of many neurons can be recorded simultaneously [Jun et al., 2017, Maass, 2016. Many tasks and behaviors essentially have to produce spatiotemporal sequences. For example, songbirds produce their songs through a specialized circuit. Neurons in the HVC nucleus burst sparsely at very precise times to drive the robust nucleus of the arcopallium which in its turn drives motor neurons [Hahnloser et al., 2002, Leonardo andFee, 2005]. In addition, sequential neuronal activity is recorded in various brain regions for different motor tasks [Pastalkova et al., 2008, Itskov et al., 2011, Harvey et al., 2012, Peters et al., 2014, Adler et al., 2019. While the different tasks possibly involve different sets of muscles, the underlying computation on a more fundamental level might be similar [Rhodes et al., 2004].
Theoretical and computational studies have shown that synaptic weights of recurrent networks can be set appropriately such that dynamics on a wide range of time scales is produced [Litwin-Kumar and Doiron, 2014, Zenke et al., 2015, Tully et al., 2016. In general, these synaptic weights are engineered to generate a range of interesting dynamics: slow-switching dynamics [Schaub et al., 2015] and different types of sequential dynamics [Chenkov et al., 2017, Billeh and Schaub, 2018, Setareh et al., 2018. However, it is unclear how the brain learns these dynamics as most of the current approaches use non biologically-plausible ways to set or "train" the synaptic weights. For example, FORCE training [Sussillo and Abbott, 2009, Laje and Buonomano, 2013, Nicola and Clopath, 2017 or backpropagation through time use non-local information either in space or in time.
Here, we propose to learn a task in a spiking recurrent network which drives a read-out layer. All synapses are plastic under typical Hebbian learning rules [Clopath et al., 2010, Litwin-Kumar andDoiron, 2014]. We train the recurrent network to generate a sequential activity which serves as a temporal backbone. This sequential activity is generated by clusters of neurons activated one after the other. As clusters are highly recurrently connected, each cluster undergoes reverberating activity that lasts longer than neural time scale. Therefore, the sequential activation of the clusters leads to a sequence long enough to be behaviourally relevant. This allows us to bridge the neural and the behavioral time scales. We then use Hebbian learning to learn the target spatiotemporal dynamics in the read-out neurons. Thus, the recurrent network encodes time and the read-out neurons encode 'space'. 'Space' can mean spatial position, but it can also mean frequency or a more abstract state space. Similar to the liquid state-machine, we can learn different dynamics in parallel in different read-out populations [Jaeger et al., 2007]. We show that learning in the recurrent network is stable during spontaneous activity and that the model is robust to synaptic failure.

Results
In the following we will build our model step-by-step in a bottom-up manner.

Model architecture
The model consists of two separate parts ( Fig 1A). The first part is a recurrent network which serves to encode time in a discretized manner. The second is a read-out layer. The read-out layer learns a multivariate signal of time, t → φ(t). The neurons in the read-out layer encode the D different dimensions of the signal, φ(t) = [φ 1 (t), φ 2 (t), ..., φ D (t)]. Time is discretized in the recurrent spiking network by C clusters of excitatory neurons, t = [t 0 , t 1 , ..., t C ]. This means that at each time only a subset of neurons spike, i.d. the neurons belonging to the same cluster. Given that the first cluster is active during time interval [t 0 , t 1 ], cluster i will be active during time interval [t i−1 , t i ]. These excitatory clusters drive the read-out neurons through all-to-all feedforward connections. At time t ∈ [t i−1 , t i ] the read-out neurons are solely driven by the neurons in cluster i, with some additional background noise. This discretization enables Hebbian plasticity to bind the read-out neurons to the neurons active in the relevant time-bin. The read-out neurons are not interconnected and receive input during learning from a set of excitatory supervisor neurons. In previous models, the learning schemes are typically not biologically plausible because the plasticity depends on non-local information. Here however, we use the voltage-based STDP plasticity rule at the excitatory to excitatory neurons ( Fig 1B). This is paired with weight normalization in the recurrent network ( Fig 1C) and weight dependent potentiation in the read-out synapses ( Fig 1D). Additionally, inhibitory plasticity [Vogels et al., 2011] prevents runaway dynamics.
A recurrent network that encodes discrete time Learning happens in a two-stage process. In the first stage, a feedforward weight structure is embedded into the recurrent network. The excitatory neurons are divided into disjoint clusters and neurons in the same cluster share the same input. The recurrent network is initialized as a balanced network with random connectivity. To embed a feedforward structure, the network is stimulated in a sequential manner (Fig 2A). Neurons in cluster i each receive external Poisson spike trains (rate of 18 kHz for 10 ms). After this, there is a time gap where no clusters receive input (5 ms). This is followed by a stimulation of cluster i+1. This continues until the last cluster is reached and then it links back to the first cluster (i.e. a circular boundary condition). During the stimulation, neurons in the same cluster fire spikes together, which will strenghten the intra-cluster connections bidirectionally through the voltage-based STDP rule [Clopath et al., 2010, Ko et al., 2013. Additionally, there is a pre/post pairing between adjacent clusters. The weights from cluster i to cluster i + 1 strenghten unidirectionally ( Fig  2B). When the time gap between sequential stimulations is increased during the training phase, there is no pre/post pairing between clusters anymore. This leads to slow-switching dynamics as opposed to sequential dynamics [Litwin-Kumar andDoiron, 2014, Schaub et al., 2015] (Fig S2). After the weights have converged, the external sequential input is shut-down and spontaneous dynamics is simulated. External excitatory Poisson spike trains without spatial or temporal structure drive the spontaneous dynamics. During that time, the sequence of the clusters reactivates spontaneously and ensures that both the intra-cluster and the feedforward connections remain strong. The connectivity pattern is therefore stable (Fig S3). The feedforward weight embedding changes the spectrum of the recurrent weight matrix. The dominant eigenvalues lay in a circle in the complex plane ( Fig 2C). Analysis of a simplified linear rate model shows that the imaginary parts linearly depend on the strength of the feedforward embedding (see supplementary material, Fig S1). Intuitively, this means that the imaginary parts of the dominant eigenvalues determine the period of the sequential activity ( Fig S3). Under spontaneous activity, each cluster is active for about 14 ms, which is due to the recurrent connectivity within the cluster. A large adaptation current counteracts the recurrent reverberating activity to turn the activity reliably off. Therefore, as each cluster is spontaneously active in a sequence, that leads to a sequence length that reaches behavioural time scales (Fig 2D). In summary, the feedforward block connectivity structure embedded in the recurrent network results in a sequential dynamics that discretizes time.

Learning a non-markovian sequence
After the temporal backbone in the recurrent network is established, we now learn a simple yet non-markovian sequence in the read-out neurons. During training, the read-out neurons receive additional input from supervisor neurons and from interneurons ( Fig 3A). The supervisor neurons receive external Poisson input. The rate of the Poisson input is modulated by the target sequence that needs to be learned ( Fig 3B). The target sequence is composed of different states (A, B or C) that are activated in the following order ABCBA. This is a nonmarkovian state sequence because the transition from state B requires knowledge about the previous state. Each neuron in the supervisor is encoding one state. This task is non-trivial and it is unclear how to solve this problem in a generic recurrent spiking network with local plasticity rules. However, separating time and space solves this in a natural way. Our recurrent network (Fig 2) is used here, where the first cluster is activated at time t. At the same time t, the supervisor is also activated. The underlying assumption is that a starting signal activates both the first cluster of the recurrent network and the external input to the supervisor neurons at the same time. The read-out neurons are activated by the supervisor neurons. Due to the voltage-based plasticity, synapses are potentiated from neurons in the recurrent network and read-out neurons that fire at the same time. After the training period, the interneurons and supervisor neurons stop firing ( Fig 3C). The target sequence is now stored in the read-out weight matrix ( Fig 3D). As clusters in the recurrent network spontaneously reactivate during spontaneous activity in a sequential manner, they also reactivate the learned sequence in the read-out neurons. The spike sequence of the read-out neurons is a noisy version of the target signal ( Fig 3E). Learning the same target signal several times results in slightly different read-out spike sequences each time ( Fig S4). The firing rates of neurons in the read-out correspond to the target sequence ( Fig 3F). In summary, our results show that the model is able to learn simple but non-trivial spike sequences.

Properties of the model
We then investigate the robustness of our model by silencing neurons in the recurrent network after learning. The first type of robustness we test is a robustness of the read-out. To study this, the presynaptic spike trains from the recurrent network are slightly perturbed. We compared two different networks. 1) A large network (where 2400 excitatory neurons are grouped in 30 clusters) and 2) a small network (where 600 excitatory neurons are grouped in 30 clusters). In order to have flexibility, we did not simulate our recurrent network dynamics but simply forced neurons in cluster i to spike randomly in time bin i. As before, we learn the simple ABCBA sequence in the readout neurons ( Fig 4A). Note that the larger network learns quicker than the smaller one. The strengths of the read-out synapses of the smaller clusters need to be larger to be able to drive the read-out neurons. After learning, 4 random neurons are silenced in each cluster at two different time points. While the effect of silencing these neurons is very small in the larger network, it is very visible in the smaller network. Multiple spikes disappear at each read-out neuron. These results show that, not surprisingly, larger clusters drive read-out neurons more robustly.
We then wanted to test whether time discretization is important in our model. To that end, we trained our recurrent network with clusters as small as one neuron. In that extreme case, the clusters and recurrent connections effectively disappear and our network becomes a synfire chain with a single neuron in every layer [Abeles, 1991]. Randomly removing a synapse in the network will break the sequential dynamics ( Fig 4B). Although a spatiotemporal signal can be learned in the read-out neurons, the signal is not stable under a perturbation as the synfire-chain is not stable (Fig 4B). In summary, the choice of cluster size is a trade-off between network size on the one hand and robustness on the other hand. Large clusters require a large network but learn faster, are less prone to a failure of the sequential dynamics and drive the read-out neurons more robustly.
The sequential activity in the recurrent network shows some variability in terms of sequence duration. The neural activity does not move along the periodic trajectories with exactly the same speed in each reactivation. We wondered how the variance in our network compared to Weber's law. According to Weber's law, the standard deviation of reactions in a timing task grows linearly with time [Gibbon, 1977, Hardy andBuonomano, 2018]. Since our model operates on a time scale that is behaviorally relevant, it is interesting to look at how the variability increases with increasing time. Time is discretized by clusters of neurons that activate each other sequentially, as such the variability increases naturally over time. As in a standard diffusion process, this increase is expected to grow with √ t rather than linearly. This is indeed what our network displays ( Fig 4C). Here, we increased the period of the recurrent network by increasing the network size (80 excitatory clusters of 80 neurons each, see Methods for details). By doing so, we can look at how the standard deviation of cluster activations grows with time.   After 200 seconds of learning, spontaneous dynamics is simulated. At t = 750 ms, a single synapse (the synapse connecting neuron number 41 to neuron number 40) is deleted. The dynamics breaks down and parts of the sequence are randomly activated by the external input. The learned read-out dynamics is fractured in subsequences as a consequence. Top: spike raster of the excitatory neurons of the recurrent network. Bottom: spike raster of the read-out neurons. The maximum synaptic read-out strength is increased, W AE max = 75 pF , to be able to learn large read-out weights. (C.) Temporal variability in the sequential activity of the recurrent network is measured over 79 trials. A histogram of the periods of the 79 sequences (top) shows an average period of about 615 ms and a standard deviation of around 5 ms. The standard deviation increases roughly with the square root of time (bottom). The least squares fit for the function σ t = a √ µ t has a root mean squared error of 0.223 ms. µ t is the mean time that a cluster is activated, σ t is the standard deviation of the cluster activations and a the coefficient of the fit, a = 0.213. A linear fit increases the root mean squared error to 0.824 ms. The lowest time constant in the plasticity of the read-out synapses is 5 ms.

Learning a complex sequence
In the non-markovian target sequence ABCBA, the states have the same duration and the same amplitude ( Fig 3B). We wanted to test whether we could learn some more complicated sequences. In this second task, the model is trained using a spatiotemporal signal that has components of varying durations and amplitudes.
As an example, we use a 600 ms meadowlark song (Fig 5A). The spectrogram of the sound is normalized and used as the time-and amplitude-varying rate of the external Poisson input to the supervisor neurons. Each read-out and supervisor neuron encodes a different frequency range. The song is longer than the period of the original recurrent network (Fig 2). Therefore, we trained on the larger network of 6400 excitatory neurons ( Fig  4C). After training, the read-out weight matrix reflects the structure of the target sequence ( Fig 5B). Under spontaneous activity, the supervisor neurons and interneurons stop firing and the recurrent network drives song replays (Fig 5C). The learned spatiotemporal signal broadly follows the target sequence ( Fig 5A). The model performs worse when the target dynamics has fast time-variations. This is because the time-variations are faster than or at the same order as the time discretization in the recurrent network. Thus, we conclude that the model can learn interesting spiking dynamics. The performance is qualitatively better when the changes in the target dynamics are slower than the time discretization in the recurrent network.

Discussion
We proposed an architecture that consists of two parts. The first part is a temporal backbone, implemented by a recurrent network. Excitatory neurons in the recurrent network are organized into disjoint clusters. These clusters are sequentially active by embedding a feedforward connectivity structure into the weights of the network. We showed that ongoing spontaneous activity does not degrade this structure. The set of plasticity rules and sequential dynamics reinforce each other. This was shown for slow-switching dynamics before [Litwin-Kumar and Doiron, 2014] (Fig S2). The stable sequential dynamics provides a downstream linear decoder with the possibility to read time out at the behavioural timescale. The second part is a set of read-out neurons that encode multiple dimensions of a signal, which we also call "space". Connecting the first to the second part binds space to time. The read-out neurons learn spike sequences in a supervised manner. More specifically, we investigated a simple state transition sequence and a sequence that has a more complex dynamics. The supervisor sequence is encoded into the read-out weight matrix.
The presented model brings elements from different studies together. Recurrent network models that learn sequential dynamics are widely studied [Fiete et al., 2010, Rajan et al., 2016, Hardy and Buonomano, 2018. Separate from this, models exist that aim to learn spatiotemporal dynamics [Brea et al., 2013, Nicola and Clopath, 2017, Gilra and Gerstner, 2017. Combining the two, the dynamics of the recurrent network is exploited by the readout neurons to perform the computational task. In this paper, we use local Hebbian learning such as the voltage-based STDP [Clopath et al., 2010].
Importantly, the dynamics of the recurrent network spans three time scales. First, individual neurons fire at the fastest milliseconds time scale. Second, one level slower, clusters of neurons fire at a time scale that determines the time discretisation of our temporal backbone, or if you will, the "tick of the clock", τ c . Third, the slowest time scale is on the level of the entire network, i.e. the period of the sequential activity, τ p . The time scales τ c and τ p are dependent on the cluster and network size, the average connection strengths within the clusters and adaptation. Smaller cluster sizes lead to a smaller τ c and τ p increases with network size when the cluster size is fixed.
The recurrent network is the "engine" that, once established, drives read-out dynamics. Our model can learn different read-out synapses in parallel and is robust to synapse failure. The robustness is a consequence of the clustered organization of the recurrent network. The clusters in the recurrent network provide a large drive for the read-out neurons while keeping the individual synaptic strengths reasonably small. Smaller clusters on the other hand require larger read-out synaptic strengths and the dynamics are as such more prone to small changes. In the limit, every cluster has exactly one neuron. The sequential dynamics is especially fragile in this case. Learning is faster with more neurons per cluster. Relatively small changes in the synapses are sufficient to learn the target. This is consistent with the intuitive idea that some redundancy in the network can lead to an increased learning speed [Raman et al., 2019].
Previous studies have discussed whether sequences are learned and executed serially or hierarchically [Lashley, 1951]. Our recurrent network has a serial organization. When the sequential activity breaks down halfway, the remaining clusters are not activated anymore. A hierarchical structure would avoid such complete breakdowns at the cost of more complicated structure to control the system. Sequences that are chunked in subsequences can be learned separately and chained together. When there are errors in early sub-sequences this will less likely affect the later sub-sequences. Evidence for hierarchical structures is found throughout the literature [Sakai et al., 2003, Glaze and Troyer, 2006, Okubo et al., 2015. The basal ganglia is for example thought to play an important role in shaping and controlling action sequences [Tanji, 2001, Jin and Costa, 2015, Geddes et al., 2018. Another reason why a hierarchical organization seems beneficial is inherent to the sequential dynamics. The time-variability of the sequential activity grows by approximately √ t. While on a time scale of a few hundreds of milliseconds, this does not yet pose a problem, for longer target sequences this variability would reach and exceed the plasticity time constants. The presented model can serve as an elementary building block of a more complex hierarchy.
In summary, we have shown that a clustered network organization can be a powerful substrate for computations. Specifically, the model dissociates time and space and therefore can make use of Hebbian learning to learn spatiotemporal sequences. More general, the backbone as clustered organization might encode any variable x and enable downstream read-out neurons to learn and compute any function of this variable, φ(x).

Methods
Excitatory neurons are modelled with the adaptive exponential integrate-and-fire model. A classical integrateand-fire model is used for the inhibitory neurons. Excitatory to excitatory recurrent synapses are plastic under the voltage-based STDP rule [Clopath et al., 2010]. This enables the creation of neuronal clusters and a feedforward embedding. Normalization and weight bounds are used to introduce competition and keep the  [Vogels et al., 2011]. In general, it prevents runaway dynamics. The connections from the recurrent network to the read-out neurons are plastic under the same voltage-based STDP rule. However, potentiation of read-out synapses is linearly dependent on the strength of the synapses. There is no normalization here to allow a continuous weight distribution.

Network dynamics
Recurrent network A network with N E excitatory (E) and N I inhibitory (I) neurons is homogeneously recurrently connected with connection probability p. The weights are initialized such that the network is balanced.

Read-out neurons
The N E excitatory neurons from the recurrent network are all-to-all connected to N R excitatory read-out (R) neurons. This weight matrix is denoted by W RE . During learning, the read-out neurons receive supervisory input from N R excitatory supervisor (S) neurons. The connection from supervisor neurons to read-out neurons is one-to-one and fixed, w RS . N R interneurons (H) are one-to-one and recurrently connected to the read-out neurons with fixed connection strengths, w RH and w HR (see table 1 for the recurrent network and read-out parameters). The E to E, I to E and the E to R connections are plastic.
Membrane potential dynamics The membrane potential of the excitatory neurons (V E ) in the recurrent network has the following dynamics: where τ E is the membrane time constant, E E L is the reversal potential, ∆ E T is the slope of the exponential, C is the capacitance, g EE , g EI are synaptic input from excitatory and inhibitory neurons respectively and E E , E I are the excitatory and inhibitory reversal potentials respectively. When the membrane potential diverges and exceeds 20 mV, the neuron fires a spike and the membrane potential is reset to V r . This reset potential is the same for all neurons in the model. There is an absolute refractory period of τ abs . The parameter V E T is adaptive for excitatory neurons and set to V T + A T after a spike, relaxing back to V T with a time constant τ T : The adaptation current a E for recurrent excitatory neurons follows: where τ a is the time constant for the adaptation current and α is the slope. The adaptation current is increased with a constant β when the neuron spikes. The membrane potential of the read-out (V R ) neurons has no adaptation current: where τ E , E E L , ∆ E T , E E , E I and C are as defined before. g RE is the excitatory input from the recurrent network and supervisor neurons (supervisor input only non-zero during learning). g RS is the excitatory input from the supervisor neuron (only non-zero during learning). g RH is the inhibitory input from the interneuron (only non-zero during learning). The absolute refractory period is τ absR . The threshold V R T follows the same dynamics as V E T , with the same parameters. The membrane potential of the supervisor neurons (V S ) has no inhibitory input and no adaptation current: where the constant parameters are defined as before and g SE is the external excitatory input from the target sequence. The absolute refreactory period is τ absS . The threshold V S T follows again the same dynamics as V E T , with the same parameters. The membrane potential of the inhibitory neurons (V I ) in the recurrent network has the following dynamics: where τ I is the inhibitory membrane time constant, E I L is the inhibitory reversal potential and E E , E I are the excitatory and inhibitory resting potentials respectively. g EE and g EI are synaptic input from recurrent excitatory and inhibitory neurons respectively. Inhibitory neurons spike when the membrane potential crosses the threshold V T , which is non-adaptive. After this, there is an absolute refractory period of τ abs . There is no adaptation current. The membrane potential of the interneurons (V H ) follow the same dynamics and has the same parameters, but there is no inhibitory input: where the excitatory input g HE comes from both the read-out neuron it is attached to and external input. After the threshold V T is crossed, the interneuron spikes and an absolute refractory period of τ absH follows. The interneurons inhibit the read-out neurons stronger when they receive strong inputs from the read-out neurons. This slows the potentiation of the read-out synapses down and keeps the synapses from potentiating exponentially (see table 2 for the parameters of the membrane dynamics).
Synaptic dynamics The synaptic conductance of a neuron i is time dependent, it is a convolution of a kernel with the total input to the neuron i: , with a decay time τ d and a rise time τ r dependent only on whether the neuron is excitatory or inhibitory. There is no external inhibitory input to the supervisor and inter-neurons. During spontaneous activity, there is no external inhibitory input to the recurrent network and a fixed rate. The external input to the interneurons has a fixed rate during learning as well. The external input to the supervisor neurons is dependent on the specific learning task. There is no external input to the read-out neurons. The externally incoming spike trains s X ext are generated from a Poisson process with rates r X ext . The externally generated spike trains enter the network through synapses W X ext (see table 3 for the parameters of the synaptic dynamics).

Plasticity
Excitatory plasticity The voltage-based STDP rule is used [Clopath et al., 2010]. The synaptic weight from excitatory neuron j to excitatory neuron i is changed according to the following differential equation: A LT D and A LT P are the amplitude of depression and potentiation respectively. θ LT D and θ LT P are the voltage thresholds to recruit depression and potentiation respectively, as H(.) denotes the Heaviside function. V i is the postsynaptic membrane potential, u i and v i are low-pass filtered versions of V i , with respectively time constants τ u and τ v : s j is the presynaptic spike train and x j is the low-pass filtered version of s j with time constant τ x : where the time constant τ x is dependent on whether learning happens inside (E to E) or outside (E to R) the recurrent network. s j (t) = 1 if neuron j spikes at time t and zero otherwise. Competition between synapses is created in the recurrent network. This is done by a hard L1 normalization every 20 ms, keeping the sum of all weights onto a neuron constant: j W ij = K. E to E weights have a lower and upper bound [W EE min , W EE max ]. The minimum and maximum strengths are important parameters and determine the position of the dominant eigenvalues of W . Potentiation of the read-out synapses is weight dependent. Assuming that stronger synapses are harder to potentiate [Debanne et al., 1999], A LT P reduces linearly with W RE : The maximum LTP amplitude A is reached when W RE = W RE min , A = 0.0008 pA mV −1 (see table 4 for the parameters of the excitatory plasticity rule).  Inhibitory plasticity Inhibitory plasticity acts as a homeostatic mechanism, preventing runaway dynamics [Rhodes-Morrison et al., 2007, Vogels et al., 2011, Litwin-Kumar and Doiron, 2014, Zenke et al., 2015. It is not necesseary for the stability of the sequential dynamics, but it is necessary for slow-switching dynamics ( Fig  S2). Excitatory neurons that fire with a higher frequency will receive more inhibition. The I to E weights are changed when the presynaptic inhibitory neuron or the postsynaptic excitatory neuron fires [Vogels et al., 2011]: where r 0 is a constant target rate for the postsynaptic excitatory neuron. s E and s I are the spike trains of the postsynaptic E and presynaptic I neuron respectively. The spike trains are low pass filtered with time constant τ y to obtain y E and y I (as in equation 12).

Learning protocol
Learning happens in two stages. First a feedforward structure is embedded in the recurrent network that produces reliable sequential dynamics. Once this dynamics is established connections to read-out neurons can be learned. Read-out neurons are not interconnected and can learn in parallel.

Recurrent network
The network is divided in disjoint clusters of 80 neurons. The clusters are sequentially stimulated for a time duration of 60 minutes by a large external current where externally incoming spikes are drawn from a Poisson process with rate 18 kHz. This high input rate does not originate from a single external neuron but rather assumes a large external input population. Each cluster is stimulated for 10 ms and in between cluster stimulations there are 5 ms gaps. During excitatory stimulation of a cluster, all other clusters receive an external inhibitory input with rate 4.5 kHz and external input weight W I ext = 2.4 pF . There is a periodic boundary condition. The last cluster activates the first cluster again. After the sequential stimulation, the network is spontaneously active for 60 minutes. The connectivity stabilizes during the spontaneous dynamics. The recurrent weight matrix of the large network (Fig 5) is learned using the same protocol. The recurrent weight matrix reaches a stable structure after three hours of sequential stimulation followed by three hours of spontaneous dynamics. Parameters that change are summarized in table 6. For slow-switching dynamics, a  (Fig S2). The weight matrix used to plot the spectrum of the recurrent network in Figs 2 and S3 is W = Read-out network During learning of the read-out synapses, external input drives the supervisor and interneurons. The rate of the external Poisson input to the supervisor neurons reflects the sequence that has to be learned. The rate is normalized between 0 kHz and 10 kHz. During learning, W RE changes. After learning, the external input to the supervisor and inter-neurons is turned off and both stop firing. The read-out neurons are now solely driven by the recurrent network. Plasticity is frozen in the read-out synapses after learning. With plasticity on during spontaneous dynamics, the read-out synapses would continue to potentiate because of the coactivation of clusters in the recurrent network and read-out neurons. This would lead to read-out synapses that are all saturated at the upper weight bound.
Simulations The code used for the training and simulation of the recurrent network is build on top of the code from [Litwin-Kumar and Doiron, 2014] in Julia. The code used for learning spatiotemporal sequences using read-out neurons is written in Matlab. Recurrent networks learned in Julia can be saved and loaded into the Matlab code, or networks with a manually set connectivity can be explored as in [Schaub et al., 2015]. Forward Euler discretization with a time step of 0.1 ms is used. The code is available for reviewer on ModelDB and access code: XYZ and password XYZ. The code will be made public on ModelDB after publication.

Supporting information
Linear rate model: analysis of the spectrum A linear rate model can give insight into the dynamics of a large nonlinear structured spiking network [Schaub et al., 2015]. Sequential dynamics in a linear rate network can emerge from the same feedforward structure as studied in this paper. The dynamics of a simplified rate model follows: where x is a multidimensional variable consisting of the rates of all excitatory and inhibitory groups, A is the weight matrix and ξ is noise. A minimal model for sequential firing consists of three groups of excitatory neurons and one group of inhibitory neurons (Fig S1). We are interested in a general connectivity structure: Figure S2: Slow-switching dynamics. The recurrent network is stimulated with external input that is spatially clustered, but temporally uncorrelated. Each cluster is stimulated for 50 ms, with 50 ms gaps in between stimulations. The rate of external stimulation is 18 kHz during training and there is no simultaneous inhibition of other clusters. This is repeated for 20 minutes after which the network stabilizes during 20 minutes of spontaneous activity. (A.) A diagonal structure is embedded in the recurrent network. Since there are no temporal correlations in the external input, there is no off-diagonal structure. (B.) The spectrum shows an eigenvalue gap. This indicates the emergence of a slower time scale. The leading eigenvalues do not have an imaginary part, pointing at the absence of feedforward structure and thus there is no sequential dynamics. (C.) Under a regime of spontaneous dynamics (i.e. uncorrelated Poisson inputs), the clusters are randomly reactivated. The high adaptation means a neuron fires only once during each reactivation. Figure S3: The feedforward embedding is stable. After 60 minutes of training (i.e. sequential stimulation as described in the methods), the network stabilizes during spontaneous activity. During the first 30 minutes of spontaneous dynamics, the connectivity still changes. More specifically, the imaginary parts of the leading eigenvalues increase. This leads to a higher switching frequency and as such a smaller period in the sequential activity. After around 30 minutes, a fixed point is reached. The first row shows the spectra of the weight matrix at different times. The second row shows the spike trains at those times, for one second of spontaneous activity. Figure S4: The sequence ABCBA is relearned four times for 12 seconds each. Before relearning, the read-out weight matrix W RE was always reset. When active, read-out neurons fire two spikes on average +/− one spike. This variability is a consequence of the noisy learning process.