Comparison of FORCE trained spiking and rate neural networks shows spiking networks learn slowly with noisy, cross-trial firing rates

doi:10.1371/journal.pcbi.1013224

Fig 1.

Leaky Integrate-and-Fire spiking and Equivalent rate networks.

In standard spiking networks, each neuron has a membrane voltage governed by linear dynamics. Once this voltage crosses a threshold, a spike is fired by the neuron which is then filtered by a set of double exponential filter equations. The resulting filtered spike current or post-synaptic current is then used as the output from the neuron. In the firing rate network model, compute the instantaneous theoretical firing rate given the input current for each neuron. This firing rate is then again filtered by the double exponential filter equations to produce the post-synaptic current as the neurons output.

More »

Expand

Table 1.

Leaky Integrate-and-fire neuron model parameters.

More »

Expand

Fig 2.

FORCE training spiking LIF and LIF-matched rate networks.

The single layer recurrent neural network used in the FORCE method consists of: a set of fixed reservoir weights, a set of fixed encoder weights and a set of learned decoder weights. The reservoir network creates a chaotic pool of rich mixed dynamics which are used to linearly decode the target supervisor with . This decoder (S for spikes, R for rates) is learned online using the Recursive Least Squares (RLS) algorithm. The encoder weights then feedback the decoded output into the reservoir to stabilize the dynamics.

More »

Expand

Fig 3.

FORCE can train both spiking LIF and LIF-matched rate networks.

Spiking LIF and parameter matched LIF-matched rate networks are trained on different tasks. The network outputs and sample neuron firing rates are overlaid. Spiking LIF values are plotted with a solid line, LIF-matched rate values are plotted with a dotted line, and supervisors with a thick grey line. A FORCE training can be broken down into three phases: pre-learning, learning, and post-learning. Before learning, the network dynamics are spontaneously chaotic. During learning, the network output is forced to match the target output, and the network dynamics are stabilized accordingly. After learning, if the training is successful, the network output and dynamics will continue to reproduce those stabilized during learning without any further weight updates. The green line indicates the change in the Euclidean norm of the decoder. Across all three stages of learning, the neural dynamics and network outputs of the spiking and rate networks are highly correlated. B Networks of 2000 neurons were trained to reproduce the random kick pitchfork system using 120s of training, with 27s of testing displayed. C Networks of 2000 neurons were chaotically initialized, then trained to reproduce the product of a 1Hz and 2Hz sine wave using 5s of training, with 5s of testing displayed. D–E Networks of 2000 neurons were chaotically initialized, then trained to reproduce the first bar of the song “Ode to Joy" by Beethoven. Each of the 5 notes in the first bar was converted to a component of a 5-dimensional target signal. Quarter notes were represented by the positive portion of a 2Hz sine wave, and half notes by the positive portion of a 1Hz sine wave. Training consisted of 80s or 20 bar repetitions, while the testing displayed consists of 4s or 1 repetition. F–G Networks of 5000 neurons were randomly initialized into chaos, then trained to reproduce the global dynamics of the Lorenz system with parameters , , and . To train the networks, 200s of Lorenz target trajectory was used, and then 200s of testing output is displayed. Each of the 3 components of the Lorenz system was used to train a component of the 3-dimensional network output.

More »

Expand

Fig 4.

FORCE training with slow learning rates leads to strongly correlated and swapable decoders across spiking LIF and LIF-matched rate networks.

A–C Networks of 2000 neurons were trained on different supervisors over a grid of points in the (Q,G) parameter space for both spiking LIF and LIF-matched rate networks. The learning rates used were: , , and for the pitchfork, Ode to Joy, and oscillator respectively. Each set of heatmaps from top to bottom are: the L₂ testing error for the spiking networks, the L₂ testing error for the firing rate networks, and the Pearson correlation between the learned decoders of the spiking and rate networks. The stars indicate the most correlated pair of networks with both networks L₂ error below a threshold of , which were used in remaining panels. D–F Sample overlaid network outputs (black), sample neuron firing rates (blue), and target supervisor (grey) for both the spiking (solid) and rate (dotted) networks. G–I Sample output and neuron dynamics for swapped decoders. The top plots are the output and neuron dynamics for the LIF network with the trained firing rate deocder. The bottom plots are the output and neuron dynamics for the firing rate network with the trained LIF deocder. J–L Scatter plot of LIF decoder versus firing rate decoder with a linear fit.

More »

Expand

Fig 5.

FORCE training with fast learning rates reduces decoder correlation and swappability.

A–C Row balanced networks of 2000 neurons were train on a supervisors over a grid of points in the (Q,G) parameter space for both LIF and LIF-matched rate networks. The learning rates used were: , , and for the pitchfork, Ode to Joy, and oscillator respectively. Each set of heatmaps from top to bottom are: the L₂ testing error for the LIF networks, the L₂ testing error for the rate networks, and the Pearson correlation between the learned decoders of the spiking and rate networks. The stars indicate the most correlated pair of networks with both networks L₂ error below a threshold of , which were used in remaining panels. D–F Sample overlaid network outputs (black), sample neuron firing rates (blue), and target supervisor (grey) for both the spiking (solid) and rate (dotted) networks. G–I Sample output and neuron dynamics for swapped decoders. The top plots are the output and neuron dynamics for the LIF network with the trained firing rate deocder. The bottom plots are the output and neuron dynamics for the firing rate network with the trained LIF deocder. J–L Scatter plot of LIF decoder versus firing rate decoder with a linear fit.

More »

Expand

Fig 6.

Fast learning improves performance of LIF-match rate but not spiking LIF networks.

A Networks of 2000 neurons were train on the Ode to Joy, Fourier basis, and sinusoidal tasks over a grid of points in the (Q,G) hyperparameter space with 4 different learning rates for both spiking and rate networks. Within each sub-panel, in order from left to right, we plotted the testing error for the spiking network, the rate network, and the cross network decoder correlation . B For the 5 Hz sinusoidal oscillator task and (Q,G) hyperparameter point (20,0.125), we trained 21 repetitions of randomly initialized networks with sizes in the range for 4 different learning rates for both the spiking (B.I) and rate model (B.II). Each blue point represents a repetition and each black point the mean. The blue lines indicate the linear regression fit with slope and intercepts indicated. C Mean Pearson correlation coefficient of decoders across networks for simulations in B. The shaded area indicates the corrected sampled standard deviation.

More »

Expand

Table 2.

Minimal error achieved over (Q,G) grid for LIF network on Fourier, Ode to Joy with HDTS, and 5 Hz sinusoidal oscillator.

More »

Expand

Table 3.

Minimal error achieved over (Q,G) grid for LIF-matched rate network on Fourier, Ode to Joy with HDTS, and 5 Hz sinusoidal oscillator.

More »

Expand

Fig 7.

Variance dominates LIF-spiking networks while bias dominates LIF-matched rate networks in mean squared error decomposition.

A For the 5 Hz sinusoidal oscillator task, we trained networks over a grid of points in the (Q,G) hyperparameter space with 4 different learning rates for both LIF and LIF-matched rate networks. For each point in the (Q,G) space, we simulated the trained network for 100 (20s) repetitions of the sinusoidal output, then computed the cross trail bias and variance of the networks output. Columns of heatmaps within each sub-panel from left to right are: the time averaged bias squared, time averaged variance, and the proportion of the variance to bias squared. The left panel and right panels contain the plotted values for the spiking network simulations and rate networks, respectively. B–C For a selected point (indicated by star in A) in the (Q,G) grid, the corresponding trained network was simulated with both LIF and rate neuron models for slow and fast learning rates for 100 (20s) repetitions of the 5 Hz sinusoidal task. For 5 randomly chosen neurons, the spike times (B) for the spiking networks and filtered postsynaptic currents (C) for both networks were recorded. To counteract output time-drift, each repetition of the network output was time-aligned to the first peak of the supervisor. B Each spike time was represented by a dot, where the colour indicates the order of the spike time within each repetition of the task, indicating their high variability. C The postsynaptic filters for both the LIF and rate networks, where the shaded areas indicate the corrected sample standard deviation for both. Note that the deviation for the rate network is also displayed.

More »

Expand

Table 4.

Minimal squared bias and variance achieved over (Q,G) grid for LIF and LIF-matched firing-rate network on 5 Hz sinusoidal oscillator. The minimal variance was only computed over (Q,G) points that had a corresponding bias squared less than 1e-2.

More »

Expand

Fig 8.

FORCE trained spiking decoders do not stabilize for fast learning.

Time series data for spiking LIF and LIF-matched rate networks trained on a 5 Hz sinusoidal for 10 s with . A Sampled decoder elements, B Euclidean norm of decoders , C Pearson correlation across network decoders , and D log instantaneous error ().

More »

Expand

Table 5.

Parameters for Fig 3.

More »

Expand

Table 6.

Training parameters for Figs 4 and 5.

More »

Expand