## Figures

## Abstract

The spiking neural networks (SNNs) are the third generation of neural networks and perform remarkably well in cognitive tasks such as pattern recognition. The spike emitting and information processing mechanisms found in biological cognitive systems motivate the application of the hierarchical structure and temporal encoding mechanism in spiking neural networks, which have exhibited strong computational capability. However, the hierarchical structure and temporal encoding approach require neurons to process information serially in space and time respectively, which reduce the training efficiency significantly. For training the hierarchical SNNs, most existing methods are based on the traditional back-propagation algorithm, inheriting its drawbacks of the gradient diffusion and the sensitivity on parameters. To keep the powerful computation capability of the hierarchical structure and temporal encoding mechanism, but to overcome the low efficiency of the existing algorithms, a new training algorithm, the Normalized Spiking Error Back Propagation (NSEBP) is proposed in this paper. In the feedforward calculation, the output spike times are calculated by solving the quadratic function in the spike response model instead of detecting postsynaptic voltage states at all time points in traditional algorithms. Besides, in the feedback weight modification, the computational error is propagated to previous layers by the presynaptic spike jitter instead of the gradient decent rule, which realizes the layer-wised training. Furthermore, our algorithm investigates the mathematical relation between the weight variation and voltage error change, which makes the normalization in the weight modification applicable. Adopting these strategies, our algorithm outperforms the traditional SNN multi-layer algorithms in terms of learning efficiency and parameter sensitivity, that are also demonstrated by the comprehensive experimental results in this paper.

**Citation: **Xie X, Qu H, Liu G, Zhang M, Kurths J (2016) An Efficient Supervised Training Algorithm for Multilayer Spiking Neural Networks. PLoS ONE 11(4):
e0150329.
https://doi.org/10.1371/journal.pone.0150329

**Editor: **Gennady Cymbalyuk,
Georgia State University, UNITED STATES

**Received: **November 8, 2015; **Accepted: **February 11, 2016; **Published: ** April 4, 2016

**Copyright: ** © 2016 Xie et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **All relevant data not included in the paper are available from the UCI mechanism repository. The Iris data set and its description is available from http://archive.ics.uci.edu/ml/datasets/Iris. The Breast Cancer Wisconsin (Original) Data Set is available from http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29.

**Funding: **This work was partially supported by the National Science Foundation of China and the China Scholarship Council under Grants 61273308 and 61573081 (http://www.nsfc.gov.cn/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Increasing the level of realism in a neural simulation and improving the computational capability of artificial neural networks [1][2], the spiking neural networks employing temporal coding mechanism is introduced as the third generation of neural networks and has achieved great success in various artificial intelligence tasks [3]–[6]. Most traditional neural networks represent real-valued analog data by the firing rate of neurons, like the first generation neural networks pioneered by the McCulloch-Pitts model [7] and the second generation by the the perceptron model [8]. However, there are substantial evidences that in biological neural systems there exist fast computations that are very likely based on spike firing events [1][2][9]. To simulate these firing events, the third generation of neural networks, the SNNs transmit information by spike times instead of the firing rate, and have been proven computationally more powerful than networks with rate coding [10]–[13].

In the learning of SNNs with temporal encoding mechanism, the supervised training is an important biomimetic concept which could potentially improve the learning speed with the help of an instructor signal. Various supervised training algorithms of SNNs have been proposed by now, which can broadly be subdivided into two classes: training algorithms for single layer SNNs, and for multilayers.

The single layer training algorithms are introduced based on the gradient decent rule or learning windows. Regarding to the gradient decent rule, the Tempotron [14] is a classical algorithm employing the distance between the output neuron’s voltage and the firing threshold as the cost function and can complete training efficiently, however, it can only complete binary classification tasks. The Chronotron [15] and spike pattern association neuron algorithm [16] try to minimize the distance between the desired and actual output spike trains by the gradient descent rule, with the distance defined by the Victor and Purpura metric [17] and the van Rossum metric [18] respectively.

A lot of algorithms based on learning windows have been proposed [19] for single layer networks. Among which, the remote supervised learning method is a classical one employing both the Spike-Timing-Dependent Plasticity (STDP) window, and the anti-STDP learning window to complete training [20]. The perceptron-based spiking neuron learning rule adopts a learning window based on the postsynaptic voltage function to instruct training [21]. The Synaptic Weight Association Training (SWAT) utilizes the STDP learning window and the Bienenstock-Cooper-Munro learning rule [22] to drive learning and achieves convergence. The precise spike driven synaptic plasticity learning rule [23] combines the Windrow-Hoff rule and the learning window of postsynaptic potential. Further algorithms adopting learning windows are introduced in [24]. These training algorithms employing learning windows are often more efficient than those with the gradient descent rule. But these single layer algorithms cannot complete training when the network structure contains hidden layers. However, electrophysiology experiments on cat’s visual system and monkey striate cortex reveal that the information in biological neurons is processed hierarchically rather than by a single layer [25]–[27]. Then, training a hierarchical spiking neural network is by far the closest way to the biological system.

For multilayers learning of the SNNs, the Spike Propagation (SpikeProp) [28] is the pioneer method that defines the computational error by the distance between the actual and target firing time, and minimizes the error by the gradient descent rule. It achieves training accurately but inefficiently, and only the first spike of a neuron can be trained. Different variations of the SpikeProp, the Quick Propagation, Resilient Propagation [29] and the Multiple SpikeProp [30][31] are proposed to improve the SpikeProp’s learning performance. The Multi-layer Remote Supervised Learning Method(Multi-ReSuMe) [32] extends the ReSuMe [20] to multiple layers by the gradient decent rule, assuming that the relation between the input and output firing rates is linear. All of these existing algorithms can achieve learning, while the efficiency of them is much lower than that of the biological system [33][34], and does not meet the requirements of the real-time applications.

To solve the low efficiency problem in the multilayer training of SNNs, the Normalized Spiking Error Back-Propagation (NSEBP) is proposed in this paper, which is motivated by the selective attention mechanism of the primate visual system [35, 36] and its layer-wise feature representation method in hierarchical structures [25, 26]. Different from traditional algorithms, our algorithm only selects target spike times as attention areas and ignores the states of other times. Besides, the voltage difference is employed to evaluate training errors, and the relation between the weight variation and voltage error change is uncovered, which enables the NSEBP to adjust each synaptic efficacy accurately. Moreover, the computational error is back propagated to previous layers by presynaptic spike jitter instead of the traditional gradient decent rule, which realizes layer-wise training in our algorithm. In the feedforward calculation, the analytic solutions of the spike time are calculated in the spiking response model, instead of detecting the postsynaptic voltage states at all time points. Employing these strategies, our algorithm achieves a significant improvement in training efficiency compared with the traditional training methods.

## Learning Algorithm

In this section, a new algorithm for feed-froward multilayer spiking neural networks, the Normalized Spiking Error Back Propagation (NSEBP) is presented.

### Neuron Model

In our study, the simplified Spike Response Model (SRM_{0}) is employed because of its simplicity and effectiveness. In the SRM_{0} [37], once the *j*th spike is emitted, a fundamental voltage *ϵ*_{j} is inspired and transmitted to its postsynaptic neuron. Each postsynaptic neuron integrates the weighted sum of all presynaptic influence *ϵ*_{j} at time *t* as its voltage *u*(*t*), and emits a spike if its voltage *u*(*t*) reaches the threshold. The postsynaptic voltage *u*(*t*) is described in the following equations:
(1)
where
(2)
with the Heaviside step function
(3)
Specifically, denotes the last recent output spike of the postsynaptic neuron, *w*_{j} is the weight of the presynaptic neuron emitting the *j*th input spike, and is the refractory function to simulate the biological refractory period. Γ_{j} is a set containing the spike time emitted by all the presynaptic neurons, *u*^{ext} is the external voltage, , with denoting the *j*th firing time of the input spike train. *τ*_{1} and *τ*_{2} are constant parameters.

In our algorithm, the Post-Synaptic Potential (PSP) learning window is employed, which is represented in Eq (4), providing a relation of the weight modification and spike time deviation. Obviously, it only directs weight modification if the presynaptic neuron fires before the postsynaptic one.
(4)
where *A*_{1} is a constant set to be 1 in our study, and denotes the time distance between the current time *t* and the input time .

### The NSEBP Algorithm

In this section, the Normalized Spiking Error Back Propagation (NSEBP) is proposed. The feedforward calculation process is derived in the Theorem 1, and the feedback training process of this learning rule is described here for one postsynaptic neuron with several presynaptic neurons.

In the network with *n* layers employing the SRM_{0} model, an arbitrary postsynaptic neuron *o* has an input spike train denoting the ordered input spikes, and a target spike train . Inspired by the selective attention mechanism of the primate visual system, the NSEBP only detects and trains the voltage states at target time points for neuron *o*, and ignore states on other non-target time.

For each postsynaptic neuron *o*, instead of the traditional time error, the voltage distance between the threshold *ϑ* and the postsynaptic voltage *u*(*t*_{d}) at the target time *t*_{d} is employed as the network error in our algorithm described in Eq (5), which is trained to become zero in our algorithm:
(5)
To train the postsynaptic voltage to *ϑ*, two steps are applied to our algorithm, that are the presynaptic spike jitter to back propagate error and the weight modification to complete training of the current layer.

#### Presynaptic spike jitter.

The presynaptic spike jitter is employed to back propagate error instead of the traditional gradient descent rule. It can influence the postsynaptic neuron voltage *u*(*t*_{d}) and realize layer-wised training. To achieve the back-propagated layer-wised learning, neurons in hidden layers also require the target spike time and training error. Then, the error in Eq (5) is allocated to *n* layers by the normalized parameter *r* and back propagated by the presynaptic spike jitter. The error assigned to the current layer is
(6)
and to the previous *n* − 1 layers is
(7)
where *r* is set to 1/*n* in our algorithm. This error assignment is shown in Fig 1.

The error *err* is assigned to two parts, among which is assigned to the current layer for weight modification, and is propagated to previous (n − 1) layers.

The error is back propagated to previous *n* − 1 layers by shifting each influential presynaptic spikes (presynaptic spikes which have influence to the postsynaptic neuron *o* at the current target time *t*_{d}). The error assigned to the *j*th influential presynaptic spike is Δ*u*_{j}, which is calculated by
(8)
In which is an assign variable and calculated by
(9)
with *A*_{1} and *W*_{ind}(*s*_{j}) defined in Eq (4). *m*_{1} and *m*_{2} are the first and last indexes of the influential presynaptic spikes respectively. To achieve training of this Δ*u*_{j}, the time variation of the *j*th input spike is calculated by
(10)
with
(11) (12) (13)
where *w*_{j} is the corresponding weight of the *j*th presynaptic spike, *t*_{d} is the target spike time, *τ*_{1} and *τ*_{2} are model parameters defined in Eq (2), denotes the *j*th presynaptic spike time. If *a* ≠ 0 and *b*^{2} − 4*ac* ≥ 0 hold, is calculated by Eq (10) and the solution with the minimum absolute value is applied to our algorithm. The calculation is derived in the following theoretical derivation section.

#### Weight modification.

To train to 0, the weight modification in our algorithm for an arbitrary influential input spike *j* is calculated by
(14)
where is a parameter defined by the normalized learning window
(15)
with *ϵ*_{j}(*s*_{j}) calculated by Eq (2), and *W*_{ind}(*s*_{j}) by Eq (4).

To avoid the *ϵ*_{j}(*s*_{j}) in Eq (14) going to infinitesimal when *s*_{j} is too large, not all presynaptic spikes but only the spikes with voltage *ϵ*_{j}(*s*_{j}) > *ϑ*_{v} at *t*_{d} are trained, where *ϑ*_{v} is the voltage threshold. Solving the same mathematical equations as Theorem 2 in the following section, the time boundaries are and , as shown in Fig 2. Its corresponding spike index range is denoted by [*m*_{1}, *m*_{2}].

The voltage *ϵ*_{j}(*s*_{j}) caused by the input spike is above *ϑ*_{v} when . The voltage of the input is set to 0 at time *t* if this is not in the interval [*t*_{1}, *t*_{2}].

If there is no input spike in the range [*t*_{1}, *t*_{2}], *S* spikes in [*t*_{1}, *t*_{2}] are added randomly to the presynaptic hidden neurons with a probability. The probability *p*_{i} assigned spikes to neuron *i* is calculated by
(16)
in which, *m* is the number of presynaptic neurons, *n*_{i} is the number of spikes emitted by neuron *i*, and *n*_{i} = 0.5 if there is no spike emitted. Consequently, the fewer spikes emitted by neuron *i*, the higher probability *p*_{i} it possesses. This allocation approach not only solves the none input problem in the training, but also balances the spike distribution. These added spikes are regarded as the target time of previous hidden layers and trained in the previous layer.

## Mathematical Analysis

### Theoretical Derivation

In this section, the theoretical derivations in the feedforward and back propagation of our algorithm are presented in Theorem 1 and Theorem 2 respectively.

In traditional methods employing the temporal encoded SNNs, the output spikes of a neuron are detected serially in time, leading to an inefficient feed forward process. In our study, the output spike times are obtained by solving the voltage function instead of traversing all time points, which are derived in the following theorem.

Supposing that for a postsynaptic neuron *o*, the refractory period is set to with the constant *A*_{2} > 0, the external interference voltage *u*^{ext} = 0, and is the ordered presynaptic spike train of the *P*_{1} presynaptic input spikes, where denotes the *j*th presynaptic spikes with *w*_{j} representing its response synapse weight. *m*_{o} is the index of the first influencing spike to the postsynaptic neuron *o*, *ϑ* is the firing threshold, *τ*_{1} and *τ*_{2} are model parameters defined in Eq (2). With these definitions, the relation between the pre and postsynaptic spikes is obtained by solving the quadratic function, which is proved in the following theorem:

**Theorem 1** *In the* *SRM*_{0} *model, for each range*
*in* *T*_{pre} *with* 1 ≤ *j* ≤ *P*_{1} − 1, *assuming that* (17) (18) *If the following conditions hold*:
*the postsynaptic output spike* *t*_{out} *in the range* *is solved by* (19)

* Proof*: According to the SRM

_{0}model described in Eqs (1)–(3), for

*t*∈ [), the voltage of a postsynaptic neuron

*u*(

*t*) is (20) Since (21) at time

*t*

_{out}, we have (22) The postsynaptic neuron will fire once its voltage reaches the threshold

*ϑ*, then the postsynaptic output spike time

*t*

_{out}follows (23) According to Eq (2), (24) Thus, we have (25) Suppose and refer to Eqs (17), (18), (25) and (

*II*), (26) By (

*I*), the solutions of Eq (26) is (27) and for all presynaptic time, we have

*t*

_{i}> 0,

*z*> 0, then (28)

The result follows.

The Theorem 1 proves the relation between the pre and postsynaptic spikes, which is applied to our algorithm to improve the feedforward computation efficiency.

In the feedback process of our algorithm, the error is back propagated by the presynaptic spike jitter instead of the traditional gradient decent rule, by which, the layer-wised training is applicable to our algorithm and improves the learning efficiency of our algorithm significantly. The relation of the presynaptic spike jitter and the voltage change is investigated in the following theorem.

Supposing that Δ*u*_{j} is the voltage variation of the postsynaptic neuron *o* generated by the *j*th presynaptic spike, *t*_{d} is the current target spike time, and other variables are the same as that in Theorem 1, then the relation between the time jitter and Δ*u*_{j} is obtained by solving the quadratic function, which is proved in the following theorem:

**Theorem 2** *In the* *SRM*_{0} *model with* *τ*_{1} = 2*τ*_{2}, *if the voltage change* Δ*u*_{j} *follows* *and assuming that* (29) (30) (31) *the voltage variation* Δ*u*_{j} *of the postsynaptic neuron can be achieved by the presynaptic spike time jitter* (32)

* Proof*: Supposing that

*u*

_{j}is the postsynaptic voltage stimulated by the input spike at the target time

*t*

_{d}, and the presynaptic spike jitter makes the voltage change △

*u*

_{j}. By Eq (1), we have (33) and then (34) Let (35) and by Eqs (29)–(31) and (34) can be expressed by (36) Under the condition (

*I*), we have

*w*

_{j}≠ 0 ⇒

*a*≠ 0, and if

*w*

_{j}> 0, for condition (

*II*), (37) (38) Then the discriminant of Eq (36) is (39) Analogously, when

*w*

_{j}< 0, for condition (

*III*), (40) (41) (42) Then, under these conditions, Eq (36) has solutions (43) By the property of the logarithmic function, the spike jitter can be obtained by Eq (35) only if

*z*> 0. For

*w*

_{j}> 0, we have

*a*< 0,

*b*> 0, then (44) Under condition (

*II*), for

*w*

_{j}> 0, , we have , and then (45) (46) (47) Analogously, by condition (

*III*),

*w*

_{j}< 0, (48) Then, under these conditions, is solved by Eqs (35) and (43) with (49)

The result follows.

Specially, when Δ*u*_{j} exceeds the boundary of Theorem 2 (*II*) or (*III*) in our algorithm, it is set to the corresponding feasible boundary in the same direction of the condition.

### Convergence Analysis

In this section, the convergence of our algorithm is investigated. To guarantee the convergence of the traditional and our algorithms employing the SRM_{0} model, some conditions need to be met to select target time points. These conditions for traditional algorithms and our algorithm are studied in Theorem 3 (1) and Theorem 3 (2) respectively by analyzing the voltage function and the spiking firing conditions.

**Theorem 3** *In the network under* *n* *layers employing the* *SRM*_{0} *model described in* *Eqs* (1)–(3) *with* *τ*_{1} = 2*τ*_{2}, *we have*:

*(1) To guarantee the convergence of the traditional algorithms based on the precise spike time mechanism, a time point*
*is available as target time only if there exist input spikes in*
.

*(2) To guarantee the convergence of our algorithm, a time points*
*is available as target time only if there exist input spikes in*
. *When the strategy of* [*t*_{1}, *t*_{2}] *described in* Fig 2 *is applied to our algorithm, this scope is* .

* Proof*: (1) For an arbitrary

*j*th input spike , by Eq (2), (50) Taking the partial derivatives, (51) In the traditional precise time mechanism, it is when emitting spikes, then we have (52) (53) by

*τ*

_{1}= 2

*τ*

_{2}, (54) (55) Then in traditional algorithms, the input spike can only inspire output spikes in the scope for its postsynaptic neurons. Analogously, the output spikes caused by are in after

*n*layers. Consequently, traditional algorithms cannot get convergent at if there is no input spike in . Then, to guarantee the convergence of the traditional algorithms based on the precise spike time mechanism, a time point is available as target time only if there exist input spikes in .

(2) Our algorithm employs the primate selective attention mechanism instead of the precise spike time rule, then there is no requirement of . When the local influence shown in Fig 2 is not applied to our algorithm, all presynaptic spikes which have an influence on can be trained to complete learning. Consequently, the time scope for input spikes is .

If the local influence shown in Fig 2 is applied to our algorithm, the time scope of output spikes generated by the input after several layers is shown in Fig 3. It is obvious that in layer 1, the time scope is [*t*_{1}, *t*_{2}], and in layer 2, the earliest time to fire is *t*_{1}′ = *t*_{1} + *t*_{1}, and the latest firing time is *t*_{2}′ = *t*_{2} + *t*_{2}. Then, for layer *n* − 1, we have *t*_{d} satisfying Eq (56) to complete training:
(56)
Obviously, if there is no input in , this can not be trained convergently. Then, in this condition, the time points is available as target time only if there exist input spikes in .

The output spike scopes generated by the input spike to each layer.

The results follow.

Theorem 3 provides conditions for encoding target times, which guarantees the convergence of different algorithms. It also indicates that the convergent condition of our algorithm is relaxed compared with traditional algorithms. When target spikes are selected following these conditions, different algorithms have different convergence properties and speed, which depend on their training mechanisms. In our algorithm, all qualified target times can be trained successfully and efficiently.

At the *m*th target time , the convergence of our algorithm is proved in the Theorem 4 by analyzing the postsynaptic voltage. Since the interference from training other target spikes or patterns varies with different network status, the following theorem proves the convergence of our algorithm ignoring this interference. The convergent situations with various interference are investigated in the following simulations sections.

**Theorem 4** *In the* *SRM*_{0} *model described in* *Eqs* (1)–(3) *with* *τ*_{1} = 2*τ*_{2}, *the postsynaptic voltage of our algorithm at an arbitrary target time* *is convergent to the threshold* *ϑ* *if the condition in Theorem 3 (2) holds, ignoring the interference from training other target spikes or patterns*.

* Proof*: There are two cases in the training: (1) the presynaptic hidden neurons emit qualified spikes.

Since our algorithm are trained layer-wisely, each layer shares the same training process and becomes convergent in the same way. Then, the convergence of our algorithm is proved in one layer with a postsynaptic neuron and several presynaptic neurons. Suppose that the voltage of the postsynaptic neuron at is calculated by Eq (1), and the voltage variations generated by the presynaptic weight modification and spike jitter are Δ*u*_{w} and Δ*u*_{t} respectively. For an arbitrary postsynaptic neuron *o*, and to the *p*th presynaptic spike, the voltage variation generated by the weight modification of this spike is , which is calculated by
(57)
where Δ*w*_{p} is the variation of its weight *w*_{p}. Since ,
(58)
For all presynaptic inputs, the voltage variation generated by the weight modification is
(59)
in which is the *p*th presynaptic spike time, and *m*_{1} and *m*_{2} are the first and last indexes of these presynaptic spikes. By Eqs (5), (6) and (14),
(60)
By Eq (15), , then
(61)
If the conditions in Theorem 2 hold, according to Eq (8), the calculated by Eq (32) is
(62)
which makes the postsynaptic voltage variation generated by the presynaptic spike jitter become , with and defined in Eqs (7) and (9) respectively. Under these conditions, for all presynaptic spikes, the voltage variation inspired by the presynaptic spike jitter Δ*u*_{t} is
(63)
By Eq (9), , then according to Eqs (5) and (7),
(64)
where *r* is a parameter defined in Eq (7). Consequently, the whole postsynaptic voltage variation Δ*u* generated by both presynaptic spike jitter and weight modification is
(65)
and then
(66)
If the value of exceeds the boundary in Theorem 2, the corresponding feasible boundary value is set to , and the solution is obtained according to Eq (32). It is obvious that the boundary value has the same training direction with Δ*u*, which can make *err* close to 0. Then our algorithm will be convergent after several leaning epochs at .

(2) If there is no qualified input spike in the presynaptic hidden layer, our algorithm adds spikes randomly with probability calculated by Eq (16), after which all weight modifications and spike jitters are the same as case (1), and our algorithm can get convergence.

The layer-wise training is employed in our algorithm, by which each layer shares the same training process and becomes convergent in the same way. In this analysis, the interference of other target spike trains or patterns are ignored. With the influence, our algorithm requires several more epochs to offset this interference and complete training.

The results follow.

### Computational Complexity

In this section, the computational time complexity of our algorithm is studied and compared with two traditional algorithms, the SpikeProp [28] and Multi-ReSuMe [32]. Before this, the detailed pseudo-codes of the feedforward and feedback processes of our method are listed.

#### Feedforward calculation.

According to the previous description, the pseudo-code of the feedforward calculation of our method is listed below. Since in the multilayer networks, each layer shares the same feedforward calculation process, only the calculation of one layer is described in this pseudo-code.

**The Feedforward Calculation of Our Algorithm**

**Definition**:

*T*_{pre}: the set of presynaptic spikes, which contains spikes emitted by all presynaptic neurons . *T*_{pre} is sorted and has no duplicate numbers.

**Initialization**:

The weight matrix *W* is initialized randomly.

**Feedforward calculation**:

**For** each postsynaptic neuron:

**For** each presynaptic spike interval to with 1 < *j* < *P* − 1:

For all presynaptic spikes before , calculate parameters *a* and *b* by Eqs (17) and (18).

**If** *a* and *b* meet the conditions in Theorem 1:

Calculate the output spike time in this scope by Eq (19) and add it to the output spike train of the postsynaptic neuron.

**End If**

**End For**

**End For**

Supposing that there are *M* presynaptic neurons, *N* postsynaptic neurons, *P* input spikes of all these presynaptic neurons, and the time length is *T*. As described in the pseudo-code, our algorithm detects each input spike scope and calculates parameters *a* and *b* by all of these *P* input spikes in the worst case, then the time complexity of our algorithm for one layer is *O*(*NP*^{2}), which also reveals the number of operations in our method.

For traditional feedforward calculation method, all discrete time points in *T* are detected instead of the input spike scopes of *P*, then the second loop in the pseudo-code above is replaced by the time scope in *T* (supposing that the time interval is 1ms). For each time scope, it calculates the postsynaptic voltage by these *P* input spikes and determines whether the voltage is greater than the threshold. In this way, the time complexity of the traditional method in one layer is *O*(*NTP*). Since a neuron can only emits one spike in a time points, we have *P* ≤ *T*. Consequently, the time complexity of our method in the feedforward calculation is less than that of the traditional approach.

#### Feedback modification.

Similar to the feedforward calculation, the feedback weight modifications in each layer and each output neuron of our algorithm have the same training process. Then in this part, only the training process of one layer and one output neuron is listed in the following pseudo-code.

**The Feedback Modification of Our Algorithm**

**Definition**:

*T*_{pre}: the set of presynaptic spikes, which contains spikes emitted by all presynaptic neurons .

*T*_{d}: the set of target output spikes, which contains all target spikes of the postsynaptic neuron .

**Initialization**:

The weight matrix *W* is initialized randomly.

**Feedback modification**:

**For** each target spike time in *T*_{d}:

Calculate the weighted sum of all input spikes as the postsynaptic voltage *u*(*t*_{d}), and calculate the error *err* by Eq (5).

**If** *err* ≠ 0

**Step1**: Assign *err* to each layer by Eqs (6) and (7).

**Step2**: Calculate the presynaptic spike variation in the current layer by Eq (10).

**Step3**: Adjust all presynaptic weights in this layer by Eq (14).

**End If**

**End For**

Assuming that there are *M* presynaptic neurons, *N* postsynaptic neurons, and the number of target spikes for all postsynaptic neurons is *D*, which is equal to *D*_{1} + *D*_{2} + … + *D*_{N}, with *D*_{i} represents the number of target spikes of the *i*th postsynaptic neuron. The number of input spikes is *P*, and the time length is *T*. According to the pseudo-code for the feedback modification of our algorithm, the time complexity for one layer is *O*(*DP*), which also reveals the number of operations in our algorithm.

In traditional algorithms, like the SpikeProp and Multi-ReSuMe, the postsynaptic states at all time points of *T* instead of target intervals *T*_{d} are detected and their corresponding weights are modified by a given condition. Then the time complexity for most traditional algorithms like the SpikeProp and Mullti-ReSuMe in one layer is *O*(*TP*). Since *D* < *T*, the number of operations in our algorithm is less than traditional methods.

## Training Performance

In this section, the training performance of our algorithm is investigated and compared with two classical algorithms, the SpikeProp [28] and Multi-ReSuMe [32].

A spiking network structure employing an input layer with 50 neurons, a hidden layer with 100 spiking neurons, and an output neuron is devised in our simulations, which is shown in Fig 4. The training of the multilayer neural network consists of two steps, the feedforward calculation and feedback weight modification. The efficiency of the both two steps is studied in the following parts.

There are 50 input neurons, 100 hidden neurons, and one output neuron.

### Feedforward Calculation

The feedforward calculation is an important step before learning, it computes the output spikes from the input ones. In this subsection, two simulations are conducted to investigate the computational performance of our proposed method described in Theorem 1 compared with the traditional precise time calculation method. Specifically, the network structure is shown in Fig 4, and these input output spikes are generated by a homogeneous Poisson process.

The first simulation is carried out to explore the feedforward calculation performance in different time lengths from 200 ms to 2800 ms, and for each input neuron, one spike generated by a homogeneous Poisson process is emitted. To evaluate the similarity of the output spike trains calculated by our algorithm and traditional method quantitatively, the correlation-based measure *C* [38] is employed, with
where *v*_{1} **⋅** *v*_{2} is the inner product, and |*v*_{1}|, |*v*_{2}| are the Euclidean norms of *v*_{1} and *v*_{2} respectively. The *v*_{1} and *v*_{2} are vectors obtained by the convolution of the two spike trains using a Gaussian filter:
where *N*_{i} is the number of spikes in the test spike train, and is the *m*th spike in it. *σ* is the standard deviation of this Gaussian filter which is set to be 1 in our study. Generally, the measure *C* equals to 1 for identical spike trains and decrease towards zero for loosely correlated spike trains.

The simulation results are shown in Fig 5A, which indicate that our proposed method has the same output spike trains as the traditional method, but achieves higher efficiency than it, because our method detects only time intervals of the input spikes instead of all time points in the traditional method.

A: Simulation results on different time lengths ranging from 200 ms to 2800 ms. B: Simulation results on different input firing rates ranging from 10 Hz to 1100 Hz.

In the second simulation, the performance of our proposed method is tested with the input firing rate of a homogeneous Poisson process ranging from 100 Hz to 1100 Hz with time length fixed to 1000 ms. Obviously, the higher the input firing rate, the higher the input densities, and when the firing rate is higher than 1000 Hz, the density of input spikes and all time points is similar.

Simulation results shown in Fig 5B indicate that our method has the same output spike trains as the traditional ones, and its computational time is growing with the increase of the input spike density. However, our method is still a little more efficient than the traditional method even if the input spike density is similar to that of all time intervals with a rate above 1000 Hz.

### Feedback Weight Modification

In this section, the training performance of our algorithm is investigated and compared with the traditional classical multi-layer algorithms, the Spikeprop and Multi-ReSuMe. The first simulation is devised to study the learning efficiency at different time lengths of spike trains, in which there are 50 input neurons, 100 hidden neurons, and one output neuron with a network structure depicted in Fig 4. The input spike train of each input neuron is generated by a homogeneous Poisson process with *r* = 10 Hz, ranging the time length from 200 ms to 2800 ms. The output neuron is desired to emit only one spike in these time lengths because the SpikeProp cannot complete training for multiple target spikes. For fast convergence of traditional algorithms, the target time *t*_{d} is set to *t*_{o} + 5, where *t*_{o} is the output firing time of the first epoch. Similar to the previous simulation, *C* is employed here to measure the accuracy.

The comparison results are shown in Fig 6A, in which the upper sub-figure depicts the learning accuracy of these three algorithms. It illustrates that in this simulation, our algorithm has a similar accuracy as traditional ones. The middle sub-figure of Fig 6A shows the learning efficiency of the SpikeProp and the Multi-ReSuMe, and the efficiency of our algorithm is displayed independently in the below sub-figure because the magnitude of the learning epochs and learning time of our algorithm is not the same as that of the traditional algorithms. The comparison of these two sub-figure denotes that our algorithm requires less learning epochs and learning time than the SpikeProp and Multi-ReSuMe in various situations.

A: Simulation results on different time lengths fixing the input spike rate to 10 Hz. B: Simulation results on different input firing rates with the time length 500 ms.

The second simulation is conducted to test the learning efficiency of our algorithm under different firing rates, in which the input and output spike trains share the same time length of 500 ms, and the input spike train of each input neuron is generated by a homogeneous Poisson process ranging from 10 Hz to 100 Hz.

The simulation results are shown in Fig 6B, where the upper sub-figure shows the accuracy of these three algorithms, the middle and the below ones depict the learning efficiency of traditional algorithms and our algorithm respectively. Similar with the previous simulation, our algorithm achieves an approximate accuracy with the SpikeProp and Multi-ReSuMe, and improves the training efficiency significantly both in training epochs and training time.

To further explore the learning efficiency of these algorithms, the training time of one epoch is tested in various firing rates and time lengths. Firstly, the time length is fixed to 500 ms, and each input neuron emits a spike train generated by a homogeneous Poisson process with firing rate ranging from 10 Hz to 100 Hz, and the output neuron emits only one spike.

The simulation results shown in Table 1 indicate that the higher the firing rate, the more time required for these three algorithms to complete one training epoch, since more input spikes required to be trained. Besides, our algorithm consumes less time than traditional ones in various firing rates.

Secondly, in order to verify the effect of the time length on the training efficiency, another simulation is carried out where each input neuron emits two spikes and the output neuron emits one spike generated by a homogeneous Poisson process, and the maximum time length varies form 100 ms to 1000 ms.

The simulation results are shown in Table 2, which denotes that in various time lengths, our algorithm has the similar running time for training one epoch, while the time of the SpikeProp and Multi-ReSuMe increases obviously. This reveals the advantage of the selective attention mechanism which enables our NSEBP to concentrate attention on target contents and does not have to scan all time points as traditional algorithms do. It also indicates that the running time of NSEBP has no direct relation to the time length. These simulations in this section demonstrate that our algorithm achieves a significant improvement in efficiency compared with traditional algorithms.

## Non-linear Spike Pattern Classification

### The XOR Benchmark

In this section, we perform experiments with the NSEBP on a classical example of a non-linear problem, the XOR benchmark to investigate its classification capability and the influence of different parameters. The network architecture shown in Fig 4 is employed in this section, with 4 input neurons, 10 hidden neurons and one output neuron.

The encoded method and the generation of the input spike trains are shown in Fig 7. It depicts that the input 0 and 1 are encoded randomly, which is set to the spike time [1, 2] and [3, 4] respectively in this simulation. Then the four input patterns {0, 0}, {0, 1}, {1, 0}, {1, 1} are encoded by two segments *S*_{1} and *S*_{2} copying the encoded results of 0 and 1. The classification objects are the input spike patterns, among which the input spike trains corresponding to {0, 0} and {1, 1} are one class *C*_{1}, and the input spike trains corresponding to {0, 1}, {1, 0} are the other class *C*_{2}. The desired outputs of the output neuron corresponding to *C*_{1} and *C*_{2} are set to 10 ms and 15 ms respectively satisfying the convergent condition in Theorem 3.

Generation of the input spike trains for the classification task in the XOR problem.

Our algorithm is applied to the feed-forward network described above with *ϑ* = 1, *τ*_{1} = 5 and *r* = 0.5. Different from traditional multi-layer networks in [28, 32], our algorithm requires none sub-connections, which reduces the number of weight modification. With these parameters, our algorithm can complete training efficiently in 15 learning epochs and achieve accuracy 1 in various number of hidden neurons, as shown Fig 8. This training efficiency is higher than traditional algorithms that is at least 63 epochs in Multi-ReSuMe [32] and 250 cycles in SpikeProp [28]. In the following we systematically vary the parameters of our algorithm and investigate their influence.

The convergent process of our algorithm with the number of hidden neurons 5, 10, 15. All of these simulations achieve accuracy 1.

### The Parameters

In this part, we explore the influence of the parameter *τ*_{1}, the number of hidden neurons, the parameter *r* defined in Eq (6), and the threshold *ϑ* on the convergent epochs. 50 simulations are carried out and the average learning epoch is obtained.

Fig 9(a) shows the convergent epochs for different values of the threshold *ϑ*, with the number of hidden neurons fixed to 10, *r* = 0.5, and *τ*_{1} = 5. It suggests that the convergence of our algorithm is insensitive to the threshold, and it can complete training in 8 epochs in various situations.

Convergent epochs with various values of parameter *ϑ* (a), number of hidden neurons (b), r (c), and *τ*_{1} (d). All of these simulations achieve accuracy 1.

Fig 9(b) depicts the convergent epoch with different numbers of hidden neurons, which indicates that in the beginning, more neurons in the hidden layer lead to less learning epochs, while when it above 30, this change is not obvious. This is mainly because more hidden neurons make more sparse representation of the input patterns in the hidden layer, which are easier to be trained because there is less interference between different patterns. Different amount of information requires different numbers of hidden neurons for a sparse representation, and if the number of hidden neurons is large enough, the variation of the convergent epoch is not apparent.

Fig 9(c) displays the convergent epoch with different *r* defined in Eq (6), which determines the proportion of the error back-propagated to the previous layers. The simulation results demonstrate that the convergence of our algorithm has no noticeable relationship with *r*. To balance the load of each layer, we suggest *r* = 1/*n* when there are *n* layers required to be trained in our algorithm. In real world applications, *r* can be set to different values according to different requirements. Fig 9(d) shows the convergent epoch for different *τ*_{1}, which indicates that our algorithm can achieve rapid convergence in various cases, and different values of *τ*_{1} have some influence on the convergent speed, but not obvious.

Simulations in this section demonstrate that our algorithm can complete non-linear classification efficiently. Besides, simulation results shown in Fig 9 indicate that our algorithm is not sensitive to various parameters, which makes our algorithm more convenient to be applied to various applications.

## Classification on the UCI Datasets

In this section, we apply NSEBP to classify both the Iris and Breast Cancer Wisconsin (BCW) datasets of the UCI [39] to investigate the capability of our algorithm over classification tasks.

### Iris Dataset

The Iris dataset is firstly applied to benchmark our algorithm. It contains three classes, each with 50 samples and refers to a type of the iris plant: Iris Setosa (class 1), Iris Versicolour (class 2), and Iris Virginica (class 3) [40]. Each sample contains four attributes: sepal length (feature 1), sepal width (feature 2), petal length (feature 3), and petal width (feature 4).

To make the difference between the data apparent, the data of each feature are mapped into a high dimensional space using the population time encoding method [41]. There are 12 uniforming distributed Gaussian receptive fields in [0, 1], which distribute an input variable over 12 input neurons which are shown in Fig 10.

The input variable is normalized to [0, 1], and a non-firing zone is defined to avoid spikes in later time. Every no firing neuron has code −1. For instance, 0.35 is encoded to a spike train of 12 neurons: (−1, −1, 14, 200, −1, 119, 7, −1, −1, −1, −1, −1).

The network structure devised for this classification task is shown in Fig 11, in the iris data set, the input layer has 48 neurons with each feature 12 inputs. The hidden layer has 4 neurons, each possesses local connections with 12 neurons of the input layer that represent one feature. In the training period, only synaptic weights from input to hidden neurons are adjusted, and the output neuron has weight 1 which is applied to decision making.

Network structure consisting of 12 ∗ *F* input neurons, *F* hidden neurons and one output neuron, where *F* is the number of features.

In this application, the *i*th sample of class *c* has the input spike train for four features and a target spike train which is obtained by , with parameter *τ*_{2} defined in Eq (2). The local connections in our network enable each feature to train an independent sub-network with 12 synapses. Since samples of one class have similar spike trains, encoding results reveal that there are only few different target time trains for each feature. As in the previous simulations, our algorithm chooses the class with minor voltage error.

Table 3 compares the classification accuracy and convergent epoch of the NSEBP with four classical neural network classifier: Multi-ReSuMe [32], Spikeprop [28], MatlabBP, SWAT [22] for the Iris dataset on the training set and testing set. The classifier of the Multi-ReSuMe and Spikeprop are conducted with three layer SNN structures proposed in [32] and [28] respectively. The SWAT employs a SNN architecture of 13 input neurons with each connects to 16 neurons in middle layer, and the squared cosine encoding method is employed [22]. The simulation results are shown in Table 3.

The comparison results indicate that our algorithm achieves comparable or even higher accuracy compared with the traditional classifiers, while our algorithm only requires 18 epochs to complete training, instead of 2.6 ⋅ 10^{6} epochs for the MatlabBP, 1000 epochs for Spikeprop, 500 epochs for the SWAT, and 174 for multi-RuSuMe. The simulation results prove that our algorithm is the most efficient one, and outperforms the compared neural networks methods significantly.

### Breast Cancer Wisconsin Dataset

The two-class Breast Cancer Wisconsin (BCW) dataset is also applied to analyze our algorithm. This dataset contains 699 samples, while 16 samples are abandoned because of missing data. Each sample has nine features obtained from a digitized image of a fine needle aspirate (FNA) of a breast mass [42].

The same network structure and training settings as these in the Iris classification simulations are employed here. There are nine features instead of four, then there are 108 input neurons and 9 hidden neurons in the network structure depicted in Fig 11.

Table 4 compares the accuracy and efficiency of the NSEBP against the existing algorithms for the BCW dataset. It shows that the test data accuracy of the NSEBP is comparable to that of the other approaches, while the NSEBP only requires 16 epochs for complete training, instead of 1500 epochs for the Spikeprop, 500 epochs for SWAT, and 9.2 ⋅ 10^{6} epochs for MatlabBP. Then, the training efficiency of our algorithm is improved significantly compared with these classical neural network algorithms.

Simulation results in this section demonstrate that our algorithm achieves a higher efficiency and even a higher learning accuracy than the traditional neural network methods in the classification tasks. The selective mechanism and the presynaptic spike jitter adopted in our algorithm make the efficiency of the NSEBP to be independent with the length of the spike train, and overcome the drawbacks of low efficiency in traditional back-propagation methods. With this efficiency, the training of SNNs can meet the requirement of real world applications.

## Conclusion

In this paper, an efficient multi-layer supervised learning algorithm, the NSEBP, is proposed for spiking neural networks. The accurate feedforward calculation and weight modification employing the normalized PSP learning window enables our algorithm to achieve a rapid convergence. Besides, motivated by the selective attention mechanism of the primate visual system, our algorithm only focuses on the main contents in the target spike trains and ignores neuron states at the un-target ones, which makes our algorithm to achieve a significant improvement in efficiency of training one epoch. Simulation results demonstrate that our algorithm outperforms traditional learning algorithms in learning efficiency, and is not sensitive to parameters.

The classification results on the UCI data sets indicate that the generalization ability of our algorithm is a little better than the traditional backpropagation method, and similar to the SpikeProp, but lower than the SWAT. It means that our algorithm does not make great contribution to the over fitting problem. However, the traditional methods to improve the training generalization ability can also be applied to our algorithm, such as employing an optimized network structure and better decision-making method, or better sample validate methods, that will be studied in the future work.

Our algorithm is derived from the SRM_{0} model, but the same derivation process is feasible to other models when the voltage *u* can be expressed by an equation of time *t* and can be transformed to a quadratic function by the substitute method. Besides, the proposed feed forward calculation method can be applied to the existing algorithms to improve their learning performance. Consequently, employing these training strategies, the SNNs can be applied efficiently to various applications with multilayer network structure and arbitrary real-valued analog inputs.

## Supporting Information

### S1 Table. Experiment Data for Feedforward Calculation Performance.

https://doi.org/10.1371/journal.pone.0150329.s001

(XLSX)

### S2 Table. Experiment Data for Training Performance with Different Time Lengths.

https://doi.org/10.1371/journal.pone.0150329.s002

(XLSX)

### S3 Table. Experiment Data for Training Performance with Different Firing Rates.

https://doi.org/10.1371/journal.pone.0150329.s003

(XLSX)

### S4 Table. Experiment Data for Training time of one epoch.

https://doi.org/10.1371/journal.pone.0150329.s004

(XLSX)

### S5 Table. Experiment Data for The Convergent Process of Our Algorithm with Different Number of Hidden Neurons.

https://doi.org/10.1371/journal.pone.0150329.s005

(XLSX)

### S6 Table. Experiment Data for The Convergent Epochs with Different Parameters.

https://doi.org/10.1371/journal.pone.0150329.s006

(XLSX)

### S8 Table. Experiment Data for Breast Cancer Winsconsin Dataset.

https://doi.org/10.1371/journal.pone.0150329.s008

(XLSX)

## Acknowledgments

This work was partially supported by the National Science Foundation of China under Grants 61273308 and 61573081, and the China Scholarship Council.

## Author Contributions

Conceived and designed the experiments: XX HQ. Performed the experiments: XX MZ. Analyzed the data: XX GL. Contributed reagents/materials/analysis tools: XX MZ. Wrote the paper: XX JK.

## References

- 1. Theunissen F, Miller JP. Temporal encoding in nervous systems: a rigorous definition. Journal of computational neuroscience. 1995; 2: 149–162. pmid:8521284
- 2. Rullen R V, Guyonneau R, Thorpe SJ. Spike times make sense. Trends in Neurosciences. 2005; 28: 1–4.
- 3. Hu J, Tang H, Tan KC, Li H, Shi L. A Spike-Timing-Based Integrated Model for Pattern Recognition. Neural Computation. 2013; 25: 450–472. pmid:23148414
- 4. O’Brien MJ, Srinivasa NA. Spiking Neural Model for Stable Reinforcement of Synapses Based on Multiple Distal Rewards. Neural Computation. 2013; 25:123–156. pmid:23020112
- 5. Naveros F, Luque NR, Garrido JA. A Spiking Neural Simulator Integrating Event-Driven and Time-Driven Computation Schemes Using Parallel CPU-GPU Co-Processing: A Case Study. IEEE Transactions on Neural Networks and Learning Systems. 2015; 26: 1567–1574. pmid:25167556
- 6. Zhang Z, Wu QX. Wavelet transform and texture recognition based on spiking neural network for visual images. Neurocomputing. 2015; 151: 985–995.
- 7. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics. 1943; 5: 115–133.
- 8. Rosenblatt F. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review. 1958; 65: 386. pmid:13602029
- 9. Tiesinga P, Fellous JM, Sejnowski TJ. Regulation of spike timing in visual cortical circuits. Nature reviews neuroscience. 2008; 9: 97–107. pmid:18200026
- 10. Mehta MR, Lee AK. Role of experience and oscillations in transforming a rate code into a temporal code. Nature. 2002; 417: 741–746. pmid:12066185
- 11. Rullen RV, Thorpe SJ. Rate coding versus temporal order coding: What the retinal ganglion cells tell the visual cortex. Neural Computation. 2001; 13: 1255–1283. pmid:11387046
- 12. Bohte S.M. The Evidence for Neural Information Processing with Precise Spike-times: A Survey. Natural Computation. 2004; 3: 195–206.
- 13. Benchenanel K, Peyrachel A, Khamassi M, Tierney PL, Gioanni Y, Battaglia FP, et al. Coherent theta oscillations and reorganization of spike timing in the hippocampal-prefrontal network upon learning. Neuron. 2010; 66: 921–936.
- 14. Gütig R, Sompolinsky H. The tempotron: a neuron that learns spike timing-based decisions. Nature neuroscience. 2006; 9: 420–428. pmid:16474393
- 15. Florian RV. The chronotron: a neuron that learns to fire temporally precise spike patterns. Plos one. 2012; 7: e40233. pmid:22879876
- 16. Mohemmed A, Schliebs S, Matsuda S, Kasabov N. Span: Spike pattern association neuron for learning spatio-temporal spike patterns. International Journal of Neural Systems. 2012; 22: 1250012 pmid:22830962
- 17. Victor JD, Purpura KP. Metric-space analysis of spike trains: theory, algorithms and application. Network: computation in neural systems. 1997; 8: 127–164
- 18. van Rossum MC. A novel spike distance. Neural Computation. 2001; 13: 751–763. pmid:11255567
- 19. Masquelier T, Guyonneau R, Thorpe SJ. Competitive STDP-based spike pattern learning. Neural computation. 2009; 21: 1259–1276. pmid:19718815
- 20. Ponulak F, Kasinski A. Supervised learning in spiking neural networks with ReSuMe: sequence learning, classification, and spike shifting. Neural Computation. 2010; 22: 467–510. pmid:19842989
- 21. Xu Y, Zeng X, Zhong S. A new supervised learning algorithm for spiking neurons. Neural computation. 2013; 25: 1472–1511. pmid:23517101
- 22. Wade JJ, McDaid LJ, Santos J, Sayers HM. SWAT: a spiking neural network training algorithm for classification problems. IEEE Transactions on Neural Networks. 2010; 21: 1817–1830. pmid:20876015
- 23. Yu Q, Tang H, Tan KC, Li H. Precise-spike-driven synaptic plasticity: Learning hetero-association of spatiotemporal spike patterns. Plos one. 2013; 8: e78318. pmid:24223789
- 24. Ponulak F, Kasiński A. Introduction to spiking neural networks: Information processing, learning and applications. Acta neurobiologiae experimentalis. 2010; 71: 409–433.
- 25. Hubel DH, Wiesel TN. Receptive fields and functional architecture of monkey striate cortex. The Journal of physiology. 1968; 195: 215–243. pmid:4966457
- 26. Hubel DH, Wiesel TN. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of physiology. 1962; 160: 106–154 pmid:14449617
- 27. Song W, Kerr CC, Lytton WW, Francis JT. Cortical plasticity induced by spike-triggered microstimulation in primate somatosensory cortex. Plos one. 2013; 8(3): e57453. pmid:23472086
- 28. Bohte SM, Kok JN, La Poutre H. Error-backpropagation in temporally encoded networks of spiking neurons. Neurocomputing. 2002; 48: 17–37.
- 29. McKennoch S, Liu D, Bushnell LG. Fast Modifications of the SpikeProp Algorithm. IEEE International Joint Conference on Neural Networks, IEEE. 2006; 48: 3970–397
- 30. Ghosh-Dastidara S, Adeli H. A new supervised learning algorithm for multiple spiking neural networks with application in epilepsy and seizure detection. Neural Networks. 2009; 22: 1419–1431.
- 31. Xu Y, Zeng X, Han L, Yang J. A supervised multi-spike learning algorithm based on gradient descent for spiking neural networks. Neural Networks. 2013; 43: 99–113. pmid:23500504
- 32. Sporea I, Grüning A. Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation. 2013; 25: 473–509. pmid:23148411
- 33.
Thorpe SJ, Imbert M. Biological constraints on connectionist modelling. Connectionism in perspective. 1989; 63–92.
- 34.
Fink C G, Zochowski M, Booth V. Neural network modulation, dynamics, and plasticity. Global Conference on Signal and Information Processing (GlobalSIP), IEEE, 2013; 843–846. https://doi.org/10.1109/GlobalSIP.2013.6737023
- 35. Desimone R, Ungerleider LG. Neural mechanisms of visual processing in monkeys. Handbook of neuropsychology. 1989; 2: 267–299
- 36. Ungerleider SKALG. Mechanisms of visual attention in the human cortex. Annual Review of Neuroscience. 2000; 23: 315–341. pmid:10845067
- 37.
Gerstner W, Kistler WM. Spiking Nerual Models: Single Neurons, Populations, Plasticity. 1st ed. Cambridge: Cambridge University Press. 2002.
- 38. Schreiber S, Fellous JM. A new correlation-based measure of spike timing reliability. Neurocomputing. 2003; 52:925–931. pmid:20740049
- 39.
Bache K, Lichman M. UCI Machine Learning Repository; 2013. Accessed: http://archive.ics.uci.edu/ml.
- 40. Fisher RA. The use of multiple measurements in taxonomic problems. Annals of eugenics. 1936; 7: 179–188
- 41. Snippe HP. Parameter extraction from population codes: A critical assessment. Neural Computation. 1996; 8: 511–529. pmid:8868565
- 42. Wolberg WH, Mangasarian OL. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proceedings of the national academy of sciences. 1990; 87: 9193–9196.