A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback

doi:10.1371/journal.pcbi.1000180

A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback

Figure 1

Scheme of reward-modulated STDP according to Equations 1–4.

(A) Eligibility function f_c(t), which scales the contribution of a pre/post spike pair (with the second spike at time 0) to the eligibility trace c(t) at time t. (B) Contribution of a pre-before-post spike pair (in red) and a post-before-pre spike pair (in green) to the eligibility trace c(t) (in black), which is the sum of the red and green curves. According to Equation 1 the change of the synaptic weight w is proportional to the product of c(t) with a reward signal d(t).

doi: https://doi.org/10.1371/journal.pcbi.1000180.g001