A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback

doi:10.1371/journal.pcbi.1000180

A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback

Figure 8

Test of the validity of the analytically derived conditions 13–15 on the relationship between parameters for successful learning with reward-modulated STDP.

Predicted average weight changes (black bars) calculated from Equation 22 match in sign and magnitude the actual average weight changes (gray bars) in computer simulations, for 6 different experiments with different parameter settings (see Table 1). (A) Weight changes for synapses with . (B) Weight changes for synapses with . Four cases where constraints 13–15 are not fulfilled are shaded in light gray. In all of these four cases the weights move into the opposite direction, i.e., a direction that decreases rewards.

doi: https://doi.org/10.1371/journal.pcbi.1000180.g008