Uncertainty–guided learning with scaled prediction errors in the basal ganglia

doi:10.1371/journal.pcbi.1009816

Uncertainty–guided learning with scaled prediction errors in the basal ganglia

Fig 2

Reward prediction performance of the RW, SPE and Kalman filter models.

A The first 200 trials of reward prediction for the RW learner (upper row, orange color) and the SPE learner (lower row, blue color). The true value (grey line), the observed rewards (grey dots) and the learner’s estimate (colored line) are shown as a function of trial number. Columns correspond to selected levels of observation noise (σ = 1, 5, 15). B Learning performance averaged over trials. We show the logarithm of the mean squared difference between the mean of the reward distribution and the learner’s prediction thereof, as a function of the observation noise σ. Orange lines correspond to RW learners, the blue line corresponds to a SPE learner parametrized with α_m = 1 and α_s = 0.01, and the green line corresponds to a Kalman filter parametrized with the true underlying process and observation noise parameters. The different shades of orange correspond to different learning rates, as indicated by the color bar.

doi: https://doi.org/10.1371/journal.pcbi.1009816.g002