Learning Reward Uncertainty in the Basal Ganglia

doi:10.1371/journal.pcbi.1005062

Learning Reward Uncertainty in the Basal Ganglia

Fig 9

Changes in the variables of the OpAL model simulated in a two-alternative choice task as a function of trial number.

The rewards were sampled from a Gaussian distribution. Different rows correspond to simulations with different mean rewards μ_i (indicated above the panels), and different columns show: synaptic weights describing the tendency to select G_i and inhibit N_i for the two actions and the value of the state V. Standard deviations of reward σ_i associated with the two actions are indicated above the corresponding panels. Here, both G and N were initialized at 0.1, and we set α = 0.1 and the parameters of the choice rule to a = b = 1. For each of the panels, the simulation was run 50 times, for 300 trials each.

doi: https://doi.org/10.1371/journal.pcbi.1005062.g009