Learning environment-specific learning rates

doi:10.1371/journal.pcbi.1011978

Fig 1.

Experimental design.

A: Experimental procedures. All three experiments started with a learning phase, in which participants randomly alternated on a trial-by-trial basis between the stable and the volatile casino, one of which was always presented on the top of the screen and had blue slot machines, while the other one was always presented on the bottom of the screen and had grey slot machines. The relation between location and the volatility of the casino was counterbalanced across participants. In Experiment 1, the learning phase was followed by a control phase, in which participants randomly alternated on a trial-by-trial basis between two stable casinos, one of which was always presented on the left of the screen and had green slot machines, while the other one was always presented on the right of the screen and had orange slot machines. In Experiment 3, without notifying participants that anything would change, the learning phase was followed by a test phase, in which participants randomly alternated between the same two casinos as in the learning phase, but both of their contingencies were now of intermediate volatility. On each trial, one of the two casinos was first presented on the relevant location for 750 ms. Next, two slot machines were presented in the relevant colour until the participant had chosen one of them, with a response deadline of 5 seconds. Finally, feedback was presented for 750 ms: the lights on the chosen slot machine would start to flicker, if reward was obtained, or remained off, if reward was not obtained. B: Reward rate simulation results. Each point represents the (smoothed) reward rate obtained by the simulated Rescorla-Wagner model in the relevant environment and with the relevant parameter settings.

More »

Expand

Table 1.

Learning phase model comparison.

More »

Expand

Fig 2.

Results Experiment 1.

A: Model estimation results for the learning phase. The density plots on the left side of each subfigure show the full posterior densities over the means of the group-level distributions of the relevant parameters. The scatter plots on the right side of each subfigure show the means of all individual-level posterior distributions of the relevant parameters. B: Reward rate simulation results for the dual learning rate model. Each point represents the (smoothed) reward rate obtained by the simulated dual learning rate model in the relevant environment and with the relevant parameter settings. Each black dot represents a participant’s estimated positive and negative learning rate. C: Model estimation results for the control phase.

More »

Expand

Fig 3.

Results Experiment 2.

A: Model estimation results. The density plots on the left side of each subfigure show the full posterior densities over the means of the group-level distributions of the relevant parameters. The scatter plots on the right side of each subfigure show the means of all individual-level posterior distributions of the relevant parameters. B: Reward rate simulation results for the dual learning rate model. Each point represents the (smoothed) reward rate obtained by the simulated dual learning rate model in the relevant environment and with the relevant parameter settings. Each black dot represents a participant’s estimated positive and negative learning rate. C: Evolution of model parameters over blocks. Each subfigure shows the means and standard deviations of the posterior densities over the means of the group-level distributions of the relevant parameters in the relevant blocks.

More »

Expand

Table 2.

Model comparison.

More »

Expand

Fig 4.

Results Experiment 3.

A: Model estimation results for the learning phase. The density plots on the left side of each subfigure show the full posterior densities over the means of the group-level distributions of the relevant parameters. The scatter plots on the right side of each subfigure show the means of all individual-level posterior distributions of the relevant parameters. B: Reward rate simulation results for the dual learning rate model. Each point represents the (smoothed) reward rate obtained by the simulated dual learning rate model in the relevant environment and with the relevant parameter settings. Each black dot represents a participant’s estimated positive and negative learning rate. C: Evolution of model parameters over blocks. Each subfigure shows the means and standard deviations of the posterior densities over the means of the group-level distributions of the relevant parameters in the relevant blocks. D: Model estimation results for the test phase.

More »

Expand

Table 3.

Learning phase model comparison.

More »

Expand

Table 4.

Test phase model comparison.

More »

Expand

Fig 5.

Results post-hoc analyses Experiment 3.

A: Model estimation results for the test phase up to the second (and last) reward contingency switch. The density plots on the left side of each subfigure show the full posterior densities over the means of the group-level distributions of the relevant parameters. The scatter plots on the right side of each subfigure show the means of all individual-level posterior distributions of the relevant parameters. B: Choice probability results. Presented here are the proportion of participants choosing the correct slot machine on the first 10 trials after each of the two reward contingency switches in both environments in the test phase. Error bars represent standard errors of the choice probabilities.

More »

Expand

Table 5.

Model overview.

More »

Expand

Table 6.

Model recovery simulation results.

More »

Expand