Combined model-free and model-sensitive reinforcement learning in non-human primates

doi:10.1371/journal.pcbi.1007944

Combined model-free and model-sensitive reinforcement learning in non-human primates

Fig 3

The impact of both reward and transition information on first-stage choice reaction time.

(A) The averaged across sessions z-scored first-stage reaction time (RT) difference between previous common and previous rare trials as a function of reward on the previous trial (high z-scores indicate responses faster if previous transition was rare). Error bars depict SEM. (B-C) Multiple linear regression results on first-stage reaction time with the contributions of the reward main effect (B) and the reward × transition interaction term (C) from the five previous trials. Dots represent the fixed-effects coefficients for each session (coloured red when p < 0.05 and grey otherwise). Bar and error bar values correspond, respectively, to the mixed-effect coefficients and their SE. Dashed lines illustrate the exponential best fit on the mean fixed-effects coefficients of each trial into the past. ** α = 0.01 and * α = 0.05 in two-tailed one sample t-test with null-hypothesis mean equal to zero.

doi: https://doi.org/10.1371/journal.pcbi.1007944.g003