Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts
Fig 6
Action bias and hysteresis versus learning performance: 3-T Face/House version.
To compare the pure GRL model (“2”) with the final 2CE1 model adding three parameters for constant bias and exponential hysteresis, simulated data sets from each model were yoked to their respective empirical data sets. Posterior predictive checks were tested for the probability of a correct action, the probability of a right-hand action, or the probability of a repeated action independent of state. (a) If only examining accuracy in terms of correct choices for maximizing reward, the shortcomings of the reduced model without bias are not so obviously apparent at first. (b) Upon considering action bias, these right-handed individuals mostly had a tendency to select the right-hand action (p < 0.05). Whereas the 2CE1 model could account for this effect with a constant lateral bias (p < 0.05), the reduced model could not (p > 0.05). (c) Regarding the probability of repetition versus alternation, note that 100% accuracy would produce 66.7% alternation for the present experimental design, but 100% alternation would still produce 50% accuracy. The Good-learner group exhibited a tendency to alternate in the aggregate as expected (p < 0.05), whereas the Poor-learner and Nonlearner groups did not (p > 0.05). Only the 2CE1 model featuring exponential hysteresis could match this pattern with quantitative precision. (d-f) Independent of direction, absolute differences from the chance level of 50% reveal the full extent of the action-specific components of variance, which are as substantial as the effects of reward typically emphasized in active learning. For fitting the probability of a right-hand action or a repeated action, a margin of roughly 2% for pure GRL was insubstantial in comparison. Error bars indicate standard errors of the means.