Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task

doi:10.1371/journal.pcbi.1004648

Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task

Fig 3

Comparison of agents’ behaviour–reduced task.

Comparison of the behaviour of all agents types discussed in the paper on the reduced task. Far left panels–Stay probability plots. Centre left panels—Predictor loadings for logistic regression model predicting whether the agent will repeat the same choice as a function of 4 predictors; Stay–a tendency to repeat the same choice irrespective of trial events, Outcome–a tendency to repeat the same choice following a rewarded trial, Transition—a tendency to repeat the same choice following common transitions, Transition x outcome interaction–a tendency to repeat the same choice dependent on the interaction between transition (common/rare) and outcome (rewarded/not). Centre right panels–Predictor loadings for logistic regression analysis with additional ‘correct’ predictor which captures a tendency to repeat correct choices. Right panels—Predictor loadings for lagged logistic regression model. The model uses a set of 4 predictors at each lag, each of which captures how a given combination of transition (common/rare) and outcome (rewarded/not) predicts whether the agent will repeat the choice a given number of trials in the future, e.g, the ‘rewarded, rare’ predictor at lag -2 captures the extent to which receiving a reward following a rare transition predicts that the agent will choose the same action two trials later. Legend for right panels is at bottom of figure. Error bars in all plots show SEM across sessions. Agent types: (A-D) Q(1), (E-H) Model-based, (I-L) Q(0), (M-P) Reward-as-cue, (Q-T) Latent-state.

doi: https://doi.org/10.1371/journal.pcbi.1004648.g003