When Does Model-Based Control Pay Off?
Fig 1
Design of the Daw two-step task.
(A) State transition structure of the original two-step paradigm. Each first-stage choice has a high probability of transitioning to one of two states and a low probability of transitioning to the other. Each second-stage choice is associated with a probability of obtaining a binary reward. (B) To encourage learning, the second-stage reward probabilities change slowly over the course of the experiment.