Model-Based Reasoning in Humans Becomes Automatic with Training
(A) Subjects chose between a pair of fractals at each of two stages, where a choice at the first-stage lead to one of two second-stage pairs with a fixed probability. This transition structure could be exploited by the player. The second-stage choice followed either a reward (gold coin) or no reward (0), according to independently fluctuating reward contingencies. On dual-task trials (displayed in the figure), two different numbers of physically different sizes were displayed above each fractal at the first-stage. Following second-stage feedback, the word ‘SIZE’ or ‘VALUE’ was presented on the screen, requiring the player to indicate whether the number that was larger in size, or value, respectively, had appeared on the left or right side of the screen. Correct responses were incentivized via monetary gain; incorrect responses were unrewarded. (B) On days 1 and 2 the ‘high load group’ played alternating blocks of single-task (128) and dual task (64) trials (for a total of 4 blocks), while the ‘low load group’ played 2 consecutive blocks of single-task (128) trials. On day 3 both groups played alternating blocks of single-task and dual task trials (as per the ‘high load group’ on days 1–2).