Locus Coeruleus tracking of prediction errors optimises cognitive flexibility: An Active Inference model
Fig 7
The explore/exploit task simulated with fixed and flexible values of model decay.
(a) and (b) show the behavioural output from the explore/exploit task for agents with a fixed α parameter, specifically α = 32 (slow model decay) or α = 2 (fast model decay). The agent with α = 2 is hyperflexible in its behaviour and changes its strategy after single failed trials. In contrast, the α = 32 agent is inflexible and persists in seeking reward in the same location despite multiple failed trials. (c) and (d) show the outcome of simulations involving fixed α agents contrasted with the performance of an agent with a flexible value of α set by the state-action prediction error. Each simulation consisted of 150 trials in which the location of the high probability arm changed either every 15 or every 50 trials. The simulation was repeated 50 times. (c) and (d) show the average reward obtained in bins of 20 trials (shaded errors show standard error of the mean), alongside the mean total reward gained by each agent (error bars show S.E.M.;***P<0.0001, one way ANOVA followed by Tukey posthoc test). The less stable/more stable environments favour the α = 2 / α = 32 agent respectively: however, the flexible agent is able to perform as well (or better) in both scenarios. In (e) the location of the reward changes after random intervals, and the flexible agent clearly outperforms both of the fixed- α agents.