Predictive representations can link model-based reinforcement learning to model-free mechanisms

doi:10.1371/journal.pcbi.1005768

Predictive representations can link model-based reinforcement learning to model-free mechanisms

Fig 6

Comparison of SR-Dyna and Dyna-Q.

Median value function (grayscale) and implied policy after each algorithm (row) learns about relevant change in each of the 3 tasks (column). Both SR-Dyna (a) and Dyna-Q (b) can solve all 3 tasks when a sufficient number of samples backed up. c) Without a sufficient number of samples, SR-Dyna can still solve the latent learning task. d) Without a sufficient number of samples, Dyna-Q cannot solve any of the 3 tasks.

doi: https://doi.org/10.1371/journal.pcbi.1005768.g006