Accounting for sensitivity of latent learning to behavioral statistics with successor representations

doi:10.1371/journal.pcbi.1014131

Accounting for sensitivity of latent learning to behavioral statistics with successor representations

Fig 8

The structure of Deep SR accounts for performance difference in the three different pre-exposure paradigms.

A) Left: Well-learned SF for the direct learning agent in the gridworld. Left-center: After targeted pre-exposure, the SF structure is similar to what the direct learning agent learns from rewards, particularly in states near the goal (shown in inset). Right-center: After continuous pre-exposure, the structure is more dissimilar to that of the direct learning agent. Right: After mistargeted pre-exposure, the structure is similar to that of the targeted pre-exposure, but for the mistargeted location. B). Same as in A) for the Tolman maze without the mistargeted pre-exposure simulations. C) For the gridworld. The cosine similarity between the SF vector from the latent learning agents at the end of the pre-exposure phase and that from the direct learning agent at the task’s conclusion. Targeted pre-exposure results in superior scores in states close to the goal (last state), while mistargeted pre-exposure shows the lower scores in states close to the mistargeted state. Error bars are S.E.M. D) Same as in C) for the Tolman maze.

doi: https://doi.org/10.1371/journal.pcbi.1014131.g008