Accounting for sensitivity of latent learning to behavioral statistics with successor representations
Fig 3
Agent learns local connectivity of the maze during latent learning.
Deep SR matrices derived from one-hot encoding for all states are shown for the A) gridworld for the right action and B) Tolman maze for the up action, averaged over 30 simulations. Rows and columns correspond to the state indices. Each row of panels represents to the successor transitions of a particular state to every state in the environment. Trials 0 and 50 correspond to before and after the learning phase, respectively. At the end of the learning phase, both agents exhibit similar transition patterns that reflect a movement from the start to the goal location in both mazes.