Accounting for sensitivity of latent learning to behavioral statistics with successor representations
Fig 4
Successor representation of individual states reveal predictive encoding.
Shown are visualizations of one row of the deep SR matrix (one-hot encoding, targeted pre-exposure), mapped onto the environment structure, for the state indicated by the orange cross in the A) gridworld and B) Tolman maze. Trial –50 corresponds to the randomly initialization. After the pre-exposure to the environment (trial 0), the future states which are expected to be visited by the agent are mostly near the current state, whereas by the end of learning (trial 50), the future occupancy of states has evolved into a path leading to the goal.