Grid Cells, Place Cells, and Geodesic Generalization for Spatial Reinforcement Learning
Figure 5
Qualitative comparison of learned value functions using tabular, geodesic grid cell, and geodesic place cell bases.
In each figure A–C, the column titles indicate the representation used to learn the value functions for a given gridworld configuration (denoted by row). White lines are walls, discrete squares indicate states, and the gray scale from dark to light indicates low to high value, respectively. To ease comparison between spatial representations within a given gridworld, the image brightness was normalized with respect to the optimal value function. (A) Snapshot of value representation after 25 learning trials. (B) Snapshot of value representation after 25 learning trials. (C) Snapshot of value representation after 50 learning trials. In contrast to Euclidean bases, the geodesic representation does not smear value across walls but instead tracks around them.