Emergence of belief-like representations through reinforcement learning
Fig 6
Value RNNs with larger capacity had more belief-like representations.
A. Error between the RPEs of the Value RNN and Untrained RNN relative to the RPEs of the Belief model (“RPE MSE”; see Fig 3D) during Task 2, as a function of the number of units in the RNN. Each dot indicates the error for a single Value RNN. Circles indicate the median across the N = 12 Value RNNs (dark purple) and N = 12 Untrained RNNs (light purple) with the same number of units. Remaining panels use the same conventions. B. Total variance explained (R2) of beliefs on held-out trials (see Fig 4B). C. Cross-validated log-likelihood of the state decoder using each RNN’s activity to estimate the true state (see Fig 4C). D. Difference between each RNN’s odor memory and reward memory (see Fig 5E).