The successor representation subserves hierarchical abstraction for goal-directed behavior

doi:10.1371/journal.pcbi.1011312

The successor representation subserves hierarchical abstraction for goal-directed behavior

Fig 7

Modularity of the successor representation model of response times.

(A,B) Relationship between possible discount factor values and “modularity” of the successor matrix after a sequence of states as experienced by two representative participants. Modularity is measured as the ratio of state prediction error for between-wing transitions over within-wing transitions (corrected for “preferred” transitions; see Methods). Notice that A shows a “peak” (orange line) just before 1, whereas B linearly increases with a maximum at 1. These two patterns were ubiquitous in our data set. Not all discount factors lead to modular successor representations (horizontal grey line at modularity = 1). For both these participants, only discount factors above about 0.25 show an effect of community structure. (C) Posterior means of the recovered discount factor parameters for all participants (dots) and kernel density estimate over these. (D) Modularity of the successor matrix for all participants (blue dots) under a null model and under the fitted successor representation model. (E) State prediction errors for all state-action transitions (even those not possible in the actual experiment), based on the estimated successor matrix for an example participant in our dataset (derived from the full posterior distribution; mean estimated discount factor of 0.978). Hotter colors indicate increased prediction error. Both axes index state-action conjunctions, with states labeled 1–15 as in Fig 1A, and actions labeled as (i) or (ii) as in Fig 1B. The community structure is visible as ‘squares’ of decreased prediction error for states (rooms) that are part of the same community (wing). (F) Posterior model probabilities of the successor representation response time model (x-axis) and the modularity measure (y-axis) for all participants (blue dots) with line of best fit (orange). (G) Similar to F with total accumulated reward bonus on the y-axis.

doi: https://doi.org/10.1371/journal.pcbi.1011312.g007