Ventromedial prefrontal cortex 3 recruitment during memory recall varies 4 non-linearly as a function of remoteness 5

30 Systems-level consolidation refers to the time-dependent reorganisation of memory traces in the 31 neocortex, a process in which the ventromedial prefrontal cortex (vmPFC) has been implicated. 32 Capturing the precise temporal evolution of this crucial process in humans has long proved elusive. 33 Here, we used multivariate methods and a longitudinal functional MRI design to detect, with high 34 granularity, the extent to which autobiographical memories of different ages were represented in 35 vmPFC and how this changed over time. We observed an unexpected biphasic involvement of vmPFC 36 during retrieval, rising and falling around an initial peak of 8-12 months, before re-engaging for older 37 two and five year old memories. This pattern was replicated in two independent sets of memories. 38 Moreover, it was further replicated in a follow-up study eight months later with the same 39 participants and memories, where the individual memory representations had undergone their 40 hypothesised strengthening or weakening over time. We conclude that the temporal engagement of 41 vmPFC in memory retrieval seems to be non-linear, revealing a complex relationship between 42 systems-level consolidation and prefrontal cortex recruitment that is unaccounted for by current 43 theories. 44 45 46


Introduction
We possess a remarkable ability to retrieve, with ease, one single experience from a lifetime of 48 memories. How these individual autobiographical memories are represented in the brain over time 49 is a central question of memory neuroscience which remains unanswered. 50 Consolidation takes place on two levels which differ on both a spatial and temporal scale. On 51 a cellular level, the stabilisation of new memory traces through modification of synaptic connectivity 52 takes only a few hours (1), and is heavily dependent upon the hippocampus (2-5). On a much longer 53 timescale, the neocortex integrates new memories, a form of consolidation termed "systems-level" 54 (6). The precise timeframe of this process is unknown. A related long-standing debate which has 55 contributed to this uncertainty is whether or not the hippocampus ever relinquishes its role in 56 autobiographical memory retrieval. One theory asserts that the hippocampus is not involved in the 57 retrieval of memories after they have become fully consolidated to the neocortex (7). Alternate 58 views maintain that vivid, detailed autobiographical memories retain a permanent reliance on the 59

Consistent level of details recalled across memories 162
To complement the subjective ratings of memory characteristics with a more objective assessment 163 of their content, transcripts of participants' memory interviews were scored using the 164 Autobiographical Interview protocol [(56); Materials and methods]. In total for this first experiment, 165 10,187 details were scored. The mean (SD) number of internal details (bound to the specific 166 'episodic' spatiotemporal context of the event) and external details (arising from a general 167 'semantic' knowledge or references to unrelated events) are shown in Table S1B (see also Fig 2). 168 They were then compared across time periods. In contrast to the subjective ratings of memory 169 detail, there was no difference in the number of details recalled across memories from different 170 time periods (F (4.54,131.66) = 1.92, p = 0.101). As expected, the number of internal and external details 171 differed (F (1,29) = 206.03, p < 0.001), with more internal details recalled for every time period (all p < 172 0.001). No interaction between time period and type of detail was observed (F (7,203) = 1.87, p = 173 0.077). While a more targeted contrast of the most recent (0.5M) and most remote (60M) memories 174 did reveal that 0.5M events contained more internal details (t(29) = 3.40, p = 0.002), this is 175 consistent with participants' subjective ratings, and implies that any observed strengthening of 176 neural representations over time could not be attributable to greater detail at remote time-points. 177 The number of external details recalled was remarkably consistent across all time periods, 178 emphasising the episodic nature of recalled events irrespective of remoteness. Inter-rater 179 reliabilities for the scoring (see Materials and methods) were high for both internal (ICC = 0.94) and 180  (57). This comprises areas implicated in memory consolidation (31, 54, 55), 195 namely Brodmann Areas 14, 25, ventral parts of 24 and 32, the caudal part of 10 and the medial part 196 of BA 11 (Fig 3A,and Materials and methods). 197 On each trial, the photograph and associated pre-selected cue phrase relating to each event 198 were displayed on a screen for 3 seconds. Following removal of this cue, participants then closed 199 their eyes and recalled the memory. After 12 seconds, the black screen flashed white twice, to cue 200 the participant to open their eyes. The participant was then asked to rate how vivid the memory 201 recall had been using a five-key button box, on a scale of 1-5, where 1 was not vivid at all, and 5 was 202 highly vivid ( Fig 3B). 203 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; We used Representational Similarity Analysis (RSA) to quantify the extent to which the 204 strength of memory representations in the vmPFC differed as a function of memory age. This was 205 achieved by contrasting the similarity of neural patterns when recalling the same memory with their 206 similarity to other memories to yield a "neural representation" score for each memory (see 207 Materials and methods, Fig 3C). As there were two memories recalled per time period, the neural 208 representation scores were averaged to produce one value for that time period. 209 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a We anticipated an increase in the strength of memory representations at some point 220 between 0.5M and 24M, in line with the results of Bonnici and Maguire (55). This is what we 221 observed, where the most recent 0.5M memories were undetectable (t 29 = 0.72, p = 0.477) in 222 vmPFC, in contrast to the distinct neural signatures observed for 4M (t 29 = 2.85, p = 0.008), 8M (t 29 = 223 3.09, p = 0.004) and 12M (t 29 = 3.66, p < 0.001) old memories (Fig 4A). These changes in the strength 224 of memory representations were significant across time periods (F (7,203) = 2.22, p = 0.034), with an 225 observed increase in vmPFC recruitment from 0.5M to 8M (t 29 = 2.07, p = 0.048) and 12M (t 29 = -226 2.20, p = 0.036). 227 However, what was observed for the following two time periods was unexpected -an 228 apparent disengagement of the vmPFC over the next eight months as we observed weak 229 detectability of memory representations in vmPFC for 16M (t 29 = 1.85, p = 0.074) and 20M (t 29 = 230 1.03, p = 0.310) old memories. Neither 16M (t 29 = -1.06, p = 0.298) nor 20M memories (t 29 = -0.40, p 231 = 0.691) were more strongly represented than the recent 0.5M old memories. In contrast, the more 232 remote 24M (t 29 = 4.34, p < 0.001) and 60M (t 29 = 3.55, p = 0.001) memories were detectable in the 233 vmPFC, and significantly more so than the most recent memories (24M vs 0.5M, t 29 = -2.93, p = 234 0.007; 60M vs 0.5M, t 29 = -2.54, p = 0.017) as well as the more temporally proximal 20M old 235 memories (24M vs 20M, t 29 = -2.50, p = 0.018; 60M vs 20M, t 29 = -2.32, p = 0.028). 236 The experimental design afforded us the opportunity to verify this biphasic pattern. As we 237 sampled two memories per time-point, this time-dependent pattern should be evident in both sets 238 of memories. As shown in Fig 4B, the two sets of memories followed a similar time-course of 239 changes in representation within vmPFC. This is a compelling replication, given that the two 240 memories from each time-period were unrelated in content as a prerequisite for selection, recalled 241 in separate sessions in the scanner and analysed independently from each other. The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; The availability of two memories at each time-point also permitted the use of an alternative 256 approach to calculating neural representation scores. Instead of using the similarity to memories 257 from other time-points as a baseline, we could also assess if memories were similar to their 258 temporally matched counterpart in the other set. As can be seen in Fig S3, the biphasic pattern is 259 preserved even when just using one identically aged memory as a baseline. In other words, the 260 distinguishable patterns are specific to each individual memory rather than attributable to general 261 retrieval processes associated with any memory of the same age. 262 263

The observed temporal relationship is unique to vmPFC 264
Our main focus was the vmPFC, given previous work highlighting specifically this region's role in 265 representing autobiographical memories over time (54,55). We also scanned within a partial volume 266 (to attain high spatial resolution with a reasonable TR), so were constrained in what other brain 267 areas were available for testing (see Materials and methods). Nevertheless, we examined the same 268  (Table S2). 274 Following scanning, participants completed three additional ratings. They were asked to 275 indicate the extent to which the memories were changed by the 6 repetitions during scanning on a 276 scale ranging from 1 (not at all) to 5 (completely). They reported that the memories were not 277 changed very much by repetition (mean: 2.61, SD: 0.74). They were also asked how often during 278 scanning they thought about the memory interview one week previous on a scale of 1 (not at all) to 279 5 (completely), with participants indicating they rarely thought about the interview (mean: 2.29, SD: 280 1.01). Finally, participants were asked the extent to which the recall of memories from each time 281 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; period unfolded in a consistent manner over the course of the session. A difference was observed 282 (F (7,203) = 2.78, p = 0.009), with the most recent 0.5M old memories being rated as more consistently 283 recalled than the most remote 60M memories (t 29 = 3.97, p = 0.012). 284 In addition to the ROI-based approach, a searchlight analysis was also conducted in MNI 285 group normalised space to localise areas within the vmPFC where memories displayed high 286 detectability across participants (see Materials and methods). We discovered a significant bilateral 287 cluster of 652 voxels (Fig S4A), and subsequently used RSA to quantify the strength of neural 288 representations at each time-point within this area ( Fig S4B). The results were highly similar to the 289 whole-ROI analysis in native space, suggesting the main result may be driven by more spatially 290 confined activity within the vmPFC. However a searchlight approach is sub-optimal to answer the 291 current research question, as it requires an a priori model RSM against which to compare the neural 292 patterns at each searchlight sphere, whereas the ROI approach makes no such assumptions. 293 We also conducted a standard mass-univariate analysis on the whole volume with memory 294 remoteness as a parametric regressor, and this did not reveal any significant results, consistent with 295 the findings of Bonnici et al. (54). In a similar parametric analysis, we did not find evidence of the 296 modulation of univariate activity by in-scanner vividness ratings as might be suggested by the 297 findings of Sheldon and Levine (59), however, all memories chosen for the current study were highly 298 vivid in nature. 299 300

Rationale and predictions for Experiment 2 301
The biphasic pattern we observed in the fMRI data did not manifest itself in the subjective or 302 objective behavioural data. In fact, the only difference in those data was higher ratings for the most 303 recent 0.5M old memories. However, these were paradoxically the most weakly represented 304 memories in the vmPFC, meaning the neural patterns were not driven by memory quality. The 305 objective scoring of the memories confirmed comparable levels of detail provided for all memories, 306 without any significant drop in episodic detail or increase in the amount of semantic information 307 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; provided as a function of time. Therefore, the amount or nature of the memory details were not 308 contributing factors. 309 Nevertheless, to verify that the results genuinely represented the neural correlates of 310 memory purely as a function of age, one would need to study the effects of the passage of time on 311 the individual neural representations. Therefore we invited the participants to revisit eight months 312 later to recall the same memories again both overtly and during scanning; 16 of the participants 313 agreed to return. In order to generate specific predictions for the neural representations during 314 Experiment 2, we took the actual data for the 16 subjects from Experiment 1 who returned eight 315 months later ( participants from Experiment 1 (green line) who returned for Experiment 2. Light grey arrows 341 indicate the hypotheses. * p < 0.05, ** p < 0.01. 342 343 Experiment 2 (eight months later) 344 One week prior to the fMRI scan, with the assistance of the personal photographs and previously 345 chosen phrases which were used as cues in Experiment 1, the participants verbally recalled and 346 rated the characteristics of their autobiographical memories just as they had done eight months 347 previously (see Materials and methods and Fig 6A). 348 349 350 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017;

A similar level of detail was recalled across experiments 408
As with Experiment 1, transcripts of participants' memory interviews during Experiment 2 were 409 scored using the Autobiographical Interview protocol [(56); see Materials and methods)]. A total of 410 6,444 details were scored (see Table S3B for means, SD). There was a difference in the number of 411 details recalled across different time periods in Experiment 2 (F (7,105) = 2.49, p = 0.021). However, 412 this difference was only observed for external details (F (7,105) = 3.25, p = 0.004), with more provided 413 for 28M memories than 12M memories (t 15 = -4.68, p = 0.008). As with Experiment 1, the number of 414 internal and external details differed (F (1,15) = 72.57, p < 0.001), with more internal details recalled for when examining other brain areas within the partial volume in Experiment 2, in no case did we find a 461 significant difference in memory detectability across time periods. 462 Following scanning in Experiment 2, participants completed three additional ratings. They 463 were asked to indicate the extent to which the memories were changed by the 6 repetitions during 464 scanning on a scale ranging from 1 (not at all) to 5 (completely). As in Experiment 1, they reported 465 that the memories were not changed very much by repetition (mean: 2.56, SD: 0.81). They were also 466 asked how often they thought of the experience of recalling the memories in Experiment 1 while 467 performing the scanning task in Experiment 2 on a scale of 1 (not at all) to 5 (during every memory).

481
This study exploited the sensitivity of RSA to detect not only the extent to which memories of 482 different ages were represented in the vmPFC, but how these representations changed over time. 483 During Experiment 1, we observed detectability in vmPFC for memories at 4M to 12M of age, which 484 was also evident at 24M and 60M. As expected, recent 0.5M old memories were poorly represented 485 in vmPFC in comparison. Curiously, however, the same lack of detectability in vmPFC was observed 486 for memories that were 16M to 20M old. This pattern persisted across separate sets of memories 487 and was replicated in a follow-up study eight months later with the same participants and 488 memories. Behavioural data failed to account for these time-dependent representational changes in 489 either experiment, and other regions failed to show a significant change in memory representations 490 over time. These findings are difficult to accommodate within existing theoretical frameworks of 491 long-term memory consolidation (9, 12, 60-62), as neocortical recruitment is generally assumed to 492 involve an ascending linear trajectory. Consolidation has been characterised as fluid and continuous 493 (63), but the biphasic vmPFC engagement observed here suggests additional complexity in its 494 temporal recruitment. 495 496

Processing of consolidated memories in the vmPFC 497
The observed weak representation of recent memories is consistent with the time-dependent 498 nature of systems-level consolidation. Likewise, the progressive increase in vmPFC memory 499 detectability from 4 to 12 months is indicative of memories being consolidated with the passage of 500 time. However the subsequent weakening of memory representations from 12 to 20 months, and a 501 re-engagement for more remote memories, suggests the vmPFC performs separate operations on 502 consolidated representations, one medium and one longer-term. 503 The initial period of detectability (4-12M) represents a congested memory space (64) 504 compared to remote life periods. Therefore interference from related memories may be an issue. 505 The vmPFC is involved in the inhibition of irrelevant memories (65, 66), which may be the medium-506 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; term cognitive process underlying the first phase of recruitment. Remote memories (>24M) are 507 potentially less affected by interference, but may require more cognitive flexibility to retrieve due to 508 sharing fewer features with the current spatiotemporal context. Such facilitation by the vmPFC may 509 arise from the deployment of schema by this region (67) which refers to the abstraction of elements 510 common to multiple experiences. Such schema could be used to rapidly and preconsciously confine 511 memory search to a subset of temporally distant representations. This would likely become a 512 necessary long-term strategy for all remote memories. Memories of an interim age may simply not 513 rely as much on the vmPFC for successful retrieval due to minimal interference from related 514 memories which have decayed, and sufficient temporal proximity to render schema-mediated 515 retrieval unnecessary. 516 It is worth nothing, however, that the observed neural patterns are unlikely to reflect time-517 sensitive general processes which are independent of memory content. If that were the case, the 518 two sets of memories would be indistinguishable from each other at every time period. However, 519 neural patterns for individual memories were as distinct from their temporally matched 520 counterparts as memories of different ages. The neural activation is therefore memory-specific, 521 suggestive of a locally-stored representation at the very least, and more likely an additional 522 associated executive process. The strengthening of neural connections over time may therefore be 523 linear, but the extent to which the vmPFC utilises these representations during retrieval is not, and 524 remains highly sensitive to memory age. Fully appreciating the role of the vmPFC in memory 525 retrieval may involve a combination of both storage and processing in this region, and how they 526 interact (68). 527 Using an autobiographical memory paradigm to study consolidation is preferable to 528 laboratory-based episodic memory tests by virtue of its ecological validity, availability of temporally 529 distant stimuli, clinical significance and context-dependent equivalence to animal tasks. However, 530 studying autobiographical memory carries with it potential confounds which can affect 531 interpretation of results. Below we consider why these factors cannot account for our observations. 532 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017;

Consistency of recall and forgetting 534
Older memories may yield a higher RSA score if they are more consistently recalled. Here, however, 535 participants actually rated 0.5M memories as more consistently recalled than 60 month old 536 memories. Older memories were not impoverished in detail when compared to the detail available 537 for recent memories. Moreover, an inspection of interview transcripts across experiments revealed 538 participants rarely offered new details for previous memories when retested, countering the 539 suggestion that increased detectability of old memories may arise from the insertion of new episodic 540 or semantic details (69). The consistency in recalled detail across experiments could be attributable 541 to participants recalling in Experiment 2 what they had said during Experiment 1. However whether 542 or not participants remembered by proxy is irrelevant, as they still recalled the specific details of the 543 original event, removing forgetting as a potential explanation of changes in neural patterns over 544 time. 545 546

The influence of repetition 547
Retrieving a memory initiates reconsolidation, a transient state where memories are vulnerable to 548 interference (70, 71). Therefore, repeated retrieval may cause this process to have an influence on 549 neural representations. However, all memories were recalled one week before the fMRI scan, so if 550 such an effect was present it would be matched across time-points. Retrieval at this stage may also 551 accelerate consolidation (72), yet if this was a major influence, we would likely have found 0.5M 552 memories to be more detectable than they were. Further repetition of memories within the scanner 553 in Experiment 1 took place over a timescale that could not affect consolidation processes or 554 interpretation of the initial neural data. Nevertheless, this could arguably affect vmPFC engagement 555 over a longer period of time (73) and thus perturb the natural course of consolidation, influencing 556 the results of Experiment 2. However, given that seven out of the eight specifically hypothesised 557 temporally sensitive changes in neural representations were supported, an altered or accelerated 558 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; consolidation time-course appears highly unlikely. Again, recall recency was matched in Experiment 559 2 by the memory interview, and recall frequency between experiments was low. 560 Taking a more general and parsimonious perspective, the ratings demonstrate that, 561 naturally, all memories are recalled on an occasional basis (Table S1), therefore it seems highly 562 unlikely that a mere six repetitions within a scanning session would significantly alter the time 563 course of systems-level consolidation. It should also be noted that successful detection of neural 564 patterns relied on the specific content of each memory, rather than being due to generic time-565 related retrieval processes (Fig S3). One alternative to the current two-experiment longitudinal 566 design to limit repetition across experiments would be to have a different group of participants with 567 different memories for the second experiment. However the strength of the current approach was 568 the ability to track the transformation in neural patterns of the same memories over time. 569 570

The effect of selection 571
An alternative interpretation of the time-sensitive vmPFC engagement is a systematic bias in the 572 content of selected memories. For example, annual events coinciding across all participants, such as 573 a seasonal holiday. However, recruitment took place over a period of five months in an evenly 574 spaced manner, ensuring that such events did fall into the same temporal windows across 575 participants. The occurrence of personal events such as birthdays was also random across 576 participants. The use of personal photographs as memory cues also limited the reliance on time of 577 year as a method for strategically retrieving memories. Furthermore, the nature of memory 578 sampling was that unique, rather than generic, events were eligible, reducing the likelihood of 579 events which were repeated annually being included. Memory detectability was high at 12 month 580 intervals such as one, two and five years in this study, suggesting perhaps it is easier to recall events 581 which have taken place at a similar time of year to the present. However this should have been 582 reflected in behavioural ratings, and equivalently strong neural representations for recent 583 memories, but neither was observed. Most importantly, if content rather than time-related 584 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; consolidation was the main influence on memory detectability, then we would not have observed 585 any change in neural representation scores from Experiment 1 to Experiment 2, rather than the 586 hypothesised shifts which emerged. 587 A related concern is that memories across time differ in nature because they differ in 588 availability. Successful memory search is biased towards recency, meaning there are more events to 589 choose from in the last few weeks, than remote time periods. Here, this confound is circumvented 590 by design, given that search was equivalently constrained and facilitated at each time-point by the 591 frequency at which participants took photographs, which was not assumed to change in a major way 592 over time. These enduring "snap-shots" of memory, located within tight temporal windows (see 593 Materials and methods) meant that memory selection was not confounded by retrieval difficulty or 594 availability. It could also be argued that selection of time-points for this study should have been 595 biased towards recency given that most forgetting occurs in the weeks and months after learning. 596 However, it is important to dissociate systems-level consolidation from forgetting, as they are 597 separate processes which are assumed to follow different time-courses. Memory forgetting follows 598 an exponential decay (74), whereas systems-level consolidation has generally been assumed, until 599 now, to be gradual and linear (75). Our study was concerned only with vivid, unique memories which 600 were likely to persist through the systems-level consolidation process. 601

602
Value 603 Given that the medial prefrontal cortex is often associated with value and emotional processing (76), 604 could these factors have influenced the current findings? Humans display a bias towards 605 consolidating positive memories (77), and remembered information is more likely to be valued than 606 that which is forgotten (78). Activity in vmPFC during autobiographical memory recall has been 607 found to be modulated by both the personal significance and emotional content of memories (79). 608 However, in the current two experiments, memories were matched across time periods on these 609 variables, and the selection of memories through photographs taken on a day-to day basis also 610 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; mitigated against this effect. In the eight months between experiments, memories either remained 611 unchanged or decreased slightly in their subjective ratings of significance and positivity, suggesting 612 that these factors are an unlikely driving force behind the observed remote memory representations 613 in vmPFC. For example, if recent memories in Experiment 1 were not well-represented in vmPFC 614 because they were relatively insignificant, there is no reason to expect them to be more so eight 615 months later, yet their neural representation strengthened over time nonetheless. 616 617

Relation to previous findings 618
A methodological discrepancy between this experiment and that conducted by Bonnici et al. (54), is 619 the additional use of a photograph to assist in cueing memories. One possible interpretation of the 620 neural representation scores is they represent a role for the vmPFC in the maintenance of visual 621 working memory following cue offset. However, the prefrontal cortex is unlikely to contribute to 622 maintenance of visual information (80). Furthermore, were this to be the driving effect behind 623 neural representations here, the effect would be equivalent across time-periods, yet it is not. 624 There is, however, an obvious inconsistency between the findings of the current study and 625 that of Bonnici,et al. (54). Unlike that study, we did not detect representations of 0.5M old 626 memories in vmPFC. It could be that the support vector machine classification-based MVPA used by 627 Bonnici et al. (54) is more sensitive to detection of memory representations than RSA, however, the 628 current study was not optimised for such an analysis because it necessitated an increased ratio of 629 conditions to trials. Nonetheless, the increase in memory representation scores from recent to 630 remote memories was replicated and additionally refined in the current study with superior 631 temporal precision. One observation which was consistent with the Bonnici findings was the 632 detection of remote memories in the hippocampus, which also supports theories positing a 633 perpetual role for this region in the vivid retrieval of autobiographical memories (10, 12). However, 634 the weak detectability observed at more recent time points may reflect a limitation of the RSA 635 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; approach employed here to detect sparsely encoded hippocampal patterns, which may be overcome 636 by a more targeted subfield analysis (81). 637 In the light of our hypotheses, Experiment 2 generated one anomalous finding. Twenty-four 638 month old memories from Experiment 1 were no longer well represented eight months later. Why 639 memories around 32M of age are not as reliant on vmPFC is unclear, but unlike other time-periods, 640 we cannot verify this finding in the current experiment, as we did not sample 32M memories during 641 Experiment 1. 642 643 Summary 644 The current results revealed a two-stage process of autobiographical memory retrieval in the vmPFC 645 over the course of systems-level consolidation, which was remarkably preserved across completely 646 different sets of memories in one experiment, and closely replicated in a subsequent longitudinal 647 experiment with the same participants and memories. These findings support the notion that the 648 vmPFC becomes increasingly important over time for the retrieval of remote memories. Two 649 particularly novel findings emerged. First, this process occurs relatively quickly, by four months 650 following an experience. Second, vmPFC involvement after this time fluctuates in a highly consistent 651 manner, depending on the precise age of the memory in question. Further work is clearly needed to 652 explore the implications of these novel results. Overall, we conclude that our vmPFC findings may 653 be explained by a dynamic interaction between the changing strength of a memory trace, the 654 availability of temporally adjacent memories, and the concomitant differential strategies and 655 schemas that are deployed to support the successful recollection of past experiences. Thirty healthy, right handed participants (23 female) took part (mean age 25.3, SD 3.5, range 21-32). 668 All had normal or corrected-to-normal vision. 669 670

Memory interview and selection of autobiographical memories 671
Participants were instructed to select at least three photographs from each of eight time-points in 672 their past (0.5M, 4M, 8M, 12M, 16M, 20M, 24M and 60M relative to the time of taking part in the 673 experiment) which reminded them of vivid, unique and specific autobiographical events. The 674 sampling was retrospective, in that the photographs were chosen from the participants' pre-existing 675 photograph collections and not prospectively taken with the study in mind. Highly personal, 676 emotionally negative or repetitive events were deemed unsuitable. An additional requirement was 677 that memories from the same time period should be dissimilar in content. For the four most recent 678 time periods (0.5M-12M), the memories should have taken place within a temporal window two 679 weeks either side of the specified date yielding a potential window of one month, for the next three 680 time points (16M-24M), three weeks either side to allow a window of six weeks, and one month 681 either side for the most remote time point (60M), giving a two month window. This graded approach 682 was adopted to balance temporal precision with the availability of suitable memories at more 683 remote time-points. 684 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017 Participants were asked to describe in as much detail as possible the specific 685 autobiographical memory elicited by a photograph. General probes were given by the interviewer 686 where appropriate (e.g., "what else can you remember about this event?"). Participants were also 687 asked to identify the most memorable part of the event which took place within a narrow temporal 688 window and unfolded in an event-like way. They then created a short phrase pertaining to this 689 episode, which was paired with the photograph to facilitate recall during the subsequent fMRI scan 690 ( Fig 1A). Participants were asked to rate each memory on a number of characteristics (see main text, 691 Figs 1 and 6, Tables S1 and S3), and two memories from each time period which satisfied the criteria 692 of high vividness and detail, and ease of recall were selected for recollection during the fMRI scan. 693 694

Behavioural analyses 695
The interview was recorded and transcribed to facilitate an objective analysis of the details, and the 696 widely-used Autobiographical Interview method was employed for scoring (56). Details provided for 697 each memory were scored as either "internal" (specific events, temporal references, places, 698 perceptual observations and thoughts or emotions) or "external" (unrelated events, semantic 699 knowledge, repetition of details or other more general statements). To assess inter-rater reliability, a 700 subset of sixteen memories (n=2 per time period) were randomly selected across 16 different 701 subjects and scored by another experimenter blind to the aims and conditions of the study. Intra-702 class coefficient estimates were calculated using SPSS statistical package version 22 (SPSS Inc,703 Chicago, IL) based on a single measures, absolute-agreement, 2-way random-effects model. 704 As two memories per time period were selected for later recall in the scanner, behavioural 705 ratings were averaged to produce one score per time period. Differences in subjective memory 706 ratings across time periods were analysed using a one-way repeated measures ANOVA with 707 Bonferroni-corrected paired t-tests. Differences in objective memory scores of internal and external 708 details across time periods were analysed using a two-way repeated measures ANOVA with 709 Bonferroni-corrected paired t-tests. A threshold of p < 0.05 was used throughout both experiments. 710 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017 All ANOVAs were subjected to Greenhouse-Geisser adjustment to the degrees of freedom if 711 Mauchly's sphericity test identified that sphericity had been violated. 712 713 Task during fMRI scanning 714 Participants returned approximately one week later (mean 6.9 days, SD 1) to recall the memories 715 while undergoing an fMRI scan. Prior to the scan, participants were trained to recall each of the 16 716 memories within a 12 second recall period [as in Bonnici et al. (54), Bonnici and Maguire (55)], when 717 cued by the photograph alongside its associated cue phrase. There were two training trials per 718 memory, and participants were asked to vividly and consistently recall a particular period of the 719 original event which unfolded across a temporal window matching the recall period. 720 During scanning, participants recalled each memory six times (6 trials x 16 memories = 96 721 trials). The two memories from each time period were never recalled together in the same session, 722 nor was any one memory repeated within each session, resulting in 12 separate short sessions with 723 eight trials in each, an approach recommended for optimal detection of condition-related activity 724 patterns using MVPA (82). Trials were presented in a random order within each session. On each 725 trial, the photograph and associated pre-selected cue phrase relating to each event were displayed 726 on screen for three seconds. Following removal of this cue, participants then closed their eyes and 727 recalled the memory. After 12 seconds, the black screen flashed white twice, to cue the participant 728 to open their eyes. The participant was then asked to rate how vivid the memory recall had been 729 using a five-key button box, on a scale of 1-5, where 1 was not vivid at all, and 5 was highly vivid. The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017

MRI data acquisition 737
Structural and functional data were acquired using a 3T MRI system (Magnetom TIM Trio,Siemens 738 Healthcare, Erlangen, Germany). Both types of scan were performed within a partial volume which 739 incorporated the entire extent of the ventromedial prefrontal cortex (Fig 3A). 740 Structural images were collected using a single-slab 3D T2-weighted turbo spin echo 741 sequence with variable flip angles (SPACE) (83) in combination with parallel imaging, to 742 simultaneously achieve a high image resolution of ~500 μm, high sampling efficiency and short scan 743 time while maintaining a sufficient signal-to-noise ratio (SNR). After excitation of a single axial slab 744 the image was read out with the following parameters: resolution = 0.52 x 0.52 x 0.5 mm, matrix = 745 384 x 328, partitions = 104, partition thickness = 0.5 mm, partition oversampling = 15.4%, field of 746 view = 200 x 171 mm 2, TE = 353 ms, TR = 3200 ms, GRAPPA x 2 in phase-encoding (PE) direction, 747 bandwidth = 434 Hz/pixel, echo spacing = 4.98 ms, turbo factor in PE direction = 177, echo train 748 duration = 881, averages = 1.9. For reduction of signal bias due to, for example, spatial variation in 749 coil sensitivity profiles, the images were normalized using a prescan, and a weak intensity filter was 750 applied as implemented by the scanner's manufacturer. To improve the SNR of the anatomical 751 image, three scans were acquired for each participant, coregistered and averaged. Additionally, a 752 whole brain 3D FLASH structural scan was acquired with a resolution of 1 x 1 x 1 mm. 753 Functional data were acquired using a 3D echo planar imaging (EPI) sequence which has 754 been demonstrated to yield improved BOLD sensitivity compared to 2D EPI acquisitions (84). Image 755 resolution was 1.5mm 3 and the field-of-view was 192mm in-plane. Forty slices were acquired with 756 20% oversampling to avoid wrap-around artefacts due to imperfect slab excitation profile. The echo 757 time (TE) was 37.30 ms and the volume repetition time (TR) was 3.65s. Parallel imaging with GRAPPA 758 image reconstruction (85) acceleration factor 2 along the phase-encoding direction was used to 759 minimize image distortions and yield optimal BOLD sensitivity. The dummy volumes necessary to 760 reach steady state and the GRAPPA reconstruction kernel were acquired prior to the acquisition of 761 the image data as described in Lutti et al. (84). Correction of the distortions in the EPI images was 762 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; implemented using B0-field maps obtained from double-echo FLASH acquisitions (matrix size 64x64; 763 64 slices; spatial resolution 3mm 3 ; short TE=10 ms; long TE=12.46 ms; TR=1020 ms) and processed 764 using the FieldMap toolbox available in SPM (86). 765 766 MRI data preprocessing 767 fMRI data were analysed using SPM12 (www.fil.ion.ucl.ac.uk/spm). All images were first bias 768 corrected to compensate for image inhomogeneity associated with the 32 channel head coil (87). 769 Fieldmaps collected during the scan were used to generate voxel displacement maps. EPIs for each 770 of the twelve sessions were then realigned to the first image and unwarped using the voxel 771 displacement maps calculated above. The three high-resolution structural images were averaged to 772 reduce noise, and co-registered to the whole brain structural scan. EPIs were also co-registered to 773 the whole brain structural scan. Manual segmentation of the vmPFC was performed using ITK-SNAP 774 on the group averaged structural scan normalised to MNI space. The normalised group mask was 775 warped back into each participant's native space using the inverse deformation field generated by 776 individual participant structural scan segmentations. The overlapping voxels between this 777 participant-specific vmPFC mask and the grey matter mask generated by the structural scan 778 segmentation were used to create a native-space grey matter vmPFC mask for each individual 779 participant. 780 781

Representational Similarity Analysis 782
Functional data were analysed at the single subject level without warping or smoothing. Each recall 783 trial was modelled as a separate GLM, which comprised the 12 second period from the offset of the 784 memory cue to just before the white flash which indicated to the participant they should open their 785 eyes. Motion parameters were included as regressors of no interest. RSA (88), was performed using 786 the RSA toolbox (http://www.mrc-cbu.cam.ac.uk/methods-and-resources/toolboxes/) and custom 787 MATLAB (version R2014a) scripts. In order to account for the varying levels of noise across voxels 788 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; which can affect the results of multivariate fMRI analyses, multivariate noise normalisation (89) was 789 performed on the estimated pattern of neural activity separately for each trial. This approach 790 normalises the estimated beta weight of each voxel using the residuals of the first-level GLM and the 791 covariance structure of this noise. This results in the down-weighting of noisier voxels and a more 792 accurate estimate of the task-related activity of each voxel. 793 The average number of voxels analysed in the vmPFC across the two sets of memories was 794 5252 (SD 1227). Whole ROI-based analysis was preferred to a searchlight approach which would 795 involve comparing neural with model similarity matrices (90), as we did not have strong a priori 796 hypothesis about changes in neural representations over time against which to test the neural data, 797 nor did we want to make assumptions regarding the spatial distribution of informative voxels in the 798 vmPFC. 799 As participants recalled two memories per time-point, the dataset was first split into two 800 sets of eight time points, which were analysed separately using RSA. To characterise the strength of 801 memory representations in the vmPFC, the similarity of neural patterns across recall trials of the 802 same memory was first calculated using the Pearson product-moment correlation coefficient, 803 resulting in a "within-memory" similarity score. Then the neural patterns of each memory were 804 correlated with those of all other memories, yielding a "between-memory" similarity score. Both 805 within-and between-memory correlations were performed on trials from separate runs. For each 806 memory, the between-memory score was then subtracted from the within-memory score to provide 807 a neural representation score (Fig 3C). This score was then averaged across the two memories at 808 each time-point. Results for the left and the right hemispheres were highly similar, and therefore the 809 data we report here are from the vmPFC bilaterally. A distinctive neural pattern associated with the 810 recall of memories at each time period would yield a score significantly higher than zero, which was 811 assessed using a one-sample t-test. Strengthening or weakening of memory representations as a 812 function of remoteness would result in a significant difference in memory representation scores 813 across time periods, and this was assessed using a one-way repeated measures ANOVA with post-814 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017; hoc two-tailed paired t-tests. Error bars on graphs displaying neural representation scores were 815 normalised to reflect within-rather than between-subject variability in absolute values, using the 816 method recommended by Cousineau (91) for within-subjects designs. The range of values that we 817 observed are entirely consistent with those in other studies employing a similar RSA approach in a 818 variety of learning, memory and navigation tasks in a wide range of brain regions (92-101). 819 820

Searchlight analysis 821
An RSA searchlight analysis was conducted in normalised space, on multivariate noise-normalised 822 data within the ROI. This approach selected every voxel within the ROI, and using a volumetric 823 approach which is constrained by the shape of the ROI, expanded the area around that voxel until an 824 area of 160 voxels was reached. Within each of these spheres, memories were correlated with 825 themselves, and other memories, analogous to the standard ROI approach. Then the resulting neural 826 RSM was correlated using Spearman's rank correlation coefficient with a model RSM which 827 consisted of ones along the diagonal and zeros on the off-diagonal. This model RSM was used to 828 detect if individual memories were detectable across all time-points. For every voxel, the average 829 correlation from every sphere it participated in was calculated, to generate a more representative 830 score of its informational content. Parametric assumptions regarding the spatial distribution of 831 unsmoothed data may not hold. Therefore we used statistical nonparametric mapping (SnPM13) on 832 the resulting searchlight images. We used 10,000 random permutations, a cluster-based significance 833 threshold of t=3, and a family-wise-error corrected threshold of p<0.05 within the ROI. Sixteen of the 30 participants who took part in Experiment 1 returned to take part in Experiment 2 838 (14 female, mean age 24.7, SD 3.1, range 21-33) approximately eight months later (8.4 months, SD 839 1.2). 840 . CC-BY 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017;

MRI data acquisition 868
Structural and functional data were acquired using the same scanner and scanning sequences as 869 Experiment 1. However the prior acquisition of the partial volume structural MRI scans negated the 870 need to include these in the protocol of Experiment 2. 871 872 MRI data preprocessing 873 fMRI data were preprocessed using the same pipeline as Experiment 1, with the additional step of 874 co-registering the functional scans of Experiment 2 to the structural scans of Experiment 1, which 875 enabled the use of the vmPFC masks from Experiment 1. First-level GLMs of each recall trial were 876 constructed in an identical manner to Experiment 1. 877 878 Representational Similarity Analysis 879 RSA of the Experiment 2 fMRI data was conducted in an identical manner to Experiment 1. The 880 average number of voxels analysed in the vmPFC across the two sets of memories for all participants 881 was 5228 (SD 1765). To generate predicted changes in representations in the eight months from 882 Experiment 1 to Experiment 2, the scores from Experiment 1 were shifted by two time-points, and a 883 two-tailed paired t-test was performed on each memory's original neural representation score and 884 its expected score eight months later (Fig 5). To ascertain whether the observed neural 885 representation scores had changed between Experiments 1 and 2, a two-way (experiment x time 886 period) repeated measures ANOVA was performed. To investigate if these changes mirrored the 887 predictions generated by the original data, paired t-tests were performed between the actual neural 888 representation scores for each memory from Experiment 1 and Experiment 2, one-tailed if there was 889 a hypothesised increase or decrease. The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/202689 doi: bioRxiv preprint first posted online Oct. 13, 2017;