Non-monotonic recruitment of ventromedial prefrontal cortex during remote memory recall

7 Daniel N. Barry, Martin J. Chadwick and Eleanor A. Maguire 8 9 10 11 Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College 12 London, 12 Queen Square, London, WC1N 3AR, UK 13 14 Institute of Behavioural Neuroscience, Department of Experimental Psychology, Division of 15 Psychology and Language Sciences, University College London, 26 Bedford Way, London, 16 WC1H 0AP, UK 17 18 19 20


Introduction
We possess a remarkable ability to retrieve, with ease, one single experience from a lifetime of 47 memories. How these individual autobiographical memories are represented in the brain over time 48 is a central question of memory neuroscience which remains unanswered. 49 Consolidation takes place on two levels which differ on both a spatial and temporal scale. On 50 a cellular level, the stabilisation of new memory traces through modification of synaptic connectivity 51 takes only a few hours [1], and is heavily dependent upon the hippocampus [2-5]. On a much longer 52 timescale, the neocortex integrates new memories, a form of consolidation termed "systems-level" 53 [6]. The precise timeframe of this process is unknown. A related long-standing debate which has 54 contributed to this uncertainty is whether or not the hippocampus ever relinquishes its role in 55 autobiographical memory retrieval. One theory asserts that the hippocampus is not involved in the 56 retrieval of memories after they have become fully consolidated to the neocortex [7]. Alternate 57 views maintain that vivid, detailed autobiographical memories retain a permanent reliance on the 58 hippocampus for their expression [8][9][10][11][12]. 59 subsequent recall in the scanner. This meant that there were two full sets of memories. Participants 124 created a short phrase pertaining to each autobiographical memory, which was paired with the 125 photograph to facilitate recall during the subsequent fMRI scan. personal photograph, chose a phrase to help remind them of this memory during the subsequent 131 scanner task, and rated its characteristics. (B-F) Subjective ratings (means +/-1SEM; see also Table A  132 in S1 Table) of memory characteristics at each time period for Experiment 1, averaged across the 133 two sets of memories. Ratings were on a scale of 1 to 5, where 1 was low and 5 was high. For 134 emotional valence: 1-2 = negative, 3 = neutral, 4-5 = positive. * p < 0.05, ** p < 0.01, *** p < 0.001. 135 136

Consistent level of details recalled across memories 165
To complement the subjective ratings of memory characteristics with a more objective assessment 166 of their content, transcripts of participants' memory interviews were scored using the 167 Autobiographical Interview protocol ([57]; Materials and methods). In total for this first experiment, 168 10,187 details were scored. The mean (SD) number of internal details (bound to the specific 169 'episodic' spatiotemporal context of the event) and external details (arising from a general 170 'semantic' knowledge or references to unrelated events) are shown in Table B in S1 Table (see   scan. (B) The timeline of an example trial from the scanning task. (C) Graphical illustration of the 199 neural representation score calculation using RSA. The neural pattern similarity across trials recalling 200 the same memory (orange) minus the mean pattern similarity between that memory and other 201 memories (yellow) generates a "neural representation" score. A score significantly higher than zero 202 indicates a neural pattern distinct to that memory is present in the vmPFC. 203 204 205 On each trial, the photograph and associated pre-selected cue phrase relating to each event 206 were displayed on a screen for 3 seconds. Following removal of this cue, participants then closed 207 their eyes and recalled the memory. After 12 seconds, the black screen flashed white twice, to cue 208 the participant to open their eyes. The participant was then asked to rate how vivid the memory 209 recall had been using a five-key button box, on a scale of 1-5, where 1 was not vivid at all, and 5 was 210 highly vivid ( Fig 3B). 211 We used RSA to quantify the extent to which the strength of memory representations in the 212 vmPFC differed as a function of memory age. This was achieved by contrasting the similarity of 213 neural patterns when recalling the same memory with their similarity to other memories to yield a 214 "neural representation" score for each memory (see Materials and methods, Fig 3C). As there were 215 two memories recalled per time period, the neural representation scores were averaged to produce 216 one value for that time period. 217 We anticipated an increase in the strength of memory representations at some point 218 between 0.5M and 24M, in line with the results of Bonnici and Maguire [55]. This is what we 219 observed, where the most recent 0.5M memories were undetectable (t 29 = 0.72, p = 0.477) in 220 vmPFC, in contrast to the distinct neural signatures observed for 4M (t 29 = 2.85, p = 0.008), 8M (t 29 = 221 3.09, p = 0.004) and 12M (t 29 = 3.66, p < 0.001) old memories ( Fig 4A). These changes in the strength 222 of memory representations were significant across time periods (F (7,203) = 2.22, p = 0.034), with an 223 observed increase in vmPFC recruitment from 0.5M to 8M (t 29 = 2.07, p = 0.048) and 12M (t 29  However, what was observed for the following two time periods was unexpected -an 238 apparent disengagement of the vmPFC over the next eight months as we observed weak 239 detectability of memory representations in vmPFC for 16M (t 29 = 1.85, p = 0.074) and 20M (t 29 = 240 1.03, p = 0.310) old memories. Neither 16M (t 29 = -1.06, p = 0.298) nor 20M memories (t 29 = -0.40, p 241 = 0.691) were more strongly represented than the recent 0.5M old memories. In contrast, the more 242 remote 24M (t 29 = 4.34, p < 0.001) and 60M (t 29 = 3.55, p = 0.001) memories were detectable in the 243 vmPFC, and significantly more so than the most recent memories (24M vs 0.5M, t 29 = -2.93, p = 244 0.007; 60M vs 0.5M, t 29 = -2.54, p = 0.017) as well as the more temporally proximal 20M old 245 memories (24M vs 20M, t 29 = -2.50, p = 0.018; 60M vs 20M, t 29 = -2.32, p = 0.028). 246 The experimental design afforded us the opportunity to verify this non-monotonic pattern. 247 As we sampled two memories per time-point, this time-dependent pattern should be evident in both 248 sets of memories. As shown in Fig 4B, the two sets of memories followed a similar time course of 249 changes in representation within vmPFC. This is a compelling replication, given that the two 250 memories from each time-period were unrelated in content as a prerequisite for selection, recalled 251 in separate sessions in the scanner and analysed independently from each other. 252 The availability of two memories at each time-point also permitted the use of an alternative 253 approach to calculating neural representation scores. Instead of using the similarity to memories 254 from other time-points as a baseline, we could also assess if memories were similar to their 255 temporally matched counterpart in the other set. As can be seen in Fig 4C, the non-monotonic 256 pattern is preserved even when just using one identically aged memory as a baseline. In other 257 words, the distinguishable patterns are specific to each individual memory rather than attributable 258 to general retrieval processes associated with any memory of the same age. 259 An alternative explanation for memory representation scores which decreased over time is 260 that the neural patterns became increasingly similar to memories from other time-points, rather 261 than less consistent across repetitions, perhaps again reflecting more general retrieval processes. 262 However as evident in S3 Fig, between-memory scores remained stable across all time-points, and  263 did not differ in their statistical significance (F (5.24,152.02) = 1.72, p = 0.13). If anything, there was a 264 slight trend for higher between-memory scores to accompany higher within-memory scores. 265 Therefore, the detectability of neural representations appeared to be driven by consistent within-266 memory neural patterns. 267 268

The observed temporal relationship is unique to vmPFC 269
Our main focus was the vmPFC, given previous work highlighting specifically this region's role in 270 representing autobiographical memories over time [54,55]. We also scanned within a partial volume 271 (to attain high spatial resolution with a reasonable TR), so were constrained in what other brain 272 areas were available for testing (see Materials and methods). Nevertheless, we examined the same 273 brain areas as Bonnici et al. memories which were undetectable in the vmPFC were still represented in other brain regions at 280 these time points (see S2 Table). For example, 20 month old memories which did not appear to 281 recruit the vmPFC during retrieval were represented in the majority of other regions comprising the 282 core autobiographical memory network (precuneus, lateral temporal cortex, parahippocampal 283 cortex, and approaching significance in the retrosplenial cortex (t 29 = 1.83, p = 0.08)). 284 Following scanning, participants completed three additional ratings. They were asked to 285 indicate the extent to which the memories were changed by the 6 repetitions during scanning on a 286 scale ranging from 1 (not at all) to 5 (completely). They reported that the memories were not 287 changed very much by repetition (mean: 2.61, SD: 0.74). They were also asked how often during 288 scanning they thought about the memory interview one week previous on a scale of 1 (not at all) to 289 5 (completely), with participants indicating they rarely thought about the interview (mean: 2.29, SD: 290 1.01). Finally, participants were asked the extent to which the recall of memories from each time 291 period unfolded in a consistent manner over the course of the session. A difference was observed 292 (F (7,203) = 2.78, p = 0.009), with the most recent 0.5M old memories being rated as more consistently 293 recalled than the most remote 60M memories (t 29 = 3.97, p = 0.012). 294 In addition to the region of interest (ROI)-based approach, a searchlight analysis was also 295 conducted in MNI group normalised space to localise areas within the vmPFC where memories 296 displayed high detectability across participants (see Materials and methods). We discovered a 297 results were highly similar to the whole-ROI analysis in native space, suggesting the main result may 300 be driven by more spatially confined activity within the vmPFC. However, a searchlight approach is 301 sub-optimal to answer the current research question, as it requires an a priori model RSM against 302 which to compare the neural patterns at each searchlight sphere, whereas the ROI approach makes 303 no such assumptions. 304 We also conducted a standard mass-univariate analysis on the whole volume with memory 305 remoteness as a parametric regressor, and no area displayed either a significant increase or 306 decrease in activity in accordance with memory age, consistent with the findings of Bonnici et al. 307 [54]. In a similar parametric analysis, we did not find evidence of the modulation of univariate 308 activity by in-scanner vividness ratings as might be suggested by the findings of Sheldon and Levine 309 [60], however, all memories chosen for the current study were highly vivid in nature. 310 One concern when studying covert cognitive processes such as autobiographical memory in 311 the fMRI scanner is participant compliance, because performance is subjectively reported rather 312 than objectively assessed. However, if participants were complying with task demands, there should 313 be an association between in-scanner subjective ratings and the detectability of neural 314 representations. When non-vivid trials were additionally incorporated into the RSA analysis, the 315 mean memory representation score in the vmPFC for all participants averaged across time-points 316 decreased from 0.0049 (SD 0.005) to 0.0044 (SD 0.005). In fact, the deleterious effect of including 317 these extra non-vivid trials was evident in 24 out of the 30 participants. Such a consistent 318 relationship between participants' subjective ratings of their own memory performance and the 319 sensitivity of the RSA analysis to detect memory representations, strongly suggests participants were 320 performing the task as instructed. The non-monotonic pattern we observed in the fMRI data did not manifest itself in the subjective or 324 objective behavioural data. In fact, the only difference in those data was higher ratings for the most 325 recent 0.5M old memories. However, these were paradoxically the most weakly represented 326 memories in the vmPFC, meaning the neural patterns were not driven by memory quality. The 327 objective scoring of the memories confirmed comparable levels of detail provided for all memories, 328 without any significant drop in episodic detail or increase in the amount of semantic information 329 provided as a function of time. Therefore, the amount, or nature, of the memory details was not 330 contributing factors. 331 Nevertheless, to verify that the results genuinely represented the neural correlates of 332 memory purely as a function of age, one would need to study the effects of the passage of time on 333 the individual neural representations. Therefore, we invited the participants to revisit eight months 334 later to recall the same memories again both overtly and during scanning; 16 of the participants 335 agreed to return. In order to generate specific predictions for the neural representations during 336 Experiment 2, we took the actual data for the 16 subjects from Experiment 1 who returned eight 337 months later ( Table A in S1 Table and  369 Table A in S3 Table) of memory characteristics at each time period for Experiment 1 (blue line, n=16 370 participants) and how the ratings of the same memories differed eight months later during 371 Experiment 2 (red line, the same n=16 participants) averaged across the two sets of memories in 372 both cases. Ratings were on a scale of 1 to 5, where 1 was low and 5 was high. For emotional 373 valence: 1-2 = negative, 3 = neutral, 4-5 = positive. Asterisks indicate significant differences in 374 memory ratings between Experiments 1 and 2; * p < 0.05, ** p < 0.01, *** p < 0.001. 375 376 377

A similar level of detail was recalled across experiments 423
As with Experiment 1, transcripts of participants' memory interviews during Experiment 2 were 424 scored using the Autobiographical Interview protocol ([57]; see Materials and methods)). A total of 425 6,444 details were scored (see Table B in S3 Table for    when examining other brain areas within the partial volume in Experiment 2, in no case did we find a 485 significant difference in memory detectability across time periods. 486 Following scanning in Experiment 2, participants completed three additional ratings. They 487 were asked to indicate the extent to which the memories were changed by the 6 repetitions during 488 scanning on a scale ranging from 1 (not at all) to 5 (completely). As in Experiment 1, they reported 489 that the memories were not changed very much by repetition (mean: 2.56, SD: 0.81). They were also 490 asked how often they thought of the experience of recalling the memories in Experiment 1 while 491 performing the scanning task in Experiment 2 on a scale of 1 (not at all) to 5 (during every memory).  Table A in 495 S1 Table and Table A in S3 Table). 496 497

498
This study exploited the sensitivity of RSA to detect not only the extent to which memories of 499 different ages were represented in the vmPFC, but how these representations changed over time. 500 During Experiment 1, we observed detectability in vmPFC for memories at 4M to 12M of age, which 501 was also evident at 24M and 60M. As expected, recent 0.5M old memories were poorly represented 502 in vmPFC in comparison. Curiously, however, the same lack of detectability in vmPFC was observed 503 for memories that were 16M to 20M old. This pattern persisted across separate sets of memories 504 and was replicated in a follow-up study eight months later with the same participants and 505 memories. Behavioural data failed to account for these time-dependent representational changes in 506 either experiment, and other brain regions failed to show a significant change in memory 507 representations over time. These findings are difficult to accommodate within any single theoretical 508 account of long-term memory consolidation [9, 12, 61-63], as neocortical recruitment is generally 509 assumed to involve an ascending linear trajectory. Consolidation has been characterised as fluid and The progressive vmPFC disengagement observed over the following eight months suggests the 555 suppression of interfering memories becomes less of a necessity over this period. Forgetting is a key 556 attribute of an optimally functioning memory system [72], and the number of autobiographical 557 events individuals can recall has been shown to decrease substantially between one and two years, 558 before levelling off [73]. Therefore, the reduction in availability of potentially interfering memories 559 from this time period may relieve the vmPFC from its role in disambiguating them from memories 560 which have persisted through the consolidation process. For example, one may return from a 561 vacation with many memories which contain multiple overlapping features, but this will inevitably 562 be reduced to a few distinct experiences as time goes on. 563 564

Two years to five years: the emergence of schematic representations 565
If disambiguation ceases to be an issue for older memories, the robust re-engagement of vmPFC at 566 the results of Experiment 2. However, given that seven out of the eight specifically hypothesised 634 temporally sensitive changes in neural representations were supported, an altered or accelerated 635 consolidation time-course appears highly unlikely. Again, recall recency was matched in Experiment 636 2 by the memory interview, and recall frequency between experiments was low. 637 Taking a more general and parsimonious perspective, the ratings demonstrate that, 638 naturally, all memories are recalled on an occasional basis ( Table A in S1 Table), therefore it seems 639 highly unlikely that a mere six repetitions within a scanning session would significantly alter the time 640 course of systems-level consolidation. It should also be noted that successful detection of neural 641 patterns relied on the specific content of each memory, rather than being due to generic time- year as a method for strategically retrieving memories. Furthermore, the nature of memory 655 sampling was that unique, rather than generic, events were eligible, reducing the likelihood of 656 events which were repeated annually being included. Memory detectability was high at 12 month 657 intervals such as one, two and five years in this study, suggesting perhaps it is easier to recall events 658 which have taken place at a similar time of year to the present. However, this should have been 659 reflected in behavioural ratings, and equivalently strong neural representations for recent 660 memories, but neither was observed. Most importantly, if content rather than time-related 661 consolidation was the main influence on memory detectability, then we would not have observed 662 any change in neural representation scores from Experiment 1 to Experiment 2, rather than the 663 hypothesised shifts which emerged. 664 A related concern is that memories across time differ in nature because they differ in 665 availability. Successful memory search is biased towards recency, meaning there are more events to 666 choose from in the last few weeks, than remote time periods. Here, this confound is circumvented 667 by design, given that search was equivalently constrained and facilitated at each time-point by the 668 frequency at which participants took photographs, which was not assumed to change in a major way 669 over time. These enduring "snap-shots" of memory, located within tight temporal windows (see 670 Materials and methods) meant that memory selection was not confounded by retrieval difficulty or 671 availability. It could also be argued that selection of time-points for this study should have been 672 biased towards recency given that most forgetting occurs in the weeks and months after learning. 673 However, it is important to dissociate systems-level consolidation from forgetting, as they are 674 separate processes which are assumed to follow different time-courses. Memory forgetting follows 675 an exponential decay [82], whereas systems-level consolidation has generally been assumed, until 676 now, to be gradual and linear [83]. Our study was concerned only with vivid, unique memories which 677 were likely to persist through the systems-level consolidation process. 678 A further potential concern regarding memory selection is that recent and remote memories 679 which are comprised of equivalent levels of detail must be qualitatively different in some way. here. Because memories were chosen only from available photographic cues, the salience of recent 691 and remote events was determined at the time of taking the photograph, and not during 692 experimentation. These photographs served as potent triggers of remote memories which were not 693 necessarily more salient than recent memories, and which may not have otherwise come to mind 694 using a free recall paradigm. In addition, one would expect more salient remote memories to score 695 higher than recent memories on subjective ratings of vividness, personal significance or valence, but 696 this was not the case. Therefore, stronger neural representations at more remote time-points were 697 likely due to consolidation-related processes rather than qualitative differences between recent and 698 remote experiences at the time of encoding. 699 700 Value 701 Given that the medial prefrontal cortex is often associated with value and emotional processing [86], 702 could these factors have influenced the current findings? Humans display a bias towards 703 consolidating positive memories [87], and remembered information is more likely to be valued than 704 that which is forgotten [88]. Activity in vmPFC during autobiographical memory recall has been 705 found to be modulated by both the personal significance and emotional content of memories [89]. 706 However, in the current two experiments, memories were matched across time periods on these 707 variables, and the selection of memories through photographs taken on a day-to day basis also 708 mitigated against this effect. In the eight months between experiments, memories either remained 709 unchanged or decreased slightly in their subjective ratings of significance and positivity, suggesting 710 that these factors are an unlikely driving force behind the observed remote memory representations 711 in vmPFC. For example, if recent memories in Experiment 1 were not well-represented in vmPFC 712 because they were relatively insignificant, there is no reason to expect them to be more so eight 713 months later, yet their neural representation strengthened over time nonetheless. 714 715 Relation to previous findings 716 A methodological discrepancy between this experiment and that conducted by Bonnici et al. [54], is 717 the additional use of a photograph to assist in cueing memories. One possible interpretation of the 718 neural representation scores is they represent a role for the vmPFC in the maintenance of visual 719 working memory following cue offset. However, the prefrontal cortex is unlikely to contribute to 720 maintenance of visual information [90]. Furthermore, if this was the driving force behind neural 721 representations here, the effect would be equivalent across time-periods, yet it was not. 722 There is, however, an obvious inconsistency between the findings of the current study and 723 that of Bonnici,et al. [54]. Unlike that study, we did not detect representations of 0.5M old 724 memories in vmPFC. It could be that the support vector machine classification-based MVPA used by 725 Bonnici et al. [54] is more sensitive to detection of memory representations than RSA, however, the 726 current study was not optimised for such an analysis because it necessitated an increased ratio of 727 conditions to trials. Nonetheless, the increase in memory representation scores from recent to 728 remote memories was replicated and additionally refined in the current study with superior 729 temporal precision. One observation which was consistent with the Bonnici findings was the 730 detection of remote memories in the hippocampus, which also supports theories positing a 731 perpetual role for this region in the vivid retrieval of autobiographical memories [10, 12]. However, 732 the weak detectability observed at more recent time points may reflect a limitation of the RSA 733 approach employed here to detect sparsely encoded hippocampal patterns, which may be overcome 734 by a more targeted subfield analysis [91]. 735 There are, however, distinct advantages to the use of RSA over pattern classification MVPA. 736 RSA is optimal for a condition-rich design as it allows for the relationships between many conditions 737 to be observed. For example, in the current experiment, a visual inspection of the group RSA matrix 738 (S1 Fig) does not reveal an obvious clustering of recent or remote memories which would indicate 739 content-independent neural patterns related to general retrieval processes. The approach employed 740 by Bonnici et al. [54] assessed the distinctiveness of memories within each time-point from each 741 other in order to detect memory representations. Should the neural patterns of a single memory 742 become more consistent over time, yet also more similar to memories of the same age due to 743 generic time-dependent mechanisms of retrieval, pattern classification would fail to detect a 744 representation where one is present. In the current study, however, the two can be assessed 745 separately, revealing memories at each time-point become distinct from both memories of all other 746 ages ( Fig 4A) and identically aged memories (Fig 4C). The machine learning approach employed by 747 The current results revealed that the recruitment of the vmPFC during the expression of 766 autobiographical memories depends on the exact stage of systems-level consolidation, and that 767 retrieval involves multiple sequential time-sensitive processes. These temporal patterns were 768 remarkably preserved across completely different sets of memories in one experiment, and closely 769 replicated in a subsequent longitudinal experiment with the same participants and memories. These 770 findings support the notion that the vmPFC becomes increasingly important over time for the 771 retrieval of remote memories. Two particularly novel findings emerged. First, this process occurs 772 relatively quickly, by four months following an experience. Second, vmPFC involvement after this 773 time fluctuates in a highly consistent manner, depending on the precise age of the memory in 774 question. Further work is clearly needed to explore the implications of these novel results. Overall, 775 we conclude that our vmPFC findings may be explained by a dynamic interaction between the 776 changing strength of a memory trace, the availability of temporally adjacent memories, and the 777 concomitant differential strategies and schemas that are deployed to support the successful 778 recollection of past experiences. Participants were asked to describe in as much detail as possible the specific 809 autobiographical memory elicited by a photograph. General probes were given by the interviewer 810 where appropriate (e.g., "what else can you remember about this event?"). Participants were also 811 asked to identify the most memorable part of the event which took place within a narrow temporal 812 window and unfolded in an event-like way. They then created a short phrase pertaining to this 813 episode, which was paired with the photograph to facilitate recall during the subsequent fMRI scan 814 ( Fig 1A). Participants were asked to rate each memory on a number of characteristics (see main text, 815 Figs 1 and 6, S1 Table and S3 Table), and two memories from each time period which satisfied the 816 criteria of high vividness and detail, and ease of recall were selected for recollection during the fMRI 817 scan. 818 819

Behavioural analyses 820
The interview was recorded and transcribed to facilitate an objective analysis of the details, and the 821 widely-used Autobiographical Interview method was employed for scoring [57]. Details provided for 822 each memory were scored as either "internal" (specific events, temporal references, places, 823 perceptual observations and thoughts or emotions) or "external" (unrelated events, semantic 824 knowledge, repetition of details or other more general statements). To assess inter-rater reliability, a 825 subset of sixteen memories (n=2 per time period) were randomly selected across 16 different 826 subjects and scored by another experimenter blind to the aims and conditions of the study. Intra-827 class coefficient estimates were calculated using SPSS statistical package version 22 (SPSS Inc, 828 Chicago, IL) based on a single measures, absolute-agreement, 2-way random-effects model. 829 As two memories per time period were selected for later recall in the scanner, behavioural 830 ratings were averaged to produce one score per time period. Differences in subjective memory 831 ratings across time periods were analysed using a one-way repeated measures ANOVA with 832 Bonferroni-corrected paired t-tests. Differences in objective memory scores of internal and external 833 details across time periods were analysed using a two-way repeated measures ANOVA with 834 Bonferroni-corrected paired t-tests. A threshold of p < 0.05 was used throughout both experiments. 835 All ANOVAs were subjected to Greenhouse-Geisser adjustment to the degrees of freedom if 836 Mauchly's sphericity test identified that sphericity had been violated. Structural and functional data were acquired using a 3T MRI system (Magnetom TIM Trio, Siemens 861 Healthcare, Erlangen, Germany). Both types of scan were performed within a partial volume which 862 incorporated the entire extent of the ventromedial prefrontal cortex (Fig 3A). 863 Structural images were collected using a single-slab 3D T2-weighted turbo spin echo 864 sequence with variable flip angles (SPACE) [93] in combination with parallel imaging, to 865 simultaneously achieve a high image resolution of ~500 μm, high sampling efficiency and short scan 866 time while maintaining a sufficient signal-to-noise ratio (SNR). After excitation of a single axial slab 867 the image was read out with the following parameters: resolution = 0.52 x 0.52 x 0.5 mm, matrix = 868 384 x 328, partitions = 104, partition thickness = 0.5 mm, partition oversampling = 15.4%, field of 869 view = 200 x 171 mm 2, TE = 353 ms, TR = 3200 ms, GRAPPA x 2 in phase-encoding (PE) direction, 870 bandwidth = 434 Hz/pixel, echo spacing = 4.98 ms, turbo factor in PE direction = 177, echo train 871 duration = 881, averages = 1.9. For reduction of signal bias due to, for example, spatial variation in 872 coil sensitivity profiles, the images were normalized using a prescan, and a weak intensity filter was 873 applied as implemented by the scanner's manufacturer. To improve the SNR of the anatomical 874 image, three scans were acquired for each participant, coregistered and averaged. Additionally, a 875 whole brain 3D FLASH structural scan was acquired with a resolution of 1 x 1 x 1 mm. 876 Functional data were acquired using a 3D echo planar imaging (EPI) sequence which has 877 been demonstrated to yield improved BOLD sensitivity compared to 2D EPI acquisitions [94]. Image 878 resolution was 1.5mm 3 and the field-of-view was 192mm in-plane. Forty slices were acquired with 879 20% oversampling to avoid wrap-around artefacts due to imperfect slab excitation profile. The echo 880 time (TE) was 37.30 ms and the volume repetition time (TR) was 3.65s. Parallel imaging with GRAPPA 881 image reconstruction [95] acceleration factor 2 along the phase-encoding direction was used to 882 minimize image distortions and yield optimal BOLD sensitivity. The dummy volumes necessary to 883 reach steady state and the GRAPPA reconstruction kernel were acquired prior to the acquisition of 884 the image data as described in Lutti et al. [94]. Correction of the distortions in the EPI images was 885 implemented using B0-field maps obtained from double-echo FLASH acquisitions (matrix size 64x64; 886 64 slices; spatial resolution 3mm 3 ; short TE=10 ms; long TE=12.46 ms; TR=1020 ms) and processed 887 using the FieldMap toolbox available in SPM [96]. 888 889 MRI data preprocessing 890 fMRI data were analysed using SPM12 (www.fil.ion.ucl.ac.uk/spm). All images were first bias 891 corrected to compensate for image inhomogeneity associated with the 32 channel head coil [97]. 892 Fieldmaps collected during the scan were used to generate voxel displacement maps. EPIs for each 893 of the twelve sessions were then realigned to the first image and unwarped using the voxel 894 displacement maps calculated above. The three high-resolution structural images were averaged to 895 reduce noise, and co-registered to the whole brain structural scan. EPIs were also co-registered to 896 the whole brain structural scan. Manual segmentation of the vmPFC was performed using ITK-SNAP 897 on the group averaged structural scan normalised to MNI space. The normalised group mask was 898 warped back into each participant's native space using the inverse deformation field generated by The average number of voxels analysed in the vmPFC across the two sets of memories was 917 5252 (SD 1227). Whole ROI-based analysis was preferred to a searchlight approach which would 918 involve comparing neural with model similarity matrices [99], as we did not have strong a priori 919 hypothesis about changes in neural representations over time against which to test the neural data, 920 nor did we want to make assumptions regarding the spatial distribution of informative voxels in the 921 vmPFC. 922 As participants recalled two memories per time-point, the dataset was first split into two 923 sets of eight time points, which were analysed separately using RSA. To characterise the strength of 924 memory representations in the vmPFC, the similarity of neural patterns across recall trials of the 925 same memory was first calculated using the Pearson product-moment correlation coefficient, 926 resulting in a "within-memory" similarity score. Then the neural patterns of each memory were 927 correlated with those of all other memories, yielding a "between-memory" similarity score. Both 928 within-and between-memory correlations were performed on trials from separate runs. For each 929 memory, the between-memory score was then subtracted from the within-memory score to provide 930 a neural representation score (Fig 3C). This score was then averaged across the two memories at 931 each time-point. Results for the left and the right hemispheres were highly similar, and therefore the 932 data we report here are from the vmPFC bilaterally. A distinctive neural pattern associated with the 933 recall of memories at each time period would yield a score significantly higher than zero, which was 934 assessed using a one-sample t-test. Strengthening or weakening of memory representations as a 935 function of remoteness would result in a significant difference in memory representation scores 936 across time periods, and this was assessed using a one-way repeated measures ANOVA with post-937 hoc two-tailed paired t-tests. Error bars on graphs displaying neural representation scores were 938 normalised to reflect within-rather than between-subject variability in absolute values, using the 939 method recommended by Cousineau

Searchlight analysis 946
An RSA searchlight analysis was conducted in normalised space, on multivariate noise-normalised 947 data within the ROI. This approach selected every voxel within the ROI, and using a volumetric 948 approach which is constrained by the shape of the ROI, expanded the area around that voxel until an 949 area of 160 voxels was reached. Within each of these spheres, memories were correlated with 950 themselves, and other memories, analogous to the standard ROI approach. Then the resulting neural 951 RSM was correlated using Spearman's rank correlation coefficient with a model RSM which 952 consisted of ones along the diagonal and zeros on the off-diagonal. This model RSM was used to 953 detect if individual memories were detectable across all time-points. For every voxel, the average 954 correlation from every sphere it participated in was calculated, to generate a more representative 955 score of its informational content. Parametric assumptions regarding the spatial distribution of 956 unsmoothed data may not hold. Therefore we used statistical nonparametric mapping (SnPM13) on 957 the resulting searchlight images. We used 10,000 random permutations, a voxel-level significance 958 threshold of t=3, and a family-wise-error corrected cluster-wise threshold of p<0.05 within an ROI.

Memory interview 967
Participants were presented with the 16 photographs and cue phrases associated with the 968 autobiographical memories in Experiment 1 and were asked to describe in as much detail as possible 969 the specific event which they had recalled previously. General probes were given by the interviewer 970 where appropriate (e.g. "what else can you remember about this event?"). The interviewer availed 971 of summarised transcripts from Experiment 1 to verify the same memory and details were being 972 recalled. Participants then rated each memory on the same characteristics assessed in Experiment 1. 973 The memory interview during Experiment 2 was also recorded and transcribed. 974 975

Behavioural analyses 976
The analysis of subjective and objective ratings for Experiment 2 followed exactly the same 977 procedure as Experiment 1. The extent to which subjective ratings for the same memory had 978 changed between Experiment 1 and Experiment 2 was assessed using a two-way (experiment x time 979 period) repeated measures ANOVA with Bonferroni-corrected paired t-tests. Differences in objective 980 memory ratings across experiments were analysed using a two (experiment) x two (detail) x eight 981 (time period) repeated measures ANOVA with Bonferroni-corrected paired t-tests. 982 983

Task during fMRI scanning 984
Participants returned approximately one week later for the fMRI scan (mean 5.5 days, SD 3.7). Prior 985 to scanning, only one reminder training trial per memory was deemed necessary given the prior 986 experience of performing the task in Experiment 1. The scanning task remained unchanged from 987 Experiment 1, aside from the re-randomisation of trials within each session. When the least vivid 988 trials were excluded, the mean number of trials (/6) selected for analysis from each time period 989 were as follows: