Optimal forgetting: Semantic compression of episodic memories
Fig 5
Rate distortion trade-off in forgetting over time.
A,B: Using the text-VAE model, we modelled the dependence of memory representations on time by gradually increasing the compression rate, β. Recall probabilities averaged over multiple word lists of studied and lure words was measured on a text-VAE trained on Wikipedia entries (top) and on a synthetic vocabulary (bottom). Increasing the compression rate results in a monotonically decreasing recall performance for studied words. In contrast, increased delay of recall leads to an increase in false memories. For critical NS (lure) words the recall probability initially increases with larger compression rates but very high compression rates result in losing gist-like recall as well. Asymptotically the performance on semantically related S words will approach the performance on random word lists as less and less of the structure of the data is used. C,D: Difference between recall probabilities for lure words and studied words as a function of the delay between recall and study for the model (top) and experiments (bottom). Both Wikipedia-trained and synthetic vocabulary trained models predict persistence of false recall of non-studied lure words as compared to studied words, visible as an increase in the difference between lure and studied word recall rates as a function of time. For even longer delays, the gist information is progressively forgotten as well and consequently recall rates for both lure and studied words approach zero. The same pattern of increasing rate up to a delay of three weeks and a subsequent decrease can be observed in experimental data. Data is reproduced from Toglia et al. [45], Seamon et al. [46] and Thapar & McDermott [47].