Learning of Chunking Sequences in Cognition and Behavior
Fig 6
The model is run 60 times, for 120 trials (Ny = 30) for different levels of noise. Each trial consisted of the presentation of one sequence, followed by a recall phase. (Top-Left) Sequence recall accuracy D averaged over all the runs. The sequence was determined by the identity of the most active mode in the elementary layer.D was computed using the Levenshtein distance (equal to the number of additions and subtractions between two sequences). In the noiseless and low noise cases, the distance between the presented sequence and the reproduced sequence reached about.05 (horizontal line), roughly corresponding to 1 addition/subtraction per sequence recall. The network was robust to noise, and sequence recall accuracy degraded gracefully as the amplitude of noise was increased. (Bottom-Left) Estimates of chunking rate measure CR for monitoring chunking in the noiseless case (blue curves).CR is defined as the number of transitions taking place in the chunking layer during the presentation of a pattern in the sequence. During an initial transient CR decreases as learning proceeds, indicating the formation of the chunks. (Right) Activity in the chunking layer for two representative runs, one with no noise, the other with no chunks, where learning of Qij and Rji was turned off. The identity of the chunks is color-coded. Interestingly, the boundaries of the chunks can change during training, and the chunks can undergo substantial reconfigurations at the beginning of the training phase. In absence of learning in Qij and Rji, the chunking rate did not diminish over the course of learning, indicating the absence of chunks. S4 Fig displays the evolution of the individual weights for the run shown in the top-right panel (No Noise).