Stochastic attractor models of visual working memory

This paper investigates models of working memory in which memory traces evolve according to stochastic attractor dynamics. These models have previously been shown to account for response-biases that are manifest across multiple trials of a visual working memory task. Here we adapt this approach by making the stable fixed points correspond to the multiple items to be remembered within a single-trial, in accordance with standard dynamical perspectives of memory, and find evidence that this multi-item model can provide a better account of behavioural data from continuous-report tasks. Additionally, the multi-item model proposes a simple mechanism by which swap-errors arise: memory traces diffuse away from their initial state and are captured by the attractors of other items. Swap-error curves reveal the evolution of this process as a continuous function of time throughout the maintenance interval and can be inferred from experimental data. Consistent with previous findings, we find that empirical memory performance is not well characterised by a purely-diffusive process but rather by a stochastic process that also embodies error-correcting dynamics.


Local Attractors
Here we define a flow function, g(x), that creates local attractors where basins of attraction have width δ around memory items (a parameter to be estimated).The flow function is created using windowed sinusoids

Multiple Attribute Model
In the main text we have described a stochastic attractor model of a single attribute.Here we describe a generic approach in which each item to be remembered comprises a set of attributes.That is, where y = {x 1 , x 2 , ...x P }.For example, for P = 3 we may have object location, size and identity.This model assumes that memory traces evolve independently (see last paragraph below for further comment) for each attribute and the density of the ith attribute evolves according to stochastic attractor dynamics where x ij is the encoded value of attribute i for the jth item, σ ei is encoding noise, M i reflects attractor dynamics, σ ri is read-out noise, and τ is the delay length (as in Eq ?? in the main text).A new index z = j is assigned to each new stimulus arrary, as in e.g.Hippocampal indexing theory [1].The recall density over x i given single cue x k is computed as follows.First, the cue information is used to update the distribution over indexes Second, the updated distribution over indexes is used to compute the recall density The overall model structure is equivalent to a Mixture Model [2] but with multiple independent attributes (we would call this a mixture of stochastic attractors).The above equations derive from Bayes rule and the prior over indexes, p(z = j), could be taken as uniform or have some temporal structure [3].Because the recall density depends on the cue, this should explain the empirical finding that swap errors can depend on cue similarity (as well as similarity of the recalled attribute) [4,5].In this view, these two types of swap error have different causes -diffusion in the cue attribute and diffusion in the recall attribute.
In terms of a potential mapping to neuroanatomy, we envisage that the attribute memory traces are instantiated in the relevant cortical areas and that these can be indexed via discrete latent variables (or "tokens/indexes") instantiated in e.g medial temporal lobe [1,6,7,8] or claustrum [9].
Equation 3 describes how cue information is used to update the distribution over indexes, and may reflect cortical-to-hippocampal signalling and hippocampal normalisation (e.g. via recurrent competition).Equation 4shows how the index distribution is used to create the predictive density over the recall attribute, and may reflect top-down signalling from hippocampus to the relevant cortical area.In this view "feature-binding" is instantiated in both bottom-up and top-down signalling.
A reasonable critique of the independence assumption in Eq 2 is that not all cortical attributes evolve independently.For example, there is evidence that location and orientation are coded as a bivariate quantity, a low-level coding arising from orientation selective neurons distributed across the visual field [10].This could however be accommodated by treating location and orientation as a single bivariate attribute.For higher level attributes e.g.object identity, an assumption of independence between e.g.location and identity (given z) seems more plausible.
If there are multiple cues m k = {x k , ...x K }, the corresponding feature binding computations are 3 Hierarchical Dynamics This model incorporates two levels of dynamics where, at the top level, the stable fixed points repel each other according to deterministic flow function f i and, at the bottom level, memory traces follow stochastic attractor dynamics with stable states µ i (as in the current paper).The top-level flow function could be instantiated using piecewise sinusoids (as in the current paper) but with e.g.unstable fixed points located midway between item values for repulsive dynamics.The strength of the different types of effect would be reflected in the parameter values α and β.

Integrated Model
In an integrated model we could test for both response-bias and multi-item dynamics dx = β r g r (x)dt + β m g m (x)dt + σdw We could also augment this to include other effects, for example, of previous trials (as in [11]) giving dx = i β i g i (x)dt + σdw (9) where g i (x) is the flow function associated with the ith source of dynamical effects (response bias, multi-item, previous trial), and β i is the corresponding strength.

w
(x, c j ) = H(|x − c j | < δ/2) s(x, c j ) = − sin(2π(x − c j )/δ) where c j is the color of the jth item (or location/orientation depending on experimental paradigm), and H(a) is the Heaviside function (1 for a > 0, 0 otherwise).An example of such a flow function is shown in Fig 1.

Figure 1 :
Figure 1: Local Attractor Flow Function The Piecewise Sinusoidal functions used in the current paper instantiate global attractors because memory traces always experience an attractive force (flow functions are only zero at the stable fixed points -See Fig 2 in the main paper).An alternative choice would be a Local Attractor, as shown in the figure, and defined using windowed sinusoids.This contains regions of zero flow where traces are subject only to diffusive forces.