Prediction is the ability of the brain to quickly activate a target concept in response to a related stimulus (prime). Experiments point to the existence of an overlap between the populations of the neurons coding for different stimuli, and other experiments show that prime-target relations arise in the process of long term memory formation. The classical modelling paradigm is that long term memories correspond to stable steady states of a Hopfield network with Hebbian connectivity. Experiments show that short term synaptic depression plays an important role in the processing of memories. This leads naturally to a computational model of priming, called latching dynamics; a stable state (prime) can become unstable and the system may converge to another transiently stable steady state (target). Hopfield network models of latching dynamics have been studied by means of numerical simulation, however the conditions for the existence of this dynamics have not been elucidated. In this work we use a combination of analytic and numerical approaches to confirm that latching dynamics can exist in the context of a symmetric Hebbian learning rule, however lacks robustness and imposes a number of biologically unrealistic restrictions on the model. In particular our work shows that the symmetry of the Hebbian rule is not an obstruction to the existence of latching dynamics, however fine tuning of the parameters of the model is needed.
Citation: Aguilar C, Chossat P, Krupa M, Lavigne F (2017) Latching dynamics in neural networks with synaptic depression. PLoS ONE 12(8): e0183710. https://doi.org/10.1371/journal.pone.0183710
Editor: Maurice J. Chacron, McGill University Department of Physiology, CANADA
Received: April 28, 2017; Accepted: August 9, 2017; Published: August 28, 2017
Copyright: © 2017 Aguilar et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: Our work was funded in part by European Research Council Advanced Grant NerVi number 227747, https://erc.europa.eu/advanced-grants. There was no additional external funding received for this study.
Competing interests: The authors have declared that no competing interests exist.
Prediction of changes in the environment is a fundamental adaptive property of the brain [1–5]. To this aim, the neural mechanisms subtending prediction must activate in memory potential future stimuli on the basis of preceding ones. In nonhuman primates processing sequences of stimuli, neural activity shows two main dynamics triggered by the presentation of the first stimulus (prime) that precede the second stimulus (target). First, some neurons strongly respond to the first stimulus and exhibit a retrospective activity at an elevated firing rate after its offset [1, 2, 6]. Retrospective activity is considered as a neural mechanism of short-term maintenance of the first stimulus in working memory [7–10]. Second, some neurons exhibit an elevated firing rate during the delay between the prime and target, i.e. before the onset of the target, and respond strongly to this target [1, 2, 11–18]. Prospective activity depends on previous learning of the pairs of prime and target stimuli [1, 2, 15, 16, 18–22] and is considered as a mechanism of prediction of the second stimulus [23–25]. Further, prospective activity of neurons coding for a stimulus is related to response times to process this stimulus when it is presented [15, 26]. In humans processing sentences, the EEG signal correlates with the level of predictability of target words from preceding prime words [27–31] (see  on fMRI and  on MEG signals). The early stages of processing of a word are facilitated when this word is predictable [27, 34, 35]) leading to a shorter processing time . This so called priming of a target stimulus by a related preceding prime is reliably reported in both human [37–41] and nonhuman primates ; see  for a review. Further, experiments show that the magnitude of priming highly relies on the relation between the two stimuli stored in memory [29, 42–45] and on the overlap between features of the stimuli in memory [46, 47]. In both human and non-human primates, the relation between two stimuli stored in memory depends on the learned sequences of stimuli [42, 48, 49].
Many neurophysiological studies have described learning at the synaptic level as combinations of long-term potentiation (LTP) and long-term depression (LTD) of synapses [50–53]). On this basis, synaptic efficacy is an essential parameter to code the relation between stimuli in memory (e.g., [54, 55]). Further, single cell recordings and local field potentials report that neurons in the macaque cortex respond to several different stimuli [56–58]) and that a given stimulus is coded by the activity of a population of neurons [59–61]. As a result, the information about a specific stimulus is distributed across a pattern of activity of a neural population [62, 63]. Two different patterns of activity corresponding to two stimuli can therefore share some active neurons. Hence, such pattern overlap in the populations responsive to different stimuli can code a relation between these stimuli [64, 65].
Computational modelling studies of biologically inspired neural networks have been carried out in the context of the dynamics of neural activity in priming protocols used in human and nonhuman primates. Models show that retrospective activity of a stimulus is possible for high values of synaptic efficacy between neurons that are active to code for this stimulus [66, 67] and that prospective activity of a stimulus not yet presented is possible for high values of synaptic efficacy between neurons coding for the first stimulus and neurons coding for the second stimulus [68, 69]. On this basis, computational models have shown how a large spectrum of priming phenomena depends on the level of prospective activity of neurons coding for the second stimulus [25, 70, 71]. Taken as a whole, models have emphasised the essential role of the matrix of synaptic efficacies for the generation of specific levels of prospective activity generating specific levels of priming.
There have been a number of computational studies focussed on priming generated by the dynamics of populations of neurons with a distributed coding of the stimuli in attractor network models [72–76]. When presented with an external stimulus, these attractor networks converge to a stable steady state and do not activate a sequence of patterns. However, latching dynamics have been described as the internal activation of a sequence of patterns triggered by an initial stimulus  (see also [78–82]). In a model recently introduced [70, 71], several priming effects involved in prediction can be reproduced by latching dynamics that depend on the overlap between the patterns. This is made possible due to units that do not maintain constant firing rates, allowing the network to change state instead of converging to a fixed-point attractor. Interestingly, latching dynamics relies on the specific neural mechanisms of neural noise and fast synaptic depression. Neural noise is a fundamental property of the brain [83–85]). One of its functions is to increase the probability of state transitions in attractor networks [86, 87]. However, noise alone does not allow regular sequences of state transitions according to pattern overlap (see ). Fast synaptic depression reported in cortical synapses  rapidly decreases the efficacy of synapses that transmit the activity of the pre-synaptic neuron. A consequence is that the network cannot sustain a stable regime of activity of the neurons in a given pattern and spontaneously changes state. Connectionist models have shown the effects of fast synaptic depression on semantic memory  and on priming [70, 71]. When a stimulus is presented to the network, neurons activated by the stimulus activate each others in a pattern, but fast synaptic depression contributes to their deactivation because they activate each other less and less. In the meantime, these neurons begin to activate neurons of a different but overlapping pattern, that, because they are less activated, exhibit less synaptic depression at their synapses. Before fast synaptic depression takes its effect, the newly activated neurons can strongly activate their associates in the new pattern. The transition from the old to the new pattern is enabled by the synaptic noise. Hence the combination of neural noise and fast synaptic depression makes latching dynamics possible in attractor neural networks. However, the precise role of each of these mechanisms in changing the network state are still unclear. Further, the necessary and sufficient pattern overlap for latching dynamics and how it combines with synaptic depression and noise are still unknown. The aim of the present approach is to analyze the necessary and sufficient conditions of combination of neural noise, fast synaptic depression and overlap for the existence of latching dynamics, using the framework of heteroclinic chains .
Before introducing heteroclinic chains we would like to point to a number of works where a combination of synaptic depression and noise are used to study switching between different memory states, that is latching dynamics [92–95]. In our work we use a deterministic law for the evolution of synaptic depression, which results in monotonic decrease of the synaptic variable. This approximation requires a mean field limit. The authors of [92–95] use a formulation close to the mean field limit yet allowing for non-monotonic dependence of the coupling function on the firing rate patterns coding for the concepts. As a consequence the resulting latching dynamics can be chaotic. In some other works not only synaptic depression, but also facilitation is included in the model . The novelty of our approach is the use of the concept of heteroclinic chains and singular perturbation theory, allowing for accurate prediction of the time and direction of the switches.
The term heteroclinic chain refers to a sequence of steady states joined by connecting trajectories. Heteroclinic chains or cycles have been studied in various contexts, including fluid dynamics, population biology, game theory and neuroscience (see [97–99] for a review), in particular in a model of sequential working memory . Typically such chains involve states of saddle type, acting as sink for some trajectories and source for other ones. Latching dynamics, as investigated in the context of a Hopfield network [70, 71], strongly suggest a link to heteroclinic chains; similar dynamic behaviour has been found for networks of integrate and fire neurons with the same connectivity (learning) rule . Following the set-up of [70, 71] we use Hopfield networks as attractor network models, however, following , we make a small change in the equation defining the network, in order to ensure that heteroclinic chains can exist in a robust manner. Another difficulty is that latching dynamics does not fit into the classical context of heteroclinic chains, as the learned patterns that lose stability due to synaptic depression cannot be seen as states of saddle type. Hence we need to consider generalized heteroclinic chains given as a sequence of connecting trajectories joining attractors (learned patterns) which become unstable due to a slowly varying variable (synaptic strength). The context of heteroclinic chains has the simplicity which allows for the derivation of numerous algebraic conditions that need to be satisfied in order for such chains (and hence latching dynamics) to exist. Therefore our work leads to a better qualitative and quantitative understanding of latching dynamics, including the role of overlap, synaptic depression, noise and feedback inhibition.
The evolution of activity variables/firing rates
As in [70, 71] with somewhat different notations, the system describing the dynamics of N neurons (populations) is as follows (1) where uj is the activity variable (average membrane potential) of neuron j, I is a constant external input, xj is the firing rate of neuron j, the coefficients Jij express the strength of the excitatory connections from neuron j to neuron i and τ is the time constant, measured in miliseconds. The terms represent inhibition, discussed in more detail below. The firing rate is itself a monotonously increasing function of the activity variable with limiting values 0 and 1. This function is often taken as x = g(u) = (1 + e−u/μ)−1. In this work we will use an approximation of g, as shown below.
System (1) can be expressed in terms of firing rates by means of the transformation xi = g(ui). Learned patterns are steady state patterns of Eq (1) of the form (ξ1, …, ξn), ξj = 0 or 1. In order to apply the linearized stability principle we need to be able to evaluate partial derivatives of the right hand side of Eq (1) at the learned patterns. However for such states the derivatives do not exist for the particular choice of g. This makes it impossible to apply the algebraic method of linearization (computation of eigenvalues of the linearized system) in the current models. We can remedy this using the approach introduced in  by replacing the function g−1(x) = ln x − ln(1 − x) by its Taylor expansion fq(x) at x = 1/2 up to some arbitrary order q. When we let q tend to infinity fq tends uniformly to g−1 in any interval in (0, 1). In the following, for simplicity, we take the expansion to first order f1(x) = 4x − 2, (this corresponds to q = 1). A different choice of q would not significantly alter our results. After renaming the parameters we arrive at the equations (2) The system (2) has the following fundamental property: any vertex, edge, face or hyperface of the cube [0, 1]N is flow-invariant: trajectories with starting point in any one of these sets are entirely including in it. This implies in particular that the vertices are equilibria, or steady-states, of Eq (2). The vertices have coordinates 0 (inactive unit) or 1 (active unit). Hence vertices correspond to patterns for the neural network and whenever a vertex is a stable equilibrium it represents a learned pattern.
The modelling of synaptic depression
According to the synaptic depression assumption the coefficients Jij vary in time according to the rule: (3) where the evolution of the synaptic variable si is given as follows  (4) τr and U being the time constant of the recovery of the synapse and the maximal fraction of used synaptic resources.
The modelling of excitatory connections
We assume that the matrix of excitatory connections (Jmax)i, j = 1, …n is derived from a set of learned patterns which must be stable steady states of the system. Following  we use the Hebbian learning rule for sparse matrices, as introduced in , see also . According to this rule the coefficients of the connectivity matrix (Jmax)i, j = 1, …n (without synaptic depression) satisfy (5) where ξ1, …, ξP are the learned patterns, N is the total number of neurons and p is the ratio of active to inactive units, measuring the sparsity of the matrix J. Note that the matrix given by Eq (5) is symmetric.
Let us set ν = (Np(1 − p))−1. We simplify the expression (5) by introducing a change of variables and parameters, see also . The rhs of Eq (2) can be rewritten as where now (6) Remark that Jmax is symmetric, while J at t > 0 will not be so (as long as some and not all neurons are active, see Eq (3)). In other words synaptic depression has the effect of breaking the symmetry of the connectivity matrix.
We rename parameters μ/ν as μ, etc., rescaling time by t = t′/ν and update the definition of Jij in Eq (2) by , with given by Eq (6). Further, we assume that the connectivity matrix is sparse, which is consistent with neurophysiological data , as well as with computational models showing that a sparse matrix allows maximal storage capacity [106–110]. In the following we shall assume that p ≪ 1 (sparse matrix) and replace p by 0 in Eq (6). This guarantees that the weights are positive, which is consistent with the assumption that they correspond to the excitatory connections. Moreover, given that ν is a constant between 0 and 1 and, due to the sparsity of , is not particularly close to 0, we set, for simplicity, ν = τ. This choice does not qualitatively alter our results. Hence the context of our study is system (2) with τ = 1 and the goal is to find latching dynamics between learned patterns with the connectivity matrix given by Eq (6). As an intermediate stage of our investigation we will consider systems of the form (2) with weights that do not satisfy Eq (6).
The modelling of inhibition
The term −I in Eq (2) corresponds to constant (tonic inhibition). Due to the presence of this term the pattern consisting of all neurons inactive is stable.
The term − λ∑xi is the non-selective inhibition, depending on the activity of the specific neurons. This contribution should be thought of as feedback inhibition: a pyramidal neuron which is active excites some interneurons which contribute an inhibitory feedback. The choice of the dense inhibitory connectivity is supported by the experimentally known fact that interneurons are characterised by an extensive axonal arborisation, which allows each one of them to reach a large number of pyramidal cells in a local network .
The modelling of noise
Noise plays a very important role of facilitating the transition from steady states that lose stability to the ones that follow in the sequence of latching dynamics. Therefore the noisy perturbation should not have the factor of x(1 − x), which would make it very small near the vertices, yet it must preserve the invariance of the cube [0, 1]N. We construct the noise term starting with white noise and subsequently modify it so that it points towards the interior of the cube. This noise term can be thought of as a fluctuation of the firing rate due to random presence or suppression of spikes. The role of the adjustment brought to white noise is to ensure that negative firing rates or firing rates greater than 1 do not arise. In practice, in our simulations we add a noisy perturbation to the initial condition at regular intervals of time, making sure that the perturbations are positive for firing rates near 0 and negative for firing rates near 1.
Sparse connectivity and the non-selective inhibition imt stable patterns contain only a few active neurons.
One of the goals of this work is to show that latching dynamics is approximated by heteroclinic chains, which we introduce here. Given given a sequence of steady state patterns ξ1, …, ξM, M < P, a heteroclinic chain consists of a sequence of connecting trajectories (dynamic transition patterns) between these patterns and of a sequence of time instances 0 < t1 < ⋯ < tM such that the transition from ξk to ξk+1 exists for the coefficients of the connectivity matrix Jij evaluated at tk, that is , where i, j = 1, 2, …, n.
We will show that latching dynamics is closely approximated by heteroclinic chains. Hence, finding a heteroclinic chain in our model becomes equivalent to the problem of finding heteroclini chains. This problem can therefore be formulated as follows:
Problem: let there be given a sequence of patterns ξ1, …, ξM, M < P, where ξk and ξk+1 share at least one active unit for all k = 1, …, P − 1. Under which conditions does there exist a sequence of connecting trajectories ξ1 → ⋯ → ξP, so that a heteroclinic chain is realized between these patterns?
In this section we argue that latching dynamics is closely approximated by heteroclinic chains and subsequently investigate the possibility and feasibility of the existence of heteroclinic chains in our model. As justified in more detail below, sparse connectivity of excitation and dense connectivity of feedback inhibition imply that latching dynamics involves only a few learned patterns, consisting of a small number of active neurons. Moreover, as we argue below, there is significant overlap between the patterns. Hence we expect that there exists a small subnetwork, weakly connected to the rest of the network, which supports a heteroclinic chain. The connectivity matrix restricted to this subnetwork is not necessarily obtained from the learning rule (6). Based on this argument we break up the problem into two parts:
- we consider a small network (the prototype of a subnetwork), designing the connectivity matrix so that a heteroclinic chain connecting a priori specified patterns exists,
- we construct a larger network whose connectivity matrix is derived from the learning rule (6) such that the small network is its subnetwork. Our construction leads naturally to a matrix with sparse connectivity. It is known that connectivity in the brain is only about 10% [105, 112–116], hence our construction is consistent with the biophysical data.
We carry out this procedure for a few examples illustrating the general principle.
The structure of Eq (2) makes the eigenvalues of the system linearized at each steady state pattern lying on a vertex of the hypercube [0, 1]N easy to compute (diagonal Jacobian matrix). Let ξ = (ξ1, …, ξN) be a vertex (hence ξj = 0 or 1), then the eigenvalue at ξ along the coordinate axis xk has the form (7) The stability condition is now
(S) σk < 0 for all k = 1, 2, …, n.
Note that this algebraic method would not be available if we had not replaced the function f, equal to the inverse of the transfer function, by its Taylor polynomial.
The assumption of sparsity implies that for each k only a few ’s can be non-zero. This means that in a stable pattern only a few ξj’s can be non-zero, otherwise the contribution of the non-selective inhibition would not allow the stability condition to hold.
Formula (7) and the stability condition (S) are the tools that will allow us to create conditions for the existence of heteroclinic chains and establish the role of the overlap between learned patterns. We show that such overlap is needed for the existence of a heteroclinic chain.
Sparse coding and the relation between latching dynamics and heteroclinic chains
Consider a pattern ξ = (ξ1, ξ2, …, ξN) and suppose ξ is a stable steady state of Eq (2) with sj = 1, j = 1, …, N. We will argue that only a few of the components ξj can be significantly larger than 0. Assume this is not the case. Then, either there is a large number of ξj’s satisfying 0 < ξj < 1 or there is a large number of ξj’s equal to 1. In the first case λ∑ξj must be larger that ∑Jij ξj, so that the pattern cannot be a steady state. If the second case arises then the pattern can be a steady state but it cannot be stable, since the eigenvalues corresponding to the entries equal to 1 must be positive, see Eq (7). Hence the number of components that are not very close to 0 must be small. This property is an expression of sparse coding (few neurons code for every concept), which, in the context of our model, is a consequence of sparse connectivity in the excitatory network and dense connectivity of feedback inhibition.
The above argument implies that latching dynamics must necessarily take place near the surface of [0, 1]N, as the learned patterns have to be close to the surface and the transitions between them cannot be take a very long time, otherwise synaptic depression would weaken the excitatory connections making it unlikely for such transitions to occur. Hence the question is to identify dynamics on the boundary of the cube that could be a good approximation of latching dynamics. In fact the argument of sparse coding given above suggests that the dynamics is restricted to a subset of the boundary of the cube of relatively small dimension. In a heteroclinic chain we find a model of the dynamics in the edges of the cube which, when perturbed by noise gives a faithful representation of latching dynamics.
Constructing heteroclinic chains
Due to the action of synaptic depression each of the learned patterns in a heteroclinic chain must lose stability due to one or more of the eigenvalues σk becoming positive. We assume that no two eigenvalues become positive at the same time, which implies that a noisy trajectory must follow the direction of the unstable eigenvalue. We will assume that in order to pass from one learned pattern to the next, the trajectory follows the edge corresponding to the unstable eigenvalue to the opposite vertex, which is a saddle point with a single unstable direction connecting to the next learned pattern in the chain. The chain will therefore be a sequence of elementary chains consisting of connections with three elements: a learned pattern ξi that becomes unstable due to synaptic depression (prime), a transition pattern of saddle type, and the next learned pattern ξi+1 (target). The active units in the pattern correspond to the overlap between the patterns ξi and ξi+1 The fact that the transition pattern should be unstable imposes another condition on the eigenvalues. These conditions will be presented in the next section.
We argue that an elementary chain is the most likely mechanism of transition from ξi to ξi+1. It is certainly the simplest case dynamically. Any more complicated dynamics would be likely to increase the passage time, so that the target pattern could lose stability due to synaptic depression before becoming active in the chain. Finally, more complicated dynamics would require the existence of additional unstable eigenvalues leading to additional constraints on the matrix .
Constraints on the connectivity matrix
We now state the algebraic constraints from the eigenvalue conditions (7) which define the parameter regions where heteroclinic chains could exist (see S1 File for a derivation of these conditions). These conditions are only necessary, in fact our numerics show that heteroclinic chains which follow a prescribed sequence of connections arise in a reliable manner in yet smaller parameter regions. The origin of these conditions is the requirement of stability of the steady states in the absence of synaptic depression combined with the requirement of the existence of a passage to the next steady state once the state currently attracting the dynamics loses stability due to synaptic depression. A cycle we consider joins a sequence of learned patterns ξ1 → ⋯ → ξp such that each of them has exactly m excited neurons (with entry 1) and the switching from one pattern to the next corresponds to switching the values in two entries. Possibly after re-arrangement of the indices it is no loss of generality to assume that In addition we have p − 1 transition patterns We make a simplifying assumption that the entries of Jmax are 0 outside of a band around the diagonal of width 2m − 1 (this is consistent with the requirement of the sparsity of the matrix). We introduce: (8) The requirement that the patterns ξ1, …ξp are stable in the absence of synaptic depression can be expressed, using Eq (7), by the condition (9) (10) Other types of constraints come from the fact that, in the time interval of transition from one pattern to the next, the dynamics must approach a transition state from the direction of the prime pattern and leave in the direction of the target pattern. It means that there is a time instance such that (11) which implies a weaker condition (12) The combination of Eqs (9), (10) and (12) places severe restrictions on the parameters λ, I and . Additional constraints can be derived from the fact all the other directions of have to be stable, in order to ensure the reliability of the cycle, but we did not explore these conditions here. For m = 2 we can use Eq (11) and the fact that there is only one synaptic variable pertaining to to obtain the inequality: . From the symmetry of the connectivity matrix we conclude that . In other words, the elements on the upper diagonal and the lower diagonal must be increasing. This, combined with Eqs (9), (10) and (12), gives, for m = 2 (13) The property of increasing diagonal elements, in practice, prevents the existence of long chains as the large coefficients will activate the corresponding neurons just due to the presence of noise.
Conditions based on slow/fast dynamics
To take advantage of the fact that the synaptic variables si are slow compared to the firing rates we write the equation for the si’s in the form (14) with In this formulation the model has a time scale separation and ρ is a regular parameter.
The transitions are governed by Eq (2), which implies that as ε → 0 each one of them lasts for an approximately constant positive amount of time. It follows that as ε → 0 the change in si in the transition period tends to 0, that is si remains close to a constant value, approximated by bifurcation points, corresponding to the loss of stability of ξi. These values can be derived in a recurrent manner as follows:
- At t = 0 s1 = s2 = 1. Until the loss of stability of ξ1 it holds that x1 ≈ 1 and x2 ≈ 1. Hence, in that period, we set: x1 = x2 = 1 and Eq (14) becomes (15) where i = 1, 2. Since s1 and s2 have the same initial condition they remain equal. Using Eq (7) we derive that time when ξ1 loses stability is defined by: Hence the bifurcation is given by the s1 and s2 values:
- Assuming that s2 does not change during the transition , at the beginning of the period when the trajectory is near ξ2, we have and s3 = 1. The evolution of (s2, s3) until the next stability loss is Eq (15) with i = 2, 3. Note that Eq (15) is a linear equation and hence can be solved: (16) By Eq (7) we must now choose tB so that Hence the values of (s2, s3) corresponding to the point of the loss of stability of ξ2 are and . To obtain and we solve the linear system (17) with . The values of and can now be found by solving this linear equation (the determinant of the matrix is easily shown to be non-zero).
- The process of the derivation of the bifurcation values is iterative. If is known then and are obtained by solving the linear equation
This calculation allows us to check for a given matrix J, if the specified heteroclinic chain exists for sufficiently small ε. The first set of conditions is
The second set of conditions is obtained based on the requirement that, for each i = 1, 2, …, P, at the ith bifurcation point the following properties hold:
- ξi+1 must be stable,
- must be a saddle with the direction of ei stable (ei is the vector with ith component 1 and the other components 0),
- ξi must be stable in the direction of ei+1.
Explicit conditions can be derived in an iterative manner using Eq (7), in the sequel we do this numerically in the context of specific examples. If these conditions are satisfied then a chain exists for sufficiently small ε and small noise. If the conditions are violated then the chain does not exist unless the noise is large and the transitions are driven exclusively by noise.
Satisfying the learning rule
Our results show that, given a small network, the parameters have to be tuned quite precisely to obtain a heteroclinic chain. Adding the requirement that the matrix is obtained using the learning rule (6) gives an even more severe constraint. We have constructed examples of networks supporting heteroclinic chains with each neuron involved in some of the patterns forming the chain. In each case the connectivity coefficients we used had larger values than given by the patterns involved in the chain alone. To solve this problem we designed a method of defining a larger system, with the connectivity matrix of the form (20) and added learned patterns which do not participate in the chain but with an overlap with the patterns forming the chain, so that the matrix is obtained using the learning rule (6). The matrix A consists of many blocks with few non-zero coefficient that are small in comparison to the entries of . The matrix B is block diagonal, with the off-diagonal entries in each of the blocks equal to 1. The added learned patterns must satisfy the constraints (9) and (10) to ensure their stability. This way the matrix is sparse (about 25% non-zero elements in the example we constructed). There is no natural algorithm to construct , so we refrain from making any further specifications. We constructed for a specific example, see S2 File.
A simple example—An elementary chain
In this section we examine the simplest case of a network of three neurons and two learned patterns (21) This example is the prototype of an elementary chain. In this section we describe the simplest case of constructing a heteroclinic chain, namely a sequence of connecting orbits (dynamic transition patterns) along edges of the cube in connect these patterns in a chain. This requires the presence of one intermediate equilibrium , which by assumption is not a learned pattern and has an unstable direction along the coordinate which passes from 0 to 1 in the sequence (22) Computing the connectivity matrix from the learning rule (6) with patterns ξ1 and ξ2 is straightforward and gives (23) Applying formula (7) with N = 3 and Jmax given by Eq (23) it is easily checked that the two learned patterns have negative eigenvalues, hence are stable, in the absence of synaptic depression, iff 1 < I + 2λ < 2 − μ. This is our first requirement.
In the remaining of this section , resp. , will denote the eigenvalue at ξi, resp. along xk.
We now “switch on” synaptic depression. As time elapses, synaptic weights will be modified according to Eq (4). For a given pattern (steady-state of Eq (2)) on a vertex of the cube [0, 1]3) the evolution of the synaptic variable si (i = 1, 2 or 3) can be of two types: (i) if xi = 0 the value si = 1 is a steady-state of Eq (4); (ii) if xi = 1 then si decreases monotonically towards the limit value S = (1 + τU)−1 which is the steady-state of Eq (4) at xi = 1.
Let us describe in terms of the eigenvalues associated with each pattern, the scenario which would lead to the expected heteroclinic chain.
First, the weakening of the synaptic variable s1 should modify the stability of the learned pattern ξ1 in such a way that a trajectory will appear connecting ξ1 to the intermediate along the first coordinate x1. This entails that the eigenvalue at ξ1 (eigenvalue along coordinate x1), which initially is negative, becomes positive after some time. This is a kind of dynamic bifurcation, in which time plays the role of a bifurcation parameter through the evolution of the synaptic variables. Simultaneously we want the eigenvalue at become negative at finite time so that this state is attracting along direction x1. By Eq (7) a sufficient condition for this is .
The second step is to check conditions which would allow for the existence of a trajectory connecting to ξ2 along the edge with coordinate x3. This requires that the eigenvalue be positive after some time (possibly already at t = 0), hence by Eq (7), . The two conditions we just derived are incompatible with matrix Eq (23), so we can conclude that this matrix does not admit the heteroclinic chain Eq (22).
From the above discussion we can infer that a necessary condition for the existence of a chain Eq (22) is that . Since Jmax is symmetric this means that its upper diagonal must have strictly increasing coefficients. In addition to these conditions we also request that: (i) ξ1 be “more” stable in the x2 direction than in the x1 direction, i.e. at t = 0, so that a trajectory starting close to this equilibrium will first destabilize in the x1 direction (it may of course not destabilize at all); (ii) the x2 direction at is stable; (iii) ξ2 is stable when t large enough. This imposes additional conditions on the coefficients of Jmax. Let us show that the matrix (24) satisfies all these conditions and hence admits the heteroclinic chain Eq (22). Of course this is an ad’hoc construction, however we shall show in S2 File that Eq (24) is a submatrix of the connectivity matrix of a subnetwork of a large sparse network under the learning rule (6) (see Eq (20)).
- 2 < I + 2λ < 3 (stability of the learned patterns in absence of synaptic depression),
- 3S < I + 2λ ( becomes >0 in finite time),
- 1 < I + λ (),
- I + λ < 2S ( becomes >0 in finite time),
- I + 2λ < 2S + 2 ().
These conditions are all satisfied if τ and U are chosen such that 2/3 < S and the point (λ, I) lies in the triangle bounded by the lines I > 0, 2 < I + 2λ and I + λ < 2S. The value of S = (1 + τU)−1 being given, these conditions can be conveniently represented graphically. Fig 1 shows an example with τ = 100 and U = 0.004.
Let us illustrate numerically this result. We have integrated the Eqs (2) and (4) with N = 3 and coefficient values τ = 100, U = 0.004, I = 0.15 and λ = 1.2. Fig 2 shows a time series of xI(t) (upper figure) and si(t) (lower figure) with an initial condition starting close to ξ1. In order to observe the transitions Eq (22) in reasonable time we have incorporated a noise in the code, in the form of a small random deviation from initial condition on the xi variables at each new integration time (simulation with Matlab).
A case with five neurons and the extended network of 61 neurons
We consider (25) with I = 0.3, λ = 3.4, μ = 3.1, τr = 400 and U = 0.01. This matrix and these parameter values meet all conditions for the existence of a heteroclinic chain joining the patterns ξ1 = (1, 1, 0, 0, 0), …, ξ4 = (0, 0, 0, 1, 1), however Eq (25) does is not derived from the learning rule (6). In Fig 3 we show a simulation of a chain of four states existing for the above parameters.
Panel (a) shows the evolution of the firing rate variables, panel (b) shows the evolutions of the synaptic variables.
We extended this network to a network of 61 neurons with the connectivity matrix satisfying the learning rule (6), using the approach described earlier. The details of the construction can be found in S2 File. Fig 4 shows a simulation for the required chain joining the states whose first five components are ass ξ1 = (1, 1, 0, 0, 0), …, ξ4 = (0, 0, 0, 1, 1) specified above and the remaining 56 components are 0.
The parameter settings are as the same as in the simulation of Fig 3. Panel (a) shows the evolution of the firing rate variables, panel (b) shows the evolutions of the synaptic variables.
Sparsity of the extended network.
Electrophysiological studies suggest that connectivity in the brain is sparse with only approximately 10% of pairs of neurons connected [105, 112–116]. In the extended network obtained in this section and in S2 File the ‘emptiness’ of the matrix (fraction of zero-weight synapses) is above 75%, which is consistent with neurophysiological data as well as with computational models showing that a sparse matrix allows maximal storage capacity [106–110]. The present results show that sparsity is necessary not only to improve storage capacity (ensure the stability of learned patterns) but also to enable the sequential activation of patterns. Indeed, in the case of Hebbian learning considered here, heteroclinic chains involving patterns defined by the activity of neurons e.g. 1-6 are possible only if the synaptic matrix obeys conditions on the efficacies along the subdiagonal. These conditions depend in turn on the role of additional neurons (n) among a large number of ‘non-coding neurons’ taken into account in the learning equation. This is possible under conditions of sparse coding of the patterns.
Reliability of the chain.
We have computed the reliability of the chain with the prescribed sequence (1, 1, 0, 0, 0) → (0, 1, 1, 0, 0) → (0, 0, 1, 1, 0) → (0, 0, 0, 1, 1) in the extended network of 61 neurons, with respect to the parameter U (maximal fraction of used synaptic resources). For parameter values distributed in the interval 0.0005 ≤ U ≤ 0.110 we performed 10 simulations with the same initial conditions for each of the chosen parameter values. Fig 5 shows the distribution, for different values of U, of the proportion of the simulations for which a given pattern was activated in the chain.
Panel A. Reliability of the chain of four patterns in the network of 61 neurons as a function of U. Panels B1-4. Activities of neurons coding for the patterns in the chain as a function of time for four representative values of U.
Panel A of Fig 5 shows that the activation of the full length chain (all of the four successive patterns) is possible for a limited range of values of U (from .0081 to .0093, see panel B3). Even within this range of values of U and with fixed parameter values, the chain does not always fully develop on every simulation due to the presence of noise. Further, for values of U lower or higher than this range, the chain does not fully develop for two different reasons. On the one hand, when U decreases, the length of the activated chain decreases because the network stays in the state corresponding to the first pattern (panel B1) or to the second pattern with slow transition time (panel B2). On the other hand, when U increases, the length of the activated chain decreases because the network does not stay in the state corresponding to the first but does not activate the second pattern either, ending in a state where no neuron is activated (panel B4). Taken as a whole, these results show that the value of U determines the reliability of the chain in terms of number of patterns activated, with the patterns occurring later in the chain less likely to be activated. Further, the value of U also determines the state of the network after the first pattern, ranging from this first pattern or the different following patterns (low values of U) to an absence of activity of the neurons (high values of U). For the optimal range of values of U, the reliability of the chain is maximal but not perfect due to the presence of noise.
Slower synaptic depression leads to more reliable chains.
Here we show some simulation complementing our analysis of the effect of varying the speed of the evolution of the synaptic variables sj. Earlier we introduced new parameters and ρ = τrU and studied the effect of decreasing ε, while keeping ρ fixed, which is equivalent to simultaneously increasing τr and decreasing U. We showed that this has the effect of decreasing the threshold of the noise amplitude needed for activation of a heteroclinic chain. This result is confirmed by simulations, as shown in Fig 6. In addition, our simulations show that the curve marking the upper limit of the noise window for which heteroclinic chains are activated also increases with the decrease of ε. Altogether, the window of noise for which the chains are activated is larger if the synaptic variable evolves more slowly. Our hypothesis, based on the simulations shown in Fig 6, is that time scale separation increases the reliability of the deterministic fast component of the dynamics, limiting the role of the stochastic component to the initiation of the transitions from one state to the next.
The top panel shows the simulations marked by dots in the (ε, σ) plane, colour coded according to the extent and reliability of the activation of the heteroclinic chain, as indicated by the colour bar on the right. Lower panels show sample time traces corresponding to a selection of the parameter points, as indicated.
A chain connecting six patterns with m = 2
In this section we present an example of a longer chain involving six neurons and 5 patterns. We do not construct the extension to a large network with a matrix derived from the learning rule (6). This would be possible using the approach outlined earlier and carried out in detail for the example of five neurons in S2 File. However the matrix we consider has larger entries than the one of the preceding section (five neurons), which implies that a larger extended network would be needed.
We consider the following connectivity matrix: (26) This matrix satisfies the necessary conditions eq-condsmt based on the learned patterns ξ1 = (1, 1, 0, 0, 0, 0), ξ2 = (0, 1, 1, 0, 0, 0), ξ3 = (0, 0, 1, 1, 0, 0), ξ4 = (0, 0, 0, 1, 1, 0) and ξ5 = (0, 0, 0, 0, 1, 1). For the choice of parameters I = 0.48, λ = 8, μ = 1.2, τr = 600 and U = 0.012 and for a range of noise amplitudes this matrix gives the following heteroclinic chain/latching dynamics:
Starting with initial condition close to ξ1, the dynamics visits successively ξ2, …, ξ5, the transition form ξi to ξi+1 passing through the intermediate (not learned) state with only one excited neuron at rank i + 1, see Fig 7. Observe that as long as a variable xj is “large” (close to 1) the corresponding synaptic variable sj decreases until xj comes close to 0. Then sj increases, according to the time evolution driven by Eq (4).
A case of a shared neuron with m = 3
This example gives a different option for the neural coding of items. Thus far the principle of our model has been that each item (e.g. the prime or target) is coded by a pattern of activity of all neurons in the network. As a consequence only one item can be ‘activated’ at a given time in a heteroclinic chain. The simplest case is when two patterns are activated in succession: the pattern coding for the prime followed by a pattern coding for the target [70, 72–74]. In this case either the prime or the target is ‘activated’ at a given time. Such priming mechanism can account for neuronal activities recorded in nonhuman primates in priming protocols where a prime is related to a single target. In that case priming relies on the successive activation of the presented prime and of the predicted target [15, 69]. However, in human studies priming is reported not only for targets directly related to the prime (Step 1 targets), but also for targets indirectly related to the prime through a sequence of one (Step 2 targets) or two (Step 3 targets) intermediate associates of the prime that are activated after the prime and before the target (e.g.  for a review). Such indirect priming has been accounted for by network models in which Step 1, Step 2 and Step 3 associates to the primes were coded by neural populations that can be activated simultaneously . The present model of heteroclinic chains is of particular interest to account for the sequential activation of items involved in step priming. However, in priming studies the prime can still be reported by participants after processing of the target . This suggests that the prime must be available in working memory at the end of the activation of the sequence of Step associates, that is neurons coding for the prime must be active at the end of the heteroclinic chain. The possibility to activated the prime in after several associates have been activated in a chain is no reproduced by models of priming based on latching dynamics [70, 71]. In the present model, a way to for neurons coding for the prime to be actvated at the end of the heteroclinic chain is simply to consider that a pattern (attractor state) does not correspond to a single item, but rather corresponds to several items each corresponding to the activity of a subgroup of neurons. In that case the activity of a neuron would correspond to the average activity of a population of neurons coding for and item . Such population coding is consistent with recent models of priming in the cerebral cortex [25, 40, 118]. Intuitively, the first pattern in the chain codes for the prime only while the next pattern 2 in the chain codes for the combination of the prime and of the Step 1 target together, the pattern 3 codes for the Prime, Step1 target and Step 2 target, and so on. This way the population coding for the prime would be active throughout the entire computation.
In this section we present an example of a system of five neurons with three active neurons in each pattern and one neuron present in each pattern. For this we use the following connectivity matrix: (27) τr = 400, xmax = 1, U = 0.012. For this system, for the choice of the parameters I = 0.5, λ = 2.8 and μ = 1 we find by simulation a chain joining ξ1 = (1, 1, 1, 0, 0), ξ2 = (1, 0, 1, 1, 0), ξ3 = (1, 0, 0, 1, 1), and . The time series of the solution is shown in Fig 8.
Panel (a) shows the firing rates xj and panel (b) shows the synaptic variables sj Color code: blue = x1, red = x2, black = x3, green = x4, cyan = x5. Same code for sj.
The above analysis of the network behavior shows that heteroclinic chains can develop in the case where a neuron (i.e. a population coding for the prime) is active for all the successive patterns in the chain. In other words the overlap between populations coding for different items is such that a subgroup of neurons coding for an item (e.g. the prime) can remain activated while another subgroup can be deactivated as the chain progresses. The network is able to keep previous stimuli activated (e.g. a prime) while at the same time it can activate a sequence of items (i.e. associates to the prime) that can be predicted on the basis of the prime. The compatibility between changing patterns in the chain and stable activity of a neuron or population of neurons allows to account for two fundamental properties exhibited by the brain, within a unified model. Due to the structure of the overlap in the coding of the memory items, this example combines the population coding used classically in models of priming in the cerebral cortex, in which a given item is coded by a given population of neurons, and the distributed coding used in Hopfield types models of priming, in which a given item is coded by the pattern of activity of all neurons in the network. In this way the present model aims at unifying our understanding of the coding of items in memory and of priming processes between these items.
The present study provides the first analysis of the sufficient conditions for heteroclinic chains of overlapping patterns in the case of a symmetric Hebbian learning rule. Heteroclinic chains closely approximate latching dynamics, hence they are good candidates to account for priming processes reported in human and nonhuman primates. Here priming-based prediction is seen as the activation—by a pattern presented to the network—of a pattern not (yet) presented. Within this framework, heteroclinic chains account for the activation or inhibition of neurons so that the network codes for the ‘target’ pattern before its actual presentation, under conditions of overlap with the pattern coding for the ‘prime’ pattern. Heteroclinic chains account for different dynamics of activity of neurons reported in nonhuman primates during the delay between the prime and target: some neurons active for the prime and for the target remain active (pair coding neurons [1, 2]), some neurons active for the prime but not for the target are deactivated and some neurons not active for the prime but active for the target exhibit an increased activity during the delay, which corresponds to prospective activity [11–15]. The model of latching dynamics has been adapted to allow for the existence of heteroclinic chains, by replacing the equation for the membrane potential by the equation for the firing rate, with the nonlinearity replaced by its polynomial approximation (to arbitrary order), so that the dynamics is well defined even when the firing rate takes its minimal (0) or maximal (1) values . In the modified model we were able to identify some of the restrictions imposed on the network by the requirement of the existence of heteroclinic chains. From a modelling perspective, this is a step in bridging the gap between Hopfield-type models of priming and cortical network models of priming. The present model provides a mathematically tractable description of the reliability of sequences of patterns used to model priming in non-human primates and in human. Itl exhibits latching dynamics reported to account for priming processes and their perturbations, and it calculates spike rates of neurons coding for items in terms of overlap between related populations. It could serve future applications and to better understand perturbation of priming processes reported in pathologies of priming, such as Alzheimer disease or schizophrenia, by analyzing the reliability of sequences as a function of network parameters usually considered as subtending perturbations of priming (noise, dopaminergic activity, synaptic connectivity); [25, 119–121].
It should be noted that recent work by  has shown that short term memory can be induced as persistent activity in clustered networks without synaptic learning. This effect called cluster reverberation, could be the main mechanism by which short-term memory (sensory or working memory) works in the brain. In this model learned patterns are by nature metastable and latching dynamics can arise as activation of sequences of patterns. We expect that our approach could also apply to analyse the dynamics in cases of clusters of neurons.
Predictions on neural activities in priming
The present model makes predictions regarding the possibility for the prime to remain activated (remembered) or not (forgotten) as the chain progresses. In the network, activating patterns coding for several Step 1, Step 2, etc targets would make difficult the simultaneous persistent activity of the prime, due to retroactive interference based on inhibition generated by the ‘step’ targets . The corresponding experimental prediction that could be tested in priming experiments is that the activity of neurons coding for the prime would decrease when successive targets are predicted in memory even though they are not actually presented. This could be visible in nonhuman primates on a decrease of the retrospective activity of neurons coding for the prime when a series of ‘Step’ targets is predicted. The behavioral counterpart in humans would be a decrease in the reportability of the prime when the length of the sequence of targets to predict increases.
Asymmetry of priming and of synaptic efficacies
Brunel  recently pointed out the possibility that the optimal synaptic matrix depends on the constraint imposed on the network, either storing patterns as stable states or storing patterns to be activated in sequences. The present results show that heteroclinic chains are possible with symmetric matrices built through Hebbian learning and specify the necessary conditions for sequences to arise in the network. Although asymmetric connections can improve the ability of the network to activate sequences of patterns, they are not a necessary condition. However, the present results also show that the conditions for heteroclinic chains impose strong constraints on the structure of the synaptic matrix, suggesting that although symmetric weights can be optimal for storage capacity, they are not an optimal solution for the activation of sequences. If the network codes for a given pattern 1 at a given time, the activation of the next pattern 2 in a chain requires that two given neurons i and j activate each other or not depending on their state within each pattern. For example, i but not j can be activated in 1, and the opposite in 2. Hence for an optimal sequence 1 → 2, i should activate j but j should not activate i. The present results show that the symmetry of the weights can be compensated by the sparsity of the network at the expense of an increasing number of neurons necessary to code for the patterns. Even though asymmetric heteroclinic chains are possible with symmetric synaptic efficacies, further analysis of heteroclinic chains with asymmetric learning rules would bring new evidence on the specific role of asymmetric weights on the required level of sparsity and on the reliability of latching dynamics.
S1 File. Constraints on the connectivity matrix.
The authors would like to thank the anonymous referee for pointing out a large body of work they were not aware of, with a complementary view on the subject.
- 1. Miyashita Y. (1988) Neuronal correlate of visual associative long-term memory in the primate temporal cortex. Nature, 335: 817–820.
- 2. Miyashita Y., and Chang H. S. (1988) Neuronal correlate of pictorial short-term memory in the primate temporal cortex. Nature. 331: 68–70.
- 3. Miller E. K. (1999) The prefrontal cortex: complex neural properties for complex behavior. Neuron, 22: 15–17.
- 4. Bunge S. A, Kahn I., Wallis J. D, Miller E. K, Wagner A. D. (2003) Neural circuits subserving the retrieval and maintenance of abstract rules. J Neurophysiol. 90(5): 3419–28.
- 5. Muhammad R., Wallis J. D, Miller E. K. (2006) A comparison of abstract rules in the prefrontal cortex, premotor cortex, inferior temporal cortex, and striatum. J Cogn Neurosci., 18(6): 974–89.
- 6. Fuster J. M. and Alexander G. E. (1971). Neuron activity related to short-term memory. Science, 13;173(3997):652–4.
- 7. Amit D. J., Brunel N., and Tsodyks M. V. (1994) Correlations of cortical Hebbian reverberations: Theory versus experiment. J. Neurosci., 14: 6435–6445.
- 8. Goldman-Rakic P. S. (1995) Cellular basis of working memory. Neuron, 14(3): 477–85.
- 9. Wang X. J. (2002) Probabilistic decision making by slow reverberation in cortical circuits. Neuron, 36: 955–968.
- 10. Ranganath C. and D’Esposito M (2005). Directing the mind’s eye: prefrontal, inferior and medial temporal mechanisms for visual working memory. Curr. Opin. Neurobiol., 15: 175–182.
- 11. Naya Y., Yoshida M., and Miyashita Y (2001). Backward spreading of memory-retrieval signal in the primate temporal cortex. Science, 291: 661–664.
- 12. Naya Y., Yoshida M., and Miyashita Y (2003). Forward processing of long-term associative memory in monkey inferotemporal cortex. J. Neurosci., 23: 2861–2871.
- 13. Naya Y., Yoshida M., Takeda M., Fujimichi R., and Miyashita Y. (2003) Delay-period activities in two subdivisions of monkey inferotemporal cortex during pair association memory task. European J. Neurosci., 18: 2915–2918.
- 14. Yoshida M., Naya Y., and Miyashita Y. (2003) Anatomical organization of forward fiber projections from area TE to perirhinal neurons representing visual long-term memory in monkeys. Proc. Natl Acad Sci., 100: 4257–4262.
- 15. Erickson C. A., and Desimone R. (1999) Responses of macaque perirhinal neurons during and after visual stimulus association learning. J. Neurosci., 19: 10404–10416.
- 16. Rainer G., Rao S. C., and Miller E. K. (1999) Prospective coding for objects in primate prefrontal cortex. J. Neurosci., 19: 5493–5505.
- 17. Tomita H., Ohbayashi M., Nakahara K., Hasegawa I., and Miyashita Y. (1999) Top-down signal from prefrontal cortex in executive control of memory retrieval. Nature, 401: 699–703.
- 18. Sakai K., and Miyashita Y. (1991) Neural organization for the long-term memory of paired associates. Nature, 354: 152–155.
- 19. Gochin P. M., Colombo M., Dorfman G. A., Gerstein G. L., and Gross C. G. (1994). Neural ensemble coding in inferior temporal cortex. J. Neurophysiol., 71: 2325–2337.
- 20. Messinger A., Squire L. R., Zola S. M., and Albright T. D. (2001). Neuronal representations of stimulus associations develop in the temporal lobe during learning. Proc. Natl. Acad. Sci. USA, 98: 12239–12244.
- 21. Wirth S., Yanike M., Frank L. M., Smith A. C., Brown E. N., and Suzuki W. A. (2003). Single neurons in the monkey hippocampus and learning of new associations. Science, 300: 1578–1581.
- 22. Ison M. J., Quian Quiroga R, Fried I. (2015). Rapid Encoding of New Memories by Individual Neurons in the Human Brain. Neuron, 87(1):220–30.
- 23. Wallis J. D., Anderson K. C., Miller E. K. (2001) Single neurons in prefrontal cortex encode abstract rules. Nature, 411(6840), 953–6.
- 24. Wallis J. D., Miller E. K. (2003) From rule to response: neuronal processes in the premotor and prefrontal cortex. J Neurophysiol., 90(3): 1790–806.
- 25. Brunel N., and Lavigne F. (2009) Semantic priming in a cortical network model. em J. Cog. Neurosci., 21: 2300–2319.
- 26. Roitman J. D. & Shadlen M. N. (2002) Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. Journal of Neuroscience, 22, 9475–9489.
- 27. DeLong K. A., Urbach T. P. & Kutas M. (2005) Probabilistic word pre-activation during language comprehension inferred from electrical brain activity. Nature Neuroscience, 8(8): 1117–1121.
- 28. Kutas M., DeLong K. A. & Smith N. J. (2011) A Look around at What Lies Ahead: Prediction and Predictability in Language Processing. In Bar M. (Ed.) Predictions in the Brain: Using Our Past to Generate a Future: 190–207. Oxford Scholarship Online,
- 29. Brothers T, Swaab T. Y., Traxler M. J. (2015). Effects of prediction and contextual support on lexical processing: prediction takes precedence. Cognition. 136:135–49.
- 30. DeLong K. A., Troyer M. & Kutas M. (2014) Pre-Processing in Sentence Comprehension: Sensitivity to Likely Upcoming Meaning and Structure. Language and Linguistics Compass, 8(12): 631–645.
- 31. DeLong K. A, Quante L. & Kutas M. (2014) Predictability, plausibility, and two late ERP positivities during written sentence comprehension. Neuropsychologia, 61: 150–162.
- 32. Willems R. M., Frank S. L., Nijhof A. D., Hagoort P. & van den Bosch A. (2015) Prediction During Natural Language Comprehension. Cereb Cortex, bhv075 1–11.
- 33. Ding N., Melloni L., Zhang H, Tian X. & Poeppel D. (2015) Cortical tracking of hierarchical linguistic structures in connected speech. Nat Neurosci., 19(1): 158–64.
- 34. Lavigne F., Vitu F., & d’Ydewalle G. (2000) The influence of semantic context on initial eye landing sites in words. Acta Psychologica, 104(2): 191–214.
- 35. McDonald S. A., Shillcock R. C. (2003) Eye movements reveal the on-line computation of lexical probabilities during reading. Psychol Sci., 14(6): 648–52.
- 36. Hutchison KA, Heap SJ, Neely JH, Thomas MA. (2014) Attentional control and asymmetric associative priming. J Exp Psychol Learn Mem Cogn., 40(3): 844–56. Epub 2014 Feb 17.
- 37. Meyer D. E., & Schvaneveldt R. W. (1971). Facilitation in recognizing pairs of words: Evidence of a dependence between retrieval operations. Journal of Experimental Psychology, 90: 227–234.
- 38. Neely J. H. (1991) Semantic priming effects in visual word recognition: A selective review of current findings and theories. In Besner D. & Humphreys G. W. (Eds.), Basic processes in reading: Visual word recognition. (pp. 264–336): Lawrence Erlbaum Associates, Inc.
- 39. Hutchison K. A. (2003) Is semantic priming due to association strength or feature overlap? A microanalytic review. Psychonomic Bulletin & Review, 10: 785–813.
- 40. Lavigne F., Dumercy L. & Darmon N. (2011) Determinants of Multiple Semantic Priming: A Meta-Analysis and Spike Frequency Adaptive Model of a Cortical Network. The Journal of Cognitive Neuroscience, 23(6): 1447–1474.
- 41. Meyer D. E. (2014) Semantic priming well established. Science, 1;345(6196): 523.
- 42. Van Petten C. (2014). Examining the N400 semantic context effect item-by-item: Relationship to corpus-based measures of word co-occurrence, International Journal of Psychophysiology, 94: 407–419.
- 43. Luka B. J., and Van Petten C. (2014). Prospective and retrospective semantic processing: Prediction, time, and relationship strength in event-related potentials. Brain and Language, 135: 115–129.
- 44. Lavigne F., Dumercy L., Chanquoy L., Mercier B. and Vitu-Thibault F. (2012). Dynamics of the Semantic Priming Shift: Behavioral Experiments and Cortical Network Model. Cog. Neurodynamics. 6(6): 467–483.
- 45. Lavigne F., Chanquoy L., Dumercy L. and Vitu F. (2013). Early Dynamics of the Semantic Priming Shift. Advances in Cog. Psychology. 9(1), 1–14.
- 46. Cree G. S., & McRae K. (2003). Analyzing the factors underlying the structure and computation of the meaning of chipmunk, cherry, chisel, cheese, and cello (and many other such concrete nouns). Journal of Experimental Psychology: General, 132: 163–201.
- 47. McRae K., Cree G. S., Seidenberg M. S., & McNorgan C. (2005). Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods, 37: 547–559.
- 48. Spence D. P. & Owens K. C. (1990). Lexical co-occurrence and association strength, Journal of Psycholinguistic Research, 19: 317–330.
- 49. Landauer T. K., Foltz P. W. & Laham D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25: 259–284.
- 50. Hebb D. O. (1949) The Organization of Behavior: A Neuropsychological Theory. New York, NY: Wiley & Sons.
- 51. Bliss T.V. and Lomo T. (1973) Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. J. Physiol. 232: 331–356.
- 52. Bliss T. V. and Collingridge G. L. (1993). A synaptic model of memory: long-term potentiation in the hippocampus. Nature, 361: 31–39.
- 53. Kirkwood A. and Bear M. F. (1994). Homosynaptic long-term depression in the visual cortex. Neuroscience, 14: 3404–3412.
- 54. Yakovlev V, Fusi S, Berman E, Zohary E. (1998). Inter-trial neuronal activity in inferior temporal cortex: a putative vehicle to generate long-term visual associations. Nat Neurosci. 1(4):310–7.
- 55. Weinberger N. M. (1998). Physiological memory in primary auditory cortex: characteristics and mechanisms. Neurobiol Learn Mem, 70(1-2):226–51.
- 56. Rolls E. T., Tovee M. J. (1995). Sparseness of the neuronal representation of stimuli in the primate temporal visual cortex. J Neurophysiol. 73(2):713–26.
- 57. Tamura H.,Tanaka K. (2001). Visual response properties of cells in the ventral and dorsal parts of the macaque inferotemporal cortex. Cereb. Cortex. 11: 384–399.
- 58. Tsao D. Y., Freiwald W. A., Tootell R. B.,Livingstone M. (2006). A cortical region consisting entirely of face-selective cells. Science, 311: 670–674.
- 59. Hung C., Kreiman G., Poggio T., DiCarlo J. (2005). Fast read-out of object information in inferior temporal cortex. Science, 310: 863–866.
- 60. Young M.,Yamane S. (1992). Sparse population coding of faces in the inferotemporal cortex. Science, 256: 1327–1331.
- 61. Kreiman G., Hung C. P., Kraskov A.,Quian Quiroga R.,Poggio T. and DiCarlo J. J. (2006). Object selectivity of local field potentials and spikes in the macaque inferior temporal cortex. Neuron, 49: 433–445.
- 62. Quian Quiroga R. & Kreiman G. (2010). Measuring sparseness in the brain: comment on Bowers (2009). Psychol. Rev., 117: 291–299.
- 63. Quian Quiroga R. (2016). Neuronal codes for visual perception and memory. Neuropsychologia, 83: 227–41.
- 64. Fujimichi R., Naya Y., Koyano K. W., Takeda M., Takeuchi D. and Miyashita Y. (2010). Unitized representation of paired objects in area 35 of the macaque perirhinal cortex. Eur J Neurosci., 32(4): 659–67.
- 65. Quian Quiroga R. (2012). Concept cells: the building blocks of declarative memory functions. Nat. Rev. Neurosci., 13: 587–597.
- 66. Amit D. J., & Brunel N. (1997). Model of global spontaneous activity and local structured activity during delay periods in the cerebral cortex. Cereb Cortex, 7(3): 237–252.
- 67. Amit D. J., Bernacchia A., & Yakovlev V. (2003). Multiple-object working memory–A model for behavioral performance. Cerebral Cortex, 13(5): 435–443.
- 68. Brunel N. (1996). Hebbian learning of context in recurrent neural networks. Neural Comput., 15(8):1677–710.
- 69. Mongillo G., Amit D. J., & Brunel N. (2003). Retrospective and prospective persistent activity induced by Hebbian learning in a recurrent cortical network. Eur J Neurosci, 18(7): 2011–2024.
- 70. Lerner I.,Bentin S. and Shriki O. (2012a). Spreading activation in an attractor network with latching dynamics: automatic semantic priming revisited. Cogn. Sci., 36, 1339–1382.
- 71. Lerner I. and Shriki O. (2014). Internally and externally driven network transitions as a basis for automatic and strategic processes in semantic priming: theory and experimental validation. Front.Psychol. 5:314.
- 72. Masson M. E. J., Besner D., & Humphreys G. W. (1991). A distributed memory model of context effects in word identification. Hillsdale, NJ: Erlbaum.
- 73. Masson M. E. J. (1995) A distributed memory model of semantic priming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 3–23.
- 74. Plaut, D. C. (1995) Semantic and associative priming in a distributed attractor network. In J. F. Lehman & J. D. Moore (Eds.), Proceedings of the 17th Annual Conference of the Cognitive Science Society (pp. 37-42). Hillsdale, NJ: Erlbaum.
- 75. Plaut D. C., & Booth J. R. (2000). Individual and developmental differences in semantic priming: Empirical and computational support for a single-mechanism account of lexical processing. Psychological Review, 107, 786–823.
- 76. Moss H. E., Hare M. L., Day P. & Tyler L. K. (1994) A distributed memory model of the associative boost in semantic priming. Connection Science, 6, 413–427.
- 77. Treves A (2005). Frontal latching networks: a possible neural basis for infinite recursion. Cognitive Neuropsych. 22(3-4): 276–291.
- 78. Kawamoto A. H. and Anderson J. A. (1985). A neural network model of multistable perception. Acta Psychologica 59:35–65.
- 79. Horn D. and Usher M. (1989). Neural networks with dynamical thresholds. Physical Review A. 40:1036–1040.
- 80. Herrmann M., Ruppin E. and Usher M. (1993). A neural model of the dynamic activation of memory. Biological Cybernetics. 68: 455–463.
- 81. Kropff E., Treves A. (2007). The complexity of latching transitions in large scale cortical networks. Natural Computing. 6:169–185.
- 82. Russo E., Namboodiri V. M., Treves A. and Kropff E. (2008) Free association transitions in models of cortical latching dynamics. New Journal of Physics 10 (1), p.015008.
- 83. Softky W. R. & Koch C. (1993). The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs. J. Neurosci. 13, 334–350.
- 84. Shadlen M.N. & Newsome W.T. (1998). The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J. Neurosci. 18, 3870–3896.
- 85. Rolls, E. T. and Deco, G. (2010). The noisy brain: stochastic dynamics as a principle of brain function, OUP.
- 86. Miller P. & Wang X.-J. (2006). Stability of discrete memory states to stochastic fluctuations in neuronal systems. Chaos 16, 026109.
- 87. Fiete, I., Schwab, D.J. & Tran, N.M. (2014) A binary Hopfield network with 1/log(n) information rate and applications to grid cell decoding. Preprint at http://arxiv.org/abs/1407.6029.
- 88. Chaudhuri R. and Fiete I. (2016). Computational principles of memory. Nat Neurosci. 19(3):394–403.
- 89. Tsodyks M. V., Markram H. (1997) The neural code between neocortical pyramidal neurons depends on neurotransmitter release probability. Proceedings of the National Academy of Science, USA. 94:719–723.
- 90. Huber D.E., O’Reilly R. C. (2003) Persistence and accommodation in short-term priming and other perceptual paradigms: Temporal segregation through synaptic depression. Cognitive Science 27:403–430.
- 91. Chossat P., and Krupa M. (2016). Heteroclinic cycles in Hopfield networks. J. of Nonlinear Science 28 (2), 471–491.
- 92. Cortes J., Torres J. J., Marro J., Garrido P., and Kappen H. (2006). Effects of Fast Presynaptic Noise in Attractor Neural Networks. Neural Computation, 18, 614–633.
- 93. Marro J., Torres J. J., and Cortes J. M. (2007). Chaotic Hopping between Attractors in Neural Networks Neural Networks, 20 (2), 230–235.
- 94. Torres J. J., Cortes J. M., Marro J., and Kappen H. (2008). Instabilities in Attractor Networks with Fast Synaptic Fluctuations and Partial Updating of the Neurons Activity, Neural Networks 21, 1272–1277.
- 95. Marro J., Torres J. J., and Cortes J. M. (2008). Complex behavior in a network with time-dependent connections and silent nodes”, J. Stat. Mech. P02017.
- 96. Torres J. J, Cortes J. M., Marro J., and Kappen H. (2007) Competition between Synaptic Depression and Facilitation in Attractor Neural Networks, Neural Computation, 19(10), 2739–2755.
- 97. Hofbauer J. and Sigmund K. (1998). Evolutionary Games and Population Dynamics, Cambridge University Press.
- 98. Krupa M. (1997). Robust heteroclinic cycles. J. of Nonl. Sci. 7, 129–176.
- 99. Rabinovich M. I., Afraimovich V.S., Bick C. and Varona P. (2012) Information flow dynamics in the brain. Physics of Life Reviews 9 (1): 51–73.
- 100. Bick C. and Rabinovich M. (2009). Dynamical origin of the effective storage capacity in the brain’s working memory. Phys Rev Lett. 103(21): 218101.
- 101. Pantic L., Torres J. J., Kappen H., J, Gielen S. C. A. M. (2002). Associative Memory with Dynamic Synapses, Neural Computation 14, 2903–2923.
- 102. Tsodyks M., Pawelzik K. & Markram H. (1998). Neural networks with dynamic synapses. Neural Computation, 10, 821–835.
- 103. Buhmann J., Divko R., and Schulten K. (1989). Associative memory with high information content. Phys. Rev. A 39 (5), 2689–2692.
- 104. Tsodyks M. V. (1990). Hierarchical associative memory in neural networks with low activity level. Modern Physics Letters B, 4, 259–265.
- 105. Holmgren C., Harkany T., Svennenfors B. & Zilberter Y. (2003). Pyramidal cell communication within local networks in layer 2/3 of rat neocortex. J. Physiol. (Lond.) 551, 139–153.
- 106. Brunel N. (2016). Is cortical connectivity optimized for storing information? Nat Neurosci., 19(5): 749–55.
- 107. Clopath C., Nadal J. P. & Brunel N. (2012) Storage of correlated patterns in standard and bistable Purkinje cell models. PLoS Comput. Biol. 8, e1002448.
- 108. Brunel N., Hakim V., Isope P., Nadal J. P. & Barbour B. (2004) Optimal information storage and the distribution of synaptic weights: perceptron versus Purkinje cell. Neuron 43, 745–757.
- 109. Chapeton J., Fares T., LaSota D. & Stepanyants A. (2012) Efficient associative memory storage in cortical circuits of inhibitory and excitatory neurons. Proc. Natl. Acad. Sci. USA 109, E3614–E3622.
- 110. Clopath C. & Brunel N. (2013) Optimal properties of analog perceptrons with excitatory weights. PLoS Comput. Biol. 9, e1002919.
- 111. Sik A., Penttonen M., Ylinen A., & Buzsáki G. (1995). Hippocampal CA1 interneurons: an in vivo intracellular labeling study. The Journal of neuroscience, 15(10), 6651–6665.
- 112. Mason A., Nicoll A. & Stratford K. (1991) Synaptic transmission between individual pyramidal neurons of the rat visual cortex in vitro. J. Neurosci. 11, 72–84.
- 113. Markram H., Lübke J., Frotscher M., Roth A. & Sakmann B. (1997) Physiology and anatomy of synaptic connections between thick tufted pyramidal neurones in the developing rat neocortex. J. Physiol. (Lond.) 500, 409–440.
- 114. Sjöström P. J., Turrigiano G. G. & Nelson S. B. (2001) Rate, timing, and cooperativity jointly determine cortical synaptic plasticity. Neuron 32, 1149–1164.
- 115. Thomson A. M. & Lamy C. (2007) Functional maps of neocortical local circuitry. Front. Neurosci. 1, 19–42.
- 116. Lefort S., Tomm C., Floyd Sarria J. C. & Petersen C. C. (2009) The excitatory neuronal network of the C2 barrel column in mouse primary somatosensory cortex. Neuron 61, 301–316.
- 117. Dark V. (1988). Semantic priming, prime reportability, and retroactive priming are interdependent. Memory & Cognition 16: 299.
- 118. Lavigne F, Avnaïm F and Dumercy L (2014) Inter-synaptic learning of combination rules in a cortical network model. Front. Psychol. 5, p. 842.
- 119. Lavigne F. & Darmon N. (2008). Dopaminergic Neuromodulation of Semantic Priming in a Cortical Network Model. Neuropsychologia 46, 3074–3087.
- 120. Lerner I., Bentin S., and Shriki O. (2012b). Excessive attractor instability accounts for semantic priming in schizophrenia. PLoSONE 7:e40663.
- 121. Rolls E. T., Loh M., Deco G. and Winterer G. (2008). Computational models of schizophrenia and dopamine modulation in the prefrontal cortex. Nature Reviews Neuroscience, 9, 696.
- 122. Johnson S., Marro J., and Torres J. J. (2013). Robust short-term memory without synaptic learning, PLoS ONE 8(1) e50276