Synaptic Scaling Enables Dynamically Distinct Short- and Long-Term Memory Formation

Memory storage in the brain relies on mechanisms acting on time scales from minutes, for long-term synaptic potentiation, to days, for memory consolidation. During such processes, neural circuits distinguish synapses relevant for forming a long-term storage, which are consolidated, from synapses of short-term storage, which fade. How time scale integration and synaptic differentiation is simultaneously achieved remains unclear. Here we show that synaptic scaling – a slow process usually associated with the maintenance of activity homeostasis – combined with synaptic plasticity may simultaneously achieve both, thereby providing a natural separation of short- from long-term storage. The interaction between plasticity and scaling provides also an explanation for an established paradox where memory consolidation critically depends on the exact order of learning and recall. These results indicate that scaling may be fundamental for stabilizing memories, providing a dynamic link between early and late memory formation processes.


Weight dynamics
First we are taking Equation 3 and expand it by the sum over all neurons and the excitatory connectivity at both sides to get the differential equation for the mean field: The fact that inhibition reaches further than excitation allows separation of the network in two (or more) independent ones (memory and control; see main text). Within such a (sub)network the activities of the neurons can be assumed to be equal to the average activity. Thus, F i =F : Similarly, we assume that all weights within a (sub)network do not differ much from the average value.
Thus, (w + ) 2 ≈ (w + ) 2 and the differential equation turns intȱ The fixed point (ẇ + = 0) of this mean field differential equation yields the weight-nullcline: Activity dynamics As above we assume that the activities within a (sub)network (memory/control) are equal to the average activity. Thus, the differential equation turns intȱ Where we are usingN + Ψ andN − Ψ as average connectivity values, because in our experiments only parts of the complete network are activated (for example one or two sub-populations of nine neurons). As a consequence, effective connectivity is not homogeneous and neurons at the borders of the active populations have fewer active neurons they connect to. Thus, connection numbers N + Ψ and N − Ψ are smaller for border-as compared to core-neurons. Therefore, we need to consider the average number of connections for our calculations as indicated by the small bars (N + Ψ andN − Ψ ). Setting Equation A3 equals zero yields the following dependency between weight and activity:

Fixed Points and Bifurcation
The saddle node bifurcation ( Figure 3) for the synaptic weights given different input frequencies F I is obtained by calculating the intersection between the weight-and activity-nullcline (w + w =w + F ) which provides the fixed points of the system. For this, equation A2 is transposed toF : Additionally,ū can be expressed as a function ofF : Both equations have to be inserted in Equation A4 and transposed to a functionw(F I ). For the resulting equation exists no closed-form, thus, for Figure 3 A we solved it numerically.

Bifurcation and consolidation under different conditions
In the following we will show that the bifurcation and consolidation phenomena (Figure 1 B,C) are guaranteed under different conditions, such as a different synaptic plasticity rule, random topology, and parameter changes.

LTP and LTD
In the main text, we analysed the system for a synaptic plasticity rule consisting of a correlation-term modelling long-term potentiation (LTP). Thereby, we ignored the mechanism of long-term depression (LTD). However, as expected from the analytical calculations, an additional LTD-term does not impair the learning and consolidation dynamics (compare Figure S1 A,B with main text Figure 1 B,C). Here, for the synaptic plasticity part we used the BCM-rule [1] which is a mixture of LTP and LTD:ẇ syn.plast.
. We set the parameter Θ, differentiating LTP from LTD, to 10 Hz [2,3]. As the LTP-term in this rule is about F i -times faster than in the main text, we adapted the time scale µ BCM and the time scale ratio κ BCM by the maximum F max i = α (µ BCM = µ/α and κ BCM = κ/α). All other parameters and inputs are chosen as for Figure

Random Topology
The analytical results suggest that only the average number of excitatory and inhibitory connections per unit influence learning and consolidation. Thus, the detailed topology does not influence the dynamics.
To show this we used a circuit with randomly seated units on a finite 2-d plane with periodic boundary conditions. Each unit i connects with a certain probability dependent on the distance d i,j to unit j. We used a smaller excitatory probability kernel than the inhibitory kernel:  Figure S2 A shows, for instance, the topology of one unit (green) to other units (blue: inhibitory connections; red: excitatory connections). A random set of nine neighbouring neurons was externally stimulated (yellow). All parameters and inputs are the same as for Figure 1 B,C. The resulting dynamics of these random networks are qualitatively the same as for the grid network (compare Figure

Different Parameters
As shown in the main text ( Figure 4) a change of parameters does not induce significant differences in the bifurcation diagram. In Figure S3 we show that this also holds true for the dynamics of the circuit. Here, for instance, we changed the desired target value F T . However, the resulting dynamics are qualitatively the same (compare to Figure 1 B,C in main text).

Passive Weight Decay
Without consolidation all weights will decay (Figure 3 D). This curve of the decay times can be analytically calculated by considering the mean field equation of the weight dynamics (Eq. A1): During the decay phase the average activityF is low, thus, we can assume that the synaptic plasticity part is equal to zero and the mean field equation reduces tȱ This differential equation can be solved by separation of variables with initial time point t 0 = 0 and weightw + (t 0 ) =w + 0 . Thus, the dynamic of the mean weight is As we want to assess the decay times of the weights we transpose this equation to get time t: Now, we insert as 'target' weightw + (t) (which has to be reached after time t) the weight value of the controls. Simulations show (Figure 1) that the maximal control weight w + ctrl is approximately at 0.13, which we, thus, use as target here. Then the time T , which weights need to decay from given initial (learnt) weight w + 0 to the control weight w + ctrl , is: As can be seen from the bifurcation diagram (Figure 3 A) each synaptic weight value in the STSregime can be reached by learning. Thus, each initial weight w + 0 has to be considered here and, therefore, the system has a broad distribution (from seconds to days) of decay times or rather lifetimes of memories (Figure 3 D). This distribution will broaden to infinitely long lifetimes as soon as consolidation signals are given.

Consolidation
Consolidation with different parameters Figure S4 shows that different durations of consolidation stimuli have little effect on the resulting synaptic weight changes. The impact of too late consolidation after learning and previous consolidation In Figure 5 we show that a consolidation stimuli given too late induce a negative weight change. This is independent of the fact whether the weights had previously been increased by learning or by a consolidation stimulus ( Figure S5).

The consolidation cycle
Directly after local learning stimulation (green learning pulses in Figure 1 B, C), LTS-as well as STSsynapses begin to loose strength because only weak background activation is now present until a consolidation signal is delivered (yellow pulses in Figure 1 B, C). The corresponding activity-nullcline for low background activations is plotted in gray in Figure S6 (control). Both types of synapses (STS and LTS) drop from their initially obtained fixed points (Figure 3 B, C) to this nullcline ( * -markers in Figure S6) and from thereon weights will start to relax back with approximatively γw 2F to the crossing with the blue weight-nullcline (fixed point in gray). If time passes without consolidation activation, then, all weights will indeed drop back to the (gray) fixed point nearby zero. Presentation of a strong enough consolidation stimulus changes the picture in a way that the activity-nullcline is shifted from the gray to the yellow one. Therefore, STS-as well as LTS-assemblies will jump up to the yellow nullcline (the activity changes much faster than weights) and follow it upwards with µF 2 (to the new [yellow] fixed point) as long as the stimulus last. Weight decay and recovery (after consolidation stimulus) cyclicly repeat, but one can see that STS-synapses loose more than they gain after every cycle (∆w ST S < 0), whereas the LTS-synapses always recover to almost the same value (∆w LT S = 0). This is due to the fact that the recovery depends quadratically on the activity (LTP-term) and LTS-assemblies are clearly more active than STS-assemblies (compareF LT S toF ST S ).

Memory destabilization
Destabilization by recall depends on the activity-dependence of synaptic plasticity and scaling As mentioned in the main text, the destabilizing phenomenon of a memory recall depends on the imbalance of neuronal activities. For instance, in Figure 2 one neuron in the striped area is externally stimulated and the other not. This (and the neighbor activation) induces two significantly different activities A for the stimulated neuron (yellow in Figure 2 C) and a for the non-stimulated (red; A > a). This imbalance results in a small synaptic plasticity term which depends on the correlation (multiplication) of both activities (see below and Eq. 3). Furthermore, the synaptic scaling term for the weakly active neuron is small (incoming synapse), too. However, as A is very large, the synaptic scaling term of the strongly active neuron is significantly larger than the synaptic plasticity term.
Therefore, we can assume that the weight change is dominated by the negative drive of scaling whereby the synapse shrinks. As this can happen at different positions in the memory-related cell assembly, the memory can be destabilized. Different input intensities yield lower activations and, thus, change the rate of decay but not the effect itself.

Detailed parameter analysis of memory destabilization by partial activation
The recall of a memory item changes the stability of the related cell assembly. Figure S7 provides a details parameter analysis to explain the fact of unbalanced weight transgressions from LTS to STS-domains and vice versa as shown in the main text (Figure 7). We are analyzing the impact of different recall parameters such as stimuli duration and number of reactivated neurons dependent on the initial weight-strength of the cell assembly. Input-target synapses have first been stimulated to grow, reaching different initial values in control-, STS-, and LTS-regime. As the stimulated subset of neurons was chosen randomly, all experiments have been repeated ten times and averaged results have been calculated. In Figure S7 A and C 10% of the neurons have been stimulated. This mimics a recall as well as the activation of another overlapping cell assembly.