^{1}

^{*}

^{2}

^{3}

^{1}

Conceived and designed the experiments: CC NB. Performed the experiments: CC NB. Analyzed the data: CC NB. Contributed reagents/materials/analysis tools: CC JPN NB. Wrote the paper: CC JPN NB.

The authors have declared that no competing interests exist.

The cerebellum has long been considered to undergo supervised learning, with climbing fibers acting as a ‘teaching’ or ‘error’ signal. Purkinje cells (PCs), the sole output of the cerebellar cortex, have been considered as analogs of perceptrons storing input/output associations. In support of this hypothesis, a recent study found that the distribution of synaptic weights of a perceptron at maximal capacity is in striking agreement with experimental data in adult rats. However, the calculation was performed using random uncorrelated inputs and outputs. This is a clearly unrealistic assumption since sensory inputs and motor outputs carry a substantial degree of temporal correlations. In this paper, we consider a binary output neuron with a large number of inputs, which is required to store associations between temporally correlated sequences of binary inputs and outputs, modelled as Markov chains. Storage capacity is found to increase with both input and output correlations, and diverges in the limit where both go to unity. We also investigate the capacity of a bistable output unit, since PCs have been shown to be bistable in some experimental conditions. Bistability is shown to enhance storage capacity whenever the output correlation is stronger than the input correlation. Distribution of synaptic weights at maximal capacity is shown to be independent on correlations, and is also unaffected by the presence of bistability.

The cerebellum is one of the main brain structures involved in motor learning. Classical theories of cerebellar function assign a crucial role to Purkinje cells (PCs), that are assumed to perform as simple perceptrons. In these theories, PCs should learn to provide an appropriate motor output, given a particular input, encoded by the granule cell (GC) network. This learning is assumed to occur through modifications of

The cerebellum is heavily involved in learning tasks that requires precise spatio-temporal sequences, such as grasping, precise eye movement, etc. It has long been thought

A. Simplified sketch of the cerebellar cortex circuit. GC stands for Granule cell, PC for Purkinje cell, PF for Parallel fiber, CF for Climbing fiber. B. Perceptron model: the input layer is composed of GCs, the output unit is the PC. CF represents the error signal. C. Bistable output. If the previous output is 0, the input current needs to be larger than

On the theoretical side, a particularly well studied problem is the one of learning random input-output associations by the perceptron. The maximal storage capacity (maximal number of random associations that can be learned per input synapse, in the large

The study of Brunel et al.

In this paper, we study the capacity and optimal connectivity in a perceptron network storing correlated input-output associations. More precisely, we study (i) a standard binary perceptron, whose task is to learn a sequence of associations with an arbitrary level of temporal correlations in the inputs and outputs; (ii) a bistable perceptron, again subjected to a correlated sequence of associations. We show that the capacity (maximal number of associations in a learnable sequence) is independent of the correlations in the output if the inputs are not correlated. If the inputs are temporally correlated, the capacity grows with output correlation. The capacity diverges in the limit when both correlations become close to unity. The weight distribution is shown to be independent of the degree of correlation, both in the input and output. It is also found that adding a bistability range increases capacity when the output correlation is larger than the input correlation. The optimal width of the bistability range increases with output correlation. Finally, we show that in order to reach maximal capacity, the error signal (CF) has to change the state of the output unit (PC) in addition to its synapses, consistent with experimental data

In this section, we investigate storage of associations between temporally correlated input and output sequences. The maximal capacity is defined as the maximal length of a sequence that can be learned per input synapse, or in other words the maximal number of associations composing the sequence. We study a simple Markov chain model for generating the sequences. The sequence to be learned is composed of

In the perceptron, the output is obtained though a comparison of a weighted sum of the inputs to a threshold,

Correlations defined by Equation 1 make calculations using the replica method

For numerical simulations, we chose the variant of the perceptron algorithm used in Brunel et al.

This rule can be shown to be guaranteed to converge to a solution, provided the solution exists, and

A. Maximal capacity as a function of

Simulations (

We have so far focused on the case

A–B. Dependence on coding levels. A. Maximal capacity as a function of

Experimentally, the fraction of silent synapses was estimated to be about 80%

In

To investigate how the capacity depends on temporal correlations in the output, we consider sequences of patterns generated from a Markov chain as defined in the previous section, Equation 1.

The analytical calculation for correlated output and uncorrelated inputs (

A. Maximal capacity as a function of the bistability range

We then numerically confirm the theoretical results using a perceptron learning algorithm (

If the CF does not change the state of the PCs, the simulations does not reach maximal capacity (

In this section, we simulate numerically the bistable perceptron with correlated input and output (

A. Capacity as a function of

In this paper, we reconsidered the problem of learning random input-output associations in a perceptron with excitatory weights, considered as a model for cerebellar Purkinje cells. We computed the storage capacity, and distribution of synaptic weights, in two distinct models that are subjected to correlated input-output associations, described as Markov chains: a standard binary perceptron; and a bistable perceptron.

We find that the maximal capacity increases monotonically when both input and output correlations are increased. The capacity diverges in the limit when both go to unity. This divergence of the capacity is reminiscent of the divergence of the capacity of perceptrons storing uncorrelated input-output associations in the limit when the output coding level

Interestingly, Purkinje cells are known to exhibit bistability in vitro

We also found that the distribution of synaptic weights at the maximal capacity is independent on the degree of correlations in the input and output, for both standard and bistable perceptrons. It is also independent on the input and output coding levels. This distribution is composed of a finite fraction of zero-weight (silent) synapses, and a truncated Gaussian distribution for positive weights. As shown in

The learning algorithm that we used is in good qualitative agreement with standard protocols used to induce plasticity in

We have focused on the GC

The conditions for storing associations can be expressed as,

The constraint on the weights are

One can write the perceptron algorithm with sign constraint as:

(0)

(1) pick a pattern

Go to (1)

The principle of the proof of convergence is as follows. Let us suppose that there exists a solution to the learning task with positive weights. In other words, we assume there exists a set of weights

As in the standard case (with unconstrained weights), one computes the cosine of the angle between the weight vectors

We write

From the hypothesis that

One proceeds similarly for the norm:

To get a bound on the scalar product

Since

Note that this proof of convergence of the sign-constrained perceptron is distinct from the one of Amit et al.

The capacity is defined as the maximal number of random associations that can be learned per input synapse. The capacity of a perceptron with bistable output, where the target output is correlated and the inputs are uncorrelated, can be computed analytically, using the replica method

The ‘typical’ volume of the subspace of weights satisfying Equations 24 can then be computed, as a function of

The calculation follows a standard procedure. One first introduces integral representations for the Heaviside functions, which allows to average over the patterns. Then, one introduces order parameters

In the large

Finally, the equation for the distribution of synaptic weights for the bistable perceptron is identical to the one for the standard perceptron, i.e. at maximal capacity

We would like to thank Boris Barbour, Antonin Blot, Mariano Casado, Vincent Hakim and Clément Lena for fruitful discussions.