Skip to main content
Advertisement
  • Loading metrics

The contribution of the basal ganglia and cerebellum to motor learning: A neuro-computational approach

  • Javier Baladron,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Department of Computer Science, Chemnitz University of Technology, Chemnitz, Germany, Departamento de Ingeniería Informática, Universidad de Santiago de Chile, Santiago, Chile

  • Julien Vitay,

    Roles Software, Writing – review & editing

    Affiliation Department of Computer Science, Chemnitz University of Technology, Chemnitz, Germany

  • Torsten Fietzek,

    Roles Software, Writing – review & editing

    Affiliation Department of Computer Science, Chemnitz University of Technology, Chemnitz, Germany

  • Fred H. Hamker

    Roles Conceptualization, Funding acquisition, Project administration, Supervision, Writing – original draft, Writing – review & editing

    fred.hamker@informatik.tu-chemnitz.de

    Affiliation Department of Computer Science, Chemnitz University of Technology, Chemnitz, Germany

Correction

22 Jun 2023: Baladron J, Vitay J, Fietzek T, Hamker FH (2023) Correction: The contribution of the basal ganglia and cerebellum to motor learning: A neuro-computational approach. PLOS Computational Biology 19(6): e1011243. https://doi.org/10.1371/journal.pcbi.1011243 View correction

Abstract

Motor learning involves a widespread brain network including the basal ganglia, cerebellum, motor cortex, and brainstem. Despite its importance, little is known about how this network learns motor tasks and which role different parts of this network take. We designed a systems-level computational model of motor learning, including a cortex-basal ganglia motor loop and the cerebellum that both determine the response of central pattern generators in the brainstem. First, we demonstrate its ability to learn arm movements toward different motor goals. Second, we test the model in a motor adaptation task with cognitive control, where the model replicates human data. We conclude that the cortex-basal ganglia loop learns via a novelty-based motor prediction error to determine concrete actions given a desired outcome, and that the cerebellum minimizes the remaining aiming error.

Author summary

In this study, we aimed to better understand motor learning using a neuro-computational approach. While previous work emphasized different learning regimes of the various components, the main novelty of our study is the interplay of its components. Notably, we show that the model accounts well for motor adaptation data of experiments that involve a cognitive strategy.

Introduction

A commonly assumed role for the motor basal ganglia (BG) is action or motor program selection [16]. The basal ganglia integrate sensory evidence arguing for a particular decision and disinhibit the corresponding action plan. Such motor program selection involves a focal removal of tonic neural activity in the output nuclei of the BG to activate the desired movement while increasing other neuronal activity to avoid the execution of unwanted programs [7, 8]. However, how proper actions are discovered and represented is still unclear.

Although most common tasks addressed by computational models of the basal ganglia only require choosing a correct action among other actions, e.g. selecting a button as a response to sensory input [913], some models addressed the role of the BG in a broader context of the motor system. Magdoom et al. [14] proposed a computational model where the BG makes corrections to movements controlled by the motor cortex. The model was used to show how reduced dopamine signals in Parkinson’s disease can produce abnormal movements. In an extended version of this model, the BG amplified low amplitude input signals through stochastic resonance to produce movements [15]. Kim et al. [16] proposed a model where the BG selects a muscle activation pattern demonstrated by a 2-dimensional reaching task. In the motor control framework proposed by Manella and Baldasarre [17], the BG modulates the dynamics of a cortical reservoir that implements a movement. The authors reproduce three different periodic behaviors of a two-joint arm.

The cerebellum is crucial for maintaining accuracy across multiple movements [18, 19]. Individuals with cerebellar damage have deficits when required to adapt a well-known behavior to a sudden change in environmental conditions [20]. Its role in motor adaptation has been further confirmed in imaging studies [21, 22]. Cerebellar pathology has also been hypothesized to be related to the Developmental Coordination Disorder, which in children manifests as a reduced motor performance [23, 24]. The cerebellum may implement a forward model, an inverse model, or both. Like with the basal ganglia, most models of the cerebellum focus on internal dynamics but are rarely applied to complex motor tasks. Those computational models of the cerebellum involved in motor tasks have been mainly developed in the context of neuro-robotics, often abstract much from biological detail and typically implement a closed-loop motor control network [2531].

Influential theories regarding the interaction between different motor systems emphasize that each system operates with a different type of learning mechanism, with the cerebellum implementing supervised learning, the basal ganglia reinforcement learning, and the cortex unsupervised learning [32]. An extended version of this idea, the super-learning hypothesis, proposes that the three learning mechanisms form an integrated system and act in synergy [33]. Results from one system may influence another through multiple neural pathways or neuromodulators.

Houk and colleagues [34, 35] proposed a conceptual framework that suggests distributed processing modules. It places the cerebral cortex at the center, and independent loops with the BG and cerebellum feed back to the cortex. The cortico-basal ganglia loops make an initial course selection that is then narrowed down (or refined) by the corresponding cortico-cerebellar loop. For example, in a reaching task, the cerebellum may use a prediction error to compute an online corrective movement. Thus, although the cerebellum, cortex, and basal ganglia may use different learning paradigms, they implement an interactive system capable of handling a diversity of tasks [3638].

Shadmehr and Krakauer [39] proposed a theory based on the framework of optimal feedback control. Rather than action selection, the basal ganglia are given a higher-level involvement in planning with respect to the cost and reward structure of the task. The cerebellum implements a forward model to predict the sensory consequences of motor action [40]. Recently, Haar and Donchin [41] combined Houk’s approach [34] with Shadmehr and Krakauer’s optimal control theory. They emphasize the distributed nature of the cortical network. The cortex-cerebellum loops are assumed to implement a predictive error correction of the cortical activity. However, their theory assumes the concept of parallel and segregated cortex-basal ganglia loops and thus, underestimates recent evidence for a hierarchical organization of cortex-basal ganglia loops [37, 4244].

In addition to the rather theoretical frameworks discussed above, the interaction of brain areas relevant to motor tasks can be explored by means of computational models. Very few computational models have included both, the cerebellum and the basal ganglia. An early approach [45], in which the cerebellar and basal ganglia circuitry was modeled by means of simple feedforward neural networks and combined with the DIRECT-model for motor reaching [46], aimed at explaining the behavioral difference between Parkinsonian patients and controls in a motor adaptation task. According to this model, when learning in the basal ganglia is deactivated to mimic the neurodegeneration of dopaminergic nigrostriatal neurons, continuous erratic movements occur. This compares well to data from patients who show only a crude adaptation. Recently, Caligiore et al. [36] designed a basal ganglia-cerebellar-thalamo-cortical system to explain the development of tics in Tourette. Although the model can recreate changes in the firing rates of cells in animal models of the disease, it does not implement a motor task. Capirchio et al., [47] used a system-level model to simulate a reaching task, which requires to reach three targets from a home position. In this model, the basal ganglia are represented by an actor-critic reinforcement learning account and the cerebellum as a feed-forward perceptron. Lesions to the cerebellum part showed effects observed in patients with cerebellar ataxia. Another recent model by Todorov et al. [48] focused on the role of the cerebellum and basal ganglia in motor adaptation. The basal ganglia implement action selection of a cortical motor program representing a movement trajectory. It is trained by the difference of successive reward prediction errors to support learning when performance improved and suppress the recent action when performance decreases. The cerebellum computes a small correction to the cortical motor program by means of a neural network trained with error backpropagation. In their model, any cerebellum-induced change in performance activates learning in the basal ganglia creating a credit assignment problem about the source of a gain or decline in performance. They therefore propose the existence of a critic somewhere in the brain that determines when each component participates in learning.

Another part of the brain heavily involved in motor execution are the central pattern generators (CPGs) in the brainstem and spinal cord [4952], that are not only involved in locomotion but also reaching [5355]. In mice, stimulation of brainstem neurons in the lateral rostral medulla leads to complex forelimb reaching and grasping behavior, where different populations of neurons trigger different patterns of behavior [56]. The large diversity of specialized motor-related neurons in the brainstem integrates information from the cortex, thalamus, cerebellum, and basal ganglia [57]. CPGs became very popular in the research field of neurorobotics leading to sophisticated demonstrations of complex motor actions [5861]. However, CPGs need some form of more high-level control when recruited for goal-directed behavior.

We introduce here a systems-level computational model that includes the basal ganglia, motor cortex, cerebellum, and brainstem. The focus of our study is the potential division of labor and learning in motor coordination, particularly in reaching and motor adaptation tasks. However, we do not aim to develop a rigorous implementation of neuro-biological details for each subsystem, given the still relatively poor understanding of the neural circuits in these brain parts.

Results

Model design

The model was designed in an open-loop control framework (Fig 1) in order to study its potential and limitations. In an open-loop control framework, the CPGs already provide movement dynamics but need to be under top-down or feedback control [58]. Our plant is a robotic arm with four degrees of freedom. In the reaching task, the shoulder’s yaw, pitch and roll, and the elbow were each controlled by an independent CPG network following the implementation of Nassour et al. [58, 62]. Each CPG network is formed by three layers: a rhythm generation layer that can generate multiple activity patterns, a pattern formation layer that shapes the generated pattern, and a motor neuron layer that drives the joint. While we do not neglect the existence of feedback pathways and closed-loop control, we start here with a model that does not include feedback except for learning. Thus, further upstream motor centers have to provide parameters that manipulate the movement dynamics of the CPG. Our model determines those parameters from two components. The motor cortex-basal ganglia interactions select concrete actions while the cerebellum fine-tunes those actions. The existent network in the brain is of course more complicated. For example, output neurons of the basal ganglia that project to the thalamus have collaterals that target different regions of the brainstem [56]. The term concrete action refers to the observation that movements can be decomposed into a finite set of elementary movements [63] and that activation of the motor cortex produces a limited set of muscle activations [64]. Action selection (BG) and action refinement (cerebellum) are learned through different biologically plausible mechanisms.

thumbnail
Fig 1. Design of the model.

A goal position, that may be determined by the pre-motor cortex-basal ganglia loop, has to be reached. This goal informs both, a motor cortex-basal ganglia loop and the cerebellum. The motor cortex-basal ganglia loop selects a concrete action, which determines the parameters of the CPG in the brainstem. Learning occurs when an achieved hand position is novel through dopamine-modulated Hebbian plasticity that reinforces the association between the executed action and the reached hand position. The cerebellum produces small adjustments to the CPG parameters that reduce the distance between the goal and the achieved position in the current task. Learning occurs through perturbation-based learning using the distance between the goal and the reached position as an error signal.

https://doi.org/10.1371/journal.pcbi.1011024.g001

A recent hypothesis about the functional structure of the cerebellum is that the recurrent connectivity in the cerebellar cortex implements a reservoir of dynamic activities [6567] instead of the classically hypothesized feedforward structure. Inputs from the cerebral cortex enter via the mossy fibers a strongly connected recurrent network formed by granule and Golgi cells in the cerebellar cortex, allowing complex patterns to evolve over time even after the inputs have stopped [68]. These spatio-temporal patterns in the reservoir can then be detected by the Purkinje cells to produce appropriate responses [69]. In order to benefit from this dynamical function of the cerebellum, we use the reward-modulated reservoir framework proposed by Miconi [70] as a model of the cerebellum. While the model of [70] is agnostic with respect to localizing the reservoir in any particular area of the brain, it has been used to control a musculoskeletal model of the human arm with four degrees of freedom and 16 muscles in a reaching task with two fixed targets. The reservoir learns by means of a perturbation learning rule, where random perturbations are individually applied to the neurons of the reservoir with varying amplitude and fixed frequency during a trial. At the end of a trial, the reached location is compared to the intended location to compute an aiming error signal. Depending on whether this error decreased or increased compared to the last similar trial, the weights inside the recurrent network are adapted depending on the occurrence of a perturbation (which is maintained by an eligibility trace) and the improvement or worsening of the aiming error. Perturbation learning is an alternative to error backpropagation and is considered more biologically plausible as all computations are local to the neurons.

Although the reservoir network of [70] is not related to the particular structure of the cerebellum, its neurons can be divided into two groups, depending on whether they are output cells or not. Following the interpretation of the cerebellum as a reservoir computing machine [66, 67], output neurons would correspond to the Purkinje cells and non-output neurons to the granular and Golgi cells. Cerebellar parallel fibers implement therefore the readout connections, and recurrent connections between granule and Golgi cells provide the necessary dynamic behavior. However, there is no explicit distinction between excitatory granule cells and inhibitory Golgi cells in the version of the model that we use.

The cortex-basal ganglia component is inspired by recent ideas regarding a hierarchical organization of the basal ganglia and cortex [42, 43]. Specifically, we proposed that the brain achieves goal-directed behavior through a cascade of decisions made by the multiple cortico-basal ganglia loops, each creating an intermediate objective at a different abstraction level [44]. Planning starts in the ventral or limbic loop with the desire for a particular internal or external reward known to be achievable given the current state. The dorsomedial or associative domain then determines the state needed to be reached in order to obtain the reward. The desired state is transformed into a motor goal by a further loop, e.g., by moving the hand to a particular location to satisfy the objective of reaching the object. Finally, the motor goal is transformed into a concrete action plan that may be executed by an open loop model, e.g. central pattern generators (CPGs). Let’s summarize the above concept with an example from everyday life: Our limbic system signals the need for water and we decide to reach for a glass of water, which in turn determines the motor goal in form of the spatial coordinates x,y,z, or the corresponding joint angles. The motor cortex-basal ganglia loop will then select a concrete action that moves the arm to the motor goal. The advantage of our hierarchical approach is that the motor goal is task-independent. After a decision about the target object is determined by the premotor loop, the reaching action does not need further information about those decisions made by the earlier loops. As we have already shown how such a set of decisions could be learned by dopamine-modulated plasticity [44, 71], we focus here on the motor loop only and how a motor goal is transformed into a concrete action and its final execution.

We have also recently demonstrated that learning in multiple cortex-basal ganglia loops cannot rely on a single prediction error signal being identical for all loops [44]. While a reward prediction error is well suited for the limbic loop, the motor loops should be trained by different signals to make them specific to the motor content, independent of the planning and motivational aspects of the task. We use here a dopamine response that indicates the novelty of the achieved movement [72].

A further implication of our framework is that the goal location coming from the pre-motor cortex has initially no meaning. The meaning of such internal signals must be first discovered by active exploration via the environmental act-and-sense loop. Learning occurs after the motor action by sensing its outcome—the reached location—in the premotor cortex. Thus, the outcome is linked to the action that leads to the outcome, providing meaning to the goal signals from the premotor cortex. In our motor loop, actions are initially randomly activated and a phasic increase of dopamine indicates the novelty of the achieved movement, modulating plasticity in the motor striatum to connect outcomes to concrete actions. Supported by the ideomotor theory [7375], we assume that this active exploration via the environmental act-and-sense loop is a necessary step that takes place prior to goal-directed behavior—but may continue during the lifetime—as the brain has initially no representation of the body kinematics (and dynamics).

Reaching with the cerebellum alone

As a reference, we initially test the reservoir model from Miconi [70] to mimic cerebellar learning. Following the procedure introduced by Miconi, the activity of all cells in the reservoir is randomly initialized to a small value at the beginning of each trial, the corresponding input is set, and the network is simulated for 200 milliseconds. The input is then deactivated and the network relaxes its activity for 200 additional milliseconds. The mean activity in the last 200 milliseconds of the reservoir’s output cells is then transformed linearly into the six parameter values of each CPG layer (4 joints, therefore 24 output values). Thus, the reservoir encodes the values for the full arm movement, i.e. all joints. The network has to learn reaching movements towards 8 different arbitrary targets within the arm’s workspace.

The perturbation learning rule used in the reservoir depends strongly on three parameters: the learning rate (η) or step size, the perturbation frequency (f) which determines how often the activity of the cells is perturbed, and the perturbation amplitude (A) which determines the size of the perturbation. Therefore, f and A control the level of noise in the network. Models with a small learning rate or low noise parameters decrease the error only by a small amount (see Fig 2A). Models with intermediate levels of noise or learning rate are able to solve the task but converge to different error levels. Models with faster learning become unstable: the distance to the goal initially decreases, reaches an asymptotic value, and then increases again. The same network configuration does not become unstable in a simpler version of the task in which only 2 goals are required to be learned (see Fig 2B). Results of an exhaustive parameter variation are given in S1 Fig.

thumbnail
Fig 2. Reservoir’s performance.

Performance of the reservoir with different parameter configurations: eta is the learning rate, f is the frequency of the perturbation and A is the amplitude of the perturbation. For each configuration, 50 different simulations are run, each with a different random seed producing different initial conditions, goals, and noise values. On all plots, the Euclidean distance between the goal and reached location over all simulations and including different goals is shown. A: The reservoir sets the parameters of a CPG network controlling each joint. The system is expected to learn 8 goals. Slow-learning networks hardly reduce the error. Fast-learning networks are unstable: They initially appear to learn the task, but then networks tend to forget previous knowledge. B: The same network is used but asked to only learn 2 goals. Configurations that were unstable with 8 goals are stable in this simpler version of the task. C: The output of the reservoir is transformed directly into joint angles (no CPGs are used). The performance of this network is worse than when including the CPGs. Shaded area next to each curve show the standard deviation of the mean.

https://doi.org/10.1371/journal.pcbi.1011024.g002

On a further control configuration, CPGs are removed and the activity of the reservoir’s output cells is directly linked to the change in the 4 joint angles. Those angles are transformed into a resulting hand position using a kinematic model. Networks with less noise are weaker than those including the CPGs (see Fig 2C). Fast networks become unstable, similarly to the model that includes the CPG. Thus, the CPG component is rather beneficial and does not account for the observed limitation of the reservoir when asking it to learn movements to a larger set of goal locations.

In summary, motor learning by the reservoir alone is sensitive to learning parameters, particularly when multiple target movements are required.

Reaching with the cerebellum and basal ganglia

In order to test if the division of labor between the basal ganglia and the cerebellum can avoid instabilities, we tested our full neuro-computational model (see Fig 3 for a more detailed view of the model), involving both components, on the same reaching task as before.

thumbnail
Fig 3. Detailed view on the computational model.

Arrows indicate excitatory synaptic connections between neurons. Red arrows indicate plastic connections. Lines ending with a circle indicate inhibitory connections. The closed motor cortex-basal ganglia loop has as many stripes as concrete actions. The direct pathway within the basal ganglia selects one of 120 possible concrete actions. This large number of actions ensures sufficient movement diversity within the reaching space of the arm. Each action is represented in a discrete channel connecting the corresponding cortical, striatal, substantia nigra pars reticulata (SNr), and thalamic cells. Each discrete action activates multiple sets of neurons representing possible CPG parameter values. Each CPG is formed by three layers: RG is the rhythm-generator layer, PF is the pattern formation layer and MT are the motor neurons. The 6 parameters per CPG being adapted are: the time constant τm, a shape parameter for the current–voltage curve of the fast current σf, the potassium conductance normalized to the leak conductance σs and the injected current iinj of the rhythm generator neurons of the CPGs. Further, α0 and θ0 which are the slope of the sigmoid and the center of the curve of the pattern formation layer of the CPGs. The final parameter value associated with each action is computed by integrating the activity of parameter cells weighted by their preferred parameter value. The cerebellum receives as input an abstract representation of the current goal (no position), one cell per possible goal. In the brain, that position may be encoded within the thalamus of the premotor loop. 24 of the 400 cells in the reservoir project outside (6 parameter values x 4 CPGs) and their activity contributes to the final CPG parameters. Only a single set of neurons for just one CPG is shown in the figure.

https://doi.org/10.1371/journal.pcbi.1011024.g003

The possible concrete actions are encoded by a neural population called the motor cortex, which is part of a motor cortex-basal ganglia loop. Each cortical cell projects to a set of neurons that use a population code to represent the CPG parameter values (see Fig 3). Each cell in these parameter populations is assigned a preferred parameter value. The final parameter value is decoded by computing a sum over the preferred parameter values, weighted by the activity of the corresponding cell. The weights of the connections from the action encoding population to the parameter encoding populations are fixed and random.

The basal ganglia network is a simplified version of our previous model [44, 76, 77], including only a direct pathway (striatum → substantia nigra pars reticulata → thalamus → cortex). Selection occurs when the constant inhibition exerted by the substantia nigra on the thalamus is removed by a corresponding activation in the striatum, allowing a specific cell in the thalamus to get activated and increase the firing rate of the corresponding concrete action. Despite some agreement on the functional role of different basal ganglia pathways there is nevertheless some variability particularly with respect to the indirect and hyperdirect pathway [4]. For the purpose of our study, we only need an intact function of the direct pathway and thus keep the model simple to save computation time. However, more complex motor tasks may benefit from considering additional basal ganglia pathways.

Dopamine-modulated Hebbian learning in the striatum links the input from the goal-encoding cells to the motor program. Novelty-based learning in the basal ganglia works as follows: After every movement, the input activity of the dopamine cell is increased from its baseline to 1, triggering plasticity in striatal neurons. The activity reached by the dopamine cells is however limited by a prediction obtained from the inhibition produced from the striatum, which is also subject to plasticity. The dopamine level reaches its maximum value only when an action is executed for the first time as the striatal inhibition increases after each movement. The same dopamine signal reaches all cells.

Unlike previous action-selection models of the BG, we only implement plasticity between the premotor cortex and the basal ganglia. It is common in computational models to assume that the BG implement a winner-take all mechanism between input action channels [6]. In classical action-selection models, the main inputs to the BG loop are the available actions and the BG must select one of them, usually the most salient one. In those models, the BG does not implement any transformation of the input information, it only removes the less salient action channels. Plasticity is then implemented in the connections within the loop (motor cortex) to assure a proper action selection. Based on our previous models [9, 44, 77, 78], we instead assume that each BG loop learns a goal-response map, which links objectives to appropriate actions. The input to the loop is different than action-selection models as it results from the information processing in previous loops. For selecting concrete actions, plasticity is then required at the projections from the premotor cortex, not necessarily at the projections from the motor cortex.

The cerebellum is modeled as a pool of 400 randomly connected cells. The projections within the pool are plastic and follow a perturbation-based learning rule [70]. 24 of those 400 project outside (6 parameters per joint). The activity of these output cells is added to the parameter value encoded in the parameter cells before they are set in the CPGs.

The basal ganglia are trained prior to the task simulation until the model replicates a randomly selected outcome for three times in a row. The main goal of this process is for the basal ganglia to create a map between outcomes (final hand positions) and concrete actions. During training, 120 actions are activated randomly, the outcome is observed and finally the association strength between the outcome and the action is increased. This creates a meaning for the pre-motor cortex neurons, which do not have one until activated by an observation. On each simulation a different set of 120 actions are defined, each associated with a random set of CPG parameters. Later, the outcome-action map is be used to select an action based on a desired outcome (Fig 4). The BG therefore are not trained on the goals of the task, but develop knowledge about the possible actions to choose from. Activity of the BG during an example trial is shown in S2 Fig.

thumbnail
Fig 4. Basal ganglia training.

The initial training of the basal ganglia is performed by randomly activating desired outcomes. A: Learned trajectories of 120 concrete actions of an example simulation. Each of the 120 lines in the plot represents the trajectory of the hand after selecting one action starting from the same position in one simulation. The basal ganglia can therefore select one among 120 trajectories. B: Result of learning in the basal ganglia by exploration via the environmental act-and-sense loop. At the beginning of every training trial, a random goal (desired hand position) is activated. Then, if no action cell had a strong enough firing rate, a random action is activated by setting its activity to 1. The basal ganglia learn to map the reached position with the activated action. Thus, learning associates the outcome with the action that leads to the outcome (act-and-sense). The plot shows that, over time, intended outcomes become associated with an action that closely reaches it. The blue line represents the mean distance over 50 simulations and the orange line is the average of the mean distance with a time window of 10 trials.

https://doi.org/10.1371/journal.pcbi.1011024.g004

We simulated the same reaching task with 2 and 8 goals. We used in the cerebellum a learning rate η = 0.8 and noise parameters f = 9 and A = 20. These parameters correspond to a fast network, which produced an unstable behavior when learning the task directly. Our simulations show that, with the full model, both tasks can be learned without any problem of stability. The reason is that learning is simpler as the BG introduce an initial solution through a concrete action and only small adjustments are produced by the cerebellum (see Fig 5). Not surprisingly, learning is also much faster than with the cerebellum alone.

thumbnail
Fig 5. Training the full model in a reaching task.

The full model includes a cortex-basal ganglia component that has been pretrained to allow the selection of a concrete action to a given arbitrary goal. The full model is then tested with either 2 or 8 random goal positions and required to learn to execute a movement to the given goal. The full model is not unstable when the number of goals is increased from 2 to 8. The shaded area next to each curve shows the standard deviation.

https://doi.org/10.1371/journal.pcbi.1011024.g005

Visuomotor adaptation task

After demonstrating the model’s basic functionality, we now investigate its ability to explain observations in motor adaptation. Motor adaptation refers to a particular type of motor learning in which a well-known action is modified to maintain performance after a change in the environment or the body [79]. One common way to study adaptation in an experimental setting is to impose a visuomotor rotation [80]. In such experiments, participants are seated in front of a screen and are required to move a cursor toward a target location with a straight inward-outward movement [81]. The cursor is not visible throughout the whole trajectory. During the movement, the cursor remains initially at its starting position and then indicates the movement reversal point. Thus, subjects only obtain visual feedback about their movement outcome with respect to its endpoint. After several baseline trials, the cursor’s coordinate system is rotated with respect to the coordinate system of the hand movement space. As participants are not informed about the manipulation and only observe the outcome, they slowly alter their behavior to cope with this perturbation. Errors are reduced trial by trial suggesting that it is controlled by an implicit learning process. Once the perturbation is removed, an aftereffect is observed: The participants initially overcompensate and then slowly, trial by trial, return to normal movements [80]. However, when participants were instructed about the nature of the perturbation and an instruction to compensate for it, they immediately applied it and had almost no error in the trial after the information has been given [80].

We confront our model with the visuomotor adaptation task used by Mazzoni and Krakauer [80]. After initial training on the baseline trials on two random goals, the coordinate system of the cursor is rotated by 45 degrees. As with the participants, we have three types of model simulations: in a first simulation, the model receives no information about the perturbation (rotation group); in a second simulation, the model is forced to adopt an explicit cognitive strategy by instructing it to direct the movement 45 degrees counterclockwise (rotation + strategy group); and in a third simulation the model is also instructed to direct the movement 45 degrees counterclockwise but the cursor is not perturbed (strategy group).

The perturbation is simulated in our model by rotating the final outcome of the hand movement by 45 degrees, as also human subjects have no visual feedback of their arm trajectory. Thus, after the rotation is introduced, the models make a 45 degree error (in Fig 6 at trial 100). The manipulation leads to an error signal in the cerebellum, which shows a strong increase once the rotation is introduced, but it does not induce novelty-based learning in the BG. In the strategy condition, the model is instructed to counter the perturbation, as with human subjects in the original experiment of Mazzoni and Krakauer. The instruction to counter the perturbation is given to our model as a change in the goal represented in the premotor cortex. The new goal corresponds to a position rotated from 45 degrees with respect to the initial one. The new input triggers the BG to select a different concrete action, one that moves the arm closer to the new goal direction. As with the participants, the instructed model immediately reduces the error close to zero (trial 103 in the Fig 6). This rapid change in movement direction, similarly to what was observed in humans, is in our model proposed by action selection at the BG level, as the cerebellum outputs only gradual corrections and requires multiple repetitions to adapt. In the following trials, the new motor goal is maintained and therefore the basal ganglia continues selecting the same concrete action. The change in the motor goal due to the instruction also affects the error computed at the level of the cerebellum, as the observed position of the pointer is compared to the intended motor outcome (aiming error, not task error). Importantly, as observed in human subjects, this explains why the model shows increasingly large directional errors over the following trials, over-adapting to the perturbation.

thumbnail
Fig 6. Visuomotor adaptation.

Test of the model with a visuomotor rotation task [80]. After initial training on baseline trials, the coordinate system of the cursor is rotated by 45 degrees. Then, after 200 trials interacting in the perturbed environment, the conditions return to the baseline. The first row shows the performance of models that, after 2 trials in the perturbed experiments, are informed about the perturbation by changing the goal location and re-setting the goal location later on (ROTATION + STRATEGY). The second row shows models that are not informed about the perturbation (ROTATION). The third row shows models that are provided with the new goal, but the environment was not perturbed (STRATEGY).

https://doi.org/10.1371/journal.pcbi.1011024.g006

In the original experiment of [80], after over-adapting to the perturbation, participants were instructed to stop using the explicit strategy. We give our model this information by a change in the goal, setting it back to the initial position, changing therefore again the concrete action, and as a consequence the error at the cerebellum. The new concrete action produces an immediate change in the direction, as observed in humans (see the increase in the error in Fig 6 rotation+strategy group around trial 300). When the perturbation is finally removed (10 trials after the last instruction), models and subjects show an after-effect and the error slowly declines. During this last period there is no further change in the motor goal and the corrections are therefore only produced by the cerebellum.

Our simulations of the rotation group (no instruction) show no immediate direction change. Like the human subjects, the model slowly adapts to the perturbation reducing the error trial by trial. Once the perturbation is removed, an aftereffect is again observed: A change in the direction of the error and a slow return to zero.

The simulations of the group that was instructed, but not perturbed, show no slow change in the error and no aftereffect. The change in the concrete action moves the arm toward the new desired direction and only very small changes are introduced by the cerebellum, as errors are computed according to the new instructed motor goal (aiming error). Thus, no after-effect occurs, similar to the data from human subjects. A comparison of the error signal in the Cerebellum under the three conditions can be observed in S3 Fig.

When we remove the cerebellum such that it provides no contribution to the CPG, in the rotation+strategy condition the BG compensates for the perturbation and the over-adaptation observed in the full model does not occur (S4 Fig).

Concluding, our model can replicate the main properties of the data of [80]. However, we spotted also small differences such that the model’s implicit learning process is slower than those of the participants. This could be because in the experiment of Mazzoni and Krakauer, the subjects were expected to make wrist movements of only 2.2cm, much shorter than in our setup.

Motor variability

Although motor variability has been often considered an undesired characteristic that should be avoided, it has been shown that task variability is a good predictor of individual learning ability [8284]. Greater task-relevant variability predicts faster learning.

In our model, learning in the cerebellum depends on perturbations to the activity of the cells and requires appropriate noise levels. In the reservoir, noise is defined by two parameters: the frequency by which a perturbation is introduced into the activity of the cells and the amplitude of this perturbation.

We compare models with different frequencies and amplitudes in the same perturbation task used in the previous section. Models with higher noise amplitude adapt faster to the rotated environment (see Fig 7 top). Increasing the noise frequency also allows a faster adaptation (see Fig 7 below). However, changes in the learning speed saturate at sufficiently large values: the learning speed is not further improving when the frequency level is increased. This compares well with the observations of van der Vliet et al. [82]

thumbnail
Fig 7. Variability in the visuomotor adaptation task.

Higher levels of noise produce faster adaptation until a particular noise level is reached. The plot on the top shows the performance of models with different perturbation amplitudes. The plot below shows the performance of models with different perturbation frequencies.

https://doi.org/10.1371/journal.pcbi.1011024.g007

Discussion

Our computational model is meant to advance the ongoing discussion on the contribution of the basal ganglia and cerebellum to motor learning. In the 3D-reaching task, we demonstrate the benefit of the concrete action selection by the basal ganglia, compared to a cerebellum-only model. Combined with the basal ganglia, the cerebellum is now only required to fine-tune the motor parameters, but not to learn and store all parameters of the arm movement. This further agrees with the super-learning hypothesis [33], as both learning systems interact in a pipeline organization: with the cerebellum using the results of the BG. Simulations with the full network are able to reach a good performance with parameter values that produced unstable behavior in an isolated Cerebellum model.

Of course, this advantage depends a lot on the assumed complexity of computation localized in the cerebellum and on the complexity of the control architecture. While we have used an open loop control and a target endpoint, models from the neuro-robotics community (e.g. [26, 85, 86]) typically use feedback control, which ensures that the desired endpoint will be reached, while a trajectory planner sets up the desired joint angles and the according velocities. In those approaches, models representing the cerebellum are embedded in the circuitry as forward and inverse models, and help to bring the actual trajectory closer to the desired trajectory. However, references to the basal ganglia in those studies are rather abstract and no explicit models of the basal ganglia have been used to solve robotic motor-control tasks. Demonstrating our model in the motor reaching task is meant as a proof of concept, but not to compete with state-of-the-art robotic solutions.

Adaptation tasks that include an additional cognitive strategy to counter the error [80] provide an interesting test scenario for our model. When human subjects are informed to use a strategy to overcome an error due to a rotational bias, they nevertheless continue adapting, leading to increased errors, although the strategy was effective and the task could have been done without error. In our model, the cognitive strategy affects the motor goal encoded in the premotor cortex and as a result, a different concrete action in the basal ganglia is selected to compensate for the rotational bias. However, although the cognitive strategy works fine for the task, the cursor endpoint is not consistent with the motor goal, which leads to continuous adaptation and to an increasingly bad performance on the task. This clearly shows that motor adaptation depends on an error signal that uses a motor goal (presumably defined in sensory space) but not a task goal. However, recent studies showed that under conditions where the sensory prediction error is non-zero, the task error can also have an influence [87] and both errors may interact with each other, presumably within the cerebellum [88].

The error used for cerebellar learning can be computed in different ways. It may be computed by comparing the predicted sensory consequences of the planned motor action with the outcome, i.e. sensory feedback, see also [20]. Alternatively, the motor goal [89] may already be defined in sensory space (cursor at an intended location) and the executed action is selected to reach this goal. Our approach follows this direct updating account without the need to use a forward model for computing a sensory prediction error. Recently, similar ideas have been put forward and the latter approach has been formulated as direct policy updating [90] and compared to the traditional framework according to which a forward model is updated and inverted for motor control. There is an ongoing debate about the need for motor-based forward models beyond the own body if error signals can be obtained by alternative action-outcome frameworks [91].

A critical assumption of our model is novelty-based learning in the BG. Traditionally, BG models use reward prediction errors as a model of dopaminergic signalling, where reward is linked to the task performance. However, there is evidence that dopamine neurons encode multiple signals and that different types of dopaminergic cells are connected with distinct brain networks [92]. Many cells fire to non-rewarding events [72]. Thus, motor learning may not be directly driven by a signal following task performance. Novelty signals allow the basal ganglia to acquire knowledge that is task-independent, reducing catastrophic forgetting. In our model, synaptic plasticity follows a 3-factor learning rule, with dopamine as the third factor. The size of a phasic increase in the dopamine signal depends on the prediction computed on basis of the activation of striatal neurons. With repetitions of the same action, the prediction increases and thus the dopamine signal decreases. As the dopamine signal depends on an internal context, here the activation of striatal neurons, it allows, in principle, learning of different tasks independent of childhood experience. However, we consider our novelty-based learning being a comparably simple implementation of this interesting field of research.

Taylor and Ivry [93] designed a mathematical setpoint state space model to replicate the data of [80]. The model includes a learning equation to calculate the current internal estimate of the rotation. Different from previous approaches using similar techniques to model other adaptation protocols, their equations include a representation of an explicit strategy. Its biological implementation, however, is unclear and no reference to action selection or the basal ganglia has been made.

Motor adaptation, but not particularly the role of cognitive strategies, has also been modeled by Todorov and colleagues [48] using a model of the basal ganglia and cerebellum. In addition to several differences at the implementation level, there are noticeable differences at the conceptual level of the model design that shall be discussed. According to their model, both the cerebellum and basal ganglia aim to counteract the perturbation. The cerebellum uses the error between the movement endpoint and the target to compute a correction of the motor program. Different from our approach where the basal ganglia are trained by a novelty learning rule, their basal ganglia model is trained by a temporal difference of the movement error, indicating an increased or decreased success on the task. Due to conflicts in the adaptation process, they created a critique that implements an arbitrator which controls when adaptation should be led by the basal ganglia and when by the cerebellum.

In our model, the basal ganglia select a motor action that is under strategic control. For example, to move a cursor upwards, it can choose to move the hand in a different direction. We have proposed a cognitive-to-motor hierarchy that can convert a task goal into a motor goal and the choice of the particular action [44], while we here only modeled the motor selection part. At the motor level, learning in the basal ganglia should not follow a task-performance reinforcement signal, but rather a motor-performance signal. In the present study, based on the heterogeneity of the dopamine system [94], we decided to learn on basis of a novelty learning rule in the basal ganglia. If the achieved position after a cerebellar correction is similar to positions observed during the initial training, then no learning will occur in the basal ganglia and therefore no conflict between the basal ganglia and the cerebellum occurs. Further, even if the position is new, learning will occur according to the achieved position and not the current motor goal, producing no conflict in the following trials.

The adaptation experiment we simulated includes an explicit instruction which produces an immediate reduction in the error. We represented this as a change in the motor goal which allows the BG to select a new concrete action, changing instantaneously the simulated movement direction. In comparison, the BG in the motor adaptation model of Todorov and colleagues [48] learns by means of a temporal difference of the task-performance between the current and previous trials and thus, adapts slowly and requires an exploration period after the perturbation is introduced to find the appropriate correction. In order to simulate an explicit strategy, the model of Todorov et al. would need to include an additional mechanism. Further, forcing the BG to learn on task-performance will counteract the learning in the cerebellum, which rather predicts against an ongoing adaptation towards larger task errors as observed in human subjects in the strategy condition.

Limitations

Models for understanding motor behavior and motor learning can cover many different disciplines. They may include aspects of computational neuroscience, neurorobotics, artificial neural networks, learning rules, and control theory. From each particular viewpoint, present models have limitations, due to the complex nature of the research topic. We aimed for a systems-level design to study the share of labor of different parts performing a simple robotic task and an experimental task in motor adaptation. Of course, each of our model components abstracts a lot from the brain area it shall represent. Our model of the basal ganglia covers some aspects of computational neuroscience and has been previously studied a lot and compared to experimental data [9, 44, 71, 76, 77], although here we only considered the direct pathway of the basal ganglia. The model of the CPG is biologically well-motivated, but more directed at a functional level for neurorobotics [58, 62]. The model of the cerebellum is quite abstract from its biological counterpart and is modeled as a reservoir with perturbation learning, thus avoiding the backpropagation learning rule. It is now also known that basal ganglia and cerebellum are not largely independent of each other but interconnected [95]. Through such direct projections, adaptations learned by the cerebellum could be transmitted to the basal ganglia which could then guide a learning process that incorporates them into the concrete action. Here, we do not consider any direct connection between those structures but simply add their output before setting the parameters of the joints.

The model’s motor cortex is not well motivated on the basis of physiological data but is limited to the idea of representing compact actions. Further, our motor cortex only includes fixed connections. Plasticity is known to occur in the motor cortex and is critical for the development of complex behaviors [96, 97]. In our model, plasticity in the motor cortex could help to optimize the set of actions available to the basal ganglia. For example, parameter refinements learned by the cerebellum could be then incorporated into the cortical representations of the corresponding concrete action. It has been already suggested that sensorimotor knowledge could be exported from the cerebellum to the cortex [98, 99].

Our model does not add much to the field of control theory and to its already sophisticated models of closed-loop control, as we have taken an open-loop approach. However, our approach may be extended to test theories of intermittent control which aim to describe control tasks by serial ballistic movements [100, 101]. The motor tasks we modeled do not pose a challenge to the neurorobotics community. However, a better understanding of the potential contribution of different brain parts can be helpful for designing more sophisticated robots, particularly with respect to the division of labor between cortical areas, basal ganglia, and the cerebellum.

We have also related our model with data suggesting that noise is beneficial for learning [83]. As observed in behavioral experiments, higher variability leads to faster learning. We need however to be careful with these observations, as planning noise needs to be differentiated from execution noise [82]. Our model only includes planning noise, which is represented by small perturbations of the activity of cerebellar cells, but does not include execution noise which could be produced at the level of the muscles and independent of the high-level signal reaching the joints. In our simplified implementation, the same high-level signal will produce always the same movement, something that may not happen in a more realistic environment. The relation between planning and execution noise, and the linked credit assignment problem, are topics for future studies. Further, there is evidence that the nervous system can regulate variability according to the context [102]. Increasing reward probabilities can reduce movement variability while decreasing reward probabilities produce the contrary effect [103].

Our model has not been compared to human kinematic data as other previous approaches based on reinforcement learning [104]. All simulations shown here use random actions to highlight that the model can learn to use any type of movements.

We should emphasize here that at the present stage our results are limited to a proof of concept. In order to accept the hypothesis presented here, more experiments are required and a proper comparison to other models of the basal ganglia—cerebellum network are necessary. Further, for now only a qualitative comparison with experimental data is presented.

Conclusion

Brainstem circuits are highly specialized centers for motor control which are informed by more upstream centers such as the motor cortex, thalamus, basal ganglia, and cerebellum [57]. How central pattern generators (CPGs) are influenced by basal ganglia and cerebellar sub-systems has been the central aim of our model design. We propose that cortex-basal ganglia loops select concrete actions that can be fine-tuned by the cerebellum. While the traditional view links learning in the basal ganglia to reward-based learning, and in the cerebellum to supervised learning, our approach suggests that learning in the basal ganglia is not uniform, but rather depends on the origin of the cortex—basal ganglia loop [44]. While the limbic basal ganglia are well suited for learning about the success of the task, the motor basal ganglia shall rather consider aspects of motor execution, such as a novelty-based dopamine signal. This dissociation of labor allows us to explain the surprising observation that human subjects continue to adapt in motor adaptation tasks, although they perform the task without error. In our model, the basal ganglia can counteract the perturbation in motor adaptation by a cognitive strategy. However, as the cerebellum learns about the difference between the intended position and the final arm position, it further contributes to adaptation.

Materials and methods

Central pattern generator

Each CPG network is composed of three layers: rhythm-generation neurons, pattern formation neurons, and motor neurons. More details about its neurophysiological basis can be found in [62].

The rhythm-generator layer is composed of two cells that can generate self-rhythms. The membrane potential (V) of these cells is defined by: (1) where τm and τs are time constants, iinj is the injected current, q is the lumped slow current, σs is the potassium conductance normalized to the leak conductance, σf is a dimensionless shape parameter for the current–voltage curve of the fast current and Af is the width of the N shape of the fast current.

Pattern formation neurons are modulated by the rhythm-generator neurons and by sensory neurons encoding the current joint angles. The activation function is defined by: (2) where RG is the activation of the rhythm generator neurons, Wrg is the weight for the connection from the rhythm generator neuron, Sj is the activity of the sensory neurons and Wj the weight of the connections from the sensory neurons. αPF is a descending control signal that modulates the activity of pattern formation cells and θ0 is the center of the sigmoid function that controls the balance between the extensor and the flexor.

Motor neurons are defined by: (3)

The final joint angle (U) is obtained by combining the extensor and flexor motor commands: (4) where Amp is an amplification factor, MNF and MNE are the flexor and the extensor motor neurons activation. Uref is the joint reference angle.

The parameters τm, σf, σs, iinj of the rhythm generator neurons and the parameters α0,θ0 the pattern formation neurons of all CPGs are set as a results of the BG and cerebellum interactions. The value for the fixed parameters are shown on Table 1.

Basal ganglia

The firing rate of neurons in the basal ganglia is defined by the following equation: (5) where mpj is the membrane potential, rj is the firing rate, τ is a time constant, wij is the weight between the presynaptic neuron i and postsynaptic neuron j, Ne is the group of cells that have an excitatory projection to neuron j, Ni is the group of cells that have an inhibitory projection to neuron j, B is a baseline value and ϵj is a noise term drawn from a uniform distribution. ()+ converts negative numbers to 0.

Plasticity in the cortico-striatal projection follows the learning rule: (6) where wij is the weight between cortical cell i and striatal cell j, fDA(DA(t) − BDA) is the dopamine modulation which depends on a phasic change between the current dopamine level (DA(t)) and the baseline dopamine level (BDA), Cij is the correlation between cortical cell i and striatal cell j and is a normalization term that limits the weight growth.

Based on biological findings [105, 106], a phasic increase in dopamine (DA(t) > BDA) strengthens the weights between active neurons while a phasic decrease (DA(t) < BDA) reduce their value. The function fDA(x) controls the rate of increase and decrease and takes values Kb for positive x and Kd for negative x.

The correlation term (Cij) is computed following the equation: (7) where ri and rj are the firing rates of cortical cell i and j, rPRE is the mean firing rate of the cortical population and rPOST is the mean firing rate of the striatal population, γPRE and γPOST are thresholds.

The dopamine level DA(t) is computed following the activity of a cell whose activity is governed by: (8) where BDA is the baseline dopamine level, P(t) controls that dopamine changes are produced only after a movement is executed, being 1 after a movement and 0 otherwise. The dopamine level is inhibited through direct striatal connections with weights .

Projections from the striatum to the dopaminergic cell are plastic and governed by the following rule: (9)

All fixed parameter values are shown in Table 2.

thumbnail
Table 2. Values for the fixed parameters of the basal ganglia.

https://doi.org/10.1371/journal.pcbi.1011024.t002

Cerebellum

The cerebellum module follows the reservoir computing framework proposed by [70]. It is composed of 400 neurons with a firing rate ri(t) given by: (10) where Jij are plastic local weights, uk(t) is the activity of the goal encoding cells, which is 1 if goal k is currently active and 0 otherwise, and Bik are random weights drawn from a uniform distribution between -0.2 and 0.2.

At every time step the value of xi(t) is perturbed with a probability f. Perturbations are introduced by adding to x a random value drawn from a uniform distribution between −A and A.

The learning rule depends on an eligibility trace given by: (11)

The weight change (ΔJ) is then defined as: (12) where E is the error in the current trial and is the mean error. The initial value of the weights Jij are drawn from a normal distribution with a mean of 0 and a standard deviation of 0.05.

Kinematic model

The position of the wrist (x) given the output joint angles of the CPGs is computed by performing a set of matrix operations following the simple kinematic of the humanoid robot James [107, 108]. This provides us with a fast transformation from angles to hand position. where x is the position of the wrist, elbow is the angle of the elbow joint in radians, roll is the angle of the shoulder roll joint in radians, yaw is the angle of the shoulder yaw joint in radians, and pitch is the angle of the shoulder pitch joint also in radians.

Training and task simulation details

For the simulations in the reaching task, goals are selected by adding a random number of degrees to the initial arm configuration and then computing the hand position. This ensures that goals are reachable. Only goals that are at a minimum distance of 0.5 from the initial hand position are considered to avoid very short movements.

Every simulation starts with a basal ganglia training block. At the beginning of each trial of this block, the network is simulated with no inputs for enough time to allow it to return to its baseline activity. Then, a random goal is generated and the baseline of the cortical input cells changes according to a Gaussian function with the difference between the cell’s preferred position and the goal. The network is then simulated for 200ms and the activity in the motor cortex is observed. If the maximum activity in the motor cortex is less than 0.05, a random concrete action is selected and the activity of the corresponding action cell is set to 1. If the maximum activity is larger than 0.05, the activity of the most active concrete action is set to 1. Then, an additional 150ms are simulated to allow the parameter encoding cells to reach a stable activity pattern.

Parameter values are then computed by reading the activity of the parameter encoding cells. A sum over the activity of the cells is computed, weighted by the cells’ preferred parameter value. The values for σf, σs are limited between 5 and 10, iinj is limited between -4 and 4, τM is limited between 5 and 15, α0 and θ0 are limited between 0.001 and 2.

A movement is executed by solving the CPG equations and transforming the final angles into a hand position using the kinematic model. The baseline of the input cortical cells is then changed according to this new position and the model is further simulated for 100ms. Finally, the baseline of the dopamine cell is increased to 1.0 to allow learning, and a final 100ms is simulated. The activity of the dopamine cells during this period is further restricted through striatal inhibition.

In simulations with 8 goals, the simulation speed is increased by computing the concrete action for each goal in advance. After the initial basal ganglia training, 8 additional trials are simulated, each with one of the goals that will be used later during the task simulation. The concrete action selected and the corresponding parameter values are saved for future use. Then, during the task simulation, the output values of the cerebellum are added to the saved concrete action values.

On every trial during the task simulation, the activity of the cerebellum cells is initially set to a uniform random value between -0.01 and 0.01. Then the corresponding input cell is activated and the network is simulated for 200ms. The input is then turned off and an additional 200ms is simulated. The mean of the activity of the output cell during this final period is considered as the output of the network and added to the parameters obtained through the concrete action. This process was used originally by Miconi [70].

After executing the movement, the Euclidean distance between the goal and the achieved position is computed and used as an error function to train the reservoir. The mean error considered in the learning rule is computed independently for every goal.

In visuomotor rotation paradigms, normally only 2-dimensional movements on a plane are allowed by fixing the arm accordingly. Rotations are introduced according to this two-dimensional plane. As our model normally produces three-dimensional movements we defined the plane according to which the position will be rotated. To solve this problem, we first train the model to reach 2 goals as in the reaching task. Then, during the perturbed period, the final hand position computed with the kinematic model is rotated by a fixed amount of degrees around the axis formed by the vector resulting from the cross-product between the two goals used during training. Angular errors are computed by first projecting the initial and final hand position to the same plane and determining the angle formed by the final position, the initial position, and the goal. Small values mean that the movement is made in the direction required to reach the goal.

When simulating the rotation and strategy group, a similar technique to reduce computation time was used as when the 8 goals reaching task were simulated. The parameters for each goal and their 45 degrees rotations are computed in advance after the initial basal ganglia training by simulating additional trials. Then, the output of the cerebellum is added to the stored values. Changes in the motor goal are then simulated by recalling a different value from memory. Simulations with the only rotation group are made by solving the complete network.

All simulations were implemented using the neural simulator ANNarchy: a software tool designed for distributed rate-coded or spiking neural networks [109]. The code was written using ANNarchy’s python interface, however, the simulator generated parallel C++ code. Each simulation was ran using 2 threads on a computing server with two AMD EPYC 7352 24-Core processors and 256 GB memory. Each simulation of the whole model takes around 12 hours. We ran 25 simulation in parallel on the same machine.

Supporting information

S1 Fig. Effect of learning speed and noise levels in the performance of the reservoir.

We ran multiple simulations with different values for the perturbation frequency (f), the perturbation amplitude (A) and the learning rate (eta). The color in each plot represents the distance between the achieved position and the goal position. Goals were selected randomly but always with an distance of at least 0.5 from the initial hand position. The plots show that low amplitude impede learning as the hand has stayed close to the initial position. With high enough amplitude to produce a strong movement, the network is sensitive to the value of the other parameters. High errors points are intermixed with low error points. High error points are more common when the three parameter values are high.

https://doi.org/10.1371/journal.pcbi.1011024.s001

(EPS)

S2 Fig. Activity of the basal ganglia during an example trial.

Activation of a goal position in the pre-motor cortex will activate the basal ganglia loop which will select one between the 120 available concrete actions. Each line in the figure correspond to one action channel. In this example the red action is selected. Selection starts by an activation of Striatum D1 cells which then inhibit the SNr. The constant inhibition that reaches the thalamus is then reduced allowing it to activate. Due to its thalamic inputs the motor cortex activates. Finally feedback connection to the striatum further enhance the selection.

https://doi.org/10.1371/journal.pcbi.1011024.s002

(EPS)

S3 Fig. Error signal in the cerebellum during the visuomotor adaptation task.

Each plot of the figure shows the error signal guiding learning in the model’s cerebellum during the adaptation task under one of the three different conditions. Aiming error is the distance between the current motor goal and the achieved position. On the first two conditions, once a perturbation is introduced the error increases and is then reduced with learning. Removing the perturbation produces a second increase in the error which is again slowly reduced trial by trial. In the STRATEGY condition, the change in the concrete action by the basal ganglia keeps a low error in the cerebellum and avoids learning.

https://doi.org/10.1371/journal.pcbi.1011024.s003

(EPS)

S4 Fig. Visuomotor adaptation without the cerebellum.

We ran 50 simulations of the rotation + strategy condition where after the initial training with two random goals the cerebellum’s corrections were removed. Once the perturbation is introduced, the model makes a large error which is then reduced after it is instructed to counter the perturbation (trial 103). Different to the previous simulations with the full model, the error stays flat until the model is instructed again. By the end of the simulation no aftereffect is observed. Shadow area next to the curve shows the standard deviation. The variability between simulations is explained by the fact that each time we use a different set of random concrete actions.

https://doi.org/10.1371/journal.pcbi.1011024.s004

(EPS)

References

  1. 1. Véronneau-Veilleux F, Robaey P, Ursino M, Nekka F. An integrative model of Parkinson’s disease treatment including levodopa pharmacokinetics, dopamine kinetics, basal ganglia neurotransmission and motor action throughout disease progression. Journal of Pharmacokinetics and Pharmacodynamics. 2021;48:133–148. pmid:33084988
  2. 2. Maith O, Escudero FV, Ülo Dinkelbach H, Baladron J, Horn A, Irmen F, et al. A computational model-based analysis of basal ganglia pathway changes in Parkinson’s disease inferred from resting-state fMRI. European Journal of Neuroscience. 2021;53:2278–2295. pmid:32558966
  3. 3. Chang SE, Guenther FH. Involvement of the cortico-basal ganglia-thalamocortical loop in developmental stuttering. Frontiers in Psychology. 2020;10. pmid:32047456
  4. 4. Schroll H, Hamker FH. Computational models of basal-ganglia pathway functions: focus on functional neuroanatomy. Frontiers in Systems Neuroscience. 2013; p. 1–18. pmid:24416002
  5. 5. Gurney K, Prescott TJ, Redgrave P. A computational model of action selection in the basal ganglia. I. A new functional anatomy. Biological Cybernetics. 2001;84:401–410. pmid:11417052
  6. 6. Frank MJ. Computational models of motivated action selection in corticostriatal circuits. Current Opinion in Neurobiology. 2011;21:381–386. pmid:21498067
  7. 7. Ursino M, Baston C. Aberrant learning in Parkinson’s disease: A neurocomputational study on bradykinesia. European Journal of Neuroscience. 2018;47(12):1563–1582. pmid:29786160
  8. 8. Mink JW. The Basal Ganglia: focused selection and inhibition of competing motor programs. Progress in Neurobiology. 1996;50:381–425. pmid:9004351
  9. 9. Baladron J, Hamker FH. A spiking neural network based on the basal ganglia functional anatomy. Neural Network. 2015;67:1–13. pmid:25863288
  10. 10. Humphries MD, Stewart RD, Gurney KN. A Physiologically Plausible Model of Action Selection and Oscillatory Activity in the Basal Gangliar. Journal of Neuroscience. 2006;26:12921–12942. pmid:17167083
  11. 11. Gurney K, Prescott TJ, Redgrave P. A computational model of action selection in the basal ganglia. II. Analysis and simulation of behaviour. Biological Cybernetics. 2001;84:411–423. pmid:11417053
  12. 12. Wiecki TV, Frank MJ. A Computational Model of Inhibitory Control in Frontal Cortex and Basal Ganglia. Psychological Review. 2013;120:329–355. pmid:23586447
  13. 13. Frank MJ. Hold your horses: A dynamic computational role for the subthalamic nucleus in decision making. Neural Networks. 2006;19:1120–1136. pmid:16945502
  14. 14. Magdoom KN, Subramanian D, Chakravarthy VS, Ravindran B, Amari SI, Meenakshisundaram N. Modeling basal ganglia for understanding Parkinsonian reaching movements. Neural Computation. 2011;23:477–516. pmid:21105828
  15. 15. Chakravarthy VS. Do basal Ganglia amplify willed action by stochastic resonance? A model. PLOS ONE. 2013;8. pmid:24302984
  16. 16. Kim T, Hamade KC, Todorov D, Barnett WH, Capps RA, Latash EM, et al. Reward Based Motor Adaptation Mediated by Basal Ganglia. Frontiers in Computational Neuroscience. 2017;11:19. pmid:28408878
  17. 17. Mannella F, Baldassarre G. Selection of cortical dynamics for motor behaviour by the basal ganglia. Biological cybernetics. 2015;109:575–595. pmid:26537483
  18. 18. Schmahmann JD. Disorders of the Cerebellum: Ataxia, Dysmetria of Thought, and the Cerebellar Cognitive Affective Syndrome. The Journal of Neuropsychiatry and Clinical Neuroscience. 2004;16:367–378. pmid:15377747
  19. 19. Zackowski KM, Thach W Jr, Bastian AJ. Cerebellar subjects show impaired coupling of reach and grasp movements. Experimental Brain Research. 2002;146:511–522. pmid:12355280
  20. 20. Tzvi E, Loens S, Donchin O. Mini‑review: The Role of the Cerebellum in Visuomotor Adaptation. The Cerebellum. 2022;21:306–313. pmid:34080132
  21. 21. Rabe K, Livne O, Gizewski ER, Aurich V, Beck DTA, Donchin O. Adaptation to Visuomotor Rotation and Force Field Perturbation Is Correlated to Different Brain Areas in Patients With Cerebellar Degeneration. J Neurophysiol. 2009;101:1961–1971. pmid:19176608
  22. 22. Donchin O, Rabe K, Diedrichsen J, Lally N, Schoch B, Gizewski ER, et al. Cerebellar regions involved in adaptation to force field and visuomotor perturbation. J Neurophysiol. 2012;107:134–147. pmid:21975446
  23. 23. Mariën P, van Dun K, Verhoeven J. Cerebellum and Apraxia. Cerebellum. 2015;14:39–42. pmid:25382715
  24. 24. Zwicker JG, Missiun C, Harris SR, Boydc LA. Developmental coordination disorder: A review and update. European Journal of Paediatric Neurology. 2012;16:573–581. pmid:22705270
  25. 25. Naveros F, Luque NR, Ros E, Arleo A. VOR Adaptation on a Humanoid iCub Robot Using a Spiking Cerebellar Model. IEEE Transactions on Cybernetics. 2020;50:4744–4757.
  26. 26. Ojeda IB, Tolu S, Pacheco M, Christensen DJ, Lund HH. A Combination of Machine Learning and Cerebellar-like Neural Networks for the Motor Control and Motor Learning of the Fable Modular Robot. Journal of Robotics, Networking and Artificial Life. 2017;4:62–66.
  27. 27. Casellato C, Antonietti A, Garrido JA, Carrillo RR, Luque NR, Ros E, et al. Adaptive Robotic Control Driven by a Versatile Spiking Cerebellar Network. PLOS One. 2014;09. pmid:25390365
  28. 28. Yamazaki T, Igarashi J. Realtime cerebellum: A large-scale spiking network model of the cerebellum that runs in realtime using a graphics processing unit. Neural Network. 2013;47:103–111. pmid:23434303
  29. 29. Carrillo RR, Ros E, Boucheny C, Coenen OJMD. A real-time spiking cerebellum model for learning robot control. BioSystems. 2008;94:18–27. pmid:18616974
  30. 30. Kawato M. From ‘Understanding the Brain by Creating the Brain’ towards manipulative neuroscience. Philosophical Transactions of the Royal Society B. 2008;363:2201–2214. pmid:18375374
  31. 31. Antonietti A, Martina D, Casellato C, D’Angelo E, Pedrocchi A. Control of a humanoid nao robot by an adaptive bioinspired cerebellar module in 3d motion tasks. Computational intelligence and neuroscience. 2019; p. 1–15. pmid:30833964
  32. 32. Doya K. Complementary roles of basal ganglia and cerebellum in learning and motor control. Current Opinion Neurobiology. 2000;10:732–739. pmid:11240282
  33. 33. Caligiore D, Arbib MA, Miall C, Baldassarre G. The super-learning hypothesis: Integrating learning processes across cortex, cerebellum and basal ganglia. Neuroscience and Behavioral Reviews. 2019;100:19–34. pmid:30790636
  34. 34. Houk JC, Bastianen C, Fansler D, Fishbach A, Fraser D, Reber PJ, et al. Action selection and refinement in subcortical loops through basal ganglia and cerebellum. Phil Trans R Soc B. 2007;362:1573–1583. pmid:17428771
  35. 35. Caligiore D, Pezzulo G, Baldassarre G, Bostan AC, Strick PL, Doya K, et al. Consensus Paper: Towards a Systems-Level View of Cerebellar Function: the Interplay Between Cerebellum, Basal Ganglia, and Cortex. Cerebellum. 2017;16:203–229. pmid:26873754
  36. 36. Caligiore D, Mannella F, Arbib MA, Baldassarre G. Dysfunctions of the basal ganglia-cerebellar-thalamo-cortical system produce motor tics in Tourette syndrome. PLOS Computational biology. 2017;13. pmid:28358814
  37. 37. Merel J, Botvinick M, Wayne G. Hierarchical motor control in mammals and machines. Nature Communications. 2019;10:1–12. pmid:31792198
  38. 38. Bostan AC, Strick PL. The basal ganglia and the cerebellum: nodes in an integrated network. Nature Reviews Neuroscience volume. 2018;19:338–350.
  39. 39. Shadmehr R, Krakauer J. Computational neuroanatomy for motor control. Experimental Brain Research. 2008;185:359–381. pmid:18251019
  40. 40. Izawa J, Criscimagna-Hemminger S, Shadmehr R. Cerebellar contributions to reach adaptation and learning sensory consequences of action. Journal of Neuroscience. 2012;21:4230–4239. pmid:22442085
  41. 41. Haar S, Donchin O. A Revised Computational Neuroanatomy for Motor Control. Journal of Cognitive Neuroscience. 2020;32:1823–1836. pmid:32644882
  42. 42. Rusu SI, Pennartz CMA. Learning, memory and consolidation mechanisms for behavioral control in hierarchically organized cortico-basal ganglia systems. Hippocampus. 2020;30:73–98. pmid:31617622
  43. 43. Dezfouli A, Balleine BW. sd. PLoS Computational Biology. 2013;12.
  44. 44. Baladron J, Hamker FH. Habit learning in hierarchical cortex-basal ganglia loops. European Journal of Neuroscience. 2020;52:4613–4638. pmid:32237250
  45. 45. Grosse-Wentrup M, Contreras-Vidal JL. The role of the striatum in adaptation learning: a computational model. Biological Cybernetics. 2007;96:377–388. pmid:17364182
  46. 46. Bullock D, Grossberg S, Guenther F. A self-organizing neural model of motor equivalent reaching and tool use by a multijoint arm. Journal of Cognitive Neuroscience. 1993;5:408–435. pmid:23964916
  47. 47. Capirchio A, Ponte C, Baldassarre G, Mannella F, Pelosin E, Caligiore D. Interactions between supervised and reinforcement learning processes in a neurorobotic model. bioRxiv. 2022;.
  48. 48. Todorov DI, Capps RA, Barnett WH, Latash EM, ID TK, Hamade KC, et al. The interplay between cerebellum and basal ganglia in motor adaptation: A modeling study. Plos One. 2019; p. 1–36. pmid:30978216
  49. 49. Rowat P, Selverston A. Oscillatory Mechanisms in Pairs of Neurons Connected with Fast Inhibitory Synapses. Journal of Computational Neuroscience. 1997;4:103–127. pmid:9154518
  50. 50. Orlovsky GN, Deliagina T, Grillner S. Neuronal control of locomotion: from Mollusc to Man. Oxford Univeristy Press; 1999.
  51. 51. Dimitrijevic M, Gerasimenko Y, Pinter M. Evidence for a spinal central pattern generator in humans. Annals of the New York Academy of Sciences. 1998;860:360–376. pmid:9928325
  52. 52. McCrea D, Rybak I. Organization of mammalian locomotor rhythm and pattern generation. Brain research reviews. 2008;57:134–146. pmid:17936363
  53. 53. Yakovenko S, Trevor D. Similar motor cortical control mechanisms for precise limb control during reaching and locomotion. The Journal of Neuroscience. 2015;35:14476–14490. pmid:26511240
  54. 54. Georgopoulos A, Grillner S. Visuomotor coordination in reaching and locomotion. Science. 1989;245:1209–1210. pmid:2675307
  55. 55. Sternad D, Dean WJ, Schaal S. Interaction of rhythmic and discrete pattern generators in single-joint movements. Human Movement Science. 2000;19:627–664.
  56. 56. Arber S, Costa RM. Networking brainstem and basal ganglia circuits for movement. Nature Review Neuroscience. 2022; p. 1–19. pmid:35422525
  57. 57. Ruder L, Arber S. Brainstem Circuits Controlling Action Diversification. Annu Rev Neurosci. 2019;42:485–504. pmid:31283898
  58. 58. Nassour J, Hoa TD, Atoofi P, Hamker F. Concrete Action Representation Model: from Neuroscience to Robotics. IEEE Transactions on Cognitive and Developmental Systems. 2019;12:272–284.
  59. 59. Degallier S, Righetti L, Gay S, Ijspeert A. Toward simple control for complex, autonomous robotic applications: combining discrete and rhythmic motor primitives. Autonomous Robots. 2011;31:155–181.
  60. 60. Ijspeert A. Biorobotics: using robots to emulate and investigate agile locomotion. Science. 2014;346:196–203. pmid:25301621
  61. 61. Ijspeert A, Crespi A, Ryczko D, Cabelguen J. From swimming to walking with a salamander robot driven by a spinal cord model. Science. 2007;315:1416–1420. pmid:17347441
  62. 62. Nassour J, Hénaff P, Benouezdou F, Cheng G. Multi-layered multi-pattern CPG for adaptive locomotion of humanoid robots. Biol Cybern. 2014;108(3):291–303. pmid:24570353
  63. 63. Flash T, Hochner B. Motor primitives in vertebrates and invertebrates. Current Opinion in Neurobiology. 2005;15:660–666. pmid:16275056
  64. 64. Overduin SA, d’Avella A, Carmena JM, Bizzi E. Microstimulation Activates a Handful of Muscle Synergies. Neuron. 2012;76:1071–1077. pmid:23259944
  65. 65. Tanaka G, Yamane T, Héroux JB, Nakane R, Kanazawa N, Takeda S, et al. Recent advances in physical reservoir computing: A review. Neural Networks. 2019;115:100–123. pmid:30981085
  66. 66. Yamazaki T, Tanaka S. The cerebellum as a liquid state machine. Neural Networks. 2007;20:290–297. pmid:17517494
  67. 67. Tokuda K, Fujiwara N, Sudo A, Katori Y. Chaos may enhance expressivity in cerebellar granular layer. Neural Networks. 2021;136:72–86. pmid:33450654
  68. 68. Rössert C, Dean P, Porrill J. At the Edge of Chaos: How Cerebellar Granular Layer Network Dynamics Can Provide the Basis for Temporal Filters. PLOS Computational Biology. 2015;11(10). pmid:26484859
  69. 69. Schmid K, Vitay J, Hamker FH. Forward Models in the Cerebellum Using Reservoirs and Perturbation Learning. In: 2019 Conference on Cognitive Computational Neuroscience. Berlin, Germany: Cognitive Computational Neuroscience; 2019.
  70. 70. Miconi T. Biologically plausible learning in recurrent neural networks reproduces neural dynamics observed during cognitive tasks. eLife. 2017; p. 1–24. pmid:28230528
  71. 71. Scholl C, Baladron J, Vitay J, Hamker F. Enhanced habit formation in Tourette patients explained by shortcut modulation in a hierarchical cortico-basal ganglia model. Brain Structure & Function. 2022;227:1031–1050.
  72. 72. Horvitz JC. Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience. 2000;96:651–656. pmid:10727783
  73. 73. Hommel B, Elsner B. Acquisition, representation, and control of action. In: Morsella E, Bargh JA, Gollwitzer PM, editors. Oxford Handbook of Human Action. New York, NY: Oxford University Press; 2009. p. 371–398.
  74. 74. Verschoor SA, Weidema M, Biro S, Hommel B. Where do action goals come from? Evidence for spontaneous action effect binding in infants. Frontiers in Psychology. 2010;. pmid:21738512
  75. 75. Hommel B. GOALIATH: a theory of goal‑directed behavior. Psychological Research. 2022;86:1054–1077. pmid:34324040
  76. 76. Villagrasa F, Baladron J, Vitay J, Schroll H, Antzoulatos EG, Miller EK, et al. On the Role of Cortex-Basal Ganglia Interactions for Category Learning: A Neurocomputational Approach. Journal of Neuroscience. 2018;31:9551–9562. pmid:30228231
  77. 77. Schroll H, Vitay J, Hamker FH. Dysfunctional and compensatory synaptic plasticity in parkinsons disease. European Journal of Neuroscience. 2014;39:688–702. pmid:24313650
  78. 78. Baladron J, Nambu A, Hamker FH. The subthalamic nucleus—external globus pallidus loop biases exploratory decisions towards known alternatives: A neuro-computational study. European Journal of Neuroscience. 2019;49:754–767. pmid:28833676
  79. 79. Krakauer JW, Hadjiosif AM, Xu J, Wong AL, Haith AM. Motor Learning. Comprehensive Physiology. 2019;9:613–663. pmid:30873583
  80. 80. Mazzoni P, Krakauer JW. An Implicit Plan Overrides an Explicit Strategy during Visuomotor Adaptation. The Journal of Neuroscience. 2006;26:3642–3645. pmid:16597717
  81. 81. Honda T, Nagao S, Hashimoto Y, Ishikawa K, Yokota T, Mizusawa H, et al. Tandem internal models execute motor learning in the cerebellum. Proceedings of the National Academy of Sciences. 2018;115:7428–7433. pmid:29941578
  82. 82. van der Vliet R, Frens MA, de Vreede L, Jonker ZD, Ribbers GM, Selles RW, et al. Individual Differences in Motor Noise and Adaptation Rate Are Optimally Related. eNeuro. 2018;5:1–14. pmid:30073197
  83. 83. Wu HG, Miyamoto YR, Castro LNG, Ölveczky BP, Smith MA. Temporal structure of motor variability is dynamically regulated and predicts motor learning ability. Nature Neuroscience. 2014;17:312–321. pmid:24413700
  84. 84. Herzfeld DJ, Shadmehr R. Motor variability is not noise, but grist for the learning mill. Nature Neuroscience. 2014;17:149–150. pmid:24473260
  85. 85. Zahra O, Navarro-Alarcon D, Tolu S. A fully spiking neural control system based on cerebellar predictive learning for sensor-guided robots. In: 2021 IEEE International Conference on Robotics and Automation (ICRA); 2021.
  86. 86. Zahra O, Navarro-Alarcon D, Tolu S. A neurorobotic embodiment for exploring the dynamical interactions of a spiking cerebellar model and a robot arm during vision-based manipulation tasks. International Journal of Neural Systems. 2021;32. pmid:34003083
  87. 87. Tsay J, Haith A, Ivry R, Kim H. Interactions between sensory prediction error and task error during implicit motor learning. PLoS Comput Biol. 2022;23. pmid:35320276
  88. 88. Albert ST, Jang J, Modchalingam S, ‘t Hart BM, Henriques D, Lerner G, et al. Competition between parallel sensorimotor learning systems. PLoS Comput Biol. 2022;. pmid:35225229
  89. 89. Day K, Roemmich R, Taylor J, Bastian A. Visuomotor Learning Generalizes Around the Intended Movement. eNeuro. 2016;29. pmid:27280151
  90. 90. Hadjiosif AM, Krakauer JW, Haith AM. Did We Get Sensorimotor Adaptation Wrong? Implicit Adaptation as Direct Policy Updating Rather than Forward-Model-Based Learning. The Journal of Neuroscience. 2021;24:2747–2761. pmid:33558432
  91. 91. Dogge M, Custers R, Aarts H. Moving Forward: On the Limits of Motor-Based Forward Models. Trends in Cognitive Science. 2019;23:743–753. pmid:31371239
  92. 92. Bromberg-Martin ES, Matsumoto M, Hikosaka O. Dopamine in motivational control: rewarding, aversive, and alerting. Neuron. 2010;68:815–834. pmid:21144997
  93. 93. Taylor JA, Ivry RB. Flexible Cognitive Strategies during Motor Learning. PLOS Computational Biology. 2011;7:1–13. pmid:21390266
  94. 94. Collins AL, Saunders BT. Heterogeneity in striatal dopamine circuits: Form and function in dynamic reward seeking. Journal of Neuroscience Research. 2020; p. 1046–1069. pmid:32056298
  95. 95. Bostan AC, Strick PL. The basal ganglia and the cerebellum: nodes in an integrated network. Nature Review Neuroscience. 2018;19:338–350.
  96. 96. Adkins DL, Boychuk J, Remple MS, Kleim JA. Motor training induces experience-specific patterns of plasticity across motor cortex and spinal cord. Journal of Applied Physiology. 2006;101:1776–1782. pmid:16959909
  97. 97. Monfils MH, Plautz EJ, Kleim JA. In Search of the Motor Engram: Motor Map Plasticity as a Mechanism for Encoding Motor Experience. The Neuroscientist. 2005;11:471–483. pmid:16151047
  98. 98. Hua SE, Houk JC. Cerebellar Guidance of Premotor Network Development and Sensorimotor Learning. Learning & Memory. 1997;4:63–76. pmid:10456054
  99. 99. Penhune VB, Doyon J. Cerebellum and M1 interaction during early learning of timed motor sequences. NeuroImage. 2005;26:801–812. pmid:15955490
  100. 100. Gawthrop P, Loram I, Lakie M, Gollee H. Intermittent control: a computational theory of human control. Biological Cybernetics. 2011;104:31–51. pmid:21327829
  101. 101. Loram I, van de Kamp C, Lakie M, Gollee H, PJ G. Does the motor system need intermittent control? Exercise and Sport Science Review. 2014;42:117–125. pmid:24819544
  102. 102. Dhawale AK, Smith MA, Ölveczky BP. The Role of Variability in Motor Learning. Annual Review in Neuroscience. 2017;5:479–498. pmid:28489490
  103. 103. Pekny SE, Izawa J, Shadmehr R. Reward-Dependent Modulation of Movement Variability. Journal of Neuroscience. 2015;35:4015–4024. pmid:25740529
  104. 104. Caligiore D, Parisi D, Baldassarre G. Integrating reinforcement learning, equilibrium points, and minimum variance to understand the development of reaching: a computational model. Psychological review. 2014;121:389–421. pmid:25090425
  105. 105. Shen W, Flajolet M, Greengard P, Surmeier DJ. Dichotomous dopaminergic control of striatal synaptic plasticity. Science. 2008;321:848–851. pmid:18687967
  106. 106. Shindou T, Shindou M, Watanabe S, Wickens J. A silent eligibility trace enables dopamine-dependent synaptic plasticity for reinforcement learning in the mouse striatum. European Journal of Neuroscience. 2019;49:726–736. pmid:29603470
  107. 107. Jamone L, Metta G, Nori F, Sandini G. James: A Humanoid Robot Acting over an Unstructured World. IEEE-RAS International Conference on Humanoid Robots. 2006;.
  108. 108. Natale L, Nori F, Sandini G, Metta G. Learning precise 3D reaching in a humanoid robot. IEEE 6th International Conference on Development and Learning. 2007;.
  109. 109. Vitay J, Dinkelbach HÜ, Hamker FH. ANNarchy: A Code Generation Approach to Neural Simulations on Parallel Hardware. Frontiers in Neuroinformatics. 2015;9(19). pmid:26283957