Conceived and designed the experiments: CEH MAA NS. Performed the experiments: CEH. Analyzed the data: CEH NS. Contributed reagents/materials/analysis tools: CEH. Wrote the paper: CEH NS.
The authors have declared that no competing interests exist.
Motor training with the upper limb affected by stroke partially reverses the loss of cortical representation after lesion and has been proposed to increase spontaneous arm use. Moreover, repeated attempts to use the affected hand in daily activities create a form of practice that can potentially lead to further improvement in motor performance. We thus hypothesized that if motor retraining after stroke increases spontaneous arm use sufficiently, then the patient will enter a virtuous circle in which spontaneous arm use and motor performance reinforce each other. In contrast, if the dose of therapy is not sufficient to bring spontaneous use above threshold, then performance will not increase and the patient will further develop compensatory strategies with the less affected hand. To refine this hypothesis, we developed a computational model of bilateral hand use in arm reaching to study the interactions between adaptive decision making and motor relearning after motor cortex lesion. The model contains a left and a right motor cortex, each controlling the opposite arm, and a single action choice module. The action choice module learns, via reinforcement learning, the value of using each arm for reaching in specific directions. Each motor cortex uses a neural population code to specify the initial direction along which the contralateral hand moves towards a target. The motor cortex learns to minimize directional errors and to maximize neuronal activity for each movement. The derived learning rule accounts for the reversal of the loss of cortical representation after rehabilitation and the increase of this loss after stroke with insufficient rehabilitation. Further, our model exhibits nonlinear and bistable behavior: if natural recovery, motor training, or both, brings performance above a certain threshold, then training can be stopped, as the repeated spontaneous arm use provides a form of motor learning that further bootstraps performance and spontaneous use. Below this threshold, motor training is “in vain”: there is little spontaneous arm use after training, the model exhibits learned nonuse, and compensatory movements with the less affected hand are reinforced. By exploring the nonlinear dynamics of stroke recovery using a biologically plausible neural model that accounts for reversal of the loss of motor cortex representation following rehabilitation or the lack thereof, respectively, we can explain previously hard to reconcile data on spontaneous arm use in stroke recovery. Further, our threshold prediction could be tested with an adaptive train–wait–train paradigm: if spontaneous arm use has increased in the “wait” period, then the threshold has been reached, and rehabilitation can be stopped. If spontaneous arm use is still low or has decreased, then another bout of rehabilitation is to be provided.
Stroke often leaves patients with predominantly unilateral functional limitations of the arm and hand. Although recovery of function after stroke is often achieved by compensatory use of the less affected limb, improving use of the more affected limb has been associated with increased quality of life. Here, we developed a biologically plausible model of bilateral reaching movements to investigate the mechanisms and conditions leading to effective rehabilitation. Our motor cortex model accounts for the experimental observation that motor training can reverse the loss of cortical representation due to lesion. Further, our model predicts that if spontaneous arm use is above a certain threshold, then training can be stopped, as the repeated spontaneous use provides a form of motor learning that further improves performance and spontaneous use. Below this threshold, training is “in vain,” and compensatory movements with the less affected hand are reinforced. Our model is a first step in the development of adaptive and cost-effective rehabilitation methods tailored to individuals poststroke.
Stroke is the leading cause of disability in the US, and about 65% of stroke survivors experience long-term upper extremity functional limitations
There is now definite evidence however that physical therapy interventions targeted at the more affected arm can improve both the amount of spontaneous arm use and arm and hand function after stroke
The neural correlates of motor training after stroke have been investigated in animals with motor cortex lesions
Contrasting with the increase in performance due to spontaneous recovery, a concurrent
In summary, increase in performance after stroke due to spontaneous recovery, rehabilitation, or both does not appear to correlate simply with spontaneous arm use, and a yet-to-be clarified nonlinear mechanism seems to be at play. Here, we focus on rehabilitation in the control of reaching poststroke, a prerequisite for successful manipulation. We developed a biologically plausible model of bilateral control of reaching movements to investigate the mechanisms and conditions leading to such positive or negative changes in spontaneous choice of which arm to use. Our central hypothesis, based on the above observations, is the existence of a threshold in spontaneous arm use: if retraining after brain lesion (or spontaneous recovery) increases spontaneous arm use above this threshold, performance will keep increasing, as each attempt to use the affected arm will act as a form of motor relearning. The patient will then enter a virtuous circle of improved performance and spontaneous use of the affected arm, and therapy can be terminated. In contrast, if spontaneous use of the arm does not reach this threshold after either natural recovery or rehabilitation, or both, performance will not improve after stroke, and compensatory strategies with greater reliance on the less affected arm will either remain or even develop further.
To model spontaneous use of one arm or the other, and changes in motor performance, we simulated horizontal reaching movements towards targets distributed along a circle centered on the initial (overlapping) positions of the two arms (
(A) Experimental setup. (B) Model structure. Solid line: information signal; dashed line: activation signal; dotted line: reward-based (reinforcement) learning; double dotted line: error-based (supervised) learning.
To simulate stroke, we partly lesion one hemisphere (i.e., remove a set of simulated neurons from the simulation). We first simulate a spontaneous recovery period in which the action choice module determines the choice of arm, and the state of motor cortex determines error in reaching, with consequent changes in synaptic weights. We then mimic CIT with a forced use condition in which only the use of the affected arm (i.e., that contralateral to the lesioned cortex) was allowed. We study in simulations the conditions that lead to successful recovery, that is, to high levels of spontaneous use and performance with the affected arm in appropriate regions of space, and low reliance on compensatory movements with the less affected arm.
Our model has two distributed interacting and adaptive systems: the motor cortex for motor execution and the action choice module for decision-making.
We made two assumptions to model the motor cortex with a left and a right module for control of the contralateral arm:
The motor cortex contains neurons coding direction of hand movement
The activation rule of each motor neuron is given by a truncated cosine function
Summation of individual neuron vectors (with each vector length given by Equation 1, and the vector direction given by the preferred direction) yields a population vector that has been shown to be well aligned with the initial actual (executed) hand direction
The motor system learns to generate reaching movements by minimizing error bias and by recruiting more neurons for frequently used movement, in effect minimizing directional variance
The cost can, with some approximation, be decreased by applying the following motor cortex learning rule (
In reinforcement learning, actions that maximize outcomes are selected based on estimates of future cumulative rewards, or “values”
Here, we use a total (internal) reward
The action choice module selects one of the arms for movement execution by comparing the action values
After each movement, the action value of each arm is updated with the reward prediction error, that is, the difference
Based on the action values, the module probabilistically selects which motor cortex will be used to execute a movement according to the softmax function
Strokes seem to affect only a certain range of movement directions. Outside this range, reaching is relatively spared
(A) Neuronal population coding. (B) spontaneous use (B). For (A) and (B): (1) Before stroke, (2) after stroke, (3) after 3,000 free choice trials, and (4) after 1,000 forced used trials followed by 2,000 free choice trials. In (A), each population vector figure shows the desired reach directions (thin black arrows), the neuron activation levels along their preferred directions (thin gray lines), and the resulting population vector (thick black arrows). Note that there are no “votes” for directions corresponding to the lesioned directions in (A.2) and (A.3) but that in (A.4), many neurons have become retuned to yield votes in the lesioned directions. In (B), the pie plots show the probability of using the unaffected right arm to reach to targets arrayed on a circle around the central position. In (B.2) and (B.3), the less affected arm reaches into the lesioned quadrant, but this effect is reversed with therapy (B.4).
We used two measures of motor performance:
The absolute value of the directional error between the intended reach direction and the population vector direction.
The magnitude of the population vector, normalized by the magnitude of the population vector before stroke.
We chose these two performance measures in our model because they can be linked to actual patient performance measures. Initial directional error has been used in characterizing reaching in stroke patients (e.g.,
The changes in performance and spontaneous arm use of the affected arm were recorded in four consecutive phases: (i) an acquisition phase of normal bilateral reaching behavior in 2,000 free choice trials (partially shown), (ii) an acute stroke phase of 500 free choice trials, (iii) a rehabilitation phase in a forced use condition (variable number of trials), and (iv) a chronic stroke phase consisting of 3,000 free choice trials. Values of performance and spontaneous use just after rehabilitation are called “immediate;” their long-term values at the end of the chronic phase are called “follow-up.”
In all phases, targets were randomly generated at the start of each trial, distributed uniformly across all possible angles. Unless otherwise stated, we used the following parameters: Each motor cortex had 500 neurons, with initial preferred directions
The first (prelesion) phase provided a normal baseline for reaching behavior. For each desired direction, learning achieved zero mean directional error (
Just after stroke, however, the population vectors showed directional errors in and around the affected range (
At the end of the “acute stroke” period, the less affected arm largely compensated for the more affected arm in the affected range (
We then studied the time courses of motor performance measures and spontaneous arm use (
(A) Directional error, (B) normalized population vector (PV), and (C) spontaneous arm choice. Five different durations of therapy were used (0, 200, 400, 800, or 3,000 trials). The spontaneous arm use is an average selection probability from 10 uniformly distributed desired directions on the affected range. The threshold of effective rehabilitation for this stroke size is shown in the horizontal dotted line of (C). If the rehabilitation leads to performance above this threshold, then a virtuous circle between spontaneous arm use and performance will take place and performances will continue to improve without the need for further rehabilitation.
The model thus exhibits a threshold for the intensity of rehabilitation. To precisely quantify the threshold, we computed the change in spontaneous arm use following rehabilitation by fitting a simple linear model with trials post stroke as predictor; the number of trials corresponding to a null slope corresponds to this threshold. As shown in
We plotted the average slope of spontaneous arm use in the 1,000 trials following rehabilitation as a function of the intensity of therapy. Above 420 trials (with the default parameter set), spontaneous arm use increases after therapy. Below this number of trials, it decreases.
Directional error (A), normalized population vector (PV) (B), and spontaneous arm use (C) in the immediate and follow-up tests. The directional error performance following few rehabilitation trials worsens after therapy. On the contrary, the directional error performance after sufficient rehabilitation trials improves even after therapy. Similar bistable patterns are shown for the normalized population vector and spontaneous use shown in (B) and (C).
As expected, the minimal intensity of effective therapy depends on lesion size (
(A) Number of rehabilitation trials required to reach the effective rehabilitation threshold, as a function of lesion sizes. (B) Normalized population vector (PV) as a function of lesion size in the follow-up test after 800 rehabilitation trials.
Motor performance can be judged according to two different criteria: accuracy (low bias of error) and precision (low variance of error).
(A) Contralateral (affected) arm and (B) ipsilateral (nonaffected) arm. In each panel, the thick solid line corresponds to the changes occurring from just before stroke to the 500th free choice trials following stroke onset. The thin solid line represents additional changes in a no therapy condition (3,000 free choice trials). The dotted line represents additional changes in a therapy condition (1,000 therapy trials followed by 2,000 free choice trials). After stroke, accuracy and variability of the contralateral arm worsened. Following therapy, accuracy improved but with little change in variability. With no therapy, behavioral compensation with the nonaffected arm further developed, resulting in improved accuracy for this arm (B).
We then studied the organization and reorganization of the cells' preferred directions in each hemisphere before lesion, after lesion, and after therapy. Using pie histograms (
Reorganization of the affected (left) hemisphere (A) and nonaffected (right) hemisphere (B) after stroke followed by therapy or no therapy. In each panel, histograms of the cells' preferred directions are shown (1) before stroke, (2) after stroke with 500 free choice trials, and (3) after 3,000 free choice trials or (4) after 1,000 forced used training trials and subsequent 2,000 free choice trials. The gray area in (A.2) shows the lesion site. Before the lesion, the left hemisphere contains more neurons with preferred directions in the right workspace, and the right hemisphere contains more neurons for the left workspace because of the bias for workspace preference. Just after lesion, the left hemisphere is affected. If no therapy follows, the size of the affected range increases, and the number of neurons for the fourth quadrant increases in the affected hemisphere (maladaptation) and in the first quadrant in the nonaffected hemispheres (A.3). On the contrary, the number of neurons for the first quadrant in the right hemisphere increases due to compensation. After therapy (1,000 forced use trials followed by 2,000 free choice trials), however, the distributions of directions are similar to the prelesion distribution in both hemispheres.
Motor training with the affected arm has a profound effect on reorganization in the affected hemisphere. After sufficient therapy, the distribution of the surviving cells' preferred directions is similar to the prelesion distribution, with, however, fewer cells coding each direction, because the total number of cells is reduced (
Two patterns of reorganization are noteworthy in the affected hemisphere. First, the size of the affected range increased compared to just after the lesion; second, a large number of cells now code for movements in the fourth quadrant. If no therapy or insufficient therapy is provided, the directional error of the affected arm does not decrease (
Without the unsupervised learning term, reorganization follows different patterns: Therapy has less of an effect on reorganization, and lack of therapy does not lead to overrepresentation of compensatory movements in the affected hemisphere or in the nonaffected hemisphere (
Reorganization of the affected (left) hemisphere (A) and nonaffected (right) hemisphere (B) after stroke followed by therapy or no therapy. In each panel, histograms of the cells' preferred directions are shown (1) before stroke, (2) after stroke with 500 free choice trials, and (3) after 3,000 free choice trials or (4) after 1,000 forced used training trials and subsequent 2,000 free choice trials. The gray area in (A.2) shows the lesion site.
To better understand the respective roles of each of the supervised, unsupervised, and reinforcement learning rates on behavior we then performed a sensitivity analysis for these three parameters on directional error for different durations of therapy (200, 400 and 800 therapy trials) followed by 3,000 free choice condition. As shown in
Effect of the supervised learning rate (A), the unsupervised learning rate (B), and the reinforcement learning rate (C) on directional error after different durations of therapy (200, 400, and 800 therapy trials) followed by 3,000 free choice condition. The default parameters used in simulations are shown with the gray vertical lines.
We further studied the conditions under which the threshold appears by setting each of the three rates to 0 and keeping the other two to the default values. With such learning rate settings, we plotted the directional error, normalized population vector, and spontaneous hand use (
We proposed a novel model of bilateral reaching that links different levels of analysis, as it combines a simplified but biologically plausible neural model of the motor cortex, a biologically plausible (but nonneural) model of reward-based decision-making, and physical therapy intervention at the behavioral level. Because our model is based on sound theoretical principles and neural mechanisms, it allows us to explore the nonlinear interactions between performance and spontaneous use in stroke recovery.
Our motor cortex model, by learning to minimize both directional errors and variability, accounts for the reversal of the loss of cortical representation after rehabilitation, and the increase of this loss together with the increase of the representation of neighboring areas without rehabilitation
In the lesioned cortex, during therapy, the supervised learning rule ensures that underrepresented directions are “repopulated,” decreasing average reaching errors. However, because there are fewer surviving neurons overall after stroke, stroke leads to a decrease in population vector magnitude (
During therapy, the unsupervised learning rule is “adaptive” as its effect reinforces that of the supervised learning rule (compare
To our knowledge, the present computational neural model is the first developed to make specific behavioral and neuronal predictions on the efficacy of physical therapy interventions. Two previous models have been developed to account for behavior after stroke
Our model is in accord with the most recent understanding and comprehensive view of the basal ganglia function in adaptive selection of alternative actions
In a recent motor cortex model
Because of its simplicity, our model provides clear insights into a range of factors affecting recovery of arm use after stroke. However, our model does suffer from a number of limitations:
The simplistic coding of the reach movements by the motor cortex neurons does not account for how activity of motor cortex neurons also correlates with joint torque and muscle activity
A related limitation is the lack of proximal and distal representation in our motor cortex model. In the biological motor cortex, individual joints are controlled by somewhat overlapping neural groupings forming somatotopically organized and plastic motor cortical maps. Empirical results of map reorganization after lesion have focused on remapping of the hand region
A third limitation is our simplistic model of stroke, akin to that used in animal models of stroke. These ignore the motor impairments due to diffuse lesions to a number of brain areas and tracts, and not just to the motor cortex. In particular, our model cannot study the differential effect of cortical, subcortical and combined cortical–subcortical strokes and thus cannot account for differential response to rehabilitation for different stroke locations (e.g.,
To resolve the limitations, in the future we will expand our model by adding arm and muscle models controlled by neurons grouped in adaptive motor cortical maps. We plan to investigate the tradeoff between proximal and distal regions, with cortical motor maps that change during training on tasks that require more skilled use of the hand itself. Moreover, the notion that the action choice model may correspond to the basal ganglia opens up promising lines of investigation.
In summary, despite our considerable simplifications of movement representation in the motor cortex and of the simulated lesions, our results show that our proposed mechanism of motor learning and plasticity, and the ensuing results (recovery, threshold, and neural reorganization) are general and not particular to the specifics of our model.
Our model makes the following testable behavioral and neural predictions.
If spontaneous use of the affected arm is above a threshold level after therapy, repeated spontaneous attempts to use the affected arm leads to further improvements in motor performance, which in turn increase the “value” of using the arm (
If spontaneous arm use is below this threshold after therapy, compensatory movements are reinforced. Consequently, spontaneous use and motor performance of the affected limb decrease (
The dose of task practice necessary to reach the threshold depends on stroke severity, and no amount of rehabilitation will be sufficient to reach this threshold for most strokes that are classified as severe (
Unless the stroke impairment is too severe, the dose o f rehabilitation can be adjusted for each patient such that spontaneous arm use reaches this critical threshold after rehabilitation. If the stroke is too severe however, motor retraining is “in vain” (
After effective motor retraining, movement accuracy can return close to its prestroke levels, but movement variability will be higher than prestroke (
After noneffective retraining, compensatory movements, either with the same limb or the other limb or both, will become less variable (
The hemisphere contralateral to the lesion undergoes reorganization of preferred reach directions along with the development of compensatory reach movements in the affected range (
Both supervised learning-like (error driven) and unsupervised learning-like (use driven) plastic phenomena drive reorganization in the motor cortex during skill learning in the normal brain and after stroke (
In our model, neural reorganization generates bistability at the behavioral level: after therapy, spontaneous arm use will stabilize at either a low or a high value, depending on the amount of therapy. Specifically, therapy is effective and could be stopped if spontaneous arm use reaches a certain threshold, as the repeated spontaneous arm use following therapy provides a form of motor learning that further “bootstraps” performance. Below this threshold, however, motor retraining is “in vain”—there is no or little long-term spontaneous arm use after training, and the model exhibit “learned nonuse,” as has been proposed in patients with brain lesions
We thus predict that a measure of spontaneous arm use may be a good indicator to determine optimal duration of the therapy. In current rehabilitation practice, all rehabilitation is concentrated in the weeks following stroke. Our model suggests that rehabilitation protocols adopt instead a spaced and adaptive train–Test A–wait–test B–train paradigm: short bouts of training (train) are followed by a spontaneous arm use test (Test A), no training for several weeks (wait), and another spontaneous arm use tests (Test B). If spontaneous arm use measured on Test B has increased since that on test Test A, the threshold is reached, and rehabilitation can be terminated. If spontaneous arm use is still low or has decreased since Test A, another bout of rehabilitation is called for. This pattern is repeated until the threshold is reached. Note that such a training paradigm will have the additional benefit of making use of the “spacing effect,” in which spaced training lead to superior retention of learned skills
Supplemental materials: Learning rule derivation.
(0.40 MB DOC)
Effect of supervised learning. (A) Directional error, (B) normalized population vector (PV), (C) and spontaneous arm use after different durations of therapy followed by 0 free choice trial (immediate) and 3000 free choice trials (follow-up) without supervised learning. Unlike in the full model (see
(0.17 MB TIF)
Effect of unsupervised learning. (A) Directional error, (B) normalized population vector (PV), and (C) spontaneous arm use after different durations of therapy followed by 0 free choice trial (immediate) and 3000 free choice trials (follow-up) without unsupervised learning. Unlike in the full model (see
(0.18 MB TIF)
Effect of reinforcement learning. (A) Directional error, (B) normalized population vector (PV), and (C) spontaneous arm use after different durations of therapy followed by 0 free choice trial (immediate) and 3000 free choice trials (follow-up) without reinforcement learning. Unlike in the full model (see
(0.17 MB TIF)
We thank Carolee Winstein, Julie Tilson, James Gordon, Erhan Oztop, James Bonaiuto, and Jill Stewart for their helpful comments on a previous draft of this manuscript.