Robust Brain-Machine Interface Design Using Optimal Feedback Control Modeling and Adaptive Point Process Filtering

doi:10.1371/journal.pcbi.1004730

Fig 1.

Adaptive OFC-PPF BMI architecture.

(A) Monkey performing the self-paced delayed center-out movement task in brain control. The subject’s arms were confined within a primate chair in brain control. (B) Timeline of the center-out task (see Experimental Procedures for details). (C) Adaptive OFC-PPF converts the spiking activity into a discrete time-series of 0’s and 1’s by binning the spikes in small intervals containing at most one spike; this binary time-series is modeled as a point process. We thus perform the decoding and parameter adaptation with every binary spike event. (D) Adaptive OFC-PPF architecture. The architecture models the brain in closed-loop BMI control as an infinite-horizon optimal feedback-controller to infer its intended velocity during adaptation. The inputs to the infinite-horizon optimal feedback-controller model are the visual feedback of the decoded cursor kinematics (that the monkey observes) and the instructed target position (i.e., task goal). The inferred intended velocity is input to a point process filter for each neuron, which estimates the neuron’s parameters with every 0 and 1 spike event. These estimated parameters are used in the kinematic PPF decoder that decodes the kinematics with every 0 and 1 spike event. Initially, the architecture can provide assisted training to the subject by decoding the kinematics using a target-directed PPF decoder (see Methods). After assisted training is complete, a random-walk PPF is used to decode the kinematics. Once performance converges, adaptation stops and the trained random-walk PPF is used by the monkey to perform various BMI tasks, such as the center-out or the target-jump tasks.

More »

Expand

Fig 2.

Adaptive OFC-PPF flow-chart.

(A) The architecture proceeds with decoding as follows. The decoder parameters are first initialized as desired. The architecture then starts a period of combined assisted training and spike-event-based closed-loop adaptation. Once non-assisted performance in a test period exceeds a desired threshold, assistance stops and a random-walk PPF (termed here RW-PPF) is used for control. Spike-event-based adaptation continues until performance saturates. At that point, adaptation stops and the random-walk PPF with the converged parameters is used by the subject for volitional control of movements in various tasks. The green parts of the flow-chart show the dynamic assisted training procedure. (B) The architecture during the different stages of adaptation and assisted training.

More »

Expand

Fig 3.

Performance over the process of adaptive OFC-PPF.

(A) Success rate as a function of time from the start of the experiment in one session. Success rate is calculated in sliding 2 min windows. The decoder was initialized using a visual feedback seed. Adaptive OFC-PPF was then run as described in the flow-chart in Fig 2. The architecture started by providing assisted training and adaptation. After the first assist period, which consisted of 3 discrete assist levels, performance in the test period exceeded the desired threshold of 5 trials/min and hence the architecture stopped the assisted training (vertical dashed black line). We stopped the adaptation at the vertical dashed blue line, after which the trained point process model was used in a random-walk PPF to control the cursor. Green lines show the 99% upper bound on the chance level performance during the assisted training. Assisted performance is above the 99% chance level. The horizontal dashed line shows the mean manual task performance with the arm on that day. (B) Randomly selected center-out trials on this day after adaptation stopped.

More »

Expand

Fig 4.

OFC intention estimation results in higher PPF performance.

(A, B) Sample decoded trajectories (black), the decoded velocities (orange), and the inferred intended velocities by the N-OFC model (magenta) and the instant-OFC model (blue) on the right and by the existing CursorGoal method of intention estimation (red) [20] on the left. The latter method [20] obtains the intended velocity by rotating the decoded velocity vectors towards the target while keeping their speed unchanged (the speed is set to zero at the target). Since the instant-OFC model outperformed the N-OFC model, we used the former for intention estimation. (C) Steady-state performance of the PPF decoder trained using the OFC method of intention estimation (blue) vs. the CursorGoal method of intention estimation (red). Bars indicate average values and error bars indicate s.e.m..

More »

Expand

Fig 5.

Spike-event-based adaptation enables faster convergence.

(A) Performance over time for adaptive OFC-PPF (solid) and SmoothBatch OFC-PPF (dashed) run on two sets (red and blue) of two consecutive days that started from the same initial parameters. Vertical lines show the time point where assistance stopped as the subject’s non-assisted success rate in the test period at that point exceeded the desired minimum threshold of 5 trials/min. Success rate is calculated in sliding 2 min windows. (B, C) Average success rate across sessions as a function of time into the adaptive session for SmoothBatch OFC-PPF in (B) and Adaptive OFC-PPF in (C). Blue curves show the mean success rate over 12 days of experiments for each decoder and shading reflects the standard deviation across these days. The red bar shows the time range in which the BMI architecture stopped the assisted training across days. Spike-event-based adaptation resulted in faster convergence and less variability compared with SmoothBatch adaptation that updated the decoder parameters on a slower adaptation time-scale, i.e., once every 90 seconds.

More »

Expand

Fig 6.

Adaptive OFC-PPF is robust to initialization.

(A) Performance over time for adaptive OFC-PPF that was initialized once using a visual feedback seed and once using a permuted visual feedback seed on the same day. Vertical dashed lines show the time point at which the architecture stopped the assisted training. Regardless of the initial seed, performance converges to similar values in these two sessions. Note that initial performance of both visual feedback and permuted visual feedback seeds were poor and hence assistance was used to allow the subject perform the task initially as parameters were adapting. (B–D) Convergence of the point process parameters for an example neuron as a function of time, when starting from the two different seeds. The baseline firing rate is shown in (B) and α^c for the velocity components in the two dimensions are shown in (C) and (D) (see Eq (5)).

More »

Expand

Fig 7.

Adaptive OFC-PPF extends to tasks beyond those used for CLDA training.

(A) Sample random trajectories in the target-jump task. Gray circle shows the initial target and cyan circle shows the eventual target after the jump occurred. The unfilled circle on the trajectory shows the time at which the jump occurred. The monkey used a random-walk PPF trained on the center-out task to perform this target-jump task. (B) Sample random trajectories in the target-to-target task. Each trial type consisted of a start target and an end target. Instead of going from the center to one of eight peripheral targets in the center-out task, here the monkey had to move the cursor from one target to another target (whose locations could also differ from those in the center-out task).

More »

Expand