Optimal Control Predicts Human Performance on Objects with Internal Degrees of Freedom

On a daily basis, humans interact with a vast range of objects and tools. A class of tasks, which can pose a serious challenge to our motor skills, are those that involve manipulating objects with internal degrees of freedom, such as when folding laundry or using a lasso. Here, we use the framework of optimal feedback control to make predictions of how humans should interact with such objects. We confirm the predictions experimentally in a two-dimensional object manipulation task, in which subjects learned to control six different objects with complex dynamics. We show that the non-intuitive behavior observed when controlling objects with internal degrees of freedom can be accounted for by a simple cost function representing a trade-off between effort and accuracy. In addition to using a simple linear, point-mass optimal control model, we also used an optimal control model, which considers the non-linear dynamics of the human arm. We find that the more realistic optimal control model captures aspects of the data that cannot be accounted for by the linear model or other previous theories of motor control. The results suggest that our everyday interactions with objects can be understood by optimality principles and advocate the use of more realistic optimal control models for the study of human motor neuroscience.


Linear, point-mass optimal control model
Let m h be the mass of the hand, τ 1 and τ 2 be the time constants of the second order linear muscle filter, which then yields the state space equation: with the matrices where the variables in the A matrix correspond to: For computational reasons the problem needs to be discretized and the discretization was performed using a matrix exponential with time step t = 0.01s.

Non-linear, two-link arm optimal control model
The algorithm developed by Todorov & Li (2005) [1] was used, which is available from http://www.cs.washington.edu/homes/todorov/, was used. Their dynamics model of the arm was left unchanged and the following state update matrix for the object was added to the existing code:

LQR with incomplete state observation and sensorimotor delay
To investigate possible effects of a sensorimotor delay on the simulation results, we adapted the linear optimal control model above in accordance with [2]. This was done by changing the model from one of complete state observation to one of incomplete state observation: where H is the observation matrix and ω(t) is a sensory noise term with mean 0 and covariance matrix Ω ω . This formulation already implies a time delay of one time step. A sensorimotor delay of a total of 10 time steps (i.e. 100 ms, which is roughly the time to respond to a visual perturbation [3][4][5]) was implemented using the augmented state: An augmented observation matrixH extracts the component Hx(t − 9) ofx(t) and an augmented dynamics matrixÃ removes Hx(t − 9), shifts the remaining sensory readings, and includes Hx(t) in the next statex(t + 1) The sensory noise terms were set to 0 except for the hand and object position and the hand and object velocity which were set to 0.01 m and to 0.1 ms −1 respectively. All other parameter settings were kept the same as in the model without delay. The results of the simulations are displayed in Figure S1. The model predictions for the delayed linear optimal controller are quantitatively only slightly better (i.e. explain 1%-5% more of the variance than the non-delayed version of the model) but are qualitatively very much the same. This is due to the fact that there are no unanticipated perturbations in our task and that the trials we analyse are at the end of learning when subjects already have a very good idea of the dynamics of the objects. Therefore subjects are likely to have used mostly feed-forward control rather than correcting online using the visual-and haptic feedback provided by the virtual reality setup.

Sensitivity Analysis
To analyse how sensitive both optimal control models are to the particular values of w e and w o chosen for the fits in the main article, we performed a sensitivity analysis on the data for condition B-low.
Dependency of R 2 on the parameters w v , w e and w o ( Figure S8) We changed all three parameters from between one tenth to ten times of their initial fitted values and computed R 2 values for all the different settings. Both models ( Figure S8A) are very robust to increases in w v and w o or decreases in w e , which do not affect the model predictions in the range of values investigated. In contrast, increases in w e worsen the fits slightly as the controller becomes greedier and the hand moves more directly towards the target. Similarly, decreasing w o down to one tenth of its initial value reduces the goodness of fit only slightly as the hand path becomes straighter and the object starts missing the target. Decreases of w v again worsen the fits slightly as the hand path becomes more curved and looses its loop mid-way (see below for more details).
Effect of w e on model predictions ( Figure S9 Figure S10 and for the non-linear model in Figure S11. When the object weight w o gets smaller, the hand path becomes straighter and the object starts missing the target. Similarly, when the object weight w o gets larger the object path becomes straighter and eventually the hand starts missing the target (non-linear model: w o = 10 3 ).
Effect of w v on model predictions ( Figure S12 and S13) The effort weight w e and the object weight  Figure S12 and for the non-linear model in Figure S13. When the velocity weight w v gets smaller, the hand and object path become more curved and the hand path looses its loop mid-way. Increasing the velocity requirement did not result in any significant changes over the range of values investigated.

LQR with model uncertainty and incomplete learning
To investigate the effects of model uncertainty and incomplete learning, we adapted the linear optimal control model above in accordance with [6]. Incomplete learning of the internal model was implemented by multiplying all entries in the A-matrix relating to the object dynamics by the scaling factor α = 0.8 (α = 1 would correspond to fully learned object dynamics as before): In addition model uncertainty was introduced by adding state dependent noise to the dynamics: where γ t is a Gaussian scalar random variable with mean 0 and standard deviation 1, and V is a scaling parameter matrix which scales the variance of the model parameter uncertainty: For the simulations σ was set to 0.2. The implementation is based on incomplete state observation and the sensory noise terms were set to 0 except to 0.01 m for hand and object position and to 0.1 ms −1 for hand and object velocity. All other parameter settings were kept the same as in the model without delay.