Novel Hybrid Adaptive Controller for Manipulation in Complex Perturbation Environments

In this paper we present a hybrid control scheme, combining the advantages of task-space and joint-space control. The controller is based on a human-like adaptive design, which minimises both control effort and tracking error. Our novel hybrid adaptive controller has been tested in extensive simulations, in a scenario where a Baxter robot manipulator is affected by external disturbances in the form of interaction with the environment and tool-like end-effector perturbations. The results demonstrated improved performance in the hybrid controller over both of its component parts. In addition, we introduce a novel method for online adaptation of learning parameters, using the fuzzy control formalism to utilise expert knowledge from the experimenter. This mechanism of meta-learning induces further improvement in performance and avoids the need for tuning through trial testing.


Introduction
Modern robots are expected to interact extensively with the environment and with humans [1,2]. This interaction with dynamic and unknown environments requires a control method that maintains stability and task effectiveness despite disturbances. One of the first schemes proposed to control interaction with an unknown environment is impedance control [3]. The environment is modeled as an admittance and the manipulator as an impedance, so that interactive control is achieved through the exchange of energy. Impedance control can be designed on top of adaptive control, which compensates parametric uncertainties [4][5][6]. Adaptive impedance control methods, developed in [7][8][9], have improved the operational performance of a traditional impedance controller. In particular, the work in [9] shows how stability and successful performance can be gradually acquired despite the initial interaction instability typical of tool use such as drilling or carving [10].
Parallel to these developments, studies have shown that the human nervous system can adapt mechanical impedance (e.g. the resistance to perturbations) to succeed in performing tasks in stable and unstable environments [11,12]. This is achieved through co-contraction of agonist/antagonist muscle groups, as demonstrated in Fig 1(a). The nervous system adapts motor commands to stabilise interactions through independent control of impedance and exerted force; the adaptation automatically selects suitable muscle activations to compensate for the interaction force and instability. At the same time, metabolic cost is minimised through the natural relaxation of muscle groups when error is sufficiently small. A model for this learning was introduced in [13,14], which gave rise to a novel kind of non-linear adaptive controller that has been successfully demonstrated on robots [15]. The adaptation of impedance in this biomimetic controller follows a "v-shaped" algorithm, as shown in Fig 1(b). Conventionally designed adaptive control designs are typically focussed on the estimation of uncertain parameters under stable motion [16]; in comparison, the biomimetic control design is able to acquire stability in unstable dynamics as well as minimise control effort, through adaptation of force and impedance [9]. Similar to muscle relaxation, under stable interaction the controller also demonstrates compliance, which has received much attention in recent research on robotic manipulation [17] [18].
The present paper extends this novel adaptive controller in two aspects: the first contribution is hybrid task-space/joint-space control. Controllers are typically implemented in either joint space (corresponding to the actuators) or in Cartesian space (in which case the inverse kinematics must be solved). Both of these control methods have advantages and disadvantages: • In contrast to joint space controllers, Cartesian controllers allows for intuitive trajectories in the world space. Objects placed in the workspace typically have a Cartesian representation, e.g. a box placed 0.1 metres in front of the robot.
• On the other hand, robots typically require inputs in joint-space, i.e. torques rather than forces and moments. Therefore, joint space control is less computationally expensive than Cartesian space control, as it avoids the inverse kinematic problem. This is especially true for under-actuated or redundant robots like the Baxter manipulator.
• Telepresence tasks may be more intuitive in joint space, when an anthropomorphic robot is imitating a human operator.
More specifically to this work, • Joint control can make the manipulator robust against disturbances along any part of the arm by monitoring joint-space errors.
• Cartesian control is sensitive to task-specific disturbances occurring at the end-effector.
Therefore, a hybrid joint-Cartesian space control scheme is developed and investigated in this paper to take advantages of these two control approaches. The Cartesian task we study is that of carrying an object along a given trajectory while disturbances are applied either on the endpoint or along the arm (or both), similar to noise rejection when holding a glass of champagne in a crowded room [19]. This extends developments found in [20] and [21].
Another aspect of adaptive control that has received little attention is the setting of learning parameters. These parameters are typically tuned by the user, in order to complete the task and improve performance, e.g. by minimising the tracking error. Automating the selection of learning parameters is not an easy task. Real-world manipulator systems have complex and unknown dynamics due to interaction with the environment, which is difficult-or in some cases, impossible-to model. The neural network-based approach of [22,23] may be used to estimate uncertainties in order to avoid some of these problems. However, fuzzy logic can be used to transfer expertise from a human operator in order to make rational decisions in the face of imprecise data [24][25][26]. Fuzzy logic has been successfully introduced into control systems to improve performance [27], and recently has been used in non-linear control systems [28] and robot manipulation [29]. This paper thus develops a method based on fuzzy logic to set the learning parameters.
The concepts of this paper will be simulated and tested on one arm of the Baxter robot (Fig 2). Baxter is a bimanual, low cost robot, designed for introductory industrial applications from Rethink Robotics©, which has recently become available in a research version for use in academia.

Control problem
Baxter is required to move along a given trajectory under the influence of a high frequency, low amplitude vibration at the end-effector, simulating the type of disturbance a tool might produce. In addition, a high amplitude and low frequency perturbation is applied to a point on the arm away from the end-effector, to simulate collision with an operator or with the environment. For reference, nomenclature is provided in Table 1.

Robot Dynamics
The robot arm dynamics are given as: where q denotes the vector of joint angles, M(q) 2 R n×n is the symmetric, bounded, positive definite inertia matrix, and n is the degree of freedom (DoF) of the robot arm; Cðq; _ qÞ _ q 2 R n denotes the Coriolis and Centrifugal force; G(q) 2 R n is the gravitational force; τ u 2 R n is the vector of control input torque; and τ dist 2 R n is the disturbance torque caused by friction, environmental disturbances or loads as described in the next section. The control torques τ u are generated by the designed controllers in order to achieve desired performance in terms of motion tracking and disturbance rejection.

Disturbances
We assume that the disturbance torque τ dist can be broken down to two components to simulate both a task disturbance at the end effector, described here as F task , and an environmental disturbance F envt applied on the arm, as shown in  is applied on the endpoint, where 0 < A p 20 is the amplitude and 100 < ω p 1000 the frequency of oscillation in Hertz. In joint space, the torque applied is then where the Jacobian J(q) is defined through _ x JðqÞ _ q. The environmental disturbance is given by where 20N < A r 100N is the perturbation amplitude, similar to average limits of human push/pull strength [30], and 0.1 < ω r 1 the frequency in Hertz, which provides a slowly changing disturbance. To simulate the environmental force F envt being applied at a point on the arm, e.g. at the elbow, the Jacobian matrix J is reduced by a matrix Z, defined as where z is the number of joints from the base to the contact point; e.g. if the force is applied on the elbow, z = 4. The torque can then be derived as The disturbance torque τ dist in Eq (1) is comprised of a combination of terms in Eqs (6) and (3).
Joint angular velocity € q 2 < n Joint angular acceleration Cartesian/task-space velocity € X 2 < 6 Cartesian/task-space acceleration q*, X* Desired joint position, Cartesian position Internal, external force respectively M 2 < n×n Inertia matrix C 2 < n Coriolis and Centrifugal force G 2 < n Force due to gravity e, e x Joint, task-space position error, respectively _ e; _ e x Joint, task-space velocity error

Adaptive Control Feedforward controller
Given the dynamics of a manipulator in Eq (1), we employ the following controller as the initial torque input where L(t)ε(t) corresponds to a desired stability margin [9] which produces minimal feedback (similar to the passive impedance effect of muscles and tendons), and the first three terms are feed-forward compensation for the manipulator's dynamics. As in sliding mode control, we use the tracking error are joint angle and angular velocity errors, respectively. In addition to the above control input τ r (t), we develop two adaptive controllers in joint space and task space as follows.
Joint space adaptive control. The human-like adaptive law for tuning the feed-forward and feedback components of the control torque τ u from [9] is applied both in joint and task spaces. The adaptation here is continuous during movement, rather than trial after trial on repeated movements, so that tracking error and effort are continuously minimised. Let us define t j ðtÞ ¼ ÀtðtÞ À KðtÞeðtÞ À DðtÞ_ eðtÞ ð10Þ where −τ(t) is the learned feed-forward torque, and −K(t)e(t) and ÀDðtÞ_ eðtÞ are feedback torque terms due to stiffness and damping, respectively. The adaptive laws introduced in [9] for a trajectory of period T are given as: dtðtÞ tðtÞ À tðt À TÞ Q t ðεðtÞ À gðtÞtðtÞÞ; dKðtÞ KðtÞ À Kðt À TÞ Q K ðεðtÞe T ðtÞ À gðtÞKðtÞÞ; dDðtÞ DðtÞ À Dðt À TÞ Q D ðεðtÞ_ e T ðtÞ À gðtÞDðtÞÞ : In the present paper we decouple the forgetting factor γ(t) from the gain matrices Q (Á) in order to avoid high frequency oscillation, which can occur when both γ and Q (Á) are large. As mentioned above, we consider the adaptation in continuous time, rather than by iteration over consecutive trials, yielding the joint space adaptation laws: dtðtÞ tðtÞ À tðt À dtÞ Q t ε j ðtÞ À g j ðtÞ tðtÞ ; dK j ðtÞ K j ðtÞ À K j ðt À dtÞ Q Kj ε j ðtÞe T j ðtÞ À g j ðtÞ K j ðtÞ ; dD j ðtÞ D j ðtÞ À D j ðt À dtÞ Q Dj ε j ðtÞ_ e T j ðtÞ À g j ðtÞ D j ðtÞ where δt is the sampling time, K j (0) = 0 [n×n] and D j (0) = 0 [n×n] . Q τ , Q Kj , Q Dj 2 < n×n are diagonal positive-definite gain matrices. Furthermore, in [9], γ(t) 2 < n×n was diagonal with which requires two tuning variables, a and b. To simplify parameter selection, γ is redefined as which requires only one variable, α j , to describe the shape (as shown in Fig 3) but maintaining the same functionality. This also presents the advantage of simple application of a fuzzy inference engine, as described in a later section. Task space adaptive control. Task-space control is designed in a similar manner to joint space. First, we define the error term in Cartesian space: How the magnitude of α affects the forgetting factor γ. Higher values of α have a high narrow shape, so that when tracking performance is good the control effort is reduced maximally. When tracking performance is poor, the forgetting factor is small, increasing applied feedback torque. This leads to a change in the feed-forward and feedback terms described in Eq (12) to so that and the task-space forgetting factor is defined similarly to Eq (14), below: Hybrid Controller. The combination of the basic controller of Eq (7), the joint space controller of Eq (10) and the task space controller of Eq (17) yields the hybrid controller, and therefore the input torque τ u where O 2 < n×n is a weighting matrix, designed such that the joint torque feedback is limited to certain joints, dependent on the required task. Assuming an accurate dynamic model of the robot is available, the torques due to disturbance τ dist are given as i.e. the modeled system torques minus the input torque. By normalising this vector of torques to the maximum element, the weighting matrix O can be formed: which is then applied to Eq (19), so that joint-space control torque is applied primarily to those joints which are under the influence of large disturbance forces, and less to those which are not; this limits the control effort being applied unnecessarily, reducing the overall control effort that would otherwise be applied.

Fuzzy Inference of Control Gains
Traditionally, the user sets the learning parameters Q (Á) and α (Á) based on experience of how the system responds at run-time, in order to ensure good control performance. Here, expert knowledge of the system is distilled into a fuzzy inference engine to tune the gains online, so that no prior user experience is required. An improvement in performance is also expected, as the system will pick appropriate gain values depending on the system response to unpredictable disturbances. Inferences are made according to the magnitudes of the tracking error and control effort, which we want to minimise, and also give a good indication of overall performance of the controller. There are several steps required for fuzzy inference of an output Y. First, fuzzification maps a real scalar value (for example, temperature) into fuzzy space; this is achieved using membership functions. Let X be a space of points, with elements x 2 X [31]. A fuzzy set A in X is described by a membership function μ A (x) associating a grade of membership μ A (x i ) in the interval [0, 1] to each point x in A.
In this paper we use simple triangular membership functions, which have low sensitivity to change in input and are computationally inexpensive [32]. Additionally, from [32], all membership functions are set so that the completeness of all fuzzy sets is 0.5; this reduces uncertainty by eliminating areas in the universe of discourse with low degrees of truth, and also ensures reasonable overshoot, as described in [33].
Several definitions are required. A union, which corresponds to the connective OR, of two sets A and B is a fuzzy set C An intersection, which corresponds to connective AND, can similarly be described: The Cartesian product can be used to describe a relation between two or more fuzzy sets; let A be a set in universe X and B a set in universe Y [34]. The Cartesian product of A and B will result in a relation where the fuzzy relation R has a membership function This is used in the Mamdani min-implication, to relate an input set to an output set, i.e. IF x is A THEN y is B. A rule set is then used to implicate the output, which is max-aggregated for all rules [25]. Defuzzification is then performed, using the common centroid method [35]. The defuzzified value y Ã is calculated using which computes the centre of mass of the aggregated output membership function, and relates the μ value back to a crisp output. The raw inputs to our fuzzy systems are the joint-space tracking error and effort, ε j , τ u and similarly, in task-space, ε x , F u . Before fuzzification can be performed, the inputs must be normalised so that the same inference engine is generic and is not dependent on the input magnitude. A baseline average of tracking errorsε j 2 < n ,ε x 2 < 6 , input torquet u 2 < n and input forceF u 2 < 6 are calculated for each degree of freedom over the total simulation time per time step t f = dt : These are then used to calculate the inputs to the fuzzy system, i.e. values which give an indication of performance compared to the previous iteration: For all inputs to our fuzzy systems, a value less than σ indicates an improvement and values greater than σ indicate that performance is worse. Here we set σ = 0.5, so that the input range is roughly between 0 and 1. There is no upper limit to the variables generated in Eq (28), so any input above unity returns a maximum truth value in the 'high' classification. This allows a generic set of input membership functions to be applied to all systems. These normalised variables are then used in the adaptive laws Eqs (12), (14), (16) and (18) as Q t Q t ð ε j ; t j Þ, Q Kj Q Kj ð ε j ; t j Þ, Q Dj Q Dj ð ε j ; t j Þ, a j a j ð ε j ; t j Þ for the joint-space controller, and correspondingly for the task-space controller.
The rules for fuzzy inference of the control gains are set using expert knowledge. In general: IF control effort is too high THEN gain is set low; IF tracking error is poor THEN gain is set high, as shown in Table 2 for Q (Á) . The truth table for the forgetting factor gain ( Table 3) is slightly different, in that α is required to be larger when tracking error is improved. Note that Q (Á) and α, the outputs of the fuzzy inference system, are bounded: where the maximum values are set according to previous trials performed without application of the fuzzy system. How changes in control effort and tracking error affect the Q (Á) gains is shown in Fig 4(a). It can be seen that in general: gain increases when tracking error is high and control effort is low, and minimal gain occurs when tracking error is low and control effort is high. The surface of fuzzy inference of α is shown in Fig 4(b) where it can be seen that the forgetting factor will be at its greatest when tracking error is low and control effort is high.

Stability
The stability of the controller in joint space and convergence to a small bounded set were shown in [9], and the proof for the Cartesian space controller is similar. However, here the Table 2. Truth table for inference of output Q (Á) based on fuzzy memberships of ε j i ; ε x i ; t u i , F u i .

Input
Output doi:10.1371/journal.pone.0129281.t002 Table 3. Truth tables for inference of output α (Á) based on fuzzy memberships of ε j i ; ε x i ; t u i , F u i . diagonal adaptation gain matrices Q (Á) are time varying, which must be taken into account. From [9] Appendix C, the difference in energy of the system δV(k) = δV p (t)+δV c (t) is shown to converge to zero. No change to the derivation of the first part δV p (t) is needed here, so that section of the proof still holds. A change is made in comparison to [9], equations (39-41) where Q À1 ðÁÞ is replaced with Q À1 ðÁÞ ðsÞ so that dV c ðtÞ ¼ 1 2 Z t tÀdt f trðK T ðsÞQ À1 K ðsÞK ðsÞ ÀK T ðs À dtÞQ À1 K ðs À dtÞK ðs À dtÞÞ þ trðD T ðsÞQ À1 D ðsÞDðsÞ ÀD T ðs À dtÞQ À1 D ðs À dtÞDðs À dtÞÞ þt T ðsÞQ À1 t ðsÞtðsÞ Àt T ðs À dtÞQ À1 t ðs À dtÞtðs À dtÞg: Defining a new variable δQ diag[I δQ K , I δQ D , I δQ τ ] (where is the Kronecker product) allows us to add another term to the end of [9](44), producing The term inside the last integrand can be described by ε QF TF where given thatK T F À1 where ε K,D,τ are defined as the minimum eigenvalues of F À1 K;D;t . This can then be added to the condition in [9](46) which gives the inequality dV ! l L k ε k 2 þ g max kF k 2 À g 0 kF kk F Ã k ! l L k ε k 2 þ gkF k 2 À g 0 kF kk F Ã k ! 0 ð33Þ where γ 0 = Q −1 γ, and g ¼ g 0 þ ε Q . This is a sufficient condition to prove stability, following the details in appendix C of [9], and given that Q(t) is bounded by the output of fuzzy inference stipulated in Eq (29).

Simulations
The task consisted of tracking a smooth minimal jerk trajectory along the y coordinate defined as: where T is the movement duration. Joint-space angular velocity is computed using the pseudo inverse J † (q) J T (JJ T ) −1 of the Jacobian, through _ q Ã ðtÞ ¼ J y ðqÞ ½0; y Ã ðtÞ; 0; 0; 0; 0 T ; ð35Þ from which the position and acceleration can be found respectively using Simulations of the proposed task and controller were performed using MATLAB with a kinematic and dynamic Baxter robot rigid joint model, implemented using Peter Corke's Robotics Toolbox [36,37]. To test the controller under continuous different conditions, the two disturbance forces F envt and F task were introduced in different phases: • Phase I: No disturbance; • Phase II: F task only; • Phase III: F envt only; • Phase IV: F envt and F task .
Performance was analysed in each phase, to observe the controller's reaction to different perturbations. It was expected that joint-space control would improve rejection of F envt , and task-space control to reject disturbance caused by F task ; the order of phases was set so that the adaptation progress would be easier for readers to understand. A performance index, η, was calculated from the integral of the product of input force F u and task-space tracking error ε x : where Q, R 2 < 6×6 are positive diagonal scaling matrices, and t s and t f were set to obtain η for each phase of the simulation. A small performance index η corresponds to small tracking error and control effort, and thus indicates good performance.

Hybrid Control
Performance of the hybrid controller τ u (t) = τ r (t)+τ x (t)+Oτ j (t) was compared against the controller in joint-space only, when τ u (t) = τ r (t)+τ j (t), and in task-space only, where τ u (t) = τ r (t) +τ x (t). Disturbance parameters remain the same in each case; for F task (t) defined in Eq (2), p = 20 sin(2π 50 t), and for F envt (t) from Eq (4) the parameters are r = 100 sin(2π 0.1042 t). The trajectory period and travel distance were set to 4.8s and 0.2m respectively. Each simulation phase corresponds to one completion of the trajectory of Eq (34). The Cartesian tracking error ε x in Fig 5(a) for all three control schemes shows how taskspace performs better when a tool-type disturbance is applied, but suffers when a large disturbance is applied away from the end-effector. In this case, joint-space control was able to more effectively reduce tracking error. When combined in the hybrid controller, tracking error was reduced further. From Fig 5(b) it can be noted that there was little difference in the overall amount of control effort being applied between the three methods. The measures of tracking error and control effort were combined to form the performance index η for each phase, shown in Fig 5(c). A clear difference could be seen in the performances of the task-space and joint-space controllers between phases II and III, where the disturbance type was switched from F task to F envt ; task-space control was better at handling the former, and joint-space the In phase II task-space has the lowest error, and joint space the highest, with the hybrid control in between, as expected due to the disturbance type. In the next two phases (9.6 < t < 19.2) task-space control produces the highest error, while the hybrid controller shows a much lower tracking error than its component parts. (b): Examining the input torques τ u little difference can be seen between the three control schemes. (c): The performance index η in each phase demonstrates the limitations of each control type under different disturbance conditions. In particular task-space control performance is degraded in phases III, IV where joint-space is superior. Hybrid control shows improved performance over both. doi:10.1371/journal.pone.0129281.g005 Hybrid Adaptive Control latter. The hybrid controller showed a slight improvement over joint-space in phase II but exhibited an improvement over its component parts in phases III and IV. Considering jjτ u jj was similar for all three, as seen in Fig 5(b), this suggests that the hybrid control was applying control in a more targeted fashion, i.e. only applying additional feedback to the joints which require it.
By examining the evolution of feed-forward torque in Fig 6(a) we see how in phases III and IV large increases were made to compensate for the low frequency F envt disturbance, predominantly in the first joint (the rotation of which is aligned with the x-y plane). Comparing the magnitude of feed-forward torque between controllers it is clear that joint-space control generated much higher torques, while hybrid control torques were much lower and less weighted towards joint 1.
Cartesian stiffness ellipses are shown in Fig 6(b); In task-space and hybrid control, it can be observed how the stiffness changed from a slight orientation in the y-direction (due to the trajectory moving along this axis) to a much larger ellipse predominantly in the x-axis: aligned with the direction of disturbance. Joint-space control, however, produced ellipses less-aligned with the direction of disturbance. This shows that feedback torque is being applied inefficiently in this case.

Fuzzy Inference of Control Gains
The effectiveness of the fuzzy inference of control gains Q (Á) and α was tested through implementation on the hybrid controller, and compared against results obtained in the previous section (where control gains are fixed). Base-line averages described in Eq (28) and upper limits of adaptation gains were calculated from data collected running the hybrid controller in the previous experiment, which were then used as the input to the fuzzy engines affecting the adaptive laws.
By examining Fig 7(a) we can see that there was an improvement in tracking error in phase II, but not so much in other phases, where it is similar to previous results. However, by comparing the results with Fig 7(b) we can see that although control torque was not reduced in the first two phases, there was a significant reduction in the last two; this demonstrates not only that the online tuning is able to reduce tracking error when control effort is already minimal, but also reduces the control effort required to maintain good tracking. This is reflected in Fig 7  (c) which shows in all disturbance phases that the aggregate performance index score was improved by tuning the learning parameters online.
In Fig 8(a) and 8(b) the feed-forward torques of the proximal joints are compared. We can see that the fuzzy tuning had a much higher response amplitude, although the shape has remained the same. Compared with Fig 8(c) and 8(d) the stiffness ellipse displays a reduced magnitude with fuzzy tuning. This suggests that the online-tuned controller increased feed-forward torque while sacrificing stiffness to reduce the control effort observed in Fig 5(b), although the geometry of the ellipse was maintained in the direction of disturbance. The shape of evolution through time is similar between the two controllers; however, the fuzzy hybrid controller applies a larger feed-forward torque. (c), (d): Ellipses are for the hybrid controller have a higher magnitude than the same controller with fuzzy parameter tuning; note that scaling in the fuzzy tuning case is ×0.02 scaled. Ellipses in the second phase are elongated in the direction of disturbance. doi:10.1371/journal.pone.0129281.g008

Conclusions
This paper investigated the ideas of combining joint-space and task-space feedback control to create a hybrid controller, and of online fuzzy tuning of learning parameters.
The controller was based on a bio-inspired design, which has been shown to acquire stable and successful performance with minimal effort. The controller was implemented on a dynamic model of the redundant Baxter robot arm. The results show that the hybrid controller displays reductions in tracking error of around 26% and 16% on average for the task and jointspace controllers respectively, with only a 6% maximum increase in control effort. Thus, demonstrating the hybrid controller is able to benefit from both joint-space and Cartesian-based control, providing robustness against disturbances occurring at the end-effector or any point along the arm.
The results further show how fuzzy inference can be used to set the learning parameters automatically, instead of the normal practice of setting them manually. The simulation results demonstrate an average 24% reduction in control effort and 15% improvement in overall performance with this fuzzy meta-learning than with fixed learning parameters, as well as avoiding the need for trial testing to select optimum values for adaptation gains. We also note that the method used to normalise inputs to the fuzzy system may enable iterative performance improvement, as the performance of the current iteration is compared against the previous, and the fuzzy system seeks to reduce tracking error and control effort as much as possible.