Design of Optimal Hybrid Position/force Controller for a Robot Manipulator Using Neural Networks

The application of quadratic optimization and sliding-mode approach is considered for hybrid position and force control of a robot manipulator. The dynamic model of the manipulator is transformed into a state-space model to contain two sets of state variables, where one describes the constrained motion and the other describes the unconstrained motion. The optimal feedback control law is derived solving matrix differential Riccati equation, which is obtained using Hamilton Jacobi Bellman optimization. The optimal feedback control law is shown to be globally exponentially stable using Lyapunov function approach. The dynamic model uncertainties are compensated with a feedforward neural network. The neural network requires no preliminary offline training and is trained with online weight tuning algorithms that guarantee small errors and bounded control signals. The application of the derived control law is demonstrated through simulation with a 4-DOF robot manipulator to track an elliptical planar constrained surface while applying the desired force on the surface. under the Creative Commons Attribution License, which permits unrestricted use, distribution , and reproduction in any medium, provided the original work is properly cited.


Introduction
In many robotic applications such as assembly, fine polishing, grasping, grinding, and deburring, the robot comes in extensive contact with its environment.These tasks are better dealt with by directly controlling the forces of interaction between the robot and its environment.The task is to exert a desired profile of force in the constrained degrees of freedom while following the reference trajectory in the unconstrained degrees of freedom.This problem is generally referred to as compliant motion control or impedance control or hybrid position/force control problem.
At present, many control algorithms have been proposed for hybrid position/force control.The basic hybrid position/force control scheme was originally proposed by Craig and Raibert [1].It neglected the dynamic coupling among each of the robot joints and developed the control scheme within the framework of robot joint control systems.Khatib and Burdick [2] subsequently remedied coupling problem by considering the control problem within the framework of operational space.Exact decoupling of motion and force equations and system linearization is realized by McClamroch [3] and Yoshikawa [4].Generally, constrained robotic systems are modeled as differential equations subject to the algebraic constraints.McClamroch proposed the dynamic models as singular system of differential equations and developed these models for several robot configurations and presented feedback control schemes for these configurations.Yoshikawa formulated the constraints on the end effector of the manipulator by the hypersurfaces in the effector coordinate frame and computed the joint driving force as the sum of the two forces: the joint driving force for achieving the desired trajectory of the effector position and that for achieving the desired trajectory of the force.Hogan [5][6][7] considered the design of compliant motion which is important when gripping fragile objects or in interacting between cooperating robots.In [8], Su et al. presented the control algorithm using slidingmode approach.Roy and Whitcomb proposed an adaptive force control algorithm for velocity/position-controlled robot arms in contact with surfaces of unknown linear compliance with exact dynamic considerations [9].Su along Yury Stepanenko presented an adaptive variable structure control algorithm for constrained motion of robots.It was shown that the objective could be achieved without exact knowledge of robot dynamics and on-line calculation of nonlinear dynamic functions.The other works relating to force control are also reported in literature [10][11][12][13].
The quadratic optimization of robotic motion is well addressed by Johansson [14].Johansson and Spong [15] considered the case of quadratic optimization for impedance control of robot manipulators.They showed to minimize an appropriate control effort and used Hamilton-Jacobi formulation to find optimal control.Recently, the application of neural networks in closed-loop control has become an intensive area of research.Many works on closed-loop application of neural networks are reported in the literature.Lewis et al. have worked extensively in the closed-loop application of neural networks for robot motion control [16][17][18][19].They derived successfully control algorithms with stable tracking while approximating the unknown dynamics of the manipulators with neural networks that are universally considered to be fine approximators and trained them online, removing any preliminary offline training.The works relating to application of neural networks in case of impedance control are reported in [20][21][22].In [20,22], neural networks are used for adaptive compensation of the structured and unstructured uncertainties while Saadia et al. [21] tried to globally treat the problem of the adaptation of robot behavior to various classes of tasks.Chen and Chang [23] modified the dynamic model to contain two sets of state variables and used sliding-mode approach to present the control scheme.In this paper, we also transform the dynamic model to a model with two sets of variables, one describing the constrained motion and the other describing the unconstrained motion.Using Hamilton-Jacobi formulation, an optimal sliding-mode controller is designed.The uncertainties of the model are learned with feedforward neural networks.
The stability analysis of the resulting system is also carried out.The conventional robust controllers such as model-based adaptive controllers require the knowledge of the system dynamics in detail (e.g., in the form of linear in parameters model).Also the fuzzy-rule learning scheme has stability problems.The use of optimal sliding-mode neural network based scheme motivates to overcome the shortcomings of existing adaptive schemes.The proposed control scheme has salient features: (1) it guarantees the stability of the controlled system, (2) it provides an optimal feedback solution to the problem, and (3) the neural network is able to learn the completely unknown system dynamics.
The paper is organized as follows.In Section 2, through appropriate transformation of coordinates, the dynamic model for analyzing the constrained robot system is presented.Section 3 presents the optimal sliding-mode controller design.A review of feedforward neural networks is given in Section 4. The NN controller design is presented in Section 5. Numerical simulation results are included in Section 6. Section 7 gives concluding remarks.

Dynamic model of constrained robot
Based on the Euler-Lagrangian formulation, in the absence of friction, the motion equation of an n-link rigid, nonredundant constrained robot can be expressed in joint space as [24] M(q) q + C m (q, q) q + G(q) = τ + J T (q)λ, ( where q ∈ R n is the joint displacement vector, M(q) ∈ R n×n is the inertia matrix, C m (q, q) ∈ R n is the vector characterizing Coriolis and centrifugal forces, G(q) ∈ R n is the gravitational force, τ ∈ R n is the joint space torque, J T (q) ∈ R n×m is a Jacobian matrix associated with the contact surface geometry, and λ ∈ R m (the so-called "Lagrange multiplier") is a vector of contact forces exerted normal to surface, described in coordinates relative to the surface.m is the times of constraints.Due to the stiffness of the environment, it is assumed that where S represents the environment stiffness and is a constant positive definite matrix.Ω : R 1 → R m is a function.The following properties should be noted about the dynamic structure in (2.1).
Property 2.1.M(q) is a symmetric positive definite matrix and bounded above and below, that is, there exists positive constants α M and β M such that Property 2.2.The matrix Ṁ(q) − 2C m (q, q) is skew symmetric.
It is assumed that the generalized contact surface with which the manipulator comes into contact can be thought of as an intersection of m mutually independent hypersurfaces.The algebraic equation of the surface can be written as where the mapping Φ : R n → R m is twice continuously differentiable.Differentiating (2.4) with respect to time leads to Φ(q) = ∂Φ(q) ∂q q = J(q) q, (2.5) where J(q) = ∂Φ(q)/∂q is the Jacobian matrix associated with the contact surface geometry.
Property 2.3.M(q) is a symmetric positive definite matrix and bounded above and below, that is, there exists positive constants α M and β M such that (2.9) The proof is given in Appendix A.
Property 2.4.The matrix Ṁ(q) − 2C(q, q) is skew symmetric.The proof is given in Appendix B.

Optimal sliding-mode controller design
In this section, the optimal control algorithm is designed to obtain the desired contact force λ d and the position trajectories x d , where x d = [ψ 1d ,...,ψ (n−m)d ] T .To adopt the sliding-mode technique, two sets of sliding functions, s u ∈ R n−m for the unconstrained motion and s c ∈ R m for the constrained motion, are defined.For the unconstrained motion, the sliding function is chosen as where e u = x d − x u and ėu = ẋd − ẋu .
The sliding functions for the constrained motion are chosen to be where Taking the first derivatives of (3.3) yields To analyze the stability of (3.4), the following Lyapunov function is chosen: From (3.4) and (3.5), we have Therefore, from Lyapunov stability and LaSalle's theorem [24], we conclude that λ(t) → 0 as t → ∞.This implies that the force λ will equal the desired value Then error dynamics and robot dynamics can be written, respectively, as follows: where Λ = Λu 0 0 Λc and for instance , where e f = λ d − λ.The nonlinear function h(y), capturing the unknown dynamics the robot manipulator is Now, define a new control-input vector as The new control effort u(t) ∈ R n is to be minimized later.The closed-loop robot dynamics becomes With (3.7) and (3.11), the following augmented system can be obtained: with A(q, q) ∈ R 2n×2n , B(q) ∈ R 2n×2n , and Our control objective is to find the control input u(t) so that the following quadratic performance index ζ(u) is minimized: with the Lagrangian Hamilton-Jacobi-Bellman optimization.Let u o (t) denote the required optimal control.A necessary and sufficient condition for u o (t) to minimize (3.15) subject to (3.12) is that ∃ a function V = V ( x,t) satisfying Hamilton-Jacobi-Bellman (HJB) equation [25] ∂V ( x,t) ∂t where the Hamiltonian of optimization is defined as (3.17) The function V ( x,t) satisfies the following partial differential equation: V. Panwar and N. Sukavanam 7 The minimum is attained for the optimal control u(t) = u o (t) and the Hamiltonian is then given by The following function V satisfies the Hamilton-Jacobi-Bellman equation x, (3.20)where K = K T ∈ R n×n is a positive symmetric matrix.The matrices K and Λ in (3.12) can be found by solving the matrix differential Riccati equation Now, consider the following relations: With the choice of these relations, the matrix P(q) given by (3.20) satisfies the Riccati equation (3.21).Then, the optimal control u o (t) that minimizes (3.14) subject to (3.13) is given by See Appendix C for proof.
With the optimal-feedback control law u o (t) calculated using (3.23), the torques τ(t) to apply to the robotic system are computed according to the following control input: Stability analysis.Suppose that the matrices K and Λ exist that satisfy the hypotheses of Lemma 3.1 and the matrix P(q) is bounded over (t 0 ,∞).Then using the optimal-feedback control law in (3.13) and (3.23), the controlled nonlinear system becomes This is globally exponentially stable (GES) with respect to the origin in R 2n .
Proof.The quadratic function V ( x,t) is chosen to be the candidate Lyapunov function as it is positive radially, increasing with x .It is continuous and has a unique minimum at the origin.Now, it remains to show that dV /dt < 0 for all x = 0. From the solution of the HJB equation (3.16), we have (3.27) The time derivative of the Lyapunov function is negative definite and from Lyapunov second theorem it follows that the nonlinear control system (3.25) is globally exponentially stable with respect to the origin in R 2n .

Feedforward neural networks
A two-layer feedforward neural network (FFNN) with n input units, m output units, and N units in the hidden layer, is shown in Figure 4.1.The output vector y is determined in terms of the input vector x by the formula where σ(•) are the activation functions of the neurons of the hidden layer.The inputs-tohidden-layer interconnection weights are denoted by v jk and the hidden-layer-to-outputs interconnection weights by w i j .The bias weights are denoted by θ v j , θ wi .There are many classes of activation functions, for example, sigmoid, hyperbolic tangent, and Gaussian.
The sigmoid activation function used in our work is given by By collecting all the NN weights v jk , w i j into matrices of weights V T , W T , we can write the NN equation in terms of vectors as with the vector of activation functions defined by The bias weights are included as the first column of the weight matrices.To accommodate bias weights, the vectors x and σ(•) need to be augmented by replacing 1 as their first element, for example, Function approximation property.Let f (x) be a smooth function from R n to R m .Let U x ⊆ R n then for x ∈ U x there exists some number of hidden layer neurons N and weights W and V such that [26]  The value of ε is called the NN functional approximation error.In fact, for any choice of a positive number ε N , one can find an NN such that ε < ε N in U x .For a specified value of ε N , the ideal approximating NN weights exist.The, an estimate of f (x) can be given by where W and V are estimates of the ideal NN weights that are provided by some on-line weight-tuning algorithms.
Error backpropagation algorithm.This is a common weight tuning algorithm that is based on gradient descent algorithm.If the NN is training offline to match specified exemplar pairs (x d , y d ), with x d the ideal NN input that yields the desired NN output y d , then the continuous-time version of the backpropagation algorithm for the two-layer NN is given by where F, G are positive definite design learning parameter matrices.The backpropagated error E is selected as the desired NN output minus the actual NN output E = y d − y.For the scalar sigmoid activation function (4.2), the hidden-layer output gradient σ is where I denotes the identity matrix, and diag{z} means a diagonal matrix whose diagonal elements are the components of the vector z.In the next section, design of NN controller is presented.

NN controller design
We will use a feedforward neural network to approximate the nonlinear function given in (3.9) The functional approximation error can be made arbitrarily small by selecting appropriate weights of the network.Then a functional estimate h(y) of h(y) can be written as where W and V are estimated NN weights.Then the external torque is given by where v(t) is the robustifying term that is given by where Then (3.11) becomes (5.6) In order to proceed further, we need the following definitions [17].
Definition 5.1.The solution of a nonlinear system with state y(t) ∈ R n is uniformly ultimately bounded (UUB) if there exists a compact set U y ⊂ R n such that for all y(t 0 ) = y 0 ∈ U y , there exists a δ > 0 and a number T(δ, y 0 ) such that y(t) < δ for all t ≥ t 0 + T.

Definition 5.3. Given
i j with tr{•} as the trace operator.The associated inner product is A,B F = tr{A T B}.The Frobenius norm is compatible with 2-norm so that Ay 2 ≤ A F y 2 .
Definition 5.4.For notational convenience, define the matrix of all NN weights as Z ≡ diag{W,V } and the weight estimation errors as W = W − W, V = V − V , and Z = Z − Z.The ideal NN weights bounded so that Z F ≤ Z M with known Z M .Also define the hidden-layer output error for a given y as σ = σ − σ = σ(V T y) − σ( V T y).
V. Panwar and N. Sukavanam 11 Adding and subtracting W T σ( V T y) in (5.6), we get (5.7) Again adding and subtracting W T σ in (5.7), we get The Taylor series expansion of σ(V T y) about given V T y gives us with σ ( z) = (dσ(z)/dz)| z= z the Jacobian matrix and O(z) 2 denoting terms of second order.Denoting σ = σ ( V T y), we have σ = σ V T y + O( V T y) 2 .Replacing for σ in (5.8), we get where the disturbance terms are Finally, adding and subtracting W T σ V T y (5.10) becomes where the disturbance terms are The state-space description of (5.13) can be given by Inserting the optimal feedback control law (3.23)into (5.14),we obtain NN weights update law.With positive definite design parameters F, G, and κ > 0, the adaptive NN weight update law is given by (5.16) Then the errors state vector x and W, V are uniformly ultimately bounded and the errors x can be made arbitrary small by adjusting the weights.
Proof.Consider the following Lyapunov function candidate: (5.17) The time derivative L of the Lyapunov function becomes (5.18) Evaluating (5.18) along (5.15), we get From Riccati equation, we have or we can transform it into the following form: (5.21) Using x T PA x = (1/2) x T {PA + A T P} x and using (5.21), we obtain V and applying adaptive learning rule (5.16), we have V. Panwar and N. Sukavanam 13 or + tr κ W T B T P x W + tr κ V T B T P x V . (5.24) Since B T P = [0 I ], PB = [0 I ] T , and assuming v = 0, w = 0 (though v and w can be shown to be bounded), we get (5.25) The following inequality is used in the derivation of (5.25), Using Cauchy-Schwartz inequality, we have Z,Z F ≤ Z F Z M , and then (5.25) is derived.Completing the square term, we get ( The expression for L given by (5.27) remains negative as long as the quantity in the bracket is positive, that is, either (5.28) or (5.29) hold where C x and C Z are the convergence regions.According to Lyapunov theory and LaSalle extension, the UUB of x and W, V is proved.

Simulation results
The simulation has been performed for a 4-DOF robotic manipulator (see Figure 6.1) in contact with a planar environment moving along an ellipse.The mathematical model of the manipulator is expressed as where the mass matrix and gravity terms are given as follows: , where c i = cos q i , c 23 = cos q 2 + q 3 , c 234 = cos q 2 + q 3 + q 4 , c 34 = cos q 3 + q 4 , s i = sin q i , s 23 = sin q 2 + q 3 , s 234 = sin q 2 + q 3 + q 4 , s 34 = sin q 3 + q 4 , i = 1,2, The parameter values for the manipulator model are set to be θ = q 2 + q 3 + q 4 , φ 1 q 1 , q 2 , q 3 , q 4 ≡ b 2 c 2 1 + a 2 s 2 1 a 2 c 2 + a 3 c 23 + a 4 c 234 2 − 1 = 0, φ 2 q 1 , q 2 , q 3 , q 4 ≡ a 1 + a 2 s 2 + a 3 s 23 + a 4 s 234 − c = 0. (6.5) Define x u = [ψ 1 ,ψ 2 ] T and x c = [φ 1 ,φ 2 ] T , where ψ 1 = q 1 , ψ 2 = q 2 + q 3 + q 4 .The weighting matrices used in defining the performance index are as follows: Solving for the matrices K and Λ, we get Also we have set Γ = 1.The control objective is to let the end-effector track the elliptical trajectory on the planar surface while maintaining a desired orientation along the surface.The joint q 1 is desired to track the desired trajectory q 1d (t) and desired orientation is θ d .The Lagrangian multiplier λ is set to the desired constant vector [λd1 λd2 ] T .In numerical simulation, we have chosen With ω = 0.1 and ρ = .001for the robustifying term, Figures 6.2-6.4 depict the performance of the designed optimal sliding-mode controller with perfectly known model parameter values.In case of unknown parameter values, the FFNN-based controller is used to learn the unknown robot dynamics.The architecture of the FFNN is composed of 18 input units and 1 bias unit, 24 hidden sigmoidal units and 1 bias unit and 4 output units.The learning rate in the weight-tuning algorithm is 19 , and κ = .0005.The whole system is simulated for 10 seconds.Figures 6.5-6.8 show the ability of the FFNN controller to learn the unknown dynamical behavior of the system and produce desired response.

Conclusion
In this paper, the design of an optimal hybrid motion and force control scheme is presented for a constrained robotic manipulator with unknown dynamics.The optimal control law is derived using HJB optimization.An online adaptive neural network controller, using a two-layer feedforward neural network, is proposed to compensate for the dynamic model of the constrained robot.It has been shown that the entire system depends on the user-specified performance index matrices Q and R. The stability of the overall system V. Panwar and N. Sukavanam 17 is proved using Lyapunov function that is generated by weighting matrices.Simulation of a 4-DOF robot manipulator has been used to illustrate the control methodology.The simulation results show that the feedforward neural network with the on-line updating law can compensate the robot dynamics efficiently.

A.
Proof of Property 2.3.M(q) is symmetric and positive definite.Thus M T (q) = T T (q)M(q)T(q) T = T T (q)M T (q) T T (q) T = T T (q)M(q)T(q).Therefore M(q) is symmetric and positive definite.Also M(q) = T T (q)M(q)T(q) ≥ α M λ min T T (q)T(q) I n = α M I n , ( A . 2 ) 20 Mathematical Problems in Engineering where λ min (T T (q)T(q)) is the minimum eigenvalue of the matrix T T (q)T(q) and α M = α M λ min T T (q)T(q) , M(q) = T T (q)M(q)T(q) ≤ β M λ max T T (q)T(q) I n = β M I n , (A.3) where λ max (T T (q)T(q)) is the maximum eigenvalue of the matrix T T (q)T(q) and β M = β M λ max T T (q)T(q) .(A.4) Thus the matrix M(q) is bounded above and below.

B.
Proof of Property 2.4.
we get the following relations: (C.12) Using the robot Property 2.4, we can set R −1 = Q 22 .Thus the proof is completed.

Definition 5 . 2 .
Define the norm of a vector y ∈ R n as y = y 2 1 + y 2 2 + ••• + y 2 n and the norm of a matrix A ∈ R m×n as A = λ max [A T A], where λ max [•] and λ min [•] are the largest and smallest eigenvalues of a matrix.To be specific, denote the p-norm by • p and the absolute value as | • |.