Embedded Optimal Control of Robot Manipulators with Passive Joints

This paper studies the optimal control problem for planar underactuated robot manipulators with two revolute joints and brakes at the unactuated joints in the presence of gravity. The presence of a brake at an unactuated joint gives rise to two operating modes for that joint: free and braked. As a consequence, there exist two operating modes for a robot manipulator with one unactuated joint and four operating modes for a manipulator with two unactuated joints. Since these systems can change dynamics, they can be regarded as switched dynamical systems.The optimal control problem for these systems is solved using the so-called embedding approach. The main advantages of this technique are that assumptions about the number of switches are not necessary, integer or binary variables do not have to be introduced to model switching decisions between modes, and the optimal switching times betweenmodes are not unknowns of the optimal control problem. As a consequence, the resulting problem is a classical continuous optimal control problem. In this way, a general method for the solution of optimal control problems for switched dynamical systems is obtained. It is shown in this paper that it can deal with nonintegrable differential constraints.


Introduction
In underactuated manipulators the number of available control inputs is strictly less than the number of the degrees of freedom of the robot.However, the control problem for different underactuated manipulators may have different levels of difficulty.
This paper studies the optimal control problem for planar underactuated robot manipulators with two revolute joints that move in a vertical plane and, therefore, are subject to the gravity force.The presence of two revolute joints in the mechanical system will be denoted by .Both possible models of planar underactuated  robot manipulators under gravity are considered, namely, the model in which only the shoulder joint is actuated, denoted by , which is usually called Pendubot, and the model in which only the elbow joint is actuated, denoted by , which is usually called Acrobot.The mechanical system obtained by removing both motors from the  robot manipulator under gravity is also considered, which is actually a double pendulum and will be denoted by  .
Underactuated robots are mechanical systems with second order nonholonomic constraints, because the dynamic equation of the unactuated part of the mechanical system is a second order differential constraint which, in general, is nonintegrable.In the absence of gravity it is not possible to integrate even partially this second order differential constraint in the dynamic model of the  robot manipulator.However, in the presence of this second order nonholonomic constraint the system is controllable.On the contrary, in the dynamic model of the  robot manipulator without gravity this differential constraint is completely integrable.It can be converted into a holonomic constraint that makes the system not controllable.As a consequence, the trajectory planning problem has a solution only for particular initial and final states.In the presence of gravity, the dynamic equation of the unactuated part of the mechanical system is a second order nonintegrable differential constraint for both the Pendubot and the Acrobot models and, therefore, both systems are controllable.
A comprehensive review of the control properties of underactuated robot manipulators can be found in [1], [2,Chapter 42].
In [3] the chaotic features of the double pendulum are studied by means of numerical methods.It is shown that, depending on the initial conditions, this system has a periodic, quasiperiodic, and chaotic behaviour, in which the motion of the system is unpredictable and small differences in initial conditions induce very different trajectories.
For both the Pendubot and the Acrobot models, the control objective is, in general, to drive the manipulator away from the stable equilibrium state and steer it at an unstable equilibrium state.Since both the  and the  underactuated robot manipulators are linearly controllable, it is not difficult to control them around an equilibrium point.However, due to the gravitational drift, the region of the state space where the robot can be kept in equilibrium is reduced and consists of two disjoint manifolds.Moving between these two manifolds requires appropriate swingup maneuvers, whose synthesis has been tackled by several control [4] and optimal control [5] techniques.
The underactuated robot manipulators studied in this paper are supposed to be equipped with brakes at the unactuated joints.The Pendubot and the Acrobot with brakes at the unactuated joints will be denoted by   and   , respectively, whereas the   mechanical system with brakes at both unactuated joints will be denoted by     .This mechanical system can be regarded as a brake-actuated mechanical system and will be referred to as Brakebot.Thus, since each passive joint has two operating modes, unbraked and braked, there are two operating modes for the Acrobot and the Pendubot and four operating modes for the Brakebot.
Since the Brakebot is a passive system it can only move from higher to lower energy states and, therefore, the control objective is, in general, to drive the manipulator away from an unstable equilibrium state and steer it at a stable equilibrium state with appropriate swing-down maneuvers eliminating periodic, quasiperiodic, or chaotic behaviour.For the synthesis of these maneuvers, optimal control techniques can be employed.The cost functional to be optimized could be, for instance, the duration of the maneuver, the number of switches of the dynamical system between unbraked and braked modes, or the total duration of the evolution of the system in braked mode.
The problem studied in this paper can be stated as follows: given an initial state and a final state, find the sequences of modes, the corresponding trajectory, and available control inputs that satisfy the dynamic equation of the robot manipulator and steer the system between the initial and the final states optimizing a cost functional during the motion.Since the optimal sequence of modes has to be determined, this problem is actually an optimal control problem of a switched dynamical system.
Switched systems are particular hybrid systems [6] whose continuous state does not exhibit jumps at the switching instants.Solving the optimal control problem for a switched system implies finding the optimal sequence of switching instants, the optimal sequence of discrete modes, and the optimal value for the continuous control input.
The optimal control problem of switched dynamical systems can be formulated using integer or binary variables to model the choice of the optimal mode sequence.Optimal control problems involving binary or integer variables are called Mixed-Integer Optimal Control (Mioc) problems [7].In [8], recent progress in theoretical bounds, reformulations, and algorithms for this class of problems have been reviewed and a comprehensive algorithm, based on the solution of a sequence of purely continuous problems and simulations, has been proposed.A hybrid optimal control technique based on Mioc has been presented in [9] and applied to several mechanical systems including a Pendubot with a brake at the unactuated joint.
Hybrid control techniques have been applied to both the Pendubot and the Acrobat models in [10] and [11], respectively.
In [12], the optimal control problem of switched systems has been solved via embedding into a continuous optimal control problem.It has been shown that, for quite a general class of optimal control problems for switched systems, the computational complexity of the Embedded Optimal Control (Eoc) method is no greater than that of smooth optimal control problems.With this technique the switched system is embedded into a larger set of systems and the optimal control problem is formulated in the latter.The main advantages of this method are that assumptions about the number of switches are not necessary, integer or binary variables do not have to be introduced to model switching decisions between modes, and the optimal switching times between modes are not unknowns of the optimal control problem.As a consequence, the resulting problem is a classical continuous optimal control problem.This greatly reduces the computation time to find a solution with respect to the Mioc technique.In [13] the Eoc approach introduced in [12] has been extended to incorporate hybrid behaviour due to autonomous switches and to systems with an arbitrary number of modes.In [14] the Eoc method has been compared with other optimal control methods for switched systems over several dynamic models with autonomous and controlled switches.A comprehensive treatment of all the theoretical results obtained with the Eoc approach to the optimal control problem of switched mechanical systems can be found in [15,Chapter 31].
In this paper, the optimal control of the Pendubot, Acrobot, and Brakebot has been studied using the Eoc approach.The obtained results have been compared with those obtained using a Multiphase Mixed-Integer Optimal Control (Mmioc) method.Due to the nature of the problem, only controlled switches have been considered.The results have shown that the Eoc approach is effective in solving optimal control problems for these underactuated mechanical systems, which are highly nonlinear and have nonholonomic constraints.
This paper is organized as follows.In Section 2 the dynamic models of the Pendubot, Acrobot, and Brakebot are described and in Section 3 their control properties are discussed.The optimal control problem for these dynamic systems is stated in Section 4 and in Section 5 the Eoc approach is introduced.In Section 6 the results of the application of this method to several instances of optimal control problem are reported.Finally, Section 7 contains the conclusions.

Dynamic Model of Underactuated Manipulators
The general dynamic model of a robot manipulator is described by the following second order differential equation: where the first term of this equation, () θ , represents the inertial forces due to acceleration at the joints and the second term, (, θ ) θ , represents the Coriolis and centrifugal forces.The third term represents both the effects of friction at the joints and the effects of friction induced by electromagnetic friction brakes at the unactuated joints.The term () represents the potential forces such as elasticity and gravity.The matrix () on the right-hand side maps the external forces/torques  to forces/torques at the joints.Finally,  represent the forces/torques at the joints, which are the control variables of the system.Explicit time dependence is omitted in this section to simplify the notation.
The links are supposed to be rigid, as well as the transmission elements.No external forces are supposed to be acting on the mechanical system.Under these hypotheses, the dynamic model of the robotic system reduces to Only the effects of friction induced by the electromagnetic friction brakes at the unactuated joints will be taken into account.This means that ℎ represents the braking torques induced by electromagnetic friction brakes at the unactuated joints.
A planar vertical  manipulator is composed of two homogeneous links and two revolute joints moving in a vertical plane {, }, as shown in Figure 1, where   is the length of link ,   is the distance between joint  and the mass center of link ,   is the mass of link , and    is the barycentric inertia with respect to a vertical axis, , of link , for  = 1, 2. In this case the two matrices () and (, θ ) and the vector () have the form where  = ( 1 ,  2 )  is the vector of configuration variables,  1 being the angular position of link 1 with respect to the axis of the reference frame {, } and  2 the angular position of link 2 with respect to link 1 as illustrated in Figure 1.
The vector θ = ( θ 1 , θ 2 )  is the vector of angular velocities and θ = ( θ 1 , θ 2 )  is the vector of angular accelerations.The vector  = ( 1 ,  2 )  is the vector of control inputs, in which  1 is the torque applied by the actuator at Joint A robot manipulator is said to be underactuated when the number of actuators is less than the degree of freedom of the mechanical system.Thus, the dynamic model ( 6) can In passive joints equipped with brakes, the braking torque ℎ  ,  ∈ {1, 2}, can be described as a friction-velocity map with load dependency and lockup behavior [16]: ( Here  is the viscous friction coefficient,  is the loadindependent Coulomb friction torque,  is the friction load dependency, and ℎ   is the external torque at joint . cl is the brake clamp force, and ℎ  is the load-independent static friction torque, which are supposed to have the same values at the two brakes.The parameter  defines a small zero velocity bound in accordance with Karnopp's model for zero velocity detection.A continuously differentiable version of this friction model [17] has been used for the experiments.Equations ( 7) and ( 8) with ℎ 2 = 0 and with ℎ 1 = 0 represent the unbraked operational modes of the Pendubot and the Acrobot, respectively.In this case, the second equation in (7) and the first equation in (8) describe the dynamics of the unactuated part of the mechanical system, which are second order differential constraints without input variables.Activating brakes at the passive joints of the Pendubot and the Acrobot gives rise to the braked operational modes of these two mechanical systems which thus have two operational modes, namely, (1) unactuated joint free and (2) unactuated joint braked.
Since all the joints of the Brakebot are unactuated, it has four operational modes depending on which joints are braked and which are free, namely, (1) both joints free, (2) the first joint free and the second joint braked, (3) the first joint braked and the second joint free, and (4) both joints braked.The corresponding dynamic equations can be obtained from (10) by setting one or both braking torques to zero.

Control Properties of Underactuated 𝑅𝑅 Robot Manipulators
In this section, the main control properties of planar underactuated  robot manipulators in the presence of gravity will be described.For a more general description of the control properties of underactuated robot manipulators see [1].
Optimal control approaches to trajectory planning assume that there exists a control input that steers the system between two given states.Thus, controllability is the most important aspect to be checked before studying the optimal control problem of a dynamic system.If in the trajectory planning problem the duration of the motion  is not assigned, the existence of a finite-time solution for any state (  , θ  ) in a neighborhood of (  , θ  ) is equivalent for the robotic system to the property of local controllability at (  , θ  ).If local controllability holds at any state, then the system is controllable and the trajectory planning problem is solvable for any pair of initial and final states.
For underactuated  robot manipulators, controllability is related to integrability of the second order nonholonomic constraints.The second order differential constraint in ( 7) and ( 8) may either be partially integrable to a first order differential equation or completely integrable to a holonomic equation.Necessary and sufficient integrability conditions are given in [18].If the second order differential constraint in (7) or ( 8) is not partially integrable, it is possible to steer the corresponding system between equilibrium points.This occurs for planar underactuated  robot manipulators in the absence of gravity and for both the  and  in the presence of gravity.If the second order differential constraint in (7) or ( 8) is completely integrable to a holonomic constraint, the motion of the corresponding mechanical system is restricted to a 1-dimensional submanifold of the configuration space which depends on the initial configuration.This occurs for the planar underactuated  robot manipulators without the effects of gravity [18].For this robot model, the trajectory planning problem has solution only for particular initial and final states.
Thus, when the second order differential constraint in (7) or ( 8) is not partially or completely integrable, the corresponding mechanical system is controllable.However, several aspects of controllability can be studied which characterize planar underactuated  robot manipulators.
Both planar underactuated  and  robot manipulators in the presence of gravity are linearly controllable [1].Linear controllability means that the linear approximation of the system around an equilibrium point is controllable.Both planar underactuated  and  robot manipulators in the presence of gravity are neither small-time locally controllable nor small-time locally configuration controllable [19,20].Finally, both planar underactuated  and  robot manipulators in the presence of gravity are not kinematically controllable [19][20][21].The lack of kinematic controllability implies that in general a configuration is not reachable by means of a sequence of kinematic motions, that is, feasible paths in the configuration space which may be followed with an arbitrary timing law.

The Optimal Control Problem
In this section, following [13], the optimal control problem for switched dynamical systems will be described.
In switched dynamical systems, the vector fields that describe the evolution of the system undergo discontinuous jumps depending on the state and the input.These switches are called autonomous.Discontinuous jumps in the evolution of switched dynamical systems can also be induced by a control input.For this reason these switches are called controlled.
The evolution of a switched dynamical system subject to autonomous and controlled switches can be described by the following four quantities: (1) the discrete state or mode () ∈   = {1, 2, . . .,   }, (2) the continuous state () ∈ R  , (3) the switching control input V() ∈  V = {1, 2, . . .,  V }, which can be regarded as a discrete control input, and (4) the continuous control input () ∈ R  .In this work, we only consider memoryless systems for which the autonomous switches depend on the continuous state () and the input (), and not on the current discrete state ().The evolution of the discrete state of the memoryless system is described by a piecewise continuous function  : R  × R  →   , which selects the discrete state  of the system based on the continuous state () and the continuous control input (); that is, Let  (,) ,  ∈   ,  ∈  V , be a collection of  1 vector fields The evolution of the continuous state () is described by This means that, at each  ≥  0 and for each discrete state () ∈   , the switching control input V() ∈  V selects the vector field that governs the evolution of the continuous state.Note that in (15) the switching control input V() does not affect the autonomous switches.Systems described by (15) are called systems with decoupled switches.The continuous control input () is assumed to belong to a convex and compact set Ω ⊆ R  , and () and V() are supposed to be measurable functions.Since ( 14) completely determines the discrete state () based on () and (), a piecewise  1 vector field   ((), ()) =  (((),()),) ((), ()) can be defined for each  ∈  V and ( 15) can be rewritten as follows: We are interested in computing optimal control laws for systems described by (16).This entails finding (1) the optimal sequence of switching control inputs, that is, the optimal sequence of values of the function V(), (2) the optimal switching instants,  1 ,  2 , . . .,   , including their number, , and (3) the optimal continuous control over each interval [  ,  +1 ),  = 0, 1, 2, . . ., .Mmioc techniques can be employed to find the optimal solution of this problem.However, the computational complexity of the resulting problem limits the application of these Mathematical Problems in Engineering techniques in practice.It has been shown in [12] that for quite a general class of optimal control problems of switched dynamical systems, a technique called embedding can be applied, whose computational complexity is no greater than that of continuous optimal control problems.This technique will be described in Section 5.
Consider the system described by (16), where V() and () are control variables of the optimal control problem.The cost functional of the optimal control problem, defined over the interval [ 0 ,   ], can be expressed in the following form: where Thus, the optimal control problem for the above defined class of switched systems, called Switched Optimal Control Problem (Socp), can be stated as follows: subject to (1) ( 16), (2) boundary conditions ( 0 , ( 0 ),   , (  )) ∈ B, and (3) () ∈ Ω ⊆ R  , and

Embedding Approach
The approach described in [12] to solve optimal control problems of switched dynamical systems consists in embedding system (16) into a larger set of systems.Since V() Let   be the control input for each vector field   ,  ∈  V in (15).Define a new system and the associated cost functional The Socp is converted into the following Embedded Optimal Control Problem (Eocp): minimize the functional ( 22) over all functions V  and   subject to the following constraints: (1) ( 21), (2) boundary conditions ( 0 , ( 0 ),   , (  )) ∈ B, (3) for each  ∈ [ 0 ,   ] and each  ∈  V , V  () ∈ [0, 1] and   () ∈ Ω, and (4) for each  ∈ [ 0 ,   ], ∑ This is an optimal control problem without integer or binary variables.Therefore, classical necessary and sufficient conditions of optimal control theory can be applied to solve it.However, to guarantee the existence of a solution some additional hypotheses are needed.We assume that the functions   ((),   ()) are affine in the continuous control variable, that is,   ((),   ()) =   () +   ()  , and that the functions  0  ((),   ()) are convex in   for each (, ).It has been shown that, once a solution of the Eocp has been obtained, either the solution is of the switched type, that is, exactly one of the V  's being 1 and all the others being 0, or suboptimal trajectories of the Socp can be constructed that can approach the value of the cost for the Eocp arbitrarily closely and satisfy the boundary conditions within , with arbitrary  > 0. A thorough discussion about the relationship between Socp's and Eocp's solutions can be found in [12].The numerical method employed to transcribe the Eocp into Nlp problem is described in [22].

Implementation and Results
Several numerical experiments have been carried out for the   ,   , and     planar underactuated robot manipulators in the presence of gravity.The corresponding dynamical systems are described by vector fields that are affine in the continuous control variables, thus fulfilling the sufficient conditions for the existence of a solution stated in the previous section.
The results obtained using the Eoc approach have been compared to those obtained using a Mmioc technique [7].In the Mmioc approach, to find the optimal sequence of modes an iterative process has been used.First, the maximum number of switches between modes has been estimated.This number determines the maximum number of phases that are considered.Then, an optimal control problem has been formulated, in which the unknown switching times are part of the state, and binary variables have been introduced to represent the choice of the active mode in each phase.Finally, a third-degree Gauss-Lobatto direct collocation method has been used to tackle the dynamic equations.The resulting Minlp problem has been solved using a nonlinear programming based branch-and-bound algorithm specifically tailored to the problem.Details about the implementation of this method can be found in [23].As pointed out in [8], special care has to be taken to treat the cases in which phase lengths diminish during the optimization procedure.Therefore, the phases of zero duration have been eliminated.
6.1.Pendubot with Brake.In this section, the optimal control problem for a Pendubot with a brake at the unactuated joint is studied.In this robot model, the second joint is not actuated; thus  = ( 1 , 0)  .In this case, it is not possible to integrate partially or completely the nonholonomic constraint because the conditions for partial integrability [18] are not fulfilled.Hence, the system is controllable.The numerical results of the application of the Eoc method for optimal control to two different boundary value problems for this system will be described.In both cases the functional to be minimized is the energy consumption during the motion.
The Pendubot has two operational modes, depending on whether the unactuated joint is braked or not, namely, (i) Mode 1: Joint 2 free, (ii) Mode 2: Joint 2 braked.From ( 7), the dynamic model of a Pendubot with a brake at the unactuated joint is In these equations  1 is the first component of the input vector  = ( 1 ,  2 )  .When the Pendubot is operating in Mode 1, that is, with Joint 2 free, ℎ 2 = 0. Otherwise, it takes the form (9). To convert this second order differential model into the first order model ( 16) the following change of variables is introduced: with the following additional differential relations: among the components of state variable  = ( 1 ,  2 ,  3 ,  4 )  .The result is Since the Pendubot has two dynamical modes,  V = {1, 2}, 2 vector fields,  1 = ( 1,1 ,  1,2 ,  1,3 ,  1,4 )  and  2 = ( 2,1 ,  2,2 ,  2,3 ,  2,4 )  , must be considered.The dynamic models of the Pendubot operating in Mode 1 and Mode 2 are the following: In these equations,  1,1 is the first component of vector  1 = ( 1,1 ,  1,2 )  , the input vector for vector field  1 , and  2,1 is the first component of vector  2 = ( 2,1 ,  2,2 )  , the input vector for vector field  2 .Moreover, two new variables, , must be introduced.Thus, ( 21) takes the form and the cost functional (22) becomes This objective functional fulfills the sufficient conditions for the existence of a solution stated in Section 5. Inequality constraints on the state variables have been also considered.In particular, variable  2 has been bounded to take into account possible range limitations of the mechanical system.Thus, the inequality constraints − [rad] ≤  2 ≤  [rad] have been introduced in the model.Inequality constraints on the available control variable have also been considered.In particular, the constraints −25 [Nm] ≤  1 ≤ 25 [Nm] have been introduced in the model to take into account physical limitations of the actuators.
The following parameters of the dynamic model have been used:

Swing to the Up Configuration.
In this numerical experiment, a boundary value problem has been solved with the following initial and final conditions: and θ The initial and final times have been   = 0 and   = 3.6 [s], respectively.
The energy consumption, the resolution time, the number of phases, the sequence of modes, and the corresponding switching times between them obtained with both the Eoc and Mmioc methods are reported in Table 1. Figure 2 shows the sequence of configurations of the robot at times  = (3.6/64), = 0, 1, . . ., 64, corresponding to the optimal solution of the boundary value problem obtained with a discretization of the interval [  ,   ] into 128 subintervals.Configurations that correspond to the braked mode of the mechanical systems are represented in red.It is easy to see that the braked mode is used in the solution just before swinging and when the system is reaching its final state.Figure 3 shows the corresponding control and state variables obtained with the Eoc method (in red) and the Mmioc approach (in black).In the Eoc solution, six phases can be distinguished, although two of them have a very short duration.The optimal sequence of modes obtained has been (0, 1, 0, 1, 0, 1) which corresponds to braking the unactuated joint in phases 2, 4, and 6 and not braking in phases 1, 3, and 5.In the Mmioc solution, 4 phases can be distinguished.The optimal sequence of modes obtained has been (0, 1, 0, 1) which corresponds to braking the unactuated joint in phases 2 and 4 and not braking in phases 1 and 3.It is easy to see from Figure 3(f) that the optimal sequence of modes obtained with the Eoc method coincides with that obtained with the Mmioc approach except for the fifth short phase that does not exist in the Mmioc solution.As a consequence, the continuous solutions almost coincide.The value of the objective functional for the optimal solution obtained with the Eoc method has been 192.71[J], whereas the value obtained with the Mmioc approach has been 195.532[J], higher than the previous one.The computational time to find a solution with the Eoc method has been 2.84221 [s] whereas it has been 26.984307[s] with the Mmioc approach, higher than the previous one.

Swing to the Down Configuration.
In this numerical experiment, a boundary value problem has been solved with the following initial and final conditions:  The results obtained with both the Eoc and Mmioc methods are reported in Table 2, Figures 4 and 5 with the same interpretation as in Section 6.1.1.It is easy to see from Figure 5(f) that the optimal sequence of modes obtained with the Eoc method coincides with that obtained with the Mmioc approach but the time location of the corresponding phases does not.As a consequence, the continuous solutions do not coincide.Nevertheless, the value of the objective functional for the optimal solution obtained with the Eoc method has been 21.5397 [J] whereas the value obtained with the Mmioc approach has been 21.5739 [J], very similar to the previous one.The computational time to find a solution with the Eoc method has been 1.57952 [s], shorter than computational time to find a solution with the Mmioc approach, which has been 28.65894 [s].

Acrobot with Brake.
In this section, the optimal control problem of an Acrobot with a brake at the unactuated joint is studied.In this robot model the first joint is not actuated; thus  = (0,  2 )  .In this case, it is not possible to integrate partially or completely the nonholonomic constraint because the conditions for partial integrability [18] are not fulfilled.Hence, the system is controllable.The numerical results of the application of our method for optimal control to two different boundary value problems for this system will be described.In both cases the functional to be minimized is the energy consumption during the motion.
The Acrobot has two operational modes, depending on whether the unactuated joint is braked or not, namely, (i) Mode 1: Joint 1 free, (ii) Mode 2: Joint 1 braked.
From (8), the dynamic model of an Acrobot with a brake at the unactuated joint is In these equations  2 is the second component of the input vector  = ( 1 ,  2 )  .When the Acrobot is operating in Mode 1, that is, with Joint 1 free, ℎ 1 = 0. Otherwise, it takes the form (9). To convert this second order differential model into the first order model ( 16) the change of variables (24) and the additional differential relations (25) are introduced.The result is Since the Acrobot has two dynamical modes,  V = {1, 2}, two vector fields,  1 and  2 , must be considered.The dynamic models of the Acrobot operating in Mode 1 and Mode 2 are the following: In these equations,  1,2 is the second component of vector  1 = ( 1,1 ,  1,2 )  , the input vector for vector field  1 , and  2,2 is the second component of vector  2 = ( 2,1 ,  2,2 )  , the input vector for vector field  2 .Moreover, two new variables, , must be introduced.Thus, (21) takes the form and the cost functional (22) becomes This objective functional fulfills the sufficient conditions for the existence of a solution stated in Section 5.
Variable  2 has been bounded, to take into account possible range limitations of the mechanical system.Thus, the inequality constraints − [rad] ≤  2 ≤ [rad] have been introduced in the model together with the constraints −25 [Nm] ≤  2 ≤ 25 [Nm] to take into account physical limitations of the actuators.The same parameters of the dynamic model of Section 6.1 have been used.

Swing to the Up Configuration.
In this numerical experiment, a boundary value problem has been solved with the following initial and final conditions: and θ The initial and final times have been   = 0 and   = 3.4 [s], respectively.
The results obtained with both the Eoc and Mmioc methods are reported in Table 3, Figures 6 and 7 with the same interpretation as in the previous sections.It is easy to see from Figure 7(f) that the optimal sequence of modes obtained with Eoc method does not coincide with that obtained with the Mmioc approach, since the first unbraked phase does not exist in the Mmioc solution.As a consequence, the continuous solutions do not coincide.The value of the objective functional for the optimal solution obtained with the Eoc method has been 113.968[J] whereas the value obtained with the Mmioc approach has been 113.027[J], a bit lower than the previous one.The computational time to find a solution with the Eoc method has been 20.2848 [s], whereas it has been 32.137628 [s] with the Mmioc approach, higher than the previous one.

Swing to the Down Configuration.
In this numerical experiment, a boundary value problem has been solved with the following initial and final conditions: Figure 4: Sequence of configurations of the optimal solution for the swing-down maneuver for the Pendubot with brake.The corresponding control and state variables are represented in Figure 5.
and θ The initial and final times have been   = 0 and   = 3.8 [s], respectively.
The results obtained with both the Eoc and Mmioc methods are reported in Table 4, Figures 8 and 9 with the same interpretation as the previous sections.It is easy to see from Figure 9(f) that the optimal sequence of modes obtained with Eoc method coincides with that obtained with the Mmioc approach except for the first short phase that does not exist in the Mmioc solution.Nevertheless, the continuous solutions almost coincide.The value of the objective functional for the optimal solution obtained with the Eoc method has been 5.71916 [J] whereas the value obtained with the Mmioc approach has been 5.73743 [J], slightly higher than the previous one.The computational time to find a solution with the Eoc method has been 4.7636 [s], whereas it has been 19.428886[s] with the Mmioc approach, higher than the previous one.

Brakebot.
In this section, the optimal control problem for a Brakebot is studied.In this robot model both joints are not actuated; thus  = (0, 0)  .In this case the functional to be minimized is the duration of the motion.
The Brakebot has four operational modes depending on which joints are braked and which can rotate freely, namely, (i) Mode 1: Joint 1 free, Joint 2 free, (ii) Mode 2: Joint 1 free, Joint 2 braked, (iii) Mode 3: Joint 1 braked, Joint 2 free, (iv) Mode 4: Joint 1 braked, Joint 2 braked.From (10), the model of the Brakebot is This general dynamic model can be easily particularized to describe the four operating modes of the Brakebot.For example, when the Brakebot is operating in Mode 1, that is, with both Joint 1 and Joint 2 free, ℎ 1 = ℎ 2 = 0.When the Brakebot is operating in Mode 2, that is, with Joint 1 free and Joint 2 braked, ℎ 1 = 0 and ℎ 2 takes the form (9), and so on.To convert this second order differential model into the first order model ( 16) the change of variables (24) and the additional differential relations (25) Figure 8: Sequence of configurations of the optimal solution for the swing-down maneuver for the Acrobot with brake.The corresponding control and state variables are represented in Figure 7.
Moreover, four new variables, , must be introduced.Thus, (21) takes the form and the cost functional (22) is This objective functional fulfills the sufficient conditions for the existence of a solution stated in Section 5. Inequality constraints on the state variables have been also considered.In particular, variable  2 has been bounded to take into account possible range limitations of the mechanical system.Thus, the inequality constraints − [rad] ≤  2 ≤  [rad] have been introduced in the model.This figure shows (in red) the control and state variables of the optimal solution obtained with the Eoc approach for the swingdown maneuver for the Acrobot with brake.The corresponding sequence of configurations of the robot is represented in Figure 8.The same figure shows (in black) the optimal solution of the same boundary value problem obtained with a Mmioc approach.

Swing to the Down Configuration.
In this numerical experiment the minimum time problem has been solved with the following initial and final conditions: and θ The initial time has been   = 0.
The results obtained with both the Eoc and Mmioc methods are reported in Table 5, Figures 10 and 12. Mode 1 has been represented in green, Mode 2 in blue, Mode 3 in orange, and Mode 4 in red.It is easy to see from Figure 11 that the optimal sequence of modes obtained with the Eoc method does not coincide with that obtained with the Mmioc approach.As a consequence, the continuous solutions do not coincide, as well.The value of the objective functional for the optimal solution obtained with the Eoc method has been 4.25354 [s] whereas the value obtained with the Mmioc approach has been 5.00115 [J], higher than the previous one.The computational time to find a solution with the Eoc method has been 9.29855 [s], whereas it has      been 1270.4665[s] with the Mmioc approach, much higher than the previous one.
With both approaches an initial guess of the optimal solution had to be generated.However, with the Mmioc method this generation is much more difficult since, besides the optimal continuous part of the solution, the optimal number of phases, the optimal switching times between them, and the optimal sequence of modes have to be estimated.The initial guess of the optimal switching times has been calculated assuming phases of equal duration.The Mmioc method has shown high sensitivity to the initial guess of the optimal sequence of modes.Additionally, phases of zero length had to be eliminated.Therefore an iterative procedure has been implemented to select the best initial guess and to reduce the number of phases whenever a phase of zero duration appeared in the solution.The numerical experiments have been carried out on a 2.8 GHz Intel Core i7 computer with the Mac OS X Version 10.10.1 operating system and 16 Gb RAM.The Eoc approach has always found solutions with lower or similar performance index than the Mmioc method, and in shorter computational time, especially in the case with four operating modes.Indeed, when the number of modes increases, the Mmioc method becomes slower since it suffers from drawbacks due to the inherent combinatorial nature of the problem, whereas the Eoc approach does not.

Conclusions
In this paper, the embedding approach has been used to efficiently solve optimal control problems for planar underactuated robot manipulators with two revolute joints and brakes at the unactuated joints, and in the presence of gravity.The control problem of these switched dynamical systems is particularly difficult to solve due to the combinatorial nature of the problem and to the presence of nonholonomic differential constraints.It has been shown that this method does not require using integer variables or computing the optimal switching times.It requires neither guessing the number of modes nor the initial sequence of modes.The results of the numerical experiments have shown the effectiveness of the embedding approach in solving optimal control problems of underactuated mechanical systems.

Figure 2 :
Figure 2: Sequence of configurations of the optimal solution for the swing-up maneuver for the Pendubot with brake.The configurations in which the Pendubot operates in Mode 1 and Mode 2 are represented in black and red, respectively.The corresponding control and state variables are represented in Figure 3.

Figure 3 :
Figure3: This figure shows (in red) the control and state variables of the optimal solution obtained with the Eoc approach for the swing-up maneuver of the Pendubot with brake.The corresponding sequence of configurations of the robot manipulator is represented in Figure2.The same figure shows (in black) the optimal solution of the same boundary value problem obtained with a Mmioc approach.The dashed vertical lines represent switches between phases.

Figure 5 :
Figure5: This figure shows (in red) the control and state variables of the optimal solution obtained with the Eoc approach for the swing-down maneuver for the Pendubot with brake.The corresponding sequence of configurations of the robot manipulator is represented in Figure4.The same figure shows (in black) the optimal solution of the same boundary value problem obtained with a Mmioc approach.

Figure 6 :
Figure6: Sequence of configurations of the optimal solution for the swing-up maneuver for the Acrobot with brake.The corresponding control and state variables are represented in Figure7.

Figure 7 :
Figure 7:  This figure shows (in red) the control and state variables of the optimal solution obtained with the Eoc approach for the swing-up maneuver for the Acrobot with brake.The corresponding sequence of configurations of the robot manipulator is represented in Figure6.The same figure shows (in black) the optimal solution of the same boundary value problem obtained with a Mmioc approach. ẋ

Figure 9 :
Figure9: This figure shows (in red) the control and state variables of the optimal solution obtained with the Eoc approach for the swingdown maneuver for the Acrobot with brake.The corresponding sequence of configurations of the robot is represented in Figure8.The same figure shows (in black) the optimal solution of the same boundary value problem obtained with a Mmioc approach.

Figure 10 :
Figure 10: Sequence of configurations of the optimal solution for the swing-down maneuver for the Brakebot.The configurations in which the Brakebot operates in Mode 1, Mode 2, Mode 3, and Mode 4 are represented in green, blue, orange, and red, respectively.The corresponding control and state variables are represented in Figure 7.

Figure 11 :
Figure 11:  This figure shows in different colors, namely, Mode 1 in green, Mode 2 in blue, Mode 3 in orange, and Mode 4 in red, the operating modes of the optimal solution obtained with the Eoc approach for the swing-down maneuver for the Brakebot.The corresponding sequence of configurations of the robot manipulator is represented in Figure10.The same figure shows (in black) the operating modes of the optimal solution obtained with a Mmioc approach.

Figure 12 :
Figure 12:  This figure shows (in red) the control and state variables of the optimal solution obtained with the Eoc approach for the swingdown maneuver for the Brakebot.The corresponding sequence of configurations of the robot manipulator is represented in Figure10.The same figure shows (in black) the optimal solution of the same problem obtained with a Mmioc approach.
1, and  2 is the torque applied by the actuator at Joint 2.
and T 0 , B 0 , T  , and B  are the boundary sets for  0 , ( 0 ),   , (  ). is a real-valued function defined in the neighbourhood B = T 0 × B 0 × T  × B  , which is assumed to be a compact set, and  0 (,) :   → R,  ∈   ,  ∈  V , are  1 functions.

Table 1 :
Results for the swing-up maneuver for the Pendubot with brake:   = 3.6 [s].

Table 2 :
Results for the swing-down maneuver for the Pendubot with brake:   = 3.6 [s].

Table 3 :
Results for the swing-up maneuver for the Acrobot with brake:  = 3.4 [s].