Multiphase Trajectory Generation for Planar Biped Robot Using Direct Collocation Method

Stability and energy efficiency are the main focuses in the bipedal robot field. In this paper, we apply a multiphase gait, which is different from the widely used two-phase gait, to improve the stability at the moment, when a biped robot transfers from the double support phase to the single support phase. *en, we create dynamic equations with contact forces in each phase using Lagrangian formulation. Furthermore, the direct collocation method is utilized to generate the optimal trajectory toward both stability and energy efficiency. Finally, the comparison between multiphase gait and two-phase gait is performed with numerical simulations.*e results prove that multiphase gait increases the stability margin in the cost of slightly decreasing energy efficiency. Besides, both gaits show a similar human-like characteristic in hip height variation during walking.


Introduction
In the field of bipedal robot research, both stability and energy efficiency are always the main goals. At present, the bipedal robot still experiences low stability and high energy cost problem. is paper aims to dig toward this orientation further.
Raibert developed a successful running bipedal robot with one-foot gait, which distinguishes the state of the leg into two phases, support or stance phase and flight phase [1]. e support phase is when the leg provides support, and the flight phase is when the leg is in the air. e leg switches two phases in strict alternation. As for the walk of the bipedal robot, most researchers suppose it consists of two successive phases, single support phase (SSP) and double support phase (DSP). ese two phases are easily understood according to the above definition of support phase: SSP is a period when only one leg is in the support phase, and DSP is when both legs are in the support phase. Actually, the studies of human walking gait find that there are two subphases of DSP and SSP each [2]. en, these two foot rotation subphases are applied to biped robot [3][4][5]. e degree of freedom (DoF) of the biped robot, which is used in this paper, varies in different phases. In SSP, the robot is fully actuated, and when considering the heel lift subphase, it is underactuated; in DSP, it is overactuated [6]. In this paper, we add the foot rotation subphase in DSP based on the two-phase gait, so there are only fully actuated SSP and overactuated DSP. e theoretical benefit of this gait is illustrated in Section 2.2. e finding of the pros and cons of this multiphase gait is our focus in this paper. is gait is close to that used in paper [7], except we use different methods to generate trajectory and give out the comparison with two-phase gait to analyze the benefit of this multiphase gait.
Energy efficiency and stability have long been objectives of trajectory generation. For stability, the zero moment point (ZMP) [8] has been used since the last century. e ZMP is the point on the ground, where the aggregation of moments about this point is zero. If ZMP is within the convex sets consisting of regions where feet contact the ground, the robot feet would not rotate around ZMP, and the robot is considered stable. is stability criterion has been widely applied in biped like ASIMO, NAO, and HRP-2. ere is a different method called hybrid zero dynamics (HZD) [9], which can be used to analyze the stability of periodic orbits.
is method is good at solving underactuated systems, but comparing to the ZMP method, this method is relatively complex and hard to understand. Besides, we do not consider the underactuated phase here; thus, we choose the ZMP method in this paper.
For optimal trajectory generation, the most frequently used method is parameter optimization, which firstly represents the joint trajectories as functions of time decided by a vector of real numbers and then takes these numbers as the decision variables of the optimization problem. Different gradient-free optimization methods, which do not need gradients to determine the search direction, are utilized to find these decision variables. Farzdpour et al. [7] used the genetic algorithm (GA) to obtain the key parameters in trajectory generation for a seven-link planar bipedal robot with multiphase. Elhosseini et al. [10] used A-C parametric whale optimization algorithm (WOA) to find optimal parameters of the hip trajectory making ZMP close to the middle of the support polygon in two-phase gait. is method is an efficient and tractable way of generating suboptimal trajectories. However, since this method reduced discrete optimization variables to a finite number, path constraints, which are along the whole trajectory period, like dynamic equation constraints, are difficult to satisfy. Besides, polynomial functions may arise undesirable oscillations of optimized functions or jerky variations at connecting points [11].
Some researchers applied the differential dynamic programming (DDP) method on trajectory generation. Feng et al. [12] achieved online trajectory optimization by using the DDP method to derive the trajectory of the center of mass of the robot as a high-level trajectory. Budhiraja et al. [13] used DDP with Karush-Kuhn-Tucker constraint to generate the whole-body trajectory while tracking the centroidal trajectory. is method usually combines with a simplified model and can generate the trajectory in milliseconds so it can be applied online. However, the DDP approach needs the second-order derivatives of the system dynamics, which leads to the issue that is unsuitable for large problems, and does not have a unified solution for the problem with general inequality constraints [14].
Except for the above parametric optimization and DDP method, the direct collocation method [15][16][17] is one promising alternative to effectively solve trajectory optimization problems [18]. It is especially suitable for highly complex problems. Besides, it can be solved with present nonlinear programming (NLP) solvers, which can converge with poor initial guesses and are extremely computationally efficient [19]. Most importantly, it overcomes some limitations of the above methods since it can satisfy a bunch of path constraints and do not need the second-order derivatives of the system dynamics, which is hard to derive for the complex full-dimensional dynamic equations.
ere are many works about the direct collocation method applied in the legged robot area. In 1999, Hardt et al. [20] implemented direct collocation method using software DIRCOL to generate minimum energy symmetric, periodic gaits for a five-link bipedal robot with SSP, and instantaneous DSP gait. Xi et al. [21] used both direct collocation and multiple shooting methods to figure out how increasing speed affects the choice of gait and also inverse, under the objective of minimizing the cost of transport for both a planar biped and planar quadruped. Multiple shooting method is one of the trajectory optimization methods. Compared to the direct collocation method, this method has benefits like simplicity and straightforward, but it suffers slow convergence speed and is sensitive to initial guesses [22]. Ma et al. [23] used nonlinear programming toolbox FROST [24] to generate walking gait on the slippery surface combining with the hybrid zero dynamics control method to achieve stable walking for the AMBER-3M planar robot with point feet. Li et al. [25] utilized a direct collocation method with the closed-form of centroidal momentum (CM), which is CM generated by ground reaction forces (GRFs) that should match with CM calculated from the robot's generalized coordinates and velocities, to generate the trajectory for a five-link legged robot with the reduced numerical error.
is paper aims to testify the benefits of three phases (SSP, DSP subphase1, and DSP subphase2) walking style and to generate the optimal trajectory for our seven-link planar biped robot toward energy efficiency and stability for this walking style. In this paper, we use the direct collocation method to generate the optimal trajectory for this walking style, which can increase the ZMP stability margin when biped robot transfers from DSP to SSP. e holonomic dynamic equations with contact forces in each phase are created by the Lagrangian method, while the contact impulse is not considered in our paper to keep the velocity continuity. Various constraints are considered in our optimization, like unilateral contact, friction cone, and given speed. We compare the generated two trajectories (three-phase gait and two-phase gait) in energy efficiency and stability and find that three-phase gait increases the stability although it is in the cost of slight energy cost, and direct collocation method brings about the human-like hip height trajectory during SSP, which is the cycloid curve. e rest of the paper is organized as follows. Section 2 illustrates the differences between the two-phase walking style and three-phase walking style. e procedure for deriving the multiphase dynamic model is also presented. Section 3 is about the implementation details of direct collocation for generating optimal trajectory. e objectives and constraints of optimization are provided. Section 4 presents the generated three-phase and two-phase gaits. e comparison between these two gaits is conducted. In the end, conclusions and discussion are presented in Section 5.

Model Symbol Definition.
e bipedal robot in this study is a seven-link planar biped robot, which owns six actuators. Every two adjacent links are connected by an actuated revolute joint. Figure 1 presents the schematic biped robot. e symbols' meanings and values are given in Table 1, and they are consistent in the paper. Most of the parameters match with our biped robot [26] except the parameters of torso since our robot now does not have the upper body. We add it here because the upper body can equilibrate the angular momentum of the swing leg. e parameters of the upper body, mainly the mass and the length, are calculated according to the ratio between the upper body and lower body in paper [27].
Forefoot is the length from toe to ankle. Inertia is about its center of mass.
ere are other data in Figure 1 needed to be illustrated. e fixed coordinate is set in the heel of the support leg (in red). e direction of the X-axis is along the walk direction, and the direction of the Y-axis is vertical to the ground and point to the top. q � [q 1 , q 2 , q 3 , q 4 , q 5 , q 6 , q 7 ] are the generalized coordinates used for depicting the configuration of robot. ey are measured from the horizontal line through the revolute point to the link, which is called absolute angle usually, and the positive direction is counterclockwise same with convention. P 0 , P 1 , P 2 , etc. are the position of revolute joints, which can be derived by q and robot parameters through kinematics. G 1 , G 2 , G 3 , etc. are points of center of mass (CoM) of each link. Usually, we set CoM in the geometry middle point of the link except for the foot we put it in the middle of the sole.

e Biped Robot Dynamic Model.
Before we start to derive the dynamic model, we need to illustrate "multiphase" in the heading. In this paper, we only consider periodic gait. A gait is defined from one foot off the ground until the other foot leaves the ground.
In this paper, we consider two different walking styles shown in Figure 2. e upper one includes two phases: SSP (right half) and DSP (left half ), and this walking style is widely used. To make it more clear, we picture the middle state in SSP and DSP. e single support phase is when one leg (swing leg) of the robot swings, and the other leg (support leg) is on the ground. e double support phase is the state when both legs are touching the ground. When using ZMP as a stability criterion, we need ZMP to stay within the support polygon (SP), which is the convex sets consisting of regions where feet contact with the ground, and the minimum distance from ZMP to the boundary of SP is called the stability margin. e more the stability margin, the more stable the robot. Since we limit our research to planar walk, the support polygon is a line along the X-axis and ZMP is a point in the X-axis.
For the upper walking style, in DSP, the SP is from x A to x B , where x A is the horizontal position of swing leg toe and x B is the position of support leg heel. In SSP, the SP is from x B to x C , where x C is the horizontal position of support leg toe. So, when the robot switches from DSP to SSP, ZMP must go through the intersection point i.e., the SP is only a point; thus, at this moment, the stability margin is zero, a small deviation will cause the robot to fall. To prevent this situation, the below walking style is given in Figure 2.
is walking style divided the double   Figure 1: Biped robot model. support phase into two subphases (subphase1 and sub-phase2). In subphase1, the support foot rotates about its heel to flat, while the swing foot rotates about its toe to an intermediary angle, and its SP is from x A to x B . In subphase2, the support foot keeps flat and the swing foot continues rotating until toe-off, which is the initial state of SSP and its SP is from x A to x C . erefore, at the moment, when robot switches from DSP subphase2 to SSP, the stability margin is a line between Comparing to the first walking style, the stability margin at transferring moment increases.
Note when the robot transfers from SSP to DSP, the swing leg (in blue) becomes the support leg (in red), likewise in original support leg. e following is the procedure of deriving the equation of motion (EoM) for each phase. To simplify our procedure, we make the following modeling assumptions: e support leg foot is in full contact with the ground during SSP, and there is no slippage. During DSP, the support leg foot and swing foot rotate around a pin point, which does not slip. e instantaneous impact event, when swing foot contacts with ground, is avoided by making the contact velocity zero.

Single Support Phase. Multibody dynamics with contact usually formulate as
where M(q) ∈ R n q ×n q is the generalized mass matrix; n q is the dimension of q; q, _ q, € q ∈ R n q are the generalized position, velocity, and acceleration, respectively; b(q, _ q) ∈ R n q ×n q is the Coriolis and centrifugal matrix; g(q) ∈ R n q is the gravitational term; τ ∈ R n q is the generalized actuator forces; J ∈ R n c ×n q is the Jacobian matrix of contact points; n c is the number of contact points; and λ ∈ R n c is the Cartesian contact forces. en, we use the Euler-Lagrange formalism to derive the EoM of the single support phase. Since, in SSP, the link 1 does not move, i.e., can be regarded as the ground, we can eliminate link 1 and its corresponding q 1 . In the end, the EoM of single support phase can be written as where u � [u1; u2; u3; u4; u5; u6] is the vector of actuator torques and B 1 is the distribution matrix, which maps the actuator torques to generalized actuator forces. e actual dimension of matrices is illustrated in the bracket.

Double Support Subphase1.
In DSP, the biped robot is overactuated because there are two contacts (ground with support leg and swing leg). In kinematics, the robot is the closed chain, and contacts can be handled as kinematics constraints. In subphase1, the heel of support leg P 0 and toe of swing leg P 7 are both in contact with the ground. Suppose the distance between P 0 and P 7 is D s . We use P 0x , P 7x and P 0y , P 7y represent the X coordinate and Y coordinate of each point. en, the holonomic kinematic constraints are as follows:   Mathematical Problems in Engineering e Jacobian of holonomic constraints is Hence, the EoM of the robot in subphase1 formulates as follows: λ 1 is unknown here, so we could not solve this equation now. One way to solve it is eliminating the λ 1 . First, we assume J c1 is the full rank, so there is the orthogonal complement matrix of it, J c1null , satisfying J c1 J c1null � J T c1null J T c1 � 0. Now we transfer to derive J c1null . We divide the generalized coordinates into two categories: independent (q in ) and dependent coordinates (q de ). In this phase, we can choose them as Taking the derivative of equation (3), we gain en, we combine equations (6) and (7): where J c1in � (zΩ 1 /zq in ) and J c1de � (zΩ 1 /zq de ). en, we shift _ q de to the left side yield: us, _ q can be expressed as Replacing _ q in equation (7), Given that _ q in ≠ 0, we can derive Now J c1null is derived. us, we can multiply both sides of equation (5) with it: Taking the second derivative of equation (3), we get Combining equations (13) and (14) together, the final EoM of subphase1 can be written as where

Double Support
Subphase2. e procedure of deriving the dynamic model of subphase2 is similar to subphase1, except we need to add one more constraint for q 1 : en, the same procedure of subphase1 is followed to compute EoM, but with caution when choosing the dependent (q de2 ) and independent (q in2 ) coordinates. If we choose like in subphase1, corresponding J c2de will be rank deficient; i.e., no inverse matrix exists; thus, we need to adjust q de2 and q in2 into e transition from DSP1 to DSP2 (an abbreviation of DSP subphase2) and DSP2 to SSP, the generalized coordinates, and velocities are continuous.

Trajectory Optimization
In the prior section, we get the dynamic models of the biped robot in all phases. In this section, we use the direct collocation method to generate the optimal trajectory. e direct collocation method converts the trajectory optimization problem into a nonlinear programming problem by approximating all the continuous functions in the original problem as polynomial splines. In this paper, we utilize the standard collocation method [22] with the Hermite-Simpson method. Our problem has three phases, so we need to use the multiphase method [18,25]. It is like solving multiple single-phase problems simultaneously. e difference is there are linkage constraints between any two connected phases, hence combining into an integrated trajectory. e following are the details of the optimal trajectory problem.

Optimization Objective.
In this paper, we aim to generate low energy costs and high stability trajectory. For the energy cost, one common criterion is the cost of transport (CoT), which is the energy cost of the average distance moved by the robot, but CoT is a hard cost function to optimize over since it usually derives discontinuous solutions [27]. erefore, we use a simple but effective function-torque squared cost function: where T is the time of trajectory. is cost function usually leads to smooth solutions. e smooth solution has three advantages. e first one is that it is easy to approximate for piecewise polynomial spline and thus decreases the error because of approximation. e second one is that it is tractable to apply to the real robot. e last one is that it can prevent the torque from becoming too large.
As for stability, we use ZMP stability criterion. e ZMP point can be derived by the following equation: where € x i , € y i are the horizontal and vertical accelerations of G i , n is the number of links, and g is the gravity acceleration. en, we limit the actual ZMP point to our preplanned ZMP curve. Note we cannot directly use |ZMP − ZMP des | to measure the deviation from actual ZMP to the desired ZMP since the absolute function is not continuous. erefore, we use the below hyperbolic tangent function to approximate absolute function: where ZMP des is the desired ZMP and α is a parameter adjusting the smooth of the function. Combining these two objectives, the final objective is where k zmp is the coefficient to balance the effects of ZMP and energy cost since the actual L ZMP is much smaller than L u . By chosen suitable k zmp such that two costs are at the same order of magnitude; otherwise, the effects of ZMP would not arise.

Constraints.
A parade of constraints is required to generate a feasible trajectory. For different phases, different parts are constrained to produce an ideal solution. Usually, the constraints can be divided into path constraints and boundary constraints. Path constraints are the restriction along the trajectory, while boundary constraints only take effect at the beginning and the end. e constraints applied in this paper are explained as follows (path constraints first): (1) One of the basic constraints is avoiding the hyperextension of the knee joint and preventing the foot contact with the shin: (2) Contact force constraints. Since we assume there is no slippage, the contact force must be within the friction cone. Besides, the contact is unilateral. ese constraints correspond to the first two modeling assumptions: 6 Mathematical Problems in Engineering where F x and F y denote forces in the support leg, which can be solved by Newton's second law. λ x and λ y are forces in the swing leg, which occur in the DSP. μ is the friction coefficient. (3) During SSP, the swing leg should be constrained to above the ground, and setting the ground clearance can make the trajectory more desirable. Two options to do this: one way is to keep the swing foot above continuous-time function f(t), and the other way is to prescribe continuous state function f(x). Here, we choose time-based foot clearance: where T SSP means the duration of SSP, h is a set parameter of the highest height, and P 71 represents the position of swing foot heel. (4) At the beginning of SSP (toe-off), the Y-axis velocity of swing foot toe should be positive: (5) At the beginning of SSP, swing foot toe should be at the ground, and at the end of SSP, swing foot heel should be at the ground: where 0 means the time is zero, i.e., beginning of SSP. T SSP means the duration time of SSP, i.e., end of SSP. (6) We set the robot walk at a given speed (1 m/s). e ankle horizontal displacement in a step is constrained to 0.8 m, and the cycle time is prescribed as T � T DSP + T SSP � 0.8 s. However, in this paper instead of the constraint of ankle movement, we constrain the distance from swing toe to support heel at the beginning of SSP and support toe to swing heel at the end of SSP. Besides, according to the knowledge that the percentage of DSP is about 20% of the time and the other 80% is SSP [2], the T DSP and T SSP are determined: where P 00 represents the position of support foot toe. (7) Since no impact, we constrain the swing heel velocity is zero just before contact. is constraint corresponds to the third modeling assumption: (8) Phase transition (linkage constraints): these constraints are corresponding to the transitions part of the last section. SSP to DSP1, there is the reset map (Equation (18)); DSP1 to DSP2, keep continuous; and DSP2 to SSP, keep periodic. e parameters used in optimization are shown in Table 2.
e value of limits of T DSP1 is determined by trial and error.

Results
We solve trajectory optimization using ICLOCS2 [28] software with IPOPT [29] and linear solver MUMPS. We use random values between the lower limit and upper limit as the initial guess. For the two kinds of walking styles, we both solve 10 times from 10 random initial guesses. All optimizations are solved on a laptop computer (processor: 1.8 GHz quad-core Intel i7-8550U) with MATLAB R2019a. e average iterations and computation time are 593 iterations and 181 seconds for the first kind walking style. As for the second one, it takes much longer, and the average iterations and computation time are 1782 iterations and 546 seconds. Note here that we do not use any analytic information, and we only use IPOPT's numerical approximation. Stability [30] is not considered here. We give out the details of the generated trajectory for each walking style and the comparison of these two walking styles.

Results of the First Walking Style.
e curves of each joint angle and joint velocity are given in Figures 3 and 4. e middle vertical line depicted in all figures of this section represents the dividing line, the left part is DSP, which also be divided into DSP1 and DSP2 by a middle vertical line in the second walking style, and the right is SSP. From these two figures, we can see the curves are smooth, and state switching from DSP to SSP is continuous. Since the ranges of q 1 and q 7 (angle of the foot) are different from others, they are above the others. Also, _ q 5 , _ q 6 , and _ q 7 all experience large fluctuation during SSP since they swing from back to fore. We observe that the swing foot experiences the most fluctuation like a roller coaster, then the swing knee, and the last one is the swing hip. From the view of minimizing energy, it is reasonable first to adjust the minimum energy cost part.
is gives us a hint of control method, and we should Mathematical Problems in Engineering prioritize the control of the minimum energy cost part. e stick diagram is shown in Figure 5. During the medium of SSP, the knee joint is at the singularity position, and this leads to the cycloid curve of hip height, which is shown straightly in the next figure. e swing foot toe and heel are all above the ground clearance curve (green and yellow, respectively) during SSP. e hip height curve is shown in Figure 6. It depicts a more straightforward view of the cycloid curve during SSP.

Results of the Second Walking
Style. e optimal value of T DSP1 is 0.1161 s, which is about 58% of the whole DSP period.
e joint angle and velocity trajectories of this walking style are plotted in Figures 7 and 8, respectively. ey are also smooth and continuous when switching between different phases. Figure 9 illustrates the stick diagram. e hip height trajectory is depicted in Figure 10. e height of the hip joint is also a cycloid curve. Figures 3 and 7, the joint angle trajectories are basically the same. As for the joint velocity (Figures 4 and 8), comparing to the first walking style, the scope of fluctuation in the second walking style when transferring from DSP to SSP is slightly bigger and at the end of SSP is much smaller. e energy cost of these two walking styles is different. Since both walking styles experience the same distance and time, we can directly use the equation below without dividing the distance to compare:

Comparison of Two Gaits. Comparing
where N is the number of collocation points (20 in this paper) and τ is the generalized actuator torques in equation (1). e energy cost of gait2 is 1040 joule, and it is slightly more than gait1 (915 joule). Although the second gait increases the cost of energy a little, it raises the stability. Figure 11 illustrates the trajectory of ZMP, the left one is gait1, and the right one is gait2. Both ZMP trajectories follow the desired ZMP and within the support polygon, which is the region between the black dotted line. In gait1, at the transferring moment from DSP to SSP, ZMP (depicts as the red circle) is at the ZMP upper limit of DSP, which leads to the trend of biped robot falling forward; also, it is at the ZMP lower limit of SSP, which leads to the trend of biped robot falling backward. In contrast, for gait2, ZMP at transferring moment is 0.053, which is one-third the distance between the ZMP lower limit and upper limit of SSP, so the stability margin is increased from 0 to 0.053 m, which prevents the falling down caused by small deviation, as we mentioned the benefit of this walking style at Section 2.2.

Effect of Weight between Energy Cost and Stability.
We also investigate the influence of the weight between energy cost and stability on the result. We add a weight w to our objective function (Equation (21)): We set five different weight values w ∈ 0, 0.25, 0.5, { 0.75, 1} and find that the energy cost decreases with the increase in weight in both gaits ( Figure 12). Besides, when w < 0.5, the energy cost of two-phase gait is more than three-    phase gait, but when w ≥ 0.5, it is the opposite. Note that when w � 1, the two-phase gait could not get the optimal solution, so this point is not in Figure 12.
For the effect of weight on ZMP (Figure 13), when w ≤ 0.5, although actual ZMP is slightly getting worse with the increase in weight, it still can follow the desired ZMP, but when w > 0.5, it is getting very bad. Note that when w � 1, the two-phase gait could not get the optimal solution, so there is only the ZMP curve of three-phase gait. Although when w < 0.5, the energy cost of gait1 is more than gait2, and it seems opposite to our conclusion, but at this time, the energy cost is more than w � 0.5, so our conclusion is still correct when considering energy cost and stability equally (w � 0.5).

Conclusions and Discussion
Aiming to increase the stability of the biped robot when transferring from DSP to SSP, this paper divides DSP into two subphases. en, we apply the direct collocation method, which is suitable for our problem and overcomes some limitations of parametric optimization and the DDP method, to generate the optimal multiphase trajectory toward stability and energy efficiency. In the end, we compare the three-phase trajectory with the traditional two phases. e comparison confirms that multiphase gait can increase stability although it consumes slightly more energy. Besides, both trajectories show the human-like cycloid curve, which makes the gait look more natural. In the future, we will use more accurate and effective direct collocation method, like orthogonal collocation methods [31] instead of the standard collocation methods, which we use here. Besides, we will apply this trajectory to a physical robot and try to reduce computation time through a less complex dynamic model like centroidal momentum model. We will generate various  trajectories with different velocities and different foot ground clearances to build a library, which can make the robot more adaptive to different situations.

Data Availability
e data (MATLAB file) used to support the findings of this study are available from Shuai Heng upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest.