Universal Walking Control Framework of Biped Robot Based on Dynamic Model and Quadratic Programming

Biped robot research has always been a research focus in the ﬁeld of robot research. Among them, the motion control system, as the core content of the biped robot research, directly determines the stability of the robot walking. Traditional biped robot control methods suﬀer from low model accuracy, poor dynamic characteristics of motion controllers, and poor motion robustness. In order to improve the walking robustness of the biped robot, this paper solves the problem from three aspects: planning method, mathematical model, and control method, forming a robot motion control framework based on the whole-body dynamics model and quadratic planning. The robot uses divergent component of motion for trajectory planning and introduces the friction cone contact model into the control frame to improve the accuracy of the model. A complete constraint equation system can ensure that the solution of the controller meets the dynamic characteristics of the biped robot. An optimal controller is designed based on the control framework, and starting from the Lyapunov function, the convergence of the optimal controller is proved. Finally, the experimental results show that the method is robust and has certain anti-interference ability.


Introduction
Humanoid robots are expected to perform various tasks in the near future. Humanoid robots have the characteristics of multiple joints and multiple degrees of freedom, which makes them have the potential to satisfy multitasking. e humanoid robot's motion control often faces some difficult situations, such as the number of degrees of freedom required by the robot to perform a variety of expected tasks is higher than the number of degrees of freedom the robot has. For a legged robot with six underactuated degrees of freedom as a whole, the most important tasks of motion control are maintaining balance and completing the expected trajectory. Generally speaking, the robot's feet are used to complete these motion trajectories, and then the corresponding grasping tasks are performed by the arms. However, with the development of motion control technology, arm movements can also be used to enhance the overall balance performance of the robot. Arm movement was originally used for dynamic balance during walking, but it can also be used to assist in tasks such as kicking [1]. erefore, a full-body motion control framework needs to be established to find the best control output, so that the robot can complete various expected tasks under the constraint conditions such as maintaining balance [2,3]. e selection of the control variables of the robot's whole-body motion control frame has a great influence on the control performance that the robot can achieve. Examples of well-known whole-body motion control include virtual model control [4,5], passivity-based whole-body controllers [6][7][8], and inverse dynamics-based approaches such as those presented in [9,10]. Most methods solve local minima, i.e., optimize only the state of the current control frame, and then use the centroid dynamics model for predictive control. e method of using the whole-body dynamics model for predictive control has a higher model complexity and a larger amount of calculation [11]. anks to the improvement of the computing power of the controller, it can be applied to the gait control of biped robots [12].
ere are too many control algorithms for biped robots, and it is difficult to describe all control systems with one structure. e controller is usually divided into walking pattern generator and stabilizer. e former is used to generate a series of joint trajectories to act on the actual robot. When the robot is about to fall, the latter will modify the previously generated trajectory based on various sensor readings (contact force sensor, gyroscope, accelerometer, etc.). e ASIMO robot is the most outstanding representative of this method, usually based on the zero moment point (ZMP), the divergent motion component (DCM), or their deformation [13]. Although it is impossible to unify all methods, the structure of the controller can be divided into upper-level motion plan (advanced motion plan) and lower-level tracking (low-level tracking) [14]. e upperlevel motion plan usually uses a simplified dynamic model to generate the center of gravity and foot trajectory, while the lower level below considers using a more complex model to generate the position or torque of each joint. e generation of high-level motion trajectories usually requires model-based prediction of the future.
is is the basic idea of model predictive control (MPC). e mainstream method is still to form a trajectory optimization problem and then use MPC to solve the optimization problem and implement a real-time solution [15]. e simplified model is usually used because the calculation amount of the complex model is too large, and there is currently no method for performing real-time calculation. e specific degree of simplification is related to the movement to be achieved, the form of trajectory optimization, and the computing power of the existing controller. It should be noted that even in the case where the upperlayer trajectory is based on a simplified model, since the lower layer considers a more complex model, this makes the resulting joint trajectory conform to the dynamic characteristics of the actual robot and can follow the upperlayer trajectory very well. Usually, the lower level needs to solve inverse kinematics (IK) or inverse dynamics (ID) [16]. e mainstream method is to integrate IK or ID into one or more quadratic programming (QP) and then solve these QP problems in real time to generate the final joint trajectory [17,18].
is paper proposes a whole-body motion control method based on inverse dynamics, integrating motion control into an optimization framework based on quadratic programming and using joint torque and linear contact force as optimization variables for quadratic programming problems. Considering the complexity of the algorithm and the real-time nature of control, this paper selects the existing technology and finally forms the most complex control framework. e second section introduces the DCM trajectory planning; the third section introduces the control framework used for trajectory tracking; and the fourth section introduces the construction of the secondary planning controller. Section 5 verifies the control framework through experiments.

Biped Gait Trajectory Planning
Based on DCM DCM points in three-dimensional space can be defined as T are the position and velocity of the center of mass (CoM) point, respectively. e formula above shows that the CoM point can naturally and steadily follow the movement of the DCM point [19,20]. After differentiation, we obtain It shows that the external force F can directly affect the dynamics of DCM. rough analysis, the combined force F received by the system is In formula (3), m is the system mass, g is the gravity coefficient, x is the robot coordinate, r is the VRP, which is the space vector, and Δz vrp is the scalar, which represents the height of the robot's torso.
In order to make the actual DCM point run on the planned trajectory, the closed-loop control method is used to track the expected DCM trajectory: Adopt proportional (P) control rate to achieve stable tracking of DCM points. Among them, when k ξ > 0, the DCM error can converge gradually and steadily. For a given virtual repellent point (VRP) r vrp and the corresponding centroidal moment pivot point (CMP), r ecm position, it can converge to the expected value. Expected DCM tracking control rate is Among them, formula (5) only stabilizes the unstable part of DCM and does not affect the natural and stable CoM dynamic changes. e closed-loop dynamic situation is Among them, I and 0 represent the unit matrix and zero matrix of 3 × 3. For k ξ > 0, the eigenvectors of the system matrix are stable, and the feed-forward terms ξ d and _ ξ d are 2 Complexity both bounded and then can meet bounded input bounded output stable (BIBO) standards.

Dynamic Control Framework for Biped Robot
A universal control method for biped robots is to control rigid joint tracking position through IK. is method only requires kinematic models and is widely used in actual robots; the other is based on ID. e method can provide a compliant motion trajectory and is robust to external disturbances, but its performance depends largely on the quality of the dynamic model, which is difficult to obtain by measuring actual robots. At present, the better method is to use the inverse kinematics IK as a complement to the controller based on the inverse dynamics ID to compensate the modeling error to improve the anti-interference ability. Many biped robot control methods can be decoupled into two layers: a planner that outputs the position of the center of mass and a lowlevel controller responsible for joint layer control. However, in order to expand the application space of the biped robot and make the robot more robust to external disturbances, the underlying controller also needs to consider full-body kinematics and dynamics. e controller architecture needs to solve for whole-body ID and IK in each control cycle to track higher-level targets. e control model is divided into two parts: IK and ID. Both are formulated as two independent QP problems. Problems have their own goals and constraints.
e servo control command of the joint layer is calculated by the following formula: In formula (7), q d , _ q d , and τ d are the expected joint position, speed, and torque, and q, _ q, and τ are measured quantities. Use ID to calculate τ d for force control and IK to calculate q d and _ q d . For robot gait, the robot trajectory is generated at the user level, such as the expected trajectory of feet, hands, and CoM in the Cartesian coordinate system. e full-body robot controller takes it as input and calculates the control amount of each individual joint, such as the joint position, speed, and torque, and then uses these control variables as reference inputs for the joint layer servo controller. e robot's joint position, speed, joint acceleration, and torque are calculated separately. IK and ID are described by QP problems: In the formula, unknown state variables χ and constraints C E , c E , C I , and c I are all related to specific control problems. Two QP problems are solved once in each control cycle. e cost function can be optimized as So, G � A T A and g � −AT b . A and b can be broken down into w i is the weight needed for the cost function.

Joint Force Control Based on Inverse
Dynamics. e motion equation and constraint equation of a biped robot can be described as In formula (11), (q, _ q) is the complete state of the system, including 6 degrees of freedom on the underactuated torso, M(q) is the inertia matrix, and h(q, _ q) is the sum of gravity, centrifugal force, and Coriolis force, S is a selection matrix, where the first 6 rows of the 6 degree of freedom (DOF) corresponding to the underactuated torso are zero, and the rest is a unit matrix, τ is the joint moment vector, J T (q) is the Jacobian matrix of all contact points, F is the vector of all contact forces in the world coordinate system, and x is the vector of contact position and direction in the Cartesian space. e number of rows and columns of F and J T depends on the number of contact points. In order to describe the contact force of the biped robot with the ground more clearly, the equation of motion (11) was rewritten as follows: In formula (12), N c is the number of foot contact friction cones, J T k (q) is the Jacobian matrix of each friction cone, and f k is the contact force.

Cartesian Space Acceleration.
Using the acceleration deviation in the Cartesian space as the evaluation function, we obtain Calculate the input € x * by formula (13), where x * d , _ x * d , and € x * d are specified by the higher-level controller. Calculate the sum based on the current robot state through forward kinematics, and track the CoM, hand, foot, and torso directions. Acceleration can be calculated by equation (13). For different calculation targets, the Jacobian rows without constraints need to be omitted accordingly. In order to enhance the stability of calculations during contact, contact is not regarded as a hard constraint, but a penalty function Complexity 3 with a higher weight is used to represent the contact to speed up the solution of the quadratic programming problem. e cost function at the time of contact x * d , _ x * d can be ignored and set € x * � 0. tAt the same time, considering the influence of CoM trajectory, torso posture, foot center trajectory of swinging leg, and foot center trajectory of standing leg on robot control, the cost function formula (10) is extended to (14), where w is the weighting coefficient and A com is the centroid tracking cost in the end acceleration tracking cost: Constraint equation (15) corresponds to the CoM trajectory: Similarly, the swing leg and standing leg trajectory constraint equations are e body posture constraint equation is

Center of Pressure (CoP).
Given the contact force and contact moment b M, b F in the foot coordinate system, the position of the pressure center in the foot coordinate system is Cost function of pressure center deviation is In formula (19), p is the contact force amplitude vector, which is consistent with the general β ij meaning of the constraint equation, * represents the reference value, and p * x , p * y is the expected pressure center position in the foot coordinate system given by the high-level controller. In formula (20), R is the transformation matrix from the world coordinate system to the foot coordinate system.

Weight Distribution.
In the two-leg support, specify the desired weight distribution w * � F zl /(F zl + F zr ), and add this to the cost function: In formula (20), S weight is a row vector with zeros, except for S weight (3) � 1 − w * and S weight (9) � −w * , the rest are zero.

Track State Variables and Regularization Directly.
Use the expected value directly to evaluate χ; then formula (21) is obtained, where A is the selection matrix, b is target vector, q is joint position, τ is torque, F is end contact force, and * represents the reference value: If no target value is specified in the formula, 0 is used. is item can be used to directly control specific joints or forces, which can directly adjust the value of the state variable χ, making QP problems easier to solve.

Torque Change.
In order to avoid joint torque oscillation, the speed of change of τ needs to be considered:

Complexity
In formula (22), τ prev is the output of the previous period.

Friction Cone Contact Model.
In the control of a biped robot, it is difficult to solve the contact force under different contact conditions. In order to simplify the solution process, the contact surface of the foot is approximated by multiple contact points, and the contact force of these contact points is solved. For the feet of a robot, a friction cone constraint can be used to determine whether the point will slide. e point will not slide only if the contact force vector is inside the friction cone. As shown in Figure 1, the apex of the friction cone is located at the contact point, the normal line is perpendicular to the ground, and the slope is determined by the friction coefficient of the contact between the sole and the point. However, this constraint method represented by a friction cone is nonlinear, although the solver can handle conic constraints. However, the calculation of the cone constraint is heavy. In order to speed up the solution, this paper uses a convex polygon as shown in Figure 1 to approximate the cone constraint of the contact point. In the contact force model represented by the convex polygon model, the feasible contact force and contact moment of the foot are a linear combination of the contact force f i of each point. f i is the product of a unit vector 0 uf i representing the direction of the contact force in space and a scalar ρ i representing the magnitude of the contact force, as shown in formula (25): erefore, at a given contact point, the contact forces in all friction cones can be represented by the contact forces at a total of nρ contact points. e unit vector 0 uf i of the feasible contact force direction is determined by the contact point. e amplitude scalar ρ i of the contact force is limited to a positive value. All nρ contact force amplitude scalars are combined into a contact force amplitude vector ρ; then, Contact force vector ρ can be mapped to six-dimensional space contact force and moment: In formula (25), k represents the kth part of the robot that is in contact with the external environment. Each 0 A ρ,k maps the contact force amplitude vector ρ on the limb to the contact force and contact torque to the k limb.
Each foot of a biped robot is approximated by four contact points located on the edge of the foot, and the friction cone of each contact point is approximated by 4 contact force unit vectors, so the total number of contact force variables per leg n p � 16. In formula (25), the force ρ is mapped to the foot contact torque through the matrix A ρ,k .
It should be noted that the friction force obtained is reasonable only when the measured friction coefficient, the contact state of the foot, and the ground normal vector of the contact point are approximately accurate; otherwise, the system will be seriously unstable.

Constraints.
Considering the constraints such as the output torque of each joint of the actual robot, the motion equations need to be used as equation constraints, and various inequality constraints must be added to ensure that the solution results do not violate relevant physical conditions. Inequality constraints include positive contact forces, joint moments within a certain range, etc.; the optimization problem is formulated as min € q,β,f In formula (26), they are dynamic model constraint, friction cone constraint, contact force amplitude vector constraint, and joint torque constraint. e cost items are CoP tracking cost, end acceleration tracking cost, and torque change cost, τ is the joint torque, contact force vector cost, β ij is the contact force amplitude vector, f i represents the friction between the sole of the foot and the ground, and τ max is the limit value of the output torque of each joint. e coefficients of each cost term are ω 1 , ω 2 , ε 1 , and ε 2 .
At present, the friction coefficient f is obtained through preliminary experiments, and the friction coefficient of different roads is different. e function of the friction cone is to limit the force of the robot on the ground when the friction coefficient is small, to prevent sliding friction during Complexity walking, which will cause the robot to lose stability. In the future, under unknown walking conditions, sliding friction during walking will be detected and corrected.
Finally, the robot control framework is shown in Figure 2.
ere are many parameters in the control framework that need to be adjusted. e parameters that need to be adjusted include the PD controller feedback coefficient when calculating the end acceleration cost, the weight coefficient of each end point cost in the end acceleration cost, and the weight coefficient of each cost in the secondary programming problem, such as ω 1 , ω 2 , ε 1 , and ε 2 .
Parameter debugging process is as follows: (1) e weight coefficients w 1 and w 2 in the end acceleration cost are set to a larger value, about 1-100.
Debug ω 1 and ω 2 until better CoP tracking and end acceleration tracking are achieved. Although there are many total parameters, it is easier to determine the appropriate control parameters when debugging in stages.

Optimal Controller Based on PD Control and
Quadratic Programming e trajectory planning algorithm and the IK and ID algorithms can be used to obtain the desired angle of each joint of the robot, but the actual output torque of each joint is limited. erefore, it is necessary to design an optimal controller to limit the joint torque of the biped robot.
rough optimized force control, the robot's motion posture can better track the expected posture.
e biped robot dynamics model is Output dynamics are Differentiating formula (28), we obtain Substitute € q into equation (29) to obtain equation (31), where Sorting formula (31), we obtain Multiplying formula (32) left by D y � (J y D −1 J T y ) −1 , we obtain Among them, Formula (34) describes the relationship between the auxiliary variable, and when the controller F is used, the corresponding control law can be substituted into the equation to find. Consider the form of F: Put it into formula (27); then, (37) For the above system, consider the following Lyapunov function: Deriving from the above formula, we obtain 6 Complexity Converting the above formula into a matrix, we obtain In order to ensure the positive definiteness of the matrix in formula (40), k p > ((k d D T y D y )/2), a smaller value α is often selected, but it is necessary to ensure that formula (41) is a positive definite matrix: For an actual biped robot system, the physical meaning of D in the actual system is an inertia matrix, which is positively definite and symmetrical. erefore, D y can be obtained as a symmetric matrix. e selection conditions for k p can be modified to k p > ((k d D T y D y )/2); the corresponding matrix to be verified can also be simplified to e construction space of the biped robot system includes the joint angles q � (q 1 , q 2 , . . . , q 12 ) and the corresponding underactuated degrees of freedom. e mass data and the moment of inertia data of each part of the robot body are obtained from the CAD model. Considering trajectory following, its control problem is of the form: Among them, u is the actual output torque value of each joint, and U lim is the maximum allowable torque of each joint, according to the actual torque limit of each joint. By reasonable selection of the coefficients K p and K d in the controller, the influence of system nonlinearity on system stability can be avoided to the greatest extent. e matrix B represents the selection matrix in the underdrive system. e first 6 rows of the corresponding underdrive 6 DOFs are 0, and the rest are the unit matrix.    8 Complexity In order to quickly solve the optimal control problem, the open-source OSQP solver is selected to solve the abovementioned quadratic programming problem. e solver uses an improved first-order alternating direction multiplier method (ADMM) to solve the quadratic programming problem [21]. By decomposing the matrix at the initial stage, the calculation amount in the subsequent process is greatly reduced. e optimal control algorithm is robust; its optimization results are not sensitive to the initial value, and good solution results can be obtained by using different initial values. e control framework integrates methods that can improve the robustness of the system. e framework uses DCM for trajectory planning, and DCM itself has anti-interference ability; uses friction cone and whole-body dynamics model to improve the accuracy of the model and reduce the control error caused by the rough model; considers responsibility for the actual situation to obtain a variety  of constraints and use QP; realizes the optimal controller; and further improves the robustness of robot walking.

Experimental Verification
Under the robot control framework, force control and position control are applied to the biped robot, respectively, and the comparison effect is examined. e experimental conditions are that the biped robot moves forward 5 steps, the robot stride is 30 cm, and the period is 1.2 seconds. e weight of the physical robot is about 60 kg, as shown in Figure 3. e robot's inverted pendulum time constant ω is selected according to the interest rate difference.        Complexity trajectories under force control. It can be seen that based on the position control, each joint performs position control according to the trajectory plan calculated in advance, so the joint angle planning curve and the actual trajectory curve are basically coincident. In the case of force control, the joint output force is tracked and controlled, and the joint angle planning curve and actual curve are no longer consistent. Although the position control ensures the accuracy of position output, the control effect of the CoP trajectory is not as good as that of the force control method. Under the force control method, the CoP trajectory curve of the robot walking will be smoother, and the fluctuation is smaller than that of the position control method.
Under the position control and force control methods, the six-dimensional force (SDF) data of the ankle joint is shown in Figures 6(a) and 6(b). e force control method can effectively reduce the impact force of the robot's foot and the ground. Under position control, the robot's feet are in hard contact with the ground, and the ground reaction force is greater than the force control method, which affects the robot's walking stability. e robot goes through several walking cycles to balance the impact. Compared with the position control method, the force control method can better absorb the external impact and ground reaction force and restore the balance in a shorter time. It can be seen that based on position control, ZMP trajectory control is not as good as force control. During walking, ZMP can only be proved to be stable if it is within the stable area of the foot, but from the perspective of ZMP trajectory, the actual ZMP is calculated and fluctuates greatly, so ZMP trajectory only shows the stability of walking to a certain extent. Figures 8(a) and 8(b) show the trajectory of the joint position after the robot's mass increases by 2 kg and the trajectory effect of the CP point and the CoP. It can be seen that after the mass of the robot increases, its joint trajectory remains almost unchanged, and there is a certain deviation between the actual CP trajectory and the reference trajectory, but due to the robustness of the optimal controller, the tracking error of the robot is small. e CP trajectory will remain within a certain range and will not cause the robot to fall, and the tracking of the CoP trajectory will be better.
Figures 9(a) and 9(b) show the robot joint position trajectory after the robot's center of mass is moved forward by 4 cm and the CP point and CoP trajectory effects. ere is a certain inherent deviation between the actual CP trajectory and the reference trajectory of the robot, but due to the robustness of the optimal controller, the deviation value of the CP trajectory will eventually tend to a more stable value, which will not cause the robot to be unstable. In other words, CoP trajectory tracking is better. Figures 10(a) and 10(b) show that the trunk of the robot is subjected to a continuous thrust of 30 N at 1.5-1.6 s, and the thrust is along the positive direction of the x-axis. It can be seen that when the robot is subjected to thrust, the CP trajectory along the x-axis direction has a significant deviation, but it will not cause the CP tracking divergence. After the thrust stops, due to the role of the optimal controller, the CP tracking error gradually decreases until converged to approximately 0; relatively speaking, the CoP tracking is good, and the corresponding joint running trajectory is less affected by thrust, which is almost the same as the effect when it is not subjected to thrust.

Conclusion
is paper presents the technique of the robot motion control based on the dynamic model and quadratic programming. In order to improve the walking robustness of the biped robot, this article starts with the construction of the robot motion control framework from three aspects: planning method, mathematical model, and control method. Firstly, the biped gait trajectory planning is introduced based on the divergent component of motion. en, the dynamic control framework is discussed for biped robot, including the joint force control, friction cone contact model, and various constraints. is paper integrates these methods into the control framework and verifies the effectiveness of the control framework on physical robots. Finally, the experimental results show that the method is robust and has certain anti-interference ability.
Further research work will focus on improving the accuracy of the dynamics model, the robustness of the control method, and the verification experiments in more complex scenarios, for example, complex walking scenes (such as sand or outdoors) and greater walking disturbances, etc. e ultimate goal is for the robot to walk stably in a human environment.

Nomenclature
CoM: Center of mass, the trajectory curve of the center of mass during the robot walking CoP: Center of pressure, the trajectory curve of the pressure center of the foot of the robot during the robot walking CP: Capture point, robot's walking trajectory based on DCM planning DCM: Divergent motion component, a walking strategy of biped robot ID: Inverse dynamics, calculate the torque required for the joints given the motion of the robot IK: Inverse kinematics, calculate the required joint angle for a given foot position QP: Quadratic programming, convex quadratic programming problem ZMP: Zero moment point, a criterion of walking stability of biped robot.

Data Availability
No data were used to support this study.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper. 12 Complexity