Real-Time Footprint Planning and Model Predictive Control Based Method for Stable Biped Walking

In order to walk in a physical environment, the biped will encounter various external disturbances, and walking under persistent conditions is still challenging. This paper tries to improve the push recovery performance based on capture point (CP) and model predictive control. The trajectory of zero moment point (ZMP) and center of mass are solved and predicted in a limited time horizon. Online footprint generator is combined with MPC walking pattern generation, which can keep biped stable in the next few steps, and projection of ZMP is used to calculate the next footprint and reach the target CP in an incremental way. Verification of the proposed stable biped walking method is conducted by simulation and experiments.


Introduction
tBiped robot is one kind of mobile robot and belongs to the field of leg robot. In the world of robotics, biped walking is one of the fields with the most interesting research [1]. Robots need to interact and collide intermittently with the environment to complete specific actions and tasks [2]. Compared with other forms of robots, biped robot has more advantages: the biped with a humanoid structure can adapt to various environments, and it can replace humans in many dangerous jobs and offer services for humans, such as artificial limb and medical rehabilitation [3]. e biped robot model is nonlinear dynamic with multiple degrees of freedom and constraints, and it is important to interpret and predict the long-term behavioral trends of the dynamic system for various control tasks. A simplified biped walking linear inverted pendulum model (LIPM) is the baseline of humanoid walking pattern controller [4,5], and passive walking and under-actuated robots have certain advantages in energy-saving walking. But passive walking has poor terrain adaptability and can only walk downhill or on flat ground; while under-actuated walking robots require special structural design and rely on complex walking control strategies [6]. Zhang et al. [7] proposed a walking pattern generator for omnidirectional walking on a slope and uneven plane and realized the oblique walk and heading for any direction, but the performance of push recovery under external persistent force is not analyzed. Oliver et al. proposed an improved model named the flexible linear inverted pendulum model (FLIPM) which integrate damped spring and second center of mass to LIP [8], which realized a stable walk on a servo-driven biped robot. However, the performance is only improved in the z direction, which is not included by FLIPM and causes a rotation of the root alongside the x-axis towards the ankle joint lifting up from the plane. Biped robot has dozens of degrees of freedom and no fixed base, which leads to its dynamic system is very complex. Juang et al. proposed the fully connected recurrent neural network optimized by continuous multiobjective ant colony optimization to form CpG to solve the multiobjective gait generation problem of Nao robot. An open-source multiple DOF servo-driven robot Robotis op [9] is used to test their method performance, but the coupled oscillator model is not able to keep stability under an unpredictable perturbation.
In the process of walking, the biped robot will inevitably be affected by various uncertain factors, such as the uncertainty of mathematical model based on different assumptions [10], the uncertainty of robot system parameters caused by the structural size, material properties, manufacturing and assembly errors, and the random uncertainty of driving torque caused by actuator noise and joint friction, and the external environment disturbance and plane condition uncertainty in the process of robot walking [11]. Pratt et al. [12] introduced the concepts of capture point and capture area by LIPM and flywheel model, and CP is the key metric for push recovery of the biped which shows the target direction to maintaining body stability by keeping center of pressure (CoP) inside the foot support area. But how the capture region adapts with other biped models is not discussed. Majid et al. generate walking patterns for the biped by two stages: computes the best step location and duration and adapts these values using divergent component of motion (DCM) measurement. Englsberger [13] designed CP following and CP step-end controller and has proved the antipush stability of CP control principle, but real-time footprint position adjustment is not included to enhance the robustness of the walking pattern against external perturbations. Since the large number of joints, it is very complex to use CPG model to describe the joint angle for the 3D humanoid robot. Q-learning-based CPG model [14,15] can keep the body recover from horizon perturbation, but it need different expertise to redesign the controller. In addition, the motion obtained by this gait control method is generally not optimal, and the anti-interference in vertical direction ability is poor.
Humanoid walking pattern generation can be solved by model predictive control (MPC) with efficient constraint handling [16]. e optimization-based method is utilized so as to generate, at each iteration, the optimal gait trajectory of the system satisfying the given constraints. However, it is well known that the action of a bounded persistent disturbance can destabilize a predictive controller which has been designed to be stabilizing for the nominal case [17]. ese controllers can be used with a cost function that at the specific problem of generating trajectory by quadratic programming in linear system control, a fast optimization algorithm is proposed. However, several papers like [18][19][20] used restrictions on model state at the end of the prediction horizon. Since the state of the system is completely dependent on the beginning and the end of the horizon, motions lack the flexibility to perform secondary control objectives. Amos et al. [21] proposed the strategy architecture designed for end-to-end training, the robot learned to combine high-level planning strategy with low-level motion controller to realize autonomous navigation on a curved path, which is less computationally and memory intensive compared to traditional optimization solutions, but the model used in that article is still linear which is hard to extend to complex systems like the biped walking. Scianca et al. [22] introduced an IS-MPC framework for gait generation which extend LIPM with ZMP trajectory input, and recursive feasibility of internal stability is realized for biped dynamics, but the cost of control and response speed of method are not analyzed for position control based humanoid robot [23]. e gait trajectory planning method based on LIP does not consider the external interference force, so it cannot walk in a contact-rich environment. e gait pattern generator can only calculate the trajectory position at the next time according to the state transition equation but cannot predict the trajectory in the future.
is paper proposes footprint planning and MPC method for stable biped walking considering the capture point principle. e framework of the method is summarized in Figure 1. LIPM is used to linearize the dynamic model of biped walking, and MPC is used as a gait pattern generator to generate a smooth trajectory of CoM. Online footstep generator based on capture point feedback is combined with the MPC controller, and the biped can take one or more steps to reach the stable state and realize omnidirectional walking on the horizon plane. And a projection function maps the target ZMP into support polygon which makes inverse kinematics is solvable.

Approximation of Walking Model.
Approximation acts effective method to deal with complex systems. Most humanoid robots use the model-based walking planning method to get the gait pattern and abstract the biped dynamics equation from the model by approximating the center of mass (CoM).
e classical inverted pendulum model is widely used for humanoid walking pattern generation. Some conditions are mentioned first as follows: (1) Whole-body mass of the robot is concentrated on the CoM. Under the joint action of the supporting moment and the force along the stretching direction of the rod, the dynamic equation of LIPM is established, and the trajectory of the CoM is obtained by solving the dynamic equation. Using the LIPM, let p denote the position of the center of pressure (CoP) on the plane, and the horizontal dynamics differential equation of CoM is where w 0 � ������ g/h CoM and g is gravity force, and h CoM is the constant height of CoM along z-axis. x denotes the position of CoM along x-axis, and € x denotes body acceleration. is approximation (1) decouples the sagittal and coronal motions of the biped robot, so we will focus on the x-axis motion throughout this paper, and y direction motion is totally identical.

Computational Intelligence and Neuroscience
Considering the discretization at a minimum interval time T, CoM and CoP are discretized. Let input control signal u(t) � ẋ t which is the jerk of CoM position in statespace equation, then the system state transition at time t � kT, k � 1, 2, ... with notation From (2), we can get the ZMP position For the system state x k , the discretized system of LIPM is In the process of bipedal walking, the position of ZMP p k always falls into the support polygon of the foot, so the range of ZMP can be limited within e range of ZMP [p min k , p max k ] depends on foot contact condition with the horizontal plane at time kT. According to the state transition (2), the next state at time k + 1 is decided by the current state x k and input control u k . e core idea of MPC makes the actual ZMP p k most likely close to reference ZMP trajectory p ref k at the minimum control cost at time kT. en, the optimal control sequence u k , u k+1,... can be get by solving the quadratic problem (QP).
where α/β(α > 0, β > 0) are factors to balance the system response ratio and control cost. If α/β rises, the tacking speed of system and control costs increase. And if α/β is lower, the tracking speed of the system and control costs decrease.

Capture Point
where ξ is the x or y component of ICP position on plane.
Due to the limitation of robot dynamics, when the capture point is not captured, the CP will be away from CoP in a straight line until it falls down that is noted by Figure 2. e features of the capture point can be described as follows: (1) Due to the regular dynamics of the robot, if the point is not captured, with time, the ICP moves in a straight line joining the CoP and the CP in a direction away from the foot, as shown in Figure 2. (2) Zero-step capturability: If the ICP falls inside the support polygon, then the robot can balance itself with the application of a good control like MPC; however, if the ICP falls outside the support polygon, the robot will have to take at least one step to protect itself from falling down. (3) One-step capturability: e model can come to stability within a step if the ICP falls within the reachable range. e reachable region is bounded by the maximum step length.
According to equation (1) and equation (7), the dynamic relationship between CP and ZMP is Since the pole of transition function is ω 0 , equation (7) is an unstable system without external additional input. e idea of CP control is to generate the ZMP trajectory so that the current CP reaches the target CP within a given time interval, and then, the CoM follows the planning CP curve. e solution of equation (7) in the time domain is Footprint Generation Computational Intelligence and Neuroscience where ξ x,0 is the initial position of CP. It can be seen that when ZMP p x is constant, CP will change exponentially. Let ξ x,d , ξ x represent the target CP and current CP, respectively, and time from ξ x,d to ξ x is dT, and then, (9) can be discretized as e goal of end-of-step CP controller: ZMP trajectory is generated by ξ x,d and variable time span dT. Let ξ x,eos represent the end of step at the ending of each foothold, and ξ x,d � ξ x,eos . Suppose the center of the support foot is approximately regarded as the ZMP point p i at step i, and the initial CP is where t step is the duration of each step. Iterative relation of CP state is ξ eos,i−1 � ξ init,i . When the robot is subjected to external thrust, CP changes suddenly, and the biped needs to replan the landing footprint so that the CoM is adjusted to ξ eos . According to the iterative relationship (11), the CP position in each support leg cycle can be calculated. en, the system reaches a stable state following the new reference ZMP sequence generated by the CP controller. ZMP reference trajectory is a continuous curve and constrained by robot dynamics of support polygon. When the system is disturbed by a large external force, CP varies fast out of feet support polygon on the plane. If p x is calculated by (11), p d might locate out of support polygon. And the incremental progressive method can be adopted for ZMP, and CoM is controlled to move along the target CP direction within the current support polygon. Formula (9) is differentiated as follows: Δξ eos � dξ eos dt.
A method to implement the incremental integration (12) is the ZMP projection as shown in Figure 3. If the projection is not applied, p d from (11) could beyond the support polygon, and the CoM trajectory generated by p d will cause falling on the ground.
As Figure 3 shows, we found a new projection ZMP p d ′ in support polygon to make CoM follow the target direction of CP. e projection rules can be summarized as follows: (i) If p d is located in support polygon, the target ZMP p d is in the reachable region of the biped, then projection ZMP p d ′ � p d . (ii) If p d is located outside the support polygon, p ' d is the closest point on the inner boundary of the support polygon to (ξ d − ξ).
In practical application, the number of reverse iterative footprint calculation F is limited. Based on the current foothold p i and initial CP ξ init,i , the position of ZMP can be solved on a limited horizon F. CoM mapping is used to make the CoM approach the target position. But as 3 shows, there is a discontinuity between current p and p ' d � p d which leads to discontinuous input to the walking pattern generator. To compensate for this problem, the MPC controller is introduced in the next section to track the CP trajectory on the one hand and export smooth CoM trajectory, so that the biped will not lose stability due to the sudden change of dynamic parameters in the process of walking.

Footprint Generator with Variable Velocity.
e footprint generator module and MPC controller module are implemented synchronously. e input of the footprint generator is reference velocity v x , v y and angular velocity w around the z-axis. e output of this generator is the next   footprint position, direction, and time. With the change of input reference velocity, the cycle and step length of biped walking will be affected, and the footprint generator calculates the next foothold position by solving the quadratic optimization problem. e time and orientation of the foothold of the output of the footprint generator will be used as the reference input of the MPC controller in the next step.
As shown in Figure 1, the input of the footprint generator is v x , v y , w from time t k to t k+P � t k + T p , where T p is the duration of the footstep. e output of the footprint generator is the sequence ( where F is the number of next footprints, and (x Assume that the biped is within the j-th support phase switching duration T j s , v is average velocity, T s is step period duration, and L s is stride length, so v � L s /T s . Walking velocity is limited by the biped dynamics such as CoM height, degree of freedom, and leg length. v is determined by both T j s and L s . Δv is velocity small variation in Δt, and then, where ΔL s � cΔT s , and (14) can be expressed as Let v � 0.01m/s, T s � 0.8s, and the relationship between velocity and duration can be shown in Figure 4.
During a small time interval δt, velocity can be approximately regarded as a linear variation. erefore, the orientation angle of the robot can be approximately constant, and angular velocity is ignored in (14).
Considering time sequence t 1 2 , t 2 2 , ..., t F 2 iterative relation, where t 0 s is the end time of the last support leg duration, and the iteration is over when j > k + P. en, a time sequence e idea of the footprint generator is to allocate the future foothold under the condition of considering the constraints of kinematics and dynamics parameters of the biped in time T k s � T 1 s , ..., T F s . When the body root is subjected to push forces, according to (7), _ x will change suddenly, which will affect the planned trajectory of x. e velocity and acceleration change suddenly, which affect the planning trajectory of CoM. e position of the foothold need be adjusted to make the CoM as close as possible near new CP so that the CoM can keep a gradually stable state. e footprint planning can be transformed into two QP optimization problems as follows: where θ max is the maximum steering angle allowed for two consecutive footprints according to the physics dynamics of the biped. And the second QP problem is where the supporting foot position is (x 0 f , y 0 f ) at the starting time t k . e output of the footprint generator (X k f , Y k f , Θ k f ) is the input of the MPC controller, and the final ZMP and CoM trajectory is calculated by MPC. Figure 5 shows an example of footprint generation where the orientation of the robot coincides with the tangent of the motion path. e swing foot trajectory planning is simple, and in order to avoid sudden changes in the velocity of end point, we use Bezier curves to generate swing foot trajectory and then solve the joints rotation by inverse kinematics. Computational Intelligence and Neuroscience

MPC Controller.
Predictive control approximates the robot to a three-dimensional linear inverted pendulum, introduces the state space method, optimizes the performance index by the linear quadratic regulator, and limits the movement of the CoM in the horizontal plane with constant height. e state feedback gain and predictive gain of the predictive controller are generated by solving discrete algebraic Riccati equations. Due to the consideration of state feedback, error feedback, and future target information, a stable walking mode can be generated with less computation. e system only performs prediction of MPC controller at kT, then measures the actual state x, _ x of the system as feedback by LIPM, and calculates the optimal control by solving QP. N represents the prediction horizon of the MPC controller, and equation (6) can be rewritten within e above equation (20) is solved by solving complex algebraic Riccati equation in [5], but this method has low computational efficiency. Within MPC prediction horizon, equation (4) can be transformed to matrix operation. e state transition equation is rewritten with length N as And (21) can be abbreviated as According to (20), the QP equation can be represented as According to the definition of MPC, u k at the current moment is the first prediction of MPC prediction trajectory; that is, where e T � [1, 0, ..., 0] T , and according to the state formula of LIPM (4), the trajectory of x and output position of ZMP at the next moment are calculated as Within the predictive horizon NT, a balance between response speed and control cost of system can be found by adjusting α/β(α > 0, β > 0), and in this article, α/β � 10 − 5 .
When the system and environment models are well known, NT can approach infinity. But LIPM model itself is a linear approximation to the walking dynamics of biped robot with systematic model errors, and environment cannot be accurately modeled. So prediction horizon could not be infinite. In fact, when N increases, the gain of the system is close to 0, which means that the influence of the predicted future value on the stability of the current system is reduced, but it will significantly decrease the computational efficiency of QP calculation. erefore, this paper sets the MPC control horizon to NT � 1s.

Results
e performance of the proposed dynamic balance control framework in Figure 1 is proved by simulation experiment, we present simulation results using the LIPM and MPC simulation in bullet environment, the input velocity is constant, the single leg support phase time is constant t sup � 0.5s, supporting leg switching time t sw � 0.33s, MPC horizon N � 30, and the minimum control frequency is 100 Hz. e external push force f ext � 0, and the robot is not subject to external interference, so that walks along a straight line with N step � 6, and according to the foothold generator, the robot's foothold position is shown in Figure 6.
Since the LIPM is decoupled from each other in X and Y directions, body CoM and ZMP are solved separately by the MPC controller, as is shown in Figure 7. Although there is slight noise jitter in the output ZMP trajectory, the CoM maintains smooth transition, and there is no big speed mutation when the support legs are exchanged. erefore, the MPC method has good tracking performance.
For the omnidirectional walking, Figure 5 shows velocity direction of walking can be changed in real time by the footprint generator, and θ max � π/4. e robot walks around the center of the circle along with a radius of 2m. e system receives high-level control input v x , v y , w, and the optimization problems (18)and (19) is solved at each step. By solving the inverse kinematics, the next step foothold rotation is realized by the hip joint rotation; thus, the omnidirectional walking on the horizon plane of the biped is implemented. e performance of the proposed method is measured in case of external disturbance in x-y plane, as shown in Figures 8-11. An instantaneous external thrust f ext � 10N is added to the CoM of the robot between 4s and 5s, that duration is 0.01 s. It can be seen that the system can adjust the foothold position in 1 step in both directions and be stabilized.
Since the interference of external thrust, the biped will deviate from the predetermined trajectory, but the velocity and direction of biped walking can be changed in real time, and the system has the ability to adjust the input parameters to make the biped walking towards the target position.

Conclusions
At the first, a simplified LIPM is used to linearize the dynamic model of biped walking, MPC is used as a gait pattern generator to keep dynamics and kinematics feasibility, and CoM and ZMP trajectory of the biped is solved and predicted in a limited horizon. Since the bipedal system is an unstable system, it is easy to be disturbed by external thrust, resulting in system instability and falling down. erefore, this paper designs a bipedal balance control method considering capture point with MPC controller, which realizes the footprints planning with variable speed, so that the robot can walk in omnidirectional directions on the plane. At the same time, the capture point parameters of the system are introduced as feedback. When the system is disturbed by external thrust, the acquisition point will change suddenly. By controlling the ZMP point trajectory, the centroid can track the direction of the capture point, and the system will walk to keep it stable. However, CP may fall outside the support polygon. We use a projection method to make the next landing footprint always fall in the range of the biped reachable area and reach the target CP in an incremental manner. For solving the problem of ZMP input point discontinuity caused by projection, the system uses the MPC method to predict the smooth trajectory curve of centroid in the future according to the new foothold position. e position of joint of the full body is calculated by inverse kinematics and drives the biped realized stable walking.
We only focus on the dynamic walking of bipedal on the horizontal plane. It is necessary to analyze the dynamic walking of bipedal on the nonstationary ground and extend the planning of foothold in a two-dimensional plane to threedimensional space. On the other hand, the gait pattern generator is mainly a linear inverted pendulum model, which is an approximation of bipedal walking and can not plan complex action behavior. And there will be some improvements in the system approximation model, and a more accurate dynamic and environmental model will be used to predict and analyze the robot's future actions and behaviors.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest. Computational Intelligence and Neuroscience 9