Optimization of Central Pattern Generator-Based Torque-Stiffness-Controlled Dynamic Bipedal Walking

In this paper, we propose a central pattern generator-based model to control the walking motion of a biped robot. *e model independently controls the joint torque and joint stiffness in real time. Instead of the phase-dependent neural model used by Huang in 2014, we adopt the same structure for all the walking phases, reducing the number of connections between neurons.*is reduction enables the employment of the particle swarm algorithm to find the optimal values of these parameters which lead to different solutions with different performance criteria. *e simulation of the proposed method on a seven-link bipedal walking model gave a good performance in the range of walking speeds, which is referred to as versatility, and in walking pattern transition. *e achieved walking gaits are 1-period cyclic motions for all the input control signals except for few gaits. Besides, these 1-period cyclic motions have a good local and global stability. Finally, we expanded our neural model by adding connections that work only when the robot walks on uneven terrains, which improved the robot’s performance against this kind of perturbation.


Introduction
e concept of central pattern generators (CPGs) is commonly used in biology. CPGs are neuronal circuits found in the spinal cord of vertebrates and in the segmental ganglia in invertebrates [1]. ese circuits have the ability to generate rhythmic outputs suitable for periodic motions such as moving legs to perform walking motion in quadrilateral legs. e level of activity of these neurons is determined by one signal from the upper nervous system. Several models were designed to mathematically represent these generators [2][3][4], and their rhythmic outputs were used to control several biomimetic robots [5][6][7]. e significant reduction in the number of control signals, the transitional ability between different motion patterns using a limited number of control signals [1], and the robustness against disturbances [2] encouraged the use of these models in passivity-based biped robots which are based on the concept of passive dynamic walking [8]. e output of CPGs was used to generate periodic signals to control the joints torques [2], angles [9], angular velocity [10], or the Cartesian coordinates of robot's link [11]. Huang et al. [12] presented a CPG-based control system to control joint torque and stiffness of a seven-link passivity-based biped robot. CPG-based torque-stiffness-controlled dynamic walking exhibits a good performance in walking pattern transition, versatility, disturbance rejection, and energy efficiency.
Despite the simplicity of the control and the reduction in the input signals, the number of parameters needed to generate stable gaits is significant. Several methods were proposed to find the appropriate values for these parameters, based on evolutionary algorithms such as genetic algorithm [10,13] and particle swarm optimization (PSO) algorithm [13] and reinforcement learning [14].
In this work, we propose a simpler version of the CPGbased model provided by Huang et al. [12] to control a 7-link 6-DoF biped robot. is model is a phase-independent model, and thus, it considers constant parameters for all walking phases instead of using different parameters for each phase. e optimized parameters values are determined using the particle swarm optimization algorithm. We also discuss the importance of adding connections to the neural model to increase the robot's ability to walk on uneven terrain. is paper is organized as follows: biped model and walking phases are described in Section 2. CPG-based control system is presented in Section 3. Experimental results are depicted in Section 4, and the conclusion is in Section 5.

Biped Model and Walking Dynamics
2.1. Biped Model. We adopted a model with seven links, as shown in Figure 1: an upper body and two legs; each leg consists of a thigh, a shank, and a flat foot. e model contains six joints: two hip joints, two knees, and two ankles. e stiffness and equilibrium angle of each joint is controllable except for the knee joints which are considered with a constant stiffness. e friction between the ground and the foot is assumed to be sufficient so that no slippage occurs during the movement, and the collision with the ground is a fully inelastic instantaneous collision that does not result in any slip or bounce.
During walking, the robot's leg passes through different phases, which differ in the number of ground contact points and in the number of degrees of freedom. e robot's leg is called "stance leg" when it has at least one contact point with the ground. In contrast, the leg with no ground contact is called a "swing leg." Depending on the phases of its legs, the robot has different phases. e phase with two legs on the ground is called a double stance phase, while the phase with only one leg touching the ground is called the swing phase. When the walking phases are properly sequenced, the robot generates stable motions called "gait." Walking motions have several gaits depending on the followed scenarios. Adopted gaits have been studied in detail in [15]. Our study focuses on two gaits which are closer to human walking with a moderate speed [15]. e phases of these two gaits are as follows: (1) push-off phase: the robot rotates around two contact points (the heel of the leading leg and the toe of the trailing leg). is phase ends when the trailing leg leaves the ground. (2) Swing phase with free knee: during this phase, the knee constraint in the swing leg is released. e swing leg moves forward until the shank becomes straight with the thigh, and then, the knee collision occurs. (3) Knee collision: after this collision, the knee joint is locked until the next swing phase of this leg. (4) Swing phase with a locked knee: during this phase, the thigh and the shank of the swing leg moves together until the heel of this leg strikes the ground. e heel of the stance leg might leave the ground before the swing foot hits the ground, and this rise is called "premature heel rise." [16] Premature heel rise moves the robot to the second gait, while the absence of this rise holds the robot in the first gait. (5) Heel collision: the heel of the swing leg strikes the ground. (6) Double stance phase: the heel of the leading leg with the toe or all the foot of the trailing leg is in contact with the ground.
is phase continues until the toe of the leading leg touches the ground. (7) Foot collision: the whole foot of the leading leg strikes the ground. Repeating phases according to the previous scenario guarantee stable robot motion.

Walking Dynamics.
Here, we tackle the equations of motion (EoM) considering the mechanical parameters adopted in [12]. e model is described using the Euclidean coordinates, where the x-axis is parallel to the ground and heading forward, while the y-axis is vertical to the ground upwards. Each link is described by the coordinates of its center of mass (CoM) and its angle with the y-axis for the thighs and the shanks or its angle with the x-axis for the feet. us, the model's Euclidean coordinates can be written as follows: x � x h , y h , x b , y b , θ b , x t1 , y t1 , θ t1 , x s1 , y s1 , θ s1 , x f1 , y f1 , θ f1 , x t2 , y t2 , θ t2 , x s2 , y s2 , θ s2 , x f2 , y f2 , θ f2 .
(1) e model is also described by the generalized coordinates: (2) e coordinates change according to the walking phases. For example, when the knee is locked, the angle and the angular velocity of the thigh and of the shank are equal.
Euclidean coordinates are related with the generalized coordinates by where J is the Jacobian matrix. We define the constraints vector ξ(q) which contains the constraints with the ground and the constraints of the locked knee. e previous constraints is guaranteed by the equality: ξ(q) � 0.
Lagrange's equations of the first kind could be used to get the EoM [12,15]: where T � (zξ(q)/zq), M q is the mass matrix written in the generalized coordinates, F q is the external active forces acted on the robot written in the generalized coordinates, and F f is the constraint force vector.
Rearranging equation (5) as follows: and from equations (4) and (6), we get All the walking dynamics phases share the same dynamics form as equation (7), while the component of each matrix may be different during different phases. e collision is supposed to be a fully inelastic instantaneous collision. us, the model achieves the constraints vector after the collision, and we write 2 Journal of Robotics e equation of strike moment can be obtained by integrating equation (4): where _ q − and _ q + are the velocities in the generalized coordinates before and after the collision, respectively. I f is the impulse force acted on the robot at the collision moment.
From equations (8) and (9), we write the collision equations as

Model Structure.
Previous studies [17][18][19] demonstrated the importance of changing the internal dynamic of a biped robot by using adaptable compliance actuators in walking gaits transitions, control of speed, and energy efficiency. Huang et al. [12] adopted the biologically inspired control method based on CPG in such a way that the joint stiffness and the joint torque could be independently controlled in real time. is method helped in improving the smoothness and the time of gait transitions, and it increased the disturbance rejection with good energy efficiency. is is why we adopted this strategy; therefore, the torque of each joint can be defined by the following equation: where K is is the joint stiffness nominal value, K i is the output of the neural oscillator responsible for controlling the joint stiffness, θ e i is the output of the neural oscillator responsible for controlling the joint torque by controlling the joint equilibrium angle, _ θ i is the angular velocity of the ith joint, and d i is the damping coefficient. e oscillator output is not applied directly to the dynamic model, and instead the response of the motors which control the joint stiffness and the equilibrium angle is taken into account. is response is modeled by a first-order response function with a time constant equal to τ a � 0.1 sec. e neural model receives information from the dynamic model including the state of the leg (stance or swing) and the relative joint angle and its relative angular velocity. Figure 2 shows the schematic diagram of the proposed coupled neural oscillators.
Our model is based on 9 neurons in total. Five neurons control the swing leg with free knee, and four neurons control the stance leg. Here, we consider that the torque and the stiffness of each joint are controllable except for the knee joint, which is considered as a constant-stiffness joint. e effect of the knee's neuron is eliminated when the knee is locked at the end of the swing phase. Furthermore, the foot of the swing leg needs to be fixed with the shank so that the equilibrium joint for this joint is kept constant during the swing phase with free knee, and then it is diminished after the knee collision until the heel of this leg strikes the ground. e neural model receives two input signals: u e controls the joint's equilibrium position and thus the joint's torque, and Journal of Robotics u s controls the joint's stiffness. Interlimb coordination is established by adding inhibitory connections between the unit oscillators of the hip joints in each leg; the knee and foot neurons in each leg are affected by connections from the hip joint in the same leg. No connection was added between the unit oscillators for the torque control and those for the stiffness control in order to keep their changes independent [12]. Figure 3 shows the proposed control system. We used the model submitted by Matsuoka [20] to mathematically represent the unit oscillators. erefore, the equations of the torque oscillators are where the index i represents the joint number shown in Figure 3, θ j is the torque oscillator output (the equilibrium position), u e i is the input control signal, τ e i and τ ′ e i are time constants, β e is the coefficient of adaptation effect for equilibrium position control, c e i is the weight of the input signal that affects the oscillator i, ω e ij is the connections weight, and Feed e i (q, _ q) is a vector with all the feedback signals from the dynamic system which includes information about the relative angle and the relative angular velocity of the affected joint. e equations of the stiffness oscillators are All the parameters in the previous equations have the same meaning as in equation (12). We restricted the output of the stiffness neurons in the range [− 10, 10] because these values represent the offsets from the nominal values of the joint stiffness.     Journal of Robotics Figure 3, equation (12) can be rewritten as follows:

Model's Parameters. From
In the same way, equation (13) can be rewritten as follows: Here, the time constants are assumed to be equal for all the stiffness neurons.
However, we may notice that the number of parameters needed to evaluate stable gaits is significant which makes reaching the appropriate parameters by trial and error an uneasy task. erefore, we employed the PSO algorithm to find the optimal values. e PSO algorithm is an iterative statistical technique for nonlinear function optimization. It does not rely on the gradient, and instead it uses random values to avoid convergence to local minima. e algorithm was first introduced in [21], and for our problem, we adopt the proposed amendments in [22,23]. e algorithm starts by generating a set of particles within the working space, assigning initial velocity values to those particles and calculating their respective costs. e positions and velocities of particles are modified during the successive iterations based on the value of a predefined cost function of their neighbors from the previous iteration and based on the best value of these particles during all iterations.

Experimental Results
In this section, we present the results of the simulations that were carried out in MATLAB 2017 environment based on the EoM mentioned in Section 2.2. We adopted the mechanical parameters of the biped model used in [12,15].

Model's Parameter Optimization Using PSO Algorithm.
e parameters of our neural model include, as shown from equations (14) and (15), the neurons interconnection weights (9 parameters), the self-parameters of each neuron with the two input signals (u e , u s ) (13 parameters), the sensory feedback weights (10 parameters), the nominal values of the joint stiffness (3 parameters), and the constant foot equilibrium joint during the free swing phase. In addition, we need five parameters related to imposed conditions on the initial state of the walker. ese conditions are used to guarantee that the motion begins at the moment just after the robot rises his trailing leg from the ground, the upper body at this moment is in midway between the two thighs, and the leading leg is in full contact with the ground. e parameters include the hip angle of the rear leg θ t2 and its angular velocity _ θ t2 , the rear foot angle θ f2 and its angular velocity _ θ f2 , and the leading hip angular velocity _ θ t1 . erefore, a total of 39 parameters need to be optimized. Table 1 presents these parameters with the upper and lower bounds. e parameters range was inspired by a logical approach and some initial tests.
e cost function is defined as follows: where Nb of steps is the number of steps achieved by the robot; this number is limited by a predefined maximum. e 1 � a 1 .(v < v 0 ).|v − v 0 | guarantees that the walking velocity v exceeds a certain minimum limit v 0 . e 2 � a 2 · div(θ b ) is related to the standard deviation of the body angle during walking. e 3 � a 3 . max clearance eliminates the foot scuffing of the swing leg with the ground. e coefficients a i , 1 ≤ i ≤ 3, are the weights which determine the importance of each factor in our optimization problem. e self-parameters of the PSO algorithm are shown in Table 2. By applying the PSO algorithm several times, different solutions have been obtained according to the optimized factors e i . ese solutions differ in the gait patterns, in the versatility, and in the robustness against disturbances. Table 1 shows the optimal values of the best solution, and Figure 4 shows the stick diagram of a stable gait generated for u e � 2.4 and u s � 22. Table 1, the range of walking speed was studied as a function of the two control signals. Figure 5 shows the variations of walking velocities as a function of the control signals. We notice a forward acceleration by increasing u e and u s , which can be justified by the increase in the system's input energy. e velocity ranges from 0.34 m.s − 1 to 1.2 m.s − 1 which corresponds to a Froude number (Fr) in the range [0.12, 0.43].

Walking Patterns. Based on the parameter values shown in
Fr is defined as V/ ��� � g · l where V is the robot speed, g is the gravitational acceleration, and l is the leg length [12]. e speed contours in Figure 5 show that similar speed values could be obtained by taking several input values, where the difference between these gaits is in the step length and frequency, which can be also noticed on Figures 6 and 7. Figure 6 shows that u e is mainly responsible for the changes in the step length especially in the high values of u s , while Figure 7 shows that the changes in the step frequency depend mainly on u e for the low values of u e , while it depends on the value of u s for the moderate value of u e and u s . For the large value of u s , the frequency is almost constant, and this is because the changes in the stiffness were restricted to the range [− 10, 10], which limited the effect of u s .

Walking Speed Control.
We studied the transitional ability of the simulated walker between different patterns    Journal of Robotics by changing the control signals depending on former exploration of the walking space of the robot. To this end, a speed profile was defined, and the ability of the walker to change its velocity following this profile by changing the input signals at the beginning of the push-up phase was studied. We observed in Figure 8 the flexibility and the ability of the robot in following the speed profile. e overshoot was less than 5% in all the transitions, and it took three to four steps before the robot settled to its new speed value.

Stability Analysis.
e stability of the walker is analyzed by observing its motion on a step-to-step basis. One step is considered as a function called "stride function f(u)" [16] which maps the robot's state u m (q, _ q) at a defined point of a step m with its state at the same point in the next step m + 1. Stable walking is achieved when the stride function converges after a few steps to a single point called "fixed point," or when it alternates between n points. e trajectory of the robot in the state space is called the n -period limit cycle, where n is the number of the previous points. e cyclic stability of this motion is determined by linearizing the stride function at its fixed points [16,24]. Assume small deviation Δu n Δu n+1 � u * + JΔu n , (17) where J is the Jacobian matrix calculated using the Newton-Raphson method [24]. When the eigenvalues of the Jacobian matrix are within the unit circle in the complex plane, the cyclic motion of the robot is stable and it can handle small perturbations around its nominal trajectory [16]. e local stability increases when the magnitude of the maximum eigenvalue λ m is closer to zero. Figure 9 shows the control signals that generate 1period cyclic motions and the relative |λ m | at each point. We notice that the motion is stable for the most input control signals.
In addition to the local stability, the global stability of the walker with the CPG-based control system was also considered. e global stability is determined by the basin of attraction (BoA) which is defined by all the initial conditions that lead to stable motions of the walker. We have mentioned before that the moment when the trailing leg leaves the ground is considered as the beginning of the motion, and the Step number Figure 8: Following a predefined speed profile. e overshoot is less than 5% in all the transitions, and it takes three to four steps before the robot settles to its new speed.  Journal of Robotics robot's initial condition is determined by the angle and the angular velocity of the trailing leg (θ t2 , _ θ t2 ), the angle and the angular velocity of the trailing foot (θ f2 , _ θ f2 ), the angular velocity of the leading leg _ θ t1 , and the angle and angular velocity of the body (θ b , _ θ b ). We restricted our study to the situations where the angular velocity of the body is zero and the trailing foot angle and angular velocity equal to those of the trailing leg.
us, the BoA is determined with the remaining DoF using the cell mapping method [15,16].   Journal of Robotics u s � 22. We can see that the stable regions of the walker are relatively large which proves the effectiveness of the proposed control model.

Disturbance Rejection.
We studied the ability of the proposed model to walk on uneven terrain. A random trajectory with 50 step length has been generated, and we added a random change in the ground height in the range [− e, e] for each step. e process was repeated 100 times, and the percentage of the successful trajectory passing was calculated. Figure 11 shows the results as a function of the increasing amplitude of the maximum change e.
To increase the robustness, we added new connections to the neural model. ese connections are dependent on the disturbance and work only at some phases on some neurons as follows: (1) Neurons related to the stance foot during the swing phase. (2) Neurons related to the trailing foot during the push-up phase. (3) Neurons related to the hip joints during the swing phase. ese connections are inspired by the human behavior to overcome a sudden ground height variation during walking. We also considered the study on a limit cycle walking, which indicates the importance of controlling the swing leg and the stance foot to increase the stability against disturbances [16]. e previous connections work directly after the robot senses a change in the ground height. is disturbance e is calculated depending on the difference between the two feet's height, and it can be added to our model as follows: where i is the joint index, c i se and c i ee are the connections weight, and u i e and u i s are the nominal control signals and e in meters. To calculate the appropriate values of the connections weight, we employed the PSO algorithm again with a cost function defined by the average number of success steps on a distributed ground with random height variation in the range [− 10, 10] mm. e same PSO parameters as in Table 2 are used with a swarm size of 150 particles. e obtained optimal weights are shown in Table 3. Figure 11 shows the increase in the robustness after adding the new connections with the optimized values. e walker can pass a trajectory with a random variation with amplitude up to 2.5% of its leg length with about 50% success rate.
We studied the maximum single step that could be passed by our robot with the added connections for all the control signals (u e , u s ). Figure 12 shows the results. e largest ground height variations the walker can overcome is 0.048 m at u e � 3.1 and u s � 67.

Discussion
e proposed phase-independent CPG-based control model exhibits good results depending on several performance criteria. e Froude number of the robot ranges from 0.12 to 0.43. is range is close to the speed achieved in [12] which ranged from 0.15 to 0.43 and comparable with previous passivity-based bipedal walkers (e.g., the Fr of "Veronica" ranges from 0.07 to 0.16 [17] and the Fr of "Runbot" ranges from 0.25 to 0.5 [25]).
Besides the wide speed range reachable by the two input signals, the model has a great ability to change its speed by only changing the control signals at the beginning of the push-off phase. However, the transition between working points is not possible for every two points. We observed that some points are not reachable from other points, and the walker have to pass through at least one intermediate point to complete the transition correctly. is is because those former points are not in the basin of attraction of the later points.
Finally, adding new connections to our neural model increased the robot stability and enabled our biped to walk on uneven terrain with a maximum height change up to 20 mm with about 50% of success rate. ese new connections could be considered as a new layer added to the neural model, and they only work when the robot discovers a change in the height of the ground. We can generalize this idea by changing the weight of those connections or by using different connections for different environments or situations, while the basic structure of our model is kept fixed. Our robot can handle a change up to 4.8 cm, which is about 6% of its leg length, whereas the Huang model was able to handle a ground disturbance of maximum 5.25% of leg length [12]. e passive-based walking robot "Meta" deals with ground height changes of 5% of leg length with a control strategy combining the effect of ankle stiffness, local stance ankle control, and ankle push-off control [16]. Existing studies reported that a simulated biped model with reinforcement learning could overcome ground disturbances of height of only 1.67% of leg length [26].
To conclude, our model is simpler than the model proposed by Huang et al. [12] which also controls the external torque and internal dynamics of the walker. Despite the simplicity, our model achieved a close performance to the performance of Huang's model, and it showed a good local and global stability on all the possible values of the two input signals. Besides, our model is able to change the walking velocity between most of the allowable speed values in three to four steps.

Conclusion
In this paper, a walking robot model was presented. e model, based on the central pattern generators, generates periodic signals that control the torque and stiffness of the compliance joints. is neural model is controlled using only two control signals: one for controlling joints torque and the other for controlling joints stiffness. We used the proposed model to control a 2D 7-link biped robot, and the procedure proved good performance in terms of versatility and walking pattern transitions. e robot walks on the 1-period limit cycle and has a good local stability around the fixed point in most working space defined by the values of the control signals which generate stable gaits.
Finally, we proposed a method which helped to increase the robot stability against disturbance in the height of the ground by adding connections to the control model related to the disturbance, and through experiments on the simulated model, we observed a good improvement in the disturbance rejection.
As a future work, we suggest to explore the properties of Hindmarsh-Rose (HR) neural model [27] on the aspect of CPG and reconstruct our CPG network based on the HR model.

Data Availability
e software code used to support the findings of this study is available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.