Pareto Design of State Feedback Tracking Control of a Biped Robot via Multiobjective PSO in Comparison with Sigma Method and Genetic Algorithms: Modified NSGAII and MATLAB's Toolbox

An optimal robust state feedback tracking controller is introduced to control a biped robot. In the literature, the parameters of the controller are usually determined by a tedious trial and error process. To eliminate this process and design the parameters of the proposed controller, the multiobjective evolutionary algorithms, that is, the proposed method, modified NSGAII, Sigma method, and MATLAB's Toolbox MOGA, are employed in this study. Among the used evolutionary optimization algorithms to design the controller for biped robots, the proposed method operates better in the aspect of designing the controller since it provides ample opportunities for designers to choose the most appropriate point based upon the design criteria. Three points are chosen from the nondominated solutions of the obtained Pareto front based on two conflicting objective functions, that is, the normalized summation of angle errors and normalized summation of control effort. Obtained results elucidate the efficiency of the proposed controller in order to control a biped robot.


Introduction
Biped robots are one of crucial kinds of mobile robots since they are the most similar robots to humans and have the capabilities the same as humans, such as walking, speaking, and communicating [1][2][3]. Undeniably, they will be utilized in industry as an alternative to the skilled workforce doing high-risk activities in the near future. To control robots for a variety of challenging tasks, researchers have used efficient robust controllers, such as practical velocity tracking control [4], dynamic state feedback control [5], fuzzy PD control [6], and neural network control [7]. Specifically, Solís-Perales and Peón-Escalante [8] used robust adaptive tracking control for a class of robot manipulators having model uncertainties. Indeed, they utilized a linearizing-like control feedback and a high-gain estimator for a model with four unknown parameters, that is, system parameters, nonlinear terms, external perturbations, and the friction effects in each robot joint. Akbari et al. [9] employed a fuzzy TSK controller to control a rotary flexible joint manipulator modeled by the use of a solenoid nonlinear spring. It has been illustrated that state feedback controllers are effectual controllers in the aspect of having acceptable tracking error and control effort. In particular, Montagner and Ribas [10] used state feedback control for tracking sinusoidal references to reject disturbances affecting the plant. They utilized three techniques, linear quadratic regulator, the pole placement, and the ∞ control to control uninterruptible power supply systems. Chang and Fu [11] proposed dynamic state feedback formation control for achieving the realization of the multirobot formation system with respect to the problem of dilation of a formation shape and stabilization issue in 2 The Scientific World Journal a nonholonomic system simultaneously. Oliveira et al. [12] proposed a state feedback technique based on the modification of states transition matrix and used Genetic Algorithm (GA) to eliminate the trial and error process. Solihin et al. [13] utilized state feedback control for tracking control of a flexible link manipulator. They also employed the adopting particle swarm optimization algorithm to omit the tedious trial and error process of determining heuristic parameters in state feedback control.
Particle swarm optimization (PSO) is regarded as a smart optimization evolutionary and simulating algorithm introduced by Kennedy and Eberhart [14] having outstanding qualities, such as a high convergence rate and the capability of solving complicated optimization problems. PSO was derived from human behavior and animal behavior and it is easy to implement owing to having few parameters to adjust and special characteristic of memory [15]. In that respect, it has been successfully promoted by a number of researchers and applied in a wide range of scientific fields, to name it a few, control, electronics, robotics, and economics [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35]. Although having many excellent qualities, PSO suffers premature convergence due to losing of diversity. To solve the premature convergence, Wang et al. [36] proposed a hybrid PSO algorithm utilizing a diversity improving mechanism and neighborhood search strategies to reach a trade-off between exploration and exploitation abilities. Zhou et al. [37] employed the concept of the mutation by using random factors to augment the global search ability of particle swarm optimization. In order to enhance the search capabilities of PSO, it has been combined with other optimization algorithms. Idoumghar et al. [38] utilized particle swarm optimization combined with simulated annealing algorithms to avoid local optimal solutions and premature convergence of PSO. Qian et al. [39] combined particle swarm optimization with the simplex method to adopt a hierarchical and cooperative regime of global search and local search to optimize the objective function. Liu and Yang [40] utilized particle swarm optimization combined with the Nelder-Mead simplex method to enhance its potential for rapid convergence.
Undeniably, the objective functions of practical engineering problems conflict with each other. Hence, designers prefer to use multiobjective optimization algorithms to regard all objective functions based on the design criteria. To this end, several approaches, such as dynamic neighborhood PSO [41], dominated tree [42], sigma method [43], vector evaluated PSO [44], dynamic multiple swarms [45], dynamic population size, and adaptive local archives [46], have been proposed to develop the PSO algorithm to deal with multiobjective optimization problems. By comparing the abovementioned techniques, it can be concluded that the main difference among those approaches is the leader selection techniques. When all particles are updated in the current iteration, some of nondominated solutions are similar in the objective function space. Keeping all of them in the archive needs a great deal of space. Moreover, it will also preclude uniform diversity of nondominated solutions. To overcome these drawbacks, a fuzzy approach is utilized in this study.
Recently, Hassanzadeh and Mobayen [47] utilized the genetic algorithm, particle swarm optimization, and ant colony to balance the pendulum in the rotational inverted position. They demonstrated the efficiency of the proposed controller with respect to parameter variations, noise effects, and load disturbances. Wang and Guan [48] successfully utilized optimal control based on particle swarm optimization for a parallel hybrid hydraulic excavator. Gao et al. [49] used a typical fractional order PID control strategy based upon the improved multi-objective differential evolutionary algorithm for the gun control system. This investigation develops significantly authors' previous work [50,51] as follows: Taherkhorsandi et al. [50] used a linear quadratic tracking controller to control a biped robot stepping on a flat surface; however, a nonlinear state feedback tracking controller is used here to control the biped robot walking on slope. Optimal state feedback tracking control using a multi-objective particle swarm optimization algorithm in comparison to three prominent optimization algorithms, modified NSGAII, Sigma method, and MAT-LAB's Toolbox MOGA is used here to design the parameters of the proposed controller, while, sliding mode control based upon a particle swarm optimization algorithm is utilized in [51] to control the biped robot.

The Dynamics of the Biped Robot
In the present study, a biped robot is walking in the lateral plane on slope [50]. To model this robot, a three-link planar is used according to Figure 1. The first link represents the stance leg on the ground, the second link signifies the head, arms, and trunk, and the third link is the swing leg. In fact, those links move freely in the lateral plane. The parameters of the biped robot are obtained from Table 1 for a humanoid robot having 171 (cm) height and 74 (kg) weight [53]. The distance between two legs of the model (2 2 ) equals 32.7 (cm).
To obtain the dynamic equations of the biped robot, Newton-Euler method is employed to derive the dynamic equations of the model. 1 , 2 , and 3 are the angles between the first, second, and third links and assumed vertical line of these links, correspondingly. Hence, the equations of the model for 1 , 2 , and 3 are The Scientific World Journal (1) In this study, the biped robot passes two phases, double support phase (DSP) and single support phase (SSP). In DSP, both feet are on the ground; however, in SSP, the biped robot has one contact surface with the floor. The time of DSP is regarded as 20 percent of the whole time [54]. Moreover, the swing foot trajectory having the firstorder continuity is generated and it would maintain the zero moment point on the inside of the support polygon. Then, the inverse kinematics is employed to acquire the desired trajectories of the joints. The desired trajectories should have first-order and second-order continuity. The first-order derivative continuity guarantees the smoothness of the joint velocity, while the second-order continuity guarantees the smoothness of the acceleration or torque on the joints.

Particle Swarm Optimization.
Particle swam optimization is a smart evolutionary and simulating algorithm motivated by the simulation of social behavior instead of survival of the fittest [14]. At first, PSO was proposed to tune weight functions in neural networks [55]; however, it is now utilized as an effectual optimization algorithm where the decision variables are real numbers [56,57]. The candidates for solutions are named particles and the position of them changes based on every particle's experience and neighbors (velocity). Indeed, each candidate solution is associated with a velocity [58]. The governing equations for particles are as follows: where ⃗ ( ) is the position of particle and ⃗ V ( ) presents the velocity of particle , at time step .
is the inertia weight utilized to control the impact of the previous history of velocities on the current velocity of a given particle. 1 is the cognitive learning factor illustrating the attraction which a particle has toward its own success. 2 is the social learning factor representing the attraction which a particle has toward the success of the entire swarm. 1 , 2 ∈ [0, 1] are random values. ⃗ best is the position of the best particle of the entire swarm and ⃗ best is the best personal position of the particle . Inertia weight is used to tune the global and local search ability and it has qualities which are reminiscent of the temperature parameter in the simulated annealing [58]. It is crucial to note that the large inertia weight makes a global search straightforward; however, the small inertia weight facilitates a local search. In that respect, by changing the inertia weight dynamically, the search ability is dynamically adjusted. Eberhart and Kennedy [59] illustrated that decreasing inertia weight linearly over the iterations enhances the performance of PSO. Particles are permitted to move around their best personal position ( ⃗ best ) by using a large value of 1 and a small value of 2 . Moreover, particles 4 The Scientific World Journal converge to the best particle of the whole swarm ( ⃗ best ) by utilizing a small value of 1 and a large value of 2 . By regarding the above-mentioned results, it was observed that best solutions were acquired when 1 is decreased linearly and 2 is increased linearly over the iterations [57]. In this study, the linear formulation for inertia weight and learning factors are employed as follows: where 1 and 2 are the initial and final values of the inertia weight, respectively. 1 and 2 are the initial values of the learning factors 1 and 2 , respectively. 1 and 2 are the final values of the learning factors 1 and 2 , respectively. is the current iteration number and maximum iteration is the maximum number of allowable iterations.

Multiobjective Particle Swarm Optimization.
Multiobjective optimization is gaining a vector of decision variables satisfying constraints to give acceptable values to all objective functions [60]. It involves the vector of design variables and the vector of objective functions. Multi-objective minimization based on the Pareto technique can be conducted using some definitions [61]. The definition of Pareto optimality: a point * ∈ Ω (Ω is a feasible region in ) is Pareto optimal (minimal) if and only if there is not ∈ Ω which is dominant over * . Alternatively, it can be readily restated as ∀ ∈ Ω, ̸ = * , ∃ ∈ {1, 2, . . . , } : ( * ) < ( ).
The The definition of Pareto set: for a given multi-objective optimization problem, a Pareto set * is a set in the decision variable space consisting of all the Pareto optimal vectors * = { ∈ Ω | ∄ ∈ Ω : ( ) ≺ ( )}. Indeed, the Pareto front PT * is a set of the vectors of objective functions mapped from * . In multi-objective particle swarm optimization, a set of different leaders is devoted to each particle and one of the leaders could be chosen to update the position of a particle; however, one leader is utilized to update the positions of particles in single-objective optimization problems. In elaboration, one leader should be selected as best in order to update the position of each particle and enhance the convergence and diversity of solutions. To this end, a leader selection approach based on density measures is utilized. A neighborhood radius neighborhood is defined for all nondominated solutions. Two nondominated solutions are neighbors if the Euclidean distance measured in the objective domain between them is fewer than neighborhood . Hence the particle having fewer neighbors is preferred as a leader. On the other hand, by choosing an appropriate approach to find ⃗ best for particle th, the diversity within the swarm is maintained. Here, Sigma method is utilized to find the best personal positions of particles; however, this method was proposed to find the best local guides [43]. When and are devoted to each particle in the population and archive, respectively, the particle in the archive is chosen as the best personal position of the particle where the distance between and is minimized. By regarding a two-objective space, the parameter is defined as follows: In the present study, a turbulence operator is employed to find more appropriate positions and avoid being trapped in a local minimum. particles in the population are randomly chosen to add the turbulence factor to their position vectors: where rand is a random number generated uniformly in the interval [0, 1]. ⃗ max and ⃗ min are upper bound and lower bound of the search space. In this paper, = × number of particles where is the probability of the turbulence operator and set at 5/ .

The Pareto Design of State Feedback Control
The stages of state feedback control are designed and constructed step by step as follows. To control the system, the state variable vector is chosen as ( Finally, the control effort is obtained by The Scientific World Journal 5 The proposed method is used to find the proper state feedback parameters and remove the tedious and repetitive trial and error process. Furthermore, the results are compared with three prominent algorithms. The performance of a controlled closed loop system is usually assessed by a variety of goals [62]. In this study, the normalized summation of angles errors and normalized summation of control effort are regarded as the objective functions. These objective functions are minimized at the same time. The vector The feasibility and efficiency of the proposed multi-objective algorithm are assessed in comparison with Sigma method [23], modified NSGAII [41], and MATLAB Toolbox MOGA. The Pareto front of this multi-objective problem is shown in Figure 2. The swarm size is 10 and the maximum iteration equals 500. The term ⃗ V ( ) is limited to the range [−V ave , +V ave ] in which V ave = ( max − min )/2. While the velocity violates this range, it will be multiplied by a random number between [0, 1]. constant and neighborhood are set at 25 and 0.02, respectively. Over iteration, the inertia weight is linearly decreased from 1 = 0.9 to 2 = 0.4, 1 is linearly decreased from 1 = 2.5 to 1 = 0.5, and 2 is linearly increased from 2 = 0.5 to 2 = 2.5.
By regarding Figure 2, all the optimal points in the Pareto front are nondominated and could be selected to design the controller. However, it is crucial to note that selecting a better  Table 2. The real tracking trajectories and phase planes of the optimum design points A, B, and C are shown in Figures 3, 4, and 5.

Conclusions
In this study, an optimal robust state feedback controller is used to control biped robots walking in the lateral plane on slope. To this end, a biped robot is regarded and modeled  in that plane. State feedback control is employed as a robust controller to control heavy nonlinear dynamic equations of the robot. Moreover, a multi-objective particle swarm optimization algorithm is used to design the parameters of the proposed controller. In the proposed controller, effectual techniques, such as a fuzzy based approach which prunes the archive, the turbulence operator that causes particles to escape straightforwardly, and Sigma method which finds the best personal positions of particles, are used. The results elucidate that the proposed method performs effectively in designing the parameters of the controller in comparison to three well-known algorithms, modified NSGAII, Sigma method, and MATLAB's Toolbox MOGA. Indeed, the proposed approach can be regarded as a promising approach to control various similar nonlinear systems, especially, biped robots. Furthermore, the normalized summation of angles errors and normalized summation of control effort are two conflicting objective functions. By using three points of the obtained Pareto front, six parameters of the controller are designed. The first point chosen in the Pareto front has the minimum normalized summation of angles errors, the third point has the minimum normalized summation of control effort, and the second point is the optimal point minimizing both objective functions simultaneously.