Controller Design for Rotary Inverted Pendulum System Using Evolutionary Algorithms

This paper presents evolutionary approaches for designing rotational inverted pendulum RIP controller including genetic algorithms GA , particle swarm optimization PSO , and ant colony optimization ACO methods. The goal is to balance the pendulum in the inverted position. Simulation and experimental results demonstrate the robustness and effectiveness of the proposed controllers with regard to parameter variations, noise effects, and load disturbances. The proposed methods can be considered as promising ways for control of various similar nonlinear systems.


Introduction
During the past decades, many modern control methodologies such as nonlinear control, optimal control, adaptive control, and variable structure control have been widely proposed for control approaches 1-3 .However, these methods are theoretically complex and difficult to implement.Also, proportional-integral-derivative PID controller is a well-known method for industrial control processes.The later approach has been broadly employed in industries because of its simple structure and robust performance in a wide range of operating conditions 3 .Unfortunately, it has been difficult to tune up PID controller gains accurately because many industrial plants are often very complex consisting of issues such as higher order, time delays, and nonlinearities 4, 5 .Ziegler and Nichols proposed the first method utilizing the classical tuning rules.Though, it is hard to determine optimal PID controller parameters with Ziegler-Nichols formula in general 1, 3 .To overcome these difficulties, various methods have been proposed.The ability of using numerical methods for efficiently and accurately characterizing the quality of a particular design has excited control engineers to apply stochastic global optimizers.Over the past years, several evolutionary algorithms have been proposed to search for optimal PID controllers.Among them, GA has received great attention and PSO has been successfully applied to various fields 6-10 .Also, ACO is a relatively recent approach to solve optimization problems by simulating the behavior of ant colonies and modeling the behavior of ants, which are known to be able to find the shortest path from their nest to a food source 11 .The ACO method proposed in our paper has the following advantages: applicability to any kind of optimization problems, combinatorial or continuous, easy implementation, high rate of successful optimizations, and low run time.
In this paper, we compare the efficiency of three intelligent algorithms, that is, GA, PSO, and ACO methods.These evolutionary algorithms are used to adjust the PID controller parameters in order to ensure adequate servo and regulatory behavior of the closed-loop system.Also, we formulate the problem of designing PID controllers as an optimization dilemma which adjusts five performance indexes, that is, maximum overshoot, rise time, settling time, and steady-state error of the response and system control energy.

Rotational Inverted Pendulum System
The rotational inverted pendulum system is a well-known test platform for evaluating various control algorithms.Also, it has some significant real-life applications such as position control, aerospace vehicles control, and robotics 12, 13 .The system consists of a rotational arm and a pendulum where the rotational arm is actuated by a motor with the objective of balancing the pendulum in the inverted position.A schematic diagram of the RIP system is represented in Figure 1, where u, l p , m p , α, r, θ, and J b are the motor input, the pendulum length, the pendulum mass, the pendulum angle, the arm length, the arm angle and effective mass moment of inertia, respectively.
The plane of the pendulum is orthogonal to the radial arm.Figure 2 indicates the RIP system built in robotics research lab in our department.Also, the block diagram of the whole system is shown in Figure 3.
In this section, the dynamic equations of the RIP system considering backlash and friction effects are presented.The RIP dynamics are governed by 12, 14

2.3
The parameters of nonlinear model of the system are represented in Table 1.Using 2.1 -2.3 , the RIP system is easily simulated using Simulink and Matlab.The Simulink block diagram of RIP system shown in Figures 4 and 5 illustrates the step response without controllers indicating the whole system is unstable.The controller parameters generated by heuristic algorithms are employed iteratively in relevant simulation blocks, and the cost function is calculated in the manner presented in the next section.

GA, PSO, and ACO: An Overview
In the following, brief reviews of GA, PSO and ACO principle are illustrated.

GA
Considering Darwin's original ideas, life in all its diverse forms is evolved by natural selection and adaptation processes controlled by the survivability of the fittest species.GA is an evolutionary optimizer that takes a sample of possible individuals and employs selection, crossover, and mutation as the primary operators for optimization 15, 16 .

PSO
Considering the social behavior of swarm of fish, bees, and other animals, the concept of the particle swarm optimization PSO is developed.The PSO is a robust stochastic evolutionary computation method based on the movement of swarms looking for the most fertile feeding location 16 .
From the above statements, it is obvious that the theoretical bases of the two optimization methods rest upon two completely different structures.The GA is based on genetic encoding and natural selection, and the PSO method is based on social swarm behavior.PSO is based on the principle that all solutions can be represented as particles in a swarm.Each particle has a position and velocity vector, and each position coordinate represents a parameter value.Similar to GA, PSO requires a fitness evaluation function that takes the particle's position and assigns a fitness value to it.X PB and X GB are the personal best P best position and global best G best position of the ith particle.Each particle is initialized with a random position and velocity.The velocity of each particle is accelerated toward the global best and its own personal best based on the following equation: Here rand and Rand are two random numbers in the range 0, 1 , c 1 and c 2 are the acceleration constants, and w is the inertia weight factor.The parameter w helps the particles converge to G best , rather than oscillating around it.Suitable selection of w provides a balance between global and local explorations.In general, w is set according to the following equation 16, 17 : w 0.5 1 rand 0, 1 .

3.2
The positions are updated based on their movement over a discrete time interval Δt as follows, with Δt usually set to 1: 3.3  Then the fitness at each position is reevaluated.If any fitness is greater than G best , then the new position becomes G best and the particles are accelerated toward that point.If the particle's fitness value is greater than P best , then P best is replaced by the current position.

ACO
ACO is a relatively recent approach to solve optimization problems by simulating the behavior of ant colonies and modeling the behavior of ants, which are known to be able to find the shortest path from their nest to a food source 18 .ACO is an optimization technique that has been recently developed and recognized as effective for combinatorial optimization problems.
All ants start their tours from source node and end up their tours in destination node.In each node, an ant chooses its path probabilistically, and the probability of choosing an edge is proportional to the pheromone on the edge, that is, roulette wheel selection.
All edges have an initial amount of pheromone, τ 0 .After completion of all tours, first pheromone values on all edges are lowered, reflecting evaporation: where ρ ∈ 0, 1 is the pheromone evaporation rate.Then all ants deposit pheromone on all edges they have crossed in their tours: where T K is the tour of the kth ant, cost k is its cost, and c 0 is a positive constant which allows to adjust maximum pheromone deposit.The algorithms parameters have been chosen based on trial and error as follows.For GA method, population size 50, crossover rate 0.5, mutation rate 0.01, and maximum generations 20; for PSO algorithm, number of particles 50; acceleration constants c 1 c 2 1.5, and maximum iteration 20; for ACO algorithm, ρ 0.4, τ 0 τ max 10, τ min 1 and c 0 0.5.Each algorithm is implemented in Matlab.All of the programs are run on a 2.1 GHz Core 2 Duo processor with 2 GB of memory.Each of the optimization methods is tested in 50 independent runs involving 50 different initial trial solutions.

Problem Formulation and Controller Design
Considered performance index includes the overshoot M P , rise time T r , settling time T s , steady-state error E ss , and control energy E u .We find the appropriate parameters for the controllers minimizing the performance indexes.The proposed cost function is considered as follows 3, 19 : where β is the weighing factor.In this paper, β is set to 1 and 1.5 to investigate different possible solutions.The algorithms stages for searching proper parameters of PID controller are as follows.First, specify the lower and upper bounds of controller parameters and initialize the particles of the population randomly.Each particle, that is, K controller parameters is sent to Matlab Simulink.Then, the values of the performance criteria in the time domain, namely, M P , T r , T s , E ss , and E u are calculated iteratively.After that, cost function is evaluated for each particle according to these performance criteria.If the cost for local best solution is less than the cost of the current global best solution, the global solution is replaced with local solution.At the end of each iteration, the program checks the stop criterion.If the number of iterations reaches the maximum designated by the user, the latest global best solution is recorded and the algorithm is brought to an end.

Simulation Results
The lower and upper bounds of the three controller parameters are shown in Table 2.
In order to examine the dynamic behaviors and convergence characteristics of the proposed methods, two statistical indexes, namely, the mean value μ and the standard deviation σ of cost values of all individuals during the computation processes, are used.The mean value displays the accuracy of the algorithm, and the standard deviation measures the convergence speed of the algorithm.The formulas for calculating these values are as follows, respectively 19, 20 :

5.1
where cost P i is the cost value of the individual and n is the population size.
Several simulations are performed to investigate and compare controllers' convergence characteristics.As it can be seen in simulations in Figures 6 and 7, though all controllers can obtain stable mean cost value using the same cost function and simulation conditions, the ACO-PID controller has better cost value and mean value, showing that it can achieve better accuracy.Simultaneously, we can also find from Figures 8 and 9 that the convergence tendency of the standard deviation of cost values in the ACO-PID controller is much faster than other ones.This can prove that the ACO method has better convergence efficiency.Also, our simulation results demonstrate that ACO method is faster than GA and slower than PSO and the run time in 20 iterations for ACO is 2408.17sec in comparison with 1805.42 sec for PSO and 3865.903sec for GA.

Servo Behavior
In the following, the optimization procedure has been applied to the RIP system in reference tracking servo behavior .For analysis of this behavior, the reference signal y r for pendulum angle is first given equal to zero, and then it is determined equal to π. Figures 10 and 11 show the best results of the arm and pendulum angles and control energy value for different values of β.As it can be seen, ACO-based controller makes fine responses, indicating the superiority over other controllers.The simulation results of the best solution for various values of β in 50 runs are summarized in Table 3.As it can be observed from Table 3, the overshoot is decreased using ACO-PID in comparison with GA-PID and PSO-PID.This improvement is 25.693% β 1 and 19.335% β = 1.5 in comparison with PSO-PID and 55.071% β 1 and 37.936% β = 1.5 in comparison with GA-PID.Also, it is inferred that rise time of the step response in all simulations is quite similar and settling time of step response using GA is better than other controllers.According to total cost value, ACO-PID controlled systems have less cost values.Now, the servo performance is considered for the reference signal y r of pendulum angle equal to π. Figures 12 and 13 show the pendulum angles servo response for y r π and different values of β.

Regulatory Behavior
The control systems are always subject to external disturbances and internal noise which affect the system dynamics.If the nature of the disturbance is identified, it can be modeled mathematically.However, in practice, the nature of the disturbances is not clear and we may not be able to simulate them easily.In RIP system, the disturbance and noise effects can be applied by adding an additional load to the end of pendulum and adding noise to the position sensor, respectively.The simulations are done subject to a band-limited white noise noise power 0,000523 and sampling time 0.1 sec and 10% parameter value changes.Simulation results shown in Figures 14 and 15 illustrate the robustness and effectiveness of the proposed controllers subject to the noise and disturbance.Servo and regulatory results motivate to consider the proposed procedure as a suitable tool for controller parameters design and also stimulate investigating the possibility of further research on design and development of other practical control systems.

Experimental Results
We have performed experiments on the RIP system set at the University of Tabriz in the robotics research lab.The applied card in this project is PCI-6602 which creates the connection between computer and system and has A/D and D/A converters.Also the arm and pendulum links angles are measured using two E40S Autonics company encoders.The experimental results of the proposed methods on RIP system are shown in Figures 16 and  17.In these figures, the time interval 0, 1.2 which belongs to the swing-up period is deleted to focus on controllers' performance in the balance mode.The experimental results are very consistent with the simulation results shown in Figures 10 and 11 which not only prove the performance of proposed methods but also verify the availability of the system model.The fact that the simulated and real controlled responses are practically identical validates the identified system model.
In order to study the stability of designed control system using ACO algorithm subject to parameters variations, we perform the following experiment.In this experiment, the adding effect of a 25 g body mass with the length of 13 cm and in the presence of disturbances is validated.Figure 18 shows the results of this practical test.Then, the proposed method is a very powerful technique in completely eliminating the effects of the disturbances and providing satisfactory servo behavior.

Conclusion
In this paper, we present three evolutionary algorithms for designing of intelligent controllers of the RIP system.Each of the algorithms is tested in 50 independent runs involving 50 different initial solutions.The rotational inverted pendulum system is considered as a case study.Through the simulation results, the proposed controllers perform efficient search for proper PID parameters.To evaluate the controller performance, we tested the ability of the closed-loop system to follow set point changes servo behavior and the ability of the closed-loop system to reject disturbances regulatory behavior .The work demonstrates that all methods can solve searching and tuning the controller parameters efficiently.The proposed methods could be considered as promising ways for nonlinear control systems in general.One of the important features of the system is using of xPC-Target toolbox and input-output card in Simulink environment which utilizes hardware in the loop HIL , telelab implementation and fast-prototyping properties.The topic of our future researches is to employ other cognitive methods in order to achieve better results for designing controller and improving the performance in real time.Also, implementation, of heuristic algorithms for designing adaptive controllers will be our future challenging task.Furthermore, teleoperation control of RIP system using haptic device would be another challenge.

Figure 1 :
Figure 1: Schematic view of RIP system.

Figure 4 :
Figure 4: Block diagram of RIP system.

Figure 6 :Figure 7 :
Figure 6: Convergence tendency of mean values of cost function with β 1.

Figure 8 :
Figure 8: Convergence tendency of standard deviation values of cost function with β 1.

Figure 9 :
Figure 9: Convergence tendency of standard deviation values of cost function with β 1.5.

Figure 10 :
Figure 10: System response: a pendulum angle, b arm angle, and c energy signal, with β 1 for y r 0.

Figure 11 :
Figure 11: System response: a pendulum angle, b arm angle, and c energy signal, with β 1.5 for y r 0.

Figure 12 :Figure 13 :
Figure 12: Pendulum angle servo response of the RIP control system with β 1 for y r π.

Figure 14 :
Figure 14: System response: a pendulum angle, b arm angle, and c energy signal, with β 1 subjected to the band-limited white noise noise power 0,000523 and sampling time 0.1 sec and 10% disturbance.

Figure 15 :
Figure 15: System response: a pendulum angle, b arm angle, and c energy signal, with β 1.5 subjected to the band limited white noise noise power 0,000523 and sampling time 0.1 sec and 10% disturbance.

Figure 16 :
Figure 16: Experimental results of system responses with β 1: a pendulum angles, b arm angles, and c energy signals.

Figure 17 :
Figure 17: Experimental results of system responses with β 1.5: a pendulum angles, b arm angles, and c energy signals.

Figure 18 :
Figure 18: Experimental results of system responses using ACO-PID by adding a body mass to the end of pendulum and in the presence of disturbances: a pendulum angle, b arm angle, and c energy signal.

Table 1 :
Parameters of the RIP system.

Table 2 :
range of three controller parameters.

Table 3 :
Best results of PID controllers with different β values gained by PSO, GA, and ACO algorithms in 50 runs.