Improved Artificial Bee Colony Algorithm and Its Application in LQR Controller Optimization

In order to get the optimal performance of controller and improve the design efficiency, artificial bee colony (ABC) algorithm as a metaheuristic approach which is inspired by the collective foraging behavior of honey bee swarms is considered for optimal linear quadratic regulator (LQR) design in this paper. Furthermore, for accelerating the convergence speed and enhancing the diversities of population of the traditional ABC algorithm, improved solution searching approach is proposed creatively.The proposed approach refers to the procedure of differential mutation in differential evolutionary (DE) algorithm and produces uniform distributed food sources in employed bee phase to avoid local optimal solution. Meanwhile, during the onlooker bees searching stage where the solution search area has been narrowed by employed bees, new solutions are generated around the solution with higher fitness value to keep the fitness values increasing monotonously. The improved ABC algorithm is applied to the optimization of LQR controller for the circular-rail double inverted pendulum system, and the simulation results show the effect on the proposed optimization problem.


Introduction
Linear quadratic regulator, as a representative optimal control, has been widely used for complex system control.The objective of LQR controller is to minimize the quadratic cost function with weighting matrices selected by engineers.Often this means that the controller synthesis will be an iterative process where an engineer should judge the produced "optimal" controller through simulation and then adjusts the weighting matrices to get a controller more in line with the specified design goals.This tedious process limits the application of the LQR-based controller synthesis [1].
Fortunately, computational intelligence (CI), as a set of nature-inspired computational methodologies and approaches including artificial neural networks, genetic algorithm, swarm intelligence, and artificial immune algorithm, has become a remarkably developing research area because of its ability of intelligent reasoning and decision making.Relative researches have been deployed in solving complex problem in diverse fields as well as in LQR controller optimization [2][3][4][5].
As one of the most recent swarm intelligence approaches, artificial bee colony (ABC) algorithm inspired by the foraging behavior of honey bees has been presented by Karaboga [6] in 2005.It is easy to implement and fewer parameters and numerical comparisons have demonstrated that the performance of the ABC algorithm is competitive to other CI algorithms [7,8].However, more similar to other swarmbased optimization algorithms, ABC algorithm also has the problems of getting trapped into local optimization or slower convergence speed which contradict each other [9].In order to solve these problems, various improved ABC algorithms have been proposed [9][10][11][12][13].But most researches so far focus on improving the ability of exploration and few literatures [14][15][16] have considered the balance between exploration and exploitation which is the goal in ABC research.Sharma et al. [14] have improved the movement of onlooker bees using golden section search method to get more efficient food locations.Gao et al. [15] have formed an orthogonal learning strategy by orthogonal experimental design to construct a more promising and efficient candidate solution.Gao and Liu [16] have proposed two improved solution search equations for global optimization inspired by differential evolution (DE) and employed the chaotic systems and the oppositionbased learning method to enhance the global convergence speed.
In this paper, in order to optimize the performance of LQR controller designed for balance control of rotary double inverted pendulum system, the ABC algorithm is introduced; meanwhile, for preventing the local optimization and accelerating convergence speed simultaneously, the solution search phase as the key of the algorithm is improved and applied without using the complex algorithms mentioned above.On one hand, differential mutation in differential evolutionary (DE) algorithm is learned and adopted in the employed bees searching phase to enhance the diversity of the population.On the other hand, in onlooker bees searching phase, the comparison of fitness values between the current and the neighboring solutions is introduced to keep the fitness values increasing monotonously and accelerate the convergence speed.The major contributions of this paper are as follows: (i) different from the traditional ABC algorithm mentioned in [5], aiming to accelerate the convergence speed meanwhile enhancing the diversity of the population, an improved ABC algorithm that incorporates the ideas of DE and fitness comparison has been proposed creatively; (ii) the effectiveness of the improved ABC algorithm for optimizing the LQR controller of rotary double inverted pendulum system has been proved.
The rest of the paper is arranged as follows.In Section 2, theory of LQR controller is described briefly.In Section 3, the basic ABC algorithm is introduced.In Section 4, improved ABC algorithm is proposed and related problems are discussed.In Section 5, the simulation and results are given to confirm the proposed method.The conclusions are presented in Section 6.

LQR Controller
LQR is a MIMO design theory with observation-based control which is concerned with operating a dynamic system at minimum cost.The structure of LQR control system is shown in Figure 1.
For a linear, time-invariant system, The LQR controller  could be obtained by =  −1   .Based on the control variable  = −, the system could be stabilized and the quadratic cost function  as shown in (2) should be minimized.Consider Here, the parameter  could be obtained by solving the continuous time Riccati equation in (3) through the selection of  and .Consider Obviously, the cost function as well as the control performance of LQR controller is affected by the weighting matrices  and .The engineer needs to repeatedly examine the produced "optimal" controllers through simulation and adjust the weighting matrices to get a controller with better performance.
In order to simplify the LQR design procedure, meanwhile obtaining the optimal performance, the artificial bee colony (ABC) algorithm is proposed.

The Artificial Bee Colony Algorithm
The ABC algorithm is inspired by the collective foraging behavior of honey bee swarms.The artificial bee colony can be separated into three groups as in the real world: employed bees, onlooker bees, and scouts bees.The employed bees are responsible for exploiting the food sources, bring loads of nectar from the food source to the hive, and share the information about food source with onlooker bees through waggle dance."Onlookers" are those bees that are waiting in the hive for the information to be shared by the employed bees such as distance, direction, and profitability; then further search around the selected food source based on the probability will be done."Scouts" are those bees which are currently searching for new food sources in the vicinity of the hive [5,12].
In ABC algorithm, the food sources represent the possible solutions.And the numbers of food sources SN are equal to the employed bees.

Employed Bees Stage.
Each of the employed bees searches in its vicinity and generates new solution   representing the th parameter in the th solution as where  is a randomly selected number in {1, 2, . . ., SN},  is the index for the dimension of the optimization problem, and   is an random number between −1 and 1, which affects the perturbation range of   .The nectar amount of the food source as well as the quality of each solution is represented by the fitness which could be calculated by Figure 2: The new generated solution in ABC algorithm.
Here,   , corresponding to the th solution, is the value of objective function.Since the goal of LQR controller design is to minimize the performance function  which is equal to   , the fitness corresponding to the solution could be expressed as 1/(1+) and the phrase "better fitness" in the last paragraph means larger fitness.

Onlooker Bees
Phase.After all employed bees finish the searching process, they share the fitness of each food source with the onlookers, each of whom selects a food source according to the probability as shown in ( 6) which is proportional to the nectar amount of food source.Consider Then, better food source around its chosen food source will be searched randomly based on the fitness.

Scouts Bees
Phase.If a solution does not improve for a number of iterations, the food source will be abandoned, and the associated employed bee becomes scout bee.Random search will be performed and new solutions will be generated as where   min ,   max are the bound of th dimension.As mentioned above, the flow chart of ABC algorithm is presented in Figure 3.

The Improved ABC Algorithm
For the problems of slow convergence speed and premature convergence existing in the standard ABC algorithm, improved ABC algorithm is proposed mainly on search mechanism.

New Search Operation Based on Differential Evolution. DE [17] is a population-based evolutionary algorithm;
The max-cycle number "(MCN)" is reached

Yes Population initialization
Better solution searching around each food source based on the fitness calculated by (5) Possibilities calculation of each selected source and sending the onlooker bees to the higher one New food source exploitation in the vicinity of the selected food source Will the scout bee appear?

End
Replacing the current food source with the one generated by ( 5 the crucial idea behind DE is a scheme for generating solutions in the mutation process.Three different population vectors are selected as parents and one of these three individuals is selected as the main parent.DE generates new vector by adding a weighted difference between two members to the main parent as shown in The integers , , and  are chosen randomly, are mutually different, and are also different from the running index . is a real and constant factor which determines the distribution of the generated solution.
As the result of vector operation, distance and direction information could be extracted and taken into account during the process of generating the new vector; thus, a broader solution space and the new vector in wider range could be obtained compared with Figure 2.And a uniformly distributed solution which could fully express the characteristics of the solution space and enhance the diversity of the population [18] will be generated.
In order to maintain population diversity especially in employed bee phase which is the first stage in ABC algorithm, motivated by DE and based on the property of the ABC algorithm, the th parameter in the th solution   could be calculated with the help of the three randomly selected sources as where , , and  are mutually different random integer indices selected from [1, SN]; , the same as DE, is a positive real number that controls the rate of evolvement.To avoid the adverse impact on the exploitation of the employed or onlooker bee, the value of  is usually chosen as [0.01, 0.2].Therefore, from Figure 4, the new solution could be generated randomly by using position information of the three randomly selected sources such as distance and direction, and the population diversity as well as the ability of global optimization would be improved.

New Search Operation Based on Fitness Comparison.
As mentioned above, the improved search mechanism will raise the possibility to get the global optimal solution; but on the other hand, the new solutions are randomly selected and the fitness of new candidate food source   may be higher or lower than that of the current source.Thus, the optimization process will be iterative and less efficient.In this part, another new source search mechanism is proposed aiming to keep the fitness values increasing monotonously and accelerate the convergence speed.
The new candidate food source is determined based on the comparison of fitness information between the current and neighboring food sources as shown in Fit() stands for the fitness value of the current th food source, Fit() represents the fitness value of the neighboring th food source, and   is a random number in [0, ] where the parameter  is a positive constant to ensure that the newly searched solution is closed to the one with higher fitness value.
From (10), the value of   not only depends on the difference between the two neighboring positions but also relates to the fitness values of two related food sources.By comparing the fitness values of the current food source and candidate food source, the new generated source will always appear around the source with larger fitness value as the red circle shown in Figure 5. Thus, the searching area could be centralized effectively and the optimization process could be accelerated.

The Improved ABC Algorithm.
As analysed above, the solution search mechanism as described in ( 9) is known Figure 4: New solution generation as described in (9).
Figure 5: New solution generation as described in (10).
to be robust but less efficient in terms of the convergence speed, while the solution search operation as (10), which determines search direction based on the result of fitness comparison, can improve the convergence performance but may lead to the premature convergence.Hence, in order to take advantage of the compensatory property and avoid the shortages of them, the two solution search operations are hybridized in traditional ABC algorithm.On the one hand, in order to enhance the diversity of the population especially in employed bee phase which is the first stage of ABC algorithm, the search scope should be extended as large as possible; thus, the search operation which is referred to differential evolution could be adopted.On the other hand, for accelerating the convergence speed especially in the onlooker bee phase where the preliminary search has been done by employed bees, the search operation based on the comparison of fitness values of solutions has been introduced in onlooker bee phase.Thus the main steps of the improved ABC algorithm could be concluded as follows.
Step 1. Initialize the population of solutions; define the parameters including SN and MCN (maximum cycle number), as well as the boundary of the solution; what should be noted is the selection of  and  mentioned in the two proposed search operations; the values of the two parameters represent the amount of perturbation added to the "main" source and affect the optimized performance.Step 2. Generate new solutions   by the employed bees using (9).
Step 3. Calculate the fitness value of new candidate food source.
Step 4. Obtain the probabilities   of all food sources by means of their fitness values as shown in (6).
Step 5. Produce the new solutions   as shown in (10) by onlookers.
Step 6. Determine the abandoned solution; if it exists, replace it with a new randomly produced solution   by scout bee with (7).
Step 7. Record the best food source position achieved so far.
Step 9. Judge whether the termination conditions (Cycle = MCN) have been met; if the conditions meet, output the optimal solution; otherwise, go back to Step 2.

Simulation and Results Analysis
In this section, in order to prove the optimization performance, the improved ABC algorithm and the traditional ABC algorithm are applied to the LQR controller optimization for the same circular-rail double inverted pendulum system and the results are compared and analyzed.

Control Plant.
As a typical underactuated system, the circular-rail inverted pendulum system consists of the horizontal driving arm driven by torque motor and two-stage pendulum rods as shown in Figure 6.The control objective is to keep the above two-stage pendulum rod in vertical position balance through the motion of the driving arm in the horizontal plane.
Based on the Lagrange equation, the state-space equation of the inverted pendulum system could be deduced as (11), with the assumption that the values of  1 and  2 in the vertical balance position are zero.In the state-space equation, the state variables are , the output variables are  = [ 1 ,  2 ,  3 ]  , and the input variable is θ 1 [19].

The Raise of Optimization Problem.
As an unstable but controllable, observable system, the closed loop system and LQR control algorithms could be introduced to improve the performance of inverted pendulum system.As mentioned above, the purpose of LQR control is to design the state feedback controller  = − to stabilize the system and minimize the cost function  = ∫ ∞ 0 [   +   ].For the control of inverted pendulum system with one input, three outputs, and six state variables, the matrices  and  could be selected as and the corresponding cost function could be expressed as Here, the matrices  and  which affect the control performance should be optimized to minimize  and balance the above two-stage pendulum rod in vertical position further.

LQR Controller Optimization.
Based on the optimization objective, the improved ABC algorithm is introduced.According to the procedure mentioned above, considering both the search quality and computational efforts, the algorithm parameters, including the population size SN and MCN, are fixed to 10 and 300, respectively.The upper and lower boundary of the solution are set to be [1000, 1000, 1000, 50, 50, 50] and [0.6, 0.6, 0.6, 0.6, 0.6, 0.6].
In order to determine the values of parameters  and  which affect the optimization performance, the relationship between the parameter values and the fitness of optimized solutions should be obtained.Several groups of different  and  have been chosen and applied to the LQR optimization problem; corresponding fitness has been calculated.As can be seen from Figure 7, the relationship between fitness and the parameter values is not linear or monotonic; the fitness values keep almost constant when the values of  and  are larger than 1.8 and 0.17, and the maximum fitness value is 0.566 and its corresponding values of  and  are 1.5 and 0.14, respectively.
Based on the same initial parameter settings, with the traditional ABC algorithm and the improved ABC algorithm, the optimization procedure has been carried out, respectively.As shown in Figure 8, for the optimization procedure with improved ABC algorithm, the procedure is terminated after 98 cycles and the fitness value corresponding to the optimized solution is 0.566.Compared with the traditional ABC algorithm, both the convergence speed and the fitness values of the solutions have been improved.To verify the effectiveness of the improved ABC algorithm and the optimized LQR controller, the simulations have been done on the circular-rail double inverted pendulum system.The step input whose amplitude is 0.01 rad is added to the horizontal rod, and the other two inputs are set to be zero.As shown in Figure 9, compared with the performance of ABC algorithm, the step response of improved ABC algorithm possesses better tracking performance, the settling time decreases from about 6 seconds to 3 seconds, and the overshoot existing in lower pendulum increases slightly on the order of 10 −4 whose effect is very small to the whole system.
Furthermore, in order to verify the correctness of the improved ABC algorithm, the relationship between the selected fitness values corresponding to different values of  and  and corresponding control performance of the lower pendulum has been described in Figure 10.It is clear that the settling time and the ability of noise suppression of the control plant are proportional to the fitness values, but the overshoot which is small enough is inversely proportional to the fitness values.Thus, conclusion could be deduced that the control performance is determined by the fitness value; the higher fitness values, the better performance of the control system, which has been mentioned above.
The foregoing results indicate that the improved ABC algorithm is a very effective method for parameters optimization in LQR controller design which is proper for rotary double inverted pendulum system control.

Conclusion
This paper focuses on the issue of the LQR controller optimization for the rotary double inverted pendulum system control.Aiming at accelerating the convergence speed meanwhile enhancing the diversity of the population, a novel improved ABC algorithm which incorporates the ideas of DE and fitness comparison has been creatively introduced and applied to the optimization problem.The simulations have been done and the results show the validity of improved ABC algorithm that outperforms the traditional ABC algorithm in optimizing the parameters of LQR controller for inverted pendulum system.

Figure 1 :
Figure 1: The structure of LQR control system.

Figure 8 :
Figure 8: The procedure of optimization.

Figure 9 :
Figure 9: (a) The step response of horizontal rod.(b) The step response of lower pendulum.(c) The step response of upper pendulum.

Figure 10 :
Figure 10: (a) The relationship between fitness values and the amplitude of the output with noise.(b) The relationship between the fitness and setting time of lower pendulum.(c) The relationship between the fitness and overshoot of lower pendulum.