A Novel Optimal Control Method for Impulsive-Correction Projectile Based on Particle Swarm Optimization

This paper presents a new parametric optimization approach based on a modified particle swarm optimization (PSO) to design a class of impulsive-correction projectiles with discrete, flexible-time interval, and finite-energy control. In terms of optimal control theory, the task is described as the formulation of minimum working number of impulses and minimum control error, which involves reference model linearization, boundary conditions, and discontinuous objective function. These result in difficulties in finding the global optimum solution by directly utilizing any other optimization approaches, for example,Hp-adaptive pseudospectral method. Consequently, PSO mechanism is employed for optimal setting of impulsive control by considering the time intervals between two neighboring lateral impulses as design variables, whichmakes the briefness of the optimization process. A modification on basic PSO algorithm is developed to improve the convergence speed of this optimization through linearly decreasing the inertial weight. In addition, a suboptimal control and guidance law based on PSO technique are put forward for the real-time consideration of the online design in practice. Finally, a simulation case coupled with a nonlinear flight dynamic model is applied to validate themodified PSO control algorithm.The results of comparative study illustrate that the proposed optimal control algorithm has a good performance in obtaining the optimal control efficiently and accurately and provides a reference approach to handling such impulsive-correction problem.


Introduction
Impulsive-correction projectile, as a type of precision-guided munitions, produces directive force for trajectory control, which results in quick-maneuvering and precise-correcting via small impulsive rockets equipped around the cross section of the airframe.This has also attracted significant interests of research due to its short-time response, high ratio of effectiveness-consumption, and simpleness of guidance [1][2][3][4][5].Considerable work on the design of magnitude of impulse thrust and its position arrangement [4], real-time computational algorithm [6], flight stability of impulsive-correction trajectory [7], and so on has been developed.For instance, Liu and Willms [8] provided a novel approach to obtain necessary and sufficient conditions for impulsive controllability of continuous linear dynamics exercising discrete-time actions for applying into maneuvers of spacecraft.Rempala and Zabczyk [9] developed a simple and direct proof of a version of Blaquiere's maximum principle for deterministic fixedtime impulsive control problems.Prado [10] considered the problem of optimal maneuvers to insert a satellite in a constellation in the application of two impulses to the satellite and the objective to perform this maneuver with minimum fuel consumption.Nevertheless, due to its nondifferentiable and discontinuous characteristics of impulsive control and its limit of finite working number of impulses, that is, discrete and finite-times correction, it is impossible to implement continuous control like conventional aerodynamic-fin control for the airframe.
Nowadays, numerical optimization controls are categorized into two different classes with their own advantages and characteristics, that is, direct method based on mathematical programming and parameterization of state and control histories and indirect method grounded on solution of two-point boundary value problem (TPBVP) using optimal control principle [11][12][13].They are of probability to be applied for achieving an optimal performance of such problem.In general, direct method is more popular in application than indirect method due to the difficulty to obtain analytical solutions of indirect approach for nonlinear complex system.Hp-adaptive pseudospectral method, as one kind of the most popular and efficient direct methods, is combining Legendre pseudospectral method [14,15] and Hp-adaptive method [16], which discretizes state variables and control variables on a series of Legendre-Gauss-Lobatto (LGL) points.What limits its application in impulsive system is that impulsive control variable is nondifferentiable with flexible-time intervals, which does not satisfy the Karush-Kuhn-Tucker (KKT) conditions.
To tackle such problems effectively, swarm intelligence-(SI-) based methods among those evolutionary algorithms [17][18][19][20] such as genetic algorithms (GAs), simulated annealing (SA), and ant-colony optimization are becoming more popular due to their speed and accuracy qualities.They are inspired by natural phenomena, for example, the behavior of groups of birds, ant colonies, herds of animals, and even social connections between human beings [19].As a type of SIbased methods, the particle swarm optimization (PSO), primarily introduced in 1995 by Eberhart and Kennedy [21] and then extended by other researchers [22,23], has been showing brilliant effect in optimizing discontinuous problems for its briefness in concept, easiness to implement, and high computational efficiency.Some mathematical approaches also make contributions to the PSO.For example, Couceiro and Sivasundaram provided a modified PSO algorithm to overcome traditional PSO algorithm's drawbacks by considering a fraction calculus approach [24].Pires et al. proposed a novel method for controlling the convergence rate of the PSO algorithm using fractional calculus concepts and observed the relationship between the fractional order velocity and the convergence of the algorithm [25].Reference [26] interpreted PSO as a finite difference scheme for solving a system of stochastic ordinary differential equations (SODE) and proposed a class of modified PSO iteration methods based on local attractors of the SODE which behaved differently for different problems.Unlike the traditional optimization techniques, the PSO does not rely on the rigid mathematical characteristics (continuity, derivability) of the optimization problem itself and constraints in the optimization process.Reference [27] developed a PSO approach with a punish function to design the static parameters such as the working number of impulses, the magnitude and axial eccentricity of each single impulse thrust, and the fin oblique angle for impulsive-correction projectile.Yang et al. [28] presented a new approach to a fuel-optimal impulsive control problem of the guided projectile by using an improved PSO technique; this method did not consider the optimal working modes and flexible-time intervals in detail.Reference [29] proposed a new method for solving an optimal control problem applied to spacecraft reentry trajectory by using a PSO method and avoiding the calculations needed in the common analytical approaches.
Therefore, the aim of this paper is to present a new method for solving an optimal impulsive control problem with discrete, flexible-time interval, and finite-times correction using a modified PSO method, where the Hpadaptive pseudospectral method has not been fully solved.The remaining of this paper is organized as follows.Section 2 deduces the mathematic model of impulsive-correction projectile system and states the optimal impulsive control problem.In Section 3, a modification based on the basic PSO is presented in detail for the impulsive optimization control design and the structure and parameter design of the controller, which solve the optimal setting of impulsive control for impulsive-correction projectile.Moreover, a suboptimal control and guidance law based on PSO technique are developed for the real-time consideration of the online design in practice.In Section 4, a simulation case is shown to demonstrate impulsive-correction projectile by implementing the modified PSO algorithm.In order to validate the performance of the impulsive control design, a specific nonlinear flight dynamic model coupled with different conditions such as optimal and suboptimal algorithm is also carried out.Finally, some conclusions are presented in Section 5.

Problem Statement
For the convenience of discussion, the motion of impulsivecorrection projectile in longitude plane is chosen in this paper under assumption that the Earth is flat and motionless.The flight dynamic equations of impulsive-correction projectile are where the state variables include true airspeed of the projectile V, trajectory inclination angle , pitch angle , pitch angular ratio   , horizontal position , height position , and mass of the projectile . and  are the magnitude and axial eccentricity of the impulsive thrust, respectively;   means the pitching moment;   is the moment of inertia about -axis;   denotes the mass flow rate;  represents the attack of angle; and the drag force  and lift force  are both the functions of dynamic pressure , reference area , and the drag and lift coefficients   ,   as shown in (2).
Considering the wind disturbance during the flying period of the projectile, additional equations can be governed as where the subscript  means the relative movement between the projectile and the air stream.  (+ for downwind and − for upwind) is the horizontal velocity of wind during the flight, and Mach number Ma is the function of V  and sonic speed .
Compared with the time-of-flight of the entire trajectory, the working time of the impulse is so transient that it can be treated as instantaneous mutation.Meanwhile, in consideration of low-cost design, impulsive rocket is of open loop, which cannot change the magnitude of its thrust force; that is, impulsive control   only stands two statuses: where   denotes the constant magnitude of the impulsive force (see Figure 1) and   > 0 represents the flag of the impulse actuated with order number  = 0, . . ., , to the contrary   < 0.Here  is the total number of the impulses equipped.
Figure 2 gives the control and guidance process of the impulsive-correction projectile.The task of this paper can be described as follows: (1) In terms of the reference model, find an optimal solution of the impulsive design variables (()) for minimum consumption of impulsive control and minimum control error under the condition of the flight dynamic equations described by ( 1)-( 2), the relative terminal states and control constraints.(2) Based on the optimal numeric solution, find an online control structure and algorithm which will be illustrated in Section 3, to obtain the suboptimal control for the practical realization.

Optimal Control Design
3.1.Reference Model.For the optimal flight control problem under consideration of (1) and Figure 2, a mathematical model based on the linear disturbance motion can be taken as reference model controlled where Here,   ,   , and   represent transfer coefficient, damping, and time constant of the projectile, respectively.

Objective Function.
Considering the impulsive-correction projectile belongs to a kind of low-cost munitions, the purpose of optimizing impulsive control is to satisfy the requirements of the cost as well as the precision of the impulsive projectile system, that is, the minimum amount of the working number of impulses and the control error.Thus the performance index used for minimization of control energy and control error is where  denotes the working number of actuated impulses;  is the coefficient of weight; and the control error () =  1 −   =  −   .Here  0 and   represent the initial and end time, respectively.

Flight Constraints.
For the given control model, the constraints corresponding to the flight optimization are as follows.The constraint on the state control is where the subscripts lower and upper represent lower and upper limit, respectively.

Control law
Guidance law u e(t)

Reference model
x T , y T q, q  c − x, y, ẋ , ẏ In terms of the limit of total number of impulses, the inequality constraint is given as Similarly, due to the discontinuity of impulsive control, the time interval   (see Figure 1) should be subject to In accordance with the above formulation for the impulsive-correction projectile, the optimization issue in question is to achieve an effective algorithm to set the schematic of the impulses of application for a minimum impulsive control number and minimum control error.Due to the discrete characteristics of impulsive control exhibited in (3) and Figure 1, which is different with that of conventional continuous control, it is well to select   , time intervals between the two neighboring impulse forces  −1 and   as design variables.

Optimization Design Based on Modified PSO.
From the control and guidance schematic of the impulsive-correction projectile system outlined in Figure 2, the control structure can be described in detail as in Figure 3, where (()) is the impulsive controller to be designed in terms of the reference model.In order to complete the design, two steps can be implemented as follows.(1) Obtain the numeric solution of the optimal impulsive control, which will be deducted in this section by using a numeric optimization approach.(2) Determine the suboptimal online controller via establishing the relationship between the error () and the control , as expressed in Section 3.6.
PSO belonging to a category of SI methods mimics the unpredictable motion of a group of birds while searching for food, trying to take advantage of the information sharing mechanism that affects the overall behavior of the swarm.
The main strength of PSO is its fast convergence due to cooperation of all individuals in finding the best solution.The initial population that composes the swarm is generated randomly at the first iteration of the process.Considering an -dimensional search space, the swarm maintains a population of  particles.Each particle in the swarm is associated with a position vector as X  = { 1 ,  2 , . . .,   } representing a possible solution to the problem and corresponding to a specific value of the objective or fitness function, a memory of its previous best position vector as P  = { 1 ,  2 , . . .,   } remembering its best value so far ( best ) and the corresponding position in the search space, and velocity vector as V  = {V 1 , V 2 , . . ., V  } denoting the varying speed of the current solution in the search space.
Moreover, each particle knows where the best value for the fitness function has occurred so far in the group ( best ).At the end of the process the best particle in the swarm (corresponding to the best solution with reference to the fitness function) is selected.The velocity and position update equations are given in the form of a single iteration as follows [13]: where  = 1, 2, . . ., ;  is the iteration counter; and  1 ,  2 are random numbers, generated uniformly in the range [0, 1] responsible for imparting randomness to the flight of the swarm.The inertial weight  and the acceleration coefficients  1 and  2 are all nonnegative constant real parameters.Meanwhile, for improving the local ability of convergence of the particles during the range of the error, the inertial weight  is proposed to be reduced linearly from the start value to the end.

Modified PSO Algorithm Process.
From the above modifications, Figure 4 describes the whole computational process of the algorithm in the following steps.
(1) Initialize the population of  particles in the solution space of PSO such as swarm size, neighborhood size, iteration number, inertia weight, acceleration coefficient, random swarm locations, and velocities.

Next-generation
Step 1: Step 2: Step 3: Step 4: Step 5: Step  (2) Calculate the fitness value for each particle in terms of the fitness function for updating individual optimal fitness  best and the swarm optimal fitness  best .
(3) Evaluate whether the current fitness is individual optimal via comparing the fitness value of its current location to that of the local best  best .If the current location has a higher fitness value,  best should be replaced with its current location.Otherwise, we keep the previous  best .
(4) Similarly, compare the fitness value of its current location to that of the global best  best at each particle along its path.If the current location has a higher fitness value,  best should be replaced with its current location.
(5) Update the velocity of the variables in particles in terms of (10) and the position by using (11).
(6) Evaluate whether the target or maximum epochs are reached; that is, judge whether the new particle generated by iteration arrives at the minimum fitness value or whether the time of iteration approaches the predefined maximum value.If any of the two conditions is satisfied, it is indicated that the optimum solution of the problem is corresponding to the global best of the particle, and the criteria should be stopped.Otherwise go to step (2) for next iterations.

Suboptimal Online Controller Design.
Regarding the above optimization process of the impulsive control, the computational time is long so that it cannot be applied to the realtime control.Consequently, for seeking an online control, the controller (()) is considered as the form of inertial element with gains  1 and  2 ; that is, subject to the constraints as follows.
(1) Due to the finite amount of number of the impulses, the time interval   is substituted to ( 9). ( 2) In accordance with the discrete characteristic of the impulsive control, some treatments are governed as where  0 represents the threshold value deciding whether a single impulse works.
It is easy to understand the goal of selecting controller parameters  0 ,  1 , and  2 is to make the control effect approach to that of the PSO method as far as possible.Similarly, let the error between the control outputs of the two methods as the objective function, and controller parameters can be optimized via utilizing the previously described PSO method.

Guidance Law Design.
In order to make the projectile hit the target precisely, it is also necessary to design an appropriate guidance law; that is, formulate the control command with the relative motion between the projectile and the target.Traditionally, the guidance law is determined to approach the convergence of the distance and the rate of the line of sight (LOS) between the projectile and the target.For instance, in terms of classic proportional navigation (PN) law it is concluded that the projectile guidance command is in proportion to the rate of LOS.
In consideration of the limit of finite energy of impulsive projectile, it is predicted that guidance trajectory should be as similar to the nature trajectory as possible under the standard condition.Thus it is helpful to save the amount of number of the impulses which are not imperative.Here, a compensation of the influence of the gravity is considered based on classic PN method for the guidance law which is given by where θ  denotes guidance command of trajectory inclination angle; q is the rate of LOS; and   ,   are PN and gravity compensation gains, respectively.Usually   is chosen between 2 and 6.According to (14), the guidance law can be written in the form of integration as follows:

Simulation Example and Analysis
4.1.Performance of the PSO Online Controller.In this section, a simulation example using the above method is demonstrated for the impulsive control of a certain impulsivecorrection projectile system.Here, in order to validate the performance of the optimal controller, the initial conditions for the design of control optimization are set as follows.The values of   ,   , and   are set to be 0.00048, 0.156, and 0.0455.The initial and end time are  0 = 0 and   = 10 s, respectively.The coefficient of weight  = 0.2, and the total numbers  = 32.Additionally,   is chosen as unit step response with the constraints of  lower = −20 deg,  upper = +20 deg.The parameters of the adopted PSO control are  = 30,  = 3,  1 =  2 = 1.8.The inertial weight  is set to be reduced linearly, from 1.0 to 0.4, and the search will be terminated if the fitness value is less than 0.4, or the number of iterations reaches 150.Controller gains are  1 = 3500,  2 = 0.68, and the threshold value is  0 = 10000 N.
Figures 5-7 give the optimization results and dynamic response qualities of the reference impulsive system.From Figures 5 and 6, it is shown that the dynamic response of the suboptimal controller is similar to that of PSO approach.Meanwhile, the  criterion in Figure 7 that represents the fitness value's trend over iterations remains at the same value for a long iteration period.This indicates that the particles have reached the global optimum point in the feasible area.Although the speed of computations of PSO is rather slow,  the suboptimal controller design is well rounded to make up its weakness and make it feasible in the online design and application.

Trajectory Simulation for Validation.
A trajectory simulation case is also presented to validate the nonlinear impulsive control system in detail, in which the numerical integration of the trajectory model uses the fourth-order Runge-Kuta method.The simulation parameters are listed as the impulsive force   = 4000 N and the working time of each impulse per time equals 0.015 s.The initial state values of the projectile system are  0 = 0 s,  0 = 40 kg, V 0 = 650 m/s,  0 = 0 m,  0 = 0 m,  0 = 0 rad/s,  0 =  0 = 45 deg,  0 = 0 deg,  = 0.015 m, and   = −5 m/s, and the impulsive control starts at the altitude ℎ = 3000 m during the declining period of the trajectory.
According to the above PSO method for the impulsivecorrection projectile, the simulation results are illustrated in  In the figures, the standard trajectory means natural trajectory of free-flying without considering disturbance and impulsive control; the disturbed trajectory happens under the consideration of wind disturbance, but without any impulsive control.Simultaneously, the optimal trajectory under the PSO algorithm and the suboptimal trajectory with the online algorithm are exhibited for comparative study.