Training ANFIS Model with an Improved Quantum-Behaved Particle Swarm Optimization Algorithm

This paper proposes a novel method of training the parameters of adaptive-network-based fuzzy inference system (ANFIS). Different from the previous works which emphasized on gradient descent (GD) method, we present an approach to train the parameters of ANFIS by using an improved version of quantum-behaved particle swarm optimization (QPSO). This novel variant of QPSO employs an adaptive dynamical controlling method for the contraction-expansion (CE) coefficient which is the most influential algorithmic parameter for the performance of the QPSO algorithm. The ANFIS trained by the proposed QPSO with adaptive dynamical CE coefficient (QPSO-ADCEC) is applied to five example systems.The simulation results show that the ANFISQPSO-ADCEC method performs much better than the original ANFIS, ANFIS-PSO, and ANFIS-QPSO methods.


Introduction
Fuzzy systems (FSs) have been successfully applied in many areas, such as system modeling and controls.To ease the design and improve system performance, many neural or statistical learning approaches that automatically generate fuzzy rules have been proposed [1].A fuzzy inference system employing fuzzy if-then rules can model the qualitative aspects of human knowledge and reasoning processes without using precise quantitative analyses.This fuzzy modeling or fuzzy identification, first explored systematically by Sugeno and Kang [2], has found numerous practical applications in control [3,4], prediction, and inference [5,6].However, there are some basic aspects of this approach which are in need of better understanding.First, there are no standard methods for transforming human knowledge or experience into the rule base and data base of a fuzzy inference system.Second, there is a need for effective methods for tuning the membership functions (MF) so as to minimize the output error measure or maximize performance index.This paper concentrates on a so-called adaptive-networkbased fuzzy inference system (ANFIS), which can serve as a basis for constructing a set of fuzzy if-then rules with appropriate membership functions to generate the stipulated inputoutput pairs [7].The ANFIS model is a reprehensive of adaptive fuzzy systems which are generated through training processes and are known as evolving fuzzy systems.It has been one of attractive researches focusing on fuzzy systems in recent years.
The TSK [2] is a fuzzy system with crisp functions and has been found to be efficient in complex applications [4].It has been proved that, with proper number of rules, a TSK system is able to approximate every plant.As such, TSK systems are widely used in the ANFIS and play the advantage of good applicability since they can be interpreted as local linearization modeling and conventional linear techniques for state estimation and control.
The ANFIS has both the advantages of neural networks and fuzzy systems.However, training the parameters of the ANFIS model is one of the main issues encountered when the model is applied to the real-world problems.Most of the training methods for the ANFIS are based on gradient descent (GD) approaches, where calculation of gradient in each step is tractable since the chain rule used may cause many local minima of the problem.The gradient methods are known to be local search approaches and their performances generally depend on initial values of parameters so that it is difficult for them to find the global optimal model parameters.Since the design of FSs can be reduced to an optimization problem, many researchers have proposed to design FSs by employing metaheuristics such as genetic algorithms (GAs) [8][9][10] and particle swarm optimization (PSO) [11][12][13].However, GAs have always been complaint about their slow convergence speed, while PSO may encounter premature convergence at the later stage of the search process and is sensitive to neighborhood topology.
This paper explores the applicability of a variant of PSO, quantum-behaved particle swarm optimization (QPSO), to training of the ANFIS model.The QPSO algorithm was inspired by quantum mechanics [14][15][16][17].Its iterative equation is very different from that of PSO and can lead QPSO to be globally convergent [18][19][20].Besides, unlike PSO, QPSO needs no velocity vectors for particles and has fewer parameters to adjust, making it easier to implement.The QPSO algorithm has been shown to successfully solve a wide range of continuous optimization problems.Many empirical studies show that QPSO has stronger global search ability when solving various continuous optimization problems [21][22][23][24][25][26][27].In order to make a further improvement of the QPSO, in this paper, we propose an adaptive dynamical control method for the contraction-expansion (CE) coefficient of the algorithm, which is the most influential algorithmic parameter that can be tuned to adjust the balance between the local search and global search of the particle.The improved QPSO algorithm is applied to ANFIS training and is tested on several example systems, with performance comparison between the improved QPSO, the original QPSO, and PSO algorithms.
The rest of the paper is organized as follows: in Section 2, we present a review of fuzzy systems and the ANFIS model.Section 3 presents the principle of QPSO.The improved QPSO approach is proposed in Section 4. Application of the QPSO and its improved version to ANFIS model training is described in Section 5. Section 6 presents the simulation results for five example systems using ANFIS model trained by the optimization algorithms.Finally, the paper is concluded in Section 7.

Fuzzy Systems and ANFIS Model
2.1.Fuzzy Systems.This subsection describes the fuzzy systems to be optimized through training processes in this study and their mathematical expressions.The TSK fuzzy model was originally proposed by Sugeno and Kang in an effort to formalize a systematic approach to generating fuzzy rules from an input-output data set [2].The Takagi-Sugeno-Kang (TSK) fuzzy system can be of zero order or first order.The th rule, which is denoted   , in a TSK fuzzy system is represented in the following form: where  is the number of time step,  1 (), . . .,   () are input variables, () is the output variable of the system,   is a fuzzy set, and   is the consequent part function.The fuzzy set   uses a bell-shaped membership function given by where   ,   , and   are the parameter set of the fuzzy set   .Each of these parameters has a physical meaning:   determines the center of the corresponding membership function,   is the half width, and   (together with   ) controls the slopes at the crossover points.In the inference engine, the fuzzy AND operation is implemented by the algebraic product in fuzzy theory.Thus, given an input dataset  = ( 1 , . . .,   ), the firing strength () of rule  is calculated by For zero-order and first-order TSK-type fuzzy systems, the consequent function   is set to a real value  0 and a linear function of input variables  0 + ∑  =1     for each , respectively.If there are  rules in a fuzzy system, the output of the system, which is calculated by the weighted-averagedefuzzification method, is given by In this work, the proposed improved QPSO algorithm and other optimization algorithms will be used to optimize free parameters   ,   , and   in each rule.

The Architecture of ANFIS.
This subsection presents the architecture of ANFIS network.A detailed coverage of ANFIS can be found in [7].The ANFIS network is a neurofuzzy network that was proposed by Jang in 1993 [6].Since the ANFIS is an adaptive network, parts of its nodes are adaptive, which means that their outputs depend on the parameters belonging to these nodes.Two kinds of learning algorithms have been proposed to tune these parameters to optimize the approaching performance during the training period.For simplicity, the above-mentioned system is supposed to have two inputs and one output, and its rule base contains two fuzzy if-then rules of TSK fuzzy model.A typical TSK fuzzy model has the two rules that can be stated as where  and  are the inputs of the ANFIS,  and  are the fuzzy sets, and   ( = 1, 2) is a first-order polynomial and represents the outputs of the first-order TSK fuzzy inference system.In the above rules,   ,   , and   ( = 1, 2) are the parameters set, referred to as the consequent parameters.
The architecture of ANFIS is shown in Figure 1, and the node function in each layer is described below.
Layer 1.This layer is the layer of membership functions that contains adaptive nodes with node functions described as where  and  are the input nodes,  and  are the linguistic labels, () and () are the membership functions which usually adopts a bell shape with the maximum and minimum values equal to 1 and 0, respectively: where   ,   , and   are the parameters set.When values of these parameters change, the bell-shaped functions vary accordingly, exhibiting various forms of membership functions on linguistic labels  and .In fact, any continuous and piecewise differentiable functions, such as commonly used trapezoidal or triangular-shaped membership functions, are also qualified candidates for node functions in this layer.
Parameters in this layer are referred to as premise parameters.
Layer 2. Every node in this layer is a fixed node, marked by a circle and labeled Π, with the node function to be multiplied by input signals to serve as output.Consider The output   represents the firing strength of a rule.The output of each node represents the firing strength of a rule.In fact, other T-norm operators that perform the generalized AND can be used as the node function in this layer.
Layer 3. Every node in this layer is a fixed node, marked by a circle and labeled , with the node function to normalize the firing strength by calculating the ratio of the th node firing strength to the sum of all rules' firing strength.Moreover, For convenience, the outputs of this layer are referred to as normalized firing strengths.
Layer 4. Every node in this layer is an adaptive node, marked by a square, with node function given by Layer 1 Layer 2 Layer 3 Layer 4 Layer 5 where  is the output of layer 3 and {  ,   ,   } is the parameter set.The parameters in this layer are referred to as consequent parameters.
Layer 5. Every node in this layer is a fixed node, and the overall output can be expressed as linear combination of the consequent parameters.Consider

Hybrid Learning Algorithm for ANFIS Model.
It can be seen that there are two modifiable parameter sets, {  ,   ,   } labeled as premise parameters and {  ,   ,   } labeled as consequent parameters.The aim of the training procedure for this architecture is to tune the above two parameter sets to make the ANFIS output fit the training data.Each epoch of this hybrid learning procedure is composed of two passes: a forward pass and a backward pass.In the forward pass, the premise parameters are fixed, and the least squares estimation (LSE) is applied to identify consequent parameters.When the optimal parameters are found, the backward pass starts with the consequent parameters fixed, the error rate of output node back propagates from output end toward the input end, and the premise parameters are updated by the gradient descent (GD) method.These methods update premise parameters by using GD or Kalman filtering and appear to be prone to trap into the local optima.In this paper, we propose a QPSO with adaptive dynamical contraction-expansion coefficient (QPSO-ADCEC) and employ this algorithm to train the parameters of ANFIS for the purpose of obtaining the global optimal solution.

Quantum-Behaved Particle Swarm Optimization
In the PSO with  individuals, each individual is treated as a volume-less particle in the -dimensional space, with the current position vector and velocity vector of particle  at the where () is an objective function continuous almost everywhere and  is the feasible space.Accordingly,  , can be updated by Trajectory analysis in [30] showed that convergence of the PSO algorithm may be achieved if each particle converges to its local attractor,  , = ( 1 , ,  2 , ⋅ ⋅ ⋅   , ), defined at the coordinates or where   (10).In PSO, the acceleration coefficients  1 and  2 are generally set to be equal; that is,  1 =  2 , and thus   , is a sequence of uniformly distributed random numbers over (0, 1).As a result, (15) In QPSO, each single particle is treated as a spin-less one moving in quantum space.Thus state of the particle is characterized by wave function , where || 2 is the probability density function of its position.Inspired by convergence analysis of the particle in PSO [30], it is assume that, at the th iteration, particle  flies in the -dimensional space with  potential well centered at  ) , and thus the probability distribution function is ) .
Using Monte Carlo method, we can measure theth component of position of particle  at the ( + 1)th iteration by ) ,   ,+1 ∼  (0, 1) , (20) where   ,+1 is a sequence of random numbers uniformly distributed over (0, 1).The value of   , is determined by where   = ( 1  ,  2  , . . .,    ) is called mean best () position defined by the average of the  positions of all particles; namely, Thus the position of the particle updates according to the following equation: ) .
The parameter  in ( 21) and ( 23) is called contractionexpansion (CE) coefficient, which can be adjusted to balance the local search and the global search of the algorithm during the optimization process.The current position of the particle in QPSO is thus updated according to ( 16) and ( 23).
The QPSO algorithm starts with the initialization of the particle's current positions and their  positions (setting  ,0 =  ,0 ), followed by the iteration of updating the particle swarm.At each iteration, the  position of the particle swarm is computed and the current position of each particle is updated according to ( 16) and (23).Before each particle updates its current position, its fitness value is evaluated, and then its  position and the current  position are updated.In (23), the probability of using either operation "+" or operation "−" is equal to 0.5.The search procedure continues until the termination condition is met.
We outline the procedure of the QPSO algorithm as follows.

Procedure of the QPSO
Step 1. Initialize the population; that is, initialize the current position and personal best position of each particle.
Step 2. Execute the following steps.
Step 4. Properly select the value of .
Step 5.For each particle in the population, execute from Step 6 to Step 8.
Step 6. Evaluate the objective function value of the current position of the particle, that is, ( , ).
Step 8. Update each component of the particle's position according to (16) and (23).
Step 9.While the termination condition is not met, return to Step 2.
Step 10.Output the results.

The QPSO with Adaptively Dynamical CE Coefficient (QPSO-ADCEC)
The CE coefficient is the most influential algorithmic parameter for the search performance of the QPSO.In [19], the influence of the CE coefficient on particles' dynamical behaviour and the algorithmic performance were theoretically and empirically analyzed, and two control methods for the CE coefficient, linearly decreasing and fixed value methods, were investigated in depth on a suite of well-known benchmark functions.Here, we propose a novel control method for the parameter and thus propose a modified QPSO.The motivation of this proposal is to control the CE coefficient in an adaptive and dynamical way according to evolution process and the diversity of the particle swarm during the search process.From experience it is well known that the convergence of the QPSO depends on the evolution speed of the fitness values of the particles.Therefore in order to handle complex and nonlinear optimization process, it would be helpful to include such factor into the CE coefficient design.In this paper, we propose a novel dynamically varying CE coefficient for the QPSO involving two factors, namely, the evolution speed factor and the aggregation degree factor.The evolution speed factor of the th particle at the th iteration is given by ℎ , = min ( ( ,−1 ) ,  ( , )) max ( ( ,−1 ) ,  ( , )) , where ( , ) is the fitness value of the personal best position of particle  at iteration  and  is the current iteration number.Note that the evolutionary speed factor lies in (0, 1].This parameter reflects the run time history of the algorithm and the evolution speed of the particle.For the minimization problem described in (12), if the given optimization is a minimization one, ( , ) ≤ ( ,−1 ) for each  > 0. Thus ℎ , can be simply expressed as ℎ , = ( , )/( ,−1 ).The smaller the value of ℎ , , the faster the decrease of ( , ) and the faster the evolution speed of particle .After certain number of iterations, the value of ℎ , attains its maximum value indicating that ( , ) is not improved any more, and the algorithm stagnates or finds the optimum solution.The aggregation degree factor  of particle  at the th iteration is defined by where   is the average fitness value of all the particles at the th iteration, that is; where  is the swarm size.(  ) represents the optimal value that the swarm found in the th iteration.Note that (  ) cannot be replaced by (  ) since (  ) denotes the optimal value that the entire swarm has found up to the th iteration.It is obvious that the value of  , falls in (0, 1].The aggregation can reflect not only the aggregation degree but also the diversity of the swarm in terms of the fitness values of the current position of the particles.The larger the value of  , , the less the difference between (  ) and   , and thus the more the aggregation or the smaller the swarm diversity.
In the case when all the  , are equal to 1, all the particles have the same identity with each other in terms of the fitness values of their current positions.The new CE coefficient for each particle can be written as where

Learning ANFIS Network by QPSO-ADCEC
This section presents how to employ the QPSO-ADCEC algorithm for learning the ANFIS parameters.The ANFIS has two types of parameters which need to be trained, that is, the premise parameters {  ,   ,   } and the consequent parameters {  ,   ,   }.The premise parameters are estimated by the QPSO-ADCEC algorithm, and the consequent parameters are identified by the least squares estimation (LSE).Thus, the position of each particle in the QPSO-ADCEC represents a set of premise parameters.The fitness is defined as root mean squared error (RMSE) between actual output and desired output, which can be expressed by The ANFIS coupling with the QPSO-ADCEC algorithm is outlined as below.
Step 1. Initialize the swarm of particles such that the position of each particle are uniformly distributed within the search scope, and set the iteration number  = 0.
Step 2. Set the position of each particle  , as the premise parameters {  ,   ,   }, and identify consequent parameters {  ,   ,   } with LSE.Then calculate fitness value of each particle, and set each particle's personal best position as  ,0 =  ,0 .
Step 3. Find out the mean value of all particles' personal best position   by using (22).
Step 4. For each particle in the population, execute from Step 5 to Step 8.
Step 8. Update the position of each particle  , by using ( 16) and (23), where  is replaced by  , .
Step 9.If the termination condition is met, go to exit, otherwise go to Step 2, and set  =  + 1.

Simulation Results
To investigate the efficiency of the proposed method, five example systems were tested.The first two examples are the problem of identification of nonlinear systems.For the third example, ANFIS-QPSO-ADCEC was used as an identifier to identify a nonlinear component discrete control system.In the simulation study for the fourth example, we used the proposed method to predict a chaotic time series.The fifth example is to use the proposed ANFIS-QPSO-ADCEC method to predict the carbon dioxide in Box-Jenkins Furnace.For all the examples, besides the ANFIS-QPSO-ADCEC method, the ANFIS model trained with BP algorithm (ANFIS-BP), ANFIS with PSO (ANFIS-PSO), and ANFIS with QPSO (ANFIS-QPSO) were also tested and compared with the proposed ANFIS-QPSO-ADCEC.For the PSO, the inertia weight was set to be 0.73, and the acceleration coefficients were set as  1 =  2 = 1.49.For the QPSO algorithm, the CE coefficient was decreased linearly from 1.0 to 0.5 over the course of search.For the QPSO-ADCEC, the initial CE coefficient  initial = 0.5, the value of  and  were set to be 0.6 and 0.4, respectively.These parameters were shown to result in good algorithmic performance in our preliminary experiments.For each of the PSO variants, the swarm size was set to be 50, and the maximum number of iterations was 2000.
Example 1 (nonlinear SISO System modeling [1]).In this example, the nonlinear plant to be identified by first-order TSK-type fuzzy system is described as  = sin () + 0.8 sin (3) + 0.2 sin (5) , where  is the only input.The fuzzy system had five rules with the membership function of each rule being Gaussian function.Each member function had two parameters and there were 10 premise parameters, and 10 consequent parameters in the fuzzy system.Thus the dimensionality of the position of each particle is 10.We chose 100 input data which were randomly generated between −1 and 1, 50 of which were used as the training data and the remainder of which were employed as testing data.Table 1 lists the training error and test error of each method, from which we can see that the training errors (trnRMSE) and test errors (testRMSE) of the ANFIS network trained by the QPSO-ADCEC, QPSO, and PSO are smaller than those of the ANFIS trained by BP algorithm.Among all the PSO variants, the QPSO-ADCEC was able to train the ANFIS model with the smallest training and test errors.Figure 2 visualizes results by using the ANFIS-QPSO-ADCEC for identification of the system and shows that the ANFIS model learned by the QPSO-ADCEC fits the nonlinear system very well.
Equation ( 30) is three-input nonlinear function.Each variable had two membership functions so that the fuzzy system had eight fuzzy rules [1].Each membership function was Gaussian with 2 parameters, and the fuzzy system had 12 premise parameters and 32 consequent parameters.Thus, the dimensionality of the position of each particle in QPSO-ADCEC was 12. From the grid points of the range within the input space of the above equation, 500 data pairs were obtained, where 250 data were used for training and the remaining 250 data for testing.The results for this example are presented in Table 2, where it is shown by training and test errors that the proposed ANFIS method outperformed the other methods on this example system and ANFIS-BP performed worst among all the methods.Figure 3 also shows that the ANFIS model learned by the QPSO-ADCEC fitted the system fairly well.
We chose () and () as the inputs and ( + 1) as the output.The input to the plant and the model was a sinusoid () = sin(2/250), and the adaptation started at  = 1 and stopped at  = 250.As shown in Figure 4, the output of the model follows the output of the plant almost immediately even after the adaptation stopped at  = 250, and the () is changed to () = 0.5 sin(2/250) + 0.5 sin(2/25) after it stopped  = 500.Figure 4 also illustrates the results using the ANFIS-QPSO-ADCEC for identification.The training error and test error provided in Table 3 show the superiority of the ANFIS-QPSO-ADCEC to its competitors.
Example 4 (prediction of future values of a chaotic time series [31]).The simulation results for Examples 1 to 3 show that the ANFIS-QPSO-ADCEC can be used to model highly nonlinear functions effectively.In this example, we are to demonstrate how the proposed ANFIS-QPSO-ADCEC can be employed to predict future values of a chaotic time series.The time series used in our simulation is generated by the chaotic Mackey-Glass differential delay equation [22] defined by The prediction of future values of this time series is a benchmark problem which has been considered by a number of connectionist researchers (Lapedes and Farber [32], Moody [33], Jones et al. [34], Crowder [35], and Sanger [36]).
The initial conditions for (0) and  are 1.2 and 17, respectively.We use four past data for this prediction, and the fuzzy system is generated as [ ( − 18) ,  ( − 12) ,  ( − 6) ,  () :  ( + 6)] .(33) When  was varying from 118 to 1117, we generated 1000 data pairs for our data and applied the first 500 data pairs for training and the rest 500 data pairs for prediction.Table 4 provides the training error and test error of each method, showing again that the proposed QPSO-ADCEC algorithm performed the best in training the ANFIS model for this example.Figure 5 shows the errors between actual output and desired output by using ANFIS-QPSO-ADCEC.
Example 5 (prediction of carbon dioxide in Box-Jenkins Furnace [33]).Box-Jenkins Furnace data set contains 296 pairs of input and output data, where the input () is the flow of methane and the output () is the concentration of carbon dioxide dismissed from the furnace.In our experiment, we chose (), ( − 1), ( − 2), ( − 1), and ( − 2) to be the input data and () to be the output of the system [37], thus obtain 294 pairs of input and output data.The first 244 pairs were used to training the ANFIS model and the last 50 pairs were employed to test the trained system.Table 5 presents the training error and test error of each method, showing that the QPSO-ADCEC algorithm performed the best in training the ANFIS model and the trained system had the least test error.Figure 6 plotted the desired output and the actual output of the ANFIS model trained by the QPSO-ADCEC.It can be observed that the obtained system fitted the data very well.

Conclusion
In this paper, we proposed a novel variant of the QPSO algorithm for training the parameters of ANFIS.This improved QPSO, called QPSO-ADCEC, employs an adaptive dynamical varying CE coefficient which makes the QPSO have better global search ability.The proposed QPSO-ADCEC is applied to estimation of the premise parameters of the ANFIS model, and the LSE approach is used for optimizing the consequent parameters of the fuzzy system.
The effectiveness of the proposed ANFIS-QPSO-ADCEC method was verified by applying it to identification of nonlinear systems and predication of a chaotic system.The simulation results show that the proposed ANFIS-QPSO-ADCEC method had better performance than the ANFIS-PSO, ANFIS-QPSO, and the ANFIS trained with the gradient decent method due to the stronger global search ability of the QPSO-ADCEC algorithm.

Figure 5 :Figure 6 :
Figure 5: (a) The actual output and desired output; (b) errors between actual output and desired output by using the ANFIS-QPSO-ADCEC for Example 4.
Mathematical Problems in Engineeringth iteration represented as  , = (1, ,  2 , , . . .,   , ) and  , = ( 1 , ,  2 , , . . .,   , ) [28, 29]. Te particle moves according to the following equations: , . . . . , and  = 1, 2, . . ., , where  1 and  2 are known as acceleration coefficients.The parameter  is known as the inertia weight which can be adjusted to balance the explorative search and the exploitive search of the particle.Vector  , = ( 1 , ,  2 , ⋅ ⋅ ⋅   , ) is the best previous position (the position giving the best objective function value or fitness value) of particle  and called personal best () position, and vector   = ( 1  ,  2  ⋅ ⋅ ⋅    ) is the position of the best particle among all the particles in the population and called global best () position.Without loss of generality, we consider the following maximization problem: initial is the initial value of the CE coefficient and  and  typically lie in the range [0, 1].It is obvious that the value of inertia weight lies in between (1 − ) and (1 + ).With this adaptively dynamical CE coefficient, the modified QPSO algorithm, called QPSO with adaptively dynamical CE coefficient (QPSO-ADCEC), can have strong ability to jump out of local search than the original QPSO with linearly decreasing or fixed CE coefficient.