Extracting T–S Fuzzy Models Using the Cuckoo Search Algorithm

A new method called cuckoo search (CS) is used to extract and learn the Takagi–Sugeno (T–S) fuzzy model. In the proposed method, the particle or cuckoo of CS is formed by the structure of rules in terms of number and selected rules, the antecedent, and consequent parameters of the T–S fuzzy model. These parameters are learned simultaneously. The optimized T–S fuzzy model is validated by using three examples: the first a nonlinear plant modelling problem, the second a Box–Jenkins nonlinear system identification problem, and the third identification of nonlinear system, comparing the obtained results with other existing results of other methods. The proposed CS method gives an optimal T–S fuzzy model with fewer numbers of rules.


Introduction
To control any system, it is necessary to obtain an exact model of it but in many cases, it has not enough information to get an acceptable mathematical model, and it is required to use modelling techniques based on input-output data.
Fuzzy models are used due to their excellent performance in the modelling of nonlinear systems and being easy to implement. A fuzzy model is constructed from a basis of rules formed by inputs and outputs of a system [1].
The Takagi-Sugeno (T-S) fuzzy model is a type of fuzzy models which is able to give a local linear representation of the nonlinear system. Such a model is able to approximate a wide class of nonlinear systems because they are considered powerful in modelling and control of complex dynamic systems.
A T-S fuzzy model is powerful if it allows obtaining highly accurate models from a small number of rules but the majority of works in literature provide a large number of rules.
Many works concerning the T-S fuzzy models especially discrete-time type are done in literature such as in [2,3]. The optimization of T-S fuzzy models is to determine the structure and parameters of model. The methods used to tune the antecedent and consequent parameters are the clustering algorithms [4] and the linear least squares [5][6][7]. The design of a fuzzy model is a search problem where each point represents a possible fuzzy model with different structures and parameters [1,8]. In the aim to obtain optimal fuzzy models, many evolutionary algorithms (EAs) such as genetic algorithms (GAs) [9], genetic programming [10], evolutionary programming [11], evolution strategy [12], and differential evolution (DE) [13] have been used [14]. These methods tune the parameters and the structure is predefined.
Though, all parameters of model such as structure and parameters are linked and should be optimized simultaneously. Thereby, in [1] the authors have presented the optimization of rule structure where all the information is encoded into a chromosome.
The particle swarm optimization (PSO) is a novel metaheuristic algorithm used recently in many domains [15]. The PSO algorithm is used to elicit fuzzy models such as in [16] in which PSO optimize the structure, the number of membership functions, and the singleton consequent parameters. In [17], the results of PSO and GA are compared for the same method given in [16] with fixed number of rules and membership function for the same example of simulation.

Computational Intelligence and Neuroscience
The structure of the fuzzy model is identified using an online fuzzy clustering method and the parameters are optimized by PSO [20]. In [21], the fuzzy model is extracted using PSO with the recursive least-squares method. The ant colony and PSO were used to obtain T-S fuzzy model in [22]. Lin [23] used immune-based symbiotic PSO to obtain T-S models for the prediction of the skin color detection. In [24], Niu et al. used multiswarm PSO to tune fuzzy systems parameters. In [25], the subtractive clustering is utilized to extract a set of fuzzy rules and a variant of PSO called CPSO algorithm in the aim to find the optimal membership functions and the consequent parameters. In [19], the CRPSO is employed to tune all parameters of the fuzzy models. In [26], the GA is used for learning the T-S fuzzy model from data with a new encoding scheme.
DE and quantum inquired differential evolution is utilized to learn the T-S fuzzy model in [27]. T-S fuzzy models are also developed for modelling industrial systems such as the moving grate biomass furnace in [28].
Other metaheuristic named hunting search (HUS) is used in [18] to determine the parameters of the T-S fuzzy model.
A recent metaheuristic algorithm named the cuckoo search (CS) is proposed in 2009 by Yang. The CS algorithm has given best results compared to other metaheuristics such as PSO and GA. The CS is used to learn neural networks [29] and in the reliability optimization [30]. In [31], the CS is used to tune parameters of Sugeno-Type Fuzzy Logic Controller for liquid level control. Optimizing fuzzy controller using CS in the case study of computer numerical control of a steam condenser is done in [32]. In [33], a prediction of academic performance of student based on the CS is proposed. In [34,35], the cuckoo search is used in the reduction of high order. In [36], Hammerstein model trained by the cuckoo search algorithm is proposed to identify nonlinear system. In [37], CS is used in the structural damage identification.
The goal and the motivation of this contribution is to obtain a T-S model with minimum number of rules by using a CS method. The objective is to have a model with less complexity able to be easily implemented in embedded systems and with a minimum of errors which proves its efficiency and precision. This paper describes the use of CS to obtain the optimal structure in terms of a number of rules and also the parameters premises and consequents of T-S model simultaneously in the aim to explore the advantages of CS in optimization which are better compared to other metaheuristics in many examples in literature. The optimal T-S fuzzy models extracted are compared on the same examples with other methods in terms of MSE and number of rules.
This paper is organized as follows. Section 2 describes the structure of T-S fuzzy system. The CS algorithm is introduced in Section 3. Section 4 explains the encoding scheme of T-S model method used. Section 5 presents the different metaheuristic algorithms. Section 6 presents the application examples with results and discussions. Finally, conclusions are given in Section 7.

T-S Fuzzy Model
T-S fuzzy model presented in [38] is given by the following basis of rules.
Rule . If 1 is 1 and . . . and is , then where = 1, . . . , , is the number of rules, ] represents the input, is the size of input, 0 , 1 , . . . , are the consequent parameters, is the th fuzzy rule output, and is a fuzzy subset. The output of the model y is obtained as follows: where of the th rule is computed as with ( ) being the membership function's grade of and is characterized by a Gaussian function as and are, respectively, the mean and the deviation of the MF. The premise parameters and are adjusted [20]. [39][40][41]. CS imitates the parasitism of cuckoo and used Lévy Flights which are better than random walks [32].

CS is developed by Yang and Deb in 2009
The CS has the specificity that the cuckoos lay their eggs in the nests of host birds. Some cuckoos can mimic the properties of the host eggs.
As a result, the number of the eggs abandoned is reduced and their reproductivity increases [42].
The CS models are used in many optimization problems. In [40,43], it is demonstrated that Lévy Flights are better than random walk in CS algorithm.
In CS algorithm, the solution is given by an egg in a nest, and a new solution is represented by a cuckoo egg. The goal is to use the newer cuckoos or solutions to substitute worst solutions. In the classical algorithm, each nest has one egg; however the algorithm can be extended to complex problem [40,43].
In the cuckoo search, the rules are as follows: (i) Every cuckoo lays a single egg and throws it in a random nest.
(ii) The best nests or solutions will be transferred to the next generations. (iii) The number of host nests is predefined, and a host can detect a stranger egg with probability ∈ [0, 1]. In that event, the host bird skips the egg or abandons the nest and builds a new nest in another place [43].
The CS algorithm is described in Pseudocode 1 [43].
In our work, ( ) is the MSE and nests are possible T-S fuzzy models with cuckoos being the parameters of the T-S fuzzy model. This algorithm uses a combination of a local random walk and the global random walk, controlled by parameter . The local random walk can be written as where and are two different solutions, ( ) is a Heaviside function, is a random number drawn from a uniform distribution, and is the step size. The global random walk is carried out by using Lévy Flights. where Here, > 0 is the step size scaling factor, which should be related to the scales of the problem of interest [44]. In fact, we use the Lévy Flights to obtain other T-S fuzzy models in the next generation which can be a solution. Label i Label Nmax · · · · · · · · · · · · · · · · · · · · · · · · Figure 1: The particle structure of encoding a fuzzy rule base.

Encoding Scheme for T-S Fuzzy Model
In this paper, the fuzzy system is given by a particle formed by the premise and the consequent parameters and also the labels which are used to choose the rules to construct the fuzzy system [20]. The fuzzy model's particle is presented in Figure 1.
In Figure 1, each rule is formed by premise and consequent parameters and the label. Figure 2 shows that the particle is given by a vector composed of the premise parameters 1 , 1 , . . . , , , consequent parameters 0 , 1 , . . . , , and the label of all rules.
The fuzzy rules are selected using the labels in fact. If > 0, then the rule is selected where = 1, . . . , max is the index of the rule. The T-S fuzzy system is composed of the active rules.
The CS algorithm is used to elicit T-S fuzzy model and presented as follows: (1) Encoding all the parameters premise and consequent of all rules with a predefined maximum number of rules max .
(2) Defining a fitness function and the bounds of parameters.
(3) Randomizing an initial swarm of nests. Initializing all the parameters of particles representing fuzzy models with the lower and upper bounds chosen of parameters. Every nest is a fuzzy model with different structures and parameters. (4) Calculating the fitness of initials nests which is MSE given by this equation: is the number of input prototypes, ref ( ) is the desired output, and ( ) is the model output.
(5) Using the CS to search the optimal T-S fuzzy model.
Step 1. Get a fuzzy model (cuckoo) FM 1 from the swarm of nests (TS fuzzy models generated randomly) randomly by using Lévy Flights, calculate its fitness, and select another fuzzy model (nest) FM 2 randomly among the n nests of the swarm.
Step 2. If MSE(FM 1 ) > MSE(FM 2 ), replace FM 2 by the new solution FM 1 ; otherwise pass to the next step.
Step 3. worst fuzzy models or nests (each nest is a fuzzy model in our work) are abandoned and new ones are built.
Step 4. Test the stopping criterion; if it is verified, keep the best fuzzy model (the optimal premise, consequent, and number of rules) and otherwise return to Step 1.

Metaheuristic Algorithms
There are many metaheuristic algorithms in literature such as the particle swarm optimization (PSO), the cooperative random learning PSO called CRPSO, the hunting search (HUS), the genetic algorithms (GA), and the differential evolution (DE).

The Particle Swarm
Optimization (PSO). The particle swarm optimization (PSO) imitates the movement of birds flocking or fish schooling looking for food [19]. The research of optimal solution is given by two equations: where is the position of a particle, V is the velocity, is the inertia weight, 1 and 2 are constants, 1 and 2 are random numbers between 0 and 1, p is the best position of the particle, and is the best position of all particles in the swarm.
Another version of PSO is the cooperative random learning PSO (CRPSO) which used subswarms and the equation of velocity is given as follows: where 3 is constant, 3 is random number, j is the index of subswarm, r is the number between 1 and the number of subswarms, and is the global best position of all subswarms [19].

The Hunting Search (HUS).
The hunting search (HUS) algorithm imitates the social behavior of animals when catching a prey in the way wolves hunt. The algorithm is based on approaching the leader having the best position in the group and reorganizing if the hunters are close to each other but still cannot find the optimum solution [18].
The research of new solution obeys this equation: where is a random number between 0 and 1, NML is the maximum number of movements toward the leader, and is the position of leader in the th variable. Another step is the reorganization of hunters which is given by this equation: where EN is the number of past reorganizations and and are positive constants, with the aim to avoid falling into a local minimum and obtain a globally optimal solution.

The Genetic Algorithm (GA).
The genetic algorithm (GA) is a random search technique which imitates natural evolution with Darwinian survival of the fittest approach. In this algorithm, the variables are represented as genes in a chromosome, and the chromosomes are evaluated according to their fitness values. The chromosomes with better fitness are found through the three basic operations of GA: selection, crossover, and mutation [46]. The algorithm of GA is described as follows: (1) Initialization of initial population called chromosomes.
(2) Evaluation of each element in the population by calculating its fitness function.
(4) Generation of new chromosomes using the chromosomes selected and the GA's operations such as crossover and mutation.
(5) Test of the stopping criterion: if validated, then the parameters are kept; otherwise return to Step (2).

The Differential Evolution (DE).
The differential evolution (DE) is a search algorithm that is similar to GA; it deals with a real coded population and devises its own crossover and mutation in the real space [13]. DE creates 0 , a mutated form of any individual , using the vector difference of randomly picked individuals called * and ∘ using this equation: where is a scaling factor between 0 and 2. Then, the crossover is applied between any individual member of the population and the mutated vector 0 and the best element is kept in the last iteration.

Application Examples and Discussions
In this part, the T-S models optimized are used to identify three systems: a nonlinear plant modelling problem, the Box-Jenkins gas furnace benchmark, and identification of nonlinear system. The performance of CS is compared with other metaheuristic algorithms. The parameters used in the examples are presented in Table 1.
According to Table 2, CS method gives the best results in terms of a number of rules (mean) fewer than the HUS method in [18], MSE (mean), and standard deviation (Std) in both training and testing stages compared with other methods. Also, the CS method gives good performances with the smaller number of evaluations than the result in [19]. In fact, in [19] 1000 generations and 20 particles are used with CRPSO algorithm and 2000 generations and 30 particles with PSO, GA, and DE; however in our work we use 500 generations and 20 particles which are less than the previous algorithms.
The optimal fuzzy model gives an MSE 8 * 10 −4 in training and 4 * 10 −4 in testing. Figure 3 indicates outputs of target and model in the training and testing stages of the optimal model and the errors between them can be seen in Figure 4. As follows in Figure 3, the CS method gives the output with small errors.

Application to Box-Jenkins Gas Furnace
Data. The Box-Jenkins gas furnace data [1,11,[34][35][36] was recorded from a combustion process of a methane-air mixture [44]. The data set originally consists of 296 data points [ ( ), ( )]. The input ( ) is the gas flow rate; however the output ( ) was the carbon dioxide (CO 2 ) concentration. The aim is to elicit a model to predict ( ) using this data. The first step is to determine the appropriate inputs to be used. The All the simulations are executed 50 times. The mean number of rules, the mean, and standard deviations of the MSE are listed in Table 3.
From Table 3, we conclude that CS has minimum mean MSE compared to PSO, HUS, and DE and less standard deviation MSE than PSO, CRPSO, HUS, and DE. The mean number of the rules of CS is much smaller than HUS, PSO, CPSO, GA, and DE. In conclusion, the CS-based method can give a fuzzy model with less number of rules. The optimal fuzzy model found by CS during 50 runs has an MSE of 0.139 and 3 rules.  Figure 5 shows the target and the model outputs and Figure 6 gives the errors between them. The optimal fuzzy model can identify the output with small errors. Table 4 gives the parameters of the optimal fuzzy model.

Identification of Nonlinear
System. The third example used for identification, given by Narendra and Parthasarathy, is described by the next difference equation [47]:  The input ( ) is given by this equation: ) .
The output of this equation depends nonlinearly on both its past values and the input. The 200 training input patterns are randomly generated in the interval [−1 1] by using (17). The aim of this application is to predict ( ) by using this approach when the inputs chosen are ( − 1) and ( − 1).
All the coefficients of the premise and consequent parameters are limited to [−5, 5]. The maximum number of rules is chosen as 5. Table 5 gives the best results of 50 experimental trials.
For all methods, the number of generations is fixed to 500 and the number of particles is fixed to 20.
As shown in Table 5, CS gives a minimum number of rules, less mean of MSE, and less standard deviation MSE compared to PSO, HUS, and GA.  The optimal fuzzy model found by CS during 50 runs has an MSE of 0.0231 and 3 rules. Figure 7 shows the target and the model outputs and Figure 8 gives the errors between them. The optimal fuzzy model can predict the output with small errors.

Conclusions
In this paper, the extracting of T-S fuzzy model using CS algorithm is described. The T-S fuzzy model tuned by CS has the rules structures and both the premises and consequents parameters optimized. The optimal T-S fuzzy model has a fewer number of rules and smaller MSE both in mean and in standard deviation. The T-S model using CS algorithm is validated by the comparison of its performance to other methods for modelling three benchmarks: the nonlinear plant modelling problem, the Box-Jenkins problem, and identification of nonlinear system and this shows that the CS algorithm gives much better accuracy in modelling nonlinear systems; in fact, CS gives a model with minimum of number of rules with better errors compared to other metaheuristics.