Identification of Fuzzy Inference Systems by Means of a Multiobjective Opposition-Based Space Search Algorithm

We introduce a new category of fuzzy inference systems with the aid of a multiobjective opposition-based space search algorithm (MOSSA). The proposed MOSSA is essentially a multiobjective space search algorithm improved by using an opposition-based learning that employs a so-called opposite numbers mechanism to speed up the convergence of the optimization algorithm. In the identification of fuzzy inference system, the MOSSA is exploited to carry out the parametric identification of the fuzzy model as well as to realize its structural identification. Experimental results demonstrate the effectiveness of the proposed fuzzy models.


Introduction
Fuzzy modeling has been utilized in many fields for engineering, medical engineering, and even social science.Lots of diverse approaches to fuzzy modeling have been proposed in the past decades.Pioneering work by Pedrycz [1], Tong [2], Xu and Zailu [3], Sugeno and Yasukawa [4], Oh and Pedrycz [5], Chung et al. [6], and others [7] has studied different approaches to construct fuzzy models.All these methods reported are based on information granulation and optimization algorithms, and there is a lack of investigations of multiobjective identification of fuzzy models.
There have been a suite of studies focusing on multiobjective when designing fuzzy models.In the design of fuzzy models, two main and conflicting objectives are commonly considered.One is the accuracy and the other is the complexity.In the 1990s, the emphasis of modeling was positioned on the accuracy maximization.As powerful optimization tools in many science fields [8], various approaches such as Genetic Algorithms (GAs) and Particle Swarm Optimization (PSO) [9,10] have been proposed to improve the accuracy of fuzzy models.As a result of accuracy maximization, the complexity of the model increases.Some researchers attempted to simultaneously optimize the accuracy and the complexity of the fuzzy models [11,12].Since it is impossible to simultaneously optimize these objectives due to the existence of the accuracycomplexity tradeoff, the accuracy maximization and complexity minimization have been often cast in the setting of multiobjective optimization.A number of evolutionary algorithms (EAs) have been developed to solve multiobjective optimization problems such as micro-GA [13] and NSGA-II [14][15][16].As a result, multiobjective optimization (MOO) techniques have been applied to the design of fuzzy models exhibiting high accuracy and significant interpretability [17,18].Nevertheless, when dealing with the fuzzy models, the previous studies lack an optimization vehicle which considers not only the solution space being explored but also the techniques of MOO.In our previous study [19], we propose a design of fuzzy models based on multiobjective space search algorithm (MSSA).This work provides some enhancements of the fuzzy modeling.Noticeably, some limitations of this fuzzy model are as follows: (1) the conventional MSSA is essentially stochastic search techniques, and such random mechanism leads to slow convergence speed; and (2) the flexibility and predictive ability are limited due to the only one single type of all polynomials of the consequence part of fuzzy rules.
In this study, we present a multiobjective oppositionbased space search algorithm (MOSSA) and introduce a design of fuzzy model by means of the MOSSA and weighted least squares method (WLSM).The resulting fuzzy models address the two constrains mentioned above.First, the proposed MOSSA that is used as a vehicle to maximize the accuracy of the fuzzy inference system could come with more rapid convergence speed in comparison with the conventional MSSA.Second, instead of the ordinary least squares method (LSM), WLSM is used to estimate the coefficients of the consequent polynomials.With the use of WLSM, fuzzy model exhibits different types of polynomials, which can vary from one rule to another.

A Design of Fuzzy Inference Systems
Figure 1 depicts an overall scheme of fuzzy rule-based modeling.The identification procedure for fuzzy models is split into two parts, namely, premise part and consequence part of the fuzzy rules.

Identification of Premise
Part.The identification completed at the premise level consists of two main steps.First, we select the input variables  1 ,  2 , . . .,   of the rules.Second, we form fuzzy partitions (by specifying fuzzy sets of welldefined semantics e.g., Low, High, etc.) of the spaces over which these individual variables are defined.
The identification of the premise part is completed in the following manner.
Step 1. Arrange a set of data U into data set X  composed of the corresponding input and output data.
Step 2. Run the -Means to determine the centers (prototypes) k  of the data set X  .
Step 2.1.Arrange data set X  into -clusters (in essence this is effectively the information granulation process).
Step 2.2.Calculate the centers k  of each cluster as follows: Step 3. Partition the corresponding input space using the prototypes of the clusters k  .Associate each cluster with some meaning (semantics), say Small, Large, and so forth.
Step 4. Set the initial apexes of the membership functions using the prototypes k  .

Identification of Consequence
Part.The identification of the consequence part of the rules embraces two phases, namely, (1) a selection of the consequence variables of the fuzzy rules (identification of consequence structure) and (2) determination of the parameters of the consequence (identification of consequence parameters).

Identification of Consequence
Structure.The identification of the conclusion parts of the rules deals with a selection of their structure (Types 1, 2, 3, and 4) that is followed by the determination of the respective parameters of the local functions occurring there.The consequence part of the rule that is extended form of a typical fuzzy rule in the TSK (Takagi-Sugeno-Kang) fuzzy model has the form ( Type 1 (simplified inference).Consider Type 2 (linear inference).Consider Type 3 (quadratic inference).Consider Type 4 (modified quadratic inference).Consider where   is the th fuzzy rule,   represents the input variables,   is a membership function of fuzzy sets,   is a constant,   and   are a center value of the input and output data, respectively, and n is the number of fuzzy rules.
The calculation of the numeric output of the model, based on the activation (matching) levels of rules there, relies on the following expression: Here,  * is the inferred output value, and   is the premise level of matching   (activation level).

Identification of Consequence Parameters.
The identification of consequence parameters is completed with the aid of WLSM, which is to determine the coefficients of the model through the minimization of the objective function   .The main difference between the WLSM and the classical LSM is the weighting scheme, which comes as a part of the WLSM and makes its focused on the corresponding local model as follows: The performance index   can be rearranged as where a  is the vector of coefficients of th consequent polynomial (local model), and W  is the diagonal matrix (weighting factor matrix) which involves the activation levels.X  is a matrix which includes input data shifted by the locations of the information granules (more specifically, centers of clusters).In case the consequent polynomial is Type 2 (linear or a first-order polynomial), X  and a  are read as follows: For the local learning algorithm, the objective function is defined as a linear combination of the squared error being the difference between the data and the corresponding output of each fuzzy rule, based on a weighting factor matrix.The weighting factor matrix, W  , captures the activation levels of input data to th subspace.In this sense we can consider the weighting factor matrix to form a discrete version of the fuzzy linguistic representation for the corresponding subspace.
The coefficients of the consequent polynomial of the th fuzzy rule can be determined in a usual manner, namely, Notice that the coefficients of the consequent polynomial of each fuzzy rule are computed independently using a certain subset of the training data.These computations can be implemented in parallel and in this case the overall computing load becomes unaffected by the total number of the rules.

Multiobjective Optimization of Fuzzy Inference Systems
Many optimization problems come with multiple objectives, which not only interact but may be in conflict.MSSA [20] is one of multiobjective optimization algorithms and it has been successfully used in the design of fuzzy models.In this study, we develop a multiobjective optimization algorithm based on an opposition-based space search algorithm as the optimization vehicle of FIS.

Multiobjective Opposition-Based Space Search Algorithm (MOSSA).
We first introduce a single-objective space search algorithm (SSA), an adaptive heuristic optimization algorithm whose search method comes with the analysis of the solution space [21].Let us recall the space search mechanism to update the current solutions.The role of space search is to generate new solutions from the old ones.The search method is based on the operator of space search, which generates two basic steps: it generates a new subspace (local area) and realizes search in this new space.The search in the new space is realized by randomly generating a new solution (individual) located in this space.Regarding the generation of the new space, we consider two cases: (a) space search based on  selected solutions (denoted here as Case I), and (b) space search based on the current best solution (Case II).
To illustrate the operator in detail, we consider the optimization problem: Here a feasible solution can be represented in the following way ( 1 ,  2 , . . .,   ).
We consider two scenarios.
(a) Space search based on  selected solutions: in this case,  solutions are randomly selected from the current population.The role of this operator is to update the current solutions by new solutions approaching to the optimum.The adjacent space based on  solutions is given in the form > 0,  = 1, 2, . . ., } .
(b) Space search based on the current best solution: in this case, the given solution is the best solution in the current population.The role of this operator is to adjust the best solution by searching the adjacent space as follows: Here  is a proportion coefficient being used to adjust the size of adjacent space.In this study,  is set to 1.
To speed up the convergence of the SSA, we develop an opposition-based space search algorithm (OSSA) realized by using a mechanism of opposition-based learning (OBL) [22].OBL has been shown to be an effective concept to enhance various optimization approaches.Let us recall the basic concept.
Based on this concept, we can develop the oppositionbased space search operator.Assume that the current solution set (population) is , where ℎ is the size of solution set,   is the size of opposition solution set, and  is the dimension of a solution.The Pseudocode opposition-based space search can be summarized as shown in Pseudocode 1.
With understanding of the OSSA, we can develop an MOSSA.In order to improve convergence to the Pareto front as well as produce a well-distributed Pareto front, the technique of nondominated sort with the aid of the crowding distance [23] is used in the MOSSA.The details are presented in Pseudocode 2. The nondominated sort is realized with the aid of estimation of the crowding distance among solutions in the current solution set.The termination condition of the MOSSA is such that all the solutions in the current population have the same fitness or terminate after a certain fixed number of generations.

Arrangement of Solutions in the MOSSA.
The standard gradient-based optimization techniques might not be effective in the context of rule-based systems given their nonlinear character (in particular the form of the membership functions) and modularity of the systems.This suggests us to explore other optimization techniques.Figure 2 depicts the arrangement of solutions present in the MOSSA-based method.The first part supporting structural optimization is separated from the second part used for parametric optimization.The size of the solutions for structural optimization of the fuzzy model is determined according to the number of all input variables of the system.The size of the solutions for parametric optimization depends on structurally optimized fuzzy inference system.When running the optimization method, we use an improved tuning method.Figure 3 illustrates the comparison of "conventional" tuning and the proposed tuning.In the "conventional" tuning, the structural and the parametric optimization are carried out sequentially.First, the structural optimization is completed with  number of generations and then we proceed with the parametric phase with  number of generations, where  and  are given numbers.The structural optimization of the fuzzy model is carried out assuming that the apexes of the membership functions are kept fixed.The fixed apexes of the membership functions are taken as the center values produced by the -Means algorithm, while the parametric optimization is applied to the fuzzy model derived through the structural optimization.In a nutshell, from the viewpoint of structure identification, only one fixed parameter set, which is the assigned apexes of membership functions obtained by -Means clustering, is considered to carry out the overall structural optimization of fuzzy model.From the viewpoint of parameter identification, only one structurally optimized model that is obtained during the structure identification is considered to be involved in the overall parametric optimization.In order to construct the optimized fuzzy model, the range of search space for the structural as well as the parametric optimization is clearly restricted in the sequential tuning method.To alleviate this problem, we present a MOSSA-based improved tuning method.In this approach, we realize the structural with the aid of  number of generations of parametric optimization, where  is a given number.The several generations of parametric optimization will help to determine the optimal structure of the fuzzy model.

Objective Functions of the FIS.
Three objective functions are used to evaluate the accuracy and the complexity of an FIS.Those are performance indexes, entropy of partition, and the total number of the coefficients of the polynomials to be estimated, respectively.Once the input variables of the premise part have been specified, the optimal consequence parameters that minimize the assumed performance index can be determined.We consider two performance indexes, that is, the standard root mean squared error (RMSE) and mean squared error (MSE) as follows: where  * is the output of the fuzzy model,  is the total number of data, and  is the data index.The accuracy criterion  includes both the training data and testing data and comes as a convex combination of the two components as follows: Here, PI and E PI (V PI) denote the performance index for the training data and testing data (Validation data), respectively. is a weighting factor that allows us to strike a sound balance between the performance of the model for the training and testing data.Depending upon the values of the weighting factor, several specific cases of the objective function are worth distinguishing.(ii) If  = 0.5, then both the training and testing data are taken into account.Moreover it is assumed that they exhibit the same impact on the performance of the model.
(iii) The case  =  where  ∈ [0,1] embraces both the cases stated above.The choice of  establishes a certain tradeoff between the approximation and generalization aspects of the fuzzy model.
As a measure for evaluating the structure complexity of a model we consider the following partition criterion: where  is the total number of selected input variables, and   is the number of membership functions for the th corresponding input variable.
As a simplicity criterion we consider the consequence part of the local models, which is computed as where   is the number of coefficients of the th polynomial and  stands for the number of input variables.
In a nutshell, we find the Pareto optimal sets and Pareto front by minimizing {, , } by means of the MOSSA.This leads to easily interpretable, simple, and accurate fuzzy models.

Experimental Studies
This section reports on comprehensive numeric studies illustrating the design of the fuzzy model and quantifying its performance.We use three well-known data sets.Each data set is divided into two parts of the same size.PI denotes the performance index for the training data, V PI represents the validation data, and E PI stands for the testing data.In all considerations, the weighting factor  was set to 0.5.
The parameters of the MOSSA are summarized in Table 1 (the choice of these specific values of the parameters is a result of intensive experimentations; as a matter of fact, those values are in line with those reported in the literature).

𝑁𝑂 𝑥 Emission Process Data of Gas Turbine Power Plant.
NO  emission process is modeled using the data of gas turbine power plants.A NO  emission process of a GE gas turbine power plant located in Virginia, USA, is chosen in this experiment.The input variables include AT (ambient temperature a site), CS (compressor speed), LPTS (low pressure turbine speed), CDP (compressor discharge pressure), and TET (turbine exhaust temperature).The output variable is NO  .The performance index is MSE defined by (17).First, the NO  emission process is split into two separate data sets.The first 50% of data set (consisting of 130 pairs) is used in the design of the fuzzy model.The remaining 50% data set (consisting of 130 pairs) helps quantify the predictive quality of the model.Table 2 summarizes the performance values of optimal solutions (individuals) objective functions (, , ) of the FIS.Generally, as could have been expected, by increasing the number of coefficients or rules, the accuracy of FIS becomes better.Second, the NO  emission process data is divided into three parts.130 pairs out of 260 pairs of input-output data are used for training; 78 pairs out of 260 pairs of input-output data are utilized for validation; the remaining part (consisting of 52 pairs) serves as a testing set.The results of the proposed model are shown in Table 3.
Table 4 illustrates the results of comparative analysis of the proposed model when being contrasted with other models.The selected values of the performance indexes of the FIS are included in Tables 2 and 3, respectively.We note that the proposed model outperforms several previous fuzzy models known in the literature.

Automobile Miles Per Gallon (MPG)
Data.We consider automobile MPG data (http://archive.ics.uci.edu/ml/datasets/Auto+MPG) with the output being the automobile's fuel consumption expressed in miles per gallon.The data set includes 392 input-output pairs (after removing incomplete instances) where the input space involves 8 input variables.To come up with a quantitative evaluation of the fuzzy model, we use the standard RMSE performance index as the one described by (17).
The automobile MPG data is partitioned into two separate parts.The first 235 data pairs are used as the training data set for FIS while the remaining 157 pairs are the testing data set for assessing the predictive performance.Table 5 summarizes the performance values of solution (individual) objective functions (, , ) of the FIS.Next, we divide the automobile MPG data into three separate parts.The first one (consisting of 196 pairs) is used for training.The second (consisting of 118 pairs) part of the series is utilized for validation.The remaining part (consisting of 78 pairs) serves as a testing set.The values of the performance index are presented in Table 6.
The identification error of the proposed model is compared with the performance of some other model; refer to Table 7.The selected values of the performance indexes of the FIS are marked in Tables 5 and 6, respectively.The performance of the proposed model is better in the sense of its approximation and prediction abilities.

Boston Housing Data.
Here we experiment with the Boston housing data set [29].This data set concerns a description of real estate in the Boston area where houses are characterized by features such as crime rate, size of lots, number of rooms, age of houses, and their median price.The dataset consists of 506 14-dimensional data.The performance index is defined as the RMSE as given by (17).
We consider the Boston Housing data set, which is split into two separate parts.The construction of the fuzzy model is completed for 253 data points being regarded as a training set.The rest of the data set (i.e., 253 data points) is retained for testing purposes.The values of the performance index are summarized in Table 8.
Next, we move on to the Boston Housing data set, which is partitioned into three separate parts.253 pairs out of 506 pairs of input-output data are used for training; 152 pairs out of 506 pairs of input-output data are utilized for validation; the remaining part (consisting of 101 pairs) serves as a testing set.Table 9 summarizes the results obtained for the optimized structure and performance index for optimized parameters by the MOSSA.
Table 10 supports a comparative analysis considering some existing models.The selected values of the performance Mathematical Problems in Engineering indexes of the FIS are marked in Tables 8 and 9, respectively.It is evident that the proposed model compares favorably both in terms of accuracy and prediction capabilities.

Concluding Remarks
This paper contributes to the research area of the hybrid optimization of fuzzy inference systems in the following two important aspects: (1) we proposed a multiobjective oppisiton-based space search algorithm; (2) we introduced the identification of fuzzy inference systems based on the MOSSA and WLSM.Numerical experiments using three well-known dataset show that the model constructed with the aid of the MOSSA exhibits better performance in comparison with the fuzzy model reported in the literature.

Figure 1 :
Figure 1: An overall scheme of fuzzy rule-based modeling.
(i) If  = 1, then the model is optimized based on the training data.No testing data is taken into consideration.

Table 1 :
Update the dynamic interval boundaries [  ,   ] in P. List of parameters of the MOSSA.

Table 2 :
Optimal solutions based on training data and testing data (NO  ).

Table 3 :
Optimal solutions based on training data, validation data, and testing data (NO  ).

Table 4 :
Comparative analysis of selected models (NO  ).

Table 5 :
Optimal solutions based on training data and testing data (MPG).

Table 6 :
Optimal solutions based on training data, validation data, and testing data (MPG).

Table 7 :
Comparative analysis of the selected models (MPG).

Table 8 :
Optimal solutions based on training data and testing data (Housing).

Table 9 :
Optimal solutions based on training data, validation data, and testing data (Housing).

Table 10 :
Comparative analysis of selected models (Boston Housing).