Design of Polynomial Fuzzy Radial Basis Function Neural Networks Based on Nonsymmetric Fuzzy Clustering and Parallel Optimization

We first propose a Parallel Space Search Algorithm (PSSA) and then introduce a design of Polynomial Fuzzy Radial Basis Function Neural Networks (PFRBFNN) based on Nonsymmetric Fuzzy Clustering Method (NSFCM) and PSSA. The PSSA is a parallel optimization algorithm realized by usingHierarchical Fair Competition strategy.NSFCM is essentially an improved fuzzy clustering method, and the good performance in the design of “conventional” Radial Basis Function Neural Networks (RBFNN) has been proven. In the design of PFRBFNN, NSFCM is used to design the premise part of PFRBFNN, while the consequence part is realized by means of weighted least square (WLS) method. Furthermore, HFC-PSSA is exploited here to optimize the proposed neural network. Experimental results demonstrate that the proposed neural network leads to better performance in comparison to some existing neurofuzzy models encountered in the literature.


Introduction
With the learning and generalization abilities, Fuzzy Radial Basis Function Neural Networks (FRBFNN) have been utilized in numerous fields for engineering, medical engineering, and social science [1,2].They are developed by integrating the principles of Radial Basis Function Neural Networks (RBFNN) and invoking the mechanisms of information granulation [3].In the design of classical FRBFNN, information granulation realized by using Fuzzy C-Means (FCM) is utilized to the premise part, while the consequence part (output) is treated as a linear combination of zero-order polynomials; with the use of FCM, the centers of the clusters are determined and the membership functions of the granules can be formed.The visible advantage of the FRBFNN is that it does not suffer from the curse of dimensionality which eminently appeared in other networks based on grid portioning [3].When dealing with the FRBFNN, it is required to estimate the parameters based on vast amount of data.As powerful optimization tools in many science fields [4][5][6], various evolutionary algorithms have been also proposed to improve the accuracy of models.
Recently, polynomial Fuzzy Radial Basis Function Neural Networks (PFRBFNN) [3] were proposed.The PFRBFNN adopts four polynomial types, which overcome the zeroorder polynomials of FRBFNN.In the PFRBFNN, the output of the conventional FRBFNN is considered as a linear combination of four types of polynomials.In spite of successful construction in the design of PFRBFNN, there are still two following limitations: (1) the FCM selects the hidden node centers by partitioning the input space to an equal number of fuzzy sets for each input variable, while the modification of the original method in order to take into account nonsymmetric fuzzy partitions of the input space is not considered [7,8] and (2) most of the optimization algorithms in the design of FRBFNN are not parallel; they lack parallel optimization algorithms.
To alleviate the limitations, in this study we propose a parallel space search algorithm (PSSA) and introduce a design of PFRBFNN with the aid of HFC-PSSA and Nonsymmetric Fuzzy Clustering Method (NSFCM).On the one hand, with the use of nonsymmetric fuzzy partitions of the input space, information granulation [24][25][26][27] realized by NSFCM may overcome the first limitation.On the other hand, with the use of hierarchical fair competition strategy, the proposed PSSA may help to overcome the second limitation.The overall methodology of PFRBFNN is as follows: first, NSFCM is utilized to determine the premise part of PFRBFNN, while the coefficients of the consequence polynomials are estimated by Weighted Least Squares (WLS) method; second, HFC-PSSA is exploited to optimize the proposed PFRBFNN.
The structure of the paper is organized as follows.Section 2 presents the HFC-PSSA.Section 3 describes the architecture and learning methods applied to the PFRBFNN.Section 4 deals with the optimization of PFRBFNN.Section 4 reports on the experimental results.Finally, some conclusions are drawn in Section 5.

Hierarchical Fair Competition-Based Parallel Space Search Algorithm
First, let us recall the space search algorithm (SSA), an adaptive heuristic optimization algorithm whose search method comes with the analysis of the solution space [24].The SSA generates new solutions from the old ones by using the socalled space search operators, which generate two successive steps: it first generates a new subspace (local area) and then realizes search in this new subspace.The search in the new space is realized by randomly generating a new solution (individual) located in this space, while the generation of the new space includes two cases [24]: (a) space search based on  selected solutions (denoted here as Case I) and (b) space search based on the current best solution (Case II).
For convenience, we consider the following optimization problem min (or max)  ( 1 ,  2 , . . .,   ) Here a feasible solution is denoted in the following way Two scenarios are considered as follows [24].
(a) Space search based on  selected solutions: in this case,  solutions are randomly selected from the current population.The role of this operator is to update the current solutions by new solutions approaching to the optimum.The adjacent space based on  solutions is given in the form (b) Space search based on the current best solution: in this case, the given solution is the best solution in the current population.The expression that generates a new solution is as follows: In the HFC-PSSA, a migration mechanism that is executed in regular generation intervals is included in the evolutionary process.To explain the details of migration operation, let us consider the following sequence of steps [28][29][30].
Step 1. Normalize the fitness of individuals.We generate several subpopulations according to the idea of hierarchical fair competition, and then normalize the fitness of individuals in each subpopulation using the following expression: where  , is fitness of th subpopulation and jth individual and  max and  min are maximum and minimum values of fitness, respectively.
Step 2. Calculate some admission threshold (   ).The admission threshold for the th subpopulation is determined by the average of normalized fitness.The expression of    is as follows: where   is a size of th subpopulation.
Step 3. Create admission buffer that is located at each admission threshold level.
Step 4. Migrate the qualified individuals.Here the individuals are migrated from the admission buffer to the corresponding subpopulation.Algorithm 1 summarizes the flow of computing the HFC-PSSA.The termination condition is such that all the solutions in the current population have the same fitness or terminate after a certain fixed number of generations.

A New Design of the PFRBFNN
In the design of PFRBFNN, fuzzy clustering may be used to determine the number of RBFs, as well as a position of their centers and the values of the widths [31].The gradientmethod or the least square algorithm is used to realize parametric learning to deal with the conclusions of the rules [32,33].In contrast to PFRBFNN, the proposed PFRBFNN are designed as shown in Figure 1.There are mainly two new points in the design of PFRBFNN.First, in the construction of PFRBFNN, instead of FCM, NSFCM is used to realize radial basis functions.Second, in the optimization of PFRBFNN, we use the HFC-PSSA as an optimization vehicle.

Realization of Radial Basis Functions Using Nonsymmetric
Fuzzy Means.In the PFRBFNN, the receptive fields in Radial Basis Functions are formed by nonsymmetric fuzzy means (NSFM) [7,8], where the main idea is as follows.
Consider a system with  normalized input variables   , where  = 1, 2, . . ., ; the domain of each input variable is partitioned into a number of one-dimensional triangular fuzzy sets, ; then each fuzzy set can be written as where   is the center element of fuzzy set   and  is half of the respective width; this partitioning technique creates a total of   multidimensional fuzzy subspaces   , where  = 1, 2, . . .,   .Then one can define the center vector   and the side vector  of each fuzzy subspace: where   1 is the center element of the one-dimension fuzzy set  1 that has been assigned to input .
In some senses, the conventional fuzzy means method can be regarded as "symmetric" fuzzy means [7], and the Euclidean   (()) relative distance between   and the input data vector () can be represented as In the nonsymmetric fuzzy means method, the Euclidean   (()) becomes To obtain the detail of NSFM, one can refer to the reference [8].

Construction of Consequence Polynomials Using Information Granulation.
The PFRBFNN [3] based on information granulation [25][26][27] can be represented in form of "if-then" fuzzy rules where R  is the th fuzzy rule,  = 1, .
In the design of PFRBFNN, four types of polynomials are considered as the consequent part of fuzzy rules.One of the four types is selected for each subspace as the result of the optimization, which will be described later in this study.It is noted that the PFRBFNN using IG do not suffer from the curse of dimensionality (as all variables are considered en block) [3], more accurate and compact models with a small number of fuzzy rules by using high-order polynomials may be constructed.The four types of consequent polynomials are as follows [24][25][26][27].
Type 1: zero-order polynomial (constant type) Type 2: first-order polynomial (linear type) Type 3: second-order polynomial (quadratic type) Type 4: modified second-order polynomial (modified quadratic type) The determination of the numeric output of the model, based on the activation levels of the rules, is given in the form

Learning of the Consequent Part Using Weighted Learning
Square.To determine the coefficients of the model, we use the weighted learning square (WLS) method.The minimization of the objective function   is as follows: It is clear that   can be rearranged as [34] where a  is the vector of coefficients of th consequent polynomial (local model), W  is the diagonal matrix (weighting factor matrix) which involves the activation levels; X  is a matrix which includes input data shifted by the locations of the information granules (more specifically, centers of clusters).For example, if the consequent polynomial is type 2 (linear or a first-order polynomial), we have For the local learning algorithm [34], the objective function is defined as a linear combination of the squared error being the difference between the data and the corresponding output of each fuzzy rule, based on a weighting factor matrix.The weighting factor matrix, W  , captures the activation levels of input data to th subspace.In this sense the weighting factor matrix can be considered to form a discrete version of the fuzzy linguistic representation for the corresponding subspace.
For the th fuzzy rule, the coefficients of polynomial in the consequent part can be written in a usual manner, namely, Notice that the coefficients of polynomial in the consequent part in each fuzzy rule are calculated independently by means of a certain subset of training data.

Optimization of the PFRBFNN Using HFC-PSSA.
In the HFC-PSSA, a solution is represented as a vector comprising the fuzzification coefficient, the number of input variables, the input variable to be selected, the number of fuzzy rules, and the polynomial type.The length of solution vector corresponds to the maximal number of fuzzy rules to be considered in the optimization.Figure 2 offers an interpretation of the content of the particle in case the upper bound of search space of the fuzzy rule is set to 6.As the number of rules and the polynomials orders (in consequent part) have to be integer number, we round off these values to the nearest integer.The fuzzification coefficient is equal to 1.03.The number of selected input variables is four, while the number of the rule is six.The first local model is of linear type while the other three local models are linear and quadratic.
Here we consider two performance indexes that is the standard root mean squared error (RMSE) and mean squared error (MSE) [24] PI (or where  * is the output of the fuzzy model,  is the total number of data, and  is the data index.The accuracy criterion MPI [24] includes both the training data and testing data and comes as a convex combination of the two components: where PI and EPI denote the performance index for the training data and testing data, respectively. is a weighting factor that allows us to strike a sound balance between the performance of the model for the training and testing data.

Experimental Study
To demonstrate the effectiveness of the proposed approach, we have done the experiments based on numerical examples.In all the experiments, we set  = 0.5.The proposed HFC-PSSA is carried out as an optimization vehicle in the design of PFRBFNN.Table 1 summarizes the list of parameters and boundaries of the HFC-PSSA.

Sewage Treatment Process (STP).
The first well-known dataset comes from the sewage treatment system plant in Seoul, Republic of Korea.The proposed PFRBFNN are carried out on the sewage treatment process data [11], which consists of 52 input-output pairs and four input variables (MLSS, WSR, RRSP, and DOSP).This dataset has been intensively studied in the previous literature [9][10][11][12][13][14].Here the gas furnace process is partitioned into two parts.The first 60% of dataset is selected as training data used for the construction of the fuzzy model.The remaining 40% dataset is considered as the testing dataset, which is used to quantify the predictive quality of the model.The performance index is specified as the RMSE given by (20).
The optimal network that consists of four fuzzy rules with type 2 is obtained by using the 200 generations of PSSA.The polynomials in the consequence part of four rules are as follows: Figure 3 illustrates the resulting values of the performance index when running the PFRBFNN based on the HFC-PSSA.As it could have been expected, by increasing the total number of generations, the accuracy of PFRBFNN becomes better.
The performance of the proposed model is compared with some other models available in the literature; refer to Table 2. Local models of other models have same type of fuzzy rule such as constant or linear form.The proposed model can have different types of local models.In this comparison, the proposed model, having a small number of rules, shows better accuracy, while the model leads to disadvantage of having a large number of coefficients of local model in case of selecting the quadratic form.

Medical Imaging System (MIS).
The second dataset is a medical imaging system dataset which involves 390 software modules written in Pascal and FORTRAN, and each module is described by 11 input variables [15][16][17][18][19]. Applying the proposed design methodology, the given dataset is randomly partitioned to produce two datasets [15][16][17]: the first 60% of dataset is used for training the models; the remaining 40% of dataset, the testing dataset, provides for quantifying the predictive quality (generalization ability) of the fitted models.We consider the RMSE (20) as the performance index.
With the running 200 generations of PSSA, we obtain the optimal network that consists of four fuzzy rules with type 2. The polynomials standing in the consequence part of these four fuzzy rules are as follows: Figure 4 illustrates the resulting values of the performance index when running the PFRBFNN based on the HFC-PSSA.The proposed model is also contrasted with some previously developed fuzzy models as shown in Table 3.It is easy to see that the performance of the proposed model is better in the sense of its approximation and prediction abilities.

Abalone Data (ABA).
Finally we experiment with the abalone dataset [20][21][22][23], which is a larger dataset consisting of 4,177 input-output pairs that concerns the age of abalone predicted on a basis of seven input variables (including length, weight, diameter, etc.).The dataset is split into two separate parts [20,21]: the construction of the fuzzy model is completed for 2506 data points being regarded as a training set; and the rest of the dataset (i.e., 1671 data points) is retained for testing purposes.RMSE is considered as the performance index.After running 200 generations of PSSA, we obtain the optimal network with eight fuzzy rules with type 3, whose polynomials of the consequence part are as follows:  ( Figure 5 shows performance indexes generated by means of the HFC-PSSA in case of using training data and testing data.Table 4 summarizes the results of comparative analysis of the PFRBFNN using HFC-PSSA when being contrasted with other models.It is clear that the proposed model outperforms several previous fuzzy models reported in the literature.

Conclusions
In this study, we have proposed an HFC-PSSA and have introduced a design of PFRBFNN based on the HFC-PSSA.A suite of comparative studies demonstrated that the proposed

Figure 3 :
Figure 3: Trace curves of the performance indexes produced by HFC-PSSA (STP).

Figure 4 :
Figure 4: Trace curves of the performance indexes produced by HFC-PSSA (MIS).

Table 1 :
List of the parameters of the HFC-PSSA and boundaries of search space.If the number of input variables in dataset is smaller than 10, then we use the number of all input variables. *

Table 2 :
Comparative analysis of selected models (STP).