Prediction of Surface Roughness in End Milling Process Using Intelligent Systems: A Comparative Study

A study is presented to model surface roughness in end milling process. Three types of intelligent networks have been considered. They are (i) radial basis function neural networks (RBFNs), (ii) adaptive neurofuzzy inference systems (ANFISs), and (iii) genetically evolved fuzzy inference systems (G-FISs). The machining parameters, namely, the spindle speed, feed rate, and depth of cut have been used as inputs to model the workpiece surface roughness. The goal is to get the best prediction accuracy. The procedure is illustrated using experimental data of end milling 6061 aluminum alloy. The three networks have been trained using experimental training data. After training, they have been examined using another set of data, that is, validation data. Results are compared with previously published results. It is concluded that ANFIS networks may su ﬀ er the local minima problem, and genetic tuning of fuzzy networks cannot insure perfect optimality unless suitable parameter setting (population size, number of generations etc.) and tuning range for the FIS, parameters are used which can be hardly satisﬁed. It is shown that the RBFN model has the best performance (prediction accuracy) in this particular case.


Introduction
End milling is one of the most common metal removal operation-encountered in industrial processes.It is widely used in a variety of manufacturing industries including the aerospace and automotive sectors, where quality is an important factor in the production of slots, pockets, precision molds, and dies.The quality of the surface plays a very important role in the performance of milling as a good-quality milled surface significantly improves fatigue strength, corrosion resistance, and creep life.Surface roughness also affects several functional attributes of parts, such as contact causing surface friction, wearing, light reflection, heat transmission, ability of distributing and holding of lubricant, coating, and resisting fatigue.
Conventionally, the setup parameters for the end milling operation are usually selected with the aid of trial cutting experiments, which are both time consuming and costly.The mechanism behind the formation of surface finish is very complicated and process dependent, therefore it is very difficult to calculate its value through analytical formula.
Moreover, the surface finish of the product depends on the experience of an operator and the machining environment.Therefore, there is a need for the development of a simulation system which is capable of predicting the surface finish of a workpiece and optimizing cutting conditions.
Modeling techniques for the prediction of surface roughness can be classified into three groups which are experimental models, analytical models, and artificial intelligencebased models.Experimental and analytical models can be developed by using conventional approaches such as statistical regression techniques which are usually called the response surface method (RSM) [1][2][3].On the other hand, artificial-intelligence-based models are developed using nonconventional approaches such as artificial neural networks [4][5][6][7][8], fuzzy logic, genetic algorithms [9], and hybrid systems [10][11][12][13].References [8,14] present reviews for the various methodologies and practices that are being employed for the prediction of surface roughness.
Function approximation for a set of input-output pairs has numerous scientific and engineering applications such as signal processing, image restoration, pattern recognition, control systems, and system identifications [15,16].The meaning of function approximation is to model a desired function or an input-output relation from a set of inputoutput sample data.Recently, artificial-intelligence-based models have become the preferred trend, and these are applied by most researchers to develop a model for near optimal conditions in machining.Both fuzzy inference systems (FISs) and neural networks are universal function approximators.They can get good performance for nonlinear functions, provided that there are sufficient rules in FISs or hidden neurons in neural network [17].
Karayel [6] has implemented the artificial neural network (ANN) in the prediction and control of surface roughness in CNC lathe.A feedforward multilayered neural network was developed and trained using the scaled conjugate gradient algorithm, which is a type of backpropagation.The number of iterations was stopped at 8,000 after trial and error procedure.Using some selected data from the experimentation results, the average absolute predicting error was 0.0229.Topal [7] has studied the prediction of surface roughness in flat end milling.He used a 3-layered feed forward multilayer perceptron network which is trained with backpropagation (BP) technique.The inputs are the cutting speed, feed rate, depth of cut, and the stepover ratio.The hidden layer has 10 neurons.The achieved average root mean squared prediction error (RMSE) is 0.04 after 65,000 iterations.Oktem et al. [11] have proposed an ANN model coupled with genetic algorithm (GA).The prediction error is less than 0.0534.However, to obtain the successful model of ANN, it depends totally on the process of trial and error with some factors to consider.Until now, there are no clear rules that could serve as a basis to be followed in producing the perfect ANN.
Roy [10] has designed an expert system using fuzzy inference system (FIS) and genetic algorithm (GA) so that the surface roughness in ultraprecision turning of metal matrix composite can be modeled for a set of given cutting parameters, namely, spindle speed, feed rate, and depth of cut.The maximum absolute value of prediction error for several case studies is less than 0.053.Colak et al. [9] have proposed a gene expression programming method for predicting surface roughness of milling surface relation to cutting parameters in CNC milling machines.Cutting speed, feed rate, and depth of cut of end milling operations were collected for predicting surface roughness.In their study, there is no quantification for the prediction error, however there are some figures presented there.
Few studies have been done related to the use of radial basis function networks (RBFNs) in the prediction of surface roughness in end milling.Lu [8] has used RBFN in the prediction of surface roughness in turning operation.The least mean squared error reached is 0.0439 in the training phase.The prediction error, however, is not clear from the contribution.
Radial basis function networks (RBFNs), as a special class of single hidden-layer feed forward neural networks, have been proved to be universal approximators [18][19][20].One advantage of RBFNs compared with multilayer perceptrons is that the linearly weighted structure of RBFNs, where parameters in the units of the hidden layer can often be prefixed, can easily be trained with a fast speed without involving nonlinear optimization.Another advantage of RBFNs, compared with other basis function networks, is that each basis function in the hidden units is a nonlinear mapping which maps a multivariable input to a scalar value, and thus the total number of candidate basis functions involved in an RBFNs model is not very large and does not increase when the number of input variables increases.With these attractive properties, RBFNs are an important and popular network model for function approximation [21].
Otherwise, various neurofuzzy inference systems (FISs) have also been used to determine the surface roughness in machining operations.An important issue in application of FISs in predicting problems is to extract the structure and type of fuzzy if-then rules from available input-output data.Given an FIS whose number and structure of fuzzy rules are known, the optimization techniques in ANN and genetic programming can be used to tune the shape of membership functions of fuzzy variables and other parameters of the fuzzy rule base.
Lo [12] has studied the implementation of an adaptivenetwork-based fuzzy inference system (ANFIS) to predict the workpiece surface roughness after the end milling process.Two different membership functions, triangular and trapezoidal, were adopted during the training process of ANFISs in order to compare the prediction accuracy of surface roughness by the two networks.When a triangular membership function was adapted, the prediction accuracy of ANFISs reached is as high as 96.5%.However, he has studied only the first-order (FO) Sugeno fuzzy inference system, and the learning mechanism used in his work is not clear.
More recently, Ho et al. [13] have proposed a genetic fuzzy inference system (G-FIS).The premise and consequent parameters of the fuzzy system have been optimally determined using genetic algorithms.They used the root mean squared error as the optimality criterion.In their network, they used Gaussian memberships to represent the input variables to the network.The trained fuzzy system can be considered as first-order (FO) Sugeno fuzzy inference system (FIS).The achieved prediction accuracy RMSE (root mean square of error) is 3.32%.
There are various machining parameters that can affect surface roughness.Beside feed rate, spindle speed, and depth of cut, there are many other parameters like tool geometry, vibration, workpiece hardness, surface temperature, the material being processed, cutting time, cutting forces, chip width and thickness, and even the machine with which the experiments are performed.More details can be found in [14].In the present work, an attempt has been made to design approximating networks, so that the surface finish in end milling can be modeled for a set of input cutting parameters, namely, spindle speed, feed rate, and depth of cut.They are listed in Table 1.Several neural, neurofuzzy, and genetic fuzzy networks are developed and compared to each other in order to determine the most effective method for predicting the surface roughness in end milling process.The aim is to determine a network which can best capture the nonlinear mapping between the cutting parameters and the resulting surface roughness.Throughout this paper, the root mean square error (RMSE) is used as a measure of the prediction accuracy.This paper proceeds as follows.In Section 2, the RBFNs used in this work is developed.Section 3 introduces the ANFIS structure and its implementation to construct four different fuzzy networks.Section 4 illustrates the use of genetic algorithms to build up two other optimized fuzzy systems.The experimental data sets used in this and previous studies are given in Section 5. Section 6 demonstrates the results of comparing the performance of the seven networks and previously published results.Section 7 offers our concluding remarks.

Radial Basis Function Network (RBFN) Approach
The different types of artificial neural network are, in practice back propagation neural networks (BPNNs), counter propagation neural networks (CPNNs), and radial basis function neural networks (RBFNs), and so forth.Even though BPNNs is widely used for a variety of systems, especially in the field of surface roughness prediction as it appears in these recent articles [22][23][24][25][26], it suffers from a number of drawbacks.First, it is very slow to converge because of the use of sigmoid nonlinear transformation functions.Second, it is not always simple or straightforward to design the topology of the network that would accurately represent the system being modeled.RBFN has been chosen in this work because it has been proven that it is a very efficient network when function approximation is needed [5].This artificial neural network has the following advantages: The following four points give the main characteristics.
(i) An RBFN is an ANN which uses radial basis functions as activation functions instead of sigmoid functions.
(ii) A radial basis function is a function whose value depends only on the distance from a center point c (iii) The ANN output is a linear combination of radial basis functions.
(iv) RBFNs are used in many applications like time series predictions and function approximation.
The theory of multivariable interpolation in high-dimensional space has a long history [19].The interpolation (approximation), in its strict sense, may be stated as follows Given a set of n different points {x i ∈ R m | i = 1, 2, . . ., n} and a corresponding set of n real numbers {d i ∈ R 1 | i = 1, 2, . . ., n}, find a function F : R n → R 1 that satisfies the interpolation condition: ( As mentioned earlier, the three end milling parameters considered in this study are spindle speed (S m ), feed rate (F m ), and depth of cut (D m ).They are contained in the vector x i , that is, m = 3.The output is the surface roughness R m contained in d i , that is, d i = R m .For strict interpolation as specified here, the interpolating surface (i.e., the function F) is constrained to pass through all the training data points.
An RBFN is a multidimensional nonlinear function mapping that depends on the distance between the input vector and the center vector.An RBFN with an n-dimensional input x ∈ R n and a single output R 1 can be represented, as shown in Figure 1, by the weighted summation of a finite number of radial basis functions as follows [20].
The radial basis functions (RBFNs) technique consists of choosing a function F that has the following form: where {φ( x − x i ) | i = 1, 2, . . ., n} is a set of n arbitrary functions (generally nonlinear) known as radial basis functions, and • denotes a norm that is usually Euclidean.The known data points {x i ∈ R m | i = 1, 2, . . ., n} are taken to be the centers c i ∈ R n of the radial basis functions, and w i is a weight parameter vector to be determined.Inserting the interpolation conditions of (2) in (3), we obtain the following set of simultaneous equations for the unknown coefficients (weights) of the expansion w i : where The n-by-1 vectors d and w represent the desired response vector and linear weight vector, respectively, where n is the size of the training sample.Let Φ denote an n-by-n matrix with elements φ ji : This matrix is called the interpolation matrix [19].We may then rewrite (4) in the compact form Assuming that Φ is nonsingular and therefore that the matrix Φ −1 exists, we may go to solve (8) for the weight vector w as shown by The vital question is how we can be sure that the interpolation matrix Φ is nonsingular.Fortunately, the previous results due to Micchelli [19,20] have shown that for n distinct points x 1 , x 2 , . . ., x n ∈ R m , a large class of radial basis functions may guarantee the nonsingularity of Φ.The term radial basis function is derived from the fact that these functions are radially symmetric; that is, each node produces an identical output for inputs that lie at a fixed radial distance from the center.Among those radial basis functions are the Gaussian, multiquadratic, inverse multiquadratic, thin plate splines, cubic splines, and linear splines.The later is used in this work.
The above RBFN, used in this study, is not unique.Several other RBFNs can be found in the literature such as regularization network [19] and generalized RBFN [19,27].However, with the above RBFN, no trial-and-error procedure is required, and the computational aspects are simple and straight forward.It is also more suitable for this study since the number of data pairs involved in the training phase is not very large.Furthermore, in the training phase, one can expect zero training error, which is usually difficult (if not impossible) to achieve, at least by the other networks discussed in this work; that is, ANFIS and G-FIS.

ANFIS-Based Fuzzy Networks
Fuzzy inference systems (FISs) are usually used as mathematical tools for approximating ill-defined nonlinear functions.
They can import qualitative aspects of human knowledge and reasoning process by data sets without employing precise quantitative analysis using the following five functional components, as shown in Figure 2 [28].
(i) A rule base containing a number of fuzzy if-then rules.
(ii) A database defining the membership functions of the fuzzy sets.(iii) A decision-making unit as the inference engine.
(iv) A fuzzification interface which transforms crisp inputs to linguistic variables.(v) A defuzzification interface converting fuzzy outputs to crisp output.
As mentioned above, fuzzy inference systems (FISs) are composed of a set of If-Then rules.A typical first-order (FO) Sugeno fuzzy model for the problem under consideration has the following rule set: where R l (l = 1, 2, . . .N) denotes the lth rule number, A i , B j , and C k are the linguistic terms of the inputs, p l , q l , r l , and t l are the consequent parameters.Similar to the work of [12,13] (for the sake of comparison), we choose i, j, k = 1, 2, 3, so that the number of rules N is 27.
The overall output y of this first order (FO) Sugeno fuzzy system (10) is where τ l is the firing strength of R l , which is defined as where i, j, k = 1, 2, 3.
A zero-order (ZO) Sugeno fuzzy model has the following form where u l , l = 1, 2, . . ., N is constant.The overall output of this zero-order model is In this and the coming sections, Gaussian membership functions are used to define the membership grade of the input variables.A Gaussian membership function is defined by where x i , i = 1, 2, 3 are the input variables, that is, S m , F m , D m , c j , and σ j are, respectively, the center and spread (width) of the jth membership function, j = 1, 2, 3.
With the above structure of the FO-FIS, the number of parameters in the premise part is 18 and in the consequent part is 4 × 27, so the total number of parameters is 126.With respect to ZO-FIS, the number of parameters in the premise part is also 18 and in the consequent part is 1 × 27, so the total number of parameters is 45.In this work, the two FISs are tuned and optimized by two learning algorithms; the ANFIS as discussed in this section and the genetic algorithms in Section 4.
Fundamentally, ANFIS is about taking an initial fuzzy inference system (FIS) and tuning it with back propagation algorithm based on the collection of input-output data.With tuning, we mean optimal selection of the parameters of the input membership functions and parameters of the consequent part of the FIS.ANFIS may use the back propagation (BP) or the hybrid learning algorithm (HL) to identify the FIS parameters.In the hybrid learning method (HL), a combination of least squares and back propagation gradient descent methods is used for training FIS membership function parameters to model a given set of input/output data.The coming two subsections discuss the architecture and the learning algorithms of ANFIS.

ANFIS Architecture.
For simplicity, we assume that the fuzzy inference system under consideration has two inputs x and y and one output z.For a first-order (FO) Sugeno fuzzy model [17], a common rule set with two fuzzy if-then rules is the following Figure 3 illustrates the reasoning mechanism for this Sugeno model; the corresponding equivalent ANFIS architecture is as shown in Figure 4, where nodes of the same layer have similar functions, as described next.Here, we denote the output of the ith node in the layer l as O l,i .Layer 1.Every node i in this layer is an adaptive node with a node function where x (or y) is the input to node i and A i (or B i−2 ) is a linguistic label (such as "small" or "large") associated with this node.In other words, O 1,i is the membership grade of a fuzzy set (μ Ai or μ Bi−2 ) and it specifies the degree to which the given input x (or y) satisfies the quantifier A i (or B i−2 ).Here, the membership function for the inputs x and y can be any appropriate parameterized membership function such as Gaussian, triangular, trapezoidal, or any other appropriate function.Parameters in this layer are referred to as premise parameters.
Layer 2. Every node in this layer is a fixed node labeled Π, whose output is the product of all the coming signals: Each node represents the firing strength of a rule.In general, any other T-norm operators that perform fuzzy AND can be used as the node function in this layer.
Layer 3. Every node in this layer is a fixed node labeled S. The ith node calculates the ratio of the ith rule's firing strength to the sum of all rules' firing strengths: The outputs of this layer are called normalized firing strengths.Layer 4. Every node i in this layer is an adaptive node with a node function: where w i is a normalized firing strength from layer 3, and {p i , q i , r i } is the parameter set of this node.Parameters in this layer are referred to as the consequent parameters.Layer 5.The single node in this layer is a fixed node labeled Σ, which computes the overall output as the summation of all incoming signals: The above statements conclude the ANFIS architecture which is equivalent to a Sugeno FO fuzzy model.If the consequent part contains only r i (p i and q i are set to zeros), i = 1, 2, then we get the ZO Sugeno fuzzy model.For comparison purposes, the study examines the two models.

The ANFIS Learning Algorithms.
The task of the learning algorithm is to modify all the modifiable parameters of the adaptive layers.In this study, two learning algorithms are considered; the back propagation (BP) and the hybrid learning (HL) algorithms.In both cases, the initial FIS is generated from the training data set, using a grid partition on the data (no clustering).The central part of BP concerns how to recursively obtain a gradient vector in which each element is defined as the derivative of an error measure with respect to a parameter.This is done by means of the chain rule, a basic formula for differentiating composite functions, that is covered in every textbook on elementary calculus.Once the gradient is obtained, a number of derivative-based optimization and regression techniques are available for updating the parameters like gradient methods, steepest descent, Newton's methods, conjugate gradient methods, and nonlinear leastsquares.In particular, if we use the gradient vector in a simple steepest descent method, the resulting learning paradigm is referred to as the BP learning rule [17].The BP is used to modify the consequent parameters with the forward pass training method.The training method optimizes the consequent parameters with the premise parameters fixed.The estimation method of the optimum modulation is done using the following formula: where R m is the expected surface roughness, F is the ANFIS output, and E is the mean square error value.When E reaches the convergence condition, it will produce the inference results.Otherwise, the consequent parameters are fixed and the premise parameters are modified with the BP method.
The second training method considered in this study is based on hybrid learning (HL) algorithm which has been proposed by Jang [29].The algorithm consists of a combination of the least square estimator (LSE) and the gradient descent (GD) method.More specifically, in the forward pass of the hybrid learning algorithm, node outputs go forward until Layer 4 and the consequent parameters are identified by the least-square method.In the backward pass, the error signals propagate backward and the premise parameters are updated by the GD method.The LSE is used to modify the consequent parameters with the forward pass training method.The training method optimizes the consequent parameters with the premise parameters fixed.When E in (23) reaches the convergence condition, it will produce the inference results.Otherwise, the consequent parameters are fixed and the premise parameters are modified with the GD method.The GD method is a backward pass training method which adjusts the optimum premise parameters.These optimum premise parameters are modified corresponding to the fuzzy sets in the input domain.After the new parameters of the premise part are obtained, the output of ANFIS is calculated again by employing the consequent parameters found by the forward pass training method.Table 2 summarizes the activities in each pass.
The hybrid learning (HL) algorithm causes the error E to converge to the convergence condition.The reader is referred to [29] for further mathematical derivation of this HL algorithm.Previous results have proven that this hybrid learning (HL) algorithm is highly efficient for optimally tuning Sugeno FISs [12,17,29].
Here, we have examined four ANFIS networks.They are FO and ZO Sugeno FISs as defined in ( 11) and ( 14), respectively; each of them was trained by the BP and HL algorithms.The four ANFIS networks have been initiated using grid partition to the input variables, that is, in the antecedent part, the membership functions are equally spaced inside the range of each input variable.The consequent parameters have been initiated with zeros.The coming section discusses tuning the FO and ZO Sugeno FISs using an alternative method, the genetic algorithms (GAs).

Genetically Evolved Fuzzy Inference Systems (G-FISs)
Genetic algorithms (GAs) are derivative-free stochastic optimization methods based loosely on the concepts of natural selection and evolutionary processes.Their popularity can be attributed to their freedom from dependence on functional derivatives, and they are less likely to get trapped in local minima, which inevitably are present in any practical optimization application (including ANFIS).Eventually, GAs can be used to determine the optimal parameters of a fuzzy system given some optimality critera.
The solution of an optimization problem with GAs begins with a set of potential solutions (FISs) or chromosomes (usually in the form of bit strings) that are randomly selected.The entire set of these chromosomes comprises a population.The chromosomes evolve during several iterations or generations.New generations (offspring) are generated utilizing the crossover, mutation, and elitism technique.Crossover involves splitting two chromosomes and then combining onehalf of each chromosome with the other pair.Mutation involves flipping a single bit of a chromosome.Elitism is a policy of always keeping a certain number of best members when each new population is generated.The chromosomes are then evaluated employing a certain fitness criterion, and the best ones are kept, while the others are discarded.This process repeats until one chromosome has the best fitness and is taken as the optimum solution of the problem.Figure 5 is a schematic diagram illustrating how a fuzzy system can be trained using GAs.A comprehensive review about GAs can be found in [30].
In this section, the values of the premise and consequent parameters of FO and ZO FISs are learned by minimizing the root mean squared error (RMSE) defined by where α denotes the number of training data, R m is the actual experimental surface roughness (training data sets), and F denotes the predicted surface roughness, which is an output of FIS.This performance index has been also adapted in [13].
Because GA endeavors to maximize the fitness function, the fitness function of each gene (chromosome) is calculated as follows: where J is the performance index defined in (24) and 1 is introduced at the denominator to prevent the fitness function from becoming infinitely large.
Based on the aforementioned concepts, a genetic algorithm for maximization problems can be described as follows [17].Step 1. Initialize a population with randomly generated individuals and evaluate the fitness value of each individual using (25).
(a) Select two members from the population with probabilities proportional to their fitness values.
(b) Apply crossover with a probability equal to the crossover rate.
(c) Apply mutation with a probability equal to the mutation rate.
(d) Repeat from (a) to (c) until enough members are generated to form the next generation.
Step 3. Evaluate each member of the new generation using the fitness function (25).
Step 4. Repeat steps 2 and 3 until a stopping criterion is met.
Here, the GA performs only parameter learning of the fuzzy model.The structure of the genetic fuzzy inference systems (G-FIS) is completely determined in advance by determining the number of memberships of each input variable and choosing the function of the consequent part whether it is ZO or FO.That is, the number of rules is the product of the memberships of the input variables (full interconnection between Layers 2 and 3, Figure 4).Gaussian membership functions (15) have been utilized for the input variables.As it is the case with ANFIS, the number of parameters to be determined (optimally tuned) is 126 for the FO-FIS and 45 for the ZO-FIS.Similar to the work of Ho et al. [13], the data base of the antecedent and consequent parameters has been randomly initiated.
The GA is used to tune the membership functions at the precedent part of the fuzzy rules and the consequent part (whether it is ZO or FO) within prespecified ranges.These ranges determine the search space of the optimization problem.Choosing large ranges, increases the search space, and the optimal solution may not be reachable.Choosing low ranges, restricts the search space which may lead to an underdetermined optimization problem.A compromise has to be found.
Otherwise, the critical parameters of the GA are the size of the population, crossover rate, mutation rate, number of iterations, that is the number of generations (the stopping criterion used in this work), and so forth.These parameters are problem dependent [31].A parametric study is introduced in Section 6.3 to determine the best possible numeric values of these parameters.The obtained values are heavily depending on trial and error.

Experimental Data Sets
In this study, we use the experimental data published in [12,13].A high-speed steel (HSS) four-flute end milling cutter with a diameter 3/4 was used to machine 6061 aluminum alloy.Spindle speed, feed rate, and depth of cut were selected as the machining parameters to analyze their effect on surface roughness.A total of 48 sets were utilized as the training data for all the above algorithms.Among them, the settings of spindle speed include 750, 1000, 1250, and 1500 rpm; those of the feed rate include 6, 12, 18, and 24 ipm; and the depth of cut is set at 0.01, 0.03, and 0.05 in.They are listed in Table 3.The testing (validation) data sets are listed in Table 4.They are 24 sets which implement different feed rate sittings of 9, 15, and 21 ipm.The settings for the other parameters are the same as those of the training sets.
The experimental training and testing data have been normalized in order to make them suitable for the training and validation processes [5].This was done by mapping each term to a value between 0 and 1 by simply dividing each column in Tables 2 and 3 by the corresponding maximum value.This approach avoids the complications of other normalization criterion which can be found in [4], and the predicted surface roughness value can be easily transformed back to its true value.

Results and Discussion
In this section, results are demonstrated and discussed.The training and testing RMSE of the seven examined networks are summarized in Table 5. Findings show that the RBFN has achieved the least training error (RMSE = 0.0) and least testing error (RMSE = 0.0295).ANFIS network of type ZO Sugeno fuzzy model trained with BP exhibited the worst predicting results (RMSE = 0.1069).The two G-FIS networks give mixed signals about their prediction accuracy.
Table 6 gives more details about the prediction accuracy of three selected networks; the RBFN, the FO Sugeno fuzzy model trained by ANFIS using HL algorithm, and the ZO Sugeno fuzzy model trained by GA.The coming subsections demonstrate the performances of the three types of networks implemented in this study.

Performance of
The RBFN.The RBFN described in Section 2 has been trained using the experimental data sets listed in Table 3.As mentioned earlier, we have used the linear splines radial basis function, Figure 6.This type of radial basis functions does not require any trial-and-error procedure, which is necessary in Gaussian RBFN to determine the suitable slopes of the Gaussian functions [19,20].Since the number of training data sets is 48, the resulting interpolation matrix Φ defined in ( 5) is 48 × 48 symmetric positive definite.Elements of the main diagonal are zeros.
When we examined the inverse problem, that is, we got J = 0. Afterward, the network has been examined using the testing data of Table 4.The new interpolation matrix Φ has been computed using the testing data as the input vectors.The training input data sets are used as the centers.The resulting Φ is 24×48.This matrix and the weight vector in Table 7 have been used to compute the predicted surface roughness and then compared with testing counterpart.We obtained RMSE = 0.0295.Errors of the 24 sets of testing data after training are plotted in Figure 7, and the scatter diagram is given in Figure 8.The later shows that the predicted data is distributed in a narrow range around the 45 • line.This means that the proposed RBFN can capture the nature of the experimental data with accuracy close to 97%.Referring to Tables 5 and 6, results show that the prediction error of the RBFN is less than that of the two G-FIS networks (FO and ZO), despite the fact that genetic algorithms are often seen as global optimizers.This may be explained as follows.The G-FISs are optimal under the conditions of using certain parameter settings (population size, number of generations, etc) and within the specified ranges for the fuzzy system parameters (slope, width of the Gaussian membership functions, and parameters of the consequent part).For other parameter settings and tuning ranges, other FISs  are obtained and probably better results may be achieved.However, a trial-and-error procedure should be followed to select them.For instance, determining the suitable range for each parameter of the FIS (126 parameters for FO and 45 parameters for ZO) is a tedious and time-consuming task.
Also, RBFN has showed better results than that of FO Sugeno fuzzy model tuned by ANFIS using HL algorithm, despite the powerfulness of this algorithm as discussed by earlier works [12,17,29].In general, this may be referred to the local minima problems of derivative-based optimization schemes.

Performance of ANFIS Networks.
In this subsection, we discuss the performance of four ANFIS networks.They are the FO and ZO Sugeno fuzzy models trained with BP and HL algorithms.As mentioned in Section 3, they have been initiated using grid partition to the input variables, that is, equally spaced membership functions in the antecedent part.The consequent parameters have been initiated with zeros.In training the four ANFIS networks, the training data sets in Table 3 were used to conduct 500 cycles of learning.The prediction RMSE of the training and testing is shown in Table 5.The ZO Sugeno fuzzy model trained with BP algorithm has demonstrated the worst performance; RMSE = 0.1067.Better  results with BP algorithm have taken place when FO Sugeno fuzzy model has been used; RMSE = 0.0449.
Here, in order to save space, we only present the learning results of the ANFIS network which has achieved the best predicting performance.This network is the FO Sugeno model and has been trained using the HL algorithm; RMSE = 0.0339.The evolution of the RMSE during the learning phase is shown in Figure 9(a).As it can be noticed, this error has reached to a steady-state value of 0.0016 after nearly 300 epochs.Figure 9(b) shows the resulting error of the testing data.The numerical value of the antecedent part given in Tables 8 and 9 gives the values of the parameters of the consequent part.
Figure 10 shows the membership functions before and after training.In this Figure, the initial and the final membership functions of the spindle speed and feed rate have experienced relatively large changes in comparison with the membership functions of the depth of cut.This remark indicates that the depth of cut has the least impact on the surface roughness of the end milling process.In Lo's work [12], the same experimental data in Tables 3  and 4 has been used to train and test two ANFIS networks.The two networks are FO Sugeno models trained with the HL algorithm, and similar to this work, the number of rules is 27.In the first network, he utilized triangular membership functions and trapezoidal membership functions in the second network.The average error has been used to compare the prediction accuracy of the two networks.The network which used triangular membership functions has performed better.The author of this paper has used Table 3 in Lo's work [12] to compute the prediction RMSE when triangular membership function is used.The RMSE there is 0.0347, which is very close to FO Sugeno model when Gaussian membership functions trained by HL as we did in this work.It means that the Gaussian and triangular membership functions can equally capture the nature of the experimental data.
The conclusion that can be withdrawn from the above results is that the FO Sugeno model trained with ANFIS using HL algorithm can achieve accuracy of around 96.6% whether triangular or Gaussian membership functions have been used.
However, this conclusion cannot be generalized.The author of reference [32] uses ANFIS to predict the surface roughness in turning process and compares his results with a proposed response surface method (RSM).Similar to this work, the input parameters are spindle speed, federate, and depth of cut.The achieved prediction accuracy of the ANFIS network is better than the proposed RSM.He examined two kinds of membership functions, the triangular and Gaussian membership functions.According to his results, better results have been achieved with triangular membership functions.
A more recent article is presented in [33].This reference uses the same experimental data (training and testing data) as used in this work.The authors implement an ANFIS network with Gaussian membership functions for the inputs, that is, spindle speed, federate, and depth of cut.The training algorithm, however, is different from the presented here.It is called leave-one-out cross-validation algorithm and used to obtain an optimal ANFIS network.Then, they use a "topdown" rule reduction approach to decrease the number of  [12], the proposed RBFN in this work still performs better; see Table 5.

Performance of Genetic Fuzzy Inference Systems (G-FIS).
As the performance of a GA depends on its parameters, a parametric study has been carried out to determine the optimal set of GA parameters.These parameters are the population size, the number of generations, the number of bits of each variable, the crossover probability, and the mutation probability.They are problem dependent and should be selected carefully in order to achieve good results.We have started the parametric study from the parameter set used in [13]; that is, population size is 200, the number of generation is 200, the crossover rate is 0.9, and the mutation rate is 0.1.The number of bits which represent each variable is not mentioned there.Here, the parametric study consists of five stages and has been performed only on the ZO Sugeno fuzzy model.In the first stage, crossover probability is varied from 0.8 to 0.99, keeping the other parameters, namely, mutation probability, population size, number of bits, and maximum number of generations fixed to 0.1, 200, 16, and 200, respectively.Like the works of [10,13], the best result is observed with crossover probability of 0.90.In the second stage, mutation probability is changed in the range of 0.001 to 0. After this number of generation, no improvement has been noticed in the fitness function (25).In the fourth stage, the number of bits has been changed in the range of 16-64.The crossover probability, mutation probability, population size, and the maximum number of generation are kept to 0.90, 0.016, 200, and 500, respectively.It has been noticed that the number of bits has little or no impact on the results.So, we have selected the number of bits to be 32.In the last stage of the parametric study, the population size has been changed from 50 to 200, after keeping the other parameters, namely, crossover probability, mutation probability, number of bits, and maximum number of generation fixed to 0.90, 0.016, 32, and 500, respectively.The best result is observed when the population size is 80.Thus, the following GA parameters are found to give the best results during the GA-based training of ZO Sugeno fuzzy inference system: (i) single point crossover with a probability of 0.90, (ii) bitwise mutation with a probability of 0.016, (iii) maximum number of generation is 500, (iv) number of bits which represent each variable of the FIS to the chromosome is 32, (v) population size is 80.
In this work, both the ZO and FO Sugeno fuzzy models have been trained using these GA parameters.So, the number of bits which constitute the chromosome (gene or a possible solution) is 32 × 126 = 4032 bits for the FO Sugeno fuzzy model.With respect to the ZO Sugeno fuzzy model, the chromosome length is 32 × 45 = 1440 bits.For the two models, the number of examined solutions is the number of generations multiplied by the number of populations, that is 40,000 possible solutions.Of course, the optimal solution is the best one of these 40,000 solutions.
In order to complete the definition of the GA optimization problem, a range of variation should be specified for each parameter of the FIS.Ranges of variation of the FO Sugeno fuzzy model are listed in Table 10.They are similar to those used in [13] in order to optimize a FO Sugeno model.With respect to ZO Sugeno model, the same premise parameters ranges have been used.The consequent parameters of the ZO Sugeno fuzzy model, u i , i = 1, 2, . . ., 27, have been also trained between −1 and 1.These ranges have   been selected intentionally in order to simulate a similar optimization problem to that which is found in [13] where the authors have used the same experimental data listed in Tables 3 and 4 for training and testing, respectively.However, different parameter settings for the GA are used here.As it can be noticed from Table 5, the ZO Sugeno model (RMSE = 0.0395) has achieved better performance than the FO Sugeno model (RMSE = 0.0488).Surprisingly, the ZO G-FIS has achieved less testing error (RMSE = 0.0395) than the training one (RMSE = 0.0485).This remark has been noticed from several simulation experiments for both FO and ZO (but not consistently).The remark that should be noted here is that for the same simulation experiment, several runs do not always result in the same (exact) findings.This can be attributed to the stochastic nature of GAs.Results presented here are the best obtained results.
Parameters of the optimally tuned ZO Sugeno fuzzy model are shown in Table 11.In order to save space, parameters of the optimally tuned FO Sugeno fuzzy model are not listed here.
Membership functions of the two tuned Sugeno fuzzy models after tuning are given in Figure 11.They show large changes in the membership functions of all the three end milling parameters.This remark is notable for the FO model relative to the corresponding one which has been trained with ANFIS using HL algorithm, Figure 10.This may be referred to the nature of the tuning process of GAs.They are derivative-free algorithms and, more importantly, tuning the membership functions is done randomly.Nevertheless, referring to Figure 11, the least changes have taken place in the membership functions of the depth of cut.This remark gives the impression that the depth of cut has the least influence on the surface roughness.It was also a concluding remark in the work of [12,13].
As mentioned earlier, in the work of Ho et al. [13], the authors used the GAs to optimally tune a FO-FIS which, similar to this work, uses Gaussian membership functions for the three inputs.The number of rules (27 rules), the fitness function (24), and the tuning ranges (Table 10) are also similar to this work.The optimal FIS there has produced a prediction RMSE of 0.0332 (computed by the author of this work from Table 5 in [13]).The only difference between this work and the work of Ho et al. [13] is that we used here different parameter settings for the GA.Accordingly,   different prediction error for the same data has been resulted in RMSE = 0.0488.These findings reveal the fact that the parameter settings of GA (number of generations, number of population, etc.) have a considerable influence on the resulting optimal solution and the prediction accuracy.

Conclusions
The ability of predicting the surface roughness of end milling process without carrying actual experiment will help to develop automatic manufacturing system.In this work, three algorithms have been examined.The aim is to determine the most effective method for the prediction of surface roughness.From the results obtained from this work, a number of concluding remarks can be summarized as follows.
(i) RBFN has been found to be the most successful technique to perform surface roughness prediction with RMSE of 2.95%.With comparison to the other predicting algorithms examined in this work, it is the simplest and the fastest method for the problem under consideration.This kind of artificial neural network has proved to be the most effective mean in capturing the nature of the training data, and the best results have been achieved when examined with the testing data.Unlike other types of RBFN like the regularization and generalized RBFN, the implemented RBFN in this work does not require a trial-and-error procedure.(ii) The achieved prediction error by the RBFN outperforms previous results achieved by previous works.In [12], the RMSE using ANFIS with triangular membership functions is 3.47%, and in [13], the FO Sugeno model tuned by G-FIS, the RMSE is 3.32%.Little achievement has been done by the training algorithm presented in [33], where the RMSE is 3.19%.(iii) To the best of knowledge of the author of this work, the proposed RBFN has not been examined before in relation to the problem of surface roughness prediction.The presented results here may open the door for other applications.(iv) With regard to ANFIS networks, the results of this work and previous results show that the type of membership function plays an important role in the prediction accuracy.Triangular and Gaussian membership functions result in similar prediction performance.Using different experimental data as in [32] resulted in different results; that is, the triangular membership functions perform better.Lower prediction accuracy has taken place when trapezoidal membership functions are used [12].This reveals the complicity of fuzzy systems as universal function approximators and the need for more mathematical rigorous to determine the proximity nature of fuzzy systems.Nevertheless, this kind of optimized networks suffers the problem of local minima.
(v) Results of G-FIS (genetic-based fuzzy inference systems) show that ANFIS networks trained by the HL algorithm have performed better.This enforces our conclusion that using GAs in optimization cannot ensure the obtaining of perfect optimal solution, especially in complex systems, unless suitable parameter settings and tuning ranges have been known in advance, which is difficult, if not impossible, to satisfy.Determining the suitable range for all the parameters of a FIS using trial-and-error procedure is a tedious and time-consuming task.This is despite the fact that the obtained solution is the best one of the examined solutions; that is, number of generations multiplied by the number of genes in the population.It means that the obtained solution is the optimal one under some predefined conditions, but not necessarily the most effective solution.
(a) it is very fast in comparison to backpropagation; (b) it has the ability of representing nonlinear functions; (c) it does not experience local minima problems of back-propagation.

Figure 1 :
Figure 1: Block diagram representation of radial basis function network (RBFNs) with input x ∈ R m and output F(x).

Figure 2 :
Figure 2: The structure of the fuzzy inference system used in this work.

Figure 3 :
Figure 3: A two-input first-order Sugeno fuzzy model with two rules.

w 1 =Figure 7 :
Figure 7: The prediction error of the testing data, RBFN.

Figure 8 :
Figure 8: Scatter diagram of the testing and predicted data, RBFN.

Figure 9 :
Figure 9: Evolution of the RMSE during the learning phase of FO Sugeno FIS using ANFIS trained with HL algorithm (a) and the resulting prediction error of testing data (b).

Figure 10 :
Figure 10: Membership functions of FO Sugeno model before training (a) and after training by ANFIS using HL algorithm.

Figure 11 :
Figure 11: Membership functions of FO (a) and ZO (b) Sugeno fuzzy models after training with GA.

Table 1 :
End milling parameters used in the study.

Table 2 :
Two passes in the hybrid learning procedure for ANFIS.

Table 5 :
A summary of the training and testing results.

Table 6 :
A comparison of measured and predicted surface roughness of the test data.

Table 7 :
Values of the members of the weight vector.

Table 8 :
The optimal premise parameters of ANFIS FO Sugeno model trained with HL algorithm.

Table 9 :
The consequent parameters of ANFIS FO Sugeno model trained with HL algorithm.

Table 10 :
Ranges of the premise and consequent parameters of FO Sugeno fuzzy model.

Table 11 :
Optimal premise and consequent parameters of ZO Sugeno model tuned by GA.
1, keeping crossover probability, population size, number of bits, and maximum generation fixed to 0.90, 200, 16, and 200, respectively.The best result is found with the mutation probability of 0.016.In the third stage, crossover probability, mutation probability, number of bits, and population size are kept fixed to 0.90, 0.016, 16, and 200, respectively, and the maximum number of generation is varied from 200 to 600.The best result takes place when the number of generation is 500.