A Robust Intelligent Framework for Multiple Response Statistical Optimization Problems Based on Artificial Neural Network and Taguchi Method

An important problem encountered in product or process design is the setting of process variables to meet a required specification of quality characteristics (response variables), called a multiple response optimization (MRO) problem. Common optimization approaches often begin with estimating the relationship between the response variable with the process variables. Among these methods, response surface methodology (RSM), due to simplicity, has attracted most attention in recent years. However, in many manufacturing cases, on one hand, the relationship between the response variables with respect to the process variables is far too complex to be efficiently estimated; on the other hand, solving such an optimization problem with accurate techniques is associated with problem. Alternative approach presented in this paper is to use artificial neural network to estimate response functions and meet heuristic algorithms in process optimization. In addition, the proposed approach uses the Taguchi robust parameter design to overcome the common limitation of the existing multiple response approaches, which typically ignore the dispersion effect of the responses. The paper presents a case study to illustrate the effectiveness of the proposed intelligent framework for tackling multiple response optimization problems.


Introduction
Controllable input variables set to an industrial process to achieve proper operating conditions are one of the common problems in quality control. Taguchi method [1][2][3] is a widely accepted technique among industrial engineers and quality control practitioners for producing high quality products at low cost. In this regard, Ko et al. [4] employed Taguchi method and artificial neural network to perform design in multistage metal forming processes considering work ability limited by ductile fracture. Su et al. [5] proposed a new circuit design optimization method where genetic algorithm (GA) is combined with Taguchi method. Lo and Tsao [6] modified an analytical linkage-spring model based on neural network analysis and the Taguchi method to determine the design rules for reducing the loop height and the sagging altitude of gold wire-bonding process of the integrated circuit (IC) package. In Taguchi's design method, the control variables (factors can be controlled by analyst) and noise variables (factors cannot be controlled by analyst) are considered influential on product quality. Therefore, the Taguchi method is to choose the levels of control variables and to reduce the effects of noise variables. That is, control variables setting should be determined with the intention that the quality characteristic (response variable) has minimum variation while its mean is close to the desired target. Nevertheless, so far, the Taguchi method can only be used for a single response problem; it cannot be used to optimize a multiple response optimization problem. But, in most industrial problems, we have dealt with more than one response variable and improving them simultaneously is very important. Common problem in the simultaneous optimization of response variables is to be different and sometimes contradictory to their optimality 2 International Journal of Quality, Statistics, and Reliability direction. Thus, optimizing the manufacturing process than one response variable led to nonoptimal amounts of other responses. So when dealing with multiresponse problems had better separately to optimize the response variables (Taguchi method) and finally, according to process engineer, is determined the optimum combination of design variables. Therefore, it is very important to design a method to optimize simultaneously responses. Another important point in the optimization process of the responses is to estimate the relationship between the response and control variables. In many cases, regression relationships do not have the ability to estimate properly the relationship between response and control variables and large amounts of mean square error (MSE) regression models can be seen that show the poor quality of these relationship descriptions [7]. In most cases, this problem occurs for two reasons: (i) reversal of the independence assumptions of input variables; (ii) being a complex relationship between response and control variables. In these cases, intelligent approaches (approach based on neural network and approach based on fuzzy) are an appropriate alternative to achieve a good estimation. In this regard, [8] proposed an approach based on neural networks to solve the quality optimization problem in Taguchi's dynamic experiment. However, this method is applicable only when there is a response variable.
Reference [9] proposed the neural network method and the data envelopment analysis (DEA) [10] to efficiently optimize the multiple response problem in the Taguchi method. With the neural network, the signal-to-noise (SN) ratios of responses are estimated by the known experimental data for each control variables combination, which also named decision making unit (DMU). Then, DEA is used to find each DMU's relative efficiency so that the optimal control variables combination can be found by relative efficiency value 100%. A three-step approach presented by [11] consists in (1) using neural networks to estimate mean square deviation (MSD) of responses for all possible combinations of control variable levels, (2) using DEA to compute the relative efficiency of all of those combinations, selecting those that are efficient, and (3) using DEA again to select among the efficient combinations the one which leads to a most robust quality loss penalization. A four step procedure to resolve the parameter design problem involving multiple responses is proposed by [12]. In this method, multiple signal-to-noise ratios are mapped into a single performance index called multiple response statistics (MRS) through neurofuzzy based model to identify the optimal level settings for each control variable. Analysis of variance is finally performed to identify control variables significant to the process. The above methods discuss only control variable values used in experimental trials; therefore, it cannot find the global optimal control variable settings considering all continual control variable values within the corresponding bounds.
Reference [13] presented the approach for solving problems with multiresponse surface using neural networks. In this approach, two neural networks are used, one for discovering optimal control factors vector and the other for estimating responses. Although parameter optimization can be obtained, the effect of control variables on responses still cannot be achieved. A similar method based on artificial neural network (ANN) is presented by [14]. In this method, no matter whether the control variables are due to the level form or the real value, it can be employed. At the same time, the effect of the control variables multiple responses can be also obtained. Reference [15] proposed to use an artificial neural network to estimate the quantitative and qualitative response functions. In the optimization phase, a genetic algorithm (GA) in conjunction with a desirability function (DF) is used to determine the optimal control variable settings. Reference [16] presented a data mining approach to dynamic multiple response problem consisting of four stages which apply the methodologies of ANN, exponential desirability function (EDF), and simulated annealing (SA). First, an ANN is employed to construct the response model of a dynamic multiple response system by applying the experimental data to train the network. The response model is then employed to predict the corresponding quality responses by inputting specific control variable combinations. Second, each of the responses is evaluated by using EDF. Third, EDFs are integrated into an overall performance index (OPI) for evaluating a specific control variable combination. Finally, a SA is performed to obtain optimal control variable combination within experimental region. Another dynamic multiresponse approach is presented in [17]. In this method, similar to Chang's work [16], optimal phase is performed by GA, whereas optimal phase is performed by SA. Reference [18] focused on an optimization problem that involves multiple qualitative and quantitative responses in the thin quad flat pack (TQFP) modeling process. A fuzzy quality loss function is first employed to the qualitative responses. Neural network is then applied to estimate a nonlinear relationship between control and response variables. A GA together with EDF is applied to determine the optimal setting. Reference [19] presented the use of fuzzy-rule base reasoning and SN ratio for the optimization of multiple responses. The idea is to combine multiple SN ratios into a single performance index called multiple performance statistic (MPS) output, from which the optimum level settings of control variables can be obtained by maximizing MPS. A similar approach to [19] for optimizing the electrical discharge machining process with multiple performance characteristics has been reported by [20]. In this approach, several fuzzy rules are derived based on the performance requirement of the process. Next, the inference engine performs a fuzzy reasoning on fuzzy rules to generate a fuzzy value. Finally, the defuzzifier converts the fuzzy value into a single performance index and the optimal combination of the machining parameter levels can be determined based on maximizing performance index. Reference [21] formulated MRO problem as a multiobjective decision making problem and followed the basic idea of Zimmermann's [22] method. This approach first models the responses through multiple adaptive neurofuzzy inference system (MANFIS), then according to maximin approach, overall satisfaction is obtained by comprising via the use of membership functions among all the responses. Finally, a GA is applied to search the optimal solution on the response surfaces modeled by MANFIS.
International Journal of Quality, Statistics, and Reliability 3 With respect to the aforementioned approaches, it can be concluded that the major focus of these methods is on the location effect only, ignoring the dispersion effect of the responses. In other words, they assume that the variance for the responses is constant over the experimental space.
Reference [23] presented an integrated technique for experimental design of processes with multiple correlated responses, composed of three stages which (1) use expert system, designed for choosing an orthogonal array, to design an actual experiment, (2) use the Taguchi quality loss function to present relative significance of responses, principal component analysis (PCA) to uncorrelate responses, and gray relational analysis (GRA) to synthesize components into a single performance measure, (3) use neural networks to construct the response function model and genetic algorithms to optimize control variable design. An artificial intelligence technique that combines PCA, GRA, and GA with ANN and uses data collected from full factorial experimental design for optimization of Nd:YAG laser drilling of Ni-based superalloy sheets was proposed by [24]. We note that since principal components are linear combinations of original response variables, when PCA is conducted on quality loss values, their optimization directions might be lost. Regardless of this issue, aforementioned methods maximize the component values. In other words, they do not correctly consider the location effect of the responses. To overcome this problem, Salmasnia et al. [25] suggested a systematic procedure via PCA and desirability function that imposes specification limits on the responses to be achieved. Also, an AI tool, namely, ANFIS, is used to estimate the complicated relation between input (design variables) and outputs (responses), but this approach does not consider relative importance of responses in process optimization.
The purpose of this study is to develop a new intelligent approach that accommodates all of location and dispersion effects besides relative importance of responses in a single framework. It also does not depend on the type of relationship between response and control variables, hence making its application in cases where these relations are unknown. Another advantage of the proposed method which is in contrast to many other approaches considering discrete regions to search for optimal solution searches the experimental region continuously. We compare the characteristics of the different intelligent multiresponse approaches presented in literature to the proposed method in Table 1.
(i) Type of solution problem (TSP).
(vii) Type of search in the experimental region (TS).
The rest of the paper is organized in the following order. Section 2 describes the proposed general intelligent approach for the design of a multiple response process that uses the Taguchi signal-to-noise ratio function, ANN and GA. In Section 3, the application of the proposed model on a case study from literature is illustrated. Finally, conclusions are reported in Section 4.

The Proposed Method
This study proposes a robust intelligent optimization procedure for multiple response problems with complex relationship between response and process variables based on signal-to-noise ratio and artificial neural network. There are various methods to optimize multiple responses but most of them employ regression models to estimate relation function between response and process variables. Furthermore, they neglect dispersion effect of responses and assume that response variances are constant over the experimental space. This research proposes a new methodology which considers dispersion effect as well as location effect. In addition, the approach used to model building phase is artificial neural network (ANN), to resolve shortcomings of abovementioned regression models, to capture nonlinearity in relationship.
To develop the methodology, we first define the parameters and the variables used in the proposed approach. Then, the new methodology is described in detail.

The Parameters and Variables.
The parameters and the variables used throughout this paper are defined as follows:

Model Development.
The proposed method consists of three phases: (i) data gathering, (ii) response estimation, and (iii) optimization. In the first phase, by employing a proper experimental design, the significant factors are identified and then the required data are gathered. Next, in order to reduce the response variation and bring the response means close to the target values, signal-to-noise ratio and normalized values of them are calculated in each experimental run. The  [8] Single response Continuous Neural network -Ko et al. [4] Single response Continuous Neural network -Lo and Tsao [6] Single response Discrete Neural network -Hsieh and Tong [13] Multiple response Continuous Neural network -Hsieh [14] Multiple response Continuous Neural network -Liao [9] Multiple response Discrete Neural network DEA Chiang and Su [18] Multiple response Continuous Neural network EDF Antony et al. [12] Multiple response Discrete Neuro fuzzy MRS Cheng et al. [21] Multiple response Continuous MANFIS -Lin et al. [20] Multiple response Discrete Fuzzy rule base MPS Tarng et al. [26] Multiple response Discrete Fuzzy rule base MPS Lu and Antony [19] Multiple response Discrete Fuzzy rule base MPS Noorossana et al. [15] Multiple response Continuous Neural network DF Chang and Chen [17] Multiple response Continuous Neural network EDF Gutiérrez and Lozano [11] Multiple response Discrete Neural network DEA Chatsirirungruang [27] Multiple response Continuous Linear regression LF Sibalija and Majstorovic [23] Multiple response Continuous Neural network GRA Salmasnia et al. [25] Multiple response Continuous ANFIS DF The proposed method Multiple response Continuous Neural network WSN response estimation phase, an estimate of responses with respect to design variables, is calculated. To do this, artificial neural network is used as an estimator. Finally, the third phase consists of optimization of process using GA and finding the best solution. Figure 1 illustrates the conceptual framework of the proposed method.
Phase 1 (data gathering). This phase aims to gather the required data for training neural networks. This phase includes four steps that are described in the following.
Step 1 (identifying the significant control variables). The first step is to identify the process control variables that may influence the response(s) of interest which can be done by experts who are familiar to the area of system considered.
Step 2 (selecting a proper design of experiment). An experiment can be defined as a test or a set of tests in which purposeful changes are made on the control variables to identify the pattern of changes that may be observed in the response variables.
Step 3 (calculating the SN ratio for responses in each experimental run). Recently [1] introduced a family of performance measures called signal-to-noise (SN) ratios. The major aim of these criteria is to simultaneously reduce the response variation and bring the response means close to the target values. According to the Taguchi method, there are three types of responses. The responses with a fixed target are called the nominal of the best case (NTB). In addition, the cases in which the responses have a smaller-the-better target or larger-the-better target are called STB and LTB, respectively. For these cases, the SN ratios are defined as follows: (i) nominal-the-best (ii) larger-the-better (iii) smaller-the-better Step 4 (normalizing the SN ratio for responses in each experimental run). The normalized SN ratio values can be computed using (4): The idea behind the normalization of SN ration values is to convert them into dimensionless numbers. This is simply because each response has different units of measurements. The NSN varies from a minimum of zero to a maximum of one (i.e., 0 ≤ NSN i j ≤ 1).  for each quality characteristics. Each of these neural networks is trained with the data of the actual experiments. Each input pattern corresponds to a control variable combination, while the output is its associated SN ratio. The two main reasons for using neural networks for this task instead of other classical estimation (e.g., regression) are their non-parametric character and their generalization capability. Thus, on one hand, neural networks can approximate, without making any a prior assumption, any existing linear or nonlinear mapping between the control variables and SN ratios. On the other hand, well-trained neural networks are able to estimate, with acceptable error levels, the output values for any control variable combination, not just the ones experimentally tested. This phase consists of three steps as follow.
Step 1 (selection of the training and the testing data sets). It is usually that about one-fifth of the total data as the test data is randomly selected and the remaining data as the training data are considered [30].
Step 2 (determine the topology of neural network). Among the several conventional supervised learning neural networks are the perceptron, back propagation neural network (BPNN), learning vector quantization (LVQ), and counter propagation network (CPN). The BPNN model is employed due to its ability to achieve effective solutions for various industrial applications and neural networks power in modeling of a nonlinear and complex relationship between systems input and output in this study, to modeling the relationship between response and control variables.
At this step, a neural network would be trained for each response to estimate its relation with control variables. Thus, the number of input neurons equals the number of control variables; the output layer has one neuron corresponding to an NSN. The transfer function for all neurons in the hidden layer(s) is hyperbolic tangent activation function. According to definition of NSN, it can vary from zero to one; hence, the transfer function for the output neuron is tangent sigmoid function. The topology of the BP neural network with a single hidden layer-based process model used in the proposed approach is illustrated in Figure 2.
Step 3 (designing the most appropriate network's articulation to estimate each quality characteristic). As they are selected, the number of neurons of layers of input and output . . . . . .

Hidden layer
Output layer Input layer Control variables based on dimensions of the input and output vectors and appropriate number of hidden layer neurons often is set by using trial and error and based on indicators such as mean square error (MSE) or root mean square error (RMSE) laboratory, different back propagation networks will evaluate for discovering the appropriate network. Then, for each network is compared the network output for test data and training data with observations from experiments. Finally, a network with the lowest MSE is selected as optimal network. Phase 3 (optimization). Once the BPNN has been properly trained and validated, they can be used to estimate the SN ratios for all possible control variable combinations. The next step is then to optimize process via GA. A GA is selected to perform the optimization for two important reasons. (1) Gradient-based optimization methods, like GRG, to calculate gradient and direction of improvement require response surface while in this method is used to estimate values instead of calculating the response surfaces from the neural network.
(2) GA is known as a powerful heuristic search approach for optimization of complex and highly nonlinear functions. In the rest of this phase, first a robust parameter setting approach is suggested. Then, a brief introduction of GA and the implementation steps of it for finding optimal solution, shown in Figure 3, are given.

The Suggested Parameter Tuning Approach.
Metaheuristics have a major drawback; they need some parameter tuning that is not easy to perform in a thorough manner. Those parameters are not only numerical values but may also involve the use of search components. Usually, metaheuristic designers tune one parameter at a time, and its optimal generations value is determined empirically. In this case, no interaction between parameters is studied. This sequential optimization strategy (i.e., one-by-one parameter) does not guarantee to find the optimal setting even if an exact optimization setting is performed.
To overcome this problem, a robust parameter tuning approach based on design of experiment, desirability function, and signal-to-noise ratio Taguchi is suggested. The proposed method consists of three steps: (1) design of experiment, (2) aggregation of objective functions, and (3) selection of optimal setting.
Step 4 (design of experiment). In this step, effective parameters such as mutation and crossover probabilities, search operators such as the type of selection strategy in evolutionary algorithms, the type, and so on are recognized. Next, a proper experimental design according to the number of effective parameters is selected.
Step 5 (aggregation of objective functions). Performance analysis of metaheuristics may be with respect to different criteria such as search time, quality of solutions, and robustness in terms of the instances. These criteria usually have different scales. Hence, they should be transformed into a scale-free value as follows: for LTB type criteria, where u j and l j are desired upper and lower levels for the jth criteria, and d i jk is dimensionless value corresponding to the observed value of the jth criterion under the ith experimental run in the kth replication that is called desirability value. It assigns values from 0 to 1 to the possible value of each objective function, in which a number closer to 1 is more desirable.
To aggregate several individual desirability values, the overall desirability (D) can be defined by taking the geometric mean of the individual desirability values. Therefore, overall desirability function yields a value less than or equal to the lowest individual desirability value. If this value is 0, one or more criterion is unacceptable. The most important feature of this approach is that an obtained optimal solution does not include any objective that lies outside the acceptable limits.
Step 6 (selection optimal setting). In order to reduce simultaneously the quality variation and bring the mean criteria close to the corresponding target values, signal-to-noise ratio should be conducted on overall desirability value. Next, the main effects on signal-to-noise ratios are determined. Thus, the corresponding diagram plots the factor effect on SN. The optimal factor/level combination produces the maximum SN value.

Genetic Algorithm for Solution Searching.
The optimal solution is a set of control variables that maximizes weighted NSN (WNSN). After estimation of NSNs over control variables, we should apply a method to deal with the optimization segment. To this end, we implement genetic algorithm. Genetic algorithm was firstly introduced based on the Darwinian theory by [31]. It is one of the powerful stochastic search approaches and is widely employed for solving complex problems. In this method, a random initial population is created and probabilistic operations are used for evolving the subsequent generations. Through crossover and mutation operations, the algorithm directs the population towards the optimal solution. The quality of each individual is assessed with a fitness function which deals with the objective function of problem at hand. Each chromosome with better level of fitness has higher WNSN to generate the offspring. Through the evolution procedure the quality of offspring will be enhanced until a predefined stopping criterion is met. The major components of a genetic algorithm are as follows: (1) initialization including parameters calibrations, (2) determining a way to encode the solutions, International Journal of Quality, Statistics, and Reliability 7 (3) generation of initial population, (4) defining the operations that should be applied to the parents to generate next populations, (5) a way to determine the fitness function that returns the quality of founded solutions. Now here, we perform the genetic algorithm to solve MRO problem.
Solution Encoding. To represent each solution, a string of real numbers in the interval [−1, 1] with size of control variable numbers in which each gene indicates the amount allocated to the corresponding control variable.
Initial Solution. Initial population provides the main algorithm with a starting point that can be created by some tailored heuristics. Here, we select to carry out the randomly generated initial solution.
Evaluation. In this section, the control variable combinations founded by GA should be evaluated. To do this, estimate the NSN values by inputting the control variable combination to trained BPNNs. Then, apply (6) to synthesize the obtained NSNs into a synthetic performance measure which is referred to as a fitness function. This value is returned to the main algorithm: Selection. Selection operator chooses two individuals from population as parents to produce the offspring by crossover and mutation operators. The mechanism used in this section is based on the value of chromosome fitness. Chromosome with better fitness will have a greater WNSN to be chosen as a parent.
Crossover and Mutation Operators. Crossover is a GA operator that establishes new chromosomes by exchanging some parts of parents to create children which have characteristics from both parents. In this study, we carry out the uniform crossover in which a probability vector is produced. The size of vector is equal to number of control variables. If the value of probability is less than 0.5, the element from first parent is moved into child, otherwise the corresponding gene from second one is selected. In order to increase the diversification of population, mutation operator is used to make random variations in chromosomes. In this study, random mutation is adopted where a random gene is selected from chromosome and a random number between 0 and 1 is replaced with its current element. See example for random mutation as follows. Termination Criterion. After predefined number of iterations, the algorithm terminates.

Numerical Illustration
In order to demonstrate the application of the proposed approach, in this section, a simulation study is carried out on the example given in [28]. In this example, there are two response variables (y 1 , y 2 ) and five control variables (x 1 , x 2 , x 3 , x 4 , x 5 ). It is assumed that y 1 and y 2 have the same relative importance and are smaller-the-better and largerthe-better, respectively. Five control variables, each with three levels, are allocated sequentially to an L 18 orthogonal array. The experiments are conducted randomly.
The experimental data was analyzed by following the proposed method strictly. Table 2 shows the experimental observations. Table 3 displays SN ratios and NSN ratios for each response resulting from formula of data gathering phase.
According to the proposed method, next step is estimation of the NSNs of the different characteristics for control variables using neural networks. Since the neural networks with one or two hidden layers have ability to describe any nonlinear relationship between inputs and outputs and, on the other hand, increasing the number of layers leads to the rapid growth of the number of network parameters as a result in the process of identifying suitable neural network [32], we limit our studies with two layers networks. In order to discover the appropriate neural networks, response variables of the feed forward back propagation networks were tested with different parameters. Appropriate networks with the lowest MSE values are presented in Table 6. In both networks, the middle layers use from activation function of tangent hyperbolic and output layers use from activation function of sigmoid. Training algorithm in both networks is Levenberg-Marquardt, and ratio of the test data to the whole data is for both networks 22.22%.
The regression model considered for simulating the process and generating the data is illustrated in Tables 4 and 5 that are fitted using MINITAB 15 software.
The MSE of the two regression models is computed and presented in Table 7. As can be seen, the computed MSE from the regression models is high and this represents a poor fitness of the models. However, the two neural networks produce absolutely lower MSE. Therefore, neural networks can estimate the process function more accurately.
A GA optimization algorithm is performed on the Matlab platform at the final stage. The GA program is usually time consuming and needs many iterations to obtain convergence. However, for the present experiment, a relatively good solution was almost always obtained within 1000 iterations or in approximately 6 minutes. GA program is executed over 20 runs to set optimal control variable and the best solution obtained is (x 1 , x 2 , x 3 , x 4 , x 5 ) = (0.9, 1, −1, −0.9, 0.88).
As mentioned before, RI is a main issue in MRO but it is considered less in intelligent approaches in literature. In order to illustrate the effect of it in process optimization, we resolve the problem with different weight vectors that are 8 International Journal of Quality, Statistics, and Reliability   Table 8. As it was expected, the optimal value of factors and WNSN vary with respect to the weight vector. Now, a comparative study between the proposed method and some major studies is represented. It shows the effectiveness of the proposed method against popular approaches in the literature. The comparison is conducted on WNSN that the higher the value, the more desirable the result. The results are summarized in the Table 9.
As mentioned before, Noorossana et al. [15] and Chang and Chen [17] are two approaches that emphasize only on the location effect of responses. Consequently, these approaches have poor performance in reducing variances of responses and also WNSN value.
Although Lin et al. [20], Tong et al. [28], and Tong et al. [29] have approved their results by considering variance in their method, they only consider discrete level combination    Results of the numerical example support the claim that the proposed method, in contrast to other methods, considers mean and variance of responses and search experimental region continuously.

Conclusion
To overcome weakness of polynomial regression models in estimating the appropriate relationships between control and response variables in complex processes and difficulties of accurate optimization methods in the solution of such problems was presented a new approach based on neural network and genetic algorithm. The approach presented in addition to covering the weaknesses mentioned provides four other merits: (1) reduction of uncertainty in the process, (2) estimation of the relationship between control and response variables using traditional statistical approaches requires some statistical assumptions while the proposed approach without any assumptions is able to estimate such   (3) to solve the optimization problems with multisurface responses using this general method considering relative importance of the response variables unlike more existing approaches, (4) there is no any undesirable mathematical complexities in the proposed approach. As a future research, the qualitative variables can be considered as well as quantitative ones. Furthermore, it could be interesting to incorporate correlation among responses and also variance of the predicted responses into the proposed approach.