Novel Two-Stage Method for Low-Order Polynomial Model

One of the most popular statistical models is a low-order polynomial response surface model, i.e., a polynomial of first order or second order. These polynomials can be used for global metamodels in weakly nonlinear simulation to approximate their global tendency and local metamodels in response surface methodology (RSM), which has been studied in various applications in engineering design and analysis. The order of the selected polynomial determines the number of sampling points (input combinations) and the resulting accuracy (validity, adequacy). This paper derives a novel method to obtain an accurate high-order polynomial while requiring fewer sampling points. This method uses a two-stage procedure such that the second stage modifies the low-order polynomial estimated in the first stage; this second stage does not require new points. This paper evaluates the performance of the method numerically by using several test functions. These numerical results show that the proposed method can provide more accurate predictions than the traditional method.


Introduction
Metamodels are essentially the simple approximation functions of simulation models of real systems.The currently most common usage of metamodels is to reduce the computational cost significantly by substituting the time-consuming evaluations in some computationally intensive tasks [1].The second usage of metamodels is to smooth objective-function surfaces and deal with noisy data, which can facilitate the use of gradient-based methods for optimization problems [2,3].Another increasingly popular usage for metamodels is to deal with missing data and gain insight into the contributions of each input variable to associated output variables [4].Besides, metamodels can also be used to reduce numerical instability [5].Moreover, metamodels can be utilized as calibration methods for low-fidelity simulations of limited accuracy [6].Because of these benefits, metamodels have been extensively researched and employed in various applications including engineering design, analysis, and optimization [7][8][9][10].
Up to now, various types of metamodeling techniques have been proposed, such as radial basis function (RBF) [11], polynomial response surface [12], Kriging [13,14], support vector regression (SVR) [15,16], multivariate adaptive regression splines (MARS) [17,18], and artificial neural networks (ANN) [19].Particularly, one of the most popular metamodeling techniques is the polynomial response surface, which was first proposed by Box and Wilson [20] and later discussed in detail by Myers, Montgomery, and Anderson-Cook [12].The polynomials have great advantages in efficiency of model construction as well as transparency, which means the functional relationships between the input variables and associated output variables can be easily obtained [21].These polynomials can be not only used for local metamodels in response surface methodology (RSM) but also used for global metamodels in weakly nonlinear simulation to approximate their global tendency.Therefore, they have been widely utilized in both academic research and engineering applications [22][23][24][25].
Generally speaking, the order of the selected polynomial has significant influences on the number of sampling points as well as the resulting accuracy (validity, adequacy).Namely, as the order of the polynomial increases, the polynomial response surface becomes more accurate in approximating higher nonlinear problems.However, the number of sampling points need to increase sharply, which may be impractical for these high-fidelity simulations.We should note that the mean squared error (MSE) may also increase.
This paper derives a novel method to obtain an accurate high-order polynomial while requiring fewer sampling points.The proposed approach is based on a two-stage procedure.The second stage modifies the low-order polynomial constructed in the first stage by utilizing its feedback.It is noted that the second stage does not require new sampling points.
The remaining sections of this paper are organized as follows.Firstly, we analyze the basic theory and characteristics of the polynomial response surface.Secondly, we introduce the modeling process of the improved polynomial response surface.Thirdly, we present the detailed scheme of the numerical experiments.Then, we analyze the results and discuss the performance of the proposed method.Finally, we conclude our paper with a summary and suggestions for future work.

Polynomial Response Surface
The polynomial response surface is mainly used to develop an approximate functional relationship between a number of input variables and an associated response.The relationship can usually be written as follows: where  denotes the true response, x = ( 1 ,  2 , . . .,   ) T denotes the vector of input variables, ŷ(x) denotes the approximation response,  denotes a random error,  = ( 0 ,  1 , . . .,  −1 ) T denotes a vector of  constant coefficients, and z = (1, 1 ,  2 , . . .,  −1 ) T denotes a polynomial basisfunction vector with  elements that consists of constant term, terms of  1 ,  2 , . . .,   as well as cross-products of these terms up to a certain order.
The key step of the polynomial response surface is to estimate  by using the least-square method.In detail, an appropriate design of experiment (DOE) should first be chosen and a series of (≥p) samples can therefore be obtained.The totality of these samples can be represented by a  ×  design matrix, which is denoted by D as follows: where x  = ( ,1 ,  ,2 , . . .,  , ) T , ( = 1, 2, . . ., ) denotes the th sampling point.
Second, the values of the polynomial basis-function vectors z 1 , z 2 , . . ., z  corresponding to all the sampling points x 1 , x 2 , . . ., x  should be calculated.The totality of these vectors can be represented by a × matrix, which is denoted by Z as follows: Third, the true response   should be observed or measured for each sampling point x  by conducting numerical simulations or physical experiments.The totality of these responses can be represented by a  × 1 vector, which is denoted by y as follows: Then, from (1), we can have where   denotes the random error term at the th sampling point.Equation ( 5) can be expressed in matrix form as follows: where  = ( 1 ,  2 , . . .,   ) T .Next, the sum of squared residuals (SSR)  should be calculated.
Besides,  need to be minimized; therefore the following equation must be satisfied: Finally,  can be estimated from (8).
In this way, the polynomial response surface has been constructed.

Improved Polynomial Response Surface
The first-order and second-order polynomial models are the two most popular polynomial response surfaces.Although polynomial with order higher than two can also be employed, the number of sampling points needs to increase sharply with the order increasing.In order to overcome this difficulty, we propose an improved method.The core idea is to start with a low-order polynomial and refit it to obtain highorder polynomial in a second successive fitting by using the feedback of the initial simple fitting.No new sampling point is needed in the second fitting.
In detail, the improved method involves the following steps: (1) Choose an appropriate DOE and generate a series of samples, namely, the design matrix D. Further information about DOE can be found in a large number of references [12,[26][27][28][29][30].
(2) Conduct numerical simulations or physical experiments to observe or measure the true response vector y for all the sampling points obtained from step (1).
(3) Construct the initial low-order polynomial response surface ŷ.The first-order and second-order polynomial models are selected in this paper and denoted by 1RS and 2RS, respectively.
(4) Choose an appropriate method to modify the initial polynomial response surface ŷ, and construct the improved model ŷ .The improved polynomial response surfaces corresponding to the initial model 1RS and 2RS are denoted by 1IRS and 2IRS, respectively.
(5) Use the improved model ŷ to predict the response.

Model Construction.
This paper propose a method to correct the initial low-order polynomial and construct the corresponding high-order polynomial.The method can be expressed as follows: where  and  are two vectors of  + 1 constant coefficients, respectively, and c is a first-order polynomial basis-function vector with  + 1 elements.They can be written as follows: The idea of the method is to treat the response of the low-order polynomial as feedback and then multiply a linear regression model and add another different linear regression model.Essentially, a second-order polynomial can be obtained when applying the method to 1RS.A third-order polynomial can be obtained when applying the method to 2RS.We can see that 2RS has ( + 1)( + 2)/2 different coefficients and the correction method has 2( + 1) different coefficients.Therefore, ( + 1)( + 2)/2 sampling points are needed at least to obtain a second-order polynomial for the traditional method, while ( + 1)( + 2)/2 (when  > 1) sampling points are needed at least to obtain a third-order polynomial for the improved method.It is noted that ( + 1)( + 2)( + 3)/6 sampling points are needed at least to obtain a third-order polynomial model for the traditional method.Obviously, the improved method can obtain highorder polynomial with fewer sampling points.
We begin to construct the improved model ŷ .It should be noted at first that there is no need to generate new sampling points because the design matrix D expressed in (2) as well as the true response vector y expressed in (4) can be used to construct not only the initial model ŷ but also the improved model ŷ .
From (1) and (10), we can get where ŷ, denotes the response of the improved model at the th sampling point,  , denotes the error term of the improved model at the th sampling point, and ŷ denotes the response of the initial model at the -th sampling point.Equation ( 12) can be transformed as follows: where The totality of these equations represented by (13) can also be expressed in matrix form as follows: where ] The least-square method is employed to estimate the values of .Similar to (9), we can get In this way, the improved model has been constructed.It can be expressed as follows:

Benchmark Problems for Global Performance.
To test the global performance of the proposed method, we employ nine benchmark problems which are often used in relevant literature.The dimensions of these problems range from 2 to 20.
To facilitate the description, we use some simple marks to label these benchmark problems respectively.In detail, the 2-variable Goldstein-Price function is denoted by GP-2; the 2-variable Branin-Hoo function is denoted by BH-2; the 3-variable Perm function is denoted by PM-3; the 3

Benchmark Problems for Local Performance.
To test the local performance of the proposed method, we employ three benchmark problems, which are polynomials of secondorder, third-order, and fourth-order.
(1) Quadratic-Polynomial Function.This benchmark problem can be written as where (2) Cubic-Polynomial Function.This benchmark problem is the same as CP-3, which is described in (22).

Numerical Procedure.
When the low-order polynomials are used for local metamodels in response surface methodology (RSM), the resolution-III (R-III) designs, central composite designs (CCDs), and Box-Behnken designs are considered to be the most suitable DOE techniques [10].When the low-order polynomials are used for global metamodels in weakly nonlinear simulation to approximate its global tendency, the Latin hypercube sampling (LHS) technique is one of the most popular choices both in scientific research and engineering problems [21,31,33,34].LHS maximizes the minimum distance between the sampling points to obtain uniform designs; moreover its projections onto each variable axis give uniform points.This paper will first discuss the global performance of the improved method.Therefore, we first utilize MATLAB(2011) routine "lhsdesign" with "maximin" criterion to select the locations of the sampling points for all the benchmark problems discussed above.
The performance of metamodels may vary from DOE to DOE.To reduce the random effect, we select 1000 training sets and 1000 corresponding test sets for each benchmark problem.Particularly, for a specified training set, the number of points is chosen as triple the number of coefficients in a second-order polynomial model (namely, 3( + 1)( + 2)/2,  denotes the dimension of the benchmark problem), which refers to Jin, Chen, and Simpson [21].Meanwhile, 1000 test points are selected for a specified test set by using LHS.The performance of metamodels for each benchmark problem will be estimated by using the mean of the 1000 replicates.
In summary, the detailed information about the training and test data used for each benchmark problem are listed in Table 1.

Global Performance of the Improved Method.
Reviewing the relevant literature [35][36][37][38][39][40], we select root mean squared error at test points (  ) as validation metrics to measure the performance of the improved method for the benchmark problems.The definition of   is expressed as follows: where   denotes the number of test points.
Our main concerns are the comparison between the traditional model and its corresponding improved model, namely, the comparison between 1RS and 1IRS, as well as the comparison between 2RS and 2IRS.Although we do not think it is necessary to compare 1RS with 2RS or to compare 1IRS with 2RS, we still want to present the fact that maybe 2RS has better performance than 1RS for all the benchmark problems, while the performance of 1IRS may be close to that of 2RS in some particular problems.
The   of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) for different benchmark problems are shown in Figure 1.The boxplot provides a graphical depiction of how prediction errors vary over the range of 1000 training sets and test sets.The plot is composed of a box, an upper limit line with whiskers, a lower limit line also with whiskers and outliers.The box includes a top edge line representing the 75th percentile value, an interior line representing the median value, as well as a bottom edge line representing the 25th percentile value.The upper/lower limit line is extending from the top/bottom edge line of the box to the most extreme data points which are not considered to be outliers.The outliers Mathematical Problems in Engineering are data with values beyond the limit line and are presented by the "+" symbols.
From Figure 1 we can see the following.(1) When considered the median value of the 1000 test sets, 1IRS performs better than 1RS for seven benchmark problems.For the other two problems (HM-4 and DP-15), 1IRS has similar results with 1RS.It is noted that in this paper the results are called similar results if the difference is between -1% and 1%.
(2) 2IRS has better accuracy than 2RS for seven benchmark problems.For the other two problems (DP-15 and WE-20), 2IRS has similar results with 2RS.(3) 2RS performs better than 1RS for all the benchmark problems, yet the accuracy of 1IRS is close to that of 2RS for GP-2 and PM-3.Particularly, the accuracy of 1IRS is even better than that of 2RS for CP-3 and ZH-10.(4) For BH-2, PM-3, and HM-4, 1IRS has larger outliers (or longer tails) than 1RS and 2IRS has larger outliers than 2RS.Therefore, the improved models have larger variance than the traditional models.The detailed mean and COV (coefficient of variation) values of   over the 1000 test sets are shown in Appendix A.

Why Does 2IRS Perform Better Than 2RS and 1IRS Perform
Better Than 1RS?We think the reason is that the improved method can obtain highly nonlinear terms with fewer sampling points when compared with the traditional method.For example, when the number of sampling points is less than ( + 1)( + 2)( + 3)/6, it is impossible to obtain a third-order polynomial for the traditional method.However, for the improved method, a third-order polynomial can be obtained as long as the number of sampling points is more than ( + 1)( + 2)/2.We should note that ( + 1)( + 2)/2 is less than ( + 1)( + 2)( + 3)/6 when  > 1.

Effect of Validation Metrics.
The choice of different validation metrics may influence the results.To reduce the source of uncertainty in the results as much as possible, we select another four commonly used validation metrics.They are root mean squared error at training points (  ), correlation coefficient at test points (), average absolute error at test points (), and max absolute error at test points ().The detailed definitions and results of these metrics are shown in Appendix B∼E.Here we gather all the five validation metrics for all the nine benchmark problems and present them in Table 2.
From Table 2 we can see the following.(1) For GP-2, BH-2, PM-3, PS-4, and ZH-10, all the five metrics show that 1IRS is more accurate than 1RS and 2IRS is more accurate than 2RS.(2) For CP-3, all the five metrics show that 1IRS is more accurate than 1RS; meanwhile four metrics indicate that 2IRS is more accurate than 2RS.(3) For HM-4, two metrics show that 1RS is more accurate than 1IRS; meanwhile just one metric indicates that 2RS is more accurate than 2IRS.(4) For DP-15, only one metric shows that 1RS is more accurate than 1IRS; meanwhile just one metric indicates that 2RS is more accurate than 2IRS.( 5) For WE-20, all the five metrics show that 1IRS is more accurate than 1RS; meanwhile all the five metrics show that 2IRS has similar accuracy with 2RS.( 6 In summary, the choice of the validation metrics can slightly influence the results, but the conclusions obtained by the five metrics remain unchanged.The improved method performs better than the traditional method.Particularly, the least-square method, which is used to estimate the polynomial coefficients, implies that the most relevant metric is .And the test set (  ) is more relevant than the training set (  ).

Effect of the Number of Sampling
Points.The accuracy of metamodels may depend on DOE and vary from DOE to DOE.To reduce the random effect caused by DOE, we have selected 1000 different training sets for each benchmark problem.However, for each training set, the number of sampling points remains unchanged.Therefore, we still need to examine the effect of the number of sampling points on the performance of the improved method.Considering the length of our paper, we just select ZH-10 as the example problem, choose   , , and  as validation metrics, and make the comparison between 2RS and 2IRS.
Figure 2 shows the results with the number of sampling points varying between ( + 1)( + 2)/2 and ( + 1)( + 2)( + 3)/6.From it we can get the following findings: (1) With the number of sampling points increasing, both   and  of 2RS and 2IRS are getting smaller and smaller.That is to say, the accuracy of 2RS and 2IRS increases continuously with the increase of the number of sampling points.(2)   and  of 2IRS are always smaller than that of 2RS; meanwhile  of 2IRS are always bigger than that of 2RS.That is to say, 2IRS has an obvious accuracy improvement when compared to 2RS, even though the number of sampling points varies.

Significance of Results.
The results above have proven the effectiveness of the improved method to some extent.
In order for the method to be better used by engineers, we will compare its performance with some other popular metamodels, which are Kriging with first-order polynomial regression function (KRG1), Kriging with secondorder polynomial regression function (KRG2), radial based function with Gaussian-form basis (RBFG), and radial based function with multiquadric-form basis (RBFM).Considering Jin, Chen, and Simpson [21] have concluded that 2RS has special advantages in efficiency and transparency and some disadvantages in accuracy, we mainly focus on the accuracy improvement of the proposed method.Figure 3 shows   of 2RS, 2IRS, KRG1, KRG2, RBFG, and RBFM for different benchmark problems.The detailed results are shown in Appendix F. Considering the frequency of the accuracy ranking of all metamodels for the nine benchmark problems, we can see the following.(1) The frequency of 2RS that ranks 1st or 2nd is only one, while that of 2IRS is three, that of KRG1 is two, that of KRG2 is eight, that of RBFG is one, and that of RBFM is three.(2) The frequency of 2RS that ranks 5th or 6th is five, while that of 2IRS is one, that of KRG1 is three, that of KRG2 is zero, that of RBFG is eight, and that of RBFM is one.(3) Obviously, 2RS performs worse than KRG1, yet 2IRS performs better than KRG1.
(4) 2RS performs worse than RBFM, yet 2IRS has similar performance with RBFM.( 5) Both 2RS and 2IRS perform better than RBFG.( 6) 2RS performs worse than KRG2 for all the nine benchmark problems, yet 2IRS performs better than KRG2 for PO-3 and ZH-10.In summary, compared to the traditional method, the improved method retains the advantages in efficiency and transparency and possesses significant accuracy improvement.

Local
Performance of the Improved Method.All the results above are mainly used to test the global performance of the improved method.Therefore, LHS is utilized to generate the sampling points.To test the performance of the improved method used for local metamodels in RSM, we should use simple benchmark problems (i.e., QP-3, CP-3, and FP-3) and select CCDs to generate corresponding training points.CCDs are considered to be one of the most suitable DOE techniques in response surface methodology [10].Considering the length of our paper, we just select   , , and  as validation metrics.
Figure 4 shows the results of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) for QP-3, CP-3, and FP-3.From it we can see the following.(1) For QP-3, 1IRS performs better than 1RS for all the validation metrics.Particularly, the errors of 1IRS, 2RS, and 2IRS are zero.This is because the function QP-3 can be fitted exactly by the three models.(2) For CP-3 and FP-3, 1IRS performs better than 1RS for all the validation metrics, while 2IRS also performs better than 2RS for all the validation metrics.
In summary, when the polynomials are used for local metamodels in RSM, the improved method still performs better than the traditional method.

Conclusions
In this paper, we proposed a new method to obtain an accurate high-order polynomial while requiring fewer sampling points.The core idea of the method is to start with a low-order polynomial and refit it to obtain high-order polynomial in a second successive fitting by using the feedback of the initial simple fitting.
To test the global performance of the improved method, we employed nine example problems which are widely used as benchmark problems in relevant literature.As expected, the accuracy of the improved method is better than that of the traditional method.Analyzing the principle, we think the reason for the better performance of the improved method is that it can obtain highly nonlinear terms with fewer sampling points when compared with the traditional method.
To obtain general conclusions, we investigated the effects of validation metrics and the number of sampling points on the performance of the improved method.We found that the choice of the validation metrics and the number of sampling points can slightly influence the results, but the conclusions remain unchanged.
In order for the improved method to be better used, we compared its performance with KRG1, KRG2, RBFG, and RBFM.The results showed that the improved method retains the advantages in efficiency and transparency and possesses significant accuracy improvement when compared with the traditional polynomial response surface.
Moreover, we researched the performance of the improved method used for local metamodels in RSM.The results showed that the proposed method still performs better than the traditional method.
However, there is no single outstanding metamodel which works best for all tasks.Therefore, finding more accurate metamodels is still our future work.Figure 4: Comparison of the local performances between traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS).(a) QP-3 using   as validation metric, (b) QP-3 using  as validation metric, (c) QP-3 using  as validation metric, (d) CP-3 using   as validation metric, (e) CP-3 using  as validation metric, (f) CP-3 using  as validation metric, (g) FP-3 using   as validation metric, (h) FP-3 using  as validation metric, and (i) FP-3 using  as validation metric.

B. 𝑅𝑀𝑆𝐸 𝑡𝑟𝑛
The definition of   is expressed as follows: where   denotes the number of training points.Table 4 shows the mean values of   of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) over the 1000 training sets for nine benchmark problems.

C. 𝑅
The definition of  is expressed as follows: where Table 5 shows the mean values of  of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) over the 1000 test sets for nine benchmark problems.

D. 𝐴𝐴𝐸
The definition of  is expressed as follows: Table 6 shows the mean values of  of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) over the 1000 test sets for nine benchmark problems.

E. 𝑀𝐴𝐸
The definition of  is expressed as follows:  = max       − ŷ      = 1, 2, . . .,   (E.1) Table 7 shows the mean values of  of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) over the 1000 test sets for nine benchmark problems.

F. Significance
Table 8 shows the mean values of   of 2RS, 2IRS, KRG1, KRG2, RBFG, and RBFM over the 1000 test sets for nine benchmark problems.
) Among all the forty-five examples, thirty-nine examples show that 1IRS is more accurate than 1RS and three examples show that 1IRS has similar accuracy with 1RS; meanwhile thirtyfour examples show that 2IRS is more accurate than 2RS, and nine examples show that 2IRS has similar accuracy with 2RS.

Figure 2 :
Figure 2: Effect of the number of sampling points on the performance of 2RS and 2IRS for 10-variable Zakharov function (ZH-10).(a)   is selected as validation metric, (b)  is selected as validation metric, and (c)  is selected as validation metric.

Table 1 :
Detailed information about the training and test data used for each benchmark problem.

Table 2 :
Comparison for performance of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) among all the five different validation metrics and nine benchmark problems.

Table 3 :
Mean and COV values of   of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) over the 1000 test sets for nine benchmark problems.

Table 4 :
Mean values of   of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) over the 1000 training sets for nine benchmark problems.

Table 5 :
Mean values of  of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) over the 1000 test sets for nine benchmark problems.

Table 6 :
Mean values of  of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) over the 1000 test sets for nine benchmark problems.

Table 7 :
Mean values of  of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) over the 1000 test sets for nine benchmark problems.