^{1}

^{2}

^{1}

^{3}

^{1}

^{2}

^{3}

One of the most popular statistical models is a low-order polynomial response surface model, i.e., a polynomial of first order or second order. These polynomials can be used for global metamodels in weakly nonlinear simulation to approximate their global tendency and local metamodels in response surface methodology (RSM), which has been studied in various applications in engineering design and analysis. The order of the selected polynomial determines the number of sampling points (input combinations) and the resulting accuracy (validity, adequacy). This paper derives a novel method to obtain an accurate high-order polynomial while requiring fewer sampling points. This method uses a two-stage procedure such that the second stage modifies the low-order polynomial estimated in the first stage; this second stage does not require new points. This paper evaluates the performance of the method numerically by using several test functions. These numerical results show that the proposed method can provide more accurate predictions than the traditional method.

Metamodels are essentially the simple approximation functions of simulation models of real systems. The currently most common usage of metamodels is to reduce the computational cost significantly by substituting the time-consuming evaluations in some computationally intensive tasks [

Up to now, various types of metamodeling techniques have been proposed, such as radial basis function (RBF) [

Generally speaking, the order of the selected polynomial has significant influences on the number of sampling points as well as the resulting accuracy (validity, adequacy). Namely, as the order of the polynomial increases, the polynomial response surface becomes more accurate in approximating higher nonlinear problems. However, the number of sampling points need to increase sharply, which may be impractical for these high-fidelity simulations. We should note that the mean squared error (MSE) may also increase.

This paper derives a novel method to obtain an accurate high-order polynomial while requiring fewer sampling points. The proposed approach is based on a two-stage procedure. The second stage modifies the low-order polynomial constructed in the first stage by utilizing its feedback. It is noted that the second stage does not require new sampling points.

The remaining sections of this paper are organized as follows. Firstly, we analyze the basic theory and characteristics of the polynomial response surface. Secondly, we introduce the modeling process of the improved polynomial response surface. Thirdly, we present the detailed scheme of the numerical experiments. Then, we analyze the results and discuss the performance of the proposed method. Finally, we conclude our paper with a summary and suggestions for future work.

The polynomial response surface is mainly used to develop an approximate functional relationship between a number of input variables and an associated response. The relationship can usually be written as follows:

The key step of the polynomial response surface is to estimate

Second, the values of the polynomial basis-function vectors

Third, the true response

Then, from (

Next, the sum of squared residuals (SSR)

Besides,

Finally,

In this way, the polynomial response surface has been constructed.

The first-order and second-order polynomial models are the two most popular polynomial response surfaces. Although polynomial with order higher than two can also be employed, the number of sampling points needs to increase sharply with the order increasing. In order to overcome this difficulty, we propose an improved method. The core idea is to start with a low-order polynomial and refit it to obtain high-order polynomial in a second successive fitting by using the feedback of the initial simple fitting. No new sampling point is needed in the second fitting.

In detail, the improved method involves the following steps:

Choose an appropriate DOE and generate a series of samples, namely, the design matrix

Conduct numerical simulations or physical experiments to observe or measure the true response vector

Construct the initial low-order polynomial response surface

Choose an appropriate method to modify the initial polynomial response surface

Use the improved model

This paper propose a method to correct the initial low-order polynomial and construct the corresponding high-order polynomial. The method can be expressed as follows:

The idea of the method is to treat the response of the low-order polynomial as feedback and then multiply a linear regression model and add another different linear regression model. Essentially, a second-order polynomial can be obtained when applying the method to 1RS. A third-order polynomial can be obtained when applying the method to 2RS. We can see that 2RS has

We begin to construct the improved model

From (

Equation (

The totality of these equations represented by (

The least-square method is employed to estimate the values of

In this way, the improved model has been constructed. It can be expressed as follows:

To test the global performance of the proposed method, we employ nine benchmark problems which are often used in relevant literature. The dimensions of these problems range from 2 to 20.

To facilitate the description, we use some simple marks to label these benchmark problems respectively. In detail, the 2-variable Goldstein-Price function is denoted by GP-2; the 2-variable Branin-Hoo function is denoted by BH-2; the 3-variable Perm function is denoted by PM-3; the 3-variable Cubic-Polynomial function is denoted by CP-3; the 4-variable Power-Sum function is denoted by PS-4; the 4-variable Hartmann function is denoted by HM-4; the 10-variable Zakharov function is denoted by ZH-10; the 15-variable Dixon-Price function is denoted by DP-15; and the 20–variable Welch et al. (1992) function is denoted by WE-20.

To test the local performance of the proposed method, we employ three benchmark problems, which are polynomials of second-order, third-order, and fourth-order.

When the low-order polynomials are used for local metamodels in response surface methodology (RSM), the resolution-III (R-III) designs, central composite designs (CCDs), and Box-Behnken designs are considered to be the most suitable DOE techniques[

The performance of metamodels may vary from DOE to DOE. To reduce the random effect, we select 1000 training sets and 1000 corresponding test sets for each benchmark problem. Particularly, for a specified training set, the number of points is chosen as triple the number of coefficients in a second-order polynomial model (namely,

In summary, the detailed information about the training and test data used for each benchmark problem are listed in Table

Detailed information about the training and test data used for each benchmark problem.

Benchmark problems | Number of variables | Number of training points |
---|---|---|

2-variable Goldstein-Price | 2 | 18 |

2-variable Branin-Hoo | 2 | 18 |

3-variable Perm | 3 | 30 |

3-variable Cubic-Polynomial | 3 | 30 |

4-variable Power-Sum | 4 | 45 |

4-variable Hartmann | 4 | 45 |

10-variable Zakharov | 10 | 198 |

15-variable Dixon-Price | 15 | 408 |

20-variable Welch et al. (1992) | 20 | 693 |

Reviewing the relevant literature [

Our main concerns are the comparison between the traditional model and its corresponding improved model, namely, the comparison between 1RS and 1IRS, as well as the comparison between 2RS and 2IRS. Although we do not think it is necessary to compare 1RS with 2RS or to compare 1IRS with 2RS, we still want to present the fact that maybe 2RS has better performance than 1RS for all the benchmark problems, while the performance of 1IRS may be close to that of 2RS in some particular problems.

The

From Figure

The choice of different validation metrics may influence the results. To reduce the source of uncertainty in the results as much as possible, we select another four commonly used validation metrics. They are root mean squared error at training points (

Comparison for performance of traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS) among all the five different validation metrics and nine benchmark problems.

GP-2 | BH-2 | PM-3 | CP-3 | PS-4 | HM-4 | ZH-10 | DP-15 | WE-20 | Total | |
---|---|---|---|---|---|---|---|---|---|---|

1IRS > 1RS | 5 | 5 | 5 | 5 | 5 | 2 | 5 | 2 | 5 | 39 |

IRS | 1 | 2 | 3 | |||||||

IRS < 1RS | 2 | 1 | 3 | |||||||

| ||||||||||

2IRS > 2RS | 5 | 5 | 5 | 4 | 5 | 4 | 5 | 1 | 34 | |

IRS | 1 | 3 | 5 | 9 | ||||||

IRS < 2RS | 1 | 1 | 2 |

From Table

In summary, the choice of the validation metrics can slightly influence the results, but the conclusions obtained by the five metrics remain unchanged. The improved method performs better than the traditional method. Particularly, the least-square method, which is used to estimate the polynomial coefficients, implies that the most relevant metric is

The accuracy of metamodels may depend on DOE and vary from DOE to DOE. To reduce the random effect caused by DOE, we have selected 1000 different training sets for each benchmark problem. However, for each training set, the number of sampling points remains unchanged. Therefore, we still need to examine the effect of the number of sampling points on the performance of the improved method. Considering the length of our paper, we just select ZH-10 as the example problem, choose

Figure

With the number of sampling points increasing, both

Effect of the number of sampling points on the performance of 2RS and 2IRS for 10-variable Zakharov function (ZH-10). (a)

The results above have proven the effectiveness of the improved method to some extent. In order for the method to be better used by engineers, we will compare its performance with some other popular metamodels, which are Kriging with first-order polynomial regression function (KRG1), Kriging with second-order polynomial regression function (KRG2), radial based function with Gaussian-form basis (RBFG), and radial based function with multiquadric-form basis (RBFM). Considering Jin, Chen, and Simpson [

Figure

In summary, compared to the traditional method, the improved method retains the advantages in efficiency and transparency and possesses significant accuracy improvement.

All the results above are mainly used to test the global performance of the improved method. Therefore, LHS is utilized to generate the sampling points. To test the performance of the improved method used for local metamodels in RSM, we should use simple benchmark problems (i.e., QP-3, CP-3, and FP-3) and select CCDs to generate corresponding training points. CCDs are considered to be one of the most suitable DOE techniques in response surface methodology [

Figure

Comparison of the local performances between traditional models (1RS and 2RS) and their corresponding improved models (1IRS and 2IRS). (a) QP-3 using

In summary, when the polynomials are used for local metamodels in RSM, the improved method still performs better than the traditional method.

In this paper, we proposed a new method to obtain an accurate high-order polynomial while requiring fewer sampling points. The core idea of the method is to start with a low-order polynomial and refit it to obtain high-order polynomial in a second successive fitting by using the feedback of the initial simple fitting.

To test the global performance of the improved method, we employed nine example problems which are widely used as benchmark problems in relevant literature. As expected, the accuracy of the improved method is better than that of the traditional method. Analyzing the principle, we think the reason for the better performance of the improved method is that it can obtain highly nonlinear terms with fewer sampling points when compared with the traditional method.

To obtain general conclusions, we investigated the effects of validation metrics and the number of sampling points on the performance of the improved method. We found that the choice of the validation metrics and the number of sampling points can slightly influence the results, but the conclusions remain unchanged.

In order for the improved method to be better used, we compared its performance with KRG1, KRG2, RBFG, and RBFM. The results showed that the improved method retains the advantages in efficiency and transparency and possesses significant accuracy improvement when compared with the traditional polynomial response surface.

Moreover, we researched the performance of the improved method used for local metamodels in RSM. The results showed that the proposed method still performs better than the traditional method.

However, there is no single outstanding metamodel which works best for all tasks. Therefore, finding more accurate metamodels is still our future work.

Table

Mean and COV values of

1RS | 1IRS | 2RS | 2IRS | |
---|---|---|---|---|

GP-2 | 1.162E+05(0.086) | 8.271E+04(0.177) | 8.180E+04(0.160) | 7.172E+04(0.225) |

BH-2 | 6.611E+01(0.066) | 6.352E+01(0.147) | 5.862E+01(0.114) | 5.106E+01(0.265) |

PM-3 | 1.576E+01(0.060) | 7.883E+00(0.416) | 6.086E+00(0.118) | 5.056E+00(0.175) |

CP-3 | 4.097E+01(0.058) | 7.056E+00(0.138) | 7.322E+00(0.148) | 1.113E+00(0.613) |

PS-4 | 4.986E+04(0.056) | 4.158E+04(0.075) | 2.876E+04(0.106) | 2.527E+04(0.123) |

HM-4 | 8.414E-01(0.033) | 8.479E-01(0.133) | 6.623E-01(0.091) | 6.484E-01(0.123) |

ZH-10 | 1.048E+08(0.091) | 5.103E+07(0.128) | 5.655E+07(0.128) | 2.022E+07(0.166) |

DP-15 | 3.838E+05(0.022) | 3.849E+05(0.025) | 1.363E+05(0.035) | 1.373E+05(0.035) |

WE-20 | 1.365E+00(0.022) | 1.297E+00(0.024) | 9.506E-01(0.026) | 9.505E-01(0.026) |

The definition of

Table

Mean values of

1RS | 1IRS | Difference | 2RS | 2IRS | Difference | |
---|---|---|---|---|---|---|

GP-2 | 9.273E+04 | 5.052E+04 | -45.5% | 4.824E+04 | 2.928E+04 | -39.3% |

BH-2 | 5.491E+01 | 4.358E+01 | -20.6% | 3.651E+01 | 2.772E+01 | -24.1% |

PM-3 | 1.380E+01 | 6.063E+00 | -56.1% | 3.748E+00 | 2.612E+00 | -30.3% |

CP-3 | 3.580E+01 | 4.259E+00 | -88.1% | 4.073E+00 | 7.076E-01 | -82.6% |

PS-4 | 4.415E+04 | 3.098E+04 | -29.8% | 1.764E+04 | 1.433E+04 | -18.8% |

HM-4 | 7.802E-01 | 7.274E-01 | -6.8% | 4.509E-01 | 4.070E-01 | -9.7% |

ZH-10 | 9.675E+07 | 3.771E+07 | -61.0% | 3.375E+07 | 9.383E+06 | -72.2% |

DP-15 | 3.703E+05 | 3.561E+05 | -3.8% | 8.920E+04 | 8.734E+04 | -2.1% |

WE-20 | 1.326E+00 | 1.221E+00 | -7.9% | 6.240E-01 | 6.205E-01 | -0.6% |

The definition of

Table

Mean values of

1RS | 1IRS | Difference | 2RS | 2IRS | Difference | |
---|---|---|---|---|---|---|

GP-2 | 4.505E-01 | 7.970E-01 | 76.9% | 8.130E-01 | 8.564E-01 | 5.3% |

BH-2 | 9.404E-02 | 3.448E-01 | 266.7% | 5.141E-01 | 6.407E-01 | 24.6% |

PM-3 | 2.100E-01 | 8.403E-01 | 300.2% | 9.288E-01 | 9.479E-01 | 2.1% |

CP-3 | 9.302E-01 | 9.981E-01 | 7.3% | 9.979E-01 | 9.999E-01 | 0.2% |

PS-4 | 7.574E-01 | 8.386E-01 | 10.7% | 9.272E-01 | 9.438E-01 | 1.8% |

HM-4 | 5.610E-01 | 5.750E-01 | 2.5% | 7.811E-01 | 7.937E-01 | 1.6% |

ZH-10 | 7.371E-01 | 9.463E-01 | 28.4% | 9.339E-01 | 9.918E-01 | 6.2% |

DP-15 | 2.467E-02 | 1.163E-01 | 371.4% | 9.337E-01 | 9.327E-01 | -0.1% |

WE-20 | 7.660E-01 | 7.919E-01 | 3.4% | 8.967E-01 | 8.967E-01 | 0.0% |

The definition of

Table

Mean values of

1RS | 1IRS | Difference | 2RS | 2IRS | Difference | |
---|---|---|---|---|---|---|

GP-2 | 7.171E+04 | 4.979E+04 | -30.6% | 4.918E+04 | 3.980E+04 | -19.1% |

BH-2 | 4.895E+01 | 4.529E+01 | -7.5% | 4.144E+01 | 3.498E+01 | -15.6% |

PM-3 | 1.147E+01 | 5.799E+00 | -49.4% | 4.212E+00 | 3.454E+00 | -18.0% |

CP-3 | 2.999E+01 | 4.512E+00 | -85.0% | 4.762E+00 | 8.330E-01 | -82.5% |

PS-4 | 3.595E+04 | 2.879E+04 | -19.9% | 2.014E+04 | 1.678E+04 | -16.7% |

HM-4 | 6.576E-01 | 6.712E-01 | 2.1% | 5.144E-01 | 4.918E-01 | -4.4% |

ZH-10 | 6.765E+07 | 3.113E+07 | -54.0% | 3.701E+07 | 1.156E+07 | -68.8% |

DP-15 | 3.076E+05 | 3.079E+05 | 0.1% | 1.081E+05 | 1.089E+05 | 0.7% |

WE-20 | 1.070E+00 | 1.012E+00 | -5.4% | 7.688E-01 | 7.686E-01 | 0.0% |

The definition of

Table

Mean values of

1RS | 1IRS | Difference | 2RS | 2IRS | Difference | |
---|---|---|---|---|---|---|

GP-2 | 7.668E+05 | 5.407E+05 | -29.5% | 5.302E+05 | 4.735E+05 | -10.7% |

BH-2 | 3.584E+02 | 3.194E+02 | -10.9% | 2.780E+02 | 2.535E+02 | -8.8% |

PM-3 | 1.038E+02 | 5.065E+01 | -51.2% | 4.363E+01 | 3.352E+01 | -23.2% |

CP-3 | 2.319E+02 | 5.583E+01 | -75.9% | 5.588E+01 | 4.901E+00 | -91.2% |

PS-4 | 3.591E+05 | 2.688E+05 | -25.2% | 2.116E+05 | 1.823E+05 | -13.8% |

HM-4 | 2.800E+00 | 2.833E+00 | 1.2% | 2.663E+00 | 2.928E+00 | 9.9% |

ZH-10 | 1.093E+09 | 5.805E+08 | -46.9% | 5.760E+08 | 2.380E+08 | -58.7% |

DP-15 | 1.463E+06 | 1.483E+06 | 1.3% | 4.993E+05 | 5.101E+05 | 2.2% |

WE-20 | 5.205E+00 | 5.009E+00 | -3.8% | 3.304E+00 | 3.303E+00 | 0.0% |

Table

Mean values of

2RS | 2IRS | KRG1 | KRG2 | RBF | RBFM | |
---|---|---|---|---|---|---|

GP-2 | 8.180E+04 | 7.172E+04 | 7.143E+04 | 6.551E+04 | 7.566E+04 | 6.165E+04 |

BH-2 | 5.862E+01 | 5.106E+01 | 3.168E+01 | 3.207E+01 | 8.063E+01 | 3.451E+01 |

PM-3 | 6.086E+00 | 5.056E+00 | 4.326E+00 | 3.730E+00 | 3.613E+00 | 2.666E+00 |

CP-3 | 7.322E+00 | 1.113E+00 | 6.729E+00 | 3.879E+00 | 4.927E+01 | 1.131E+01 |

PS-4 | 2.876E+04 | 2.527E+04 | 2.669E+04 | 2.411E+04 | 7.125E+04 | 2.533E+04 |

HM-4 | 6.623E-01 | 6.484E-01 | 4.182E-01 | 5.465E-01 | 7.118E-01 | 5.572E-01 |

ZH-10 | 5.655E+07 | 2.022E+07 | 8.331E+07 | 5.384E+07 | 1.738E+08 | 6.453E+07 |

DP-15 | 1.363E+05 | 1.373E+05 | 3.377E+05 | 1.361E+05 | 1.028E+06 | 1.804E+05 |

WE-20 | 9.506E-01 | 9.505E-01 | 1.335E+00 | 9.471E-01 | 1.206E+00 | 9.417E-01 |

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Cheng Yan and Xiuli Shen conceived and designed the experiments; Fushui Guo performed the experiments; Cheng Yan analyzed the data; Fushui Guo contributed analysis tools; Cheng Yan wrote the paper.