During the last years, kriging has become one of the most popular methods in computer simulation and machine learning. Kriging models have been successfully used in many engineering applications, to approximate expensive simulation models. When many input variables are used, kriging is inefficient mainly due to an exorbitant computational time required during its construction. To handle high-dimensional problems (100+), one method is recently proposed that combines kriging with the Partial Least Squares technique, the so-called KPLS model. This method has shown interesting results in terms of saving CPU time required to build model while maintaining sufficient accuracy, on both academic and industrial problems. However, KPLS has provided a poor accuracy compared to conventional kriging on multimodal functions. To handle this issue, this paper proposes adding a new step during the construction of KPLS to improve its accuracy for multimodal functions. When the exponential covariance functions are used, this step is based on simple identification between the covariance function of KPLS and kriging. The developed method is validated especially by using a multimodal academic function, known as Griewank function in the literature, and we show the gain in terms of accuracy and computer time by comparing with KPLS and kriging.
1. Introduction
During the last years, the kriging model [1–4], which is referred to as the Gaussian process model [5], has become one of the most popular methods in computer simulation and machine learning. It is used as a substitute of high-fidelity codes representing physical phenomena and aims to reduce the computational time of a particular process. For instance, the kriging model is used successfully in several optimization problems [6–11]. Kriging is not well adapted to high-dimensional problem, principally due to large matrix inversion problems. In fact, the kriging model becomes much time consuming when a large number of input variables are used since a large number of sampling points are required. Indeed, it is recommended in [12] to use 10d sampling points, with d the number of dimensions, for obtaining a good accuracy of the kriging model. As a result, we need to increase the size of the kriging covariance matrix which becomes computationally very expensive to invert. Moreover, this inversion’s problem induces difficulty in the classical hyperparameters estimation through the maximization of the likelihood function.
A recent method, called KPLS [13], is developed to reduce computational time which uses, during a construction of the kriging model, the dimensional reduction method “Partial Least Squares” (PLS). This method is able to reduce the number of hyperparameters of a kriging model, such that their number becomes equal to the number of principal components retained by the PLS method. The KPLS method is thus able to rapidly build a kriging model for high-dimensional problems (100+) while maintaining a good accuracy. However, it has been shown in [13] that the KPLS model is less accurate than the kriging model in many cases, in particular for multimodal functions.
In this paper, we propose an extra step that supplements [13] in order to improve its accuracy. Under hypothesis that kernels used for building the KPLS model are of exponential type with the same form (all Gaussian kernels, e.g.), we choose the hyperparameters found by the KPLS model as an initial point to optimize the likelihood function of a conventional kriging model. In fact, this approach is performed by identifying the covariance function of the KPLS model as a covariance function of a kriging model. The fact of considering the identified kriging model, instead of the KPLS model, leads to extending the search space where the hyperparameters are defined and thus to making the resulting model more flexible than the KPLS model.
This paper is organized in 3 main sections. In Section 2, we present a review of the KPLS model. In Section 3, we discuss our new approach under the hypothesis needed for its applicability. Finally, numerical results are shown to confirm the efficiency of our method followed by a summary of what we have achieved.
2. Construction of KPLS
In this section, we introduce the notation and describe the theory behind the construction of the KPLS model. Assume that we have evaluated a cost deterministic function of n points x(i) (i=1,…,n) with x(i)=x1(i),…,xd(i)∈B⊂Rd, and we denote X by the matrix x(1)t,…,x(n)tt. For simplicity, B is considered to be a hypercube expressed by the product between intervals of each direction space; that is, B=∏j=1d[aj,bj], where aj,bj∈R with aj≤bj for j=1,…,d. Simulating these n inputs gives the outputs y=y(1),…,y(n)t with y(i)=y(x(i)), for i=1,…,n.
2.1. Construction of the Kriging Model
For building the kriging model, we assume that the deterministic response y(x) is realization of a stochastic process [14–17]:(1)Yx=β0+Zx.The presented formula, with β0 an unknown constant, corresponds to ordinary kriging [8] which is a particular case of universal kriging [15]. The stochastic term Z(x) is considered as realization of a stationary Gaussian process with E[Z(x)]=0 and a covariance function, also called kernel function, given by(2)CovZx,Zx′=kx,x′=σ2rx,x′=σ2rxx′,∀x,x′∈B,where σ2 is the process variance and rxx′ is the correlation function between x and x′. However, the correlation function r depends on hyperparameters θ which are considered to be known. We also denote the n×1 vector as rxX=[rxx(1),…,rxx(n)]t and the n×n correlation matrix as R=[rx(1)X,…,rx(n)X]. We use y^(x) to denote the prediction of the true function y(x). Under the hypothesis above, the best linear unbiased predictor for y(x), given the observations y, is(3)y^x=β0^+rxXtR-1y-β0^1,where 1 denotes an n-vector of ones and(4)β0^=1tR-11-11tR-1y.In addition, the estimation of σ2 is given by(5)σ^2=1ny-1β0^tR-1y-1β0^.Moreover, ordinary kriging provides an estimate of the variance of the prediction, which is given by(6)s2x=σ^21-rxXtR-1rxX.
Note that the assumption of a known covariance function with known parameters θ is unrealistic in reality and they are often unknown. For this reason, the covariance function is typically chosen from among a parametric family of kernels. In this work, only the covariance functions of exponential type are considered, in particular the Gaussian kernel. Indeed, the Gaussian kernel is the most popular kernel in kriging metamodels of simulation models, which is given by (7)kx,x′=σ2∏i=1dexp-θixi-xi′2,∀θi∈R+.We note that the parameters θi, for i=1,…,d, can be interpreted as measuring how strongly the input variables x1,…,xd, respectively, affect the output y. If θi is very large, the kernel k(x,x′) given by (7) tends to zero and thus leads to a low correlation. In fact, we see in Figure 1 how the correlation curve rapidly varies from a point to another when θ=10.
Theta smoothness can be tuned to adapt spatial influence to our problem. The magnitude of θ dictates how quickly the squared exponential function variates.
However, the estimator of the kriging parameters (β0^,σ^2, and θ1,…,θd) makes the kriging predictor, given by (3), nonlinear and makes the estimated predictor variance, given by (6), biased. We note that the vector r and the matrix R should get hats above but it is ignored in practice [18].
2.2. Partial Least Squares
The PLS method is a statistical method which searches out the best multidimensional direction X that explains the characteristics of the output y. It finds a linear relationship between input variables and output variable by projecting input variables onto principal components, also called latent variables. The PLS technique reduces dimension and reveals how inputs depend on output. In the following, we use h to denote the number of principal components retained which are a lot lower than d (h≪d); h does not generally exceed 4, in practice. In addition, the principal components can be computed sequentially. In fact, the principal component t(l), for l=1,…,h, is computed by seeking the best direction w(l) which maximizes the squared covariance between t(l)=X(l-1)w(l) and y(l-1):(8)wl=argmaxwlwltXl-1tyl-1yl-1tXl-1wlsuch that wltwl=1,where X=X(0), y=y(0), and, for l=1,…,h, X(l) and y(l) are the residual matrix from the local regression of X(l-1) onto the principal component t(l) and from the local regression of y(l) onto the principal component t(l), respectively, such that(9)Xl=Xl-1-tlpl,yl=yl-1-cltl,where p(l) (a 1×d vector) and cl (a coefficient) contain the regression coefficients. For more details of how PLS method works, please see [19–21].
The principal components represent the new coordinate system obtained upon rotating the original system with axes, x1,…,xd [21]. For l=1,…,h, t(l) can be written as(10)tl=Xl-1wl=Xw∗l.This important relationship is mainly used for developing the KPLS model which is detailed in Section 2.3. The vectors w∗(l), for l=1,…,h, are given by the following matrix W∗=w∗(1),…,w∗(h) which is obtained by (for more details, see [22]) (11)W∗=WPtW-1,where W=w(1),…,w(h) and P=p(1)t,…,p(h)t.
2.3. Construction of the KPLS Model
The hyperparameters θ={θi}, for i=1,…,d, given by (7) are found by maximum likelihood estimation (MLE) method. Their estimation becomes more and more expensive when d increases. The vector θ can be interpreted as measuring how strongly the variables x1,…,xd affect the output y, respectively. For building KPLS, coefficients given by vectors w∗(l) will be considered as measuring of the influence of the input variables x1,…,xd on the output y. By some elementary operations on the kernel functions, we define the KPLS kernel by(12)kKPLS1:hx,x′=∏l=1hklFlx,Flx′,where kl:B×B→R is an isotropic stationary kernel and (13)Fl:B⟶Bx⟼w∗1lx1,…,w∗dlxd.
More details of such construction are given in [13]. Considering the example of the Gaussian kernel given by (7), we obtain(14)kx,x′=σ2∏l=1h∏i=1dexp-θlw∗ilxi-w∗ilxi′2,∀θl∈R+.Since a small number of principal components are retained, the estimation of the hyperparameters θ1,…,θh is faster than the hyperparameters θ1,…,θd given by (7), where d is very high (100+).
3. Transition from the KPLS Model to the Kriging Model Using the Exponential Covariance Functions
In this section, we show that if all kernels kl, for l=1,…,h, used in (12) are of the exponential type with the same form (all Gaussian kernels, e.g.), then the kernel kKPLS1:h given by (12) will be of the exponential type with the same form as kl (Gaussian if all kl are Gaussian).
3.1. Proof of the Equivalence between the Kernels of the KPLS Model and the Kriging Model
Let us define, for i=1,…,d, ηi=∑l=1hθlw∗i(l)2; we have(15)k1:hx,x′=∏l=1h∏i=1dexp-θlw∗il2xi-xi′2=exp∑i=1d∑l=1h-θlw∗il2xi-xi′2=exp∑i=1d-ηixi-xi′2=∏i=1dexp-ηixi-xi′2.In the same way, we can show this equivalence for the other exponential kernels where p1=⋯=ph:(16)k1:hx,x′=σ2∏l=1h∏i=1dexp-θlw∗ilxi-xi′pl.
However, we must caution that the above proof shows equivalence between the covariance functions of KPLS and kriging only on a subspace domain. More precisely, the KPLS covariance function is defined in a subspace from R+d whereas the kriging covariance function is defined in the complete R+d domain. Thus, our original idea is to extend the space where the KPLS covariance function is defined for the complete space R+d.
3.2. A New Step during the Construction of the KPLS Model: KPLS+K
By considering the equivalence shown in the last section, we propose to add a new step during the construction of the KPLS model. This step occurs just after the θl-estimation, for l=1,…,h. It involves making local optimization of the likelihood function of the kriging model equivalent to the KPLS model. Moreover, we use ηi=∑l=1hθlw∗i(l)2, for i=1,…,d, as a starting point of the local optimization by considering the solution θl, for l=1,…,h, found by the KPLS method. Thus, this optimization is done in the complete space, where the vector η={ηi}∈R+d.
This approach, called KPLS+K, aims to improve the MLE of the kriging model equivalent to the associated KPLS model. In fact, the local optimization of the equivalent kriging offers more possibilities for improving the MLE, by considering a wider search space, and thus it will be able to correct the estimation of many directions. These directions are represented by ηi for the ith direction which is badly estimated by the KPLS method. Because estimating the equivalent kriging hyperparameters can be time consuming, especially when d is large, we improve the MLE by a local optimization at the cost of a slight increase of computational time.
Figure 2 recalls the principal stages of building a KPLS+K model.
Principal stages for building a KPLS+K model.
4. Numerical Simulations
We now focus on the performance of KPLS+K by comparing it with the KPLS model and the ordinary kriging model. For this purpose, we use the academic function, named Griewank, over the interval [-5,5] which is studied in [13]. 20 and 60 dimensions are considered for this function. In addition, an engineering example, done at Snecma for a multidisciplinary optimization, is used. This engineering case is chosen since it was shown in [13] that KPLS is less accurate than ordinary kriging. The Gaussian kernel is used for all surrogate models used herein, that is, ordinary kriging, KPLS, and KPLS+K. For KPLS and KPLS+K using h principal components, for h≤d, will be denoted by KPLSh and KPLSh+K, respectively, and this h is varied from 1 to 3. The Python toolbox Scikit-learn v.014 [23] is used to achieve the proposed numerical tests, except for ordinary kriging used on the industrial case, where the Optimus version is used. The training and the validation points used in [13] are reused in the following.
4.1. Griewank Function over the Interval [−5, 5]
The Griewank function [13, 24] is defined by (17)yGriewankx=∑i=1dxi24000-∏i=1dcosxii+1,-5≤xi≤5, for i=1,…,d.Figure 3 shows the degree of complexity of such function which is very multimodal. As in [13], we consider d=20 and d=60 input variables. For each problem, ten experiments based on the random Latin-hypercube design are built with n (number of sampling points) equal to 50, 100, 200, and 300. To better visualize the results, boxplots are used to show the CPU time and the relative error RE given by (18)Error=Y^-Y2Y2100,where ·2 represents the usual L2 norm and Y^ and Y are the vectors containing the prediction and the real values of 5000 randomly selected validation points for each case. The mean and the standard error are given in Tables 2 and 3, respectively, in Appendix. However, the results of the ordinary kriging model and the KPLS model are reported from [13].
A 2D Griewank function over the interval [-5,5].
For 20 input variables and 50 sampling points, the KPLS models always give a more accurate solution than ordinary kriging and KPLS+K, as shown in Figure 4(a). Indeed, the best result is given by KPLS3 with a mean of RE equal to 0.51%. However, the KPLS+K models give more accurate models than ordinary kriging in this case (0.58% for KPLS2+K and KPLS3+K versus 0.62% for ordinary kriging). For the KPLS model, the rate of improvement with respect to the number of sampling points is less than for ordinary kriging and KPLS+K (see Figures 4(b)–4(d)). As a result, KPLSh+K, for h=1,…,3, and ordinary kriging give almost the same accuracy (≈0.16%) when 300 sampling points are used (as shown in Figure 4(d)), whereas the KPLS models give a RE of 0.35% as a best result, when h=3.
RE of the Griewank function in 20D over the interval [-5,5]. The experiments are based on the 10-Latin-hypercube design.
RE (%) for 20 input variables and 50 sampling points
RE (%) for 20 input variables and 100 sampling points
RE (%) for 20 input variables and 200 sampling points
RE (%) for 20 input variables and 300 sampling points
Nevertheless, the results shown in Figure 5 indicate that the KPLS+K models lead to an important reduction in CPU time for the various number of sampling points compared to ordinary kriging. For instance, 20.49 s are required for building KPLS3+K when 300 training points are used, whereas ordinary kriging is built in 94.31 s; in this case, KPLS3+K is thus approximately 4 times cheaper than the ordinary kriging model. Moreover, the computational time required for building KPLS+K is more stable than the computational time for building ordinary kriging; standard deviations of approximately 3 s for KPLS+K and 22 s for the ordinary kriging model are observed.
CPU time of the Griewank function in 20D over the interval [-5,5]. The experiments are based on the 10-Latin-hypercube design.
CPU time for 20 input variables and 50 sampling points
CPU time for 20 input variables and 100 sampling points
CPU time for 20 input variables and 200 sampling points
CPU time for 20 input variables and 300 sampling points
For 60 input variables and 50 sampling points, a slight difference of the results occurs compared to the 20 input variables case (Figure 6(a)). Indeed, the KPLS models remain always better, with a mean of RE approximately equal to 0.92%, than KPLS+K and ordinary kriging. However, the KPLS+K models give more accurate results than ordinary kriging with an accuracy close to that of KPLS (≈0.99% versus 1.39%). Increasing the number of sampling points, the accuracy of ordinary kriging becomes better than the accuracy given by the KPLS models, but it remains less accurate than for the KPLSh+K models, for h=2 or 3. For instance, we obtain a mean of RE with 0.60% for KPLS2+K against 0.65% for ordinary kriging (see Figure 6(d)), when 300 sampling points are used.
RE of the Griewank function in 60D over the interval [-5,5]. The experiments are based on the 10-Latin-hypercube design.
RE (%) for 60 input variables and 50 sampling points
RE (%) for 60 input variables and 100 sampling points
RE (%) for 60 input variables and 200 sampling points
RE (%) for 60 input variables and 300 sampling points
As we can observe from Figure 7(d), a very important reduction in terms of computational time is obtained. Indeed, a mean time of 2894.56 s is required for building ordinary kriging, whereas KPLS2+K is built in 23.03 s; KPLS2+K is approximately 125 times cheaper than ordinary kriging in this case. In addition, the computational time for building KPLS+K is more stable than ordinary kriging, except the KPLS3+K case; a standard deviation of approximately 0.30 s for KPLS1+K and KPLS2+K is observed, against 728.48 s for ordinary kriging. However, the relatively large standard of deviation of KPLS3+K (26.59 s) is probably due to the dispersion caused by KPLS3 (26.59 s). But, it remains too lower than the standard deviation of the ordinary kriging model.
CPU time of the Griewank function in 60D over the interval [-5,5]. The experiments are based on the 10-Latin-hypercube design.
CPU time for 60 input variables and 50 sampling points
CPU time for 60 input variables and 100 sampling points
CPU time for 60 input variables and 200 sampling points
CPU time for 60 input variables and 300 sampling points
For the Griewank function over the interval [-5,5], the KPLS+K models are slightly more time consuming than the KPLS models, but they are more accurate, in particular when the number of observations is greater than the dimension d, as is implied by the rule-of-thumb n=10d in [12]. They seem to perform well compared to the ordinary kriging model with an important gain in terms of saving CPU time.
4.2. Engineering Case
In this section, let us consider the third output, y3, from tab1 problem studied in [13]. This test case is chosen because the KPLS models, from 1 to 3 principal components, do not perform well (see Table 1). We recall that this problem contains 24 input variables. 99 training points and 52 validation points are used and the relative error (RE) given by (18) is considered.
Results for tab1 experiment data (24 input variables, output variables y3) obtained by using 99 training points, 52 validation points, and the error given by (18). “Kriging” refers to the ordinary kriging Optimus solution and “KPLSh” and “KPLSh+K” refer to KPLS and KPLS+K with h principal components, respectively. Best results of the relative error are highlighted in bold type.
Surrogate model
RE (%)
CPU time
tab1
Kriging
8.97
8.17 s
KPLS1
10.35
0.18 s
KPLS2
10.33
0.42 s
KPLS3
10.41
1.14 s
KPLS1+K
8.77
2.15 s
KPLS2+K
8.72
4.22 s
KPLS3+K
8.73
4.53 s
Results of the Griewank function in 20D over the interval [-5,5]. Ten trials are done for each test (50, 100, 200, and 300 training points). Best results of the relative error are highlighted in bold type for each case.
Surrogate
Statistic
50 points
100 points
Error (%)
CPU time
Error (%)
CPU time
Kriging
Mean
0.62
30.43 s
0.43
40.09 s
std
0.03
9.03 s
0.04
11.96 s
KPLS1
Mean
0.54
0.05 s
0.53
0.12 s
std
0.03
0.007 s
0.03
0.02 s
KPLS2
Mean
0.52
0.11 s
0.48
1.04 s
std
0.03
0.05 s
0.04
0.97 s
KPLS3
Mean
0.51
1.27 s
0.46
3.09 s
std
0.03
1.29 s
0.06
3.93 s
KPLS1+K
Mean
0.59
1.20 s
0.45
2.42 s
std
0.04
0.16 s
0.07
0.44 s
KPLS2+K
Mean
0.58
1.28 s
0.42
3.38 s
std
0.04
0.15 s
0.05
1.06 s
KPLS3+K
Mean
0.58
2.45 s
0.41
5.61 s
std
0.03
1.32 s
0.05
3.99 s
Surrogate
Statistic
200 points
300 points
Error (%)
CPU time
Error (%)
CPU time
Kriging
Mean
0.15
120.74 s
0.16
94.31 s
std
0.02
27.49 s
0.06
21.92 s
KPLS1
Mean
0.48
0.43 s
0.45
0.89 s
std
0.03
0.08 s
0.03
0.02 s
KPLS2
Mean
0.42
1.14 s
0.38
2.45 s
std
0.04
0.92 s
0.04
1 s
KPLS3
Mean
0.37
3.56 s
0.35
3.52 s
std
0.03
2.75 s
0.06
1.38 s
KPLS1+K
Mean
0.20
8.00 s
0.17
19.07 s
std
0.04
1.51 s
0.07
3.19 s
KPLS2+K
Mean
0.18
9.71 s
0.16
19.89 s
std
0.02
1.29 s
0.05
2.67 s
KPLS3+K
Mean
0.16
11.67 s
0.16
20.49 s
std
0.02
3.88 s
0.05
3.46 s
Results of the Griewank function in 60D over the interval [-5,5]. Ten trials are done for each test (50, 100, 200, and 300 training points). Best results of the relative error are highlighted in bold type for each case.
Surrogate
Statistic
50 points
100 points
Error (%)
CPU time
Error (%)
CPU time
Kriging
Mean
1.39
560.19 s
1.04
920.41 s
std
0.15
200.27 s
0.05
231.34 s
KPLS1
Mean
0.92
0.07 s
0.87
0.10 s
std
0.02
0.02 s
0.02
0.007 s
KPLS2
Mean
0.91
0.43 s
0.87
0.66 s
std
0.03
0.54 s
0.02
1.06 s
KPLS3
Mean
0.92
1.57 s
0.86
3.87 s
std
0.04
1.98 s
0.02
5.34 s
KPLS1+K
Mean
0.99
2.14 s
0.90
2.90 s
std
0.03
0.72 s
0.03
0.03 s
KPLS2+K
Mean
0.98
2.44 s
0.88
3.44 s
std
0.04
0.63 s
0.02
1.06 s
KPLS3+K
Mean
0.99
3.82 s
0.88
6.68 s
std
0.05
2.33 s
0.03
5.34 s
Surrogate
Statistic
200 points
300 points
Error (%)
CPU time
Error (%)
CPU time
Kriging
Mean
0.83
2015.39 s
0.65
2894.56 s
std
0.04
239.11 s
0.03
728.48 s
KPLS1
Mean
0.82
0.37 s
0.79
0.86 s
std
0.02
0.02 s
0.03
0.04 s
KPLS2
Mean
0.78
2.92 s
0.74
1.85 s
std
0.02
2.57 s
0.03
0.51 s
KPLS3
Mean
0.78
6.73 s
0.70
20.01 s
std
0.02
10.94 s
0.03
26.59 s
KPLS1+K
Mean
0.76
9.88 s
0.66
22.00 s
std
0.03
0.06 s
0.02
0.15 s
KPLS2+K
Mean
0.75
12.38 s
0.60
23.03 s
std
0.03
2.56 s
0.03
0.50 s
KPLS3+K
Mean
0.74
16.18 s
0.61
41.13 s
std
0.03
10.95 s
0.03
26.59 s
As we see in Table 1, we improve the accuracy of KPLS by adding the step for building KPLS+K. This improvement is verified whatever the number of principal components used (1, 2, and 3 principal components). For these three models, a better accuracy is even found than the ordinary kriging model. The computational time required to build KPLS+k is approximately twice lower than the time required for ordinary kriging.
5. Conclusions
Motivated by the need to accurately approximate high-fidelity codes rapidly, we develop a new technique for building the kriging model faster than classical techniques used in literature. The key idea for such construction relies on the choice of the start point for optimizing the likelihood function of the kriging model. For this purpose, we firstly prove equivalence between KPLS and kriging when an exponential covariance function is used. After optimizing hyperparameters of KPLS, we then choose the solution obtained as an initial point to find the MLE of the equivalent kriging model. This approach will be applicable only if the kernels used for building KPLS are of the exponential type with the same form.
The performance of KPLS+K is verified in the Griewank function over the interval [-5,5] with 20 and 60 dimensions and an industrial case from Snecma, where the KPLS models do not perform well in terms of accuracy. The results of KPLS+K have shown a significant improvement in terms of accuracy compared to the results of KPLS, at the cost of a slight increase in computational time. We have also seen, in some cases, that accuracy of KPLS+K is even better than accuracy given by the ordinary kriging model.
AppendixResults of Griewank Function in 20D and 60D over the Interval [<bold>−</bold>5, 5]
In Tables 2 and 3, the mean and the standard deviation (std) of the numerical experiments with the Griewank function are given for 20 and 60 dimensions, respectively.
Symbols and Notation (Matrices and Vectors Are in Bold Type)·:
Absolute value
R:
Set of real numbers
R+:
Set of positive real numbers
n:
Number of sampling points
d:
Dimension
h:
Number of principal components retained
x:
A 1×d vector
xi:
The ith element of a vector x
X:
A n×d matrix containing sampling points
y(x):
The true function y performed on the vector x
y:
n×1 vector containing simulation of X
y^(x):
The prediction of the true function y(x)
Y(x):
A stochastic process
x(i):
The ith training point for i=1,…,n (a 1×d vector)
w(l):
A d×1 vector containing X-weights given by the lth PLS-iteration for l=1,…,h
X(0):
X
X(l-1):
Matrix containing residual of the inner regression of the (l-1)th PLS-iteration for l=1,…,h
k(·,·):
A covariance function
xt:
Superscript t denoting the transpose operation of the vector x
≈:
Approximately sign.
Competing Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors extend their grateful thanks to A. Chiplunkar from ISAE-SUPAERO, Toulouse, for his careful correction of the paper.
KrigeD.A statistical approach to some basic mine valuation problems on the WitwatersrandMatheronG.Principles of geostatisticsCressieN.Spatial prediction and ordinary krigingSacksJ.SchillerS. B.WelchW. J.Designs for computer experimentsRasmussenC. E.WilliamsC. K.JonesD. R.SchonlauM.WelchW. J.Efficient global optimization of expensive black-box functionsSakataS.AshidaF.ZakoM.Structural optimizatiion using Kriging approximationForresterA.SobesterA.KeaneA.LaurenceauJ.KleijnenJ. P. C.van BeersW.van NieuwenhuyseI.Constrained optimization in expensive simulation: novel approachKleijnenJ. P.van BeersW.van NieuwenhuyseI.Expected improvement in efficient global optimization through bootstrapped krigingLoeppkyJ. L.SacksJ.WelchW. J.Choosing the sample size of a computer experiment: a practical guideBouhlelM. A.BartoliN.OtsmaneA.MorlierJ.Improving kriging surrogates of high-dimensional design models by partial least squares dimension reductionSacksJ.WelchW. J.MitchellT. J.WynnH. P.Design and analysis of computer experimentsSasenaM.PichenyV.GinsbourgerD.RichetY.CaplinG.Quantile-based optimization of noisy computer experiments with tunable precisionRoustantO.GinsbourgerD.DevilleY.DiceKriging, DiceOptim: two R packages for the analysis of computer experiments by kriging-based metamodeling and optimizationKleijnenJ. P.HellandI. S.On the structure of partial least squares regressionFrankl. E.FriedmanJ. H.A statistical view of some chemometrics regression toolsPérezR. A.González-FariasG.Partial least squares regression on symmetric positive-definite matricesManneR.Analysis of two partial-least-squares algorithms for multivariate calibrationPedregosaF.VaroquauxG.GramfortA.Scikit-learn: machine learning in PythonRegisR. G.ShoemakerC. A.Combining radial basis function surrogates and dynamic coordinate search in high-dimensional expensive black-box optimization