A Frame-Based Conjugate Gradients Direct Search Method with Radial Basis Function Interpolation Model

In this paper, we propose a new hybrid direct search method where a frame-based PRP conjugate gradients direct search algorithm is combined with radial basis function interpolation model. In addition, the rotational minimal positive basis is used to reduce the computation work at each iteration. Numerical results for solving the CUTEr test problems show that the proposed method is promising.


Introduction
In this paper, we consider the following problem: where function  is assumed to be continuously differentiable from R  into R, and the derivative information is unavailable or untrustworthy, for example, because of noise and using finite differences.Problem (1) has numerous applications in engineering, such as the helicopter rotor blade design [1,2], the aeroacoustic shape design [3], groundwater community problems [4], and medical image registration problems [5].
There are two main methods for solving (1).The first class of methods is the model based methods, which are constructed by means of multivariate interpolation, including under and overdetermined.These methods were introduced by Powell [6] and Winfield [7] and were developed by [8][9][10][11].The second class of methods is the direct search methods which are based on the comparison rules of objective function values.These methods were pioneered by Hooke and Jeeves [12].The convergence theory was established by Torczon [13,14].Audet and Dennis [15] proposed a general framework for direct search method.Coope and Price [16] extended the PRP method [17,18] to solve (1) and presented a frame-based conjugate gradients direct search algorithm (Max-PRP for short).In each iteration, Max-PRP employed the fixed maximal positive basis to estimate the first and second gradients; then the search direction is determined by employing the PRP formula.Numerical tests showed that the Max-PRP was effective on a wide variety of unconstrained optimization problem.In addition, some classical and modern direct search methods were introduced by Kolda et al. [19].
Generally, model based methods are more efficient than direct search methods in that they are able to exploit structure inherently in the problem.But direct search methods are simpler to code and to parallelize.Therefore, it is natural to try to combine both methods.In 2010, Custódio et al. [20] proposed a hybrid method integrating minimum Frobenius norm quadratic interpolation models in a direct search framework and numerical results showed that the addition of quadratic interpolation models improved the performance of the direct search method.In 2013, Conn and Le Digabel [21] showed that the use of quadratic interpolation models can improve the efficiency of the mesh adaptive direct search method.
The above hybrid algorithms were based on the quadratic interpolation models.In 2008, Wild et al. [22] presented a new derivative-free algorithm (ORBIT for short), which employed radial basis function (RBF) interpolation models.
The RBF interpolation models allowed ORBIT to interpolate nonlinear functions using fewer function evaluations than the quadratic interpolation models.In 2013, Wild and Shoemaker [23] proved the global convergence of the ORBIT under some mild assumptions.Numerical results showed that the method using RBF interpolation models outperformed methods using quadratic interpolation models.
Motivated by the efficiency of the ORBIT, we propose a new hybrid direct search method, which combines the frame-based conjugate gradients strategies with the RBF interpolation models.In each iteration, a minimal positive basis is used to construct the frame.In a maximal positive basis, 2 function values are computed, while, in a minimal positive basis,  + 1 function values are evaluated.So the computation work in the new hybrid direct search method can be reduced.In addition, when the trial point of RBF interpolation models cannot satisfy the decrease condition, we employ PRP formula to get the search direction, which is similar to the Max-PRP.Furthermore, we rotate the minimal positive basis according to the local topography of objective function, making our method more effective in practice.The convergence is established under some mild conditions.Some numerical results show that the proposed method is promising.
This paper is organized a s follows.In Section 2, we present some basic notions for positive basis, frame, and describe our method.In Section 3, we prove the convergence of the proposed method.In Section 4, numerical results show the efficiency of method derived in this paper compared to Max-PRP [16].Concluding remarks are given in Section 5.The default norm used in this paper is Euclidean.

The New Hybrid Direct Search Method
We first state the definition about positive basis, which can be found in [24].Definition 1. Positive basis V in R  is a set of vectors with the following two properties: (i) Every vector in R  is a nonnegative linear combination of the members of V.
(ii) No proper subset of V satisfies ().
It is easy to know that cardinality of any positive basis V satisfies  + 1 ≤ |V| ≤ 2.Two famous and simple examples of positive bases are where {V 1 , . . ., V  } is a basis for R  , V min represents the minimal positive basis, and V max represents the maximal positive basis.
In addition, we give some concepts about frames, which were proposed by Coope and Price [25,26].Definition 2. A frame can be defined as where  ∈ R  is a central point of a frame, ℎ > 0 is frame size, and V is a positive basis in R  .
where  = ℎ 1+ ,  is a positive constant, and the corresponding central point  is called a quasi minimal point.
Let   be th iterate.We will discuss the strategy of RBF interpolation model, search direction, and rotation of positive basis in detail below.

RBF Interpolation Model. Choose a positive basis
is the frame size, and the other points of set  are chosen in the subset of previously evaluated points.
RBF interpolation model is a popular model for optimization, and some theory and implementations can be found in [27].Corresponding to the set of interpolate data points , we get the following RBF interpolation model: where  : R + → R is a radial basis function and  1 , . ..,    ,  1 , . . .,   ∈ R are parameters to be determined. 1 , . . .,   are polynomial tails used in the context of RBF interpolation models, which most frequently are linear.
Then, we minimize the RBF interpolation model by solving the following problem: min where . ., ‖V   ‖}, and   is the radius factor parameter.

PRP Direction.
Consider the following linear model: where   ∈ R  .The coefficients can be determined by  regression interpolation conditions: Then, we have that This system can be solved by the method of least squares.For example, if we choose the positive basis V  as V min , and V  =   ( = 1, . . ., ), where   is th unit vector, then th element of   is calculated according to the following formula: where  = ∑  =1   .The PRP direction is obtained by

Rotation of the Positive Basis.
In order to modify the positive basis such that at least one of the new directions is more closely conformed to the local behavior of the function, we rotate the positive basis at each step.This idea is similar to that in [28].Suppose that where where   ∈ R ( = 1, . . ., ) describe the movements performed along the vectors V   ( = 1, . . ., ) in previous iterations.
We get positive basis V +1 by rotating V  .Firstly, we obtain  linearly independent vectors according to V  : where V +1  represents the sum of all the movements made in the directions V   for  = , . . ., .The lemma 8.5.4 of [29] proved that {V +1 1 , . . ., V +1  } is linearly independent.Secondly, we use the Gram-Schmidt orthogonalization method to get a class of standard orthogonal basis: Finally, we can get where {V +1 +1 , . . ., V +1  } is combined with {V +1 1 , . . ., V +1  } according to the same combination principal as V  .
For instance, if  =  + 1, we obtain Supposing that {  } is the sequence of quasi minimal iteration points, then the above process can be summarized as the following algorithm.
Step 1 (checking the stopping condition).If the stopping condition is not met, then go to Step 2, otherwise output the lowest known point and stop.
Step 2 (determining the frame).Create a frame Φ  at iterate   according to the positive basis V  and step length ℎ  , and calculate the corresponding function values.

Convergence Analysis
Now we have the following convergent property of Algorithm 5.
Theorem 7. Supposing that the sequence of function value {(  )} is bounded, then the sequence {  } is infinite.
Proof.Assume that {  } is finite; let   be the final quasi minimal point and   =  k.
From Steps 3, 4, and 5 of Algorithm 5, we know that or where k are frame size and positive basis corresponding to iterate  k, respectively.Supposing that Φ k is the frame corresponding to quasi minimal iterate  k, then the frame Φ k+1 is not quasi minimal.From Definition 4, it follows that there exists at least a vector By ( 22), (23), and (24), we have Then, we have where  is a positive integer and  ≥ 3.
Because frame Φ k is the final quasi minimal frame, by Step 6 of Algorithm 5, we know that ℎ  is a positive constant for  > k; that is, By ( 25), (26), and (28), we have If we ignore the stopping condition and let  → +∞, then ( k+ ) → −∞, which contradicts the condition that {(  )} is bounded.The proof of this theorem is complete.
Theorem 8. Assume the following conditions are satisfied: (A1)  is continuously differentiable.(A2) ‖V   ‖ ≤  for  = 1, . . .,  and  = 0, 1, . .., where  is a positive constant and V   is the th vector in V  .Then each cluster point of {  } is a stationary point of .
Proof.Let  ∞ be an arbitrary cluster point of {  } and the subsequence {  }  converge to  ∞ , where  is an infinite subset of natural numbers.Assume  m ∈ {  }  , and  m =  ǩ.According to Taylor expansion and (A1), we have for all V m  ∈ V m, where ℎ m, V m are frame size and positive basis corresponding to the iteration point  m, respectively, and V m  is th vector of V m.From Definition 4, we have ) Combining ( 30) and ( 31) with (A2), we obtain According to Step 6 of Algorithm 5, we have ℎ ∞ → 0. Combining these with (32) and (A1), we have Let the numbers of V m be V m 1 , . . ., V m  , then there exist  nonnegative coefficients   ( = 1, . . ., ) such that Combining ( 33) and ( 34), we have which yields ∇( ∞ ) = 0.The proof of this theorem is complete.
Remark 9.Although Theorem 8 needs the assumed condition (A1), in practice, we do not solve derivative-free problems that accurately.So we only assure that  is continuously differentiable near the stationary point.

Numerical Experiments
In this section, we discuss numerical test results for Algorithm 5. Our tests are performed on a PC with Intel Core Duo CPU (I5-3470@3.20 GHz, 3.60 GHz) and 8 GB RAM, using MATLAB 7.12.0.To compare our algorithm to Max-PRP, we choose to work with the performance profiles [30] and data profiles [31] for derivative-free optimization.The performance profile is the following fraction: where   is the number of variables in  ∈ .We use the following convergence condition: where  0 is the initial point for the test problem,  > 0 is tolerance,   is the best function value achieved by any solvers within   function evaluations, and   is a positive integer.The benchmark problems set  in our experiments is proposed in [32,33] and CUTEr test problem set [34].The problems set  includes 78 nonlinear least squares problems  1 and 60 normal nonlinear programming problems  2 .Tables 1 and 2 show some information about test problems, where   is the number of variables and   is the number of components.The problems of Table 1 are defined by The problems of Table 2 are defined by In all problems, we have 2 ≤   ≤ 200,  = 1, . . ., 138.
In the RBF interpolation model of ( 8), we set  1 , . . .,   as linear polynomial tails and (‖ −   ‖) = ‖ −   ‖ 3 .In addition, we set  max = 3 as the maximum number of points considered in the interpolate data points  = { 1 ,  2 , . . .,    }.All the previously evaluated points are used to compute the RBF interpolation model when its number is lower than  max .Similar to [20], whenever there are more previously evaluated points than  max for building the RBF interpolation model, 80% of the desired points are selected as the ones nearest to the current iterate and the last 20% are chosen as the ones further away from the current iterate.This strategy is adopted in order to preserve the geometry and diversify the information used in the RBF interpolation model.In Figure 1, we show the performance profiles related to Algorithm 5 and Max-PRP.As we can see, Algorithm 5 outperforms Max-PRP when  = 10 −3 , and the difference is significantly large as the performance ratio  decreases.In addition, Algorithm 5 guarantees better results than Max-PRP when  = 10 −5 .For example, Algorithm 5 can solve about 90% test problems, while Max-PRP only solves no more than 85%, if performance ratio  = 16.
The data profiles of Algorithm 5 and Max-PRP are reported in Figure 2. When the number of simplex gradients  is larger than 40, Algorithm 5 performs better than Max-PRP as it solves a higher percentage of problems.For example, with a budget of 400 simplex gradients and  = 10 −5 , Algorithm 5 solves almost 90% of the problems, while Max-PRP solves roughly 85% of the problems.

Conclusion
The computational results which are presented in this paper show that Algorithm 5 appears quite competitive.The performance profiles and the data profiles of numerical results indicate that Algorithm 5 often reduces the number of function evaluations which is required to reach stationary point and is superior to Max-PRP.

Table 1 :
The information about benchmark problems set  1 .

Table 2 :
The information about benchmark problems set  2 .