ON CONVERGENCE OF A GLOBAL SEARCH STRATEGY FOR REVERSE CONVEX PROBLEMS

On the base of global optimality conditions for RCP we develop the global search strategy (GSS) and focus on the global convergence of GSS giving a variant of the proof.


Introduction
Nowadays specialists on optimization observe the persistent demands from the world of applications to create an effective apparatus for finding just a global solution to nonconvex problems in which there may exist local solutions located very far from a global one even up to the values of goal function.
As well-known, the conspicuous limitation of convex optimization methods applied to nonconvex problems is their ability of being trapped at a local extremum or even a critical point depending on a starting point. In other words, the classical apparatus shows itself inoperative for new problems arising from practice.
That is why, the development of nonconvex optimization took the way of tools initially generated in discrete optimization (DO) and for discrete optimization. So, DO gave some apparatus for continuous optimization (CO). Gradually, Branches & Bounds, cuts, and so forth ideas became very popular in nonconvex area of CO, although in some cases its turned out to be too much sensitive, for instance, with respect to the changing the size of a problem.
In [8,9,10,11,12] it was proposed other approach to d.c. programming problems based on global optimality conditions (GOC) that has proved its effectiveness for numerous continuous (even dynamical) and discrete optimization problems [11,13,14,15]. underline the value of new notion or the theoretical side of investigations and its resolving impact on the convergence proof of GSS.
The paper is organized as follows. First, we remember the GOC for the simplest case of reverse convex problems (x ∈ R n ): and give the necessary and sufficient conditions for a sequence {z k } of feasible points to be minimizing to (P). After this, we discuss some features of global search strategy (GSS) for (P), and finally we study the convergence of GSS independent of starting point choice.

Global optimality conditions and minimizing sequences
In the sequel we assume that the goal function of problem (P) is bounded from below on the feasible set, that is, In addition, suppose, the function g(·) is differentiable over some convex open set Ω containing S.
Besides, the following assumption seems to be natural for reverse convex problem (P) [1,5,6,7,16,17]: (G) there does not exist any global minimum point x * , such that g(x * ) > 0. In other words, where Sol(P) := Argmin(P). Actually, if condition (G) is broken down, problem (P) ceases to be reverse convex and might be solved by a suitable convex optimization method, say, in the case of convexity of f (·) and S.
Theorem 2.1 [9,10]. Let assumption (G) be fulfilled and a point z be a solution to (P)(z ∈ Sol(P)). Then the following condition holds At first sight, condition (Ᏹ 0 ) has no relation to the classical extremum theory. But if one sets y = z, S = R n , from (Ᏹ 0 ) it follows that z is a solution to the problem: (2.4) and that is why the Lagrange rule takes place: with λ 0 ≥ 0, λ ≤ 0, λ 0 + |λ| > 0. Clearly, (2.5) is the well-known result from nonlinear programming [2,3,4,18] for problem (P) with S = R n . Hence (2.3) is connected with optimization Theory. On the other hand, optimality condition (Ᏹ 0 ) possesses so-called algorithmic property (AP) as all classical optimality conditions. It means if optimality condition is broken down at a point z under study then there exists a procedure allowing to construct a feasible point which is better than the point z of interest. Indeed, if there is a pair (v,u), such that g(v) = 0, u ∈ S, f (u) ≤ f (z) and ∇g(v), u − v > 0, then due to convexity of g(·) it results g(u) > 0 = g(v). Thus, one has the feasible point u ∈ S, g(u) > 0, f (u) ≤ f (z). Therefore, in virtue of assumption (G) there is a possibility to decrease the value of objective function f (u) by descending on the constraint g(x) = 0, that is, to construct a better feasible pointẑ ∈ S, such that g(ẑ) = 0, f (ẑ) < f (u) ≤ f (z). It can be carried out by one of classical optimization method, since the constraint g ≥ 0 is not active at the point u ∈ S, f (u) ≤ f (z).
Theorem 2.2 [9,10]. Suppose, in problem (P) the set S is convex and the following assumptions take place: Then condition (Ᏹ 0 ) where y ∈ S becomes sufficient for z being a global solution to (P). Remark 2.3. Assumption (2.6) seems to be natural since it means, that there exist points in R n , which are inadmissible with respect to the constraint g(x) ≥ 0. Otherwise the constraint would be senseless.
Remark 2.4. Assumption (2.7) means, that if we throw away the constraint f (x) ≤ f (z), it becomes immediately possible to violate the basic inequality ∇g(y), x − y ≤ 0 into condition (Ᏹ 0 ). So, assumption (H) turns out to be similar to the regularity (or normality) conditions in Lagrange multipliers rule, when, say, Slater conditions guarantee nontriviality (λ 0 = 0) of the multiplier corresponding to the objective function. If λ 0 = 0, the multiplier rule becomes inconsistent, senseless, because it expresses only some property of constraints, for instance, the linear dependence of constraints gradients. But the goal function is not involved into optimality condition. It is clear, it should be there in, that is, the objective function must be in any optimality conditions in some influential form. In other words, it would be senseless, if optimality condition did not take into account the goal function. Fortunately, in our case it is not so.

Alexander Strekalovsky 153
Proof. Suppose, for some sequence {z k } ∈ ᏹ there no exists any sequence {ζ k } verifying the conditions above. It means, that for every k = 1,2,... there exists x k ∈ R n , such that Then, due to Lemma 2.7 it follows from (2.17) and (2.18) On the other hand, the latter inequalities contradict condition (G1).
(3) Now, due to (2.27) for some number one has Hence, there exists y m ∈ R n , such that g y m = g z m =: g m , Besides, So, the KKT-theorem takes place, that is, So, λ > 0 and g(y m ) = g m := g(z m ). In this case it follows from (2.37) Taking into account ( 1) we get On the other hand, Unifying the latter inequalities with (2.40) one finally has Whence, due to (Ᏹ) it follows lim y m = u. Then because of continuity of g(·) and (2.27), we obtain 0 ≥ limg z m = limg y m = g(u) > 0, (2.43) what is impossible.

Global search strategy
In this section, we briefly repeat the basic positions of conceptual global search algorithm advanced in [15]. Theorems 2.1-2.10 suggest to consider the following problem: which is rather tight. That is why we decompose it into two consecutive problems: where u is an approximate solution to (3.2). Supposing these problems solvable, in [15] we advanced the following.
Step 1. Starting from the point x k ∈ S, g(x k ) ≥ 0, by means of one of local search method (LSM) obtain a τ k -stationary point Step 2. Construct an approximation of the level surface g(y) = 0.
Remark 3.1. In order the description of GSS becomes more substantiated we assume, that (HL) for all δ > 0, for all z ∈ S, g(z) ≥ 0 for all v : (HU) for all δ > 0, for all u ∈ S one can find a point w : Remark 3.2. It can be easily seen that the sequence {z k } generated by GS strategy is a sequence of τ k -stationary (critical) points due to describing of Step 1.

Remark 3.3.
When η k > 0 (Step 6), due to convexity of g(·), one has 0 < ∇g w j ,u j − w j ≤ g u j − g w j = g u j , (3.10) Alexander Strekalovsky 157 whence g(x k+1 ) ≥ η k > 0 = g(z k ) = g(w i ). On the other hand, from the description of Step 3 it follows f x k+1 ≤ ζ k := f z k , x k+1 ∈ S, g x k+1 > 0. (3.11) Thus, one constructed a feasible point x k+1 ∈ S, that is not worse than z k , (since f (x k+1 ) ≤ ζ k := f (z k )) and at which the constraint g ≥ 0 is not active, g(x k+1 ) > 0. In this case, starting new local search at x k+1 under the assumption of type (G), (G1), or, when constraint g ≥ 0 is essential [1,7,16,17], we will get z k+1 , g(z k+1 ) = 0, with the property This observation will have an important impact on the convergence of GS Strategy. Thus, the strategy above becomes relaxing, that is, decreasing the value of f (·) at every iteration, when η k > 0. Note, that we suppose to use some minimization method (say, Newton's method or an interior point method and so on) in order "to descent" on the surface g = 0 with the obligatory strict improving of the goal function and without taking into account the constraint g ≥ 0 (so, a free (from the constraint g = 0) descent). Therefore, we call this LSM the free descent procedure. More strictly it can be reformulated as follows: (FD) there exist µ > 0, such that for all x ∈ S, g(x) > 0, for all τ > 0 one can find a τcritical pointẑ =ẑ(x), such that It is easy to see, that (3.13) is equivalent to Remark 3.4. Returning to the describing of the global search strategy, note that it can not be viewed as an algorithm, because on Step 1, it is not precised what kind of LS algorithm you have to use, as well as, on Step 3 for solving the linearized problem you are free to choose any suitable method, which must be, however very fast, since one has to repeat it several times at every iteration. Nevertheless, GS strategy allows us not to lose from the view all basic moments of global search, while if we begin to precise some point, for instance, the linearized problem solving, we are risking to be lost in some particularities.

Convergence of global search strategy
As it was pointed out above, the verification of the fact, whether a feasible point z is a global solution or not, can be reduced to the solving problem (3.1), which in turn can be partially performed by global search strategy.
Clearly, the choosing methods for solving problems (3.2) and (3.3), as well as the local search method on Step 1 of GSS, must be skilled, but, nevertheless, is standard, in a sense, and "already seen" [2,3,4,5,6,7,18]. At the same time, the constructing of a "good" approximation Ꮽ k of the level surface g(x) = 0 on Step 2 turns out to be of paramount importance from the view point of real global search, as the numerous computational experiments show [8,11,13,14,15].
If you are able to construct a "good" approximation on Step 2 of GSS, you can escape from any stationary point and finally to say that an obtained point z is an approximate global solution [11,13,14,15].
Let us look at the situation from the theoretical point of view and show the resolving impact of a "good" approximation on the convergence of GS strategy.
In order to do it, set ζ := f (z) and consider an approximation Suppose, some points u i ∈ S, f (u i ) ≤ ζ, w i ∈ R n , g(w i ) = 0 verify the inequalities (3.6) and (3.7) according to assumption (HL) and (HU), respectively, with δ k := δ.
Let us consider together with problem (P) the dual (according Tuy [5,7,16,17]) problem of convex maximization: Recall that due to Tuy in [5, Proposition 10, page 166] the point z ∈ S is a solution to (P) if and only if where V (β) is the optimal value function of problem (Q β ) Alexander Strekalovsky 159 Assume, that the optimal value function V (β) is upper Lipschitzian at f * , that is, there exist two constantsβ and M > 0, such that for every β : f * ≤ β <β the following estimation holds Whence, in virtue of (4.6) it follows Note, that estimation (4.8), as well known [2,3,4], is rather natural for problem (Q β ), since for a Lipschitzian function (in (Q β ) it is convex, hence, Lipschitzian), a bounded set S and a continuous function f (·) the stability property in a neighborhood of zero takes place for the optimal value function V (·). See, for instance, [2], Corollaries 1 and 2 from Theorem 6.3.2, as well as Sections 6.4 and 6.5.
Remember also, that any normality condition for problem (Q β ), for instance, Slater or Mangasarian-Fromovitz regularity conditions, guarantees Lipschitzian property for the optimal value function, or the stability of the problem (Q β ) with respect to the simplest perturbation of inequality constraint [2]. Therefore, one can say, that assumption (4.8) by no means bounds the generality of consideration of problems (P) and (Q β ).
In the sequel the following result will be rather useful. Proof. First, from the definition of the function ϕ(·) and due to convexity of g(·) it follows Whence with the help of (4.9) one has If now the inequality (4.5) is broken down, then On the other hand, in this case according to definition of strictly resolving set we have Then, from (4.12) it follows what contradicts (4.4).
Clearly, in the case of merely resolving set and the nonstrict inequalities the proof is similar.
Notes. First, Lemma 4.2 gives the concordance condition for parameters of computation ∆, ε and Θ : ε ≥ MΘ∆, where ∆ is the solving accuracy of problem (P), ε stands for exactness of the basic inequality (4.5), Θ is a share of solving the auxiliary problem (3.1).
Second, according to this lemma, when applying global search strategy one can look only for the number η k = η(ζ k ), ζ k := f (z k ), neglecting (4.5). This completely corresponds to the describing of GSS, in which one does not use the inequality (4.5). The latter will be without fail satisfied, if η k > 0 (η k ≥ 0), when using strictly (merely) resolving set (ζ k ) at every iteration of GSS on Step 2.
Therefore, for applying the GS strategy (or conceptual algorithm) for solving problem (P) the following assumption seems to be natural.
Note, when using merely resolving sets we have to change in the describing GSS the strict inequality η k > 0 (on Step 6) for nonstrict, and the inequality η k ≤ 0 (on Step 7) for the strict one correspondingly.
Since the notion of resolving approximation plays the crucial role in the proof of convergence of GS strategy, at each iteration of which on Step 2 one constructs a resolving set, we will call such conceptual algorithm shorther -strategy.
Further, let us consider the following assumptions: where a convex function g(·) is differentiable over Ω.
Then the sequence {z k } generated by -strategy turns out to be minimizing for problem (P).
Moreover every cluster point z of the sequence {z k } yields the infinum of f (·) over the feasible set of problem (P).
In the case of closed set S this cluster point turns out to be a global solution to (P).
Proof (for strictly resolving sets). For the case of merely resolving sets the proof is similar with the corresponding change of the sign of inequalities from strict for nonstrict and vice versa.
(a) For the case η k = η(ζ k ) ≤ 0 for all k ≥ k 0 ≥ 0 from definition of resolving set it follows By constructing, z k+1 is a τ k+1 -critical point, g(z k+1 ) = 0, obtained by some method of local search, starting from x k+1 := u jk ∈ S, f (x k+1 ) ≤ f (z k ). Besides, where g(w i ) = g(w jk ) = 0, i = 1,...,N k , k = 0,1,2,.... Therefore, due to convexity of g(·) one has 0 < η k ≤ g u jk − g w jk = g x k+1 . (4.23) In addition, in virtue of condition (FD) Because f (·) is bounded from below over the feasible set of problem (P), the sequence { f (z k )} converges. Due to (4.26) it implies that { f (x k )} also converges to the same limit. Therefore, it follows from (4.25) that g(x k ) ↓ 0. In turn, in virtue of (4.23) it implies On the other hand, since the approximation k = (ζ k ) is strictly (∆ k ,δ k ,ε k ,θ k )resolving, according to Lemma 4.2 from the inequality η k > 0 it follows (4.28) From the latter inequality due to (4.17) and ( is also decomposed into two subsequences {z ks } and {z kt }. Moreover, the both subsequences are minimizing according, respectively, to parts (a) and (b) of the proof. Therefore, the whole sequence {z k } shows itself minimizing to problem (P).
To conclude, one can say a few words about the existence of the resolving set, the crucial role of which we observed during the proving of the convergence of GSS.
For some simple problems as presented for instance in [8,12], we are able to construct the weakly resolving set, that is, when from inequality (4.3) it follows only (4.4) (without (4.5)).
Simultaneously, the existence of a weakly resolving set is shown in the sufficiency proof of Theorem 2.2 [8,9,10,12]. Therein, supposing that there exists a feasible element u ∈ S, g(u) ≥ 0, which is better than the point under study, f (u) < f (z), one find a point y, g(y) = 0, such that ∇g(y), u − y > 0. (In the same manner Theorem 2.10 was proven.) It means, that the set W = {y} of only one point, shows itself as the weakly resolving approximation for z.
Moreover, we are able to prove that if z is not a critical point in the sense, say, of the condition (see (2.5)) ∇ f (z) − λ∇g(z),x − z ≥ 0, ∀x ∈ S, (4.30) where g(z) = 0, then the set W = {z} consisting of only one point, turns out to be weakly resolving. But this is the topic for following papers (see [12]).
As a consequence, we can replace assumption (HR) by weaker one concerning the construction of only a weak resolving approximation. How one can do it for convex maximization problems, reverse convex and d.c. minimization problems one can see in [8,10,11,12,13,14,15].

Conclusion
In this paper for a minimisation problem with one reverse convex constraint (i) after remember optimality conditions for a global solution; (ii) the necessary and sufficient conditions for a sequence of feasible points to be minimising are given; (iii) the global search strategy for finding a global solution was proposed; (iv) the new notion of resolving set has been advanced and some features of it were studied; (v) the convergence of proposed global search strategy was proved.