1. Introduction

AAA

Abstract and Applied Analysis

1687-0409 1085-3375

Hindawi Publishing Corporation

376403

10.1155/2013/376403

376403

Research Article

A Decomposition Method with Redistributed Subroutine for Constrained Nonconvex Optimization

Yuan

¹ Wang

Wei

² Pang

Li-Ping

³ Li

Dan

³ Combes

Jean M.

School of Sciences

Shenyang University

Shenyang 110044

China

syu.edu.cn

School of Mathematical

Liaoning Normal University

Dalian 116029

China

lnnu.edu.cn

School of Mathematical Sciences

Dalian University of Technology

Dalian 116024

China

dlut.edu.cn

2013

26 2 2013

2013 06 09 2012 08 12 2012 13 12 2012

2013

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

A class of constrained nonsmooth nonconvex optimization problems, that is, piecewise C2 objectives with smooth inequality constraints are discussed in this paper. Based on the 𝒱𝒰-theory, a superlinear convergent 𝒱𝒰-algorithm, which uses a nonconvex redistributed proximal bundle subroutine, is designed to solve these optimization problems. An illustrative example is given to show how this convergent method works on a Second-Order Cone programming problem.

1. Introduction

Consider the following constrained nonsmooth convex program: (1)min f(x)s.t. gj(x)≤0, j∈J=m+1,…,l, where f is convex and piecewise C2 and gj, j∈J are convex of class C2.

Many approaches are proposed for solving this program. For example, we have converted it into an unconstrained nonsmooth convex program via the exact penalty function in [1]. And we have showed that the objective function of this unconstrained optimization problem is a particular case of function with a primal-dual gradient structure, a notion related to the 𝒱𝒰-space decomposition. Based on the 𝒱𝒰-theory, we have designed an algorithm frame which converges with local superlinear rate.

Yet, very little systematic research has been performed on extending this convex program to a nonconvex framework. The purpose of this paper is to study the following nonconvex program: (2)min f(x)s.t. gj(x)≤0, j∈J=m+1,…,l, where f is piecewise C2 and gj,j∈J are of class C2. Based on the 𝒱𝒰-decomposition theory, which is first introduced in [2] for convex functions, and further studied in [3–13]. We give a 𝒱𝒰-algorithm using a redistributed proximal bundle subroutine to generate a sequence of approximate proximal points. When a primal-dual track exists, these points approximate the primal track points and give the algorithm's 𝒱-steps. And this subroutine also approximates dual track points that are 𝒰-gradients needed for the algorithm's 𝒰-Newton steps. The interest in devising 𝒱𝒰-algorithm for (2) lies on the “smoothing” effect of 𝒰-subspace and its potential to speed up the algorithm's convergence under certain conditions.

The rest of the paper is organized as follows. Section 2 breaks into two subsections. In the first part, the nonconvex program (2) is transformed into an unconstrained problem by means of the exact penalty function. Based on the Clarke subdifferential of the objective function of this unconstrained problem, we obtain the 𝒱𝒰-space decomposition. The second part of Section 2 is devoted to deal with the primal-dual function and its second-order properties. Section 3 designs a conceptual Algorithm 10 and gives its convergence theorem. When a primal-dual track exists, we substitute the 𝒱-step in Algorithm 10 with the redistributed proximal bundle subroutine. In the final section, this algorithm is applied to the Second-Order Cone programming problem to emphasis the theoretical findings.

2. The <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M26"><mml:mi>𝒱</mml:mi><mml:mi>𝒰</mml:mi></mml:math></inline-formula>-Decomposition Results 2.1. The <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M27"><mml:mi>𝒱</mml:mi><mml:mi>𝒰</mml:mi></mml:math></inline-formula>-Space Decomposition

In program (2), f is piecewise C2. Specifically, for all x∈Rn, f is continuous and there exists a finite collection of C2 functions fi:Rn→R, i∈I such that (3)f(x)∈{fi(x)∣i∈I={0,…,m}}. We refer to the function fi, i∈I, as structure functions.

The Clarke subdifferential of f at a point x∈Rn, denoted by ∂-f(x), can be computed in terms of the gradients of the structure functions that are active at x; see [14, Lemma 1]. More precisely, (4)∂-f(x)= conv{g∈Rn∣g=∑i∈I(x)αi∇fi(x),α∈Δ|I(x)|}, where (5)I(x)={i∈I∣f(x)=fi(x)} is the set of active indices at x and (6)Δs={α∈Rs∣αi≥0,∑i=1s‍αi=1}.

Let x-∈Rn be a solution of (2). By continuity of the structure functions, there exists a ball Bε(x-)⊆Rn such that (7)∀x∈Bε(x-), I(x)⊆I(x-). For convenience, we assume that the cardinality of I(x-) is m1+1 and reorder the structure functions, so that (8)I(x-)={0,…,m1}. From now on, we consider that (9)∀x∈Bε(x-), f(x)∈{fi(x)∣i∈I(x-)}.

Let F(x,ρ) denote the exact penalty function of (2) with g0(x)=0 and ∇g0(x)=0, where ρ>0 is a penalty parameter. More precisely, (10)F(x,ρ)=f(x)+ρG(x), where (11)G(x)=max{g0(x),gm+1(x),…,gl(x)}.

Call (12)J(x)={j∈J∣F(x,ρ)=f(x)+ρgj(x)} the set of indices realizing the max at x.

The following assumptions and definitions will be used in the rest of this paper.

Assumption 1.

The set (13){∇fi(x-)-∇f0(x-)}0≠i∈I(x-)∪{∇gj(x-)}j∈J(x-) is linearly independent.

Assumption 2.

Given x0∈RN and M0≥0 there exists an open bounded set 𝒪 and a function H such that, ℒ0:={x∈Rn∣f(x)≤f(x0)+M0}⊂𝒪, H is lower-C2 on 𝒪 satisfying H≡f on ℒ0.

Definition 1 (see [<xref ref-type="bibr" rid="B15">15</xref>, Definition 10.29]).

The function f is lower-𝒞2 on an open set V if for each x-∈V there is a neighbourhood V of x- upon which a representation f(x)=maxt∈Tft(x) holds, where T is a compact set and the functions ft are of class 𝒞2 on V such that ft, ∇ft, and ∇2ft depend continuously not just on x∈V but jointly on (t,x)∈T×V.

Lemma 2 (see [<xref ref-type="bibr" rid="B16">19</xref>, Proposition 1]).

If Assumption 2 holds, then f is bounded below and prox-bounded.

Definition 3 (see [<xref ref-type="bibr" rid="B17">16</xref>, Definition 1]).

Given a lower semicontinuous function f, a point x-∈Rn where f(x-) is finite and ∂-f(x-) is nonempty, and an arbitrary subgradient g∈∂-f(x-), the orthogonal subspaces (14)𝒱:=lin(∂-f(x-)-g), 𝒰:=𝒱⊥ define the 𝒱𝒰-space decomposition, and Rn=𝒰⊕𝒱, where ⊕ is the direct sum of space decomposition.

Theorem 4.

Suppose Assumption 1 holds. Then one has the following results at x-:(i)

the Clarke subdifferential of F(x-,ρ) has the following expression: (15)∂-F(x-,ρ)=∑i∈I(x-)‍αi∇fi(x-)+∑j∈J(x-)‍βj∇gj(x-), where α∈Δ|I(x-)|; βj≥0, j∈J(x-) and ∑j∈J(x-)‍βj≤ρ;

(ii)

let 𝒱 denote the subspace generated by the Clarke subdifferential ∂-F(x-,ρ). Then (16)𝒱=lin{{∇fi(x-)-∇f0(x-)}0≠i∈I(x-)∪{∇gj(x-)}j∈J(x-)},𝒰={∇gj j∈Jd∈Rn∣〈d,∇fi(x-)-∇f0(x-)〉∇=〈d,∇gj(x-)〉=0, 0≠i∈I(x-), j∈J(x-)}.

Proof.

Since f(x) defined in (2) belongs to the PDG-structured family and by Lemma 2.1 in [16] the Clarke subdifferential of F(x,ρ) at x- can be formulated by (17)∂-F(x-,ρ)=∂-f(x-)+ρ∂-G(x-)=∂-f(x-)+ρ conv{ ∇gj(x-)∣j∈J(x-)∪{0}}=∑i∈I(x-)‍αi∇fi(x-)+ρ∑j∈J(x-)∪{0}‍λj∇gj(x-), where α∈Δ|I(x-)|; λj≥0, j∈J(x-)∪{0}, and ∑j∈J(x-)∪{0}‍λj=1.

Together with ∇g0(x-)=0, there exists (18)∂-F(x-,ρ)=∑i∈I(x-)‍αi∇fi(x-)+ρ[λ0·0+∑j∈J(x-)‍λj∇gj(x-)]=∑i∈I(x-)‍αi∇fi(x-)+∑j∈J(x-)‍βj∇gj(x-), where βj=ρλj≥0, j∈J(x-)∪{0} and ∑j∈J(x-)‍βj=ρ-β0≤ρ.

Letting α0=1; αi=0,0≠i∈I(x-) and β0=ρ; βj=0, j∈J(x-), we have ∇f0(x-)∈∂-F(x-,ρ). Then it follows from the definition of space 𝒱 in Definition 3(19)𝒱= lin(∂-F(x-,ρ)-∇f0(x-))= lin{{∇fi(x-)-∇f0(x-)}0≠i∈I(x-)∪{∇gj(x-)}j∈J(x-)}, and 𝒰=𝒱⊥ means that the second formula holds.

Remark 5.

(i) Since the subspaces 𝒰 and 𝒱 generate the whole space Rn, every vector can be decomposed along its 𝒱𝒰-components at x-. In particular, any x∈Rn can be expressed as (20)Rn∋x=x-+u⊕v=x-+U-u+V-v, where V-=[{∇fi(x-)-∇f0(x-)}0≠i∈I(x-)∪{∇gj(x-)}j∈J(x-)] and U-=V-⊥.

(ii) For any s-∈∂-F(x-,ρ), we have (21)s-=s-𝒰⊕s-𝒱=U-Ts-+V-Ts-. From Theorem 4(ii), the 𝒰-component of a subgradients s∈∂-F(x-,ρ) is the same as that of any other subgradient at x-, that is, s-𝒰=U-Ts.

2.2. Primal-Dual Function and Its Second-Order Properties

In order to obtain a fast algorithm for (2), we will define an intermediate function. This function is called primal-dual function which is 𝒞2 about u∈𝒰.

Definition 6 (see [<xref ref-type="bibr" rid="B8">8</xref>, Definition 1]).

We say that (χ(u),γ(u)) is a primal-dual track leading to (x-,0), a minimizer of f and zero subgradient pair, if for all u∈Rdim𝒰 small enough(22)the primal track χ(u)=x-+u⊕v(u),the dual track γ(u)=arg min{|g|2:g∈∂f(χ(u))} satisfy the following:(i)

v:Rdim𝒰↦Rdim𝒱 is a 𝒞2 function satisfying V-v(u)∈W𝒰(u;g-𝒱) for all g-∈ri∂f(x-),

(ii)

the Jacobian Jχ(u) is a basis matrix for 𝒱(χ(u))⊥,

(iii)

the particular 𝒰-Lagrangian L𝒰(u;0) is a 𝒞2-function.

When we write v(u) we implicitly assume that dim𝒰≥1. If dim𝒰=0 we define the primal-dual track to be the point (x-,0). If dim𝒰=n then (χ(u),γ(u))=(x-+u,∇f(x-+u)) for all u in a ball about 0∈Rn.

Theorem 7.

Suppose the Assumption 1 holds. Then for all u small enough, the following hold:(i)

the nonlinear system, with variable v and the parameter u, (23)fi(x-+U-u+V-v)-f0(x-+U-u+V-v)=0, 0≠i∈I(x-),gj(x-+U-u+V-v)=0, j∈J(x-),

has a unique solution v=v(u) and v:Rdim𝒰↦Rdim𝒱 is a 𝒞2 function;

(ii)

primal track χ(u):=x-+u⊕v(u) is C2, with (24)Jχ(u)=U-+V-Jv(u)

and v(u) in (i) is C2, with (25)Jv(u)=-(V(u)TV-)-1V(u)TU-,

where (26)V(u)=[{∇fi(x)-∇f0(x)}0≠i∈I(x)∪{∇gj(x)}j∈J(x)].

In particular, χ(0)=x-, Jv(0)=0, and Jχ(0)=U-;

(iii)

f(χ(u))=fi(χ(u)), i∈I(x-) and G(χ(u))=0.

Proof.

Items (i) and (ii) follow from the assumption that fi, gj are C2 along the lines of [5, Theorem 5.1] and applying a Second-Order Implicit Function Theorem; see [17, Theorem 2.1]. The conclusion of (iii) can be obtained in terms of (i) and the definitions of G(x) and χ(u).

Lemma 8 (see [<xref ref-type="bibr" rid="B7">7</xref>, Theorem 4.5]).

Given g-∈∂-F(x-,ρ), the system with {αi(u)}i∈I(x-),{βj(u)}j∈J(x-)∪{0}(27)V-T[∑i∈I(x-)‍αi(u)∇fi(χ(u))+∑j∈J(x-)‍βj(u)∇gj(χ(u))-g-]=0,∑i∈I(x-)‍αi(u)=1,∑j∈J(x-)∪{0}‍βj(u)=ρ, has a unique solution. In particular, αi(0)=α-i, i∈I(x-) and βj(0)=β-j, j∈J(x-)∪{0}.

The following theorem gives the definition and properties of primal-dual function.

Theorem 9.

Given g-∈∂-F(x-,ρ) and supposing Assumption 1 holds, consider the primal-dual function: (28)LI(u,g-𝒱):=F(χ(u),ρ)-〈g-𝒱,v(u)〉𝒱.

Then for u small enough, the following assertions are true:(i)

LI is a C2 function of u;

(ii)

the gradient of LI is given by (29)∇LI(u;g-𝒱)=U-Tg(u), where (30)g(u)=∑i∈I(x-)‍αi(u)∇fi(χ(u))+∑j∈J(x-)‍βj(u)∇gj(χ(u)). In particular, when u=0, one has (31)∇LI(0;g-𝒱)=U-Tg(0)=U-Tg-, where (32)g(0)=∑i∈I(x-)‍α-i∇fi(x-)+∑j∈J(x-)‍β-j∇gj(x-);

(iii)

the 𝒰-Hessian of F is given by (33)∇2LI(u;g-𝒱)=Jχ(u)TM(u)Jχ(u), where (34)M(u)=∑i∈I(x-)‍αi(u)∇2fi(χ(u))+∑j∈J(x-)‍βj(u)∇2gj(χ(u)). In particular, when u=0, one has (35)∇2LI(0;g-𝒱)=U-TM(0)U-, where (36)M(0)=∑i∈I(x-)‍α-i∇2fi(x-)+∑j∈J(x-)‍β-j∇2gj(x-).

Proof.

(i) From Theorem 7(iii), we have (37)LI(u;g-𝒱)=F(χ(u),ρ)-〈g-𝒱,v(u)〉𝒱=fi(χ(u))-〈g-𝒱,v(u)〉𝒱. Since fi and v(u) are C2, (i) holds.

(ii) In view of the chain rule, differentiating the following system with respect to u: (38)LI(u;g-𝒱)=fi(χ(u))-〈 g-𝒱,v(u)〉𝒱,gj(χ(u))=0, j∈J(x-), we have (39)∇LI(u;g-𝒱)=Jχ(u)T∇fi(χ(u))-Jv(u)TV-Tg-,Jχ(u)T∇gj(χ(u))=0, j∈J(x-). Multiplying each equation by the appropriate αi(u) and βj(u), respectively, summing the results, and using the fact that ∑i∈I(x-)‍αi(u)=1 yields (40)∇LI(u;g-𝒱)=Jχ(u)Tg(u)-Jv(u)TV-Tg-, where (41)g(u)=∑i∈I(x-)‍αi(u)∇fi(χ(u))+∑j∈J(x-)‍βj(u)∇gj(χ(u)). Using the transpose of the expression of Jχ(u), we get (42)∇LI(u;g-𝒱)=U-Tg(u)+Jv(u)TV-T(g(u)-g-), which together with (6.11) in [5] yields the desired result.

In particular, if u=0, then v(0)=0 and χ(0)=x-. It follows from Remark 5(ii) that (43)∇LI(0;g-𝒱)=U-Tg(0)=U-Tg-, where (44)g(0)=∑i∈I(x-)‍α-i∇fi(x-)+∑j∈J(x-)‍β-j∇gj(x-).

(iii) Differentiating (ii) with respect to u, we obtain (45)∇2LI(u;g-𝒱)=U-TM(u)Jχ(u) +U-T[∑j∈J(x-)‍∑ji∈I(x-)‍αi(u)∇fi(χ(u))Jαi(u) =+U-Ts+∑j∈J(x-)‍βj(u)∇gj(χ(u))Jβj(u)], where (46)M(u)=∑i∈I(x-)‍αi(u)∇2fi(χ(u))+∑j∈J(x-)‍βj(u)∇2gj(χ(u)).

According to the proof of Theorem 6.3 in [5], we get (47)∑i∈I(x-)‍αi(u)∇fi(χ(u))Jαi(u)+∑j∈J(x-)‍βj(u)∇gj(χ(u))Jβj(u) =-V(u)(V-TV(u))-1V-TM(u)Jχ(u). Then (48)∇2LI(u;g-𝒱)=U-TM(u)Jχ(u)-U-TV(u)(V-TV(u))-1V-TM(u)Jχ(u)=U-TM(u)Jχ(u)+Jv(u)TV-TM(u)Jχ(u)=[U-T+Jv(u)TV-T]M(u)Jχ(u)=Jχ(u)TM(u)Jχ(u), when u=0, (49)∇2LI(0;g-𝒱)=U-TM(0)U-,where M(0)=∑i∈I(x-)‍α-i(u)∇2fi(x-)+∑j∈J(x-)‍β-j(u)∇2gj(x-).

3. Algorithm and Convergence Analysis

Supposing 0∈∂-F(x-,ρ), we give an algorithm frame which can solve (2). This algorithm makes a step in the 𝒱-subspace, followed by a 𝒰-Newton step in order to obtain superlinear convergence rate.

Algorithm 10 (algorithm frame).

Step 0 (Initialization). Given ε>0, choose a starting point x(0) close to x- enough, and a Clarke subgradient g~(0)∈∂-F(x(0),ρ), set k=0.

Step 1. Stop if (50)∥g~(k)∥≤ε.

Step 2. Find the active index set I(x-) and J(x-).

Step 3. Construct 𝒱𝒰-decomposition at x-, that is, Rn=𝒱⊕𝒰. Compute (51)∇2LI(0;0)=U-TM(0)U-, where (52)M(0)=∑i∈I(x-)‍α-i∇2fi(x-)+∑j∈J(x-)‍β-j∇2gj(x-).

Step 4. Perform 𝒱-step. Compute δ𝒱(k) which denotes v(u) in (23) and set x~(k)=x(k)+0⊕δ𝒱(k).

Step 5. Perform 𝒰-step. Compute δ𝒰(k) from the system (53)U-TM(0)U-δ𝒰+U-Tg~(k)=0, where (54)∑i∈I(x-)‍αi(u)∇fi(x~(k))+∑j∈J(x-)‍βj(u)∇gj(x~(k)) =g~(k)∈∂-F(x~(k),ρ), is such that V-Tg~(k)=0. Compute x(k+1)=x~(k)+δ𝒰(k)⊕0=x(k)+δ𝒰(k)⊕δ𝒱(k).

Step 6 (update). Set k=k+1, and return to Step 1.

Theorem 11.

Suppose the starting point x(0) close to x- enough and 0∈ri∂-F(x-,ρ), ∇2LI(0;0)≻0. Then the iteration points {x(k)}k=1∞ generated by the algorithm converge and satisfy (55)∥x(k+1)-x-∥=o(∥x(k)-x-∥).

Proof.

Let u(k)=(x(k)-x-)𝒰, v(k)=(x(k)-x-)𝒱+δ𝒱(k). It follows from Theorem 7(i) that (56)∥(x(k+1)-x-)𝒱∥=∥(x~(k)-x-)𝒱∥ =o∥(x(k)-x-)𝒰∥=o∥(x(k)-x-)∥. Since ∇2LI(0;0) exists and ∇LI(0;0)=0, we have from the definition of 𝒰-Hessian matrix that (57)∇LI(u(k);0)=U-Tg~(k)=0+∇2LI(0;0)u(k)+o(∥u(k)∥𝒰). By virtue of (53), we have ∇2LI(0;0)(u(k)+δ𝒰(k))=o(∥u(k)∥𝒰). It follows from the hypothesis ∇2LI(0;0)≻0 that ∇2LI(0;0) is invertible and hence ∥u(k)+δ𝒰(k)∥=o(∥u(k)∥𝒰). In consequence, one has (58)(x(k+1)-x-)𝒰=(x(k+1)-x~(k))𝒰+(x~(k)-x(k))𝒰+(x(k)-x-)𝒰=u(k)+δ𝒰(k)=o(∥u(k)∥𝒰)=o(∥x(k)-x-∥). The proof is completed by combining (56) and (58).

Since Algorithm 10 relies on knowing the subspaces 𝒰 and 𝒱 and converges only locally, it needs significant modification for implemental. Our 𝒱𝒰-algorithm defined below finds 𝒱-step by approximating equivalent proximal points.

Given a positive scalar parameter λ, the proximal point function depending on f is defined by (59)pλ(x):=arg minp∈Rn{f(p)+λ2∥p-x∥2} for x∈Rn. If Assumption 2 holds, then the proximal point pλ is single-valued; see [18, Theorem 1].

Corresponding to the primal track, the dual track is defined by (60)γ(u)= arg min{∥g∥2∣g∈∂-F(χ(u),ρ)}. For its properties, one can refer to [16].

The next theorem shows that 𝒱-steps in Algorithm 10 can be replaced by proximal steps, at least in the locality of a minimizer, if Assumptions 1 and 2 hold.

Theorem 12.

Suppose that Assumptions 1 and 2 hold, and that 0∈ri∂-G(x-). Then for all λ>0 sufficiently large and for any sequence xk→x-, one has (61)pλ(xk)=χ(uk),γk=arg min{∥g∥2∣g∈∂-F(pλ(xk),ρ)}, for all large k, where uk:=(pλ(xk)-x-)𝒰.

Proof.

Since gj are C2, j=m+1,…,l and gj are lower-C2. Functions defined by sums, maximums are lower-C2 [15, Example 10.35]; therefore F is lower-C2. From Lemma 2 and [15, Proposition 13.33], we have F is prox-bounded and -regular. Appling the definition of F and the fact f, gj are C2, respectively, we have that F is subdifferentially regular. So f is a function with pdg structure satisfying strong transversality and prox-regular at x-, and 0∈ri∂-G(x-), 0∈ri∂-F(x-,ρ), by [16, Theorem 5.3] we get the result.

In order to define a nonconvex 𝒱𝒰-algorithm for (2) problem, we will use a nonconvex bundle method to approximate proximal points. Many practically nonconvex bundle algorithms are modifications of some convex forerunner, with a fixed model function. Basically, such fixes consist in redefining linearization errors to enforce nonnegativity. However, a redistributed proximal bundle method for nonconvex optimization [19] based on [18] is a different picture. This work proposes an approach based on generating cutting-planes models, not of the objective function as most bundle methods do, but of a local convexification of the objective function. They deal with the augmented functions at x: (62)Fηnx(·,ρ):=F(·,ρ)+12ηn∥·-x∥2, where ηn denotes convexification parameter; in the following μn is model prox-parameter and λn strands for the prox-parameter, which satisfies λn=ηn+μn.

Bundle subroutine accumulates information from past points xi in the form (63)⋃i∈ℬ‍(ei,di,Δi,gi), where ℬ is some index set containing an index i such that xi=x, ei=F(x,ρ)-(F(xi,ρ)+〈gi,x-xi〉), di=(1/2)∥ xi-x∥2, Δi=xi-x, and gi∈∂F(xi,ρ). This information is used at each iteration to define a V-model underestimating Fηnx via the cutting-plane function. (64)φn(y)=F(x,ρ)+maxi∈ℬ{gi-(ei+ηndi) +〈(gi+ηnΔi),y-x〉}. To approximate a proximal point we solve a first quadratic programming subproblem χ-QP, which has the following form and properties.

The problem χ-QP(65)min r+12μn∥ p-x∥2s.t. r≥F(x,ρ)-(ei+ηndi)s.t. r≥+〈(gi+ηnΔi),p-x〉, i∈ℬ has a dual (66)min 12μn ∥∑i∈ℬ‍αi(gi+ηnΔi)∥2+∑i∈ℬ‍αi(ei+ηndi)s.t. αi≥0, i∈ℬ, ∑i∈ℬ‍αi=1. Their respective solutions, denoted by (r^,p^) and α^=(α^1,…,α^|ℬ|), satisfy (67)r^=φn(p^), p^=x-1μng^, where g^:=∑i∈ℬ‍α^i(gi+ηnΔi). In addition, α^i=0 for all i∈ℬ such that (68)r^>F(x,ρ)-(ei+ηndi)+〈(gi+ηnΔi),p^-x〉,φn(p^)=F(x,ρ)-∑i∈ℬ‍α^i(ei+ηndi)+〈 ∑i∈ℬ‍α^i(gi+ηnΔi),p^-x〉=F(x,ρ)-∑i∈ℬ‍α^i(ei+ηndi)-1μn∥g^∥2. For convenience, in the sequel we denote the output of these calculations by (69)(p^,r^)=χ-QP(μn,ηn,{(ei+ηndi,gi+ηnΔi)}i∈ℬ). The vector p^ is an estimate of a proximal point and, hence, approximates a primal track point when the latter exists. To proceed further we define new data, corresponding to a new index i+, by letting xi+:=p^ and computing (70)ε^:=F(p^,ρ)+ηn2∥p^-x∥2-φn(p^). An approximate dual track point, denoted by s^, is constructed by solving a second quadratic problem, which depends on a new index set: (71)ℬ^:={gi(gi+ηnΔi)i∈ℬ:r^=F(x,ρ)-(ei+ηndi)s+〈(gi+ηnΔi),p^-x〉}∪{i+}. The second quadratic programming problem, denoted by γ-QP, (72)min r+12∥p-x∥2s.t. r≥〈(gi+ηnΔi),p-x〉, i∈ℬ^, has a dual problem similar to (66), (73)min 12∥∑i∈ℬ^‍αi(gi+ηnΔi)∥2s.t. αi≥0, i∈ℬ^, ∑i∈ℬ^‍αi=1. Similar to (67), the respective solutions, denoted by (r-,p-) and α-, satisfy (74)p--x=-s^ where s^:=∑i∈ℬ^‍α-i(gi+ηnΔi). Let an active index set be defined by (75)ℬ^act:={i∈ℬ^:r-=(gi+ηnΔi)T(p--x)}. Then, from (74), r-=-(gi+ηnΔi)Ts^,j∈ℬ^act, so (76)[(gi+ηnΔi)-(gl+ηnΔl)]Ts^=0, for all such i and for a fixed l∈ℬ^act. Define a full column rank matrix V^ by choosing the largest number of indices i satisfying (76) such that the corresponding vectors (gi+ηnΔi)-(gl+ηnΔl) are linearly independent and by letting these vectors be the columns of V^. Then let U^ be a matrix whose columns form an orthonormal basis for the null-space of V^T. And let U^=I if V^ is vacuous.

For convenience, in the sequel we denote the output from these calculation by (77)(s^,U^)=γ-QP({gi+ηnΔi}i∈ℬ^). The bundle subprocedure is terminated and p^ is declared to be approximation of pλ(x) if (78)ε^≤∥s^∥2. Otherwise, ℬ above is replaced by ℬ^ and new iterate data are computed by solving the updated two quadratic programming problems above.

Now we consider a heuristic algorithm depending on the 𝒱𝒰-theory and the primal-dual track point approximations above.

Algorithm 13 (nonconvex <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M388"><mml:mi>𝒱</mml:mi><mml:mi>𝒰</mml:mi></mml:math></inline-formula>-Algorithm for (<xref ref-type="disp-formula" rid="EEq2">2</xref>)).

Step 0. Select initial starting point p0 and positive parameter M0, λ0, ξ, ρ, a convexification growth parameter Γ>1. Compute the oracle values F(p0,ρ) and g0∈∂F(p0,ρ), and the additional bundle information (e0,d0,Δ0):=(0,0,0), with (μ0,η0)=(λ0,0). Also, let U0 be a matrix with orthonormal n-dimensional columns estimating an optimal 𝒰-basis. Set s0=g0 and k:=0.

Step 1. Stop if ∥sk∥2≤ξ.

Step 2. Choose an nk×nk positive definite matrix Hk, where nk is the number of columns of Uk.

Step 3. Compute an 𝒰-Newton step by solving the linear system (79)Hkδk=-UkTsk. Set xk+1′:=pk+Ukδk.

Step 4. Initialize ℬ and run the bundle subprocedure with x=xk+1′. Compute recursively, (80)(p^,r^)=χ-QP(x,{(ei+ηndi,gi+ηnΔi)}i∈ℬ),ε^=F(p^,ρ)+ηn2∥p^-x∥2-φn(p^), ℬ^ given by (71),(s^,U^)=γ-QP({gi+ηnΔi}i∈ℬ^) until satisfaction of (78). Then set (εk+1′,pk+1′,sk+1′,Uk+1′):=(ε^,p^,s^,U^).

Step 5. If (81)F(pk+1′,ρ)-F(pk,ρ)≤-∥sk+1′∥2, then set (82)(xk+1,εk+1,pk+1,sk+1,Uk+1) :=(xk+1′,εk+1′,pk+1′,sk+1′,Uk+1′). And apply rule (83)ηn+1:=ηn if ηn+1min≤ηn,ηn+1:=Γηn+1min, λn:=μn+ηn+1 if ηn+1min>ηn, where (84)ηn+1min:=maxi∈ℬdi>0-eidi. Otherwise, execute a line search on the line determined by pk and pk+1′ to find xk+1 thereon satisfying f(xk+1)≤f(pk); reinitialize ℬ and restart the bundle subroutine with x=xk+1, and set η0:=η, μ0:=Γμn, λ0:=η0+μ0, (e0,d0,Δ0):=(0,0,0) to find new values for (ε^,p^,s^,U^); then set (εk+1,pk+1,sk+1,Uk+1)=(ε^,p^,s^,U^).

Step 6. Replace k by k+1 and go to Step 1.

4. An Illustration Numerical Example

Now we report numerical result to illustrate Algorithm 13. Our numerical experiment is carried out in Matlab 7.8.0 running on a PC Intel Core 2 Duo CPU 2.93 GHz and 2.00 GB memory.

We consider the following Second-Order Cone programming problem (SOCP): (85)min ∑j=1p‍12xjTDjxjs.t. xj∈𝒦nj, j=1,…,p, where Dj∈Rnj×Rnj is nj×nj symmetric infinite matrix and xj=(x0j,x-j) with x-j=(x1j,…,xnj-1j).

This (SOCP) can be formulated in the form: (86)min ∑j=1p‍12xjTDjxjs.t. x0j≥∥x-j∥, j=1,…,p, equivalently, (87)min ∑j=1p‍12xjTDjxjs.t. ∑i=1nj-1‍(xij)2-(x0j)2≤0, j=1,…,ps. t. -x0j≤0, j=1,…,p, Let (88)x~j={∑i=1nj-1‍(xij)2-(x0j)2≤0,j=1,…,p-x0j-p≤0,j=p+1,…,2p. Then (SOCP) problem is equivalent to the nonlinear programming problem: (89)min ∑j=1p‍12xjTDjxjs.t. x~j≤0, j=1,…,2p.

Let D=diag(D1,…,Dp), x=(x1T,…,xpT)T, then the exact penalty function of this nonlinear programming problem is (90)min F (x,ρ)=12xTDx+ρmax{x~0,x~1,…,x~2p}, with x~0=0.

In the implementation, the initial starting point is chosen arbitrarily, and the parameters have values M0=10, λ0=10,ξ=1.0×10-5 and Γ=2. Optimality is declared when stopping criterion is satisfied.

Numerical results are summarized in Table 1 in which n denotes the number of variables, #f/g denotes the number of function and one subgradient evaluation.

Table 1

Numerical results.

n	#f/g	∥ x k - x - ∥	∥ x 0 - x - ∥
40	5	1.0084 × 10⁻¹²	3.3503 × 10³
100	10	5.0028 × 10⁻¹²	5.8699 × 10³
200	15	6.7345 × 10⁻¹²	8.1666 × 10³
500	20	1.2895 × 10⁻¹¹	1.3440 × 10⁴
1000	25	1.6550 × 10⁻¹¹	1.8329 × 10⁴

Acknowledgments

This paper is supported by the National Natural Science Foundation of China under Projects nos. 11226230, 11171138 and 11171049, 11226238 and General Project of the Education Department of Liaoning Province no. L2012427.

Pang

L.-P.

Guo

F.-F.

Xia

Z.-Q.

A superlinear space decomposition algorithm for constrained nonsmooth convex program

Journal of Computational and Applied Mathematics 2010 234 1 224 232

10.1016/j.cam.2009.12.018

MR2601296

ZBL1226.65056

Lemaréchal

Oustry

Sagastizábal

The U-Lagrangian of a convex function

Transactions of the American Mathematical Society 2000 352 2 711 729

10.1090/S0002-9947-99-02243-6

MR1487623

Mifflin

Sagastizábal

Tichatschke

Théra

M. A.

V U -decomposition derivatives for convex max-functions

Ill-Posed Variational Problems and Regularization Techniques 1999 477

Berlin, Germany

Springer

167 186 Lecture Notes in Economics and Mathematical Systems

10.1007/978-3-642-45780-7_11

MR1737319

ZBL0944.65069

Lemaréchal

Sagastizábal

More than first-order developments of convex functions: primal-dual relations

Journal of Convex Analysis 1996 3 2 255 268

MR1448055

ZBL0868.49014

Mifflin

Sagastizábal

On VU-theory for functions with primal-dual gradient structure

SIAM Journal on Optimization 2000 11 2 547 571

10.1137/S1052623499350967

MR1789552

Mifflin

Sagastizábal

Pillo

Giannessi

Functions with primal-dual gradient structure and U-Hessians

Nonlinear Optimization and Related Topics 2000 36

Kluwer Academic Publishers

219 233 Applied Optimization

MR1777922

Mifflin

Sagastizábal

Primal-dual gradient structured functions: second-order results; links to epi-derivatives and partly smooth functions

SIAM Journal on Optimization 2003 13 4 1174 1194

10.1137/S1052623402412441

MR2005923

ZBL1036.90067

Mifflin

Sagastizábal

A VU-algorithm for convex minimization

Mathematical Programming B 2005 104 2-3 583 608

10.1007/s10107-005-0630-3

MR2179252

Shan

Pang

L.-P.

Zhu

L.-M.

Xia

Z.-Q.

A UV-decomposed method for solving an MPEC problem

Applied Mathematics and Mechanics 2008 29 4 535 540

10.1007/s10483-008-0412-y

MR2405142

Pang

L.-P.

Shen

Liang

X.-J.

A decomposition algorithm for convex nondifferentiable minimization with errors

Journal of Applied Mathematics 2012 2012 15

215160

10.1155/2012/215160

MR2861933

ZBL1235.65066

Daniilidis

Sagastizábal

Solodov

Identifying structure of nonsmooth convex functions by the bundle technique

SIAM Journal on Optimization 2009 20 2 820 840

10.1137/080729864

MR2515798

ZBL1191.90066

Hare

W. L.

A proximal method for identifying active manifolds

Computational Optimization and Applications 2009 43 2 295 306

10.1007/s10589-007-9139-4

MR2506254

ZBL1170.90460

Hare

W. L.

Functions and sets of smooth substructure: relationships and examples

Computational Optimization and Applications 2006 33 2-3 249 270

10.1007/s10589-005-3059-4

MR2208819

ZBL1103.90100

Mifflin

Sun

Properties of the Moreau-Yosida regularization of a piecewise C2 convex function

Mathematical Programming A 1999 84 2 269 281

MR1690013

ZBL0971.90082

Rockafellar

R. T.

Wets

R. J.-B.

Variational Analysis 1998 317

Berlin, Germany

Springer

Fundamental Principles of Mathematical Sciences

Mifflin

Sagastizábal

V U -smoothness and proximal point results for some nonconvex functions

Optimization Methods & Software 2004 19 5 463 478

10.1080/10556780410001704902

MR2095347

ZBL1097.90059

Lang

Real and Functional Analysis 1993 3rd

New York, NY, USA

Springer

10.1007/978-1-4612-0897-6

MR1216137

Hare

Sagastizábal

Computing proximal points of nonconvex functions

Mathematical Programming B 2009 116 1-2 221 258

10.1007/s10107-007-0124-6

MR2421280

ZBL1168.90010

Hare

W. L.

Sagastizábal

A redistributed proximal bundle method for nonconvex optimization

SIAM Journal on Optimization 2010 20 5 2442 2473

10.1137/090754595

MR2678400

ZBL1211.90183