Based on the Scaled conjugate gradient (SCALCG) method presented by Andrei (2007) and the projection method presented by Solodov and Svaiter, we propose a SCALCG method for solving monotone nonlinear equations with convex constraints. SCALCG method can be regarded as a combination of conjugate gradient method and Newton-type method for solving unconstrained optimization problems. So, it has the advantages of the both methods. It is suitable for solving large-scale problems. So, it can be applied to solving large-scale monotone nonlinear equations with convex constraints. Under reasonable conditions, we prove its global convergence. We also do some numerical experiments show that the proposed method is efficient and promising.
1. Introduction
In this paper, we consider the following convex constrained monotone equations:
(1)F(x)=0,x∈Ω,
where F:Rn→Rn is a continuous and monotone function. The feasible region Ω is a nonempty closed convex set. Monotone means that
(2)〈F(x)-F(y),x-y〉≥0,∀x,y∈Rn.
The algorithms of solving monotone nonlinear equations F(x)=0 have strong relationship to algorithms of solving optimization problems. It’s known that the function f(x) is strictly function is equivalent to that the vector function ∇f is strictly monotone which means (∇f(x)-∇f(y))T(x-y)>0, and the definition of monotone nonlinear equations is same to this. The strictly convex function must exists unique minimum point, so the minimum point is a stable point of the convex functions, namely, the point which the gradient vector ∇f(x)=0. The monotone vector function F(x)=0 can be seen as a gradient vector of some strictly convex function. There exists strictly convex function f(x), satisfying ∇f(x)=F(x), Therefore, solving minf(x) is equivalent to solving F(x)=0.
Nonlinear monotone equations arise in wide variety of applications, such as subproblems in the generalized proximal algorithms with Bergman distances [1]. In power engineering, the operations of a power system are described by a system of nonlinear equations, called the power flow equations, which are constrained by some operating constraints.
It has received much attention for the unconstrained nonlinear monotone equations [2–5]. Solodov and Svaiter [2] proposed a Newton-type method and a good property of the method is that the whole sequence of iterates converges to a solution of the system without any regularity assumptions. Under some weaker conditions, Zhou and Toh [4] showed that the Solodov and Svaiter’s method is super linear convergence. Zhou and Li [5, 6] extended Solodov and Svaiter’s projection method to the BFGS method and limited memory BFGS method. Zhang and Zhou [3] combined the spectral gradient method and the projection method of Solodov and Svaiter, proposed a spectral gradient projection method. Wang et al. [7] extended Solodov and Svaiter’s projection method to solve monotone equations with convex constraints. Yu et al. [8] proposed a spectral gradient projection algorithm for monotone nonlinear equations with convex constraints by combining a modified spectral gradient method and the projection method. A good property of the method is that the linear system is not necessary at each iteration. Xiao and Zhu [9] extended CG_DESCENT to solve large-scale nonlinear convex constrained monotone equations in compressive sensing by combining with the projection method of Solodov and Svaiter. At each iteration, the proposed method is not necessary to compute the Jacobian information or store any matrix.
This paper is organized as follows. In Section 2, we propose a SCALCG method for solving monotone nonlinear equation with convex constraints. Under reasonable conditions, we prove its global convergence in Section 3. In Section 4, we do some numerical experiments show that our method are efficient and promising.
2. The Method
In this section, we propose our method. At first, we simply review the SCALCG method presented by Andrei [10] for the following unconstrained optimization problems. (3)minf(x),x∈Rn,
where F:Rn→R is a continuously differentiable function, gk is its gradient at point xk.
The method of Andrei generate a sequence {xk} of approximations to the minimum x* of f, in which
(4)xk+1=xk+αkdk,(5)dk+1=-θk+1gk+1+θk+1(gk+1TskykTsk)yk-[(1+θk+1ykTykykTsk)gk+1TskykTsk-θk+1gk+1TykykTsk]sk,
where θk+1=skTsk/ykTsk.
Based on the SCALCG method, we now introduce our method for solving (1). Inspired by (5), we define dk as
(6)dk={-F0,k=0,-θkFk+θk(FkTsk-1yk-1Tsk-1)yk-1-[(1+θkyk-1Tyk-1yk-1Tsk-1)FkTsk-1yk-1Tsk-1-θkFkTyk-1yk-1Tsk-1]sk-1,k≥1,
where yk=γk+λktk∥Fk∥dk, γk=Fk+1-Fk, λk=1+∥Fk∥-1max{0,-〈γk,tkdk〉/∥tkdk∥2}, sk=zk-xk=tkdk, tk is a step length which will be defined later. The definition of yk is similar to the one in [9].
Lemma 1.
Let {dk} be generated by (6), then for any k, we have
(7)FkTdk<0.
Proof.
If k=0, we have F0Td0=-∥F0∥2<0.
If k≥1, we obtain
(8)FkTdk=-θk∥Fk∥2+θk(FkTsk-1yk-1Tsk-1)FkTyk-1-[(1+θk∥yk-1∥2yk-1Tsk-1)FkTsk-1yk-1Tsk-1-θkFkTyk-1yk-1Tsk-1]FkTsk-1=1(yk-1Tsk-1)2[-θk∥Fk∥2(yk-1Tsk-1)2+θkFkTsk-1FkTyk-1yk-1Tsk-1-FkTsk-1FkTsk-1yk-1Tsk-1-θk∥yk-1∥2FkTsk-1FkTsk-1+θkFkTyk-1FkTsk-1yk-1Tsk-1(yk-1Tsk-1)2]=1(yk-1Tsk-1)2[-θk∥Fk∥2(yk-1Tsk-1)2-θk∥yk-1∥2(FkTsk-1)2+2θkFkTsk-1FkTyk-1yk-1Tsk-1-(FkTsk-1)2yk-1Tsk-1],
where
(9)2FkTsk-1FkTyk-1yk-1Tsk-1=2(yk-1Tsk-1)FkT·(FkTsk-1)yk-1≤∥yk-1Tsk-1Fk∥2+∥FkTsk-1yk-1∥2=(yk-1Tsk-1)2∥Fk∥2+(FkTsk-1)2∥yk-1∥2.
So, we have
(10)FkTdk≤-(FkTsk-1)2yk-1Tsk-1.
By the definition of λk, the following inequality holds
(11)λk≥1-∥Fk∥-1〈γk,tkdk〉∥tkdk∥2.
So, we obtain
(12)ykTsk=〈γk+λk∥Fk∥sk,sk〉=〈γk,sk〉+λk∥Fk∥∥sk∥2≥〈γk,sk〉+∥Fk∥∥sk∥2-〈γk,sk〉=∥Fk∥∥sk∥2>0.
It can be seen that
(13)FkTdk≤-(FkTsk-1)2yk-1Tsk-1<0.
The steps of our method are stated as follows.
Algorithm 2.
Consider the following steps.
Step0. Choose an initial point x0∈Ω, and constants ε>0, σ∈(0,1), ρ∈(0,1), ξ>0 Set k:=0.
Step 1. Stop if ∥Fk∥≤ε. Otherwise, compute dk by (6).
Step 2. Let tk=max{ξρi:i=0,1,2,…} which satisfies
(14)-〈F(xk+tkdk),dk〉≥σtk∥dk∥2.
Let zk=xk+tkdk.
Step 3. Compute
(15)xk+1=PΩ[xk-αkF(zk)],
where
(16)αk=〈F(zk),xk-zk〉∥F(zk)∥2.
Step 4. Let k:=k+1. Go to Step 1.
3. Convergence Analysis
In this section, we establish the global convergence of Algorithm 2. For our purpose, we assume that F satisfies the following assumptions.
Condition A. Consider the following.
The mapping F is Lipchitz continuous, it means that it satisfies
(17)∥F(x)-F(y)∥≤L∥x-y∥,∀x,y∈Ω.
The solution set of (1), denoted by S, is nonempty.
Lemma 3.
Algorithm 2 is well defined.
Proof.
We just need prove that Step 2 is well defined in Algorithm 2. We take the limit of the both sides of (14), we have
(18)limtk→0-〈F(xk+tkdk),dk〉=limtk→0-FkTdk>0,limtk→0σtk∥dk∥2=0.
So Algorithm 2 is well defined.
Lemma 4.
Suppose Condition A hold, the step length tk satisfies
(19)tk≥min{ξ,ρδ(L+σ)∥Fk∥2∥dk∥2}.
Proof.
If the algorithm stops at some iteration k then ∥Fk∥=0, so that xk is a solution of (1). From now on, we assume that Fk≠0 for any k. It is easy to see that dk≠0 from (7).
If tk≠ξ, by the line search process, we know that tk′=ρ-1tk does not satisfies (14), that is
(20)-〈F(zk′),dk〉≤σtk′∥dk∥2,
where zk′=xk+tk′dk.
From (7), we know
(21)FkTdk<0,
So, for any k≥0, there exists a positive number δ>0, such that FkTdk≤-δ∥Fk∥2.
From (7) and condition (1), we have
(22)δ∥Fk∥2≤-FkTdk=〈-F(xk),dk〉=〈F(zk′)-F(xk),dk〉-〈F(zk′),dk〉≤L∥zk′-xk∥∥dk∥+σtk′∥dk∥2=Ltk′∥dk∥2+σtk′∥dk∥2=ρ-1(L+σ)tk∥dk∥2.
So we get
(23)tk≥ρδ(L+σ)∥Fk∥2∥dk∥2,tk≥min{ξ,ρδ(L+σ)∥Fk∥2∥dk∥2}.
Lemma 5.
Suppose Condition A hold and x¯∈S, the sequence {xk} is generated by Algorithm 2. Then the sequence {∥Fk∥} is bounded. That means for all k≥0, there exists a positive M>0, such that
(24)∥F(xk)∥≤M.
Proof.
From (2), we have
(25)〈F(zk),xk-x-〉=〈F(zk),xk-zk+zk-x-〉=〈F(zk),xk-zk〉+〈F(zk),zk-x-〉=〈F(zk),xk-zk〉+〈F(zk)-F(x-),zk-x-〉>〈F(zk),xk-zk〉.
From the non-expansiveness of the projection operator, it holds
(26)∥xk+1-x-∥2=∥PΩ[xk-αkF(zk)]-x-∥2=∥PΩ[xk-αkF(zk)]-PΩ(x-)∥2≤∥xk-αkF(zk)-x-∥2=∥xk-x-∥2-2αk〈F(zk),xk-x-〉+αk2∥F(zk)∥2≤∥xk-x-∥2-2αk〈F(zk),xk-zk〉+αk2∥F(zk)∥2=∥xk-x-∥2-2〈F(zk),xk-zk〉∥F(zk)∥2×〈F(zk),xk-zk〉+〈F(zk),xk-zk〉2∥F(zk)∥4∥F(zk)∥2=∥xk-x-∥2-〈F(zk),xk-zk〉2∥F(zk)∥2≤∥xk-x-∥2.
It is easy to see
(27)∥xk+1-x-∥2≤∥xk-x-∥2≤∥xk-1-x-∥2≤∥xk-2-x-∥2≤⋯≤∥x0-x-∥2.
Since F(x) is Lipchitz continuous, we get
(28)∥F(xk)∥=∥F(xk)-F(x-)∥≤L∥xk-x-∥≤L∥x0-x-∥.
Let M=L∥x0-x-∥, then (46) is established.
Lemma 6.
Suppose Condition A hold, and the sequence {xk} and {zk} are generated by Algorithm 2. Then, -F(zk) is a decent direction of the function (1/2)∥x-x-∥2 at the point xk, where x-∈S.
Proof.
The gradient of the function (1/2)∥x-x-∥2 is gk=xk-x-.
From (2), it can be seen that
(29)〈F(zk),xk-x-〉=〈F(zk),xk-zk+zk-x-〉=〈F(zk),xk-zk〉+〈F(zk),zk-x-〉=〈F(zk),xk-zk〉+〈F(zk)-F(x-),zk-x-〉>0.
So, we obtain
(30)〈-F(zk),xk-x-〉<0.
Lemma 7.
Suppose Condition A hold, and the sequence {xk} and {zk} are generated by Algorithm 2. Then we have the following:
{xk} and {zk} are bounded.
limk→∞(xk-zk)=0.
Particularly, we have
(31)limk→∞tk∥dk∥=0.
limk→∞(xk-xk+1)=0.
Proof.
(1) From (26), we have
(32)∥xk+1-x-∥2≤∥xk-x-∥2≤∥xk-1-x-∥2≤∥xk-2-x-∥2≤⋯≤∥x0-x-∥2.
So the sequence {xk} is bounded.
From (2), (14), and (24), we get
(33)〈F(zk),xk-zk〉=〈F(zk),-tkdk〉=-tk〈F(zk),dk〉≥σtk2∥dk∥2=σ∥xk-zk∥2,〈F(zk),xk-zk〉=〈F(zk)-F(xk),xk-zk〉+〈F(xk),xk-zk〉≤∥F(xk)∥∥xk-zk∥≤M∥xk-zk∥.
So, the following inequality holds
(34)σ∥xk-zk∥2≤∥F(xk)∥∥xk-zk∥.
That is,
(35)∥xk-zk∥≤Mσ.
So, the sequence {zk} is bounded.
(2) From (26), we obtain
(36)∥xk+1-x-∥2≤∥xk-x-∥2-〈F(zk),xk-zk〉2∥F(zk)∥2≤∥xk-x-∥2-σ2∥xk-zk∥4∥F(zk)∥2.
Since the function F(x) is continuous, and the sequence {zk} is bounded, so the sequence ∥F(zk)∥ is bounded, that is for all k≥0, that exists a positive M1>0, such that ∥F(zk)∥≤M1. Then, we get
(37)∥xk-zk∥4≤M12σ2(∥xk-x-∥2-∥xk+1-x-∥2),∑k=0∞∥xk-zk∥4≤∑k=0∞M12σ2(∥xk-x-∥2-∥xk+1-x-∥2)<+∞.
So, we have
(38)limk→∞(xk-zk)=0.
Particularly, we obtain
(39)limk→∞tk∥dk∥=limk→∞∥xk-zk∥=0.
(3) From the non-expansiveness of the projection operator, it holds
(40)∥xk-xk+1∥=∥xk-PΩ(xk-αkF(zk))∥=∥PΩ(xk)-PΩ(xk-αkF(zk))∥≤∥xk-(xk-αkF(zk))∥=∥αkF(zk)∥=∥〈F(zk),xk-zk〉∥F(zk)∥2F(zk)∥≤∥xk-zk∥.
So, we obtain
(41)limk→∞∥xk-xk+1∥=0.
Theorem 8.
Suppose Condition A hold, and the sequence {xk} is generated by Algorithm 2. Then, we have
(42)limk→∞inf∥Fk∥=0.
Proof.
If (42) does not hold, for any k≥0, there exist ε>0, such that
(43)∥Fk∥≥ε.
From the nonexpansiveness of the projection operator, it holds
(44)∥xk+1-xk∥=∥PΩ[xk-αkF(zk)]-xk∥≤∥(xk-αkF(zk))-xk∥=αk∥F(zk)∥.
By the definition of αk and Cauchy-Schwartz inequality, we have
(45)∥xk+1-xk∥≤〈F(zk),xk-zk〉∥F(zk)∥2∥F(zk)∥≤∥F(zk)∥∥xk-zk∥∥F(zk)∥2∥F(zk)∥=∥xk-zk∥=tk∥dk∥.
By the definition of yk, assumption (1) and (45), we obtain
(46)∥yk∥=∥γk+λktk∥Fk∥dk∥≤∥γk∥+λk∥Fk∥∥tkdk∥=∥γk∥+(1+∥Fk∥-1max{0,-〈γk,tkdk〉∥tkdk∥2})∥Fk∥∥tkdk∥≤∥γk∥+(1+∥Fk∥-1|〈γk,tkdk〉|∥tkdk∥2)∥Fk∥∥tkdk∥≤∥γk∥+(1+∥Fk∥-1∥γk∥∥tkdk∥∥tkdk∥2)∥Fk∥∥tkdk∥=2∥γk∥+∥Fk∥∥tkdk∥=2∥F(xk+1)-F(xk)∥+∥Fk∥∥tkdk∥≤2L∥xk+1-xk∥+∥Fk∥∥tkdk∥≤2Ltk∥dk∥+Mtk∥dk∥=(2L+M)tk∥dk∥.
From (12), we get
(47)yk-1Tsk-1≥∥Fk-1∥∥sk-1∥2≥ε∥sk-1∥2.
From (46), we have
(48)∥yk-1∥≤(2L+M)∥sk-1∥,θk=sk-1Tsk-1yk-1Tsk-1≤∥sk-1∥2ε∥sk-1∥2=1ε.
From (7), we get
(49)Fkdk≤-δ∥Fk∥2.
So, we obtain
(50)δ∥Fk∥2≤-Fkdk≤∥Fk∥∥dk∥.
That is,
(51)∥dk∥≥δ∥Fk∥≥δε.
From (6), (24), (43), and (47), we have
(52)∥dk∥≤∥θkFk∥+∥θk∥∥Fk∥∥sk-1∥∥yk-1∥yk-1Tsk-1+[(1+θk∥yk-1∥2yk-1Tsk-1)∥Fk∥∥sk-1∥2yk-1Tsk-1+θk∥Fk∥∥yk-1∥∥sk-1∥yk-1Tsk-1]≤1ε∥Fk∥+1ε∥Fk∥∥sk-1∥(2L+M)∥sk-1∥ε∥sk-1∥2+(1+1ε(2L+M)2∥sk-1∥2ε∥sk-1∥2)∥Fk∥∥sk-1∥2ε∥sk-1∥2+1ε∥Fk∥(2L+M)∥sk-1∥2ε∥sk-1∥2=1ε∥Fk∥+1ε2(2L+M)∥Fk∥+(1+(2L+M)2ε2)1ε∥Fk∥+(2L+M)ε2∥Fk∥≤(2ε+2(2L+M)ε2+(2L+M)2ε3)M.
Let (2/ε+2(2L+M)/ε2+(2L+M)2/ε3)M=C, then for all k≥0, we have
(53)∥dk∥≤C.
From (19), (43), and (53), it can be seen that
(54)tk∥dk∥≥min{ξ,ρδL+σ∥Fk∥2∥dk∥2}∥dk∥=min{ξ∥dk∥,ρδL+σ∥Fk∥2∥dk∥}≥min{ξδε,ρδε2C(L+σ)}.
The last inequality yields a contradiction with (31), so (42) holds.
4. Numerical Experiments
In this section, we do some numerical experiments to test the performance of Algorithm 2 on the following two problems. The algorithm was coded in Matlab and run on a personal computer with a 2.3 GHZ CPU and 2 GB memory and Windows XP operating system.
For each test problem, the termination condition is
(55)F(xk)≤10-5.
We set ξ=1, ρ=0.1, σ=0.0001. We test both problems with the number of variables n=100, 500, 1000, 2000, and 5000 and start form different initial points. The meaning of the columns in Tables 1 and 2 is stated as follows. “Dim” means the dimension of the problem, “Init” means the initials points, “Iter” means the number of iterations, “Time” stands for CPU time in seconds, and “Fn” stands for the final norm of equations.
Test results for Problem 9 with given initial points.
Init
(1,1,…,1)T
(2,2,…,2)T
Dim
Iter
Time
Fn
Iter
Time
Fn
100
53
0.041288
9.490010e-6
60
0.046265
9.294530e-6
500
117
0.168041
9.291395e-6
130
0.253245
9.861726e-6
1000
166
0.709354
9.629278e-6
184
0.663908
9.984013e-6
2000
237
4.222668
9.743963e-6
262
2.370548
9.975847e-6
5000
382
42.01209
9.623651e-6
421
21.02392
9.696351e-6
Init
(10,10,…,10)T
(1,0,1,0,…,1,0)T
Dim
Iter
Time
Fn
Iter
Time
Fn
100
77
0.060589
7.784485e-6
55
0.042016
7.946500e-6
500
155
0.279761
9.519559e-6
120
0.149927
9.872650e-6
1000
216
0.813048
9.260490e-6
171
0.594531
9.730717e-6
2000
303
2.733373
9.558623e-6
244
2.273943
9.991830e-6
5000
479
23.90998
9.962504e-6
393
19.72807
9.983895e-6
Test results for Problem 10 with given initial points.
Init
(1,1,…,1)T
(2,2,…,2)T
Dim
Iter
Time
Fn
Iter
Time
Fn
100
26
0.026912
6.603881e-6
32
0.025124
6.915301e-6
500
51
0.088748
7.539315e-6
63
0.149986
8.376139e-6
1000
69
0.2554104
9.920450e-6
86
0.356989
9.752235e-6
2000
96
0.986898
9.891781e-6
120
1.278378
8.932150e-6
5000
151
13.81959
9.365707e-6
187
20.48943
9.698312e-6
Init
(10,10,…,10)T
(1,0,1,0,…,1,0)T
Dim
Iter
Time
Fn
Iter
Time
Fn
100
68
0.054123
8.040204e-6
24
0.018994
7.446192e-6
500
140
0.241559
9.126295e-6
47
0.128392
9.168601e-6
1000
194
0.712503
9.667015e-6
65
0.282163
8.583682e-6
2000
271
2.927190
9.540461e-6
90
0.889466
9.452969e-6
5000
425
46.31353
9.117053e-6
141
7.183196
9.881571e-6
Problem 9.
The F is taken as F(x)=(f1(x),f2(x),…,fn(x))Τ, where
(56)fi(x)=exi-2,i=1,2,…,n,Ω=R+n.
Problem 10.
The F is taken as F(x)=(f1(x),f2(x),…,fn(x))Τ, where
(57)fi(x)=2xi-sin(|xi-1|),i=1,2,…,n,Ω=R+n.
Tables 1 and 2 show that our method is efficient. It is suitable for solving large-scale monotone equations with convex constraints.
5. Conclusions
In this paper, we have proposed a SCALCG method for solving nonlinear monotone equations with convex constraints. Under some wild conditions, we proved its global convergence.
Preliminary numerical experiments have illustrated that the proposed method works well for Problems 9 and 10.
Acknowledgment
This work has been supported by Scientific Research Fund of Hunan Provincial Education Department [12C0664].
SolodovM. V.IusemA. N.Newton-type methods with generalized distances for constrained optimization19974132572782-s2.0-003144388310.1080/02331939708844339ZBL0905.49015SolodovM. V.SvaiterB. F.A globally convergent inexact Newton method for system of monotone equations1998Dordrecht, The NetherlandsKluwer Academic Publishers355369ZhangL.ZhouW.Spectral gradient projection method for solving nonlinear monotone equations200619624784842-s2.0-3374625760010.1016/j.cam.2005.10.002ZBL1128.65034ZhouG.TohK. C.Superlinear convergence of a Newton-type algorithm for monotone equations200512512052212-s2.0-1744439772910.1007/s10957-004-1721-7ZBL1114.65055ZhouW.LiD.A globally convergent BFGS method for nonlinear monotone equations without any merit functions200877264223122402-s2.0-5704912946810.1090/S0025-5718-08-02121-2ZBL1203.90180ZhouW.LiD.Limited memory BFGS method for nonlinear monotone equations200725189962-s2.0-33847371118WangC.WangY.XuC.A projection method for a system of nonlinear monotone equations with convex constraints200766133462-s2.0-3454737796010.1007/s00186-006-0140-yZBL1126.90067YuZ.LinJ.SunJ.XiaoY.LiuL.LiZ.Spectral gradient projection method for monotone nonlinear equations with convex constraints20095910241624232-s2.0-6764979204510.1016/j.apnum.2009.04.004ZBL1183.65056XiaoY. H.ZhuH.A conjugate gradient method to solve convex constrained monotone equations with applications in compressive sensing2013405131031910.1016/j.jmaa.2013.04.017AndreiN.A scaled BFGS preconditioned conjugate gradient algorithm for unconstrained optimization20072066456502-s2.0-3394726876610.1016/j.aml.2006.06.015ZBL1116.90114