JAM Journal of Applied Mathematics 1687-0042 1110-757X Hindawi Publishing Corporation 986317 10.1155/2013/986317 986317 Research Article A Conjugate Gradient Type Method for the Nonnegative Constraints Optimization Problems Li Can Simos Theodore E. College of Mathematics Honghe University Mengzi 661199 China uoh.edu.cn 2013 10 4 2013 2013 16 12 2012 20 03 2013 2013 Copyright © 2013 Can Li. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

We are concerned with the nonnegative constraints optimization problems. It is well known that the conjugate gradient methods are efficient methods for solving large-scale unconstrained optimization problems due to their simplicity and low storage. Combining the modified Polak-Ribière-Polyak method proposed by Zhang, Zhou, and Li with the Zoutendijk feasible direction method, we proposed a conjugate gradient type method for solving the nonnegative constraints optimization problems. If the current iteration is a feasible point, the direction generated by the proposed method is always a feasible descent direction at the current iteration. Under appropriate conditions, we show that the proposed method is globally convergent. We also present some numerical results to show the efficiency of the proposed method.

1. Introduction

Due to their simplicity and their low memory requirement, the conjugate gradient methods play a very important role for solving unconstrained optimization problems, especially for the large-scale optimization problems. Over the years, many variants of the conjugate gradient method have been proposed, and some are widely used in practice. The key features of the conjugate gradient methods are that they require no matrix storage and are faster than the steepest descent method.

The linear conjugate gradient method was proposed by Hestenes and Stiefel  in the 1950s as an iterative method for solving linear systems (1)Ax=b,xRn, where A is an n×n symmetric positive definite matrix. Problem (1) can be stated equivalently as the following minimization problem (2)min12xTAx-bTx,xRn. This equivalence allows us to interpret the linear conjugate gradient method either as an algorithm for solving linear systems or as a technique for minimizing convex quadratic functions. For any xRn, the sequence {xk} generated by the linear conjugate gradient method converges to the solution x* of the linear systems (1) in at most n steps.

The first nonlinear conjugate gradient method was introduced by Fletcher and Reeves  in the 1960s. It is one of the earliest known techniques for solving large-scale nonlinear optimization problems (3)minf(x),xRn, where f:RnR is continuously differentiable. The nonlinear conjugate gradient methods for solving (3) have the following form: (4)xk+1=xk+αkdk,dk={-f(xk),k=0,-f(xk)+βkdk-1,k1, where αk is a steplength obtained by a line search and βk is a scalar which deternimes the different conjugate gradient methods. If we choose f to be a strongly convex quadratic and αk to be the exact minimizer, the nonliear conjugate gradient method reduces to the linear conjugate gradient method. Several famous formulae for βk are the Hestenes-Stiefel (HS) , Fletcher-Reeves (FR) , Polak-Ribière-Polyak (PRP) [3, 4], Conjugate-Descent (CD) , Liu-Storey (LS) , and Dai-Yuan (DY)  formulae, which are given by (5)βkHS=f(xk)yk-1dk-1yk-1,βkFR=f(xk)2f(xk-1)2,(6)βkPRP=f(xk)yk-1f(xk-1)2,βkCD=-f(xk)2dk-1f(xk-1),(7)βkLS=-f(xk)yk-1dk-1f(xk-1),βkDY=f(xk)2dk-1yk-1, where yk-1=f(xk)-f(xk-1) and · stands for the Euclidean norm of vectors. In this paper, we focus our attention on the Polak-Ribière-Polyak (PRP) method. The study of the PRP method has received much attention and has made good progress. The global convergence of the PRP method with exact line search has been proved in  under strong convexity assumption on f. However, for general nonlinear function, an example given by Powell  shows that the PRP method may fail to be globally convergent even if the exact line search is used. Inspired by Powell’s work, Gilbert and Nocedal  conducted an elegant analysis and showed that the PRP method is globally convergent if βkPRP is restricted to be nonnegative and αk is determined by a line search satisfying the sufficient descent condition gkdk-cgk2 in addition to the standard Wolfe conditions. Other conjugate gradient methods and their global convergence can be found in  and so forth.

Recently, Li and Wang  extended the modified Fletcher-Reeves (MFR) method proposed by Zhang et al.  for solving unconstrained optimization to the nonlinear equations (8)F(x)=0, where F:RnRn is continuously differentiable, and proposed a descent derivative-free method for solving symmetric nonlinear equations. The direction generated by the method is descent for the residual function. Under appropriate conditions, the method is globally convergent by the use of some backtracking line search technique.

In this paper, we further study the conjugate gradient method. We focus our attention on the modified Polak-Ribière-Polyak (MPRP) method proposed by Zhang et al. . The direction generated by MPRP method is given by (9)dk={-g(xk),k=0,-g(xk)+βkPRPdk-1-θkyk-1,k>0, where g(xk)=f(xk),  βkPRP=g(xk)Tyk-1/g(xk-1)2,  θk=g(xk)Tdk-1/g(xk-1)2,  and  yk-1=g(xk)-g(xk-1). The MPRP method not only reserves good properties of the PRP method but also possesses another nice property; that it is, always generates descent directions for the objective function. This property is independent of the line search used. Under suitable conditions, the MPRP method with the Armoji-type line search is also globally convergent. The purpose of this paper is to develop an MPRP type method for the nonnegative constraints optimization problems. Combining the Zoutendijk feasible direction method with MPRP method, we propose a conjugate gradient type method for solving the nonnegative constraints optimization problems. If the initial point is feasible, the method generates a feasible point sequence. We also do numerical experiments to test the proposed method and compare the performance of the method with the Zoutendijk feasible direction method. The numerical results show that the method that we propose outperforms the Zoutendijk feasible direction method.

2. Algorithm

Consider the following nonnegative constraints optimization problems: (10)minf(x)s.t.    x0, where f:RnR is continuously differentiable. Let xk0 be the current iteration. Define the index set (11)Ik=I(xk)={ixk(i)=0},Jk={1,2,,n}Ik, where xk(i) is the  ith component of xk. In fact the index set Ik is the active set of problem (10) at xk.

The purpose of this paper is to develop a conjugate gradient type method for problem (10). Since the iterative sequence is a feasible point sequence, the search directions should be feasible descent directions. Let xk0 be the current iteration. By the definition of feasible direction, we have that  dRn is a feasible direction of (10) at xk if and only if dIk0. Similar to the Zoutendijk feasible direction method, we consider the following problem: (12)minf(xk)Tds.t.dIk0,d1. Next, we show that, if xk is not a KKT point of (10), the solution of problem (12) is a feasible descent direction of f at xk.

Lemma 1.

Let xk0 and let d- be a solution of problem (12); then f(xk)Td-0. Moreover f(xk)Td-=0 if and only if xk is a KKT point of problem (10).

Proof.

Since d=0 is a feasible point of problem (12), there must be f(xk)Td-0. Consequently, if f(xk)Td-0, there must be f(xk)Td-<0. This implies that the direction d- is a feasible descent direction of f at xk.

We suppose that f(xk)Td-=0. Problem (12) is equivalent to the following problem: (13)minf(xk)Tds.t.dIk0,d21. Then there exist λIk and μ such that the following KKT condition holds: (14)f(xk)-(λIk0)+2μd-=0,λIk0,d-Ik0,λIkTd-Ik=0,μ0,d-1,μ(d-2-1)=0. Multiplying the first of these expressions by d-, we obtain (15)f(xk)Td--λTd-+2μd-2=0, where λ=(λIk0). By combining the assumption f(xk)Td-=0 with the second and the third expressions of (14), we find that μ=0. Substituting it into the first expressions of (14), we obtain that (16)fIk(xk)-λIk=0,fJk(xk)=0. Let λi=0,  iJk; then λi0,  iIkJk. Moreover, we have (17)f(xk)-(λIkλJk)=0,λi0,xk(i)0,λixk(i)=0,iIkJk.

This implies that xk is a KKT point of problem (10).

On the other hand, we suppose that xk is a KKT point of problem (10). Then there exist λi,iIkJk, such that the following KKT condition holds: (18)f(xk)-(λIkλJk)=0,λi0,xk(i)0,λixk(i)=0,iIkJk. From the second of these expressions, we get λJk=0. Substituting it into the first of these expressions, we have fIk(xk)=λIk0 and fJk(xk)=0, so that f(xk)Td-=fIk(xk)Td-Ik=λIkTd-Ik0. However, we had shown that f(xk)Td-0, so f(xk)Td-=0.

By the proof of Lemma 1 we find that fIk(xk)0 and fJk(xk)=0 are necessary conditions of the fact that xk is a KKT point of problem (10). We summarize these observation results as the following result.

Lemma 2.

Let xk0; then xk is a KKT point of problem (10) if and only if fIk(xk)0 and fJk(xk)=0.

Proof.

Firstly, we suppose that xk is a KKT point of problem (10). Similar to the proof of Lemma 1, it is easy to get that fIk(xk)0 and fJk(xk)=0.

Secondly, we suppose that fIk(xk)0 and fJk(xk)=0. Let λIk=fIk(xk)0,  λJk=0; then the KKT condition (18) holds, so that xk is a KKT point of problem (10).

Based on the above discussion, we propose a conjugate gradient type method for solving problem (10) as follows. Let feasible point xk be current iteration. For the boundary of the feasible region xkIk=0, we take (19)dki={0,gi(xk)>0,-gi(xk),gi(xk)0,iIk, where gi(xk)=fi(xk). For the interior of the feasible region xkJk>0, similar to the direction dk in the MPRP method, we define dkJk by the following formula: (20)dkJkMPRP={-gJk(xk),k=0,-gJk(xk)+βkPRPdk-1Jk-θkMPRPyk-1,k>0, where gJk(xk)=fJk(xk),  βkPRP=gJk(xk)Tyk-1/g(xk-1)2,  θkMPRP=gJk(xk)Tdk-1Jk/g(xk-1)2,  and  yk-1=gJk(xk)-gJk(xk-1).

It is easy to see from (19) and (20) that (21)-gIk(xk)2gIk(xk)TdkIk0,gJk(xk)TdkJk=-gJk(xk)2. The above relations indicate that (22)g(xk)Tdk=gIk(xk)TdkIk+gJk(xk)TdkJk-gJk(xk)2,(23)g(xk)Tdk-gIk(xk)2-gJk(xk)2=g(xk)2, where g(xk)=f(xk).

Theorem 3.

Let xk0, dk be defined by (19) and (20) then (24)g(xk)Tdk0. Moreover, xk is a KKT point of problem (10) if and only if g(xk)Tdk=0.

Proof.

Clearly, inequality (22) implies that (25)g(xk)Tdk0.

If xk is a KKT point of problem (10), similar to the proof of Lemma 1, we also get that g(xk)Tdk=0.

If g(xk)Tdk=0, by (22), we can get that (26)gIk(xk)TdkIk=0,gJk(xk)TdkJk=-gJk(xk)2=0. The equality gIk(xk)TdkIk=0 and the definition of dkIk (19) imply that (27)gIk(xk)0. Let λIk=gIk(xk)0;  λJk=0, then the KKT condition (18) also holds, so that xk is a KKT point of problem (10).

By combining (22) with Theorem 3, we conclude that dk defined by (19) and (20) provides a feasible descent direction of f at xk, if xk is not a KKT point of problem (10).

Based on the above process, we propose an MPRP type method for solving (10) as follows.

Algorithm 4 (MPRP type method).

Step  0. Given constants ρ(0,1),  δ>0,  ϵ>0. Choose the initial point x00; Let k:=0.

Step  1. Compute dk=(dkIk,dkJk) by (19) and (20). If |g(xk)Tdk|ϵ, then stop. Otherwise, go to the next step.

Step  2. Determine αk=max{ρj,j=0,1,2,} satisfying xk+αkdk0 and (28)f(xk+αkdk)f(xk)-δαk2dk2.

Step  3. Let the next iteration be xk+1=xk+αkdk.

Step  4. Let k:=k+1 and go to Step  1.

It is easy to see that the sequence {xk} generated by Algorithm 4 is a feasible point sequence. Moreover, it follows from (28) that the function value sequence {f(xk)} is decreasing. In addition if f(x) is bounded from below, we have from (28) that (29)k=0αk2dk2<. In particular we have (30)limkαkdk=0.

Next, we prove the global convergence of Algorithm 4 under the following assumptions.

Assumption A.

( 1 ) The level set ω={xRnf(x)f(x0)} is bound.

( 2 ) In some neighborhood N of ω, f is continuously differentiable, and its gradient is the Lipschitz continuous; namely, there exists a constant L>0 such that (31)f(x)-f(y)Lx-y,x,yN.

Clearly, Assumption A implies that there exists a constant γ1 such that (32)f(x)γ1,xN.

Lemma 5.

Suppose that the conditions in Assumption A hold; {xk} and {dk} are the iterative sequence and the direction sequence generated by Algorithm 4. If there exists a constant ϵ>0 such that (33)g(xk)ϵ,k, then there exists a constant M>0 such that (34)dkM,k.

Proof.

By combining (19), (20), and (33) with Assumption A, we deduce that (35)dkdkIk+dkJkMPRPγ1+gJk(xk)+2gJk(xk)yk-1dk-1JkMPRPg(xk-1)22γ1+2γ1Lαk-1dk-1JkMPRPϵ2dk-1JkMPRP. By (30), there exists a constant γ(0,1) and an iteger k0 such that the following inequality holds for all kk0: (36)2Lγ1ϵ2αk-1dk-1JkMPRPγ. Hence, we have for any kk0(37)dk2γ1+γdk-12γ1(1+γ+γ2++γk-k0-1)+γk-k0dk02γ11-γ+dk0.

Let (38)M=max{d1,d2,,dk0,2γ11-γ+dk0}. Then (39)dkM,k.

Theorem 6.

Suppose that the conditions in Assumption A hold. Let {xk} and {dk} be the iterative sequence and the direction sequence generated by Algorithm 4. Then (40)liminfk|g(xk)Tdk|=0.

Proof.

We prove the result of this theorem by contradiction. Assume that the theorem is not true; then there exists a constant ε>0 such that (41)|g(xk)Tdk|ϵ,k. So by combining (41) with (23), it is easy to see that (33) holds.

If liminfkαk>0, we get from (30) that dk0, so that limk|g(xk)Tdk|=0. This contradicts assumption (41).

If liminfkαk=0, there is an infinite index set K such that (42)limkK,kαk=0.

It follows from Step  2 of Algorithm 4, that when kK is sufficiently large, ρ-1αk does not satify f(xk+αkdk)f(xk)-δαk2dk2; that is (43)f(xk+ρ-1αkdk)-f(xk)>-δρ-2αk2dk2. By the mean-value theorem, Lemma 1, and Assumption A, there is hk(0,1) such that (44)f(xk+ρ-1αkdk)-f(xk)=ρ-1αkg(xk+hkρ-1αkdk)Tdk=ρ-1αkg(xk)Tdk+ρ-1αk(g(xk+hkρ-1αkdk)-g(xk))Tdkρ-1αkg(xk)Tdk+Lρ-2αk2dk2. Substituting the last inequality into (43), we get for all kK sufficiently large (45)0-g(xk)Tdkρ-1(L+δ)αkdk2. Taking the limit on both sides of the equation, then by combining dkM and recalling limkK,kαk=0, we obtain that limkK,k|g(xk)Tdk|=0. This also yields a contradiction.

3. Numerical Experiments

In this section, we report some numerical experiments. We test the performance of Algorithm 4 and compare it with the Zoutendijk method.

The code was written in Matlab, and the program was run on a PC with 2.20 GHz CPU and 1.00 GB memory. The parameters in the method are specified as follows. We set ρ=1/2,  δ=1/10. We stop the iteration if |f(xk)Tdk|0.0001 or the iteration number exceeds 10000.

We first test Algorithm 4 on small and medium size problems and compared them with the Zoutendijk method in the total number of iterations and the CPU time used. The test problems are from the CUTE library . The numerical results of Algorithm 4 and the Zoutendijk method are listed in Table 1. The columns have the following meanings.

The numerical results.

P ( i ) Dim Algorithm 4 Zoutendijk method
Iter Time Iter Time
3 2 1973 1.5710
4 2 201 0.2290
6 2 30 0.0160
3 35 0.0160
4 39 0.0470
10 124 0.1210
50 220 0.5370
8 3 44 0.0150 40 0.2188
11 3 3 0.0000 4 0.1094
15 4 10 0.0160 20 0.1563
18 6 322 0.0690 1936 12.0938
19 11 438 0.5440 8338 72.4219
23 50 12 0.0300 4 0.5000
24 100 142 0.3750
25 100 38 0.0810 6 0.3438
26 100 8 0.0470 6 0.1250
1000 4 47.9060 4 190.1406

P ( i ) is the number of the test problem, Dim is the dimension of the test problem, Iter is the number of  iterations, and Time is CPU time in seconds.

We can see from Table 1 that Algorithm 4 has successfully solved 12 test problems, and the Zoutendijk method has successfully solved 8 test problems. From the number of iterations, Algorithm 4 has 12 test results better than Zoutendijk method. From the computation time, Algorithm 4 performs much better than the Zoutendijk method did. We then test Algorithm 4 and the Zoutendijk method on two problems with a larger dimension. The problem of VARDIM comes from , and the following problem comes from . The results are listed in Tables 2 and 3.

Test results for VARDIM with various dimensions.

Problem Dim Algorithm 4 Zoutendijk method
Iter Time Iter Time
VARDIM 1000 46 13.4485
2000 55 49.0090
3000 65 97.1020
4000 78 164.6213
5000 90 271.0340

Test results for Problem 1 with various dimensions.

Problem Dim Algorithm 4 Zoutendijk method
Iter Time Iter Time
Problem 1 1000 17 0.1400 8 110.2578
2000 26 16.8604 8 263.2660
3000 39 39.6561 11 554.0310
4000 51 68.1729 30 910.1090
5000 55 110.5660
Problem 1.

The nonnegative constraints optimization problem (46)minf(x)s.t.  x0, with Engval function f:RnR is defined by (47)f(x)=i=2n{(xi-12+xi2)2-4xi-1+3}.

We can see from Table 2 that Algorithm 4 has successfully solved the problem of VARDIM whose scale varies from 1000 dimensions to 5000 dimensions. However, the Zoutendijk method fails to solve the problem of VARDIM with larger dimension. From Table 3, although the number of iterations of Algorithm 4 is more than the Zoutendijk method, the computation time of Algorithm 4 is less than the Zoutendijk method, and this feature becomes more evident as increase of the dimension of the test problem.

In summary, the results from Tables 13 show that Algorithm 4 is more efficient than the Zoutendijk method and provides an efficient method for solving nonnegative constraints optimization problems.

Acknowledgment

This research is supported by the NSF (11161020) of China.

Hestenes M. R. Stiefel E. Methods of conjugate gradients for solving linear systems Journal of Research of the National Bureau of Standards 1952 49 409 436 MR0060307 ZBL0048.09901 Fletcher R. Reeves C. M. Function minimization by conjugate gradients The Computer Journal 1964 7 149 154 MR0187375 10.1093/comjnl/7.2.149 ZBL0132.11701 Polak B. Ribire G. Note sur la convergence de directions conjugees Revue Française d'Informatique et de Recherche Opérationnelle 1969 16 35 43 Polyak B. T. The conjugate gradient method in extremal problems USSR Computational Mathematics and Mathematical Physics 1969 9 4 94 112 2-s2.0-0001931644 Fletcher R. Practical Methods of Optimization 1987 2nd Chichester, UK John Wiley & Sons Ltd. xiv+436 MR955799 Liu Y. Storey C. Efficient generalized conjugate gradient algorithms. I. Theory Journal of Optimization Theory and Applications 1991 69 1 129 137 10.1007/BF00940464 MR1104590 ZBL0702.90077 Dai Y. H. Yuan Y. A nonlinear conjugate gradient method with a strong global convergence property SIAM Journal on Optimization 1999 10 1 177 182 10.1137/S1052623497318992 MR1740963 ZBL0957.65061 Powell M. J. D. Convergence properties of algorithms for nonlinear optimization SIAM Review 1986 28 4 487 500 10.1137/1028154 MR867680 ZBL0624.90091 Gilbert J. C. Nocedal J. Global convergence properties of conjugate gradient methods for optimization SIAM Journal on Optimization 1992 2 1 21 42 10.1137/0802003 MR1147881 ZBL0767.90082 Pytlak R. On the convergence of conjugate gradient algorithms IMA Journal of Numerical Analysis 1994 14 3 443 460 10.1093/imanum/14.3.443 MR1283946 ZBL0830.65052 Li G. Tang C. Wei Z. New conjugacy condition and related new conjugate gradient methods for unconstrained optimization Journal of Computational and Applied Mathematics 2007 202 2 523 539 10.1016/j.cam.2006.03.005 MR2319974 ZBL1116.65069 Li X. Zhao X. A hybrid conjugate gradient method for optimization problems Natural Science 2011 3 1 85 90 10.4236/ns.2011.31012 Dai Y. H. Yuan Y. An efficient hybrid conjugate gradient method for unconstrained optimization Annals of Operations Research 2001 103 33 47 10.1023/A:1012930416777 MR1868442 ZBL1007.90065 Hager W. W. Zhang H. A new conjugate gradient method with guaranteed descent and an efficient line search SIAM Journal on Optimization 2005 16 1 170 192 10.1137/030601880 MR2177774 ZBL1093.90085 Li D.-H. Nie Y.-Y. Zeng J.-P. Li Q.-N. Conjugate gradient method for the linear complementarity problem with S-matrix Mathematical and Computer Modelling 2008 48 5-6 918 928 10.1016/j.mcm.2007.10.017 MR2451124 Li D.-H. Wang X.-L. A modified Fletcher-Reeves-type derivative-free method for symmetric nonlinear equations Numerical Algebra, Control and Optimization 2011 1 1 71 82 10.3934/naco.2011.1.71 MR2806294 Zhang L. Zhou W. Li D. Global convergence of a modified Fletcher-Reeves conjugate gradient method with Armijo-type line search Numerische Mathematik 2006 104 4 561 572 10.1007/s00211-006-0028-z MR2249678 ZBL1103.65074 Zhang L. Zhou W. Li D.-H. A descent modified Polak-Ribière-Polyak conjugate gradient method and its global convergence IMA Journal of Numerical Analysis 2006 26 4 629 640 10.1093/imanum/drl016 MR2263891 ZBL1106.65056 Li D. H. Tong X. J. Numerical Optimization 2005 Beijing, China Science Press Moré J. J. Garbow B. S. Hillstrom K. E. Testing unconstrained optimization software ACM Transactions on Mathematical Software 1981 7 1 17 41 10.1145/355934.355936 MR607350 ZBL0454.65049