JAM Journal of Applied Mathematics 1687-0042 1110-757X Hindawi Publishing Corporation 875935 10.1155/2013/875935 875935 Research Article An Improved Diagonal Jacobian Approximation via a New Quasi-Cauchy Condition for Solving Large-Scale Systems of Nonlinear Equations Waziri Mohammed Yusuf 1, 2 Abdul Majid Zanariah 1, 3 Öziş Turgut 1 Department of Mathematics, Faculty of Science Universiti Putra Malaysia, 43400 Serdang, Selangor Malaysia upm.edu.my 2 Department of Mathematical Sciences Faculty of Science, Bayero University, Kano PMB 3011 Nigeria buk.edu.ng 3 Institute for Mathematical Research Universiti Putra Malaysia, 43400 Serdang, Selangor Malaysia upm.edu.my 2013 19 2 2013 2013 08 08 2012 14 12 2012 15 12 2012 2013 Copyright © 2013 Mohammed Yusuf Waziri and Zanariah Abdul Majid. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

We present a new diagonal quasi-Newton update with an improved diagonal Jacobian approximation for solving large-scale systems of nonlinear equations. In this approach, the Jacobian approximation is derived based on the quasi-Cauchy condition. The anticipation has been to further improve the performance of diagonal updating, by modifying the quasi-Cauchy relation so as to carry some additional information from the functions. The effectiveness of our proposed scheme is appraised through numerical comparison with some well-known Newton-like methods.

1. Introduction

Let us consider the systems of nonlinear equations (1)  F(x)=0, where F:RnRn is a nonlinear mapping. Often, the mapping F is assumed to be satisfying the following assumptions:

there exists an x*Rn s.t F(x*)=0;

F is a continuously differentiable mapping in a neighborhood of x*;

F(x*) is invertible.

The well-known method for finding the solution to (1) is the classical Newton’s method which generates a sequence of iterates {xk} from a given initial point x0 via (2)xk+1=xk-(F(xk))-1F(xk), where k=0,1,2. The attractive features of this method are rapid convergence and being easy to implement. Nevertheless, Newton’s method requires the computation of the matrix entails the first-order derivatives of the systems. In practice, computations of some functions derivatives are quite costly, and sometimes they are not available or could not be done precisely. In this case, Newton’s method cannot be applied directly.

Moreover, some substantial efforts have been made by numerous researchers in order to eliminate the well-known shortcomings of Newton’s method for solving systems of nonlinear equations, particularly large-scale systems (see, e.g., [1, 2]). Notwithstanding, most of these modifications of Newton’s method still have some shortfalls as Newton’s counterpart. For example, Broyden’s method and Chord Newton’s method need to store an n×n matrix, and their floating points operations, are O(n2), respectively.

To tackle these disadvantages, a diagonally Newton’s method has been suggested by Leong et al.  and showed that their updating formula is significantly cheaper than Newton’s method and some of its variants. Based on this fact, it is pleasing to present an approach which will improve further the diagonal Jacobian approximation, as well as reducing the computational cost, floating points operations and number of iterations. This is what leads to the idea of this paper. The anticipation has been to further improve the performance of diagonal updating, by modifying the quasi-Cauchy relation so as to carry some additional information from the functions. We organized the paper as follows. In the next section, we present the details of the proposed method. Convergence results are present in Section 3. Some numerical results are reported in Section 4. Finally, conclusions are made in Section 5.

2. Derivation Process

This section presents a new diagonal quasi-Newton-like method for solving large-scale systems of nonlinear equations. The quasi-Newton method is an iterative method that generates a sequence of points {xk} from a given initial guess x0 via the following form:(3)xk+1=xk-αkBkF(xk)k=0,1,2, where αk is a step length and Bk is an approximation to the Jacobian inverse which can be updated at each iteration for k=0,1,2; the updated matrix Bk+1 is chosen in such a way that it satisfies the secant equation, that is, (4)Bk+1sk=yk.

It is clear that the only Jacobian information we have is y, and this is only approximation information. To this end, we incorporate more information from sk and Fk to y in order to present a better approximation to the Jacobian matrix. We consider the modification on y presented by Li and Fukushima : (5)y˘=yk+vkFksk, where vk=1+max{-skTyk/sk2,0}.

Our aim here is to build a square matrix, say B, using diagonal updating scheme which is an approximation to the Jacobian inverse, and we let Bk+1 satisfy the quasi-Cauchy equation, that is,(6)y˘ky˘Tsk=y˘ky˘TBk+1y˘k.

In addition, the deviation between Bk+1 and Bk is minimized under some norms; hence, in the following theorem, we state the resulting update formula for Bk.

Theorem 1.

Assume that Bk+1 be the diagonal update of a diagonal matrix Bk. Let us denote the deviation between Bk and Bk+1 as Ψk=Bk+1-Bk. Suppose that y˘k0 which is defined by (5). Consider the following problem:(7)min12ΨkF2s.ty˘ky˘T(Bk+Ψk)y˘k=y˘ky˘Tsk, where ·F denotes the Frobenius norm. Hence, the optimal solution of (7) is given by (8)Ψk=(y˘ky˘Tsk-y˘ky˘TBky˘k)tr(Vk2)Vk, where Vk=diag((y˘ky˘(1))2,(y˘ky˘(2))2,,(y˘ky˘(n))2), i=1n(y˘ky˘(i))4=tr(Vk2), and Tr is the trace operation.

Proof.

Consider the Lagrangian function of (7): (9)L(Ψk,α)=12ΨkF2+α(y˘ky˘TΨky˘k-y˘ky˘Tsk+y˘ky˘TBky˘k), where α is the corresponding Lagrangian multiplier. By differentiating L with respect to each Ψk(1),Ψk(2),,Ψk(n), and setting them all equal to zero, we obtain (10)Ψk(i)=-α(y˘ky˘(i))2i=1,2,,n.

Multiplying both sides of (10) by (y˘ky˘(i))2 and summing them all give (11)i=1n(y˘ky˘(i))2Ψk(i)=-αi=1n(y˘ky˘(i))4  for  everyi=1,2,,n.

Differentiating L with respect to α, and since y˘ky˘TΨky˘k=i=1n(y˘ky˘(i))2Ψ(i), then we have (12)i=1n((y˘ky˘(i))2Ψk(i))=y˘ky˘Tsk-y˘ky˘TBky˘k.

Equating (11) and (12) and substituting the relation into (10), finally we have (13)Ψk(i)=y˘ky˘Tsk-y˘ky˘TBky˘ki=1n(y˘ky˘(i))4(y˘ky˘(i))2i=1,2,,n.

Since Bk(i) is a diagonal component of Bk, y˘k(i) is the ith component of vector y˘k, then Vk=diag((y˘ky˘(1))2,(y˘ky˘(2))2,,(y˘ky˘(n))2) and i=1n(yk(i))4=tr(Vk2). We further rewrite (13) as (14)Ψk=(y˘ky˘Tsk-y˘ky˘TBky˘k)tr(Vk2)Vk, which completes the proof.

Hence, the best possible updating formula for diagonal matrix Bk+1 is given by (15)Bk+1=Bk+(y˘ky˘Tsk-y˘ky˘TBky˘k)tr(Vk2)Vk.

Now, we can describe the algorithm for our proposed method as follows.

Algorithm IDJA

Step 1.

Choose an initial guess x0, σ(0,1), γ>1, B0=In, α0>0, and let k:=0.

Step 2.

Compute F(xk), and If F(xk)10-8 stop.

Step 3.

Compute d=-F(xk)Bk.

Step 4.

If F(xk+αkdk)σF(xk), retain αk and go to Step 5. Otherwise set αk+1=αk/2 and repeat Step 4.

Step 5.

If F(xk+αkdk)-F(xk)F(xk+αkdk)-F(xk), retain αk and go to Step 6. Otherwise set αk+1:=αk×γ and repeat Step 5.

Step 6.

Let xk+1=xk+αkdk.

Step 7.

If xk+1-xk2+F(xk)210-8 stop. Also go to Step 8.

Step 8.

If ΔFk2ϵ1 where ϵ1=10-4, compute Bk+1 as defined by (15); if not, Bk+1=Bk.

Step 9.

Set k:=k+1 and go to Step 2.

3. Convergence Result

This section presents local convergence results of the IDJA methods. To analyze the convergence of these methods, we will make the following assumptions on nonlinear systems F.

Assumption 2.

( i ) F is differentiable in an open convex set E in n.

( ii ) There exists x*E such that F(x*)=0; F(x) is continuous for all x.

( iii ) F ( x ) satisfies the Lipschitz condition of order one that is there exists a positive constant μ such that (16)F(x)-F(y)μx-y, for all x,yn.

( iv ) There exist constants c1c2 such that c1ω2ωTF(x)ωc2ω2 for all xE and ωn.

We can state the following result on the boundedness of {ΨkF} by assuming that, without loss of generality, the updating matrix (15) is always used, then we have the following.

Theorem 3.

Suppose that {xk} is generated by Algorithm IDJA where Bk is defined by (15). Assume that Assumption 2 holds. There exists β>0, δ>0, α>0 and γ>0, such that if x0E and B0 satisfies I-B0F(x*)F<δ for all xkE then (17)I-BkF(x*)F<δk, for some constant δk>0, k0.

Proof.

Since Bk+1F=Bk+ΨkF, it follows that (18)Bk+1FBkF+ΨkF.

For k=0 and assuming B0=I, we have (19)|Ψ0(i)|=|y˘0y˘Ts0-y˘0y˘TB0y˘0tr(V02)(y˘0y˘(i))2||y˘0y˘Ts0-y˘0y˘TB0y˘0|tr(V02)(y˘0y˘(max))2, where (y˘0y˘(max))2 is the largest element among (y˘0y˘(i))2,i=1,2,,n.

After multiplying (19) by (y˘0y˘(max))2/(y˘0y˘(max))2 and substituting tr(V02)=i=1n(y˘0y˘(i))4, we have (20)|Ψ0(i)||y˘0y˘Ts0-y˘0y˘TB0y˘0|(y˘0y˘(max))2i=1n(y˘0y˘(i))4(y˘0y˘(max))4.

Since (y˘0y˘(max))4/i=1n(y˘0y˘(i))41, then (20) turns into (21)|Ψ0(i)||y˘0y˘TF(x)y˘0-y˘0y˘TB0y˘0|(y˘0y˘(max))2.

From Assumption 2 and B0=I, (21) becomes (22)|Ψ0(i)||c-1|(y˘0y˘Ty˘0)(y˘0y˘(max))2, where c=max{|c1|,|c2|}.

Since (y˘0y˘(i))2(y˘0y˘(max))2 for i=1,,n, it follows that (23)|Ψ0(i)|n|c-1|(y˘0y˘(max))2(y˘0y˘(max))2.

Hence, we obtain (24)Ψ0Fn3/2|c-1|.

Suppose α=n3/2|c-1|, then (25)Ψ0Fα.

From the fact that B0F=n, it follows that (26)B1Fβ, where β=n+α>0.

Therefore, if we assume that I-B0F(x*)F<δ, then (27)I-B1F(x*)F=I-(B0+Ψ0)F(x*)F                        I-B0F(x*)F+Ψ0F(x*)F        I-B0F(x*)F+Ψ0FF(x*)F; therefore, I-B1F(x*)F<δ+αϕ=δ1.

Hence, by induction, I-BkF(x*)F<δk for all k.

4. Numerical Results

In this section, the performance of IDJA method has been presented, when compared with Broyden’s method (BM), Chord Newton’s method (CN), Newton’s method (NM), and (DQNM) method proposed by , respectively. The codes are written in MATLAB 7.4 with a double precision computer; the stopping condition used is (28)sk+F(xk)10-8.

The identity matrix has been chosen as an initial approximate Jacobian inverse.

We further design the codes to terminates whenever one of the following happens:

the number of iteration is at least 200 but no point of xk that satisfies (28) is obtained;

CPU time in seconds reaches 200;

Insufficient memory to initial the run.

The performance of these methods are compared in terms of number of iterations and CPU time in seconds. In the following, some details on the benchmarks test problems are presented.

Problem 1.

Spares 1 function of Shin et al. :(29)fi(x)=xi2-1i=1,2,,n,x0=(5,5,,5).

Problem 2.

Trigonometric function of Spedicato  (30)fi(x)=n-j=1ncosxj+i(1-cosxi)-sinxi,i=1,,n,x0=(1n,1n,,1n).

Problem 3.

System of n nonlinear equations (31)fi(x)=sin(1-xi)×i=1nxi2+2xn-1-3xn-2-12xn-4+12xn-5-xiln(9+xi)-92exp(1-xn)+2i=1,2,,n,x0=(0,0,,0)T.

Problem 4.

System of n nonlinear equations (32)fi(x)=xi2-4exp(sin(4-xi2))+sin(4-xi)2+i(xn-xi)2+2n-i=1nxicosxii=1,,n,x0=(2.8,2.8,2.8,,2.8).

Problem 5.

System of n nonlinear equations (33)fi(x)=(i=1nxi)(xi-2)+(cosxi-2)-1i=1,,n,x0=(1,1,1,,1).

Problem 6.

System of n nonlinear equations (34)fi(x)=i=1nxi2-(sin(xi)-xi4+sinxi2)i=1,,n,x0=(.5,.5,.5,,.5).

Problem 7.

System of n nonlinear equations (35)fj(x)=(i=1nxi2-1)(xj-1)+xj(i=1n(xi-1))-n+1fn(x)=(i=1nxi2-1)(xn-1)+(cosxn-1)-1j=2,,n-1,x0=(.5,.5,.5,).

Problem 8.

System of n nonlinear equations (36)fi(x)=(1-xi2)+xi+xi2xn-2xn-1xn-2i=1,2,,n,x0=(.5,.5,,.5).

The numerical results presented in Tables 1, 2, 3, 4, and 5 demonstrate clearly the proposed method (IDJA) shows good improvements, when compared with NM, CN, BM, and DQNM, respectively. In addition, it is worth mentioning, the IDJA method does not require more storage locations than classic diagonal quasi-Newton’s methods. One can observe from the tables that the proposed method (IDJA) is faster than DQNM methods and required little time to solve the problems when compared to the other Newton-like methods and still keeping memory requirement and CPU time in seconds to only O(n).

Numerical results of NM, CN, BM, DQNM, and IDJA methods.

prob Dim NM CN BM DQNM IDJA
NI CPU NI CPU NI CPU NI CPU NI CPU
1 50 7 0.046 55 0.031 15 0.031 14 0.016 2 0.011
2 50 9 0.078 344 0.062 15 0.031 15 0.031 13 0.031
3 50 10 0.062 20 0.016 10 0.016
4 50 19 0.031 9 0.031
5 50 12 0.078 42 0.031 16 0.016 8 0.015
6 50 8 0.064 16 0.032 14 0.031 7 0.014
7 50 8 0.094 25 0.031 14 0.010
8 50 11 0.064 11 0.0312 11 0.016 9 0.016

Numerical Results of NM, CN, BM, DQNM, and IDJA methods.

prob Dim NM CN BM DQNM IDJA
NI CPU NI CPU NI CPU NI CPU NI CPU
1 100 7 0.156 98 0.094 15 0.043 14 0.016 2 0.011
2 100 10 0.187 18 0.062 16 0.032 13 0.032
3 100 7 0.203 24 0.140 15 0.031 7 0.015
4 100 13 0.031 10 0.030
5 100 13 0.265 53 0.109 17 0.031 12 0.031
6 100 8 0.203 16 0.047 14 0.031 7 0.017
7 100 8 0.185 26 0.031 16 0.030
8 100 11 0.234 11 0.094 11 0.032 10 0.016

Numerical Results of NM, CN, BM, DQNM, and IDJA methods.

prob Dim NM CN BM DQNM IDJA
NI CPU NI CPU NI CPU NI CPU NI CPU
1 250 7 0.359 100 0.109 15 0.101 14 0.034 2 0.032
2 250 11 0.640 21 0.218 18 0.032 8 0.031
3 250 8 0.499 29 0.250 16 0.016 9 0.016
4 250 15 0.031 10 0.032
5 250 14 0.827 19 0.031 8 0.016
6 250 8 0.686 24 0.250 14 0.031 10 0.031
7 250 8 0.499 27 0.031 14 0.031
8 250 11 0.484 11 0.125 11 0.031 10 0.016

Numerical results of NM, CN, BM, DQNM, and IDJA methods.

prob Dim NM CN BM DQNM IDJA
NI CPU NI CPU NI CPU NI CPU NI CPU
1 500 7 0.796 101 0.702 15 0.671 14 0.016 2 0.011
2 500 13 1.997 23 0.972 19 0.031 9 0.032
3 500 7 1.4352 17 0.031 9 0.031
4 500 12 0.030 10 0.031
5 500 15 2.449 21 0.031 9 0.031
6 500 8 2.184 23 0.998 14 0.032 10 0.045
7 500 8 1.498 32 0.047 15 0.047
8 500 11 1.451 11 0.515 11 0.031 9 0.031

Numerical results of NM, CN, BM, DQNM, and IDJA methods.

prob Dim NM CN BM DQNM IDJA
NI CPU NI CPU NI CPU NI CPU NI CPU
1 1000 7 2.730 103 3.167 38 9.438 14 0.016 2 0.011
2 1000 31 7.722 20 0.032 8 0.043
3 1000 9 5.819 17 0.031 9 0.031
4 1000 11 0.064 10 0.064
5 1000 16 8.705 22 0.031 10 0.031
6 1000 8 6.474 14 0.062 11 0.061
7 1000 8 4.321 38 0.062 31 0.047
8 1000 11 4.882 11 2.418 11 0.032 10 0.031
5. Conclusions

In this paper, we present an improved diagonal quasi-Newton update via new quasi-Cauchy condition for solving large-scale Systems of nonlinear equations (IDJA). The Jacobian inverse approximation is derived based on the quasi-Cauchy condition. The anticipation has been to further improve the diagonal Jacobian, by modifying the quasi-Cauchy relation so as to carry some additional information from the functions. It is also worth mentioning that the method is capable of significantly reducing the execution time (CPU time), as compared to NM, CN, BM, and DQNM methods while maintaining good accuracy of the numerical solution to some extent. Another fact that makes the IDJA method appealing is that throughout the numerical experiments it never fails to converge. Hence, we can claim that our method (IDJA) is a good alternative to Newton-type methods for solving large-scale systems of nonlinear equations.

Dennis, J. E. Jr. Schnabel R. B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations 1983 Englewood Cliffs, NJ, USA Prentice Hall xiii+378 MR702023 ZBL0579.65058 Kelley C. T. Iterative Methods for Linear and Nonlinear Equations 1995 16 Philadelphia, Pa, USA SIAM iv+165 10.1137/1.9781611970944 MR1344684 Leong W. J. Hassan M. A. Waziri Yusuf M. A matrix-free quasi-Newton method for solving large-scale nonlinear systems Computers & Mathematics with Applications 2011 62 5 2354 2363 10.1016/j.camwa.2011.07.023 MR2831697 ZBL1231.65091 Li D.-H. Fukushima M. A modified BFGS method and its global convergence in nonconvex minimization Journal of Computational and Applied Mathematics 2001 129 1-2 15 35 10.1016/S0377-0427(00)00540-9 MR1823208 ZBL0984.65055 Shin B.-C. Darvishi M. T. Kim C.-H. A comparison of the Newton-Krylov method with high order Newton-like methods to solve nonlinear systems Applied Mathematics and Computation 2010 217 7 3190 3198 10.1016/j.amc.2010.08.051 MR2733761 ZBL1204.65055 Spedicato E. Cumputational experience with quasi-Newton algorithms for minimization problems of moderatetly large size 1975 CISE-N-175 3 10 41