A Class of Steffensen-Type Iterative Methods for Nonlinear Systems

A class of iterativemethodswithout restriction on the computation of Fréchet derivatives includingmultisteps for solving systems of nonlinear equations is presented. By considering a frozen Jacobian, we provide a class ofm-stepmethods with order of convergence m + 1. A new method named as Steffensen-Schulz scheme is also contributed. Numerical tests and comparisons with the existing methods are included.

wherein each function   maps a vector x = ( 1 ,  2 , . . .,   )  of the -dimensional space R  into the real line R.This system contains  nonlinear equations with  unknowns and could be expressed in a more simplified form by defining a function  mapping (x) = ( 1 (x),  2 (x), . . .,   (x))  .Hence, the nonlinear system (1) can be written in the form (x) = 0, where the functions  1 (x),  2 (x), . . .,   (x) are the coordinate functions of ; see for more details [1] and also for application issues the work [2].
We assume that (x) is a smooth function of x in the open convex set  ⊆ R  .There is a strong incentive to use derivative information as well as function values in order to solve traditionally the system (1).The most famous solver for such a problem is Newton's iteration [1], which is defined as x (+1) = x () −   (x () ) −1  (x () ) ,  = 0, 1, 2, . . . .(2) In this iterative method, we use the  ×  Jacobian matrix, that is,   (x), with entries   (x)  =      (x).To be more precise, there are plenty of solvers to tackle the problem (1) or its scalar case, such as those in [3][4][5][6].Among such methods, the third-order iterative methods like the Halley and Chebyshev methods [7] are considered less practically from a computational point of view because they need to compute the expensive second-order Fréchet derivatives in contrast to the quadratically convergent method (2), in which the first-order Fréchet derivative needs to be calculated.As a matter of fact, for the considered problem (1), the first Fréchet derivative is a matrix with  2 entries, while the secondorder Fréchet derivative has  3 entries (without considering the symmetry).On this account, it needs a large amount of operations in order to evaluate the second derivative for each iteration.
To avoid using the Jacobian whose computation is time-consuming for large scale systems, in Chapter 7 of [8], authors presented the definition of divided difference in -dimensional space as an  ×  matrix with elements [, ; ] , = (  ( 1 , . . .,   ,  +1 , . . .,   ) to prevent computing the first-order Fréchet derivative.Note that ( 3) is a component-to-component definition.
The primary aim of the present study is to achieve high rate of convergence in order to solve (1) using (5).Hence, we first propose a new iterative method with fourth-order convergence to find both real and complex solutions.The new method does not even need the evaluation of one firstorder Fréchet derivative, let alone the higher-order ones.Next, by considering a frozen Jacobian matrix, we suggest a general -step class of iterative methods with arbitrary order of convergence and higher computational efficiency index.
The rest of this paper is prepared as follows.In Section 2, the construction of a scheme is offered.It also includes the analysis of convergence and shows that the suggested method has fourth order.In Section 3, we will extend the new method to present a multistep class of iterations.Section 4 contains a discussion about the implementation and the efficiency of the iterative methods.Section 5 contains a contribution of this study by introducing Steffensen-Schulz iterative method for the first time.This is followed by Section 6 where some numerical tests will be furnished to illustrate the accuracy and efficiency of the proposed approach.Section 7 ends the paper where short conclusions of the study by pointing out future research aspects are given.
The following theorem will be demonstrated by means of the -dimensional Taylor expansion of the functions using the estimation (8) in (7).We here include some of the basic notions which are important in the proof.Let  :  ⊆ R  → R  be sufficiently Fréchet differentiable in .By using the notation introduced in [11], the th derivative of . Thus, we have the following: (1) Hence, we take into account It is well known that, for x * + ℎ ∈ R  lying in a neighborhood of a solution x * of the nonlinear system (x) = 0, Taylor's expansion might be written as follows (assuming that the Jacobian matrix   (x * ) is nonsingular): In addition, we can express ), wherein  is the identity matrix of the same order to the Jacobian matrix.Therefore,   ℎ −1 ∈ L(R  ).We also in the sequel denote  () = x () − x * as the error in the th iteration.The equation where , is called the error equation and  is the order of convergence.
Our most important class of methods is to derive a class of iterations free from derivatives using the estimation (5).For example, the method (7) using ( 9) results in Note that it could be shown, in a similar way to the previous theorem, that (17) possesses fourth order of convergence.The implementation of (17) depends on the involved linear algebra problems.An interesting point in the new method (17) is that the LU decomposition of  needs to be done only once, and it could effectively be used three times per computing step to increase the rate of convergence without imposing much computational burden.

An 𝑚-Step Class
This section presents a general class of multistep iteration methods.In fact, the new scheme (17) can simply be improved by considering the Jacobian matrix (x () ,  () ) to be frozen.In such a way, we are able to propose a general -step multipoint class of iterative methods in the following structure: wherein  ()  is used in the linear system (x () ,  () ) ()  = ( ()  −1 ),  = 1, . . ., .We remark that, in this structure, the LU factorization of the Jacobian matrix would be computed only once.This reduces the computational load of the linear algebra problems in implementing (18).
In the iterative process ( 18) each added step will impose one more -dimensional function, whose cost is  while the convergence order will be improved to 1 + ( − 1), wherein ( − 1) is the order of the previous substeps.Considering the well-known mathematical induction, it would be easy to deduce the following Theorem for (18).Theorem 3. Using the same conditions as in Theorem 2, the step iterative process (18) has the local convergence order  + 1 using  + 1 evaluations of the function  and one first-order divided difference operator per full iterations.
Proof.The proof of this theorem is based on mathematical induction and is straightforward.

Complexity
In the iterative method (17), one may solve one linear system of equations per computing step, with three righthand side vectors (x () ), (y () ), and (z () ), and a similar procedure must be applied for (19).In such a case, one could compute a factorization of the matrix and use it repeatedly.
It is known that the cost (number of products/quotients) of solving the associated linear system by LU decomposition is (1/3) 3 +  2 − (1/3) (including the LU factorization and two triangular systems), where  is the size of the system.Moreover, if one has  systems with the same matrix, then the final cost is (1/3) 3 +  2 − (1/3).The computational cost for ( 17) is as follows:  evaluations of scalar functions for (x),  evaluations of scalar functions (y), and  evaluations of scalar functions for (w) and  2 −  (since we have computed above (x) and (w) before) for the estimation (5).
We provide a comparison of efficiency indices for the methods (2) denoted by NM, (4) denoted by SM, (6) denoted by ZM, and the new methods (17) denoted by PM4 and (19) denoted by PM6 based on the computational efficiency index which is also known as operational-efficiency index in the literature [11].
The log plot of the efficiencies according to the definition of this index for an iterative method, which is given by  =  1/ , where  is the order of convergence and  stands for the total computational cost per iteration in terms of the number of functional evaluations and the number of products/quotients in finding the LU decomposition and two triangular systems, is given in Figure 1.It is clearly obvious that the new method (19) for higher  has dominance with respect to the other well-known methods.
Although the new scheme does not have the drawbacks of NM or ZM in terms of computing Fréchet derivatives or three different divided difference operators, a significant focus should be allocated for large scale nonlinear systems.In fact, for large scale sparse nonlinear systems of equations, we have two main shortcomings, first putting numeric values into the Jacobian matrix for Newton-type methods and second the LU decomposition, which is time-consuming.The suggested method in this work does not need the computations of the Fréchet derivatives and, due to this, it resolves the first problem.For solving the second problem, the new scheme should be coded in the inexact form (see, e.g., [12,13]) using very efficient solvers for the solution of linear systems of equations such as GMRES.

A Multiplication-Rich Scheme
A contribution of this study lies in a simple trick that could be used in this section to produce derivative-free schemes which are inversion-free as well.As a matter of fact, the general class ( 18) is free from Fréchet derivative per computing step and this makes it nice but we could use a simple trick to make the process inversion-free.That is to say, the computation of the inverse of  in ( 18) is tough since it could be a dense matrix.Although we use the LU decomposition to proceed per cycle, we could apply an iterative inverse finder to avoid the computation of inverse (or solving a system) by imposing further matrix-matrix multiplications.Such a procedure could be done using the well-known Schulz-type methods (see, e.g., [14]).Here, we apply the Schulz inverse finder for this purpose, for one step of the matrix iterative method (18) as follows: wherein   = (x () ,  () ) = ((x () +  ()  1 ) − (x () ), . . ., (x () +  ()   ) − (x () )) () −1 and  1 ,   are the largest and smallest singular values of .Unfortunately, convergence of this iterative scheme happens for initial matrices so close to the zeros and it might not be quadratic.In fact, the Steffensen method (SM) and its variants obtained by the class ( 18), for instance, the approach (19), should become inversion-free easily and efficiently.We now illustrate the basins of attraction for SM2 and PM4 in the complex square of [−4, 4] × [−4, 4], when the stopping criterion is | +1 −   | ≤ 10 −6 .Hopefully, the convergence radius for tackling nonlinear problems can become broadened by introducing the free nonzero parameter , which would be a constant array in the multivariate case [9].Introducing  yields the following form of Steffensen's method: Figure 2(a) shows the SM without imposing small values for , that is, with  = 1, and Figure 2(b) is the basins of attraction of PM4. Figure 3 shows the basins of attraction of SM and PM4, but with small values of .
We name this multiplication-rich method as the Steffensen-Schulz iteration since it is derivative-free and inversion-free at the same time.It only requires one matrix inversion in the whole process of computing x (1) .

Numerical Testing
We employ here the second-order method of Newton (2), the second-order scheme of Steffensen (4), the fourth-order scheme of Zheng et al. (6), and the proposed fourth-order derivative-free method (17) to compare the numerical results obtained from these methods in solving test nonlinear systems.We also put (22) into test for Test Problems 1 and 4.
We report the numerical results for solving Tests 1-3 in Tables 1-3 based on the initial values.Note that an important aspect in implementing iterative method for solving nonlinear systems is to find a robust initial guess to guarantee the convergence.Some discussions regarding this are given in [16,17].
The residual norm along with the number of iterations and computational time using the command Timing in Mathematica 8 [18] is reported in Tables 1-3.The computer specifications are Microsoft Windows XP Intel(R), Pentium(R) 4 CPU, and 3.20 GHz with 4 GB of RAM.
An efficient way to observe the behavior of the order of convergence is to use the local computational order of convergence (COC) that can be defined by  ≈ ln(‖ (x (+1) ) ‖ / ‖ (x () ) ‖)/ ln(‖ (x () ) ‖ / ‖ (x (−1) ) ‖), in the -dimensional case.We have used this index in the numerical comparisons listed in Tables 1-3 for each different method to illustrate their numerical orders.
In numerical comparisons, we have chosen the fixed point arithmetic to be 200 using the command SetAccuarcy [expr,200] in the written codes.Note that, for some iterative methods, their residual norm at the specified iteration will exceed the bound of 10 −200 ; thus, we consider such approximations as the exact solution and denoted the residual norm by 0 in some cells of the tables.
Results in Table 1 show that the new scheme can be considered for complex solutions of hard nonlinear systems.
In this test, due to the fact that the dimension of the nonlinear system is low, we have used the LU decomposition to prevent the computation of 3 linear systems.Computational time reported in Table 1 verifies the fact that derivative-free methods with few number of divided difference operators are reliable.Note that Figure 4(a) reveals the residual fall of (22) for solving Test 1.It also reveals a quadratic convergence in the obtained residuals per full cycle.
In order to tackle large scale nonlinear systems, we have included Tests 2 and 3 in this work.As could be seen from Tables 2 and 3, the cases for 99 × 99 and 200 × 200 are considered, respectively.It is obvious that for large scale nonlinear systems we have two difficulties in implementation; first the LU decomposition takes time to be attained and second finding the numeric values of the Jacobian matrix in the Newton-type methods is costly.However, to have a fair comparison, we have not defined the pattern of the Jacobian in the computations and used LU decomposition in solving the considered linear systems.

Figure 2 :
Figure 2: Attraction basins of the polynomial  3 − 1 = 0 in the complex plane (shaded according to the number of iterations).

Figure 3 :
Figure 3: Attraction basins of the polynomial  3 − 1 = 0 in the complex plane (shaded according to the number of iterations).