A Method to Accelerate the Convergence of the Secant Algorithm

We present an acceleration technique for the Secantmethod.The Secantmethod is a root-searching algorithm for a general function f. We exploit the fact that the combination of two Secant steps leads to an improved, so-called first-order approximant of the root. The original Secant algorithm can be modified to a first-order accelerated algorithm which generates a sequence of first-order approximants. This process can be repeated: two nth order approximants can be combined in a (n + 1)th order approximant and the algorithm can be modified to an (n + 1)th order accelerated algorithm which generates a sequence of such approximants. We show that the sequence of nth order approximants converges to the root with the same order as methods using polynomial fits of f of degree n.


Introduction
The Secant algorithm is a textbook algorithm to find a numerical approximation of the root of a function ().A root  is a solution of the equation () = 0. Other such algorithms are, for example, the bisection algorithm, inverse quadratic interpolation, the regula-falsi algorithm, Muller's method, the Newton-Raphson algorithm, Steffensen's method, the Brent algorithm, and many more.These methods are discussed in many books and articles; see, for example, [1][2][3][4][5][6][7][8][9][10][11].All the algorithms mentioned are intended for a general function .They all take one, two, or more initial estimates of  as input and iteratively generate a sequence {  } of approximants of .The sequence converges to the root  for suitably chosen initial estimates and a function  meeting particular regularity requirements at and around .The exact requirements differ from method to method.Root-finding plays a role in many problems, also when this is not immediately apparent.An example is the problem of solving a set of linear equations [12].
The Secant algorithm has the characteristics that (a) it is "derivative-free;" that is, it does not require the evaluation of a derivative of  and (b) it requires only one evaluation of  per iteration.The generated sequence {  } converges superlinearly with order  0 = (1 + √ 5)/2 ≈ 1.6180 for a large class of functions .
It is important to stress that only one evaluation of  per iteration is needed.Situations which require an efficient root-finding algorithm are typically situations in which the execution time of the algorithm is dominated by the time needed to calculate the value of .In these situations it is important that as few evaluations of  as possible are needed to estimate the root with a certain accuracy.An algorithm which requires  evaluations of  per iteration is therefore only competitive with the Secant algorithm if one iteration produces a better estimate than  subsequent Secant iterations.In other words, it must converge with an order larger than   0 .To the best of our knowledge, there are no derivative-free algorithms which achieve this, except for the generalizations of the Secant algorithm discussed below.
In this paper we derive a generalization of the Secant method with the following properties: (a) it is derivative-free, (b) it requires one evaluation of  per iteration, and (c) it achieves an order of convergence arbitrarily close to 2 for analytic functions .The first two properties are the same as for the Secant algorithm.The last property shows that the method presented here will converge faster than the Secant method if  is sufficiently regular.
Other generalizations of the Secant algorithm with the same three properties are the method of inverse interpolation [2] and Sidi's method [13].These two methods are based on polynomial fits to either the inverse of  (in the case of the 2 Advances in Numerical Analysis method of inverse interpolation) or to  itself (in the case of Sidi's method).Whereas the Secant method is based on straight-line fits to , the polynomial fits of these methods can be of an arbitrary degree .The resulting order of convergence is  −1 for both methods.Hence the order is  0 for the Secant method and  1 ≈ 1.8393 when a polynomial of degree 2 is used.By taking the order of the fitting polynomial large enough, the order of convergence becomes asymptotically quadratic if the function is sufficiently regular.
Another method which satisfies the three properties is the method of direct interpolation [2].However this method requires that the root(s) of a polynomial of degree  are calculated in every iteration.This is not an attractive scheme except possibly for the case  = 2 which is known as Muller's method [1].
Our method is not based on polynomial fits.It was noted in [14,15] that the results of two Secant steps can be combined into a better approximant of the root  in a way reminiscent of Aitken's delta-squared method [16] or Shank's transformation [17].We take this idea further.We show that the process of combining approximants can be repeated.If we call the result of a Secant step an approximant of order zero, we demonstrate that two approximants of order  can be combined into an approximant of order  + 1.We devise an algorithm which generates these approximants.The th order version of the algorithm generates a sequence of th order approximants.We show that this sequence converges with order   to the root  if  is sufficiently regular.
Although our algorithm offers no specific advantage over the method of inverse interpolation or Sidi's method (all are derivative-free, require one evaluation of  per iteration, and achieve orders of convergence   ), we think it is noteworthy that the Secant method can be sped up to higher orders of convergence without the use of polynomial fits.We suspect that the same acceleration technique can be applied to a broader set of iterative algorithms.We also see a possibility that our technique can lead to a parallel rootsolving algorithm.These venues are however not explored in this paper.
The paper is organized as follows.We discuss preliminaries and recall the basic properties of the Secant sequence in Sections 2 and 3. We introduce the approximants in Section 4. The algorithm which generates these approximants is given in Section 5 and its convergence properties are derived in Section 6.We end with conclusions in Section 7.

Preliminaries
2.1.Order of Convergence.When a sequence {  } converges to a limit  and the sequence has the property lim with  ̸ = 0, then  is called the "order of convergence" or "order" of the sequence.The condition  ̸ = 0 is necessary to define the order of convergence uniquely. is called the "asymptotic error constant" [2,18,19].The larger the order of convergence, the faster the sequence converges. with to terminate the recursion.We use the following two properties of divided differences.
(ii) If  ∈ This property is cited in many text books [2,5,18,20].It follows, for example, from the previous property in combination with the mean value theorem.

The Secant Algorithm
Suppose we have an open interval  of real values and a function  :  → R. Suppose  ∈  and () = 0.A Secant step  is defined as The Secant algorithm generates a sequence {  } ∞ =−1 which starts with two initial values  −1 ,  0 ∈  and develops as We can develop the sequence as long as   ∈ .It can be shown [2,5] that It can also be shown [2,5] that if  ∈  2 (), the first derivative  (1) () and second derivative  (2)   (2) () 2 (1) () where  0 = (1 + √ 5)/2.This means that the sequence converges with order  0 under the conditions stated.It can be expected [10] that the sequence converges with a higher order if  (2) () = 0.In case  (1) () = 0 the sequence still converges but no longer superlinearly [21].

General Order Approximant
We define what we call an approximant of general order of the root  of a function  in this section.This definition is recursive.To study this approximant, we express the approximant directly in terms of  in Section 4.1.Two expressions are obtained: one involving  and polynomials in Section 4.1.1 and one involving divided differences in Section 4.1.2.These forms allows us to cast the approximant in a form which exposes its properties when we are close to  in Lemmas 3 and 4.
We define an th order approximant   as follows.
The reason why we call this an approximant will become clear shortly.

The Approximant Near the Root. Suppose the function
in the definition of   has a root at  ∈ : () = 0. We study   in the case that all its arguments  1 , . . .,  +2 are in the neighbourhood of .We write  =  +  and   () = ( + ).The function   is the function  in the coordinate frame .The root is at  = 0 in this coordinate frame:   (0) = 0. We can express () in the second divided difference of   as () =   [0, ].
Substituting   =  +   in (13) we have with is defined in the coordinate frame .The corresponding   in the original coordinate frame is If we can show that   is bounded for small values of the   , we see from (20) that   is a good approximation of .Establishing the boundedness of   is therefore the major task of the remainder of this section.
Proof.We have already shown the form of   in (20).All that remains to be done is to show that   is bounded in an ( + 2)-dimensional hypercube around the point ( 1 , . . .,  +2 ) = (0, . . ., 0).Denote this point by ⃗ 0. From the properties of divided differences we know that the numerator in (23) is bounded in a hypercube around ⃗ 0 if the function 1/(  [0, ]) is -times continuously differentiable around  = 0 and the th derivative is Lipschitz continuous.
This is also a sufficient condition for all divided differences that appear in (26) to be bounded.Therefore all terms in (26) can be made arbitrarily small by choosing the   small enough, except for the last term.The last term has a limiting value (−1) +3 / (1)   (0) ̸ = 0 if all   become equal to zero; compare or confer (35).This shows that there is a lower bound  such that 0 <  < |D| on a hypercube around ⃗ 0. Combining this with the bound on the numerator proves that   is bounded.
It remains to be shown that 1/(  [0, ]) is -times continuously differentiable with the th order derivative Lipschitz continuous around  = 0.
First we show that 1/(  [0, ]) is -times continuously differentiable on   .We use [18] which is defined if   is -times differentiable in the point  and  ̸ = 0.If we allow  = 0 as well then   must be ( + 1)times differentiable in .Hence we find if  ∈   (which includes  = 0) and   ∈  2 (  ), and if  ∈   and   ∈  ] is Lipschitz continuous on   for  = 1, . . .,  + 1.To show this take  1 ,  2 ∈   and consider the difference of the divided difference between the two points: for some  ∈ (min( (33) Proof.Taking the limit for the numerator of ( 23) we find lim Taking the limit for the denominator D in ( 26) we obtain lim Dividing the result for numerator by the result for the denominator yields the proof.

The Algorithm
We construct an algorithm which generates a sequence of th order approximants   .The algorithm starts with two initial approximants  −1 and  0 of the root .In the first iteration we simply carry out a Secant step: The second iteration also starts with a Secant step: but next we combine the two Secant steps in a first-order approximant using the iterative definition of the th order approximants: The third iteration first carries out a Secant step  3,0 , then combines this Secant step with the previous step  2,0 in a firstorder approximant  3,1 , and finally combines  2,1 and  3,1 in a second-order approximant  3,2 : We continue this way with the fourth and the following iterations and generate a scheme which looks like Advances in Numerical Analysis 7 Since we aim at generating a sequence of th order approximants, we calculate at most  + 1 columns in an iteration.The first iteration in which we calculate all  + 1 columns is the ( + 1)th iteration.
If we parametrize  −1 as  −1,0 and  0 as  0,0 we have parametrized all values in our scheme as  , with  running over the values  = −1, 0, 1, . . .and  running over the values  = 0, . . .,  max () with (41) For simplicity we will denote  , max () by  , max .This means that, for example,  −1, max must be read as  −1, max (−1) .The choice of  in (41) sets the order of the algorithm.Choosing  = 0 results in the Secant algorithm, choosing  = 1 results in the first-order accelerated Secant algorithm,  = 2 results in the second-order accelerated Secant algorithm, and so forth.An th order accelerated Secant algorithm generates a sequence of th order approximants.
The algorithm described above can be formulated as with start values  −1,0 and  0,0 .Note that with  ,0 = ( −1, max ,  −2, max ) we have and by recursion we easily show that for  = 2, . . .,  max ().Each Secant step  after the first Secant step  1,0 = ( 0 ,  −1 ) requires exactly one evaluation of .Namely, the calculation of  ,0 requires the calculation ( −1, max ) while ( −2, max ) has already been calculated when we evaluated  −1,0 .The calculation of  , for  > 0 does not require a calculation of .Hence one iteration of the algorithm requires one evaluation of , except for the first iteration which requires the evaluation of ( −1 ) and ( 0 ).
According to Brezinski and Zaglia [16] it is recommended to calculate the second line in (42) in the following form.It is mathematically equivalent to the second line in (42) but less susceptible to round-off errors according to [16]: for  = 1, . . .,  max ().Alternative forms in which the leading term is  −1,−1 ,  −1, max , or  −−2, max are also readily derived.
A pseudocode for the accelerated Secant algorithm is provided in Appendix B. Examples of sequences generated by this algorithm are given in the tables in Appendix C. be Lipschitz continuous on .Let  ∈ , () = 0 and  (1) () ̸ = 0. Then there exists an  > 0 such that the sequence { , } ∞ =+1 generated by the th order accelerated Secant algorithm converges to  if the start values  −1 and  0 are within a distance  of .

Convergence Properties
Proof.We develop our proof in the coordinate frame .With  , =  , −  (44) reads in the coordinate frame : with We have to prove that the sequence { , } ∞ =+1 converges to zero if  −1 and  0 are chosen close enough to zero.
Putting  =  max in (46) we obtain a closed recursion for  , max : For  ≥  + 1 we have  max () =  and hence Starting at  =  + 1 (49) generates a sequence . ., }.It must first be noted that all values in the set can be made arbitrarily small by choosing  −1 =  −1,0 and  0 =  0,0 close enough to zero.We can see this for  1,0 because  1,0 =  −1  0  1 ( −1 ,  0 ) and | 1 | is bounded in a 2-dimension volume around (0, 0) according to Lemma 3. In the same way we can see this for  2,1 because  2,1 =  −1  0  1,0  2 ( −1 ,  0 ,  1,0 ) and | 2 | is bounded in a 3dimension volume around (0, 0, 0).Continuing the argument we find that all values in the set  can be made arbitrarily small.Lemma 3 states that there is an  > 0 such that a faster converging algorithm in case where two calculation cores are available.It remains to be investigated what the order of convergence is (assuming an order is defined) and whether or not this can be generalized to an arbitrary number of cores.
We have only studied the convergent behaviour of the subsequence { , } ∞ =+1 of the th order algorithm.One may wonder about the subsequences { , } ∞ =+1 for  < .A study of the first-and second-order accelerated versions of the algorithm [29] revealed that they converge with the same order as { , } ∞ =+1 but with a different asymptotic error term.This is likely the case for all orders of the algorithm.
A different approach to judge the efficiency of the algorithm is to estimate the average computational cost of the algorithm by statistical means [30].In this approach one averages the cost over a set of functions with a suitable probability measure.Although interesting, such a study is outside of the scope of the current article.

B. Pseudocode
The th order accelerated Secant algorithm looks as follows in pseudocode.The algorithm calculates the approximant  , of the root of a function f.The inital estimates are x −1 and x 0 .
Advances in Numerical Analysis We define  max, as in (41) (see Algorithm 1), and we define  , as in (42) (see Algorithm 2).

C. Numerical Example
A numerical example for the accelerated Secant method is given in Tables 1 and 2. Both tables have been calculated with Algorithms 1 and 2.

6. 1 .Lemma 5 .
Basic Convergence.The following lemma establishes sufficient conditions under which the algorithm generates a convergent sequence.Let  ⊂ R be an open interval of real values and  a function  :  → R with  ∈  +1 ().Let (+1)