SEMILOCAL ANALYSIS OF EQUATIONS WITH SMOOTH OPERATORS

In the terminology of Ortega-Rhelnholdt [15], a semilocal analysis for a given , equation Fx 0 establishes the existence of a local solution x by showing that , a sequence of approximate solutions x converges to x and it also yields comn putable bounds for the errors llx* x II. The operator F is generally assumed n to be Frchet dlfferentlable and the basic idea is to take each x as the solution n of an approximating linear operator equation. The central result of this kind is

teorem, Nwton mthod. 1980 MATHEMATICS SUBJECT CLASSIFICATION COPES. 65H10, 65J15, 47H17. i. INTRODUCTION. In the terminology of Ortega-Rhelnholdt [15], a semilocal analysis for a given , equation Fx 0 establishes the existence of a local solution x by showing that , a sequence of approximate solutions x converges to x and it also yields comn putable bounds for the errors llx* x II. The operator F is generally assumed n to be Frchet dlfferentlable and the basic idea is to take each x as the solution n of an approximating linear operator equation. The central result of this kind is the Kantorovlch theorem for Newton's method. Most other semilocal results are related to that famous theorem, since they involve llnearlzatlon processes based on approximate derivatives. The theory carl be used to establish the existence and uniqueness of solutions for specific equations w-Ithout finding the solutions.
Unfortunately, its application to real computation is fraught with difficulties.
Our purpose here is to describe research which attempts to bring semilocal theory G.J. MIEL a bit closer to the computer. We will do this in two ways. First, we will state and prove a refined version of the Kantorovich theorem, which includes new error bounds. Secondly, we will give a brief and informal survey of related topics, with a view of bringing out the benefits and drawbacks of semilocal analyses.
Then i) F' (x) is invertible for every x e S(x0,t*).
ii) The iterates Xn+I x n F' (Xn)-IFxn remain in S(x0,t*) and converge to a root x* of F.
We discuss features of the above version of the Kantorovich theorem: 1) The theorem is affine invariant and the transformation (2.1) is an optimal scaling [4]. The parameter h may be considered a measure of the nonlinearity of F.
2) The estimates given in (iv) appear to be new. They show that the majorizing sequence yields not only second r-order convergence, but the stronger second q-order as well. Indeed, if h < 1 then (2.4) shows that lim an I/A < .
3) The upper bounds in (v) were derived by Miel [11]. The bound with II Xn-Xn-1 II is usually considerably better than the one with I I Xl x0 II .
Both bounds are mnotone decreasing functions of . The well known upper bounds of Gragg-Tapia [5] correspond to the case i. The bound with I I Xn-Xn-i I I 2 is sharper. We prove below that it is also sharper than the upper bound of Potra-Ptk [17].

4) Statement (vi)
gives an improvement of the lower bound of Gragg-Tapia [5], since the latter can be shown to be equi.valent to the left-most expression.
5) The bounds are expressed in terms of the majorizing sequence, but since  (0, ) The following lemma, whose proof can be found in [17], is a special case of the Induction Theorem of Ptk [20].
LEMMA. Assume that there exists 1) a map :T/T such that the series u(r) r + (r) + ((r)) + + n(r) + converges for every r T, 2) a family of sets Z(r)cx, r e T, such that x 0 Z(r 0) for some r 0 cT F(x) S(x,r) NZ((r)) whenever rT and x Z(r).
With the ntation here, the Kantorovich theorem can be established by taking Ptk [21] thus derived the a priori upper bounds of Gragg-Tapia with the use of (2.7), and recently, Potra-Ptk [17]  for the inequality, use that (s,t) is monotone decreasing in t.

INFORMAL DISCUSSION.
In this section, we give a short list of references and we consider benefits and drawbacks of semilocal analyses. 3.1 SOME REFERENCES.
A history of the Kantorovich theorem and some of its relatives is given in Kusakin; see [8,Bibliography] for the references.
A refinement of the majorant technique was used by Ortega [14] to present an elegant proof of the Kantorovich theorem, and more generally, by Rheinboldt [25], to establish a general semilocal theory for itgration of the form, Xn+1 x n-D(Xn)-iFxn (3.1) where D(x) is an approximate derivative of F satisfying certain conditions. The G.J. MIEL corresponding majorizing sequence is generated by t o 0, tn+1 t n-d(tn)-If(tn), (3.2) where d and f are respectively a linear and a quadratic polynomial. These scalar functions satisfy the convergence conditions for a subclass of methods, including Newton's method when D(x) F' (x), and (3.2) then becomes a special case of (3.1) with F=f, D=d, x0=t 0. Miel [i0,ii] showed that under the hypotheses of Rheinboldt's theorem, the majorizing sequence yields t*-t n t*-t n (tn_tn_l) l[Xn-Xn-lll < [IXl -x011 0<<i. (3.3) These error bounds are clearly optimal for the proper subclass of methods. For the Newton method, as shown in the previous section, the stronger statement with 0< < 2 is valid.
Rheinboldt's hypotheses on D(x) in the semilocal analysis of (3.1) turn out to be restrictive; Dennis [1] used a majorizing sequence to extend the result for methods which have approximate derivatives of bounded deterioration, and thus include certain generalized secant algorithms.
Consideration of these algorithms led to the research on so-called quasi-Newton methods, surveyed in [2,3]. Potra-Ptk [17,18,19] used nondiscrete induction to obtain convergence and error bounds for the Newton, multistep Newton, and generalized regular falsi methods. We proved that their upper bound for Newton's method is related to, but not as sharp as the finer bound in (3.3) with 2. 3.2

ADVANTAGES.
Benefits gained from the Kantorovich theorem and related semilocal theorems are summarized below: 1) One can establish domains of existence and uniqueness for a solution of a nonlinear operator equation, with no actual knowledge of the solution.
2) A constructive method for approximating such a solution is provided, consisting of a convergent sequence of solutions of linearized operator equations.
3) A domain of attraction S is established with the property that if the iterates reach S then they will stay in S and converge to a solution.
4) Error bounds are available provided that one can evaluate the constants involved in the hypotheses of the theorem. 5) Newton's method is self-corrective: Xn+1 depends only of F and x n, so that errors from previous iterates do no propagate.
Property (5) is an advantage of Newton's method which is not shared by quasi-Newton methods. We include (4) as an advantage despite the warnings in [1,2] against the use of majorizing sequences for getting error bounds. The reasons cited were the apparent r-order of convergence, the coarseness of the bounds, and the difficulty in calculating the required constants. Because of results which partially overcome these objections, the views against majorizing sequences should perhaps be re-evaluated. It was shown in the last section that the majorizing sequence for Newton's method does imply q-quadratic convergence and that the bound with I I Xn-Xn-lll 2 is sharper than the usual ones with I I Xn-Xn-lll and I I Xl-X011-The problems associated with the local nature of the estimates and the verification of hypotheses, however, do remain. In this connection, we point to research on computer verification of semilocal conditions by interval analysis [12,13,23].

DISADVANTAGES.
From a practical aspect, statement (2) above is no panacea to the numerical analyst: each linear operator equation in the constructive process must still be reduced to a computable form. This brings us to the first drawback in our list.
i) The theory does not provide a means of discretizing an operator equation into a corresponding finite system of equations.
2) Stringent hypothese require that the iterates be in the vicinity of a root before a theorem will guarantee convergence and provide error bounds.
3) The computation of the constants in these hypotheses, especially the Lipschitz constants, is difficult. 4) Newton's method requires a new Frchet derivative F' (x n) at each step n.
G.J. MIEL 5) A system of linear equations must be solved at each step. For Newton's method in dimension N, this requires a costly 0 (N3) arithmetic operations.
With respect to (4), it should be noted that derivatives have been compiled by suitable software as easily as functions [7,24]. The research on quasi-Newton methods is motivated by (4) and (5). These methods use ingenious approximate Jacobians to avoid the evaluation of F' (x) and to reduce from 0(N3) to 0(N2) the cost in the solution of linear systems. The price paid is a reduction from second order to superlinear convergence. Local analysis of quasi-Newton algorithms has made two fundamental contributions to the theory of iterative methods: the notion of bounded deterioration of approximate derivatives and a characterization of q-superlinear convergence. The algorithms have been studied extensively for optimization problems [2,3].
ACKNOWLEDGEMENT. This paper was presented at the V-th International Conference on Operator Theory held June 1980 in Timisoara, Romania.