On a Newton-Type Method for Differential-Algebraic Equations

This paper deals with the approximation of systems of di ﬀ erential-algebraic equations based on a certain error functional naturally associated with the system. In seeking to minimize the error, by using standard descent schemes, the procedure can never get stuck in local minima but will always and steadily decrease the error until getting to the solution sought. Starting with an initial approximation to the solution, we improve it by adding the solution of some associated linear problems, in such a way that the error is signiﬁcantly decreased. Some numerical examples are presented to illustrate the main theoretical conclusions. We should mention that we have already explored, in some previous papers (cid:3) Amat et al., in press, Amat and Pedregal, 2009, and Pedregal, 2010 (cid:4) , this point of view for regular problems. However, the main hypotheses in these papers ask for some requirements that essentially rule out the application to singular problems. We are also preparing a much more ambitious perspective for the theoretical analysis of nonlinear DAEs based on this same approach.


Introduction
Differential-algebraic equations are becoming increasingly important in a lot of technical areas. They are currently the standard modeling concept in many applications such as circuit simulation, multibody dynamics, and chemical process engineering; see for instance [1][2][3][4][5] with no attempt to be exhaustive.
A basic concept in the analysis of differential-algebraic equations is the index. The notion of index is used in the theory of DAEs for measuring the distance from a DAE to its related ODE. The higher the index of a DAE is, the more difficulties are for its numerical solution. There are different index definitions, but for simple problems they are identical. On more complicated nonlinear and fully implicit systems they can be different see 5 and the references therein. where M is a given, eventually singular, matrix depending on t. More general situations can be allowed. This type of equations arises, for instance, in the functional analytic formulation of the initial value problem for the Stokes as well as for the linearized Navier-Stokes or Oseen equations 6 . For the approximation of these equations collocation-type methods are usually used. These methods are implicit, and we need to solve a nonlinear system of equations in each iteration using a Newton's type method. Given different coefficients c i , 1 ≤ i ≤ s, there is a unique for h sufficiently small polynomial of collocation q t of degree less than or equal to s such that The collocation methods are defined by an approximation y t q t and are equivalent to implicit RK methods of s stages The coefficients c i play the role of the nodes of the quadrature formula, and the associated coefficients b i are analogous to the weights. From 1.4 , we can find implicit RK methods called Gauss of order 2s, Radau IA and Radau IIA of order 2s − 1, and Lobatto IIIA of order 2s − 2. Also we can consider perturbed collocation methods like Lobato IIIC. See 4 for more details .
A number of convergence results have been derived for these methods introducing the so-called B-convergence theory. In 7, 8 the authors extend the B-convergence theory to be valid for a class of nonautonomous weakly nonlinear stiff systems, in particular, including the linear case. As pointed out by the same authors, it is not clear if it is possible to cover, in a satisfactory way, highly nonlinear stiff problems, that is, problems where also the nonlinear terms are affected by large parameters. Moreover, any result should assume that, in each step, the associated nonlinear system is well approximated 5 . In particular, we should be able to start with a good initial guess for the iterative scheme. This might be very restrictive for many stiff problems.
On the other hand, iterative methods are the typical tool to solve nonlinear systems of equations. In these schemes we compute a sequence of approximations solving associated linear problems. In this paper, we would like to introduce a new variational approach for the treatment of DAEs where we linearize the original equations obtaining an iterative scheme. Our ideas are based on the analysis of a certain error functional of the form to be minimized among the absolutely continuous paths x : 0, T → R N with x 0 x 0 . Note that if E x is finite for one such path x, then automatically Mx is square integrable. This error functional is associated, in a natural way, with the Cauchy problem 1.1 . Indeed, the existence of solutions for 1.1 is equivalent to the existence of minimizers for E with vanishing minimum value. This is elementary.
In this initial contribution, we want to concentrate on the approximation issue through this perspective. We will place ourselves under the appropriate hypotheses so that there are indeed solutions for 1.1 , that is, there are minimizers for the error with vanishing minimum value. In addition, we would like to guarantee that the main ingredients for the iterative approximating scheme to work are valid. More explicitly, our approach for the numerical approximation of such problems relies on three main analytical hypotheses that we take for granted here. always has a unique solution, and moreover, for some constant L > 0 depending on M, f, x and its derivatives, The only solution of the problem is z ≡ 0, for every feasible, absolutely continuous, path x t with x 0 x 0 .
Here the superscript indicates transpose.

Journal of Applied Mathematics
These requirements depend on the index of the equation and on some regularity on the pair M, ∇f x t . They should be more restrictive for equations with high index. More details can be found, for example, in 9, Theorem 3.9 , where the authors consider DAEs transferable into standard canonical form. More precise information is outside of the scope of this paper. In any case, the equations verifying our hypotheses are, in general, a subclass of all analytically solvable systems.
In addition to the basic facts just stated on existence and uniqueness of solutions for our problems, the analysis of the approximation scheme, based on a minimization of the error functional E, requires one main basic assumption on the nonlinearity f : R N → R N . It must be smooth, so that ∇f : R N → R N×N is continuous and globally Lipschitz with constant K > 0 |∇f| ≤ K . Moreover, the main result of this paper demands a further special property on the map f: for every positive C > 0 and small > 0, there is D C, > 0 so that This regularity is somehow not surprising as our approach here is based on regularity and optimality. On the other hand, that regularity holds for most of the important problems in applications. It certainly does in all numerical tests performed in this work. Our goal here is placed on the fact that this optimization strategy may be utilized to set up approximation schemes based on the minimization of the error functional. Indeed, we provide a solid basis for this approximation procedure. One very important and appealing property of our approach states that typical minimization schemes like steepest descent methods will work fine as they can never get stuck in local minima and converge steadily to the solution of the problem, no matter what the initialization is. We should mention that we have already explored, in some previous papers, this point of view. Since the initial contribution 10 , we have also treated the reverse mechanism of using first discretization and then optimality 11 . The perspective of going through optimality and then discretization has already been indicated and studied in 12 , though only for the steepest descent method, and without going through any further analytical foundation for the numerical procedure. However, the main hypotheses in these papers ask for some requirements that essentially rule out the application to singular problems. We will however address shortly 13 a complete treatment of DAEs with no a priori assumptions on existence and uniqueness. Rather, we will be interested in showing existence and uniqueness from scratch by examining the fundamental properties of the error functional E.
On the other hand, variational methods have been used also before in the context of ODEs. See 14, 15 , where numerical integration algorithms for finite-dimensional mechanical systems that are based on discrete variational principles are proposed and studied. This is one approach to deriving and studying symplectic integrators. The starting point is Hamilton's principle and its direct discretization. In those references, some fundamental numerical methods are presented from that variational viewpoint where the model plays a prominent role.
The rest of the paper is divided in three sections. In Section 2 we introduce our variational approach and develop a convergence analysis. Section 3 introduces the numerical procedure. We analyze some numerical results in Section 4. Finally, we present the main conclusions in Section 5.
Journal of Applied Mathematics 5

A Main Descent Procedure
We start with a fundamental fact for our approach. Proof. The proof is elementary. Based on the smoothness and bounds assumed on the mapping f, we conclude that if x ≡ x is a critical point for the error E, then x ought to be a solution of the problem The vector-valued map y t Mx t − f x t is then a solution of the linear, nondegenerate problem The only solution of this problem, by our initial conditions on uniqueness of linearizations, is y ≡ 0, and so x is the solution of our Cauchy problem.
On the other hand, suppose we start with an initial crude approximation x 0 to the solution of our basic problem 1.1 . We could take x 0 x 0 for all t or x 0 t x 0 tf x 0 . We would like to improve this approximation in such a way that the error is significantly decreased. We have already pointed out that descent methods can never get stuck on anything but the solution of the problem, under global lipschitzianity hypotheses.
It is straightforward to find the Gâteaux derivative of E at a given feasible x in the direction y with y 0 0. Namely This expression suggests a main possibility to select y from. Choose y such that In this way, it is clear that E x y −2E x , and so the local decrease of the error is of the size E x . Finding y requires solving the above linear problem which is assumed to have a unique solution by our main hypotheses in the introduction. In some sense, this is like a Newton method with global convergence.

Journal of Applied Mathematics
Suppose x 0 is a feasible path in the interval 0, T so that x 0 0 x 0 , M x 0 is square integrable, |x 0 t | ≤ C for a fixed constant C, all t ∈ 0, T , and the quantity measures how far such x 0 is from being a solution of our problem.

Theorem 2.2.
For T sufficiently small, the iterative procedure x j x j−1 y j , starting from the above feasible x 0 and defining y j as the solution of the linear problem converges strongly in L ∞ 0, T to the solution of 1.1 .
Proof. Choose > 0 and 0 < α < 1 so that and pretend to update x 0 to x 0 y 0 in such a way that the error for x 0 y 0 be less than the error for the current iteration x 0 . Note that where we have used the differential equation satisfied by y 0 and the definition of E x . By our assumption on f above, provided that |y 0 t | ≤ . Since we know that y 0 is the solution of a certain linear problem, by the upper bound assumed in Section 1 on the size of these solutions, Journal of Applied Mathematics 7 Assume that we select T > 0 so small that and then |y 0 t | ≤ for all t ∈ 0, T . By 2.9 , 2.10 , and 2.11 , we can write If, in addition, we demand, by making T smaller if necessary, 2.14 then E x 0 y 0 ≤ αE x 0 . Moreover, for all t ∈ 0, T , All these calculations form the basis of a typical induction argument, verifying

2.16
It is therefore clear that the sum Since the various ingredients of the problem do not depend on T , we can proceed to have a global approximation in a big interval by successively performing this analysis in intervals of appropriate small size. For instance, we can always divide a global interval 0, T into a certain number n of subintervals of small length h T nh with according to 2.14 .

Numerical Procedure
Since our optimization approach is really constructive, iterative numerical procedures are easily implementable.
1 Start with an initial approximation x 0 t compatible with the initial conditions e.g. x 0 t x 0 tf x 0 .
2 Assume we know the approximation x j t in 0, T .
3 Compute its derivative M x j t . by making use of a numerical scheme for DAEs with dense output like collocation methods .
5 Change x j to x j 1 by using the update formula In particular, one can implement, in a very easy way, this numerical procedure using a problem-solving environment like MATLAB 16 .

Some Experiments
In this section, we approximate some problems well known in the literature for a different index 5, 17, 18 . High-order accuracy and stability are major areas of interest in this type of simulations. We do not perform an analysis of the convergence conditions imposed in the above section. We are only interested to test numerically our approach.
In our approach we only need to approximate, with at least order one, the associated linear system for y j , in order to obtain the convergence of our scheme see Theorem 2.2 . The stability can be ensured by the fact that we approximate a linear problem using specific implicit methods 4 . This is not the case with a general nonlinear problem 19 , where we need to approximate well with a Newton-type iterative method the nonlinear system related to the implicitness of the scheme see the above section . This approximation should be a difficult task due to the local nonglobal convergence of any iterative scheme for nonlinear problems.
In this section, we consider the convergent Lobatto IIIC method 18 valid for indexes 1-3, in order to approximate the associated linear problem for y j in each iteration. This method can be considered as a perturbation collocation method. The final error depends only on the stopping criterium. In the following examples, we stop the algorithm when

4.2
The solution of this problem is sin x π/4 , cos x π/4 .

4.4
The solution of this problem is e t , e −2t , e 2t .
iii Index 3 18 y 1 t 2y 1 t y 2 t z 1 t z 2 t , y 2 t −y 1 t y 2 t z 2 t 2 , z 1 t y 1 t y 2 t z 1 t z 2 t u t , z 2 t −y 1 t y 2 t 2 z 2 t 2 u t , y 1 t y 2 t 2 1, y 1 0 y 2 0 1,

4.5
The solution of this problem is e 2t , e −t , e 2t , e −t , e t .
In Figures 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10, we compare the solution of the corresponding three problems with the approximations given by our approach. The results are very satisfactory in all cases, obtaining always the convergence to the true solution. In a first look, the exact and computed solutions are indistinguishable, since after convergence the error is smaller than the tolerance 10 −6 used in the stopping criterium. A more systematic and careful analysis of the numerical possibilities of the method will be pursued in the future.

Conclusions
A new variational approach to the analysis and numerical implementation of regular ODEs has been recently introduced in 10, 20 . Because of its flexibility and simplicity, it can easily be extended to treat other types of ODEs like differential-algebraic equations DAEs . This has been precisely the main motivation for this paper: to explore how well those ideas can be adapted to this framework. In particular, extending to this context some of the analytical results and performing various numerical tests that confirm that indeed the variational perspective is worth pursuing. One remarkable feature is that this point of view only requires to count on good numerical schemes for linear problems, and this is the reason why it fits so well in other scenarios. Because of the many good qualities of this viewpoint, it can be considered and implemented in essentially all fields where differential equations are relevant. There is, then, a long way to go.