This paper deals with the approximation of systems of differential-algebraic equations based on a certain error functional naturally associated with the system. In seeking to minimize the error, by using standard descent schemes, the procedure can never get stuck in local minima but will always and steadily decrease the error until getting to the solution sought. Starting with an initial approximation to the solution, we improve it by adding the solution of some associated linear problems, in such a way that the error is significantly decreased. Some numerical examples are presented to illustrate the main theoretical conclusions. We should mention that we have already explored, in some previous papers (Amat et al., in press, Amat and Pedregal, 2009, and Pedregal, 2010), this point of view for regular problems. However, the main hypotheses in these papers ask for some requirements that essentially rule out the application to singular problems. We are also preparing a much more ambitious perspective for the theoretical analysis of nonlinear DAEs based on this same approach.

1. Introduction

Differential-algebraic equations are becoming increasingly important in a lot of technical areas. They are currently the standard modeling concept in many applications such as circuit simulation, multibody dynamics, and chemical process engineering; see for instance  with no attempt to be exhaustive.

A basic concept in the analysis of differential-algebraic equations is the index. The notion of index is used in the theory of DAEs for measuring the distance from a DAE to its related ODE. The higher the index of a DAE is, the more difficulties are for its numerical solution. There are different index definitions, but for simple problems they are identical. On more complicated nonlinear and fully implicit systems they can be different (see  and the references therein.)

For simplicity, we focus our attention on problems of the form (1.1)  Mx(t)=f(x(t)) in  (0,T),  x(0)=x0, where M is a given, eventually singular, matrix depending on t. More general situations can be allowed. This type of equations arises, for instance, in the functional analytic formulation of the initial value problem for the Stokes as well as for the linearized Navier-Stokes or Oseen equations .

For the approximation of these equations collocation-type methods are usually used. These methods are implicit, and we need to solve a nonlinear system of equations in each iteration using a Newton’s type method. Given different coefficients ci, 1is, there is a (unique for h sufficiently small) polynomial of collocation q(t) of degree less than or equal to s such that (1.2)q(t0)=y0,q(t0+cih)=f(t0+cih,q(t0+cih))if 1is. The collocation methods are defined by an approximation y(t)q(t) and are equivalent to implicit RK methods of s stages (1.3)ki=f(t0+cih,y0+hj=1sai,jkj),y1=y0+hi=1sbiki, for the coefficients (1.4)ai,j=0cilju-clcj-cldu,bi=01liu-clci-cldu. The coefficients ci play the role of the nodes of the quadrature formula, and the associated coefficients bi are analogous to the weights. From (1.4), we can find implicit RK methods called Gauss of order 2s, Radau IA and Radau IIA of order 2s-1, and Lobatto IIIA of order 2s-2. Also we can consider perturbed collocation methods like Lobato IIIC. (See  for more details).

A number of convergence results have been derived for these methods introducing the so-called B-convergence theory. In [7, 8] the authors extend the B-convergence theory to be valid for a class of nonautonomous weakly nonlinear stiff systems, in particular, including the linear case. As pointed out by the same authors, it is not clear if it is possible to cover, in a satisfactory way, highly nonlinear stiff problems, that is, problems where also the nonlinear terms are affected by large parameters. Moreover, any result should assume that, in each step, the associated nonlinear system is well approximated . In particular, we should be able to start with a good initial guess for the iterative scheme. This might be very restrictive for many stiff problems.

On the other hand, iterative methods are the typical tool to solve nonlinear systems of equations. In these schemes we compute a sequence of approximations solving associated linear problems. In this paper, we would like to introduce a new variational approach for the treatment of DAEs where we linearize the original equations obtaining an iterative scheme. Our ideas are based on the analysis of a certain error functional of the form (1.5)E(x)=120T|Mx(t)-f(x(t))|2dx to be minimized among the absolutely continuous paths x:(0,T)RN with x(0)=x0. Note that if E(x) is finite for one such path x, then automatically Mx' is square integrable. This error functional is associated, in a natural way, with the Cauchy problem (1.1). Indeed, the existence of solutions for (1.1) is equivalent to the existence of minimizers for E with vanishing minimum value. This is elementary.

In this initial contribution, we want to concentrate on the approximation issue through this perspective. We will place ourselves under the appropriate hypotheses so that there are indeed solutions for (1.1), that is, there are minimizers for the error with vanishing minimum value. In addition, we would like to guarantee that the main ingredients for the iterative approximating scheme to work are valid. More explicitly, our approach for the numerical approximation of such problems relies on three main analytical hypotheses that we take for granted here.

The Cauchy problem (1.1) admits a unique solution for every feasible initial condition x0 (the definition of feasible path should depend on the index of the equation).

The linearization around any feasible, absolutely continuous, path x(t) with x(0)=x0, (1.6)My(t)-f(x(t))y(t)=f(x(t))-Mx(t) in  (0,T),  y(0)=0, always has a unique solution, and moreover, for some constant L>0 depending on M, f, x and its derivatives, (1.7)yL(0,T)2TLf(x(t))-Mx(t)L2(0,T).

The only solution of the problem (1.8)ddt(Mz(t))+f(x(t))z(t)=0in  (0,T),  Mz(T)=0, is z0, for every feasible, absolutely continuous, path x(t) with x(0)=x0.

Here the superscript indicates transpose.

These requirements depend on the index of the equation and on some regularity on the pair (M,f(x(t))). They should be more restrictive for equations with high index. More details can be found, for example, in [9, Theorem 3.9], where the authors consider DAEs transferable into standard canonical form. More precise information is outside of the scope of this paper. In any case, the equations verifying our hypotheses are, in general, a subclass of all analytically solvable systems.

In addition to the basic facts just stated on existence and uniqueness of solutions for our problems, the analysis of the approximation scheme, based on a minimization of the error functional E, requires one main basic assumption on the nonlinearity f:RNRN. It must be smooth, so that f:RNRN×N is continuous and globally Lipschitz with constant K>0 (|f|K). Moreover, the main result of this paper demands a further special property on the map f:  for every positive C>0 and small ϵ>0, there is DC,ϵ>0 so that (1.9)|f(x+y)-f(x)-f(x)y|DC,ϵ|y|2,  |x|C,|y|ϵ. This regularity is somehow not surprising as our approach here is based on regularity and optimality. On the other hand, that regularity holds for most of the important problems in applications. It certainly does in all numerical tests performed in this work. Our goal here is placed on the fact that this optimization strategy may be utilized to set up approximation schemes based on the minimization of the error functional. Indeed, we provide a solid basis for this approximation procedure. One very important and appealing property of our approach states that typical minimization schemes like (steepest) descent methods will work fine as they can never get stuck in local minima and converge steadily to the solution of the problem, no matter what the initialization is.

We should mention that we have already explored, in some previous papers, this point of view. Since the initial contribution , we have also treated the reverse mechanism of using first discretization and then optimality . The perspective of going through optimality and then discretization has already been indicated and studied in , though only for the steepest descent method, and without going through any further analytical foundation for the numerical procedure. However, the main hypotheses in these papers ask for some requirements that essentially rule out the application to singular problems. We will however address shortly  a complete treatment of DAEs with no a priori assumptions on existence and uniqueness. Rather, we will be interested in showing existence and uniqueness from scratch by examining the fundamental properties of the error functional E.

On the other hand, variational methods have been used also before in the context of ODEs. See [14, 15], where numerical integration algorithms for finite-dimensional mechanical systems that are based on discrete variational principles are proposed and studied. This is one approach to deriving and studying symplectic integrators. The starting point is Hamilton’s principle and its direct discretization. In those references, some fundamental numerical methods are presented from that variational viewpoint where the model plays a prominent role.

The rest of the paper is divided in three sections. In Section 2 we introduce our variational approach and develop a convergence analysis. Section 3 introduces the numerical procedure. We analyze some numerical results in Section 4. Finally, we present the main conclusions in Section 5.

2. A Main Descent Procedure

Proposition 2.1.

Let x¯ be a critical point for the error E. Then x¯ is the solution of the Cauchy problem (1.1).

Proof.

The proof is elementary. Based on the smoothness and bounds assumed on the mapping f, we conclude that if xx¯ is a critical point for the error E, then x ought to be a solution of the problem (2.1)-ddt(M(Mx(t)-f(x(t)))-f(x(t))((Mx(t)-f(x(t)))=0in  (0,T),x(0)=x0,M(Mx(T)-f(x(T)))=0. The vector-valued map y(t)=Mx'(t)-f(x(t)) is then a solution of the linear, nondegenerate problem (2.2)My(t)+f(x(t))y(t)=0in(0,T),My(T)=0. The only solution of this problem, by our initial conditions on uniqueness of linearizations, is y0, and so x is the solution of our Cauchy problem.

On the other hand, suppose we start with an initial crude approximation x(0) to the solution of our basic problem (1.1). We could take x(0)=x0 for all t or x(0)(t)=x0+tf(x0). We would like to improve this approximation in such a way that the error is significantly decreased. We have already pointed out that descent methods can never get stuck on anything but the solution of the problem, under global lipschitzianity hypotheses.

It is straightforward to find the Gâteaux derivative of E at a given feasible x in the direction y with y(0)=0. Namely (2.3)E(x)y=0T((Mx(t)-f(x(t)))·(My(t)-f(x(t))y(t)))dt. This expression suggests a main possibility to select y from. Choose y such that (2.4)My(t)-f(x(t))y(t)=f(x(t))-Mx(t) in  (0,T),  y(0)=0. In this way, it is clear that E'(x)y=-2E(x), and so the (local) decrease of the error is of the size E(x). Finding y requires solving the above linear problem which is assumed to have a unique solution by our main hypotheses in the introduction. In some sense, this is like a Newton method with global convergence.

Suppose x(0) is a feasible path in the interval (0,T) so that x(0)(0)=x0, M(x(0)) is square integrable, |x(0)(t)|C for a fixed constant C, all t(0,T), and the quantity (2.5)E(x(0))=120T|M(x(0))(t)-f(x(0)(t))|2dt measures how far such x(0) is from being a solution of our problem.

Theorem 2.2.

For T sufficiently small, the iterative procedure x(j)=x(j-1)+y(j), starting from the above feasible x(0) and defining y(j) as the solution of the linear problem (2.6)M(y(j))(t)-f(x(j-1)(t))y(j)(t)=f(x(j-1)(t))-M(x(j-1))(t)in  (0,T),  y(j)(0)=0, converges strongly in L(0,T) to the solution of (1.1).

Proof.

Choose ϵ>0 and 0<α<1 so that (2.7)ϵ1-αC,|f(z+y)-f(z)-f(z)y|D|y|2,|y|ϵ,  |z|2C, for some constant D>0 (see the main hypotheses in Section 1). We then solve for y(0) as the solution of the nonautonomous linear problem (2.8)My(t)-f(x(t))y(t)=f(x(t))-Mx(t) in  (0,T),  y(0)=0, and pretend to update x(0) to x(0)+y(0) in such a way that the error for x(0)+y(0) be less than the error for the current iteration x(0). Note that (2.9)E(x(0)+y(0))=120T|f(x(0)(t)+y(0)(t))-f(x(0)(t))-f(x(0)(t))y(0)(t)|2dt, where we have used the differential equation satisfied by y(0) and the definition of E(x). By our assumption on f above, (2.10)|f(x(0)(t)+y(0)(t))-f(x(0)(t))-f(x(0)(t))y(0)(t)|D|y(0)(t)|2,t(0,T), provided that |y(0)(t)|ϵ. Since we know that y(0) is the solution of a certain linear problem, by the upper bound assumed in Section 1 on the size of these solutions, (2.11)|y(0)(t)|2TLE(x(0))t[0,T],  L+. Assume that we select T>0 so small that (2.12)E0,T(x(0))E(x(0))ϵ2TL, and then |y(0)(t)|ϵ for all t[0,T]. By (2.9), (2.10), and (2.11), we can write (2.13)E(x(0)+y(0))D220T|y(0)(t)|4  dtD22L2T3E(x(0))2. If, in addition, we demand, by making T smaller if necessary, (2.14)E(x(0))2αD2T3L2, then E(x(0)+y(0))αE(x(0)). Moreover, for all t(0,T), (2.15)|x(0)(t)+y(0)(t)|C+ϵ2C. All these calculations form the basis of a typical induction argument, verifying (2.16)|i=0j-1x(i)(t)|C+ϵ(i=0j-2αi)(2C),|x(j-1)(t)|ϵαj-2t[0,T],E(i=0j-1x(i))αj-1E(x(0))(E(x(0))).

It is therefore clear that the sum (2.17)i=0x(i)(t) converges strongly in L(0,T) to the solution of our initial Cauchy problem in a small interval (0,T).

Since the various ingredients of the problem do not depend on T, we can proceed to have a global approximation in a big interval by successively performing this analysis in intervals of appropriate small size. For instance, we can always divide a global interval (0,T) into a certain number n of subintervals of small length h (T=nh) with (2.18)E0,T(x(0))D2L22α1h3, according to (2.14).

3. Numerical Procedure

Since our optimization approach is really constructive, iterative numerical procedures are easily implementable.

Start with an initial approximation x(0)(t) compatible with the initial conditions (e.g. x(0)(t)=x0+tf(x0)).

Assume we know the approximation x(j)(t) in [0,T].

Compute its derivative M(x(j))(t).

Compute the auxiliar function y(j)(t) as the numerical solution of the problem (3.1)My(t)-f(x(j)(t))y(t)=f(x(j)(t))-M(x(j))(t) in  (0,T),  y(0)=0, by making use of a numerical scheme for DAEs with dense output (like collocation methods).

Change x(j) to x(j+1) by using the update formula (3.2)x(j+1)(t)=x(j)(t)+y(j)(t).

Iterate (3), (4), and (5), until numerical convergence.

In practice, we use the stopping criterium (3.3)max{y(j),2E(x(j))}TOL.

In particular, one can implement, in a very easy way, this numerical procedure using a problem-solving environment like MATLAB .

4. Some Experiments

In this section, we approximate some problems well known in the literature for a different index [5, 17, 18]. High-order accuracy and stability are major areas of interest in this type of simulations. We do not perform an analysis of the convergence conditions imposed in the above section. We are only interested to test numerically our approach.

In our approach we only need to approximate, with at least order one, the associated linear system for y(j), in order to obtain the convergence of our scheme (see Theorem 2.2). The stability can be ensured by the fact that we approximate a linear problem using specific implicit methods . This is not the case with a general nonlinear problem , where we need to approximate well (with a Newton-type iterative method) the nonlinear system related to the implicitness of the scheme (see the above section). This approximation should be a difficult task due to the local (nonglobal) convergence of any iterative scheme for nonlinear problems.

In this section, we consider the convergent Lobatto IIIC method  valid for indexes 1–3, in order to approximate the associated linear problem for y(j) in each iteration. This method can be considered as a perturbation collocation method. The final error depends only on the stopping criterium. In the following examples, we stop the algorithm when (4.1)max{y(j),2E(x(j))}10-6 and plot the solution and the approximation given by our approach.

Index 1  (4.2)y(t)=z(t),y(t)2+z(t)2=1,y(0)=z(0)=22. The solution of this problem is (sin(x+π/4),cos(x+π/4)).

Index 2  (4.3)y1(t)=i=15fi(y1(t),y2(t),z(t)),y2(t)=i=15gi(y1(t),y2(t),z(t)),y1(t)2y2(t)=1,y1(0)=y2(0)=1,z(0)=1, where (4.4)f1(y1(t),y2(t),z(t))=y2(t)-2y1(t)2y2(t)+y1(t)y2(t)2z(t)2+2y1(t)y2(t)2-2e-2ty1(t)y2(t),f2(y2(t),z(t))=-y2(t)2z(t)+2y2(t)2z(t)2,g1(y1(t),y2(t))=-y1(t)2+y1(t)2y2(t)2,g2(y1(t),y2(t),z(t))=-y1(t)+e-tz(t)-3y2(t)2z(t)+z(t). The solution of this problem is (et,e-2t,e2t).

Index 3  (4.5)y1(t)=2y1(t)y2(t)z1(t)z2(t),y2(t)=-y1(t)y2(t)z2(t)2,z1(t)=(y1(t)y2(t)+z1(t)z2(t))u(t),z2(t)=-y1(t)y2(t)2z2(t)2u(t),y1(t)y2(t)2=1,y1(0)=y2(0)=1,z1(0)=z2(0)=1,u(0)=1. The solution of this problem is (e2t,e-t,e2t,e-t,et).

In Figures 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10, we compare the solution of the corresponding three problems with the approximations given by our approach. The results are very satisfactory in all cases, obtaining always the convergence to the true solution. In a first look, the exact and computed solutions are indistinguishable, since after convergence the error is smaller than the tolerance (=10-6) used in the stopping criterium. A more systematic and careful analysis of the numerical possibilities of the method will be pursued in the future.

Index 1, T=2π, the y-coordinate, “o”-original, “+”-approximation.

Index 1, T=2π, the z-coordinate, “o”-original, “+”-approximation.

Index 2, T=2, the y1-coordinate, “o”-original, “+”-approximation.

Index 2, T=2, the y2-coordinate, “o”-original, “+”-approximation.

Index 2, T=2, the z-coordinate, “o”-original, “+”-approximation.

Index 3, T=2, the y1-coordinate, “o”-original, “+”-approximation.

Index 3, T=2, the y2-coordinate, “o”-original, “+”-approximation.

Index 3, T=2, the z1-coordinate, “o”-original, “+”-approximation.

Index 3, T=2, the z2-coordinate, “o”-original, “+”-approximation.

Index 3, T=2, the u-coordinate, “o”-original, “+”-approximation.

5. Conclusions

A new variational approach to the analysis and numerical implementation of regular ODEs has been recently introduced in [10, 20]. Because of its flexibility and simplicity, it can easily be extended to treat other types of ODEs like differential-algebraic equations (DAEs). This has been precisely the main motivation for this paper: to explore how well those ideas can be adapted to this framework. In particular, extending to this context some of the analytical results and performing various numerical tests that confirm that indeed the variational perspective is worth pursuing. One remarkable feature is that this point of view only requires to count on good numerical schemes for linear problems, and this is the reason why it fits so well in other scenarios. Because of the many good qualities of this viewpoint, it can be considered and implemented in essentially all fields where differential equations are relevant. There is, then, a long way to go.

Acknowledgments

S. Amat and M. J. Légaz supported by MINECO-FEDER MTM2010-17508 (Spain) and by 08662/PI/08 (Murcia). P. Pedregal supported by MINECO-FEDER MTM2010-19739 (Spain).

Brenan K. E. Campbell S. L. Petzold L. R. Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations 1996 Philadelphia, Pa, USA SIAM Brenan K. E. Petzold L. R. The numerical solution of higher index differential/algebraic equations by implicit methods SIAM Journal on Numerical Analysis 1989 26 4 976 996 10.1137/0726054 1005520 ZBL0681.65050 Campbell S. L. Griepentrog E. Solvability of general differential algebraic equations SIAM Journal on Scientific Computing 1995 16 2 257 270 10.1137/0916017 1317055 ZBL0821.34005 Hairer E. Wanner G. Solving Ordinary Differential Equations II: Stiff and Differential Algebraic Problems 1991 14 Berlin, Germany Springer 1111480 Riaza R. Differential-Algebraic Systems 2008 Singapore World Scientific 2426820 Emmrich E. Mehrmann W. Analysis of Operator DiFFerential-Algebraic Equations Arising Influid Dynamics. Part I. The Finite Dimensional Case 2010 Auzinger W. Frank R. Kirlinger G. An extension of B-convergence for Runge-Kutta methods Applied Numerical Mathematics 1992 9 2 91 109 10.1016/0168-9274(92)90008-2 1147965 Auzinger W. Frank R. Kirlinger G. Modern convergence theory for stiff initial value problems Journal of Computational and Applied Mathematics 1993 45 1-2 5 16 10.1016/0377-0427(93)90260-I 1213581 ZBL0782.65087 Berger T. Ilchmann A. On the standard canonical form of time-varying linear DAEs Quarterly of Applied Mathematics. In press Amat S. Pedregal P. A variational approach to implicit ODEs and differential inclusions ESAIM. Control, Optimisation and Calculus of Variations 2009 15 1 139 148 10.1051/cocv:2008020 2488572 ZBL1172.34002 Amat S. López D. J. Pedregal P. An optimization approach for the numerical approximation of differential equations Optimization. In press 10.1080/02331934.2011.649283 Pedregal P. A variational approach to dynamical systems, and its numerical simulation Numerical Functional Analysis and Optimization 2010 31 7 1532 2467 Amat S. Pedregal P. A constructive existence theorem for non-linear DAEs through a variational strategy in preparation Lew A. Marsden J. E. Ortiz M. West M. Variational time integrators International Journal for Numerical Methods in Engineering 2004 60 1 153 212 10.1002/nme.958 2073073 ZBL1060.70500 Marsden J. E. West M. Discrete mechanics and variational integrators Acta Numerica 2001 10 357 514 10.1017/S096249290100006X 2009697 ZBL1123.37327 The MathWorks, Inc. MATLAB and SIMULINK, Natick, Mass, USA Jay L. O. Solution of index 2 implicit differential-algebraic equations by Lobatto Runge-Kutta methods BIT. Numerical Mathematics 2003 43 1 93 106 10.1023/A:1023696822355 1981642 ZBL1023.65085 Jay L. C. Convergence of Runge-Kutta methods for differential-algebraic systems of index 3 Applied Numerical Mathematics. An IMACS Journal 1995 17 2 97 118 10.1016/0168-9274(95)00013-K 1335520 ZBL0832.65078 Lambert J. D. Numerical Methods for Ordinary Differential Systems 1991 John Wiley & Sons 1127425 Amat S. Pedregal P. On an alternative approach for the analysis and numerical simulation of sti ODEs DCDS A 2012 33 4 1275 1291