A simple alternative to the conjugate gradient (CG) method is presented; this method is developed as a special case of the more general iterated Ritz method (IRM) for solving a system of linear equations. This novel algorithm is not based on conjugacy; i.e., it is not necessary to maintain overall orthogonalities between various vectors from distant steps. This method is more stable than CG, and restarting techniques are not required. As in CG, only one matrix-vector multiplication is required per step with appropriate transformations. The algorithm is easily explained by energy considerations without appealing to the A-orthogonality in n-dimensional space. Finally, relaxation factor and preconditioning-like techniques can be adopted easily.
Hrvatska Zaklada za ZnanostIP-2014-09-28991. Introduction
Let(1)Ax=b,be a real linear system with a symmetric positive definite (SPD) matrix of order n. By IRM, the solution is sought through successive minimisation of the corresponding energy functional, or the quadratic form(2)fx=12xTAx−xTb,inside a small subspace formed at each iteration step [1]. After the convergence criterion is reached, a solution is found that is close to the unique minimiser of fx. Geometrically, this is the point close to the centre of the hyperellipsoids fx=c, where c are arbitrary real constants.
2. Briefly about IRM
The main idea here is to present the solution increment by the discretised Ritz method:(3)pi=Φiai,where Φi=ϕ1,i,ϕ2,i,…,ϕm,i is a matrix of linearly independent coordinate vectors and ai is the vector of corresponding coefficients. The energy decrement associated with (3) can also be expressed as the quadratic function:(4)Δfai=12aiTA¯iai−aiTr¯i,where A¯i=ΦiTAΦi and r¯i=ΦiTri are the SPD generalised (Ritz) matrix and the generalised residual vector, respectively, and both terms are of order m. After minimising (4), we obtain the system of equations that should be solved at each step:(5)A¯iai=r¯i.
The solution is used to find the increment in (3), and xi+1=xi+pi is updated afterwards. The residual is defined in a standard manner as ri+1=b−Axi+1.
Obviously, IRM represents an iterative procedure, where a discrete Ritz method is applied at each step and a suitable set of coordinate vectors which span a subspace are generated. A local energy minimum is sought within that subspace (therefore, equation (5) should be solved at each step), thereby decreasing the total energy of the system, which eventually converges to the required minimum. The subspace dimension, or the size of (5), is not limited. Rather, it aims to be small, much smaller than the number of unknowns (m≪n), because every iteration must be as fast as possible. Such a small system (though A¯i is usually full) can be solved by any direct method. Simple pseudocode, with input data and sequence of instructions common for the iterative solution methods, is given by Algorithm 1.
Algorithm 1: Basic IRM algorithm.
Require: A, b, x0, ε, nmax {usually x0⟵0}
Ensure: xi+1{close to x}
i⟵0 {initialisation: steepest descent}
r0⟵b−Ax0
q⟵r0Tr0/r0TAr0
p0⟵qr0
while ri2>εr02∧i≤nmaxdo {iterated Ritz method}
xi+1⟵xi+pi
ri+1⟵b−Axi+1
generate ϕ1,i,ϕ2,i,…,ϕm,i
A¯i⟵ϕ1,i,ϕ2,i,…,ϕm,iTAϕ1,i,ϕ2,i,…,ϕm,i
r¯i⟵ϕ1,i,ϕ2,i,…,ϕm,iTri+1
ai⟵A¯i−1r¯i
pi+1⟵ϕ1,i,ϕ2,i,…,ϕm,iai
i⟵i+1
end while {end iterated Ritz method}
The central problem involves quickly generating a small and efficient subspace, such that the energy reduction per step is as large as possible and the number of steps is extremely reduced. Usually, one coordinate vector is pi, and others are generated as Pjri+1, where Pj is the fast approximation of A−1. Nonresidual-based generation ideas are also possible [2]. Because many strategies to construct Pj (or generally ϕj,i) exist, the algorithm also allows for any new routine that generates subspaces (e.g., those suggested by other independent researchers) to be easily implemented (line 8, Algorithm 1). Potentially, this may make the method even faster.
It should be noted that the conjugacy property is not explicitly taken into account in IRM, and coordinate vectors may become (almost) linearly dependent. Therefore, routines for subspace generation which prevent such a scenario are preferred and may even change between steps. Nevertheless, if this dependence arises, some pivots approach zero during the decomposition of A¯i, which can be recognised and used for discarding corresponding equations from the small system. The subspace dimension is reduced in such cases, but A¯i becomes more regular and better conditioned. This strategy is faster than orthogonalisation, rejection, or replacement of dependent vectors [3].
IRM can also be considered as a generalisation of some iterative methods [1, 2]. Depending on the choice of coordinate vectors, some solvers can be represented or interpreted as special cases of this approach. Furthermore, it is possible to combine good properties of several methods simultaneously. If appropriate vectors are selected, convergence should proceed faster than using any single method considered. Here, an improved CG algorithm (IRM-CG) is presented due to the popularity of SPD systems.
3. IRM-CG as a Special Case of IRM
The algorithm presented here also starts with the steepest descent (SD) step. Other steps are executed using a CG-like algorithm simulated by IRM with two coordinate vectors. The first vector is the current residual ri+1, and the second vector is previous solution increment pi. Vectors span a two-dimensional subspace. At each step, a system of two equations is solved and a new energy minimum within that plane is found (Algorithm 2).
Algorithm 2: Basic IRM-CG algorithm.
Require: A, b, x0, ε, nmax {usually x0⟵0}
Ensure: xi+1 {close to x}
i⟵0 initialisation: steepest descent
r0⟵b−Ax0
q⟵r0Tr0/r0TAr0
p0⟵qr0
while ri2>εr02∧i≤nmaxdo {IRM-CG method}
xi+1⟵xi+pi
ri+1⟵b−Axi+1
A¯i⟵ri+1piTAri+1pi
r¯i⟵ri+1Tri+10T {second term is zero because ri+1Tpi=0}
ai⟵A¯i−1r¯i
pi+1⟵ri+1piai
i⟵i+1
end while {end IRM-CG method}
This approach has three matrix-vector multiplications per step: one in line 7 and two in line 8. Applying two “induced” recursive relations (“inherent” A-orthogonalisation is not exploited here), only one such multiplication remains (as in CG). If line 11 (Algorithm 2) is multiplied by A, then(6)Api+1=Ari+1Apiai.
Substituting αi=Ari+1 and βi=Api into the first recursion yields(7)βi+1=αiβiai.
Second, the frequently used residual recursion ri+1=ri−Api becomes(8)ri+1=ri−βi.
Now, after the line 4 (Algorithm 2), the new initialisation(9)β0⟵Ap0,should be inserted, and the pseudocode inside the while loop becomes as follows:(10)xi+1⟵xi+pi,ri+1⟵ri−βi,αi⟵Ari+1sole matrix−vector multiplication,A¯i⟵ri+1piTαiβiA¯i is symmetric:ri+1Tβi=piTαi,lines 9–11 remain unchanged,βi+1⟵αiβiai,i⟵i+1.
Due to roundoff errors, as in CG, the residual is periodically (after imax steps) updated from the equilibrium equation; i.e., the following should be used instead of the second line from (10):(11)ifi mod imax≠0then,ri+1⟵ri−βi,else,ri+1⟵b−Axi+1,endif.
4. Equivalence between CG and IRM-CG
The proof of equivalence between CG and IRM-CG is very simple, so it will be discussed only briefly. Initialisation is practically identical for both methods. In other steps, the minimum of the energy function inside the plane spanned by ri+1 and pi is determined. In CG, this is realised by A-orthogonalisation and by solving a linear system of two equations in IRM-CG.
If exact arithmetic is considered, IRM-CG and CG have an identical sequence of intermediate results. The exact solution is obtained after m steps, where m is the number of different “active” eigenvalues. If b is represented as a sum of eigenvectors φj, i.e., b=∑ajφj, eigenvectors (and their corresponding eigenvalues) with aj≠0 may be called “active” (or “inactive” otherwise). Of course, m can be found only if all n eigenpairs are detected. Multiple eigenvalues should be counted as one, and “inactive” eigenvalues are not counted at all. This comment is only important for theoretical considerations, as IRM-CG is interesting as an iterative, not a direct solution method.
5. Advantages of IRM-CG over CG
During real calculations (with roundoff errors), IRM-CG is more stable and behaves better than CG. First, restarting of this algorithm is not needed because A-orthogonality is not exploited. Namely, error in A-orthogonality also exists if the IRM-CG formulation is used, but it is not accumulated during calculation. Therefore, inherited errors from the A-orthogonality decrease, although nonexact arithmetic (as in every numerical process) causes new errors and affects convergence.
Consider simple example with diagonal A (a1,1=1 and a2,2=κ, where κ is a condition number of A). If b=11T, using rational arithmetic exact solution, x=11/κT is obtained in two steps by both methods. To check stability of the methods, after the initialisation phase (for i=0), a small disturbance δ to the second term of p1 is added (Figure 1(a)). The second step of IRM-CG still gives exact result, but CG gives only a perturbed approximate solution as a function of δ and κ:(12)x˜1=211+κ+2κ−14κ+1−4δκ−1+δ2κ−12,x˜2=211+κ+κ−1δκ−1−2κ4κ+1−4δκ−1+δ2κ−12.
Stability of CG method: (a) interpretation of disturbance δ, (b) function x˜−x=fδ, and (c) surface x˜−x=fδ,κ.
Notice complexity of the CG solution, even with a diagonal matrix of order two. For better explanation of the expressions, over domain δ∈−10−2,10−2 functions x˜−x=fδ for three values of κ and x˜−x=fδ,κ for κ=10a (a∈0,4) are given (Figures 1(b) and 1(c)). Only if δ=0, exact solution is recovered from (12). Also, functions are nonsymmetric, so CG behaves differently for +δ.
When large equation systems are considered, δ is accumulated primarily due to the loss of A-orthogonality, which is inherited recursively. The main reason is approximate CG solution at each step (in the current plane), in practice more complicated than (12). On the contrary, IRM-CG finds numerically “exact” solution at each plane. Therefore, the system of two equations is repeatedly solved. Roughly, if δ is split into two components at each step, the one inside and the other orthogonal to the plane, the first component is “exactly” resolved and does not produce inherited error. In CG, both components cause propagation of error. In other words, if δ lies on the plane, the behaviour of IRM-CG is as if δ=0, but if it is orthogonal to the plane, the solution is disturbed.
It is possible to interchange methods because the two approaches are equivalent. Each step may be executed by CG or IRM-CG, no matter how the earlier steps were performed. If CG is used as a solution method, it is suggested that one equivalent IRM-CG step be executed after some number of steps but before orthogonality error becomes too large. This may be called “refresh” instead of traditionally “restart.”
The second advantage of this formulation is the natural adoption of the relaxation factor ω∈0,2, known from the successive overrelaxation method [4], which may even change at each step. Line 6 (Algorithm 2) is simply(13)xi+1⟵xi+ωipi,and the second term of r¯i is not zero (line 9, Algorithm 2); it is rather ωiri+1Tpi. Also, recurrence relation in (11) becomes ri+1⟵ri−ωiβi. Obviously, using ω≠1, A-orthogonality is lost, but convergence is improved in many cases [5]. The third advantage is the very natural adoption of (multi) preconditioning-like techniques [6]. In such cases, only line 8 (Algorithm 2) is reformulated as(14)A¯⟵pi,ϕ2,i,ϕ3,i,…TApi,ϕ2,i,ϕ3,i,….
The coordinate vectors are(15)ϕj,i=Mj−1ri+1,2≤j≤n,where Mj is a matrix according to the standard approach [7], which is used to produce a better-conditioned system Mj−1Ax=Mj−1b equivalent to (1). However, transformations of the CG algorithm required for such strategies are not needed here. According to IRM, that is just another way of generating coordinate vector(s). During the solution process, they can also become (exactly or approximately) linearly dependent, and one or several of them should be excluded.
Many possibilities to rapidly construct M−1 exist, such as (not always robust) incomplete Cholesky factorisations with different fill-ins [8], algebraic multigrid methods [9, 10], and sparse approximate inverses [11]. It is even possible to use methods that are not useful as standalone solvers, as they are neither convergent nor numerically stable. For smoothing purposes, forward and backward techniques or any other promising order of unknowns may be useful.
6. Illustrative Example
Consider a simple linear FEM benchmark: a cube discretised by 8-node solid elements, supported by the corner springs of stiffnesses k, and loaded with a vertical force at the top. This model has 3993 unknowns (Figure 2). The condition number is κ≈8⋅104, which is calculated as the ratio of extreme eigenvalues. CG and IRM-CG behave almost identically (curves practically collide) for such a well-conditioned system. If the spring stiffness A is greatly reduced to 10−10 k, then κ≈3⋅1013 and IRM-CG behaves much better than CG. Processes are terminated after ε=10−10 is reached. Of course, CG may be improved by preconditioning and restarting techniques. However, IRM-CG may also be enhanced by using additional coordinate vectors, while restarting strategies are not needed at all, as previously mentioned.
Behaviour of CG and IRM-CG for well-posed and ill-posed problems.
7. Conclusion
Although general theorems and proofs about algorithm convergence rate and stability are not given here, according to the results of numerical experiments with exact and floating-point arithmetic, IRM-CG should be an interesting replacement for a standard or preconditioned CG. Recursive A-orthogonalisation, restarting recommendations, and transformations (necessary for preconditioning) are not required, hence the method should be very useful for solving non-well-posed problems. Finally, the property of conjugacy, which underlies many iterative procedures and is valid only for linear systems, is not absolutely necessary here. Therefore, IRM-CG can be successfully applied to nonlinear problems (including optimisation), where conjugacy is not even defined. This issue is vitally important, as iterative methods are used exclusively in these cases [12].
Data Availability
The Wolfram Mathematica data file used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was fully supported by the Croatian Science Foundation under the project IP-2014-09-2899.
DvornikJ.Generalization of the CG method applied to linear and nonlinear problems1979101-221722310.1016/0045-7949(79)90089-02-s2.0-49249146541DvornikJ.LazarevicD.Iterated Ritz method for solving systems of linear algebraic equations2017697521535DvornikJ.LazarevićD.The iterated Ritz method: basis, implementation and further development201867755774BaiY.-Q.XiaoY.-P.MaW.-Y.The preconditioned SOR iterative method for positive definite matrices20132013473283410.1155/2013/7328342-s2.0-84885448646MakhduomiH.KeshtegarB.ShahrakiM.A comparative study of first-order reliability method-based steepest descent search directions for reliability analysis of steel structures2017201710864380110.1155/2017/86438012-s2.0-85029676727BertolazziE.FregoM.Preconditioning complex symmetric linear systems201520152054860910.1155/2015/5486092-s2.0-84926346950DemmelJ. W.1997Philadelphia, PA, USASociety for Industrial and Applied Mathematics (SIAM)van’t WoutE.van GijzenM. B.DitzelA.van der PloegA.VuikC.The deflated relaxed incomplete Cholesky CG method for use in a real-time ship simulator20101124925710.1016/j.procs.2010.04.0282-s2.0-78650270774IwamuraC.CostaF. S.SbarskiI.EastonA.LiN.An efficient algebraic multigrid preconditioned conjugate gradient solver200319220-212299231810.1016/s0045-7825(02)00378-x2-s2.0-0038454989PereiraF. H.Lopes VerardiS. L.NabetaS. I.A fast algebraic multigrid preconditioned conjugate gradient solver2006179134435110.1016/j.amc.2005.11.1152-s2.0-33747469968BenziM.MeyerC. D.TůmaM.A sparse approximate inverse preconditioner for the conjugate gradient method19961751135114910.1137/s10648275942714212-s2.0-0004675411PraksP.BrkićD.Advanced iterative procedures for solving the implicit Colebrook equation for fluid flow friction2018201818545103410.1155/2018/5451034