Residual-Based Simpler Block GMRES for Nonsymmetric Linear Systems with Multiple Right-Hand Sides

We propose in this paper a residual-based simpler block GMRES method for solving a system of linear algebraic equations with multiple right-hand sides.We show that this method ismathematically equivalent to the block GMRESmethod and thus equivalent to the simpler block GMRES method. Moreover, it is shown that the residual-based method is numerically more stable than the simpler block GMRES method. Based on the deflation strategy proposed by Calandra et al. (2013), we derive a deflation strategy to detect the possible linear dependence of the residuals and a near rank deficiency occurring in the block Arnoldi procedure. Numerical experiments are conducted to illustrate the performance of the new method.


Introduction
In this paper, we consider iterative methods for solving a system of linear algebraic equations: where  is a nonsingular matrix of order  and  = [ 1 , . . .,   ] and  = [ 1 , . . .,   ] are rectangular matrices of dimension × with  ≤ .For solving such systems, the block GMRES [1] and its variants are very popular.Block GMRES is based on the block Arnoldi process and is formally fully analogous to the ordinary GMRES algorithm by Saad and Schultz [2].
The following notation is used throughout the paper.Subscripts denote the iteration index and superscripts distinguish between individual columns in a block.We denote by ‖ ⋅ ‖ the Euclidean vector norm and the induced matrix norm and by ‖‖  the Frobenius norm.Moreover, for  ∈ R × ( > ) of rank , () =  1 ()/  () is the spectral condition number, where  1 () >   () > 0 are the extremal singular values of .
Given an initial approximation  0 ∈ R × to the solution of (1), let and then in analogy to the unblocked case, we build a sequence of iterates   ∈  0 +   (,  0 ) such that           =      −       = min where   (,  0 ) = span{ 0 ,  0 , . . .,  −1  0 }.Equation ( 3) is equivalent to minimizing every column of   , that is,       ()        2 = min subject to  ()  −  () 0 ∈   (,  0 ) , and also to the orthogonality condition  ()  ⊥   (,  0 ) ,  = 1, . . ., , where ⊥ is the orthogonality relation induced by the Euclidean inner product.Assume that   is of the form , where Y  = [ 1 ,  2 , . . .,   ] is a basis of   (,  0 ).Then, we obtain the th residual matrix Central to the usual implementations of block GMRES is the block Arnoldi process [1], which can be used to construct the orthonormal basis of   (,  0 ).In practice, the possible linear dependence of the residuals of the  systems requires an explicit reduction of the number of right-hand sides.In [3], this was called deflation.If the block residual is nearly rank deficient, block GMRES should be implemented with deflation and there are various sophisticated rank-revealing QR factorizations.For details, see [3] and the references therein.We can write the nondeflated block Arnoldi process as shown in Algorithm 1.
From Algorithm 1, we obtain formally the ordinary Arnoldi relation where the ( + 1) ×  matrix is In the block Arnoldi algorithm,      =  holds due to the QR factorizations, and      = 0 when  ̸ = , where  is a unit matrix and 0 is a zero matrix of order .This indicates that the whole process is equivalent to the one in which the block vectors are generated column by column using an ordinary modified Gram-Schmidt process.
From ( 6) and (7), we obtain the fundamental block GMRES relation where  1 is the first  column of the ( + 1) × ( + 1) unit matrix (the size changes with ),  0 is an upper triangular matrix obtained in Arnoldi's initialization step, and   is the "block coordinates" of   −  0 with respect to the block Arnoldi basis.Using (9), the least squares problem (3) is solved by recursive QR factorization of H  , updated by applying Givens rotations.Once the norm of the residual is small enough, the triangular system with the computed -factor is solved, and the approximate solution   is computed.The detailed algorithm of block GMRES can be found in [3][4][5].
The block GMRES method with deflation at each iteration was proposed in [6].And a deflation strategy was investigated to detect when a linear combination of approximate solutions is already known; for details, see [7].In this paper, we deal with a different approach and compare the situation with deflation and without deflation.Let Z  be a block basis of   (,  0 ).Instead of building a block orthogonal basis of   (,  0 ), we look for a block orthogonal basis V  = [ 1 , . . .,   ] of   (,  0 ).As a special case, simpler block GMRES (SBGMRES) was proposed by Liu and Zhong [8], where In this paper, we will consider the basis We call this case the residual-based simpler block GMRES (RB-SBGMRES).
The paper is organized as follows.In Section 2, the RB-SBGMRES and SBGMRES algorithms are described (without deflation).In Section 3, some comparison between RB-SBGMRES and SBGMRES is established.In Section 4, the RB-SBGMRES method with deflation and the corresponding algorithm RB-SBGMRES-D (Table 2) are derived.In Section 5, these algorithms are compared using test matrices taken from the Matrix Market [9].Conclusions are included in the last section.

Residual-Based Simpler Block GMRES
Suppose that Z  ≡ [ 1 , . . .,   ] is a basis of   (,  0 ).The orthogonal basis V  of   (,  0 ) is thus obtained from the QR factorization of Z  ; that is, where U  is an upper triangular matrix with order .Due to the orthogonality property   ⊥   (,  0 ), the th residual matrix   can be computed recursively as where   =     −1 .Define   = [  1 , . . .,    ]  .Since the columns of Z  form a basis of   (,  0 ), we can represent   in the form where   ∈  × is the "block coordinates" of   −  0 with respect to the block basis Z  .Due to   =  0 − Z    , Z  = V  U  , and   ⊥   (,  0 ), it follows that Hence, once the residual norm is small enough, we can solve this upper triangular system (13) and then compute the approximate solution   .
We now present the RB-SBGMRES method without deflation as shown in Algorithms 2 and 3 and Table 1.
For comparison, we also present the SBGMRES method proposed in [8].

Comparison with SBGMRES
In [8], an equivalence between SBGMRES and classical block GMRES had been established.Algorithm 2 indicates that RB-SBGMRES is equivalent to block GMRES; that is, search a solution   ∈   (,  0 ), such that   ⊥   (,  0 ).
On the other hand, for single right-hand side, it has been observed in [10] that the gap between the true residual   =  −   and the updated residual   can be strongly influenced by the conditioning of Z  , which is the basis of   (,  0 ), and the choice of the basis Z  has an effect on the conditioning of the matrix U  [10].Since we compute Table 1: Computational cost of a cycle of RB-SBGMRES (or SBGMRES). Step

Computational cost Computation of 𝑅
Computation of    the coordinates of the correction   −  0 in the basis Z  by (13), the approximate solution   becomes inaccurate as the conditioning of U  grows.Simpler GMRES [11] is, in general, less accurate than GMRES and is inherently unstable due to the choice of the basis [ 0 ,  1 , . . .,  −1 ].It is easy to formulate an analogous conclusion in the block case.Step

Computational cost Computation of 𝑅
There is a theorem about condition number of where  is the number of right-hand sides.
We have where Moreover, we get On the other hand, Thus, using the triangle inequality and the fact that the 1norm is bounded from above by the 2-norm, we have The norm of  −1  can be expressed using (11) as With The proof then follows from ‖  ‖ 2 ≤ ‖  ‖  = √.
Theorem 1 indicates that the conditioning of [ 0 /‖ 0 ‖  , V −1 ] is inversely proportional to the actual relative norm of the residual.Once residuals become small, this will lead to the ill-conditioning of the matrices Z  and the matrices U  , and SBGMRES will behave unstably after some initial residual reduction.However, from Theorem 2, the conditioning of [ 0 /‖ 0 ‖  , . . .,  −1 /‖ −1 ‖  ] is related to the intermediate decrease of the residual norms, not to the residual decrease with respect to the initial residual.For single right-hand sides, it has been observed that [ 0 /‖ 0 ‖  , . . .,  −1 /‖ −1 ‖  ] remains well conditioned while [ 0 /‖ 0 ‖  , V −1 ] becomes ill-conditioned [10].

RB-SBGMRES with Deflation
When block Krylov subspace methods are used for the solution of linear systems of equations with multiple righthand sides, the linear dependence of the residual of the  systems may occur, and this is called deflation.Deflation may be possible at startup or in a later step.Sometimes, we need to incorporate a strategy for detecting when a linear combination of systems has approximately converged.Recently, Calandra et al. derived a deflation strategy to detect a near rank deficiency occurring in the block Arnoldi procedure in [7].We provide a brief overview of the method.
Assume that the QR factorization of  0 has been performed as with Ŷ1 ∈  × having orthonormal columns and ρ0 ∈  × .
To circumvent deflation at startup, the subspace decomposition at the beginning of the cycle is derived by finding a unitary matrix F 1 ∈  × , such that with Y 1 ∈  × 1 ,  0 ∈  × 1 , and  1 +  1 = .The unitary matrix F 1 is determined by the singular value decomposition (SVD) of ρ0 = Σ  and set F 1 = .Choose a relative deflation threshold  and select  1 singular values of ρ0 such that ( ρ0 ) > .
Define  0 = 0 and   =  −1 +   , for [Y  ,  −1 ] ∈  ×(  +  ) with orthonormal columns, and assume that the following block Arnoldi relation holds at the beginning of the th iteration: with The th iteration of the deflated block Arnoldi procedure produces matrices Ŷ+1 ∈  ×  and Ĥ ∈  (  +)×  which satisfy with Ĥ having the following block structure: Defining Ŷ+1 = [Y  ,  −1 , Ŷ+1 ], (28) can be reformulated as The subspace decomposition is performed by finding a unitary matrix F +1 such that Hence, we obtain Defining H  ∈  (  +)×  as H  = F +1 Ĥ , this leads to which is the block Arnoldi relation required at the beginning of the next iteration.From (31), the unitary matrix F +1 has the following matrix structure: Defining ρ ∈  (  +)× as ρ = ( F   ρ−1 0   × ) and R = ρ − Ĥ   with R0 = ρ0 , the unitary matrix F +1 is determined by the following steps (for details, see [7]): (1) SVD of R = Σ  , with  = ( (2) QR decomposition of  ×  matrix [ +   −  ].In the case of simpler block GMRES method, the relationship   = Ŷ+1 R , which is an important ingredient for the block GMRES method, cannot be established.We must find another strategy to perform the subspace decomposition.
Assume that the QR factorization of  0 has been performed as We compute G 0 analogously as F 1 in (26), which leads to the following formulas: with  0 = G  0 ρ0 ,  1 ∈  × 1 and  0 ∈  × 1 ,  1 +  1 = .To build a block orthogonal basis of   (,  0 ), we compute the QR factorization of  1 : Assume that the block Arnoldi relation holds at the beginning of the ( + 1)th iteration of the deflated block Arnoldi procedure, with where   ,   , and   are defined as follows: Set   =  −1 −     , with   ∈  ×  , and define   ∈    × as   =     −1 .We generate  +1 by calculating the QR factorization of   , which leads to the following formulas: ρ , and G  is a unitary matrix, which is determined similarly to G 0 .
Denote by  , and  , the number of Krylov directions keeping at the th iteration of the th cycle; Algorithm 4 indicates that  +1, <  , and  +1, <  , .

Numerical Experiments
In this section, RB-SBGMRES is tested and compared with SBGMRES.The test matrices were taken from the Matrix Market [9].All computations were carried out using Matlab on a PC with the usual double precision, where the floating point relative accuracy is 2.22 × 10 −16 .In the following examples, we take  0 as the zero matrix; thus, the initial residual matrix is  0 =  −  0 = , and we set ℎ = /, where  is the order of the matrix .We plot the relative true norm of residual and condition number of U  , respectively, in Figures 1, 2 In Figures 1 and 2, we show the relative true norm of residual ‖ −   ‖  /‖‖  and condition number of U  for the SBGMRES and RB-SBGMRES, respectively.It is clearly seen from Figure 1 that the relative true norm of residual of SBGMRES may stagnate at a significantly higher level than that of RB-SBGMRES.The reason for this is that the condition number of the matrix U  of SBGMRES increases significantly faster than that of RB-SBGMRES as Figure 2 shows.
In Figure 3, we show the relative true norm of residual ‖ −   ‖  /‖‖  for the RB-SBGMRES-D and the BFGMRES-S proposed in [7], with  = 10 −9 .It is clearly seen from the figure that RB-SBGMRES-D can compete with BFGMRES-S.It is clear from Figure 4 that the condition number of the matrix U  of SBGMRES increases faster than that of RB-SBGMRES.Since the number of right-hand sides  = 5, the maximum iteration number is 48.From Figure 5, we can observe that the numerical performance of RB-SBGMRES is better than that of SBGMRES.
Figure 6 shows the relative true norm of residual ‖ −   ‖  /‖‖  for the RB-SBGMRES-D and BFGMRES-S, with  = 10 −9 .It is clearly seen that the performances of RB-SBGMRES-D and BFGMRES-S are almost same.It is clear from Figure 7 that the numerical performance of RB-SBGMRES is better than that of SBGMRES, and from Figure 8, we see that the condition number of the matrix U  of SBGMRES increases fast while that of RB-SBGMRES remains at a significantly low level.Figure 9 shows     It is also obvious from Figure 10 that the RB-SBGMRES method is slightly more accurate than SBGMRES in this example, and Figure 11 also shows that the condition number of the matrix U  of SBGMRES increases faster than that of RB-SBGMRES.It is clear from Figure 12 that the performances of RB-SBGMRES-D and BFGMRES-S are almost the same for Example 4.
In order to further verify that the condition number of the matrix U  of SBGMRES increases significantly faster than that of RB-SBGMRES, we compared a broad selection of different matrices from the Matrix Market and presented a comparison of the overall condition number trend.We select matrices, randomly, and set the same convergence threshold for two methods.We compare the iterations required to converge for two methods.It is easy to see from Table 3 that the condition number of the matrix U  of RB-SBGMRES is significantly smaller than that of SBGMRES and the number of iterations for the SBGMRES is slightly larger than that of RB-SBGMRES for most matrices.

Conclusion
In this paper, we have proposed a minimum residual method mathematically equivalent to the block GMRES method for solving systems of linear equations with multiple right-hand sides.Numerical experiments show that, after some initial reduction, the relative true norms of residual SBGMRES may stagnate at a significantly higher level than that of RB-SBGMRES.This difference is clearly caused by the choice of the basis Z  , which has an effect on the condition number of the matrix U  .Numerical experiments indicate that U  of RB-SBGMRES remains better-conditioned than U  of simpler block GMRES, which may become a very ill-conditioned triangular matrix.Since the coordinates of the correction   −  0 in the basis Z  are computed from (13), its error starts to diverge as (U  ) grows and   will become inaccurate.We see that the choice Z  = [ 0 /‖ 0 ‖  , . . .,  −1 /‖ −1 ‖  ] has a better numerical performance.In comparison with the case of deflation, we consider a deflation strategy to detect the possible linear dependence of the residuals of the  systems and a near rank deficiency occurring in the block Arnoldi procedure for RB-SBGMRES method, which was later called RB-SBGMRES-D.Numerical experiments show that the performances of RB-SBGMRES-D and BFGMRES-S are almost the same.

Example 2 .Figure 1 :
Figure 1: Example 1, the relative true residual norms versus the number of iterations.

Figure 2 :Figure 3 :
Figure 2: Example 1, the condition numbers versus the number of iterations.

Figure 4 :
Figure 4: Example 2, the relative true residual norms versus the number of iterations.

Figure 5 :
Figure 5: Example 2, the condition numbers versus the number of iterations.

Figure 6 :Example 3 .
Figure 6: Example 2, the relative true residual norms versus the number of iterations.

Figure 7 :
Figure 7: Example 3, the relative true residual norms versus the number of iterations.

Figure 8 :
Figure 8: Example 3, the condition numbers versus the number of iterations.

Example 4 .Figure 9 :
Figure 9: Example 3, the relative true residual norms versus the number of iterations.

Figure 10 :
Figure 10: Example 4, the relative true residual norms versus the number of iterations.

Figure 11 :Figure 12 :
Figure 11: Example 4, the condition numbers versus the number of iterations.