An Alternating Direction Implicit Method for Solving Projected Generalized Continuous-Time Sylvester Equations

In this paper we propose a low-rank alternating direction implicit (ADI) method to solve projected generalized continuous-time Sylvester equations with low-rank right-hand sides. Such equations arise in control theory including the computation of inner products and 2 H norms, and the model reduction based on balanced truncation for descriptor systems. The requirements of this method are moderate with respect to both computational cost and memory. Numerical experiments presented in this paper show the effectiveness of the proposed method.


Introduction
In this paper we consider the projected generalized continuous-time Sylvester equation of the form  +  +  ,1  ,2 = 0, where ,  ∈ R × , ,  ∈ R × ,  ∈ R × , and  ∈ R × are the sought-after solution.Here,  ,1 and  ,2 are the spectral projectors onto the right deflating subspaces corresponding to the finite eigenvalues of the pencils  −  and  − , respectively;  ,1 =  −  ,1 and  ,2 =  −  ,2 are the spectral projectors onto the right deflating subspaces corresponding to the eigenvalue at infinity.The spectral projectors onto the left deflating subspaces corresponding to the finite eigenvalues of − and − are denoted by  ,1 and  ,2 , respectively, while  ,1 =  −  ,1 and  ,2 =  −  ,2 are the spectral projectors onto the left deflating subspaces corresponding to the eigenvalue at infinity.We assume that the matrices  and  are singular, but the pencils  −  and − are regular; that is, det(−) and det(−) are not identically zero.Under the assumption, the pencils  −  and  −  have the Weierstrass canonical forms: there exist nonsingular  ×  matrices  1 ,  1 and  ×  matrices  2 ,  2 such that where  () ,  () ,  () , and  () are block diagonal matrices with each diagonal block being a Jordan block.The eigenvalues of  () and  () are the finite eigenvalues of the pencils  −  and  − , respectively. () and  () correspond to the eigenvalue at infinity.The index  of nilpotency of  ()  is called the index of the pencil  − .Using (2),  ,1 ,  ,2 ,  ,1 , and  ,2 can be expressed as A number of numerical solution methods have been proposed for the standard/generalized Lyapunov and Sylvester equations.Two classical direct methods are the Bartels-Stewart method [1] and the Hammarling method [2].These methods need to compute the real Schur forms/generalized real Schur forms of the underlying matrices/matrix pencils by means of the QR/QZ algorithm [3].Besides direct methods, we mention, among several iterative methods, the Smith method [4], the alternating direction implicit iteration (ADI) method [5,6], the Smith() method [7], the low-rank Smith methods [7][8][9], the Cholesky factor-alternating direction implicit method [10][11][12][13], and the (generalized) matrix sign function method [14][15][16][17][18][19][20][21][22][23][24][25].
As  = ,  =   , and  =   , (1) is referred to as the projected generalized continuous-time Lyapunov equation, which arises in stability analysis and control design problems for descriptor systems including the characterization of controllability and observability properties, balanced truncation model order reduction, determining the minimal and balanced realizations, and computing  2 and Hankel norms; see [26][27][28][29][30][31] and the references therein.If the pencil  −  is c-stable, that is, all its finite eigenvalues have negative real part, then the projected generalized Lyapunov equation has a unique solution for each , and if, additionally,  is symmetric and positive semidefinite, then the solution  is symmetric and positive semidefinite; see, for example, [32] for details.Recently, several numerical methods have been proposed in the literature for solving the projected generalized Lyapunov equation.In [33], two direct methods, the generalized Bartels-Stewart method and the generalized Hammarling method, were proposed for the projected generalized Lyapunov equation.The generalized Hammarling method is designed to obtain the Cholesky factor of the solution.These two methods are based on the generalized real Schur form of the pencil  −  and require O( 3 ) flops and O( 2 ) memory.Iterative methods to solve the projected generalized Lyapunov equation have also been proposed.Stykel [31] extended the ADI method and the Smith method to the projected equation.Moreover, their low-rank versions were also presented, which could be used to compute lowrank approximations to the solution.These methods are especially suitable for large sparse equations with low-rank .Another iterative method for the projected generalized Lyapunov equation is the modified generalized matrix sign function method [25].Unlike the classical generalized matrix sign function method, the variant converges quadratically independent of the index of the underlying matrix pencil; see [25] for more details.
The projected generalized Sylvester equation (1) also plays an important role in control theory for descriptor systems.
Let () = ( − ) −1  and H() = G( − ) −1 F be two c-stable systems.As shown in [34], () and H() can be decomposed into where Here,  sp (), Hsp () are called the strictly proper parts of (), H(), while (), P() are called the polynomial parts of (), H(), respectively.Define the HL 2 inner product of () and H() by where the notation trace(⋅) denotes the trace of a matrix.Then, the HL 2 norm of () induced by the HL 2 inner product is For ,   ∈ R  , F, G ∈ R  , it can be shown that ⟨ sp (), Hsp ()⟩ H 2 could be expressed as where  is the solution of the projected generalized continuous-time Sylvester equation (1) with  =  G. Therefore, the norm ‖ sp ()‖ H 2 can be computed by where  is the solution of the following projected generalized continuous-time Sylvester equation: We now present the application of the projected generalized continuous-time Sylvester equation in model reduction of c-stable descriptor systems.Let G pc and G po denote the proper controllability and observability Gramians of the continuous-time descriptor system () = ( − ) −1 , respectively.For the definitions of these Gramians and their applications in model reduction of (), the reader is referred to [30].The proper Gramians G pc and G po are the unique symmetric, positive semidefinite solutions of of the projected generalized continuous-time controllability and observability Lyapunov equations associated with the system (), respectively.
There exists another kind of Gramians, called cross-Gramians; for a standard square system, see [35].We now extend the definition to square descriptor systems.The proper cross-Gramian for a c-stable square descriptor system () = ( − ) −1  is the solution  of the projected generalized continuous-time Sylvester equation (10).It can been shown that the proper cross-Gramian  satisfies By making use of the relation above and a similar approach as proposed in [35], we can design a cross-Gramian-based model reduction method for the square descriptor system ().
In this paper, we propose two iterative methods for solving the projected generalized Sylvester equation (1).This work presented here is an extension of [31].Numerical experiments show the effectiveness of the proposed methods.
We note that the perturbation analysis for (1) has been presented in [36].The following theorem, for example, [37], gives sufficient conditions for the existence, uniqueness, and analytic formula of the solution of the projected generalized continuous-time Sylvester equation ( 1).
Moreover, if  −  and  −  are c-stable, that is, all their finite eigenvalues have negative real part, then  can be expressed as Throughout this paper, we adopt the following notations.The square identity and zero matrices are denoted by  and 0, respectively.The spaces of  ×  real matrices are denoted by R × .The 2-norm and the Frobenius matrix norm are denoted by ‖ ⋅ ‖ 2 and ‖ ⋅ ‖  , respectively.The superscript "⋅  " denotes the transposition of a vector or a matrix.() is the spectral radius of square matrix , and () = ‖‖ 2 ‖ −1 ‖ 2 is the spectral condition number of .The open left and right half-plane are denoted by C − and C + , respectively.We will also adopt MATLAB-like convention to access the entries of vectors and matrices.For a matrix ,  (,) is 's (, )th entry; 's submatrices  (:,:) ,  (:,:) , and  (:,:) consist of intersections of row  to row  and column  to column , row  to row , and column  to column , respectively.
The remainder of the paper is organized as follows.In Section 2, we propose the generalized low-rank alternating direction implicit method for the solution of (1).In Section 3, we discuss the choice of shift parameters.In Section 4, we show how to solve the projected Sylvester equations by the low-rank cyclic Smith method.Section 5 is devoted to some numerical tests.Some concluding remarks are given in the last section.

Alternating Direction Implicit Method
The alternating direction implicit (ADI) method was first introduced by Peaceman and Rachford [38] to solve linear systems arising from the discretization of elliptic boundary value problems and then used in [5-7, 12, 13, 31] to solve Lyapunov or Sylvester matrix equations.In this section, we consider how to generalize the ADI method for iteratively solving (1).
We always assume that the pencils  −  and  −  are c-stable; that is, all their finite eigenvalues have negative real part.Hence, the matrices  and  are nonsingular.Multiplying the first equation in (1) on the left by  −1 and on the right by  −1 , we get the following projected standard Sylvester equation: The iterates   of the ADI iteration for ( 13) are usually generated by the alternating solution of two linear systems with multiple right-hand sides: where  0 = 0 and the shift parameters {  }  =1 and {  }  =1 are elements of C − and C + , respectively.These two equations are equivalent to the following single iteration step: We have from that   =  ,1    ,2 , that is, the second equation in ( 1) and ( 13) is satisfied exactly.We can rewrite iteration (15) as Let  denote the exact solution of (1).Then it is easily verified that the error matrix  −   obeys the recursion: where By using the Weierstrass canonical form ( 2)-( 3), we have with This implies that if {  }  =1 contains all finite eigenvalues (multiple eigenvalues counted by their algebraic multiplicities) of =1 contains all finite eigenvalues of − −1 , then  −   ≡ 0. This is due to the Cayley-Hamilton theorem [39], which states that () ≡ 0 for 's characteristic polynomial () = det( − ) and () ≡ 0 for () = det( − ).From (18), we can see that the parameters {  }  =1 and {  }  =1 should be chosen to achieve min   ,  ‖A  ‖ 2 ⋅‖B  ‖ 2 ; we will discuss the details of the choice of parameters in the next section.
About the th approximate solution   of the ADI method, by using ( 18)-( 20), we have the following estimate.
Note that the solution  is explicitly computed by the ADI iteration (17), so the storage requirement is O().One should notice that in many cases the storage requirement is the limiting factor rather than the amount of computation.We note that low-rank schemes are the only existing methods that can effectively solve large-scale Lyapunov/Sylvester equations.Assume that the low-rank right-hand side  has the factored form  =  with  ∈ R × and  ∈ R × .Instead of explicitly forming the solution , the low-rank method computes and stores approximate solutions in lowrank factored form.If the numerical rank  of  is much smaller than min{, }, that is,  ≪ min{, }, then the storage is reduced from O() to O() or O().
The key idea in the low-rank version of the ADI iteration is to rewrite the iteration   in (17) as an outer product: This is always possible since starting with the initial guess  0 = 0 × .The low-rank alternating direction implicit (LR-ADI) method is based on (17).Replacing  −1 with  −1  −1  −1 , ( 17) can be reformulated in terms of the lowrank factors as where From the fact that  0 ,  0 , and  0 are all zero matrices, it can be seen that   is  × ,   is  × , and   is  × .Thus the rank of   is no more than .Since the order of the ADI parameters {  }  =1 and {  }  =1 is not important, the ordering of {  }  =1 and {  }  =1 can be reversed.As shown in [12], we have the following iterative scheme:    = [ (1)  (2) ⋅ ⋅ ⋅  () ] , with and so we have with The algorithm is summarized as in Algorithm 1.
We make a few comments on Algorithm 1.
(1) Since the computation of the Weierstrass canonical form is sensitive under small perturbations, we should make use of the generalized real Schur factorization to compute the spectral projectors; see, for example, [33].For large-scale problems, the computation of the spectral projectors by the generalized real Schur factorization may be very expensive.However, in some applications the spectral projectors can be expressed in explicit form by using the special block structure of the matrices , , , and ; see numerical examples in this paper or [31].
(2) If the number of shift parameters  is smaller than the number of iterations required to obtain a prescribed tolerance, then we reuse these parameters in a cyclic manner.

Choice of Shift Parameters
The selection of good parameters is vitally important to the successful application of the generalized LR-ADI iteration.The rate of convergence is dominated by spectral radii of matrices A  and B  given by (20) where E and F denote spectrums of the matrices  −1  and − −1 , respectively.To compute the suboptimal ADI shift parameters for the standard Lyapunov equation with  =   and  =  = , a heuristic algorithm has been proposed in [7].It chooses suboptimal ADI parameters from a set , which is taken to be the union of the Ritz values of  and the reciprocals of the Ritz values of  −1 , obtained by two Arnoldi processes, with  and  −1 .As shown in [7], the Ritz values obtained by the Arnoldi process with  tend to be located near the "outer" eigenvalues, that is, the eigenvalues near the convex hull of the spectrum.In particular, the eigenvalues of large magnitude are usually approximated well.In contrast, they are generally poor approximations to the eigenvalues near the origin.Therefore, one computes the reciprocals of the Ritz values obtained by the Arnoldi process with  −1 to approximate the eigenvalues near the origin.
In [12], this idea has been extended to the case for Sylvester equations.A heuristic procedure which is easy to implement has been proposed in [12].It can also be naturally extended to the generalized problem (30).Here, we need to compute the largest and smallest (in modulus) nonzero approximate eigenvalues of  −1  and − −1 , respectively.
where E  is E with  1 , . . .,  −1 deleted, and similarly for F  , End For Algorithm 2: ADI parameters by Ritz values.
Noting that  and  are assumed to be singular, inverses of  −1  and − −1  do not exist.In [31], Stykel proposed a strategy to overcome this difficulty.Define where  1 ,  2 ,  1 , and  2 are the transformation matrices as in (2).A simple calculation gives that Then it is clear that the reciprocals of the smallest nonzero eigenvalues of  −1  and − −1  are the largest eigenvalues of  1  and − 2 , respectively.Thus, we can run two Arnoldi processes with the matrices  1  and − 2  to compute the smallest nonzero eigenvalues of  −1  and − −1 , respectively.In some applications, similar to the projectors  ,1 ,  ,1 ,  ,2 and  ,2 , the matrices  1 , and  2 can be also obtained in explicit form by making use of the special block structure of the matrices , , , and .The algorithm for choosing {  }  =1 and {  }  =1 is summarized as in Algorithm 2. For more details about Algorithm 2, the interesting reader is referred to [12].

Smith Method
The Smith method [4] is derived from the projected discretetime Sylvester equation with which is equivalent to (1) for any two parameters  ∈ C − and  ∈ C + .Under this assumption, the sequence {  } ∞ =0 generated by the iteration converges to the solution .If all of the shifts in the generalized ADI iteration (17) are constant, that is,   = ,   =  ( = 1, 2, . ..), then the generalized ADI iteration reduces to the Smith method.In [7], Penzl illustrated that the ADI iteration with a single shift (Smith method) converges very slowly, while a moderate increase in the number of shifts  accelerates the convergence nicely.However, it is also observed that the speed of convergence is hardly improved by a further increase of ; see Table 2.1 in [7].These observations lead to the idea of the cyclic Smith() iteration, a special case of ADI where  different shifts are used in a cyclic manner.End For Algorithm 3: The generalized LR-Smith() method for the projected generalized Sylvester equation ( 1) with  = .
The low-rank scheme based on the Smith() iteration was also introduced in [7].The method is called the cyclic lowrank Smith method (LR-Smith()) and is a special case of LR-ADI where  number of shifts are reused in a cyclic manner.This idea can be generalized for (1).The generalized LR-Smith method consists of two steps.First the matrices   ,   , and   are obtained by an  step generalized LR-ADI iteration.Then one solves the discrete-time Sylvester equation where A  and B  are as in (19) with  = , respectively.The generalized LR-Smith method for solving the projected generalized Sylvester (1) is described as in Algorithm 3.

Numerical Experiments
In this section, we present some numerical examples to illustrate the performance of the generalized LR-ADI method for the projected generalized Sylvester equation (1).Let LR-ADI and LR-Smith() denote Algorithms 1 and 3, respectively.In the following examples, we compare the numerical behavior of LR-ADI with LR-Smith() with respect to the number of iterations (Iter for short), CPU time (CPU for short), and the relative residuals ‖  ‖  /‖ 0 ‖  (Res for short).All numerical experiments are performed on a PC with the usual double precision, where the floating point relative accuracy is 2. 22 Let mvp denote the number of matrix vector products and let lss denote the number of linear system solvers at every iteration for the two methods.
The matrix coefficients in (38) are given by These matrices are sparse and have a special block structure.Using this structure, the projectors  ,1 and  ,1 onto the left  and right deflating subspaces of the pencil  −  can be computed as where Π =  −  and  ∈ R 1×1975 .See Figure 1 for a graph of the convergence.The results in Table 2 clearly indicate that the LR-ADI method is more efficient than LR-Smith(3) for this example.
Example 2. For the second experiment, we consider a holonomically constrained damped mass-spring system with  masses as in [25].The th mass is connected to the ( + 1)th mass by a spring and a damper and also to the ground by another spring and damper.Moreover, the first mass is connected to the last one by a rigid bar, and it can be influenced by a control.The vibration of this system is described by the descriptor system (38)  where  1 ∈ R × is the symmetric positive definite mass matrix,  1 ∈ R × is the stiffness matrix,  1 ∈ R × is the damping matrix, and  1 is the matrix of constraints.If  1 ∈ R 1× has full row rank, the pencil  −  is of index 3, and the spectral projectors  ,1 and  ,1 can be computed as Here  =  −1 1   1 ( 1  −1 1   1 ) −1 and Π =  −  −1 1   1 ( 1  −1 1   1 ) −1  1 = − 1 are a projector onto the kernel of  1 along the image of  −1 1   1 ; see [41] for details.The matrix  has the form  = [ 1 ,  2 ,  −1 ] ∈ R ×3 , where   denotes the th column of the identity matrix  2+1 .Analogously, we can get the projectors  ,2 and  ,2 of the pencil  − .In this experiment the state space dimensions of the problems are  = 1261 and  = 1161, respectively.The matrix  in ( 1) is  =  with  ∈ R 1261×3 and  ∈ R 3×1161 .
The computational results were reported in Figure 2 and Table 3.We note that the iteration steps of the LR-Smith(6) method is 21 while the steps of LR-ADI is 42, but the CPU time of LR-Smith(6) method is much more than that of LR-ADI method.This is because the computation cost at every iteration of two methods is very different.

Conclusions
In this paper, we have proposed the generalized low-rank alternating direction implicit method and the generalized

Table 1 :
Table 1 has a rough count of the expenses of LR-ADI and LR-Smith() at every iteration.Only the major expenses are considered.From Table 1, we can mvp and lss at every iteration step.
[40]21  12 ) −1  21 is the orthogonal projector onto the kernel of  21 along the image of  12 ; see[40].The matrices  12 and  21 have full rank, and the pencil  −  is of index 2. Analogously, we can get the projectors  ,2 and  ,2 of the pencil  − .In our experiment the state space dimensions of the problems are  = 2295 and  = 1975, respectively.The matrix  in (1) is  =  with  ∈ R 2295×1

Table 3 :
(6)ADI,  = 12 versus LR-Smith(6).-rank cyclic Smith method to solve the projected generalized Sylvester equations.Numerical experiments presented in this paper show the effectiveness of the proposed methods. low