A Parallel Preconditioned Modified Conjugate Gradient Method for Large Sylvester Matrix Equation

Computational effort of solving large-scale Sylvester equations AX + XB + F = O is frequently hindered in dealing with many complex control problems. In this work, a parallel preconditioned algorithm for solving it is proposed based on combination of a parameter iterative preconditioned method and modified form of conjugate gradient (MCG) method. Furthermore, Schur’s inequality and modified conjugate gradient method are employed to overcome the involved difficulties such as determination of parameter and calculation of inverse matrix. Several numerical results finally show that high performance of proposed parallel algorithm is obtained both in convergent rate and in parallel efficiency.


Introduction
Sylvester equations play vital roles in a number of applications such as control theory [1], matrix eigendecompositions [2], model reduction [3][4][5], and image processing [6].Particularly, when solving parabolic and elliptic PDEs by the discretization method, the issue of solving a large scale Sylvester matrix equation is often encountered.Nowadays, there are two types of methods for solving these matrix equations: the first common one is the standard direct method proposed by Bartels and Stewart [7].However, this is not applicable in large-scale settings.In order to overcome this limitation, the other concerning iterative form is discussed such as a few simple iterative schemes proposed by Ding and Chen [8].ADI method was first introduced by Peaceman and Rachford [9] for solving parabolic and elliptic PDEs and was later adapted solving the Sylvester equation by Wachspress [10].Although using ADI method needs to choose optimal parameters, it is too difficult to find them for a large scale problem.Therefore, the parameter iterative method was first proposed in [11], and the available choice of parameter was also given for a large scale problem.Recently, a new effective method based on Krylov subspace [12][13][14] is discussed.In particular, conjugate gradient (CG) method [15] is its essential one, which has the advantages of small store, small computing, and the fact that it is suitable for parallel.But it is only applied for the case that the coefficient matrix is positive definite symmetric matrices.So we can choose the modified conjugate gradient method (MCG) [16,17] in other cases.The convergence speed of MCG has been closely related to the singular values of the coefficient matrix, and the more concentrated the singular values are, the better the convergence speed of MCG is.In order to enhance the speed of convergence, the idea of preconditioning [18] is introduced.By this way, it will concentrate the singular values of the coefficient matrix.All of these brought about a need for the development of efficient parallel algorithm as a computational time is prohibitive, which motivates us to propose a parallel preconditioned algorithm for solving large Sylvester matrix equation.
The remainder of the paper is organized as follows.In Section 2, there is the introduction of MCG algorithm for solving large matrix equation AX + XB + F = O, and then our proposed preconditioned MCG algorithm is presented in Section 3, followed by the implementation of parallel algorithm (in Section 4).Convergence Analysis is given in Section 5. Finally, some numerical examples are discussed with the results analysis and comparison, followed by concluding comments.
Throughout this paper we use the following notations.Let A and B stand for two  ×  matrices and A T , (A),

Introduction of MCG Algorithm for Solving
AX + XB + F = O An  ×  Sylvester matrix equation takes the form where A ∈ R × , B ∈ R × , and F ∈ R × are given matrices, and X ∈ R × is unknown matrix.MCG algorithm [15,19,20] can be expressed as follows.
Theorem 2. Assume that AX + XB + F = O is consistent; then, for any arbitrary initial matrix X (0) , a solution of AX + XB + F = O can be obtained with finite iteration steps in the absence of round-off error.
It is important to accelerate the speed of the convergence.Form the matrix equation AX + XB + F = O, we know that the speed of convergence depends on the singular values centralization of the coefficient matrix (A ⊗ I + I ⊗ B T ).With the point of theorem, it is difficult to calculate its singular values.In order to easily enhance the speed of convergence, we propose a preconditioned MCG algorithm mentioned in the following section.

Proposed Preconditioned MCG Algorithm
In order to choose preconditioner for MCG, we introduce the parameter iterative method in [11] for solving (1), which is formed as follows: Next subsection will give the methods determining the parameter  and solving the matrices Ã, B, and F, respectively.we have either the optimal parameter  = 1/√ 1   or  = 1/ √  1   .On basis of Schur's inequality in [21], we determine  approximate value.
Through Algorithm 4, we can get the matrix (A − I) −1 .In similar manner, the inverse of B − I can be obtained.Therefore, the matrices Ã, B, and F can be obtained.

Algorithm for Solving the New Matrix Equation ÃX B −
X+ F = O.We still use MCG algorithm for solving the matrix equation ÃX B − X + F = O, which is obtained from the parameter iterative method of the above introduced, for solving (5).Thereby the precondition MCG is developed; it can be expressed as follows.
(2) For  = 0, 1, . .., calculate Step 1: If ‖R () ‖ < , or ‖R () ‖ >  but ‖Z () ‖ <  stop; otherwise, turn to Step 2; Step 2: ) , Z () ]; Step 3: X (+1) = X () +   Z () ; Step 4: R Step 5: ) , R () ]; Step 6: From every step of the previously mentioned algorithms, we can find that there are only matrix multiplication, addition, and inner product, which shows that the algorithm has very good parallelism.Furthermore, at each iteration, amount of calculation with preconditioned modified conjugate gradient method is the same as that with modified conjugate gradient method.There is only computational cost on obtaining the inverse of two matrices.The numerical results will also confirm its advantage both in convergent rate and parallel efficiency.

Parallel Implementation of Proposed Algorithm
For describing convenience, let the processor number be ,  = , where  is integer;   ( = 1, 2, . . ., ) represents th processor.And mark where A  , Ã , B  , B , I  , D  , E  , F  , X ()  , R ()  , and Z ()  are  ×  subblock matrices, and C is  ×  subblock matrix ( = 1, 2, . . ., ).Because of subblock matrices Ã and C of Ã need be stored on   ( = 1, 2, . . ., ); then, the data of Ã will be redistributed during the precondition process.And the details can be seen in precondition process.In addition, since the storage manner directly influences the ways of matrix multiplication in parallel, two parallel computing algorithms of matrix multiplication proposed in [22] are used in the paper: Multiplication 1: matrix multiplication with block row-row matrices stored; Multiplication 2: matrix multiplication with block row-column matrices stored.
Its detailed implementation can be expressed as follows.

Convergent Analysis
Let the matrices A and B be both positive definite symmetric matrices.  ,   (,  = 1, 2, . . ., ) are the eigenvalues of A and B, respectively.  ,   (,  = 1, 2, . . ., ) satisfy For the matrix equations AX The following comparison could indicate that our algorithm is more effective.

Numerical Examples and Results Analysis
In this section, the comparison with three methods the parameter iterative method, the MCG algorithm, and the proposed algorithm (PMCG) is discussed for solving two examples.The parallel platform is the parallel machine Lenovo Shenteng 1800 cluster.In the following test cases, X (0) = O and termination condition  = 10 −10 .
Example 7. Consider the elliptical partial differential equation Let the step size be ℎ = 1/1001, and five-point difference form is used to discrete above equation, which is finally transformed to solving a Sylvester matrix equation seen in (1).Since the eigenvalues of matrix A and matrix B are all greater than zero, then we have  = −√/‖Α‖  = −0.3015and  = −√/‖B‖  = −0.4083,respectively.The numerical results are shown in Table 1.Since the eigenvalues of matrix A and matrix B have not only negative real numbers but also positive ones, so there are four choices for .We have  = −√/‖Α‖  = −0.0621, = √/‖Α‖  = 0.0621,  = −√/‖B‖  = −0.0314,and  = √/‖B‖  = 0.0314, respectively.The numerical results are shown in Table 2.

Results Analysis.
From the results shown in the above tables, we observed the following.
(i) Compared with the modified conjugate gradient method and parameter iterative method, the iteration number and the computational time by our algorithm are significantly reduced.It is shown that the preconditioned method enhances its validity and convergent rate.Obviously, better parallel efficiency can be obtained with our algorithm.(ii) When the eigenvalues of matrix A and matrix B are all positive real numbers (seen in Example 7), where Ã = (A − I) −1 (A + I) = I + 2(A − I) −1 , B = (B − I) −1 (B + I) = I + 2(B − I) −1 , F = 2(A − I) −1 F(B − I) −1 , (6) and  is a parameter.The speed of convergence depends on the value of ( Ã⊗ B) T , since Ã = (A−I) −1 (A+I), B = (B− I) −1 (B + I).It is much simple to obtain the best convergence effect after decision of an optimal value of .However, two inverse matrices of A+I and B+I must be solved to get the matrices Ã, B. Let Y 1 = (A−I) −1 and Y 2 = (B−I) −1 ; we can find Y 1 and Y 2 by solving the matrix equations (A−I)Y 1 = I and (B − I)Y 2 = I.
Parameter  and the Inverse of Matrix (A−I) 3.1.1.Determination of the Parameter .If the negative real numbers  1 ≥  2 ≥ ⋅ ⋅ ⋅ ≥   are the eigenvalues of A and the negative real numbers  1 ≥  2 ≥ ⋅ ⋅ ⋅ ≥   are the eigenvalues of B, based on the optimal parameter measurement in [11], (A −I) Y 1 =I.We use the MCG algorithm to solve the matrix equations (A − I)Y 1 = I and (B − I)Y 2 = I.The specific algorithm for it will be given as follows.Algorithm 4. (1) Give an initial matrix

Table 1 :
Numerical results of Example 7.
From previously mentioned theorem, we can observe that proposed preconditioned MCG is better than traditional MCG if  1 /  and  1 /  are less, that is, if the singular values of A and B are centralized slightly.