In this paper, we develop a modified gradient based algorithm for solving matrix equations AXB+CXTD=F. Different from the gradient based method introduced by Xie et al., 2010, the information generated in the first half-iterative step is fully exploited and used to construct the approximate solution. Theoretical analysis shows that the new method converges under certain assumptions. Numerical results are given to verify the efficiency of the new method.

1. Introduction

Consider a linear matrix equation of the following form:
(1)AXB+CXTD=F,
where A∈ℛr×m, B∈ℛn×s, C∈ℛr×n, D∈ℛm×s, and F∈ℛr×s are the given constant matrices and X∈ℛm×n is the unknown matrix to be solved. A Sylvester equation AX+XTB=F is the special case of (1) with r=s=n and B=C=In, the notation In is the identity matrix of n×n. Such kind of problems frequently arise from many areas of applications in control and system theory [1], stability of linear systems [2], analysis of bilinear systems [3], power systems [4], signal and image processing [5], and so forth.

The exact solutions of matrix equations, such as Lyapunov and Sylvester matrix equations, can be obtained from matrix inversion by using the Kronecker product. The drawback of this approach is that considerable computational costs and storage requirements are needed, so that this approach is only applicable for small sized Sylvester equations. Some direct methods have also been proposed in [6–9] which are based on the idea of transforming the coefficient matrix into a Schur or Hessenberg form, by which the original equation can be solved by a backward substitution.

In the numerical linear community, iterative methods are becoming more and more popular. Several iterative schemes for Sylvester equations have been proposed; see, for example, [10–15]. Recently, some efficient gradient based and least squares based iterative algorithms for solving generalized Sylvester equations and coupled (general coupled) Sylvester equations have been presented [16–28]. The basic idea of these approaches is based on a hierarchical identification principle [16–18], which regards the unknown matrix as the system parameter matrix to be identified and then constructs a recursive formula to approximate the unknown solution. Particularly, for general linear matrix equations of form (1), it is illustrated in [3, 4] that the unknown matrix to be identified can be computed by a gradient based iterative algorithm. The convergence properties of the methods are also investigated in [3]. In this paper, a modified gradient based iterative algorithm is proposed for solving linear matrix equations of form (1). The information generated in the first half-iterative step is fully exploited and used to construct the approximate solution by the modified method. The convergence condition of the method is analyzed. The numerical performance of the method is compared with the algorithms in [3, 4]. Numerical results show that the new method is efficient and robust.

The paper is organized as follows. In Section 2, the gradient based iterative method is recalled, and the modified gradient based method is introduced and analyzed in Section 3. In Section 4, performance of the modified gradient based method is compared with the original one. Finally, we conclude the paper in Section 5.

2. A Brief Review of the Gradient Based Iterative Method

We firstly recall an iterative method proposed by Xie et al. [3] for solving (1). The basic idea is regarding (1) as two linear matrix equations as follows:
(2)AXB=F-CXTD,CXTD=F-AXB.
Then, define two recursive sequences as follows:
(3)Xk(1)=Xk-1+μAT(F-AXk-1B-CXk-1TD)BT,(4)Xk(2)=Xk-1+μD(F-AXk-1B-CXk-1TD)TC,
where μ is the iterative step size. The above procedures can be regarded as two separate iterative procedures for solving two matrix equations in (3).

With Xk(1) and Xk(2) at hand, then the kth approximate solution Xk can be defined by taking the average of two approximate solutions, that is,
(5)Xk=Xk(1)+Xk(2)2.
By selecting an appropriate initial approximate solution X0, and using Xk-1 to substitute Xk-1(1) in (4) and Xk-1(2) in (5), then the above (4)–(6) constitute the gradient based iterative method proposed in [3]. It is shown in [3] that the gradient based iterative algorithm converges as long as
(6)0<μ≤2λmax(AAT)λmax(BTB)+λmax(CCT)λmax(DT)D,
where λmax(AAT) is the largest eigenvalue of AAT.

According to lots of numerical experiments, GBI algorithm is computationally efficient. However, we observe that the GBI algorithm has some limitations. The convergent rate is slow and the stagnation will happen for ill-conditioned problem. Also, in [3], the authors pointed out that how to choose a best convergence factor is a subject to be studied and deserves further research. In this paper, we present the optimal convergence factor explicitly, and then propose a modified algorithm for solving the linear matrix equations (1).

3. A Modified Gradient Based Iterative Algorithm

The above GBI process can be accomplished by the following algorithm.

Algorithm 1 (see [<xref ref-type="bibr" rid="B3">3</xref>]).

The gradient based iterative algorithm (GBI algorithm).

Give two initial approximate solutions X0(1) and X0(2)

for k=1,2,…, until converges

Xk-1=(Xk-1(1)+Xk-1(2))/2

Xk(1)=Xk-1+μAT(F-AXk-1B-CXk-1TD)BT

Xk(2)=Xk-1+μD(F-AXk-1B-CXk-1TD)TC

end.

In the step of computing Xk(2), the last approximate solution Xk(1) has been computed. Hence, we can use the information of Xk(1) to update the Xk-1 and present a modification of GBI algorithm.

Algorithm 2.

The modified gradient based iterative algorithm (MGBI algorithm).

Give two initial approximate solutions X0(1) and X0(2)

for k=1,2,…, until converges

Xk-1=(Xk-1(1)+Xk-1(2))/2

Xk(1)=Xk-1+μAT(F-AXk-1B-CXk-1TD)BT

Xk-1=(Xk(1)+Xk-1(2))/2

Xk(2)=Xk-1+μD(F-AXk-1B-CXk-1TD)TC

end.

Let B∈ℛn×s, D=[d1,d2,…,ds], di∈ℛm, X=[x1,x2,…,xn], vec(X)=[x1T,…,xnT]T, and
(7)S=BT⊗A+[C⊗d1TC⊗d2T⋮C⊗dsT]∈ℛrs×mn.

Lemma 3 (see [<xref ref-type="bibr" rid="B3">3</xref>]).

The Sylvester equation given by (1) has a unique solution if and only if rankS,vec(F)=rank[S]=mn, in this case, the unique solution is given by
(8)vec(X)=(STS)-1STvec(F).
The corresponding homogeneous equation AXB+CXTD=0 has a unique zero solution X=0.

The following result discusses the convergence conditions of the Algorithm 2.

Theorem 4.

If the linear matrix equation (1) has a unique solution X and
(9)0<μ<min{2[λmax(AAT)λmax(BTB)],ccccfcccc2[λmax(CCT)λmax(DTD)]},
then the iterative sequence Xk generated by Algorithm 2 converges to X; that is, limk→∞Xk=X; or the error Ek=Xk-X converges to zero for any initial value X0.

Proof.

In the following discussions, we always assume that the Frobenius matrix norm is used. For the clarity of proof, we introduce another variable X^k-1 instead of Xk-1 in the fifth iteration. Define the following error matrices:
(10)Ek=Xk-X,Ek(1)=Xk(1)-X,Ek(2)=Xk(2)-X,E^k=X^k-X,ξk=AEk-1B,ηk=CEk-1TD,δk=AE^k-1B,ζk=CE^k-1TD.
By using (1) and (10), the proof of the following equalities is trivial:
(11)Ek(1)=Ek-1+μAT(-ξk-ηk)BT,Ek(2)=E^k-1+μD(-δkT-ζkT)C.
Taking the Frobenius norm of both sides of (11), it follows that
(12)∥Ek(1)∥2=tr[(Ek(1))TEk(1)]=∥Ek-1∥2+2μtr{(Ek-1)TAT[-ξk-ηk]BT}+μ2∥AT[-ξk-ηk]BT∥2≤∥Ek-1∥2+2μtr{ξkT[-ξk-ηk]}+μ2λmax(AAT)λmax(BTB)∥ξk+ηk∥2,∥Ek(2)∥2=tr[(Ek(2))TEk2]≤∥E^k-1∥2+2μtr{[-δkT-ζkT]ζk}+μ2λmax(CCT)λmax(DTD)∥δk+ζk∥2.
From Ek=[Ek(1)+Ek(2)]/2, we have
(13)∥Ek∥2≤[∥Ek(1)∥2+∥∥Ek(1)∥2]2≤∥Ek-1∥22+μtr{ξkT[-ξk-ηk]}+μ22λmax(AAT)λmax(BTB)∥ξk+ηk∥2+∥E^k-1∥22+μtr{[-δkT-ζkT]ζk}+μ22λmax(CCT)λmax(DTD)∥δk+ζk∥2≤∥Ek-1∥22-μ[1-μ2λmax(AAT)λmax(BTB)]∥ξk+ηk∥2+∥E^k-1∥22-μ[1-μ2λmax(CCT)λmax(DTD)]∥δk+ζk∥2≤∥E0∥22k-μ[1-μ2λmax(AAT)λmax(BTB)]∑i=1k∥ξi+ηi∥2+∑i=1k∥E^k-i∥22i-μ[1-μ2λmax(CCT)λmax(DTD)]∑i=1k∥δi+ζi∥2.
Obviously, ∑i=1k(∥E^k-i∥2/2i)<∞. In fact, the iterative sequence X^k, k=0,1,… generated by Algorithm 2 can also be viewed as the sequence generated by the double-side iteration in [3], so limk→∞X^k=0. As 0<μ<min{2/[λmax(AAT)λmax(BTB)], 2/[λmax(CCT)λmax(DTD)]}, we have
(14)∑i=1∞∥ξi+ηi∥2<∞,∑i=1∞∥δi+ζi∥2<∞.
It follows that
(15)ξk+ηk⟶0,ask⟶∞,
or
(16)AEk-1B+CEk-1TD⟶0,ask⟶∞.
According to Lemma 3, we have Ek-1→0 as k→∞.

4. Numerical ExperimentsExample 1.

Consider the matrix equation AXB+CXTD=F with
(17)A=(254-7),B=(6-312),C=(12-13),D=(4321),F=(31794127).
From (8), the exact solution is
(18)X=(7543).

The coefficient matrices used in this example are taken from [3]. Taking X0=10-6I2, we apply the GBI algorithm and MGBI algorithm to compute Xk; the convergence factor μ is set to be 2/[λmax(AAT)λmax(BBT)+λmax(CCT)λmax(DDT)]=1/1983.1 in GBI algorithm and to be min{2/[λmax(AAT)λmax(BBT)], 2/[λmax(CCT)λmax(DDT)]}=1/1787.6 in MGBI algorithm. The relative error δ:=∥Xk-X∥/∥X∥ is recorded and plotted in Figure 1 by MATLAB command semilogy. From the figure, we can see that the MGBI algorithm converges faster than the GBI algorithm.

Comparison of convergence curves.

Remark. The choice of the convergence factor μ is an important issue. We experimentally study its influence on the convergence. The effect of changing of the convergence factor μ for MGBI algorithms in Example 1 is illustrated in Figure 2. We see that
(19)μ=min{2[λmax(AAT)λmax(BBT)],ccccfcccc2[λmax(CCT)λmax(DDT)]}=11787.6
is a better convergence factor. However, the convergence factor is problem dependent, so seeking a best convergence factor is still a difficult task.

Comparison of convergence curves.

Example 2.

Suppose that AX+XTB=F, where
(20)A=(112-1),B=(1-111),F=(8852).
Then, the solution of X from (8) is
(21)X=(1234).
The coefficient matrices used in this example are taken from [4]. Taking X0=10-6I2, we apply the GBI algorithm and MGBI algorithm to compute Xk, the convergence factor μ is set to be 2/[λmax(AAT)+λmax(BBT)]=0.27 in GBI algorithm and to be min{2/λmax(AAT),2/λmax(BBT)}=0.377 in MGBI algorithm. The relative error δ:=∥Xk-X∥/∥X∥ is recorded in Figure 3. From the figure, we can also see that the MGBI algorithm converges faster than the GBI algorithm.

Comparison of convergence curves.

The effect of changing of the convergence factor μ for MGBI algorithms in Example 2 is illustrated in Figure 4.

Comparison of convergence curves.

5. Conclusions

In this paper, a modified gradient based iteration (MGBI) method is proposed for linear matrix equation. The convergence of MGBI is analyzed. The choice of parameter μ is an important issue, and its influence is experimentally studied. The principle idea of this paper can be extended to the more general setting like generalized (coupled) Sylvester matrix equations.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

BhatiaR.RosenthalP.How and why to solve the operator equation AX-XB=YDattaB. N.XieL.LiuY.YangH.Gradient based and least squares based iterative algorithms for matrix equations AXB + CX^{T}D = FXieL.DingJ.DingF.Gradient based iterative solutions for general linear matrix equationsDingJ.LiuY.DingF.Iterative solutions to matrix equations of the form AiXBi=FiGolubG. H.NashS.Van LoanC.A Hessenberg-Schur method for the problem AX+XB=CBartelsR. H.StewartG. W.Algorithm 432: solution of the matrix equation AX - XB = CSorensenD. C.ZhouY.Direct methods for matrix Sylvester and Lyapunov equationsEnrightW. H.Improving the efficiency of matrix operations in the numerical solution of stiff ordinary differential equationsCalvettiD.ReichelL.Application of ADI iterative methods to the restoration of noisy imagesHuD. Y.ReichelL.Krylov-subspace methods for the Sylvester equationKirrinnisP.Fast algorithms for the Sylvester equation AX-XBT=CStarkeG.NiethammerW.SOR for AX-XB=CNiuQ.WangX.LuL.-Z.A relaxed gradient based algorithm for solving Sylvester equationsBaiZ.-Z.On Hermitian and skew-Hermitian splitting iteration methods for continuous Sylvester equationsDingF.ChenT.On iterative solutions of general coupled matrix equationsDingF.ChenT.Iterative least-squares solutions of coupled Sylvester matrix equationsDingF.ChenT.Gradient based iterative algorithms for solving a class of matrix equationsDingF.LiuP. X.DingJ.Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principleDingF.ShiY.ChenT.Gradient-based identification methods for Hammerstein nonlinear ARMAX modelsDingF.LiuP. X.LiuG.Gradient based and least-squares based iterative identification methods for OE and OEMA systemsLiuY.ShengJ.DingR.Convergence of stochastic gradient estimation algorithm for multivariable ARX-like systemsLiuY.XiaoY.ZhaoX.Multi-innovation stochastic gradient algorithm for multiple-input single-output systems using the auxiliary modelDingF.LiuG.LiuX. P.Parameter estimation with scarce measurementsDingF.ChenT.Performance analysis of multi-innovation gradient type identification methodsDingF.LiuY. J.BaoB.Gradient based and least squares based iterative estimationalgorithms for multi-input multi-output systemsWangX.DaiL.LiaoD.A modified gradient based algorithm for solving Sylvester equationsZhouJ.WangR.NiuQ.A preconditioned iteration method for solving Sylvester equations