Two new algorithms are proposed to compute the nonsingular square root
of a matrix A. Convergence theorems and stability analysis for these new algorithms
are given. Numerical results show that these new algorithms are feasible and effective.
1. Introduction
Consider the following nonlinear matrix equation:
(1)F(X)=X2-A=0,
where A is an n×n nonsingular complex matrix. A solution X of (1) is called a square root of A. The matrix square roots have many applications in the boundary value problems [1] and the computation of the matrix logarithm [2, 3].
In the last few years there has been a constantly increasing interest in developing the theory and numerical methods for the matrix square roots. The existence and uniqueness of the matrix square root can be found in [4–6]. Here, it is worthwhile to point out that any nonsingular matrix has a square root, and the square root is also nonsingular [4]. A number of methods have been proposed for computing square root of a matrix [5, 7–16]. The computational methods for the matrix square root can be generally separated into two classes. The first class is the so-called direct methods, for example, Schur algorithm developed by Björck and Hammarling [7]. The second class is the iterative methods. Matrix iterations Xk+1=f(Xk), where f is a polynomial or a ration function, are attractive alternatives for computing square roots [9, 11–13, 15, 17]. A well-known iterative method for computing matrix square root is Newton’s method. It has nice numerical behavior, for example, quadratic convergence. Newton’s method for solving (1) was proposed in [18]. Later, some simplified Newton’s methods were developed in [11, 19, 20]. Unfortunately, these simplified Newton’s methods have poor numerical stability.
In this paper, we propose two new algorithms to compute the nonsingular square root of a matrix, which have good numerical stability. We first apply Samanskill technique, especially, proposed in [21] to compute the matrix square root. Convergence theorems and stability analysis for these new algorithms are given in Sections 3 and 4. In Section 5, we use some numerical examples to show that these new algorithms are more effective than the known ones in some aspects. And the final conclusions are given in Section 6.
2. Two New Algorithms
In order to compute the square root of matrix A, a natural approach is to apply Newton’s method to (1), and this can be stated as follows.
Algorithm 1 (see [11, 19] (Newton’s method for (1))).
We consider the following.
Step 0. Given X0 and ɛ, set k=0.
Step 1. Let Res(Xk)=∥Xk2-A∥/∥A∥. If Res(Xk)<ɛ, stop.
Step 2. Solve for Hk in Sylvester equation:
(2)XkHk+HkXk=-F(Xk).
Step 3. Update Xk+1=Xk+Hk, k=k+1, and go to Step 1.
Applying the standard local convergence theorem to Algorithm 1 [19, P. 148], we deduce that the sequence {Xk} generated by Algorithm 1 converges quadratically to a square root X* of A if the starting matrix X0 is sufficiently close to X*.
In this paper, we propose two new algorithms to compute the nonsingular square root of the matrix A. Our idea can be stated as follows. If (1) has a nonsingular solution X, then we can transform (1) into an equivalent nonlinear matrix equation:
(3)G(X)=X-AX-1=0.
Then we apply Newton’s method to (3) for computing the nonsingular square root of A.
By the definition of F-differentiable and some simple calculations, we obtain that if the matrix X is nonsingular, then the mapping G is F-differentiable at X and
(4)GX′(H)=H+AX-1HX-1.
Thus Newton’s method for (3) can be written as
(5)GivenX0,Xk+1=Xk-(GXk′)-1(G(Xk)),iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiik=0,1,2,….
Combining (4), the iteration (5) is equivalent to the following.
Algorithm 2 (Newton’s method for (3)).
We consider the following.
Step 0. Given X0 and ɛ, set k=0.
Step 1. Let Res(Xk)=∥Xk2-A∥/∥A∥. If Res(Xk)<ɛ, stop.
Step 2. Solve for Hk in generalized Sylvester equation:
(6)AXk-1HkXk-1+Hk=-G(Xk).
Step 3. Update Xk+1=Xk+Hk, k=k+1, and go to Step 1, where Res(Xk)=∥Xk2-A∥/∥A∥.
By using Samanskii technique [21] to Newton’s method (5), we get the following algorithm.
Algorithm 3 (Newton’s method for (3) with Samanskii technique).
We consider the following.
Step 0. Given X0, m, and ɛ, set k=0.
Step 1. Let Res(Xk)=∥Xk2-A∥/∥A∥. If Res(Xk)<ɛ, stop.
Step 2. Let Xk,0=Xk,i=1.
Step 3. If i>m, go to Step 6.
Step 4. Solve for Hk,i-1 in generalized Sylvester equation:
(7)AXk-1Hk,i-1Xk-1+Hk,i-1=-G(Xk,i-1).
Step 5. Update Xk,i=Xk,i-1+Hk,i-1, i=i+1, and go to Step 3.
Step 6. Update Xk+1=Xk,m, k=k+1, and go to Step 1.
Remark 4.
In this paper, we only consider the case that m=2. If m=1, then Algorithm 3 is Algorithm 2.
Remark 5.
Iteration (5) is more suitable for theoretical analysis such as the convergence theorems and stability analysis in Sections 3 and 4, while Algorithms 2 and 3 are more convenient for numerical computation in Section 5. In actual computations, the Sylvester equation CXD+EXF=G may be solved by the algorithms developed in [22].
Although Algorithms 2 and 3 are also Newton’s methods, Algorithms 2 and 3 are more effective than Algorithm 1. Algorithm 3, especially, with m=2 has cubic convergence rate.
3. Convergence Theorems
In this section, we establish local convergence theorems for Algorithms 2 and 3. We begin with some lemmas.
Lemma 6 (see [23, P. 21]).
Let T be an (nonlinear) operator from a Banach space E into itself and let x*∈E be a solution of x=Tx. If T is Frechet differentiable at x* with ρ(Tx*′)<1, then the iteration xn+1=Txn, n=0,1,2,…, converges to x*, provided that x0 is sufficiently close to x*.
Lemma 7 (see [17, P. 45]).
Let A,B∈Cn×n and assume that A is invertible with ∥A-1∥≤α. If ∥A-B∥≤β and αβ<1, then B is also invertible, and
(8)∥B-1∥≤α1-αβ.
Lemma 8.
If the matrix X^∈Cn×n is nonsingular, then there exist γ>0 and L>0 such that, for all X,Y∈B(X^,γ), it holds that
(9)∥GX′-GY′∥≤L∥X-Y∥,
where B(X^,r)={X∣∥X-X^∥<γ} and GX′, GY′ are the F-derivative of the mapping G defined by (4) at X, Y.
Proof.
Let α=∥X^-1∥, and we select 0<γ<α-1.
From Lemma 7 it follows that X is nonsingular and ∥X-1∥≤α/(1-αγ) for each X^∈B(X0,γ). Then GX′ is well defined, and so does GY′, where Y∈B(X^,γ). According to (4), we have
(10)∥GX′(H)-GY′(H)∥=∥(H+AX-1HX-1)-(H+AY-1HY-1)∥=∥AX-1HX-1-AY-1HY-1∥=∥A[(X-1HX-1-X-1HY-1)+(X-1HY-1-Y-1HY-1)]∥=∥A[X-1H(X-1-Y-1)+(X-1-Y-1)HY-1]∥=∥A[X-1HY-1(Y-X)X-1+Y-1(Y-X)X-1HY-1]∥≤∥A∥(∥X-1∥2∥Y-1∥∥Y-X∥∥H∥+∥Y-1∥2∥X-1∥∥Y-X∥∥H∥)=∥A∥∥X-1∥∥Y-1∥(∥X-1∥+∥Y-1∥)∥X-Y∥∥H∥≤α1-αγα1-αγ(α1-αγ+α1-αγ)∥A∥∥X-Y∥∥H∥=2(α1-αγ)3∥A∥∥X-Y∥∥H∥=L∥X-Y∥∥H∥,
where L=2(α/(1-αγ))3∥A∥.
Hence, we have
(11)∥GX′-GY′∥≤L∥X-Y∥.
Theorem 9.
If (3) has a nonsingular solution X* and the mapping GX*′:Cn×n→Cn×n is invertible, then there exists a close ball S=B(X*,δ), such that, for all X0∈S, the sequence {Xk} generated by Algorithm 2 converges at least quadratically to the solution X*.
Proof.
Let φ(X)=X-(GX′)-1(G(X)). By Taylor formula in Banach space [24, P. 67], we have
(12)lim∥H∥→0∥φ(X*+H)-φ(X*)∥∥H∥=lim∥H∥→0(∥[X*+H-(GX*+H′)-1(G(X*+H))]-[X*-(GX*′)-1(G(X*))]∥×∥H∥-1)=lim∥H∥→0(∥H+(GX*′)-1(G(X*))-(GX*+H′)-1×(GX*′)-1(G(X*+H))∥×∥H∥-1)=lim∥H∥→0(∥12H+(GX*′)-1(G(X*))-(GX*+H′)-1×[G(X*)+GX*′(H)+12GX*′′(H2)+⋯]∥12×∥H∥-1)=lim∥H∥→0(∥H+(GX*′)-1(G(X*))-(GX*+H′)-1×(G(X*))+(GX*+H′)-1×(GX*′(H))+12(GX*+H′)-1(GX*′′(H2))+(GX*′)-1⋯∥×∥H∥-1)=0.
Hence, the F-derivative of φ at X* is 0. By Lemma 6, we derive that the sequence {Xk} generated by the iteration (5) converges to X*. It is also obtained that the sequence {Xk} generated by Algorithm 2 converges to X*.
Let ∥(GX*′)-1∥=β, according to Xk→X*(k→∞) and Lemma 7; for large enough k, we have
(13)∥(GXk′)-1∥≤β1-β(1/2β)=2β.
By Lemma 8, we have
(14)∥GXk′(Xk-X*)-GX*′(Xk-X*)∥≤L∥Xk-X*∥2.
By making use of Taylor formula once again, for all t∈[0,1], we have
(15)∥G(Xk)-G(X*)-GX*′(Xk-X*)∥≤∥∫(GXk+t(X*-Xk)′(Xk-X*)-GX*′(Xk-X*))dt∥≤∫∥GXk+t(X*-Xk)′-GX*′∥dt∥Xk-X*∥≤L∥Xk+t(X*-Xk)-X*∥∥Xk-X*∥=L(1-t)∥Xk-X*∥2≤L∥Xk-X*∥2.
Hence,
(16)∥Xk+1-X*∥=∥Xk-(GXk′)-1(G(Xk))-X*∥=∥(GXk′)-1[GXk′(Xk-X*)-GXk]∥=∥(GXk′)-1[(GXk′(Xk-X*)-GX*′(Xk-X*))-(GXk′)-1(G(Xk)-G(X*)-GX*′(Xk-X*))]∥≤∥(GXk′)-1∥[∥GXk′(Xk-X*)-GX*′(Xk-X*)∥+∥G(Xk)-G(X*)-GX*′(Xk-X*)∥].
Combining (13)–(16), we have
(17)∥Xk+1-X*∥≤2βL∥Xk-X*∥2+2βL∥Xk-X*∥2=4βL∥Xk-X*∥2,
which implies that the sequence {Xk} generated by Algorithm 2 converges at least quadratically to the solution X*.
Theorem 10.
If (1) has a nonsingular solution X* and the mapping GX*′:Cn×n→Cn×n is invertible, then there exists a close ball S=B(X*,δ), such that, for all X0∈S, the sequence {Xk} generated by Algorithm 3 converges at least cubically to the solution X*.
Proof.
Let φ(X)=X-(GX′)-1(G(X)). By Taylor formula in Banach space [24, P. 67], we have
(18)lim∥H∥→0∥φ(X*+H)-φ(X*)∥∥H∥=lim∥H∥→0(∥[X*+H-(GX*+H′)-1(G(X*+H))]-[X*-(GX*′)-1(G(X*))]∥×∥H∥-1)=lim∥H∥→0(∥H+(GX*′)-1(G(X*))-(GX*+H′)-1×(GX*′)-1(G(X*+H))∥×∥H∥-1)=lim∥H∥→0(∥12H+(GX*′)-1(G(X*))-(GX*+H′)-1×[G(X*)+GX*′(H)+12GX*′′(H2)+⋯]∥×12∥H∥-1)=lim∥H∥→0(∥12H+(GX*′)-1(G(X*))-(GX*+H′)-1×(G(X*))+(GX*+H′)-1(GX*′(H))+12(GX*+H′)-1(GX*′′(H2))+⋯∥×∥H∥-1)=0.
Hence, the F-derivative of φ at X* is 0. By Lemma 6, we derive that the sequence {Xk} generated by iteration (5) converges to X*. It is also obtained that the sequence {Xk} generated by Algorithm 3 converges to X*.
Let ∥(GX*′)-1∥=β, according to Xk→X*(k→∞) and Lemma 7; for large enough k, we have
(19)∥(GXk′)-1∥≤β1-β(1/2β)=2β.
By Lemma 8, we have
(20)∥GXk′(Xk,1-X*)-GX*′(Xk,1-X*)∥≤L∥Xk-X*∥∥Xk,1-X*∥.
By making use of Taylor formula once again, for all t∈[0,1], we have
(21)∥G(Xk,1)-G(X*)-GX*′(Xk,1-X*)∥≤∥∫(GXk,1+t(X*-Xk,1)′(Xk,1-X*)-GX*′(Xk,1-X*))dt∥≤∫∥GXk,1+t(X*-Xk,1)′-GX*′∥dt∥Xk,1-X*∥≤L∥Xk,1+t(X*-Xk,1)-X*∥∥Xk,1-X*∥=L(1-t)∥Xk,1-X*∥2≤L∥Xk,1-X*∥2.
Hence,
(22)∥Xk+1-X*∥=∥Xk,1-(GXk′)-1(G(Xk,1))-X*∥=∥(GXk′)-1[GXk′(Xk,1-X*)-G(Xk,1)]∥=∥(GXk′)-1×[(GXk′(Xk,1-X*)-GX*′(Xk,1-X*))-(GX*′G(Xk,1)-G(X*)(GXk′)-1(GXk′(Xk,1-X*)-GX*′(Xk,1-X*))-GX*′(Xk,1-X*))]∥≤∥(GXk′)-1∥×[∥GXk′(Xk,1-X*)-GX*′(Xk,1-X*)∥+∥G(Xk,1)-G(X*)-GX*′(Xk,1-X*)∥].
Combining (19)–(22) and Theorem 9, we have
(23)∥Xk+1-X*∥≤2β[L∥Xk-X*∥∥Xk,1-X*∥+L∥Xk,1-X*∥2]≤2βL[4βL2∥Xk-X*∥3+16β2L3∥Xk-X*∥4]=(8β2L3+32β3L4∥Xk-X*∥)∥Xk-X*∥3≤(8β2L3+32β3L4δ)∥Xk-X*∥3=M∥Xk-X*∥3,
where M=8β2L3+32β3L4δ. Therefore, the sequence {Xk} generated by Algorithm 3 converges at least cubically to the solution X*.
4. Stability Analysis
In accordance with [2] we define an iteration Xk+1=f(Xk) to be stable in a neighborhood of a solution X=f(X), if the error matrix Ek=Xk-X* satisfies
(24)Ek+1=L(Ek)+O(∥Ek∥2),
where L is a linear operator that has bounded power; that is, there exists a constant c>0 such that, for all n>0 and arbitrary E of unit norm, ∥Ln(E)∥<c. This means that a small perturbation introduced in a certain step will not be amplified in the subsequent iterations.
Note that this definition of stability is an asymptotic property and is different from the usual concept of numerical stability, which concerns the global error propagation, aiming to bound the minimum relative error over the computed iterates.
Now we consider the iteration (5) and define the error matrix Ek=Xk-X*; that is,
(25)Xk=Ek+X*.
For the sake of simplicity, we perform a first order error analysis; that is, we omit all the terms that are quadratic in the errors. Equality up to second order terms is denoted with the symbol ≐.
Substituting (25) into (5) we get
(26)Ek+1+X*=Ek+X*-(GEk+X*′)-1(G(Ek+X*));
combining (4) we have
(27)(Ek+1+X*)(Ek+X*)+A(Ek+X*)-1(Ek+1+X*)=2A,
which implies that
(28)Ek+1Ek+Ek+1X*+X*Ek+X*2+A(X*-1-X*-1EkX*-1+O(Ek2))(Ek+1+X*)=2A.
Omitting all terms that are quadratic in the errors, we have
(29)Ek+1X*+X*Ek+X*2+AX*-1Ek+1+A-AX*-1Ek≐2A.
By using AX*-1=X*, we have
(30)Ek+1X*+X*Ek+A+X*Ek+1+A-X*Ek≐2A;
that is,
(31)Ek+1X*+X*Ek+1=0,
which means that iteration (5) is self-adaptive; that is to say, the error Ek in the kth iteration does not propagate to the (k+1)st iteration. When X* and -X* have no eigenvalue in common, especially, the matrix equation EX*+X*E=0 has a unique solution E=0 [17, P. 194]. Therefore, under the condition that X* and -X* have no eigenvalue in common, the iteration (5) has optimal stability; that is, the operator L defined in (24) coincides with the null operator.
5. Numerical Examples
In this section, we compare our algorithms with the following.
Consider
(34)Y0=A,Z0=IYk+1=1pYk∑i=1p1ξi(ZkYk+αi2I)-1Zk+1=1pZk∑i=1p1ξi(YkZk+αi2I)-1,k=0,1,2,…,
where p≥1 is a chosen integer:
(35)ξi=12(1+cos(2i-1)π2p),αi2=1ξi-1,iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiik=0,1,2,…,p.
All tests are performed by using MATLAB 7.1 on a personal computer (Pentium IV/2.4 G), with machine precision 2.2×10-16. The stopping criterion for these algorithms is the relative residual error:
(37)Res=∥Xk2-A∥∥A∥<10-15,
where Xk is the current, say the kth, iteration value.
Example 1.
Consider the matrix
(38)A=(aij)10×10={j20,i=j;i+j1000,i≠j.
We use Algorithms 1, 2, and 3 with X0=0.3I and Algorithms 11–14 to compute the nonsingular square root of A. We list the iteration steps (denoted by “IT”), CPU time (denoted by “CPU”), and the relative residual error (denoted by “ERR”) in Table 1.
IT
CPU
ERR
Algorithm 1
7
0.0086
2.17×10-16
Algorithm 2
7
0.0080
2.04×10-16
Algorithm 3
5
0.0103
1.96×10-16
Algorithm 11
9
0.0172
1.99×10-16
Algorithm 12
6
0.0136
2.03×10-16
Algorithm 13 with p=1
11
0.0094
2.71×10-16
Algorithm 13 with p=2
9
0.0101
2.68×10-16
Algorithm 14 with p=1
9
0.0127
1.97×10-16
Algorithm 14 with p=2
6
0.0108
3.61×10-16
Example 2.
Consider the matrix
(39)A=(aij)200×200={1,i=j;1i+j-1,i≠j.
We use Algorithms 1, 2, and 3 with the starting matrix X0=0.9I and Algorithms 11–14 to compute the nonsingular square root of A. We list the numerical results in Table 2.
IT
CPU
ERR
Algorithm 1
6
7.6310
5.72×10-16
Algorithm 2
6
8.7200
3.61×10-16
Algorithm 3
4
9.0258
2.60×10-16
Algorithm 11
8
13.2301
3.87×10-16
Algorithm 12
7
11.6758
2.98×10-16
Algorithm 13 with p=1
10
8.8936
9.36×10-16
Algorithm 13 with p=2
6
9.4387
5.78×10-16
Algorithm 14 with p=1
9
10.3571
2.89×10-16
Algorithm 14 with p=2
5
8.1043
3.87×10-16
From Tables 1 and 2, we can see that Algorithms 2 and 3 outperform Algorithms 1, 11, 12, and 13 in both iteration steps and approximation accuracy, and Algorithm 3 outperforms Algorithms 1, 2, and 11–14 in both iteration steps and approximation accuracy. Therefore, our algorithms are more effective than the known ones in some aspects.
6. Conclusion
In this paper, we propose two new algorithms for computing the nonsingular square root of a matrix A by applying Newton’s method to nonlinear matrix equation G(X)=X-AX-1=0. Convergence theorems and stability analysis for these new algorithms are given. Numerical examples show that our methods are more effective than the known one in some aspects.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors wish to thank the editor and anonymous referees for providing very useful suggestions as well as Professor Xuefeng Duan for his insightful and beneficial discussion and suggestions. This work was supported by National Natural Science Fund of China (nos. 11101100, 11301107, and 11261014) and Guangxi Provincial Natural Science Foundation (nos. 2012GXNSFBA053006, 2013GXNSFBA019009).
SchmittB. A.On algebraic approximation for the matrix exponential in singularly perturbed bounded value problems2010575166ChengS. H.HighamN. J.KenneyC. S.LaubA. J.Approximating the logarithm of a matrix to specified accuracy20012241112112510.1137/S0895479899364015MR18258532-s2.0-0035537402DieciL.MoriniB.PapiniA.Computational techniques for real logarithms of matrices199617357059310.1137/S0895479894273614MR1397246ZBL0870.650362-s2.0-0030519907CrossG. W.LancasterP.Square roots of complex matrices19741289293MR0340278HoskinsW. D.WaltonD. J.A faster method of computing square roots of a matrix19782334944952-s2.0-0017982409JohnsonC. R.OkuboK.Uniqueness of matrix square roots under a numerical range condition20023411–319519910.1016/S0024-3795(01)00358-5MR1873619BjörckÅ.HammarlingS.A Schur method for the square root of a matrix198352-5312714010.1016/0024-3795(83)80010-XMR7093472-s2.0-0000954153ChenS. G.HsiehP. Y.Fast computation of the nth root198917101423142710.1016/0898-1221(89)90024-2MR9921252-s2.0-0024863868DenmanE. D.BeaversA. N.Jr.The matrix sign function and computations in systems1976216394MR03930752-s2.0-0002656851FrancaL. P.An algorithm to compute the square root of a 3×3 positive definite matrix198918545946610.1016/0898-1221(89)90240-XMR10030242-s2.0-0024874525HighamN. J.Newton’s method for the matrix square root19864617453754910.2307/2007992MR829624HasanM. A.A power method for computing square roots of complex matrices1997213239340510.1006/jmaa.1997.5517MR1470859ZBL0893.650282-s2.0-0031572001HighamN. J.Stable iterations for the matrix square root199715222724210.1023/A:1019150005407MR14751792-s2.0-21944447246LiuZ.ZhangY.RalhaR.Computing the square roots of matrices with central symmetry2007186171572610.1016/j.amc.2006.08.032MR2314532ZBL1121.650452-s2.0-33947245011ZhangY. N.YangY. W.CaiB. H.GuoD. S.Zhang neural network and its application to Newton iteration for matrix square root estimation201221345346010.1007/s00521-010-0445-x2-s2.0-84858151057LiuZ.ChenH.CaoH.The computation of the principal square roots of centrosymmetric H-matrices2006175131932910.1016/j.amc.2005.07.029MR22163442-s2.0-33644928375OrtegaJ. M.RheinboldtW. C.2008Philadephia, Pa, USASIAMMR1744713LaasonenP.On the iterative solution of the matrix equation AX2-I=019581210911610.2307/2002785MR0099107OrtegaJ. M.19722ndNew York, NY, USAAcademic PressMR0403154MeiniB.The matrix square root from a new functional perspective: theoretical results and computational issues200426236237610.1137/S0895479803426656MR21241522-s2.0-17444381125SamanskiiV.On a modification of the Newton’s method196719133138GardinerJ. D.LaubA. J.AmatoJ. J.MolerC. B.Solution of the Sylvester matrix equation AXBT+CXDT=E199218222323110.1145/146847.146929MR11678922-s2.0-0026874674KrasnoselskiiM. A.VainikkoG. M.ZabreikoP. P.RutitskiiY. B.StetsenkoV. Y.1972Groningen, The NetherlandsWolters-Noordhoff PublishingGuoD. J.2009Shandong, ChinaShandong Science Press