We propose an iterative method for finding matrix sign function. It is shown that the scheme has global behavior with cubical rate of convergence. Examples are included to show the applicability and efficiency of the proposed scheme and its reciprocal.
1. Introduction
It is known that the function of sign in the scalar case is defined for any z∈C not on the imaginary axis by(1)signz=1,Rez>0,-1,Rez<0.An extension of (1) for the matrix case was given firstly by Roberts in [1]. This extended matrix function is of clear importance in several applications (see, e.g., [2] and the references therein).
Assume that A∈Cn×n is a matrix with no eigenvalues on the imaginary axis. To define this matrix function formally, let(2)A=TJT-1be a Jordan canonical form arranged so that J=diag(J1,J2), where the eigenvalues of J1∈Cp×p lie in the open left half-plane and those of J2∈Cq×q lie in the open right half-plane; then(3)S=signA=T-Ip00IqT-1,where p+q=n. A simplified definition of the matrix sign function for Hermitian case (eigenvalues are all real) is(4)S=Udiagsignλ1,…,signλnU∗,where(5)U∗AU=diagλ1,…,λnis a diagonalization of A.
The importance of computing S is also due to the fact that the sign function plays a fundamental role in iterative methods for matrix roots and the polar decomposition [3].
Note that although sign(A) is a square root of the identity matrix, it is not equal to I or -I unless the spectrum of A lies entirely in the open right half-plane or open left half-plane, respectively. Hence, in general, sign(A) is a nonprimary square root of I.
In this paper, we focus on iterative methods for finding S. In fact, such methods are Newton-type schemes which are in essence fixed-point-type methods by producing a convergent sequence of matrices via applying a suitable initial matrix.
The most famous method of this class is the quadratic Newton method defined by(6)Xk+1=12Xk+Xk-1.
It should be remarked that iterative methods, such as (6), and the Newton-Schultz iteration(7)Xk+1=12Xk3I-Xk2or the cubically convergent Halley method(8)Xk+1=I+3Xk2Xk3I+Xk2-1,are all special cases of the Padé family proposed originally in [4]. The Padé approximation belongs to a broader category of rational approximations. Coincidentally, the best uniform approximation of the sign function on a pair of symmetric but disjoint intervals can be expressed as a rational function.
Note that although (7) does not possess a global convergence behavior, on state-of-the-art parallel computer architectures, matrix inversions scale less satisfactorily than matrix multiplications do, and subsequently (7) is useful in some problems. However, due to local convergence behavior, it is excluded from our numerical examples in this work.
The rest of this paper is organized as follows. In Section 2, we discuss how to construct a new iterative method for finding (3). It is also shown that the constructed method is convergent with cubical rate. It is noted that its reciprocal iteration obtained from our main method is also convergent. Numerical examples are furnished to show the higher numerical accuracy for the constructed solvers in Section 3. The paper ends in Section 4 with some concluding comments.
2. A New Method
The connection of matrix iteration methods with the sign function is not immediately obvious, but in fact such methods can be derived by applying a suitable root-finding method to the nonlinear matrix equation(9)X2=Iand when of course sign(A) is one solution of this equation (see for more [5]).
Here, we consider the following root-solver:(10)xk+1=xk-10-4Lxk10-9Lxkfxkf′xk,with L(xk)=f′′(xk)f(xk)/f′(xk)2. In what follows, we observe that (10) possesses third order of convergence.
Theorem 1.
Let α∈D be a simple zero of a sufficiently differentiable function f:D⊆C→C, which contains x0 as an initial approximation. Then the iterative expression (10) satisfies(11)ek+1=c225-c3ek3+O(ek4),where cj=f(j)(α)/j!f′(α), ek=xk-α.
Proof.
The proof would be similar to the proofs given in [6].
Applying (10) on the matrix equation (9) will result in the following new matrix fixed-point-type iteration for finding (3):(12)Xk+1=2I+15Xk2+3Xk49Xk+11Xk3-1,where X0=A. This is named PM1 from now on.
The proposed scheme (12) is not a member of Padé family [4]. Furthermore, applying (10) on the scalar equation g(x)=x2-1 provides a global convergence in the complex plane (except the points lying on the imaginary axis). This global behavior, which is kept for matrix case, has been illustrated in Figure 1 by drawing the basins of attraction for (6) and (8). The attraction basins for (7) (local convergence) and (12) (global convergence) are also portrayed in Figure 2.
Attraction basins for (6) (a) and (8) (b) for the polynomial g(x)=x2-1.
Attraction basins of (7) (a) and (12) (b) for the polynomial g(x)=x2-1.
Theorem 2.
Let A∈Cn×n have no pure imaginary eigenvalues. Then, the matrix sequence {Xk}k=0k=∞ defined by (12) converges to S, choosing X0=A.
Proof.
We remark that all matrices, whether they are diagonalizable or not, have a Jordan normal form A=TJT-1, where the matrix J consists of Jordan blocks. For this reason, let A have a Jordan canonical form arranged as(13)T-1AT=Λ=C00N,where T is a nonsingular matrix and C,N are square Jordan blocks corresponding to eigenvalues lying in C- and C+, respectively. We have(14)signΛ=signT-1AT=T-1signAT=diagsignλ1,…,signλp,signλp+1,…,signλn.If we define Dk=T-1XkT, then, from the method (12), we obtain(15)Dk+1=2I+15Dk2+3Dk49Dk+11Dk3-1.Note that if D0 is a diagonal matrix, then, based on an inductive proof, all successive Dk are diagonal too. From (15), it is enough to show that {Dk} converges to sign(Λ). We remark that the case at which D0 is not diagonal will be discussed later in the proof.
In the meantime, we can write (15) as n uncoupled scalar iterations to solve g(x)=x2-1=0, given by(16)dk+1i=2+15dki2+3dki49dki+11dki3-1,where dki=(Dk)i,i and 1≤i≤n. From (15) and (16), it is enough to study the convergence of {dki} to sign(λi).
It is known that sign(λi)=si=±1. Thus, we attain(17)dk+1i-1dk+1i+1=-1+dki3-2+3dki1+dki32+3dki.Since |d0i|=|λi|>0, we have(18)limk→∞dk+1i-1dk+1i+1=0,and limk→∞|dki|=1=sign(λi). This shows that {dki} is convergent.
In the convergence proof, D0 may not be diagonal. Since the Jordan canonical form of some matrices may not be diagonal, thus, one cannot write (15) as n uncoupled scalar iterations (16). We comment that in this case our method is also convergent. To this goal, we must pursue the scalar relationship among the eigenvalues of the iterates for the studied rational matrix iteration.
In this case, the eigenvalues of Xk are mapped from the iterate k to the iterate k+1 by the following relation:(19)λk+1i=2+15λki2+3λki49λki+11λki3-1.So, (19) clearly shows that the eigenvalues in the general case are convergent to ±1; that is to say,(20)limk→∞λk+1i-1λk+1i+1=0.Consequently, we have(21)limk→∞Xk=Tlimk→∞DkT-1=TsignΛT-1=signA.The proof is ended.
Theorem 3.
Let A∈Cn×n have no pure imaginary eigenvalues. Then the proposed method (12) converges cubically to the sign matrix S.
Proof.
Clearly, Xk are rational functions of A and, hence, like A, commute with S. On the other hand, we know that S2=I, S-1=S, S2j=I, and S2j+1=S, j≥1. Using the replacement Bk=9Xk+11Xk3, we have(22)Xk+1-S=2I+15Xk2+3Xk4Bk-1-S=2I+15Xk2+3Xk4-SBkBk-1=2I+15Xk2+3Xk4-9SXk-11SXk3Bk-1=--2S-15SXk2-3SXk4+9Xk+11Xk3×S-1Bk-1=Xk-S32I-3SXkS-1Bk-1.Now, using any matrix norm from both sides of (22), we attain(23)Xk+1-S≤Bk-1S-12I-3SXkXk-S3.This reveals the cubical rate of convergence for the new method (12). The proof is complete.
It should be remarked that the reciprocal iteration obtained from (12) is also convergent to the sign matrix (3) as follows:(24)Xk+1=9Xk+11Xk32I+15Xk2+3Xk4-1,where X0=A. This is named PM2. Similar convergence results as the ones given in Theorems 2-3 hold for (24).
A scaling approach to accelerate the beginning phase of convergence is normally necessary since the convergence rate cannot be seen in the initial iterates. Such an idea was discussed fully in [7] for Newton’s method. An effective way to enhance the initial speed of convergence is to scale the iterates prior to each iteration; that is, Xk is replaced by μkXk. Subsequently, we can present the accelerated forms of our proposed methods as follows:(25)X0=A,μk=isthescalingparametercomputedby27,Xk+1=2I+15μk2Xk2+3μk4Xk49μkXk+11μk3Xk3-1,or(26)X0=A,μk=isthescalingparametercomputedby27,Xk+1=9μkXk+11μk3Xk32I+15μk2Xk2+3μk4Xk4-1,(27)μk=Xk-1Xk,normscaling,ρXk-1ρXk,spectralscaling,detXk-1/n,determinantalscaling,where limk→∞μk=1 and limk→∞Xk=S. The different scaling factors for μk in (27) are borrowed from Newton’s method. For this reason it is important to show the behavior of the accelerator methods (25)-(26) and this will be done in the next section.
3. Numerical Examples
In this section, the results of comparisons in terms of number of iterations and the residual norms have been reported for various matrix iterations. We compare PM1 and PM2 with (6) denoted by NM and (8) denoted by HM. The programming package Mathematica [8] is used throughout this section. In Tables 1 and 2, IT stands for the number of iterates.
Results of comparisons for Example 5 using X0=A.
Methods
NM
HM
PM1
PM2
IT
14
9
8
8
Rk+1
1.41584×10-249
1.0266×10-299
2.5679×10-298
1.45091×10-337
ρ
1.99077
3
3
3
Results of comparisons for Example 6 using X0=A.
Methods
NM
HM
PM1
PM2
IT
10
7
6
6
Rk+1
5.7266×10-155
5.80819×10-203
8.38265×10-153
1.55387×10-143
ρ
2.00228
3.00001
3.00015
3
Note that the computational order of convergence for matrix iterations in finding S can be estimated by [9](28)ρ=logXk+12-I/Xk2-IlogXk2-I/Xk-12-I,where Xk-1, Xk, and Xk+1 are the last three approximations.
Example 4.
In this example, we compare the methods for the following 500×500 complex matrix:
n = 500; SeedRandom[123];
A = RandomComplex[{-100 - I, 100 + I},{n,n}];
We apply here double precision arithmetic with the stop termination Rk+1=Xk+12-I∞≤10-5. Results are given in Figure 3.
Convergence history versus number of iterations for different methods in Example 4.
Example 5 (academic test).
We compute the matrix sign for the following complex test problem: (29)A=010i7+i7-56-5060-29059i,where(30)S=0.882671+0.0118589i0.461061-0.0519363i-0.167387+0.0215728i0.168184-0.0194164i0.219355+0.00464485i0.136809-0.00840032i0.313995-0.00196855i-0.314977-0.00219388i-0.566306-0.0184534i2.22878+0.0471091i0.189109-0.00416224i0.813305+0.0149399i0.145285+0.00157401i-0.57165+0.000347003i0.207909-0.00345322i0.791412+0.000703638i.We apply here 600-digit fixed point arithmetic in our calculations with the stop termination Rk+1=∥Xk+12-I∥∞≤10-150. The results for this example are illustrated in Table 1. We report the COCs in l∞.
Iterative schemes PM1 and PM2 are evidently believed to be more favorable than the other compared methods due to their fewer number of iterations and acceptable accuracy. Hence, the proposed methods with properly chosen initial matrix X0 can be helpful in finding the sign of a nonsingular complex matrix.
Example 6.
Here we rerun Example 5 using the scaling approaches (27) with the stop termination Rk+1=∥Xk+12-I∥∞≤10-100. The results for this example are illustrated in Table 2. We used the determinantal scaling for all compared methods. The numerical results uphold the theoretical discussions of Section 2.
A price paid for the high order convergence is the increased amount of matrix multiplications and inversions. This is a typical consequence. However the most important advantage of the presented methods in contrast to the methods of the same orders, such as (8), is their larger attraction basins. This superiority basically allows the new methods to converge to a required tolerance in one lower iteration than their same order methods. Hence, studying the thorough computational efficiency index of the proposed methods may not be an easy task and it must be pursued experimentally. In an experimental manner, if the costs of one matrix-matrix product and one matrix inversion are unity and 1.5 of unity, respectively, then we have the following efficiency indices for different methods: E(6)=21/(14(1)+14(1.5))≃1.020, E(8)=31/(9(3)+9(1.5))≃1.027, and E(12)=31/(8(4)+8(1.5))≃1.025. Note that for Newton’s method we have one matrix-matrix product per cycle due to the computation of stopping criterion. Other similar computations for efficiency indices for different examples show similar behaviors to the above mentioned one.
4. Summary
Matrix functions are used in many areas of linear algebra and arise in numerous applications in science and engineering. The function of a matrix can be defined in several ways, of which the following three are generally the most useful: Jordan canonical form, polynomial interpolation, and finally Cauchy integral.
In this paper, we have focus on iterative methods for this purpose. Hence, a third order nonlinear equation solver has been employed for constructing a new method for S. It was shown that the convergence is global via attraction basins in the complex plane and the rate of convergence is cubic. Furthermore, PM2 as the reciprocal of the method PM1 with the same convergence properties was proposed. The acceleration of PM1 and PM2 via scaling was also illustrated simply.
Finally some numerical examples in both double and multiple precisions were performed to show the efficiency of PM1 and PM2. Further researches must be forced to extend the obtained iterations for computing polar decompositions in future studies.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
The authors would like to thank the referees for their helpful corrections and suggestions.
RobertsJ. D.Linear model reduction and solution of the algebraic Riccati equation by use of the sign function198032467768710.1080/00207178008922881MR595962KenneyC. S.LaubA. J.The matrix sign function19954081330134810.1109/9.402226MR13438002-s2.0-0029358151HighamN. J.2008Philadelphia, Pa, USASociety for Industrial and Applied Mathematics10.1137/1.9780898717778MR2396439KenneyC.LaubA. J.Rational iterative methods for the matrix sign function199112227329110.1137/0612020MR1089159ZBL0725.65048SoleymaniF.StanimirovićP. S.ShateyiS.HaghaniF. K.Approximating the matrix sign function using a novel iterative method20142014910530110.1155/2014/105301MR3240521SoleymaniF.Some high-order iterative methods for finding all the real zeros2014122313327MR32173422-s2.0-84901768348KenneyC.LaubA. J.On scaling Newton's method for polar decomposition and the matrix sign function199213369870610.1137/0613044MR1168017WagonS.20103rdNew York, NY, USASpringerSoleymaniF.TohidiE.ShateyiS.HaghaniF.Some matrix iterations for computing matrix sign function20142014 942565410.1155/2014/425654MR3232916