Approximating the Matrix Sign Function Using a Novel Iterative Method

and Applied Analysis 3 Ta bl e 1: Ite ra tio n fu nc tio ns fo rfi nd in g th es ig n (in th es ca la rf or m )f ro m th eP ad éf am ily (4 ). n = 0 n = 1 n = 2 n = 3 n = 4


Introduction
The generic matrix function () of a given matrix  ∈ C × is defined formally by the integral representation where  : Ω → C is an analytic function, Ω ⊆ C, and  is a closed curve which encircles all eigenvalues of  (it should be contained in the domain of analyticity of ).The integral representation (1) is known as the Cauchy integral formula [1].The integral of a matrix  should be understood as the matrix whose entries are the integrals of the entries of .However, this mathematically appealing formula for computing the matrix functions is complicated and needs complex analysis to be fully understandable.Hence, several important other strategies for computing the matrix functions have been proposed and investigated, such as the Jordan canonical form and iterative methods for applied numerical problems (see, e.g., [1][2][3]).
In 1971, Roberts in [4] introduced the matrix sign function as a tool for model reduction and for solving Lyapunov and algebraic Riccati equations.He defined the sign function as a Cauchy integral and obtained the following integral representation of sign(): The matrix sign function is widely exploited in numerical linear algebra, especially in the computation of invariant subspaces and solutions of Riccati equations [5][6][7].Note that the application of this function enables a matrix to be decomposed into two components whose spectra lie on opposite sides of the imaginary axis.The matrix sign function is a valuable tool for the numerical solution of Sylvester and Lyapunov matrix equations (see, e.g., [8]).An application of a generalization of the Newton iteration for the matrix sign function to the solution of the generalized algebraic Bernoulli equations was considered in [9].
Another application of this matrix function as a simple and direct method to derive some fundamental results in the theory of surface waves in anisotropic materials was presented in [10].The authors of paper [11] investigated some practical iterations for matrix sector function which is a generalization of the matrix sign function.
Due to the applicability of the matrix sign function along with the difficulty of representation (2), stable iterative methods have become some viable choices.
The most important general family of matrix iterations for finding the matrix sign function  was introduced in [12] using Padé approximants to () = (1 − ) −1/2 and the following characterization: where  = 1 −  2 and  is less than 1 in magnitude.Let the (, )-Padé approximant to () be  , ()/ , () and  +  ≥ 1.The iteration has been proved to be convergent to  1.Note that Iannazzo in [13] pointed out that these iterations can be obtained from the general König family (which goes back to Schröder [14,15]) applied to the equation  2 − 1 = 0.For a recent method in this area, one may refer to [16].
A lot of known methods could be extracted from the Padé family (4).For example, the Newton's iteration can be deduced as the reciprocal of the iteration corresponding to the case  = 0 and  = 1 in Table 1: Choosing  =  = 2 yields the following fifth order method: Similarly, options  = 1 and  = 3, or  = 3 and  = 1, or  = 0 and  = 4 result in the following fifth order methods, respectively: The remaining sections of this work are organized as follows.Section 2 presents the construction and the derivation of a new matrix iteration for finding  using the following new nonlinear equation solver: where [  ,   ] = ((  ) − (  ))/(  −   ).( 10) is a combination of Jarratt's method [17] and a secant approach [18].Note that (10) is a novel three-step iterative method (for the scalar case).Section 2 also studies the stability of the method and verifies its asymptotical stability.In Section 3, we discuss some other aspects of the new method, applicable in the implementation.Therein, we derive a new inversion-free method as well as a scaled method for finding .Section 4 is devoted to the numerical examples for illustrating the convergence behavior of the new method against the existing ones.We emphasize that our constructed solver possesses the same global convergence rate as (6), but numerical example will reveal that it has an equal or faster behavior for solving many randomly generated matrices.This would be a clear advantage of our solver in contrast to (6).Finally, concluding remarks will be drawn in Section 5.

A Novel Iterative Method
Throughout this work it is assumed that  ∈ C × has no eigenvalues on the imaginary axis, so that sign() is defined.Note that this assumption implies that  is nonsingular.
As discussed in the fundamental article of Kenny and Laub in 1995 [19], the construction of new matrix methods for the matrix sign  is related to the iterative methods for finding the solution of nonlinear scalar equations.Let us apply (10) to the following nonlinear matrix equation: in which  is the identity matrix.In fact, the sign  is a solution of the matrix equation (11).After application and simplification of its reciprocal, we obtain where  0 = .
Iannazzo in [20] mentioned that a matrix convergence is governed by the corresponding scalar convergence.Since the method (10) reads the following error equation: wherein   =  () ()/(! ()) and   =   − , therefore its corresponding matrix method ( 12) converges with fifth order Table 1: Iteration functions for finding the sign (in the scalar form) from the Padé family (4).
of convergence.But the question is whether this convergence is local or global.To answer this question, we draw the basins of attraction for the new scheme along with the existing methods of various orders.
It is shown in Figures 1-3 that methods ( 5), (6), and ( 12) are globally convergent, while methods (7), (8), and ( 9) are locally convergent (if one chooses a matrix  (in Figure 2 (left)) with one eigenvalue with negative real part, but in a green petal, then the matrix iteration will not converge to ).
The higher order convergence of ( 12) made its basins larger and lighter in contrast to (5).Hence, the new method could be of interest due to its global fifth order of convergence for finding .
Now, an important challenge, that must be proven, is to show the stability of the new method (12) for finding the matrix sign function.This will be done formally in what follows.
Definition 1 (stability, see [1]).Consider an iteration  +1 = (  ) with a fixed point .Assume that  is Fréchet differentiable at .The iteration is stable in a neighborhood of  if the Fréchet derivative   () has bounded powers; that is, there exists a constant  such that ‖   ()‖ ≤  for all  > 0.
We investigate the stability of ( 12) for finding  in a neighborhood of the solution of (11).In fact, we analyze how a small perturbation at the th iterate is amplified or damped along the iterates.Stability concerns behavior close to convergence and so is an asymptotic property.Lemma 2. The sequence {  } =∞ =0 generated by (12) is asymptotically stable.
Proof.If  0 is a function of , then the iterates from (12) are all functions of  and hence commute with .Commutativity properties are frequently used when deriving a matrix iteration for finding .
In this study, we restrict the analysis to asymptotically small perturbations; that is, we use the differential error analysis.
Let Δ  be the numerical perturbation introduced at the th iterate of (12).Next, one has Here, we perform a first order error analysis; that is, we formally use approximations (Δ  )  ≈ 0, since (Δ  )  ,  ≥ 2, is close to the zero (matrix).This formal manipulation is meaningful if Δ  is sufficiently small.We have where the following identities are used (for any nonsingular matrix  and the matrix ): Note that after some algebraic manipulations and using Δ +1 = X+1 −  +1 , we have (assuming   ≈ sign() =  for enough large ) We used the following facts on the matrix sign function  2 = , and  −1 = .We can now conclude that the perturbation at the iterate  + 1 is bounded; that is, Therefore, the sequence {  } =∞ =0 generated by ( 12) is asymptotically stable.This ends the proof.Theorem 3. Let  ∈ C × have no pure imaginary eigenvalues.Then, the matrix sequence {  } =∞ =0 defined by (12) converges to the matrix sign .
Proof .Let  have a Jordan canonical form arranged as where  is a nonsingular matrix and ,  are square Jordan blocks corresponding to eigenvalues lying in C − (open lefthalf complex plane) and C + (open right-half complex plane), respectively.Denote by  1 , . . .  and  +1 , . . .  values lying on the main diagonals of blocks  and , respectively.Since  is invertible, we know that sign() is diagonalizable and [1] sign    6 Abstract and Applied Analysis On the other hand, if we define   =  −1   , then from the equation ( 12) we obtain Notice that if  0 is a diagonal matrix, then all successive   are diagonal too.From (22), it is enough to prove that {  } converges to sign(Λ), in order to ensure the convergence of the sequence generated by (12) to sign().We can write (22) as  uncoupled scalar iterations to solve () =  2 − 1 = 0, given by where    = (  ) , and 1 ≤  ≤ .From ( 22) and ( 23), it is enough to study the convergence of {   } to sign(  ), for all 1 ≤  ≤ .
Recalling   =  −1   , we have lim and subsequently the convergence is established.The proof is complete.
Theorem 4. Consider the same conditions of Lemma 2 and Theorem 3. Then the proposed method (12) has fifth order of convergence to the sign matrix .
Proof .Clearly,   are rational functions of  and hence, like , commute with .On the other hand, we know that  2 = ,  −1 = ,  2 = , and  2+1 = ,  ≥ 1.Using the replacement   =  + 20 2  + 25 4  + 2 6  (for the sake of simplicity), we have Now, using any matrix norm from both sides of ( 27 This reveals the fifth order of convergence for the new method (12).The proof is complete.

Multiplication-Rich and Scaled Variants of the New Method
In general, reduction of a problem in numerical linear algebra to the matrix inversion problem is not an advisable technique.The iterations ( 5)-( 9) as well as (12) require explicit computation of a matrix inverse in each iterative step.Since explicit usage of the inverse matrix is relatively rare in numerical analysis, there is a normal aspiration to approximate the inverse.Such discussions and variants for (12) will be given in the following subsections.

Solve a Matrix Equation Instead of the Matrix Inverse.
One of the ways to avoid explicit usage of the inverse matrix is to solve corresponding system of linear matrix equations instead of the matrix inverse.This is done in Algorithm 1.

Use Approximation of the Matrix Inverse.
Another tendency is to give matrix multiplication-rich iterations which retain the convergence rate of the method.For example, the inverse  −1  in (5) can be replaced by one step of Schulz method for the matrix inverse, which has the form   (2 −  2  ).This replacement produces the Newton-Schulz iteration (1) Given a suitable  0 ∈ C × (2) for  = 0, 1, . . .until convergence do (3)   =  + 20 for  +1 (5) end for Algorithm 1: The new method for computing the matrix sign.
which is multiplication-rich and retains the quadratic convergence of Newton's method.However, it is only locally convergent, with convergence guaranteed for ‖− 2 ‖ < 1 (see [1]).We apply similar idea to (12).For the sake of simplicity, we use the notation 12), we get the following iterative rule for computing the matrix sign function: Formula (30) is multiplication-rich with convergence guaranteed for ‖ −  2 ‖ < 1.
Note that inverse-free algorithms are suitable for the implementation on vector and parallel computers.The iterative scheme (30) includes 9 matrix multiplications, while the complexity of (12) contains 6 matrix multiplications and one matrix inversion to achieve fifth convergence order.

A Hybrid
Method.An efficient algorithm for finding the sign by avoiding the computation of matrix inverse is to use (12) or (33) in initial iterations, until  2  is close enough to , and in a subsequent stage apply (30).Such a switching approach was proposed originally in [19].A similar idea is exploited in Algorithm 2.

Scaling Method.
For lower order methods, such as (5), the convergence is slow at the beginning.In fact, once the error is sufficiently small (in practice, less than, say, 0.5), successive errors decrease rapidly, each being approximately the square of the previous one.However, initial convergence can be slow if the iteration   has a large eigenvalue, that is, in the case ‖  ‖ ≫ 1.Hence, a scaling approach to accelerate the beginning of this phase is necessary and can be done in what follows [7] for the Newton's method:  0 = ,   = is the scaling parameter computed by (32) , wherein Such an approach could be done to refine the initial matrix and to provide a much more robust initial matrix to arrive at the convergence phase rapidly.
The new iteration ( 12) is quite fast and reliable due to the discussions in Sections 2 and 3.However, a way is open for speeding up its initial phase of convergence via the concept of scaling.
An effective way to enhance the initial speed of convergence is to scale the iterates prior to each iteration, that is,   is replaced by     .Such an idea can simply be done in what follows: 0 = ,   = is the scaling parameter computed by (32) ,  +1 = (7    + 30 where lim  → ∞   = 1 and lim  → ∞   = .Algorithm 3 illustrates an efficient method based on (33) for finding .

Experimental Results
There are basically two general ways to terminate a matrix iteration for finding , that is, to stop when   has relative error below a tolerance or to stop when some suitable residual (such as  2  − ) is below the tolerance.The relevant aim will in general be problem dependent.However, the stop termination is really important in matrix methods.
The considered stopping termination in this section would be the safe strategy introduced in [1] as follows: For comparisons, we have used the matrix globally convergent methods ( 5), (6), and (12) using Mathematica 8 builtin precision, [21].We used Mathematica function The results of comparisons in terms of the number of iterations have been reported in Table 2.We remark that whatever the eigenvalues of a matrix are closer to the imaginary axis, the speed of convergence for different methods becomes slower and more risky to face with singular matrices   , whose inverses could not be computed.
In this example, we have used the stopping criterion (34) with  = 10 −10 , matrix norm one, and  0 =  as the initial matrix.
Example 6.In order to confirm the acceleration via scaling, we have reran Example 5 with the norm scaling.The results are summarized in Table 3.The numerics reverify the effectiveness of the new method (12) in finding the matrix sign.

Summary
Interest in the sign function grew steadily starting from the 1970s and 1980s, initially among engineers and later among numerical analysts.Following such a trend, in this study we proposed a fifth order new iterative method for finding the matrix sign .
Applying the basins of attraction in the complex plane, based on the results from [20], we concluded that the introduced method (12) has global convergence.We then theoretically found that it is asymptotically stable.Some numerical experiments have been studied in order to show the faster convergence on the basis of a smaller number of iterations.
Several modifications of the introduced method have been established, such as a new inversion-free method, a composite method, and a scaled version.

Example 5 .
Inverse[] to compute the required matrix inverse.Implementation details of the function Inverse[] are based on efficient row reduction (Gaussian elimination) based on numerical approximation.In this test, we examine the behavior of different iterative methods for finding the matrix sign function of 10 randomly generated complex square 70 × 70 matrices as follows: n = 70; number = 10; SeedRandom[12345]; Table[A[l] = RandomComplex[{−5 − I, 5 + I}, {n, n}];, {l, number}].

Table 2 :
Results of comparisons for Example 5.

Table 3 :
Results of comparisons for Example 6 with norm scaling.