An Algorithm for Computing Geometric Mean of Two Hermitian Positive Definite Matrices via Matrix Sign

Using the relation between a principal matrix square root and its inverse with the geometric mean, we present a fast algorithm for computing the geometric mean of two Hermitian positive definite matrices. The algorithm is stable and possesses a high convergence order. Some experiments are included to support the proposed computational algorithm


Introduction
This paper tries to provide a fast algorithm for finding the geometric mean of two Hermitian positive definite matrices via the application of matrix sign function.It is known that the scalar arithmetic-geometric mean agm(, ) of two (nonnegative) numbers  and  is defined by starting with  0 =  and  0 =  and then iterating +1 = √    ,  = 0, 1, 2, . . ., until   =   to the desired precision.The scalar sequences of {  }, {  } converge to each other.Note that in (1), √    is the geometric mean of two positive numbers   and   per computing cycle.Although this iterative formula is quite easy and reliable for two scalars, its extension for general square and nonsingular matrices is not an easy task.In this work, we are concerned with the matrix case of two square Hermitian positive definite matrices.
A right definition of the matrix geometric mean GM(, ) of two positive definite matrices  and  can be expressed by GM (, ) := ( −1 ) where given a square matrix  having no nonpositive real eigenvalues,  1/2 denotes the unique solution of the quadratic matrix equation whose eigenvalues lie in the right half plane.The definition (2) was given in the seventies by Pusz and Woronowicz [1].
There are some other important definitions for computing the matrix geometric mean of two matrices; see for example, the work of Lawson-Lim [2].Note that when just two matrices are involved the theory is well developed, but in case of finding the matrix geometric mean of more than two matrices, the formulation is kind of hard; see for more [3].
A variant formulation for the scalar case of geometric mean could be expressed by gm (, ) = √  = (   ) As could be seen in ( 4), the computation of scalar geometric mean is fully related to the square root of the positive scalar  and its inverse.A similar and significant definition for a matrix geometric mean has been developed and suggested by Bhatia in [4] as follows: for two Hermitian positive definite (HPD) matrices  and .
We use the notation # to show the geometric mean of two HPD matrices.Note that the formulas (2) and ( 5) coincide and are equal.For more, refer to [5].
The rest of this paper is organized as follows.In Section 2 we combine the root-finding Newton's method and a Chebyshev-Halley-type method to devise a fast iterative formula for solving nonlinear equations.In Section 3 we provide the link between nonlinear equation solvers and the calculation of some special matrix functions.This will illustrate how the new algorithms could be constructed and implemented.An implementation of the proposed iterative formula in symbolic software Mathematica [6] along a discussion about the stability of the scheme will be provided therein.Note that the idea of computing the geometric mean using the sign function can also be found in [7].In Section 4 we show the numerical results and highlight the benefit of the proposed technique.Conclusions are drawn in Section 5.

Construction of a New Method
It is known that a common way for improving the convergence speed of iterative methods for solving nonlinear scalar smooth equations is to combine the already developed schemes.Hence, let us first combine the well-known method of Newton into a special method of Chebyshev-Halley-type scheme [8] to derive a new iterative scheme as follows: wherein   =   (  )(  )/  (  ) 2 .For obtaining a background about nonlinear equations, one may consult [9,10].
Theorem 1.Let  ∈  be a simple zero of a sufficiently differentiable function  :  ⊆ R → R for an open interval , which contains  0 as an initial approximation of .Then, the iterative expression (6) without memory satisfies the error equation below wherein Proof.The proof of this theorem is based on Taylor's series expansion of the iterative method ( 6) around the solution in the th iterate.To save the space and not to be distracted from the main topic, we here exclude the proof.The steps of the proof are similar to those taken in [11].
The iterative method (6) reaches sixth-order convergence using five functional evaluations and thus achieves the efficiency index 6 1/5 ≈ 1.430, which is higher than that of Newton; that is, 2 1/2 ≈ 1.414.Furthermore, applying (6) for solving the polynomial equation () ≡  2 − 1 = 0 has global convergence (except the points lying on the imaginary axis).This global behavior which could easily be shown by drawing the basins of attraction (6) for () is useful in practical matrix problems so as to allow us deal with all kinds of HPD matrices with any spectra.
Remark 2. Note that since the global behavior can be observed from the basins of attraction, then we do not pursue this fact by theoretical analysis for (6).
A recent discussion about the relation between matrix sign and nonlinear equation solvers is given in [12,13].We state that all of such extensions in the scalar case and studying their orders are of symbolic computational nature.Such an extension to the matrix environment will also be done in the next section using symbolic computations.

Construction of an Algorithm for Geometric Mean
The relation between Section 2 and our main aim for computing matrix geometric mean is not clear at the first sight.
To illustrate this, we recall that many of the important matrix functions such as matrix square root, that is, the solution to the matrix equation (3), and matrix sign function which is the solution to the following matrix equation: can be calculated approximately by matrix iteration functions.Such fixed-point type methods are convergent under special conditions to the aimed matrix function and are basically originated from root-finding methods [7].
On the other hand, the important definition of the matrix geometric mean (5) requires the computation of the matrix square root of  and its inverse.Hence, in order to design an efficient algorithm for (5), we wish to construct an iterative expression in which we compute both  1/2 and  −1/2 at the same time.
Toward this aim, we apply the new nonlinear equation solver (6) for solving the matrix equation (8).This application would yield the following matrix iteration in its reciprocal form: with a proper  0 for finding the sign matrix.
Remark 3. The iteration ( 9) is not a member of the Padé family of iterations introduced in [14] for matrix sign.
In order to connect (9) with our aim, we remind an important identity as follows [7]: which indicates an important relationship between principal matrix square root  1/2 and the matrix sign function.
The identity (10) has an advantage which is the computation of the principal inverse matrix square root along with the principal matrix square root at the same time.Let us in what follows first study the stability behavior of (9).Proof.Let Δ  be a numerical perturbation introduced at the th iterate of (9).We obtain All terms that are quadratic in their errors are removed from our analysis.This formal manipulation is meaningful if Δ  is sufficiently small.We have Note that the commutativity between   and Δ  is not used throughout this lemma.To simplify this, the following identity will be used (for any nonsingular matrix  and the matrix ): Simplifying yields where  2 = ,  −1 = , and for enough large , we assumed   ≈ .After some algebraic manipulations and using Δ +1 = X+1 −  +1 ≈ X+1 − , we conclude that Applying ( 15) recursively, we have From ( 16), we can conclude that the perturbation at the iterate +1 is bounded.This allows to conclude that a perturbation at a given step is bounded at the subsequent steps.Therefore, the sequence {  } generated by ( 9) is asymptotically stable.
Note that Lemma 4 just shows the asymptotical stability.In general, the fixed-point iteration  +1 = (  ) is stable in a neighborhood of a fixed point  if Fréchet derivative   has bounded powers [7].Furthermore, if  +1 = (  ) is any super-linearly convergent iteration for  = sign( 0 ), then   () =   () = (1/2)( − ), where   is the Fréchet derivative of the matrix sign function at . Hence   is idempotent (  ∘   =   ) and the iteration is stable, thus all sign iterations, such as (9), are automatically stable.
It is now easy to deduce the following convergence theorem for (9).
Proof.The convergence of rational iterations can be analyzed in terms of the convergence of the eigenvalues of the matrices   .The reason for this is that if  has a Jordan decomposition  =  −1 , then () = () −1 .Let  have a Jordan canonical form arranged as  −1  = Λ, where  is a nonsingular matrix.It is also known that [7] sign If we define   =  −1   , then from the method (9), we obtain Notice that if  0 is a diagonal matrix then, based on an inductive proof, all successive   are diagonal too.From (18), it is enough to prove that {  } converges to sign(Λ), in order to ensure the convergence of the sequence generated by (9).We can write (18) as  uncoupled scalar iterations to solve () =  2 − 1 = 0, given by where    = (  ) , and 1 ≤  ≤ .From (18) and (19), it is enough to study the convergence of {   } to sign(  ), for all 1 ≤  ≤ .
From (19) and since the eigenvalues of  are not pure imaginary, we have that sign(  ) =   = ±1.Thus, we attain and lim  → ∞ |   | = 1 = |sign(  )|.This shows that {   } is convergent.Now, it could be easy to conclude that lim  → ∞   = sign(Λ).Finally, we have lim The proof is complete.
The iteration (9) requires one matrix inversion before computing step and obtains both  1/2 and  −1/2 which are of interest in (5).The implementation of (9) for computing principal square roots requires a sharp attention so as to save much effort.Since the intermediate matrices are all sparse (at least half of the entries are zero), then one could simply use sparse approximation techniques to save up the memory and time.
An implementation of ( 9) to compute # for two HPD matrices in the programming package Mathematica is brought forward as in (Algorithm 1).
In Algorithm 1, the four-argument function MGM computes # by entering the four arguments "matrix , " "matrix , " "the maximum number of iterations that ( 9) is allowed to take, " and the "tolerance" of the stopping termination in the infinity norm ‖ +1 −   ‖ ∞ /‖ +1 ‖ ∞ ≤ tolerance.
Note that for computing the principal matrix square root of ( −1/2  −1/2 ) 1/2 , we have used the Jordan Canonical approach, which has been provided in the general form for computing matrix functions in the code FunM.
The proposed approach converges to the solution matrix in 2 iterations, which shows a completely fast convergence.
Example 7. We now consider the two HPD matrices as follows: We compare the behavior of different methods and report the numerical results using  ∞ for all norms involved with the stopping criterion ‖ +1 −   ‖ ∞ /‖ +1 ‖ ∞ ≤  = 10 −5 in Figure 1.As could be seen, the numerical results are in harmony with the theoretical aspects of Section 3 and show a fast convergence for the proposed method (9) instead of DB, NM, and CHM.

Conclusions
Based on the quadratical Newton's scheme and the cubical method of Chebyshev-Halley, we have developed an iterative method with sixth order of convergence for solving nonlinear scalar smooth equations.The computational efficiency showed its superiority in contrast to NM.Then, the method with global behavior for finding matrix sign function has been extended for computing the principal matrix square root and its inverse.This procedure was followed so as to build a fast algorithm for finding the matrix geometric mean of two HPD matrices.We also have studied the asymptotical stability for the proposed technique.
To illustrate the new technique some numerical examples were presented.Computational results have justified robust and efficient convergence behavior of the proposed method.Similar numerical experimentations, carried out for a number of problems of different types, confirmed the above conclusions to a great extent.
We conclude the paper with the remark that in many numerical applications high precision in computations is required.The results of numerical experiments show that the high order efficient methods such as (9) associated with a multiple-precision arithmetic are very useful, because they yield a clear reduction in the number of iterates.