A Fast Newton-Shamanskii Iteration for a Matrix Equation Arising from M/G/1-Type Markov Chains

For the nonlinear matrix equations arising in the analysis ofM/G/1-type and GI/M/1-typeMarkov chains, the minimal nonnegative solution G or R can be found by Newton-like methods. We prove monotone convergence results for the Newton-Shamanskii iteration for this class of equations. Starting with zero initial guess or some other suitable initial guess, the Newton-Shamanskii iteration provides amonotonically increasing sequence of nonnegativematrices converging to theminimal nonnegative solution. A Schur decompositionmethod is used to accelerate theNewton-Shamanskii iteration.Numerical examples illustrate the effectiveness of the Newton-Shamanskii iteration.


Introduction
Some necessary notation for this article is as follows.For any matrix  = [  ] ∈ R × ,  ≥ 0 ( > 0) if   ≥ 0 (  > 0) for all , ; for any matrices ,  ∈ R × ,  ≥  ( > ) if   ≥   (  >   ) for all , ; the vector with all entries equal to one is denoted by ; that is,  = (1, 1, . . ., 1)  ; and the identity matrix is denoted by .An M/G/1-type Markov Chain (MC) is defined by a transition probability matrix of the form while the transition probability matrix of a GI/M/1-type MC is as follows: where  0 ∈ R  0 × 0 and  1 ∈ R × , respectively. is the smallest index  such that   , for  > , is (numerically) zero.The steady-state probability vector of an M/G/1-type MC, if it exists, can be expressed in terms of a matrix  that is the element-wise minimal nonnegative solution to the nonlinear matrix equation [1] Similarly, for the GI/M/1-type MC, a matrix  is of practical interest, which is the element-wise minimal nonnegative solution to the nonlinear matrix equation [2] For a large number of stochastic models, most steady-state and certain transient characteristics of the process can be expressed in terms of the minimal nonnegative solution described above.This problem arises in the analysis of queues with phase type service times, as well as in queues that can be represented as quasi-birth-death processes [3].It is known that any M/G/1-type MC can be transformed into a GI/M/1type MC and vice versa through either the Ramaswami [4] or Bright [5] dual, and the () matrix can be obtained directly in terms of the () matrix of the dual chain.The drift of the chain is defined by where  is the stationary probability vector of the irreducible stochastic matrix  = ∑  =0   and  = ∑  =1   .The MC is positive and recurrent if  < 1, null and recurrent if  = 1, and transient if  > 1; and throughout this article it is assumed that  ̸ = 1.Available algorithms for finding the minimal nonnegative solution to (3) include functional iterations [1], pointwise cyclic reduction (CR) [6], the invariant subspace (IS) approach [7], the Ramaswami reduction (RR) [8], and the Newton iteration (NI) [3,[9][10][11].For a detailed comparison of these algorithms, we refer the readers to [11] and the references therein.In [11], Newton's iteration is revisited and accelerated.From numerical experience, the fast Newton's iteration in [11] is a very competitive algorithm.
In this paper, we consider the Newton-Shamanskii iteration for (3).It is shown that, starting with a suitable initial guess, the sequence generated by the Newton-Shamanskii iteration is monotonically increasing and converges to the minimal nonnegative solution of (3).Similar to Newton's iteration, equation involved in the Newton-Shamanskii iteration step is a linear equation of the form ∑ −1 =0     = , which can be solved fast by a Schur decomposition method.The Newton-Shamanskii iteration differs from Newton's iteration as the Fréchet derivative is not updated at each iteration; therefore the special coefficient matrix structure form can be reused.
The paper is organized as follows.The Newton-Shamanskii iteration and its accelerated iterative procedure using a Schur decomposition method are given in Section 2. Then M/G/1-type MCs with low-rank downward transitions and low-rank local and upward transitions are considered in Sections 3 and 4, respectively.Numerical results in Section 6 show that the fast Newton-Shamanskii iteration can be more efficient than the fast Newton's iteration proposed in [11].Final conclusions are presented in Section 6.

Newton-Shamanskii Iteration
In this section, we present the Newton-Shamanskii iteration for (3).First we rewrite (3) as The function G is a mapping from R × into itself and the Fréchet derivative of G at  is a linear map G   : R × → R × given by The second derivative at , G   : R × → R × , is given by For a given initial guess  0,0 , the Newton-Shamanskii iteration for the solution of G() = 0 is as follows.
For  = 0, 1, . .., ,−1 is the solution to which, after rearranging the terms, can be rewritten as Following the notation of [11], we define  , = ∑  =    −  ; then the above equation is which is a linear equation of the same form ∑ −1 =0     =  as the Newton's iteration step.It can be solved fast by applying a Schur decomposition on the matrix , which is the  ×  matrix   here, and then solving  linear systems with  unknowns and equations.For the detailed description for solving ∑ −1 =0     = , we refer the reader to [11,12].We stress that, for Newton-Shamanskii iteration, the coefficient matrices are updated once after every   iteration step and the special coefficient structure can be reused, so the cost per iteration step is reduced significantly.

The Case of Low-Rank Downward Transitions
When the matrix  0 is of rank , meaning that it can be decomposed as  0 = Â0 Γ with Â0 ∈ R × and Γ ∈ R × , we refer to the MC as having low-rank downward transitions.If Newton-Shamanskii iteration is applied to this case, all the matrices  ,−1 can be written as X,−1 Γ.This can be shown by making induction on the index . 0,0 can be written as X0,0 Γ and we assume that it is true for all  ,−1 for  = 0, . . .,  and  = 1, . . .,  − 1.Hence,  ,−1 can be written as Ĝ,−1 Γ, since 12) can be rewritten as therefore  ,−1 can be decomposed as the product of an × matrix X,−1 and an  ×  matrix Γ.The inverse on the righthand side exists since is strictly less than one [13].Therefore we will concentrate on finding X,−1 as the solution to which can be rewritten as We can use the Schur decomposition method in [11,12] to solve the above equation.Different from Newton's iteration in [11], the special coefficient structure can be reused here, thus saving the overall computational cost.We will report the numerical performance of the Newton-Shamanskii iteration in Section 6.

The Case of Low-Rank Local and Upward Transitions
In this section, the case of low-rank local and upward transitions is considered, where the × matrices {  , 1 ≤  ≤ } can be decomposed as   = Γ Â with Γ ∈ R × and Â ∈ R × .To exploit low-rank local and upward transitions, we introduce the matrix , which is the generator of the censored Markov chain on level , starting from level , before the first transition on level  − 1.The following equality holds based on a level crossing argument: For the case of low-rank local and upward transitions, we can rewrite  as which means that  is of rank , while  = ( − ) −1  0 is generally of rank .Therefore, we find  as the solution to and get  from  = ( − ) −1  0 [11,14].The Newton-Shamanskii iteration step for ( 19) is as follows.

Convergence Analysis
There is monotone convergence when the Newton-Shamanskii method is applied to (3).

Preliminary.
Let us first recall that a real square matrix  is a -matrix if all its off-diagonal elements are nonpositive and can be written as  −  with  ≥ 0.Moreover, a -matrix  is called an -matrix if  ≥ (), where (⋅) is the spectral radius; it is a singular -matrix if  = () and a nonsingular -matrix if  > ().The following result from [15] is to be exploited.
Lemma 1.For a -matrix , the following statements are equivalent: (a)  is a nonsingular -matrix.
(d) All eigenvalues of  have positive real parts.
The following result is also well known [15].

Lemma 2.
Let  be a nonsingular -matrix.
The minimal nonnegative solution  for (3) may also be recalled-see [9] for details.Theorem 3. If the rate  defined by (5) satisfies  ̸ = 1, then the matrix is a nonsingular -matrix.

Monotone Convergence.
The following lemma displays the monotone convergence properties of the Newton iteration for (3).
Lemma 4. Consider a matrix  such that Then the matrix is well defined, and Proof.G   is invertible and the matrix  is well defined from (iii) and Lemma 1.Since from (iii) and Lemma 1 and G() ≥ 0, we get that vec() ≥ vec() and thus  ≥ .From ( 27) and the Taylor formula, there exists a number  1 , 0 <  1 < 1, such that where 0 <  2 < 1, we have where the last inequality is from − ≥ 0 by (ii).It is notable that is a nonsingular -matrix, so vec( − ) ≥ 0 from Lemma 1; that is,  −  ≥ 0. Now  ≥ , so (b) follows.Next we prove (c).From 0 ≤  ≤ , we have and we know that A generalization of Lemma 4 provides the theoretical basis for the monotone convergence of the Newton-Shamanskii method for (3).Lemma 5. Consider a matrix  such that Then, for any matrix , where 0 ≤  ≤ , the matrix Proof.Since 0 ≤  ≤ , we have From (iii) and Lemma 2, G   is invertible and the matrix  is well defined such that 0 ≤  ≤ .Let Up to now, we do not have theoretical results about when this modified Newton method can outperform the Newton iteration.This is not an easy question.We leave it as a goal for future research.