MPE Mathematical Problems in Engineering 1563-5147 1024-123X Hindawi 10.1155/2017/4018239 4018239 Research Article A Fast Newton-Shamanskii Iteration for a Matrix Equation Arising from M/G/1-Type Markov Chains http://orcid.org/0000-0003-4127-8036 Guo Pei-Chang 1 Salerno Nunzio School of Science China University of Geosciences Beijing 100083 China cugb.edu.cn 2017 19102017 2017 16 06 2017 18 09 2017 28 09 2017 19102017 2017 Copyright © 2017 Pei-Chang Guo. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

For the nonlinear matrix equations arising in the analysis of M/G/1-type and GI/M/1-type Markov chains, the minimal nonnegative solution G or R can be found by Newton-like methods. We prove monotone convergence results for the Newton-Shamanskii iteration for this class of equations. Starting with zero initial guess or some other suitable initial guess, the Newton-Shamanskii iteration provides a monotonically increasing sequence of nonnegative matrices converging to the minimal nonnegative solution. A Schur decomposition method is used to accelerate the Newton-Shamanskii iteration. Numerical examples illustrate the effectiveness of the Newton-Shamanskii iteration.

China University of Geosciences, Beijing 2652017140
1. Introduction

Some necessary notation for this article is as follows. For any matrix B=[bij]Rn×n, B0(B>0) if bij0(bij>0) for all i,j; for any matrices A,BRn×n, AB(A>B) if aijbij(aij>bij) for all i,j; the vector with all entries equal to one is denoted by e; that is, e=(1,1,,1)T; and the identity matrix is denoted by I. An M/G/1-type Markov Chain (MC) is defined by a transition probability matrix of the form(1)P=B0B1B2B3CA1A2A3A0A1A2A0A10,while the transition probability matrix of a GI/M/1-type MC is as follows:(2)P=B0C0B1A1A0B2A2A1A0B3A3A2A1,where B0Rm0×m0 and A1Rm×m, respectively. N is the smallest index i such that Ai, for i>N, is (numerically) zero. The steady-state probability vector of an M/G/1-type MC, if it exists, can be expressed in terms of a matrix G that is the element-wise minimal nonnegative solution to the nonlinear matrix equation (3)G=i=0NAiGi.Similarly, for the GI/M/1-type MC, a matrix R is of practical interest, which is the element-wise minimal nonnegative solution to the nonlinear matrix equation (4)R=i=0NRiAi.For a large number of stochastic models, most steady-state and certain transient characteristics of the process can be expressed in terms of the minimal nonnegative solution described above. This problem arises in the analysis of queues with phase type service times, as well as in queues that can be represented as quasi-birth-death processes . It is known that any M/G/1-type MC can be transformed into a GI/M/1-type MC and vice versa through either the Ramaswami  or Bright  dual, and the G(R) matrix can be obtained directly in terms of the R(G) matrix of the dual chain. The drift of the chain is defined by(5)ρ=pTβ,where p is the stationary probability vector of the irreducible stochastic matrix A=i=0NAi and β=i=1NiAie. The MC is positive and recurrent if ρ<1, null and recurrent if ρ=1, and transient if ρ>1; and throughout this article it is assumed that ρ1.

Available algorithms for finding the minimal nonnegative solution to (3) include functional iterations , pointwise cyclic reduction (CR) , the invariant subspace (IS) approach , the Ramaswami reduction (RR) , and the Newton iteration (NI) [3, 911]. For a detailed comparison of these algorithms, we refer the readers to  and the references therein. In , Newton’s iteration is revisited and accelerated. From numerical experience, the fast Newton’s iteration in  is a very competitive algorithm.

In this paper, we consider the Newton-Shamanskii iteration for (3). It is shown that, starting with a suitable initial guess, the sequence generated by the Newton-Shamanskii iteration is monotonically increasing and converges to the minimal nonnegative solution of (3). Similar to Newton’s iteration, equation involved in the Newton-Shamanskii iteration step is a linear equation of the form j=0N-1BjXCj=E, which can be solved fast by a Schur decomposition method. The Newton-Shamanskii iteration differs from Newton’s iteration as the Fréchet derivative is not updated at each iteration; therefore the special coefficient matrix structure form can be reused.

The paper is organized as follows. The Newton-Shamanskii iteration and its accelerated iterative procedure using a Schur decomposition method are given in Section 2. Then M/G/1-type MCs with low-rank downward transitions and low-rank local and upward transitions are considered in Sections 3 and 4, respectively. Numerical results in Section 6 show that the fast Newton-Shamanskii iteration can be more efficient than the fast Newton’s iteration proposed in . Final conclusions are presented in Section 6.

2. Newton-Shamanskii Iteration

In this section, we present the Newton-Shamanskii iteration for (3). First we rewrite (3) as(6)GX=v=0NAvXv-X=0.The function G is a mapping from Rm×m into itself and the Fréchet derivative of G at X is a linear map GX:Rm×mRm×m given by(7)GXZ=v=1Nj=0v-1AvXjZXv-1-j-Z.The second derivative at X, GX:Rm×mRm×m, is given by(8)GXZ1,Z2=v=2Nj=0v-1Avi=0j-1XiZ2Xj-1-iZ1Xv-1-j+v=2Nj=0v-2AvXjZ1i=0v-2-jXiZ2Xv-2-j-i.

For a given initial guess G0,0, the Newton-Shamanskii iteration for the solution of G(x)=0 is as follows.

For k=0,1,,(9)GGk,0Xk,s-1=-GGk,s-1,Gk,s=Gk,s-1+Xk,s-1,s=1,2,,nk,(10)Gk+1=Gk+1,0=Gk,nk.Xk,s-1 is the solution to(11)Xk,s-1-v=1Nj=0v-1AvGkjXk,s-1Gkv-1-j=v=0NAvGk,s-1v-Gk,s-1,which, after rearranging the terms, can be rewritten as(12)Xk,s-1-j=0N-1v=j+1NAvGkv-1-jXk,s-1Gkj=v=0NAvGk,s-1v-Gk,s-1.Following the notation of , we define Sk,i=j=iNAjGkj-i; then the above equation is(13)Sk,1-IXk,s-1+j=1N-1Sk,j+1Xk,s-1Gkj=Gk,s-1-v=0NAvGk,s-1v,which is a linear equation of the same form j=0N-1BjXCj=E as the Newton’s iteration step. It can be solved fast by applying a Schur decomposition on the matrix C, which is the m×m matrix Gk here, and then solving m linear systems with m unknowns and equations. For the detailed description for solving j=0N-1BjXCj=E, we refer the reader to [11, 12]. We stress that, for Newton-Shamanskii iteration, the coefficient matrices are updated once after every nk iteration step and the special coefficient structure can be reused, so the cost per iteration step is reduced significantly.

3. The Case of Low-Rank Downward Transitions

When the matrix A0 is of rank r, meaning that it can be decomposed as A0=A^0Γ with A^0Rm×r and ΓRr×m, we refer to the MC as having low-rank downward transitions. If Newton-Shamanskii iteration is applied to this case, all the matrices Xk,s-1 can be written as X^k,s-1Γ. This can be shown by making induction on the index s. X0,0 can be written as X^0,0Γ and we assume that it is true for all Xl,j-1 for l=0,,k and j=1,,s-1. Hence, Gk,s-1 can be written as G^k,s-1Γ, since Gk,s-1=l=0k-1j=1nlXl,j-1+j=1s-1Xk,j-1=(l=0k-1j=1nlX^l,j-1+j=1s-1X^k,j-1)Γ. Then (12) can be rewritten as(14)Xk,s-1=A^0Γ+j=1NAjGk,s-1j-1G^k,s-1Γ-G^k,s-1Γ+v=1NAvGkv-1Xk,s-1+j=1N-1v=j+1NAvGkv-1-jXk,s-1Gkj-1G^kΓ=I-v=1NAvGkv-1-1×A^0+j=1NAjGk,s-1j-1G^k,s-1-G^k,s-1j=1N-1v=j+1NAvGkv-1-jXk,s-1Gkj-1G^kΓ;therefore Xk,s-1 can be decomposed as the product of an m×r matrix X^k,s-1 and an r×m matrix Γ. The inverse on the right-hand side exists since 0v=1NAvGkv-1v=1NAvGv-1 and the spectral radius of v=1NAvGv-1   is strictly less than one . Therefore we will concentrate on finding X^k,s-1 as the solution to(15)X^k,s-1=A^0+j=1NAjGk,s-1j-1-IG^k,s-1+v=1NAvGkv-1X^k,s-1+j=1N-1v=j+1NAvGkv-1-jX^k,s-1ΓGkj-1G^k=A^0+j=1NAjGk,s-1j-1-IG^k,s-1+j=0N-1Sk,j+1X^k,s-1ΓG^kj,which can be rewritten as(16)Sk,1-IX^k,s-1+j=1N-1Sk,j+1X^k,s-1ΓG^kj=I-j=1NAjGk,s-1j-1G^k,s-1-A^0.We can use the Schur decomposition method in [11, 12] to solve the above equation. Different from Newton’s iteration in , the special coefficient structure can be reused here, thus saving the overall computational cost. We will report the numerical performance of the Newton-Shamanskii iteration in Section 6.

4. The Case of Low-Rank Local and Upward Transitions

In this section, the case of low-rank local and upward transitions is considered, where the m×m matrices {Ai,1iN} can be decomposed as Ai=ΓA^i with ΓRm×r and A^iRr×m. To exploit low-rank local and upward transitions, we introduce the matrix U, which is the generator of the censored Markov chain on level i, starting from level i, before the first transition on level i-1. The following equality holds based on a level crossing argument:(17)U=i=1NAiGi-1=i=1NAiI-U-1A0i-1.For the case of low-rank local and upward transitions, we can rewrite U as(18)U=i=1NAiI-U-1A0i-1=Γi=1NA^iI-U-1A0i-1=ΓU^,which means that U is of rank r, while G=(I-U)-1A0 is generally of rank m.

Therefore, we find U as the solution to(19)FX=X-i=1NAiI-X-1A0i-1=0and get G from G=(I-U)-1A0 [11, 14]. The Newton-Shamanskii iteration step for (19) is as follows.

For k=0,1,,(20)FUkYk,s-1=-FUk,s-1,Uk,s=Uk,s-1+Yk,s-1,s=1,2,,nk,Uk+1=Uk+1,0=Uk,nk.Yk,s-1 is the solution to(21)Yk,s-1-i=2NAij=1i-1I-Uk-1A0j-1I-Uk-1Yk,s-1I-Uk-1A0i-j=i=1NAiI-Uk,s-1-1A0i-1-Uk,s-1.If we define Rk,j=i=j+1NAi((I-Uk)-1A0)i-1-j(I-Uk)-1 and rearrange the terms, (21) can be rewritten as(22)Yk,s-1-j=1N-1Rk,jYk,s-1I-Uk-1A0j=i=1NAiI-Uk,s-1-1A0i-1-Uk,s-1,which is of the form j=0N-1BjXCj=E. This iteration enables us to exploit low-rank local and upward transitions. The iterates Uk,s=Uk,s-1+Yk,s-1, where Yk,s-1 solves (21), can be rewritten as Uk,s=ΓU^k,s. This can be shown by making induction on the index s. It obviously holds for U0,0. Assuming that Uk,s-1=ΓU^k,s-1, from (21), we get(23)Yk,s-1=Γi=2NA^ij=1i-1I-Uk-1A0j-1I-Uk-1Yk,s-1I-Uk-1A0i-j+i=1NA^iI-Uk,s-1-1A0i-1-U^k,s-1,which tells us that Yk,s-1 can be decomposed as ΓY^k,s-1, and the same holds for Uk,s=Uk,s-1+Yk,s-1. Therefore, from (21), we will focus on finding Y^k,s-1 as the solution to(24)Y^k,s-1-i=2NA^ij=1i-1I-Uk-1A0j-1I-Uk-1Yk,s-1I-Uk-1A0i-j=i=1NA^iI-Uk,s-1-1A0i-1-U^k,s-1.

Defining R^k,j=i=j+1NA^i((I-Uk)-1A0)i-1-j(I-Uk)-1Γ, we can rewrite the above equation as(25)Y^k,s-1-j=1N-1R^k,jY^k,s-1I-Uk-1A0j=i=1NA^iI-Uk,s-1-1A0i-1-U^k,s-1,which is of the form j=0N-1BjXCj=E.

5. Convergence Analysis

There is monotone convergence when the Newton-Shamanskii method is applied to (3).

5.1. Preliminary

Let us first recall that a real square matrix A is a Z-matrix if all its off-diagonal elements are nonpositive and can be written as sI-B with B0. Moreover, a Z-matrix A is called an M-matrix if sρ(B), where ρ(·) is the spectral radius; it is a singular M-matrix if s=ρ(B) and a nonsingular M-matrix if s>ρ(B). The following result from  is to be exploited.

Lemma 1.

For a Z-matrix A, the following statements are equivalent:

A is a nonsingular M-matrix.

A-10.

Av>0 for some vector v>0.

All eigenvalues of A have positive real parts.

The following result is also well known .

Lemma 2.

Let A be a nonsingular M-matrix. If BA is a Z-matrix, then B is a nonsingular M-matrix. Moreover, B-1A-1.

The minimal nonnegative solution S for (3) may also be recalled—see  for details.

Theorem 3.

If the rate ρ defined by (5) satisfies ρ1, then the matrix(26)I-v=1Nj=0v-1Gv-1-jTAvGjis a nonsingular M-matrix.

5.2. Monotone Convergence

The following lemma displays the monotone convergence properties of the Newton iteration for (3).

Lemma 4.

Consider a matrix X such that

G(X)0,

0XG,

I-v=1Nj=0v-1(Xv-1-j)TAvXj is a nonsingular M-matrix.

Then the matrix(27)Y=X-GX-1GXis well defined, and

G(Y)0,

0XYG,

I-v=1Nj=0v-1(Yv-1-j)TAvYj is a nonsingular M-matrix.

Proof.

G X is invertible and the matrix Y is well defined from (iii) and Lemma 1. Since(28)I-v=1Nj=0v-1Xv-1-jTAvXj-10from (iii) and Lemma 1 and G(X)0, we get that vec(Y)vec(X) and thus YX. From (27) and the Taylor formula, there exists a number θ1, 0<θ1<1, such that(29)GY=GX+GXY-X+12GXθ1Y-X,θ1Y-X=12GXθ1Y-X,θ1Y-X0,so (a) is proven, where the last inequality is obtained from Y-X0 and (8). (b) may be proven as follows. From(30)0=GG=GX+GXG-X+12GXθ2G-X,θ2G-X,where 0<θ2<1, we have(31)-GXG-Y=GXY-X-GXG-X=-GX-GXG-X=12GXθ2G-X,θ2G-X0,where the last inequality is from G-X0 by (ii). It is notable that(32)I-v=1Nj=0v-1Xv-1-jTAvXjis a nonsingular M-matrix, so vec(G-Y)0 from Lemma 1; that is, G-Y0. Now YX, so (b) follows. Next we prove (c). From 0YG, we have(33)I-v=1Nj=0v-1Yv-1-jTAvYjI-v=1Nj=0v-1Gv-1-jTAvGj,and we know that I-v=1Nj=0v-1(Gv-1-j)TAvGj is a nonsingular M-matrix. Consequently, from Lemma 2, I-v=1Nj=0v-1(Yv-1-j)TAvYj is a nonsingular M-matrix.

A generalization of Lemma 4 provides the theoretical basis for the monotone convergence of the Newton-Shamanskii method for (3).

Lemma 5.

Consider a matrix X such that

G(X)0,

0XG,

I-v=1Nj=0v-1(Xv-1-j)TAvXj is a nonsingular M-matrix.

Then, for any matrix Z, where 0ZX, the matrix(34)Y=X-GZ-1GXexists, such that

G(Y)0,

0XYG,

I-v=1Nj=0v-1(Yv-1-j)TAvYj is a nonsingular M-matrix.

Proof.

Since 0ZX, we have(35)I-v=1Nj=0v-1Zv-1-jTAvZjI-v=1Nj=0v-1Xv-1-jTAvXj.From (iii) and Lemma 2, GZ is invertible and the matrix Y is well defined such that 0XY.  Let(36)Y^=X-GX-1GX;we know that Y^Y from Lemma 2. Because Y^G from Lemma 4, (b) follows. Now(37)I-v=1Nj=0v-1Y^v-1-jTAvY^jis a nonsingular M-matrix from Lemma 4 and Y^Y; therefore I-v=1Nj=0v-1(Yv-1-j)TAvYj is a nonsingular M-matrix from Lemma 2. Next we show that (a) is true. From the Taylor formula, there exists two numbers θ3 and θ4, where 0<θ3,θ4<1, such that(38)GY=GX+GXY-X+12GXθ3Y-X,θ3Y-X=GX+GZY-X+GX-GZY-X+12GXθ3Y-X,θ3Y-X=GZY-X,θ4X-Z+12GXθ3Y-X,θ3Y-X0,where the latter inequality holds, since X-Z0 and Y-X0.

The monotone convergence result for the Newton-Shamanskii method applied to (3) follows.

Theorem 6.

Suppose that a matrix G0 is such that

G(G0)0,

0G0G,

I-v=1Nj=0v-1(G0v-1-j)TAvG0j is a nonsingular M-matrix.

Then the Newton-Shamanskii algorithm (9)-(10) generates a sequence Gk such that GkGk+1G for all k0, and limkGk=G.

Proof.

The proof is by mathematical induction. From Lemma 5,(39)G0=G0,0G0,n0=G1G,GG10,I-v=1Nj=0v-1G1v-1-jTAvG1jis a nonsingular M-matrix. Assuming that(40)GGi0,G0=G0,0G0,n0=G1Gi-1,ni-1=GiGand that I-v=1Nj=0v-1(Giv-1-j)TAvXij is a nonsingular M-matrix, from Lemma 5,(41)GGi+10,Gi=Gi,0Gi,ni=Gi+1G,and I-v=1Nj=0v-1(Gi+1v-1-j)TAvGi+1j is a nonsingular M-matrix. By induction, the sequence Gk is therefore monotonically increasing and bounded above by G and so has a limit G such that GG. Letting i in Gi+1Gi,1=Gi-(GGi)-1G(Gi)0, it follows that G(G)=0. Consequently, G=G since GG and G is the minimal nonnegative solution of (3).

6. Numerical Experiments

The Newton-Shamanskii iteration differs from Newton’s method in that the evaluation of the Fréchet derivative is not done at every iteration step. So, while more iterations will be needed than for Newton’s method, the overall cost of the modified Newton method could be less. Our numerical experiments confirm the feasibility of the Newton-Shamanskii iteration for (6).

About how to choose the optimal scalars ni in the Newton-like algorithm (see (9) and (10)), now we have no theoretical results. This is a goal for our future research. In our extensive numerical experiments, we update the Fréchet derivative every two iteration steps. That is, for i=0,1,, we choose ni=2 in the Newton-like algorithm (9).

The elapsed CPU time in seconds (denoted as “time") is used to measure the feasibility of our new method. In our numerical experiments, we use zero matrix as the initial iteration value and choose the following stopping criterion:(42)Gnew-Gold<1e-14,where Gnew and Gold denote the iteration values of solution after and before one iteration step, respectively, and · denotes the infinity-norm of a matrix. The numerical tests were performed on a laptop (2.4 GHz and 2 G Memory) with MATLAB R2013b. Numerical experiments show that the modified Newton method could be more efficient than Newton iteration proposed in . We present the numerical results for a random-generated problem in Figure 1. In this figure, we fix the number of coefficient matrices (N+1) and vary the problem size (n) and plot the CPU time of the two algorithms in seconds for different parameters n.

The MATLAB code used for the problem construction is reported as follows, which generates the N+1 coefficient matrices for (6), that is, AiRn×n,  i=0,1,2,,N:

function [A,er,rho]=problem(n,N)

rand(’state’,0);

A=rand(n,n(N+1));

e=ones(n(N+1),1);

s=Ae;

e=ones(n,1);

for i=1:n

A(i,:)=A(i,:)/s(i);

end

B=zeros(n);

beita=0;

for i=1:(N+1)

B=B+A(:,((i-1)n+1):(in));

beita=beita+(i-1)A(:,((i-1)n+1):(in))e;

end

[V,D]=eig(B’);

pai=V(:,1);

s=sum(pai);

pai=pai/s

er=max(B’pai-pai)

rho=pai’beita

end

Up to now, we do not have theoretical results about when this modified Newton method can outperform the Newton iteration. This is not an easy question. We leave it as a goal for future research.

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research is supported by the Fundamental Research Funds for the Central Universities in China University of Geosciences, Beijing (2652017140).

Neuts M. F. Structured Stochastic Matrices of the M/G/1 Types and Their Applications 1989 5 New York, NY, USA Marcel Dekker Probability: Pure and Applied MR1010040 Neuts M. Matrix-Geometric Solutions in Stochastic Models—An Algorithmic Approach 1981 Baltimore, Md, USA The Johns Hopkins University Press MR618123 Zbl0469.60002 Ramaswami V. Nonlinear matrix equations in applied probability - solution techniques and open problems SIAM Review. A Publication of the Society for Industrial and Applied Mathematics 1988 30 2 256 263 10.1137/1030046 MR941112 Ramaswami V. A duality theorem for the matrix paradigms in queueing theory Communications in Statistics. Stochastic Models 1990 6 1 151 161 10.1080/15326349908807141 MR1047109 2-s2.0-0000890872 Zbl0699.60091 Taylor P. G. Van Houdt B. On the dual relationship between Markov chains of GI/M/1 and M/G/1 type Advances in Applied Probability 2010 42 1 210 225 10.1239/aap/1269611150 MR2666925 Bini D. Meini B. On the solution of a nonlinear matrix equation arising in queueing problems SIAM Journal on Matrix Analysis and Applications 1996 17 4 906 926 10.1137/S0895479895284804 MR1410708 2-s2.0-0030492130 Akar N. Sohraby K. An invariant subspace approach in M/G/1 and G/M/1 type Markov chains Communications in Statistics. Stochastic Models 1997 13 3 381 416 10.1080/15326349708807433 MR1457654 Bini D. Meini B. Ramaswami V. Latouche G. Taylor P. Analyzing M/G/1 paradigms through QBDs: the role of the block structure in computing the matrix G Advances in Algorithmic Methods for Stochastic Models 2000 Station, NJ Notable Publications: Neshanic 73 86 Latouche G. Newton's iteration for non-linear equations in Markov chains IMA Journal of Numerical Analysis (IMAJNA) 1994 14 4 583 598 10.1093/imanum/14.4.583 MR1298534 Zbl0861.65132 2-s2.0-21844506405 Neuts M. F. Moment formulas for the Markov renewal branching process Advances in Applied Probability 1976 8 4 690 711 10.2307/1425930 MR0426196 Zbl0379.60081 Pérez J. F. Telek M. Van Houdt B. A Fast Newton’s Iteration for M/G/1-Type and GI/M/1-Type Markov Chains Stochastic Models 2012 28 4 557 583 10.1080/15326349.2012.726038 MR2995523 Pérez J. F. Van Houdt B. The M/G/1-type Markov chain with restricted transitions and its application to queues with batch arrivals Probability in the Engineering and Informational Sciences 2011 25 4 487 517 10.1017/S0269964811000155 MR2832242 Bini D. A. Latouche G. Meini B. Numerical methods for structured Markov chains 2005 Oxford University Press, New York xii+327 Numerical Mathematics and Scientific Computation 10.1093/acprof:oso/9780198527688.001.0001 MR2132031 Zbl1076.60002 2-s2.0-84919658654 Latouche G. Ramaswami V. Introduction to matrix analytic methods in stochastic modeling 1999 Philadelphia, PA, USA ASASIAM Series on Statistics and Applied Probability; SIAM 10.1137/1.9780898719734 MR1674122 Varga R. S. Matrix Iterative Analysis 1962 Englewood Cliffs, NJ, USA Prentice Hall MR0158502