A New GMRES( m ) Method for Markov Chains

This paper presents a class of new accelerated restarted GMRES method for calculating the stationary probability vector of an irreducible Markov chain. We focus on the mechanism of this new hybrid method by showing how to periodically combine the GMRES and vector extrapolation method into a much efficient one for improving the convergence rate in Markov chain problems. Numerical experiments are carried out to demonstrate the efficiency of our new algorithm on several typical Markov chain problems.


Introduction
The Markov chain is a very robust tool for studying stochastic systems overtime and is in a wide range of applications including queueing systems [1,2], computer and communication systems [3], information retrieval, and Web ranking [4][5][6][7].For both discrete and continuous Markov chains, it has always been one of the central work to get their numerical solutions by adopting appropriate methods.
In this paper, we consider a class of Krylov subspace methods, restarted GMRES (GMRES()) method, for the numerical solutions of the stationary probability vector of an irreducible Markov chain.Let  = (  ) ∈ R × be a column stochastic matrix; that is, 0 ≤   ≤ 1 (∀, ) and    =   , with  being the column vector of all ones.We seek the stationary vector  ∈ R  that satisfies  = ,   ≥ 0 (∀) , ‖‖ = 1. ( By the Perron-Frobenius theorem [8,9], if  is irreducible, then there exists a unique solution  to (1), which is strictly positive (  > 0, ∀).Equation ( 1) is equivalent to the following singular linear problem: where  =  − , with  being the identity matrix, and  is a singular -matrix with the diagonal elements being negative column sums of its off-diagonal elements.Now there are many numerical methods for solving (2), such as the traditional iterative methods, the power method or weighted Jacobi method [5,[10][11][12], the aggregation/disaggregation algorithm [13][14][15][16], and the Krylov subspace methods like the Arnoldi method and the GMRES [3,[17][18][19][20][21].However, these iteration methods for calculating  may converge very slowly when the subdominant eigenvalue of  satisfies  2 ∼ 1 [3,11].Thus, it becomes quite required to accelerate the calculation in Markov chain problems.
In this paper, we consider the restarted GMRES method and propose a new way to accelerate its numerical calculation by use of the polynomial-type vector extrapolation methods.In fact, seeking the vector extrapolation method as the accelerator is very popular; see [5,12,22] for details.In discussing this issue, we start from the mechanisms of the GMRES() method and vector extrapolation method and then illustrate how to periodically knit these two methods.In numerical experiments, the proposed extrapolation-accelerated GMRES() methods are tested on several Markov chain problems, and the experimental results show its effectiveness.
This paper is organized as follows.Section 2 briefly provides the mechanism of the GMRES methods and some acceleration ones.In Section 3, we first review the vector extrapolation method and then consider how to periodically combine the extrapolation method with the GMRES method for numerical calculation of Markov chains.Section 4 provides experimental evidence of the effectiveness of our approach.Section 5 summarizes the thesis and points out the direction of our future research.

Background
In this section, we briefly introduce the mechanism of the GMRES methods and describe some existing modifications to the standard GMRES aimed at accelerating its convergence.
The GMRES method, proposed by Saad and Schultz in [20], is a popular method for the iterative solution of sparse linear systems with an unsymmetrical matrix, If  0 is an initial guess for the true solution of (3) and  0 =  −  0 is the initial residual, we have the equivalent system: where  =  0 + .Let K  be the Krylov subspace: GMRES is to find an approximate solution Here, ‖ ⋅ ‖ 2 denotes the Euclidean norm on R  , as well as the associated-induced matrix norm on R × .GMRES is often referred to as an "optimal" method in the sense that it finds the approximate solution in the Krylov subspace that minimizes the residual [20].However, the amount of storage and computational work grow quadratically with the number of steps.So the restarted version of the algorithm is often used as suggested in [20].In restarted GMRES (GMRES()), the method is restarted after each cycle of  iteration steps, and the current approximate solution becomes the new initial guess for the next  iterates.The mechanism of the GMRES() for Markov chains as system (2) is described as follows.For more details, see [20,23].
(1) Start: choose  0 , , tol (2) For  = 1, 2, . . ., , do (3) Form the approximate solution:   =  0 +     , where   minimizes ‖ 1 −   ‖ 2 ,  ∈   .(4) Restart: while ‖ +1 ‖ 2 > tol, do In general, the convergence rate depends on the restart parameter [24,25].Even in the situation in which the appropriate restart parameter results in the satisfactory convergence, the convergence behavior may not be the optimal; since an iterate is restarted, the current approximation space is discarded at each restart, and the orthogonality between approximation spaces generated at successive restarts is not preserved at each restart.For that reason, slow convergence can occur.Therefore, it is necessary to develop more efficient algorithms for enhancing the robustness of restarted GMRES, and thereof now its improvements and modifications continue to receive considerable attentions.Augmented methods are a class of acceleration techniques which seek to avoid stalling by improving information in GMRES at the time of the restart.These acceleration methods are presented by Morgan [19,26], Saad [27] and Chapman and Saad [28].The precondition technique is also often used to accelerate the numerical calculation of GMRES; see [17,23,[29][30][31][32] for more details.Meanwhile, this paper focuses on undertaking the similar works, that is, speeding up the process and improving the robustness of the restarted GMRES.

The Main Algorithm and Practical Implementations
Now let us first present the rationale and theory behind vector extrapolation method.When considering the solutions of systems of linear or nonlinear equations by an iterative method, a sequence of vectors (approximation solutions) is yielded.As the classical iteration process may converge slowly, extrapolation strategy can often be applied to enhance its convergence rates.A detailed review of extrapolation methods can be found in the works of Smith et al. [22], Sidi [12], and Jbilou and Sadok [33].Now there are mainly two classes of vector extrapolation methods: (1) polynomialtype methods, namely, the minimal polynomial extrapolation (MPE) of Cabay and Jackson [34], the reduced rank extrapolation (RRE) of Eddy [35] and Mešina [36], and the modified minimal polynomial extrapolation (MMPE) method of Sidi et al. [37], Brezinski [38], and Pugachev [39]; and (2) epsilon algorithms, namely, the topological epsilon algorithm of Brezinski [38] and the scalar and vector epsilon algorithms of Wynn [40].Numerical experiences have illustrated that the polynomial type vector extrapolation methods are in general more economical than the epsilon ones, with respect to the computing time and storage requirements.And for the special touch, Kamvar et al. have recently proposed a new extrapolation, quadratic extrapolation, for computing the PageRank in [5], and Sidi [12] has generalized this method (the generalization one is marked by GQE) and proved that the resulting generalization is very closely related to MPE.
Therefore, we focus on the two polynomial extrapolation methods, namely, RRE and GQE, in this paper.Now let us display the implementations of these two algorithms.For more details, we refer the reader to the paper [12] by Sidi.
Vector extrapolation algorithms are derived by considering vector sequences   , generated from a fixed-point iterative method of the following form: where  is a fixed  ×  matrix,  is a fixed -dimensional vector, and  0 is an initial vector.Suppose that we have produced a sequence of iterates {  } ∞ =1 , where   ≥ 0. Then at the th outer iteration, let be a matrix consisting of the last  iterates with   being the newest, where  is called the window size in usual.It is evident that  has the following properties: The problem to be solved is transformed into obtaining a vector  satisfying ∑  =1   = 1, and thus we have an updated probability vector a linear combination of the last  iterates.Assume () < 1, which results in the unique solution  of the linear system As illustrated in [12], the efficient implementations of the vector extrapolation methods (RRE and GQE) are presented as follows.
( Clearly, Algorithms 2 and 3 share the common feature: they both contain a  factorization in step 2 for   =     , where   ∈ R ×(+1) is unitary and   ∈ R (+1)×(+1) is an upper triangular matrix with positive diagonal elements.
In addition, the -factorization is always carried out inexpensively by applying the modified Gram-Schmidt process (MGS) to the vectors  0 ,  1 , . . .,   ; see [12] for more details.
The MGS algorithm is given as follows.
( As previously mentioned, the restarting of GMRES may entail the slow convergence since the current approximation space is discarded at each restart cycle.Meanwhile, the change of the restart parameter  will affect the number of iterations and also the execution time in the run time of GMRES.Furthermore, the increasing  will decrease the number of iteration needed to converge while increasing the computing work and storage amount required per iteration, especially for large systems of equations, like large-scale Markov chains, and PageRank computation in information retrieval.So it is very necessary to enhance the robustness of GMRES().And our work falls into the category of accelerating techniques.Next, we will discuss the idea and implementation of the new method.
Recall that in restarted GMRES, once the Krylov subspace reaches dimension , the current approximate solution becomes the new initial guess for the next  iterations.The motivation of our new method comes from the successive restarts every  iterations.To be specific, we will implement the current approximation, after  iterations in a restart cycle, as the initial vector for the extrapolation procedure.And furthermore, the resulting vector by extrapolation method, in turn, can be the new improved starting vector for the next  iterations by GMRES().In summary, the mechanism of our new restarted GMRES method can be characterized as follows.(1) Choose  0 , , , tol, and set  = 1.
(4) Compute x from  by applying Algorithm 2 (RRE) and Algorithm 3 (GQE), respectively. ( Otherwise, set  ←  + 1 and go to step 2. Note that, in step 1 tol is the user described tolerance for residual vector 1-norms by GMRES(),  is the restart number in GMRES, and  is the window size for extrapolation.From Algorithms 2 and 3, both the RRE and GQE methods require  + 2 vectors  0 ,  1 , . . .,  +1 as inputs.For instance, when  = 2, four approximate vectors are needed as input in the extrapolation procedure.
Step 5 is to check the efficiency of extrapolation strategy in current iterations.
Step 6 is devised to determine when to flip flop between the extrapolation procedure and GMRES procedure.The performances of our new algorithm and comparisons with the other original algorithms will be discussed in detail below.

Numerical Results in Markov Chain Problems
In this section, we will report the numerical experiments to examine the efficiency of our new accelerated algorithm in numerical solution of stationary probability vector for Markov chains.Three typical Markov chain problems have been used in our experiments.All the numerical results are obtained with a MATLAB R2010a implementation on Windows 8 with 2.5 GZ i5 processor and 4 GB memory.
For the sake of justice, the same starting vector  0 = 1/ × , with  being the column vector of all ones, is used for all the algorithms.All the iterations are terminated when ‖ x − x ‖ 1 /‖ x ‖ 1 < tol, where x are the approximations obtained by the current iteration.For convenience, in all the tables and figures below, we have abbreviated the RRE-GMRES algorithm and the GQE-GMRES algorithm as RGM-RES and GGMRES, respectively.We denote by "rest" the restart number in GMRES, "CPU" the CPU time used in seconds, "it" the iteration counts, and "Res" the residual 1norms.Example 1.In this example, we compare the GMRES() algorithm, the RGMRES algorithm, and the GGMRES algorithm for the Markov chain problem.This test problem is a one-dimensional (1D) Markov chains often resulting from a //1 queueing system.The simplest graph of this problem is displayed in Figure 1, where the transition rates are identical.Numerical results are presented in Table 1.
It is easy to see from Table 1 that the accelerated GMRES is more effective than the unaccelerated one, both in terms of CPU time and the number of iteration and regardless of the restart number size for GMRES.In particular, the GMRES accelerated by GQE (GGMRES) performs best in most cases.For instance, when the restart number is 20, in comparison with the results of unaccelerated GMRES, the CPU time has been saved by 74% for FGMRES and 85% for GGMRES, and the iteration count has been reduced by 27% for RGMRES and 54% for GGMRES.In Figure 2, we compare the converging speed among the three algorithms.It is clear that the convergence rate has been enhanced greatly when using the accelerating technique in GMRES, and especially that the GGMRES has faster convergence speed than the other algorithms for meeting the convergence precision quickly.
Example 2. This test problem is displayed in Figure 3, which is a 1D chain with uniform weights, except for two weak links with weight  in the middle of the chain.This is a typical nearly completely decomposable (NCD) Markov chain problem, and it has been discussed in [41][42][43].We run the GMRES algorithm, the accelerated GMRES algorithm by vector extrapolation method, namely, RGMRES and GGM-RES, on this problem, when the window size in extrapolation takes different values.All the algorithms will be stopped as soon as the residual 2-norms are below tol = 10 −7 .Let  = 1 − 3 in the numerical simulation, and the experimental results are listed in Table 2.
It is seen from Table 2 that the accelerated GMRES methods by vector extrapolation method perform better than the unaccelerated GMRES method both in terms of iteration counts and CPU time, regardless of their window  size.Obviously the GMRES method accelerated by GQE outperforms the other two methods.For instance, when the widow size is 4, the iteration number in GGMRES is only 46% of that in RGMRES and less than 28% of that in GMRES.Especially from the view point of CPU time, the GGMRES method has more obvious advantages than the GMRES method and the RGMRES method, for it reduces the convergence time by 55% compared with RGMRES method and by nearly 86% compared with GMRES method.Example 3.This problem is a 2D lattice (grid) with uniform weights, which has been discussed in [43,44].Aggregates of size 3 × 3 are used and the gauge variables are defined on the lattice edges and are scalars valued as one, as shown in Figure 4.
In this problem, we run the GMRES, RGMRES and GGMRES, and compare their convergence rates while changing the problem size, with the restart number being 10 and the window size being 2. Numerical results are presented in Table 3.The numerical simulation given in Table 3 for the 2D lattice problem clearly demonstrates the effectiveness of acceleration by vector extrapolation methods.It is seen that the RGMRES algorithm converges faster than the GMRES method, while the GGMRES algorithm performs the best.

Conclusions
In this paper, we have presented a new GMRES method accelerated by vector extrapolation techniques to get the numerical solutions of the stationary probability vector of an irreducible Markov chain, using vector extrapolation techniques.Experimental results on several typical Markov chain problems demonstrate that the extrapolation method is an attractive option for accelerating the convergence of calculation of Markov chain, especially with the GQE (proposed by Sidi in [12]) being the accelerator.We note that, as mentioned previously, preconditioning technique may also be an appropriate strategy for improving convergence rate in Markov chains and would be one of our research topics of this field in future.

Figure 1 :
Figure 1: Graph for one-dimensional Markov chain with identical transition rates.

Figure 4 :
Figure 4: Graph for a 2D lattice with uniform weights.

Table 1 :
Numerical results of the three algorithms for one-dimensional Markov chain with identical transition rates.

Table 2 :
Numeric comparisons of three algorithms for the uniform chain with two weak links in the middle ( = 1 − 3).

Table 3 :
Numerical results of the three algorithms with various problem sizes for uniform 2D lattice.