A Cost-Effective Smoothed Multigrid with Modified Neighborhood-Based Aggregation for Markov Chains

Smoothed aggregation multigrid method is considered for computing stationary distributions of Markov chains. A judgement which determines whether to implement the whole aggregation procedure is proposed. Through this strategy, a large amount of time in the aggregation procedure is saved without affecting the convergence behavior. Besides this, we explain the shortage and irrationality of the Neighborhood-Based aggregation which is commonly used in multigrid methods. Then a modified version is presented to remedy and improve it. Numerical experiments on some typical Markov chain problems are reported to illustrate the performance of these methods.


Introduction
Markov chains have a large number of applications for scientific research and technological optimization in many areas, including queuing systems and statistical automata networks, web page ranking [1,2] and gene ranking [3], risk management, and customer relationship analysis.Markov chains also have an important application for the noise and signal analysis, processing of voice, image, and screen, analog of communications network and transport phenomena, economic calculation methods, and so forth.For the needs of these practical applications, there arise different types of Markov chain models, such as the hidden Markov chain [4], multivariate Markov chain, and high-order Markov chain [5].For Markov models, computing the stationary distribution is an important issue.
The transition matrix of a homogeneous irreducible Markov chain with  states can be denoted as . ∈  × is a stochastic matrix and commonly defined as a columnstochastic matrix; that is, for every 1 ≤ ,  ≤ , 0 ≤  , ≤ 1, where column vector  is equal to one.
Then the stationary distribution problem is mathematically given by  = , In this study, we only consider the situation where  is also an irreducible matrix, which means, in its directed graph, there exists a directed path from any vertex  to any other vertex .If  is irreducible, according to the Perron-Frobenius theorem [6], there exists a unique solution  to the linear system (2).
This linear problem (2) is often reformulated as with where  is an identity matrix.As Sterck et al. showed in their work [7], this formulation has two specific advantages.First, the solution  is not only the right eigenvector of  corresponding to eigenvalue 0, but also the right singular vector of  corresponding to singular value 0. The second advantage of formulation (3) is that the -matrix structure of  is amenable to additive algebraic multigrid solvers designed for nonsingular linear problems [8,9].Hence, our interest is to solve the linear system (3), corresponding to a homogeneous irreducible Markov chain.In many applications, the size of the Markov chain is very large, leading to the fact that direct solvers are impractical.Hence, the iterative procedures are widely used to calculate the stationary probability of Markov chains.However, traditional one-level iterative methods, for example, the Power method, (weighted) Jacobi method, and (weighted) Gauss-Seidel method, converge very slowly for calculating the solution  in the situations where the subdominant eigenvalue  2 of  satisfies | 2 | ∼ 1 [10,11].The Markov chain in this case is a slowly mixing type.
Recently, multigrid iterative methods aiming to accelerate the convergence rate for these types of Markov chains have received much attention.The multigrid methods for Markov chains have been developed and applied very successfully since the earliest work of Takahashi's two-level iterative aggregation/disaggregation method for Markov chains [12].Direct extension of two levels to multilevels was first explored in [13].Then, adaptive algebraic multigrid methods whose aggregates are formed algebraically based on the strength of connection in the problem matrix column-scaled by the current iterate were developed in [14,15], and for Markov chains [16].This method shows good potential.Following this idea, there have arisen some techniques including smoothing the interpolation and restriction operators [17], using acceleration on all recursive levels [18] or on the top level [2,19], the on-the-fly interactive strategy [20], and so forth.
The numerical experiments in [18] show that the aggregation strategy has a certain effect on the performance of these adaptive algebraic multigrid methods.Employing reasonable and efficient aggregation strategy can improve the performance and applicability of these multigrid methods.However, in some cases, we found that the Neighborhood-Based aggregation used commonly by the above-mentioned methods generates aggregates with many states of very weak connection between each other.These unreasonable aggregates often lead to a poor convergence rate of the algebraic multigrid methods.
To remedy these problems and improve the applicability of Neighborhood-Based aggregation, this paper propose a modified version which changes some rules.
Besides the effect on the convergence rate, the computational cost of aggregation procedure accounts for a large proportion of the whole cost of these adaptive algebraic multigrid methods [20].We study the aggregation method which is based on the connection strength and propose a judge mechanism to capture in advance the situations where the aggregates will be the same as previous.Then there is no need to implement the whole aggregation method.This strategy keeps the convergence rate the same with computing time reduced.
The remainder of this paper is organized as follows.In Section 2, the smoothed aggregation multigrid (SAM) method for the Markov chains from [17] is reviewed.In Section 3, we analyse and demonstrate the drawback of the Neighborhood-Based aggregation method and present the modified version.The judge mechanism and the costeffective SAM are presented in Section 4. Numerical tests are presented in Section 5. Finally, conclusions are reported in Section 6.

Smoothed Aggregation Multigrid (SAM) Method for Markov Chains
First, we briefly review the framework of SAM method for Markov chains from [17], where the hierarchy of coarse grids is developed based on the structure of the scaled problem matrix  diag(  ), instead of relying on the geometry of the original problem.Here, we follow the presentation of [17].
Multilevel methods aim to reduce the algebraically smooth error left by the classical iterative methods.So classical iterative methods like Power method, (weighted) Jacobi method, and (weighted) Gauss-Seidel method are often used as relaxation techniques for multilevel methods.
In this study, the weighted Jacobi iterative method is used as relaxation for (3) and it is given by where  is the diagonal matrix of .System (3) can be rewritten into the following form: where   is the approximate got by last multilevel cycle and current relaxations and   is the multiplicative error vector and should be all ones when convergence is achieved.
Then the aggregation matrix  ∈  × needs to be constructed, where   = 1 if the node  belongs to aggregate  and   = 0 otherwise.Each column of  represents an aggregate, and each row has only one component equal to 1 while others are 0.For example, if the nodes are ordered according to the aggregates they belong to, the matrix  has the form like ) .
The aggregation algorithm for constructing  is discussed in Section 3.
When  is constructed, the coarse-level equation of ( 6) is given by where   is the coarse-level multiplicative error corresponding to   .Then, the restriction operator  and the prolongation operator  are defined by  =   and  = diag(  ).Equation ( 8) is rewritten as and the coarse-level operator   can be defined as If   is known, then an improved coarse-level approximation can be obtained by With ( 11), ( 9) is rewritten as the coarse-level probability equation: Finally, ( 12) is used to accelerate the classical one-level iterative methods like Power method, (weighted) Jacobi method, or (weighted) Gauss-Seidel method for (3).We take the twolevel case, for example.A two-level aggregation/disaggregation method is applied after some relaxations on (3) at the fine level.Through constructing the aggregation , the coarse-level equation ( 12) is formatted, and we solve (12) to get the coarselevel approximation   .Then, a coarse-level correction is implemented followed by some relaxations again to obtain the improved approximation at fine level.Two-level iterative aggregation/disaggregation can be naturally extended to multiple levels by recursively applying the two-level method to solve the coarse-level probability equation (12).There are many types of recursive cycles such as V cycle which solves the coarse-level probability equation (12) once at each intermediate level of a cycle and W cycle which solves it twice.Here we only describe V cycle.It is illustrated by Figure 1.
Performance of this aggregation multigrid method for Markov chains is often unsatisfactory.Inspired by smoothed aggregation (SA) multigrid methods for linear systems, smoothing the interpolation and restriction operators,  and , is proposed to improve the convergence behavior of the aggregation multigrid method [17].
All columns of the interpolation operator, , are smoothed by (weighted) Jacobi method with weight : For the restriction operator, , all rows are smoothed as The smoothed operators   and   still hold the properties: When   and   are generated, the smoothed coarse-level operator is naturally defined as Then the coarse-level probability equation is given by Note that because the smoothed coarse-level operator   may lose the properties of an irreducible singular -matrix, a lumping technique is necessary to modify   to keep these properties.This technique is implemented before the coarse-level probability equation; refer to [17] for details.This smoothed aggregation multigrid (SAM) method for Markov chain can be formulated as Algorithm 1.
It is necessary to apply a direct solver for the coarsest level and the direct solver used in Algorithm 1 is based on the following theorem.
Theorem 1 (Theorem 4.16 in [21]).If  is an irreducible singular -matrix, then each of its principal submatrices other than  itself is a nonsingular -matrix.
Let   denote the coarsest-level operator; then we use the direct method presented in the coarsest-level Algorithm 2 to solve the coarsest-level equation     = 0.

Neighborhood-Based Aggregation and Our Modified Version
How to construct the aggregation matrix  is a crucial issue in algebraic multigrid method.Here, we consider the aggregation methods which determine aggregates based on strength of connection in matrices.There exist some corresponding aggregation methods like distance-one aggregation, distance-two aggregation [16,17], pairwise aggregation, double pairwise aggregation [22], Neighborhood-Based aggregation [23], bottom-up aggregation [24], and some other types of aggregation.Serving as the aggregation method in [7,18,19], Neighborhood-Based aggregation has some advantages over others.It usually generates well-balanced aggregates of approximately equal size and reduces the number of unknowns quickly in coarsening process.Coarselevel stencil sizes tend to be rather uniform and do not grow too quickly [18].10) else (11)  ← the solution of  = 0, ‖‖ 1 = 1,  > 0 by a direct solver.( 12) end if (13) return ; Here, the aggregation procedure is based on matrix  =  diag(  ) and a node  is said to be strongly connected to node  in the graph of  if where  is a strength of connection parameter.The strong neighborhood   of node  is the set of all points which are strongly connected to  in the graph of  including node  itself.The Neighborhood-Based aggregation can be represented as Algorithm 3.
Though the Neighborhood-Based aggregation has many advantages, it is somewhat unreasonable in some detail.
In Algorithm 3, it picks  ∈ {1, 2, . . ., } in a natural order and defines a new aggregate denoting the ( + 1)th aggregate as the strong neighborhood of node , if such strong neighborhood is contained in the left set:  \ ⋃  =1   .As for a strong neighborhood   of some node , from (19) we can see that if the max  ̸ = {−  } is relatively small, then the strength of connection from other nodes in   to node  will be weak.Even worse, the nodes in   except  may be weakly connected with many others universally; that is to say, −  is relatively small for many ,  ∈   .
And if such node  is in the beginning of {1, 2, . . ., }, it would be aggregated earlier than most other nodes, and then it is very likely to construct an aggregate using such a neighborhood   described above.
This aggregate constructed through   is unreasonable.First, it contains many nodes which are neither strongly connected with  nor strongly connected with each other.Secondly, because it is aggregated earlier, it may separate some nodes which are truly strongly connected.
Here, we take a toy example to illustrate.
However, let us check which aggregates of this matrix will be constructed by Neighborhood-Based algorithm.We define the strong connection matrix  as follows: For  = , So,   = 1 means that node  is strongly connected with node  and is included in   . is a symmetric matrix and the th row or th column represents the strong neighborhood   .In this example,  is (  (  (  (  (  (  (  (  ( ( In Neighborhood-Based Algorithm 3, nodes are picked in a natural order, so the 1st node is first picked and its strong neighborhood  1 is used as 1st aggregate.It is seen from  in (24) that all the other nodes are strongly connected to node 1. Adding 1st node itself, the strong neighborhood  1 contains all the nodes of .The 1st aggregate is constructed containing all the nodes and the algorithm ends.It is too aggressive and unreasonable.
We consider that picking nodes in a natural order to construct aggregates is unreasonable.Here we consider picking nodes in another order.
We define a measurement size nei() of node  as size nei () := length (  ) , which means the number of nodes in the strong neighborhood   of .
We can suppose that the more the nodes the strong neighborhood   contains, the more likely the probabilities to be distributed evenly and be relatively smaller.So, we preferentially pick the nodes with smaller size nei value to construct aggregates.We apply this strategy by reordering  by size nei value to make the reordered set increasing in size nei.This strategy can avoid generating the unreasonable aggregate in the 1st process: generate the initial aggregates, as the example shows.
However the Neighborhood-Based aggregation also has some defects in the 2nd process that may lead to bad aggregates.
As seen from the 2nd process of Algorithm 3, it picks the remaining nodes of 1st process and adds them to the "most connected" aggregates.We conclude that the reason one node  is left is because its strong neighborhood   has some common nodes with at least one of the existing aggregates and the "most connected" aggregate is defined as the aggregate which has the most common nodes with   .
However, when the algorithm considers which aggregate  should be added to, it ignores   itself.Consider a situation that   has a small number of common nodes with the existing aggregates compared to size of   .If we still add  to the "most connected" aggregate as Algorithm 3 does, it may cause some bad aggregates containing many nodes which should be separated because of the weak connection between each other.
We use Example 2 to illustrate again.We change the order of picking nodes to avoid the problem caused by the 1st process discussed above.First the 6th node should be picked to construct aggregate in 1st process of Algorithm 3; because  6 contains the 1st node and nodes from 6th to 8th, the 1st aggregate is constructed as {1, 6, 7, 8}.Then we pick nodes from 2nd to 5th.It can be seen from  in (24) that the strong neighborhoods of them are the same and contain the nodes from 1st to 5th.These strong neighborhoods have a common node with the 1st aggregate: 1st node, so the nodes from 2nd to 5th are all remaining nodes and should be added to the "most connected" aggregate.According to 2nd process of Algorithm 3, each node should be added to the aggregate which has most common nodes with its strong neighborhood.Thus, the result of the algorithm is also an aggregate with all the nodes, still too aggressive and unreasonable.
Here we propose an improved scheme to the 2nd process in Algorithm 3. Let   denote the remaining strong neighborhood of the remaining note  as Note that the main extra work of our modified Neighborhood-Based Algorithm 4 compared to the old version consists of reordering the set  which can be solved in (log 2 ) work in the first process, and the loss of the parallelization of the second process.

Cost-Effective SAM for Markov Chains
The smoothed aggregation multigrid for Markov chains described in Algorithm 1 is one type of adaptive algebraic multigrid algorithm.It constructs aggregation matrix  based on the strength of connection in the scaled problem matrix  = diag(  ), without any need for advance
(2) for  ∈ {1, 2, . . ., } do (3) compute size nei() based on (25) knowledge of the topology of the Markov chain.The aggregation process is fully adaptive, with aggregates constructed adaptively in each iteration and at all recursive levels.This type of multigrid method is powerful and scalable for many Markov chain problems compared to the classical AMG which use the operators without any update.But they are relatively more expensive because they construct the whole multigrid operators in every cycle and unfortunately the computational cost of constructing operators is accounted for a large proportion of the whole cost in a cycle.In [24], it is noted that, empirically, more than 50% of the computation time of each cycle is spent on coarse matrix construction.And through our observation, the most costly procedure is the aggregation procedure which constructs the matrix .
In the numerical experiments, there are some cases where the aggregation matrix  at a certain level of current cycle is the same as the corresponding one in the previous cycle, so implementing the aggregation method to update  is totally a waste.The Neighborhood-Based aggregation and our modified version depend on the strong neighborhood   of each node  and the ordering of picking nodes.And because the strong neighborhood   just depends on the strong connection matrix  in (22) and so does the reordering in our modified version, it can conclude that, with the same aggregation matrix , the results after Neighborhood-Based aggregation or our modified version would be the same.
Actually, in Neighborhood-Based aggregation, it is necessary to construct strong neighborhood   of each node
(3) Construct  according to (22).( 4) if in the first cycle then (5)  pre ← .(6) Construct  based on , and  pre ← .(7) else if  ==  pre then (8)  ←  pre .( 9) else (10)  pre ← .(11) Construct  based on , and  pre ← .(12) end if (13)  ←   and  ← diag().( 14) 19) else (20)  ← the solution of  = 0, ‖‖ 1 = 1,  > 0 by a direct solver.(21) end if (22) return ; and that is equal to constructing the strong connection matrix  because   = (, :) according to the definition of   and .So there is no extra work to construct  and there is only a need to add a judgment step.If the current  is the same as the corresponding one in previous cycle, there is no need to implement the remaining part of Neighborhood-Based aggregation and the corresponding aggregation matrix  can still be used, leading to much time being saved in this procedure.Meanwhile, the storage will be increased; for that we also need to store the strong connection matrix  and the aggregation matrix  at each level of previous cycle.The aggregation matrix  is very sparse with only  nonzeros;  is the size of current level.The strong connection matrix  is symmetric, only the upper (lower) triangular part and the diagonal elements should be stored.The whole algorithm is presented in Algorithm 5.
Note that the aggregation methods based on the connection strength including Neighborhood-Based aggregation, our modified Neighborhood-Based aggregation, and distance-one (distance-two) aggregation are all suitable for Algorithm 5.And this cost-effective SAM (Algorithm 5) is mathematically equal to SAM (Algorithm 1).

Numerical Experiments
In this section, we report numerical results obtained using a Matlab R2014a implementation on 64-bit Windows-7 with a core i5-3470 processor and 8 GB RAM memory.One purpose of our numerical tests is to examine whether our modified Neighborhood-Based aggregation can generate aggregates in a more reasonable way and bring accuracy improvement compared to the original version and to see if it is applicable for some case where the original Neighborhood-Based aggregation performs badly.Another purpose is to test the efficiency of our cost-effective SAM (referred as CESAM) for Markov chains.We also compared CESAM method with two standard methods, Power method and Gmres(50) method, to demonstrate the efficiency of it.Four Markov chain problems studied in [7] including 2 structured problems and 2 random walk on unstructured planar graph, two Markov chain examples from queueing models, and a typical Markov chain constructed by ourselves like Example 2 in Section 3 have been employed in our experiments.In our experiments, we use V cycle only.For convenience, we let CESAM O and SAM O denote the applications of CESAM and SAM with original Neighborhood-Based aggregation, respectively, and CESAM M and SAM M denote the applications of CESAM and SAM with our modified Neighborhood-Based aggregation, respectively.
Without loss of generality, special sets of the parameters are employed and they are taken from [7].As mentioned early, the weighted Jacobi method is used as the preand postsmoothing in SAM (Algorithm 1) and CESAM (Algorithm 5).Let V 1 = V 2 = 1 and set the relaxation parameter  = 0.7 in the experiments.Note that, even though the values of parameters work well for all tests Graph for a 2D lattice with uniform weights.considered here, they are likely to be problem-dependent.The coarsest-level solution is implemented by the coarsestlevel Algorithm 2 when the size is below 50; the strength of connection parameter is chosen as  = 0.25.The initial guess is generated by random sampling with a uniform (0, 1) distribution and normalized to one in the one norm for the first four Markov chains and is all equal to 1/ ( is the dimension of the Markov chain) in the last Markov chain.Iterations are terminated when res = ‖  ‖ 1 /‖ 0 ‖ 1 ≤ 10 −12 with   the current approximate solution and  0 the initial guess solution.The max number of iterations is set as 200 for SAM and CESAM methods, 10 6 for Power method, and 5000 for Gmres(50) method.
Numerical results are reported in the tables, where "" is the problem size, "it" denotes the iteration counts, and "CPU  " denotes the total computing time in seconds." op " is the operator complexity of the last cycle, which is defined as the sum of the number of nonzero entries in all operator   on all levels divided by the number of nonzero entries in the fine-level operator  1 ; that is,  op = ∑  =1 (  )/( 1 ) with (  ) being the number of nonzero entries in   .

Uniform 2D Lattice.
The first test problem is a Markov chain on 2D lattice with uniform weights.Matrix  is essentially a scaled graph-Laplacian on a 2D uniform quadrilateral lattice with 5-point stencil (see Figure 2).It is a typically structured problem.Tables 1 and 2 give the numerical results.
Table 1 shows that, for this simple and highly structured problem, there is little difference in iterations between the Neighborhood-Based aggregation and our modified version.The operator complexity " op " of SAM M is a little larger than that of SAM O for the fact that our modified Neighborhood-Based aggregation is less aggressive than the original version.This larger operator complexity and the addition work of our modified Neighborhood-Based aggregation lead to a larger computing time per cycle.For  [10].Tandem queuing network exists in some practical applications, for example, in wireless networks [25] and blood screening procedures [26].
Following the settings and description in [17], we consider the situation that there are two finite queues with single servers, customers arrive according to a Poisson distribution with rate , and the service time distribution at the two single-server stations is Poisson with rates  1 and  2 .In our numerical simulation, we set the number of customers in the queues as  = 63, 127, 255.We set  = 10,  1 = 11, and  2 = 10.
The states of this system can be represented by tuples ( 1 ,  2 ), with  1 denoting the number of customers waiting in the first queue and  2 in the second queue.Then the total number of states is given by ( + 1) 2 .It is also a structured problem and illustrated in Figure 3.The numerical results for this problem are presented in Tables 3 and 4.
Seen from Table 3, like the previous example, " op " of SAM with our modified Neighborhood-Based aggregation is still larger than that of SAM with Neighborhood-Based aggregation.This is mainly caused by the difference in the 2nd process of these two aggregation methods because these two test problems are very regular which means the strong  If the saving in calls is almost the same, the CESAM M with less iterations will be more efficient than CESAM O; this can be verified by "CPU  " of these two methods.The CESAM M method still shows better performance than that of Power method and Gmres(50) method, which can be seen in Table 4.

M/H2/1
Queues.The next test problem is M/H2/1 queues from [10].M/H2/1 queues are used in mobile communications [27].M/H2/1 queue can be illustrated in Figure 4. Arrivals are generated according to a Poisson distribution at rate , while the service is represented by a two-phase hyperexponential distribution.With probability  a customer entering service receives service at rate  1 , while with probability 1 −  this customer receives service at rate  2 .The numerical results for this problem are presented in Tables 5  and 6.
Table 5 shows that the iterations needed for SAM M are less than those needed for SAM O.This means the modified Neighbour-Based aggregation can generate more suitable aggregates than the original version.Less iterations also result in a smaller computing time of SAM M and CESAM M compared to SAM O and CESAM O, respectively.For the efficiency of the cost-effective strategy, we can see that CESAM converges in a much smaller time than SAM, with 74.3%-75.2%proportion of time saved by CESAM O compared to SAM O and 75.2%-82.2%proportion of time saved by CESAM M compared to SAM M. Finally, as Table 6 shows, the CESAM M method still obtains much better performance for this problem than that of Power and Gmres(50) methods.

H2/E3/1
Queues.The final structured test problem is H2/E3/1 queues from [10].H2/E3/1 queues are used in network engineering including computer communications networks [28].The arrival process of the H2/E3/1 queue is a two-phase hyperexponential distribution while the service process is an Erlang-3 distribution.This queueing system can be shown graphically in Figure 5.For more details, see [10].The numerical results for this problem are presented in Tables 7 and 8.
As seen from (2, 1)   8 shows, the CESAM M method still obtains much better performance for this problem than that of Power and Gmres(50) methods.For these four unstructured test problems, the SAM M shows higher numerical scalability than SAM O since the iterations needed almost keep the same when the sizes of the problems increase.

Random Planar Graph (Undirected).
Random walks on graphs play important roles in several fields, including information retrieval and statistical physics.For example, many ranking problems, for example, PageRank [2] and GeneRank [3] corresponding to directed and undirected graphs, respectively, can be modeled by random walks on graphs which represent relations between the items.First, we consider an unstructured planar (undirected) graph and calculate the stationary probability distribution of the random walk on the graph.We randomly distribute  nodes in (0, 1) 2 .Then a planar graph connecting these nodes is constructed by using Delaunay triangulation and each node that shares an edge within the triangulation is connected by bidirectional links.The probability of transition from node  to node  is given by the reciprocal of the number of outward links from node .See Figure 6(a) for a small version of this example.Table 9 shows that, for this unstructured problem, much iterations are saved using our modified Neighborhood-Based Mathematical Problems in Engineering

Random Planar Graph (Directed)
. Then, we consider directed unstructured problems.The unstructured planar graphs from the previous example are used to form a similar problem with nonsymmetric sparsity structure.Based on the graphs described in the previous example, we select a subset of triangles from the triangulation such that no two triangles in the set are neighbored.This is done by randomly selecting a triangle, marking it with "+" and marking all of its three neighbors with "−".Then repeat this process for the next unmarked triangle until all triangles are marked.
Then we randomly delete one of the six directed arcs that connect the three nodes in each "+" triangle to destroy the symmetry.This process ensures that the resulting Markov chain is still irreducible.The probability of transition from node  to node  is given by the reciprocal of the number of outward links from node .See Figure 6(b) for a small version of this example with the "+" triangles marked.Seen from Table 11, for this nonsymmetric unstructured case, the behavior of these methods is very similar to that in previous example.Also much iterations are saved using our modified Neighborhood-Based aggregation.And the difference between the operator complexity " op " of SAM O and SAM M is still large.For this problem, with size of 423 × 600.
Although this situation seems extreme, it represents the similar cases which will possibly happen in the matrices of Markov chains locally.So we hope to evaluate the rationality and the universality of the original Neighborhood-Based aggregation and our modified version through this simple test matrix.Result of numerical experiments is presented in Table 5.Since we focus on the convergence rate affected by original Neighborhood-Based aggregation and modified Neighborhood-Based aggregation and there is no difference between CESAM and SAM, only the CESAM with these two aggregation methods are implemented for this example.
From Table 13, we can observe that the computing time and the iteration counts are reduced tremendously when our modified Neighborhood-Based aggregation is used.The performance of the original Neighborhood-Based aggregation over this problem is unacceptable which means it is not applicable for this problem, but our modified Neighborhood-Based aggregation can figure it out efficiently.The "coarsestsize" in Table 5 shows that the original Neighborhood-Based aggregation generates 1 aggregate containing all the nodes for this test matrix and our modified Neighborhood-Based aggregation generates 2 aggregates.Seen from (27), this test matrix should be divided into 2 aggregates as our modified Neighborhood-Based aggregation did.Because many nodes with weak connections are gathered together

Figure 1 :
Figure 1: Multilevel V cycle: the finest-level operations are represented at the top, the coarsest-level operations are at the bottom, and intermediate-level operations are in between.The black dots represent relaxation operations on corresponding levels and the open dots represent direct solvers.

Figure 6 :
Figure6: Graphs for small versions of random planar graph (a) and random planar graph, nonsymmetric (b).Black dots represent nodes, and light gray arrows represent bidirectional links.For (b), black arrows represent unidirectional links and triangles with "+" inside have a single link that was made unidirectional.For easier visualization, the graphs shown here have a more regular distribution of points than the actual points used to build the Markov chains.
← smooth ,   ← smooth .(6)   ← lumping     ,   ←    1. (7) which means removing the common nodes with the existing aggregates from   .And we consider   as a candidate aggregate.Then we find the aggregates with most common nodes with   as previous.If the aggregate founded is in the existing aggregates, we add the node  to the corresponding aggregate.Otherwise, we use the remaining strong neighborhood   which added the node  as a new aggregate, the ( + 1)th aggregate ( is the number of existing aggregates).The whole algorithm is represented in Algorithm 4.

Table 1 :
Numerical results for 2D lattice with uniform weights. op it CPU   op it CPU  it CPU  it CPU

Table 2 :
Numerical results for 2D lattice with uniform weights.
our CESAM, it has the same " op " (which is not shown in the table) and iterations with SAM but converges in a much smaller time.The CESAM can save the calls of much work in aggregation procedure and this is problem-dependent.For example, for the size of 4096, the CESAM O needs 35 times of implementing the remaining work of aggregation procedure while SAM O needs 96 times.In addition, these savings most happen in finer level, resulting in a decrease of 63.8% proportion in computing time by CESAM O when compared to SAM O.It is similar for other sizes and for CESAM M compared to SAM M. Seen from Table2, when compared to Power and Gmres(50) method, the CESAM M shows better performance in both iterations and computing time.5.1.2.Tandem Queuing Network.The next test problem is an open tandem queuing network from

Table 3 :
Numerical results for tandem queuing network. op it CPU   op it CPU  it CPU  it CPU

Table 4 :
Numerical results for tandem queuing network.

Table 5 :
Numerical results for M/H2/1 queues. op it CPU   op it CPU  it CPU  it CPU

Table 7 ,
the iterations needed for SAM M are still less than those needed for SAM O, which results in a smaller computing time of SAM M and CESAM M compared to SAM O and CESAM O, respectively.For the

Table 7 :
Numerical results for H2/E3/1 queues. op it CPU   op it CPU  it CPU  it CPU

Table 9 :
Numerical results for random planar graph (undirected). op it CPU   op it CPU  it CPU  it CPU  Based aggregation within SAM.Similar to previous examples, there is a considerable saving in computing time by our CESAM, and this saving is more remarkable in the case of using modified Neighborhood-Based aggregation.For the size of 65536, the computing time of SAM M is 69.7% as large as that of SAM O, but the computing time of CESAM M is 58.4% as large as that

Table 12 :
Numerical results for random planar graph (directed).4%-44.2%proportion of computing time is reduced by our modified Neighborhood-Based aggregation compared to Neighborhood-Based aggregation within SAM.When CESAM is used, computing time is reduced by 51.3%-65.0%andthisproportionincreaseswith the problem size.As shown in Table12, when the problem size is as small as 1024, the CESAM M gets the worst performance compared with Power and Gmres(50) methods.When the size comes to 4096 and 16384, the CESAM M outperforms the Power and Gmres(50) methods and this advantage increases with the problem size since CESAM M is much more scalable.5.3.Extreme Example.The last test problem is a Markov chain matrix constructed by ourselves.It is similar to Example 2 in Section 3 but we expand the size to 1024.It is a typical NCD (nearly completely decomposable) Markov chain which consists of groups of nodes that are strongly connected with each other and very weakly connected to the nodes in other groups.Here we suppose that there exists a node which is very weakly connected to all the other nodes, and the transition probabilities between it and other nodes are distributed evenly.For convenience, we constructed this matrix as

Table 13 :
Numerical results of this extreme example.