Regeneration and General Markov Chains

Ergodicity, continuity, finite approximations and rare visits of general Markov chains are investigated. The obtained results permit further quantitative analysis of characteristics, such as, rates of convergence, continuity (measured as a distance between perturbed and non-perturbed characteristics), deviations between Markov chains, accuracy of approximations and bounds on the distribution function of the first visit time to a chosen subset, etc. The underlying techniques use the embedding of the general Markov chain into a wide sense regenerative process with the help of splitting construction.


Introduction
In paying tribute to Lajos Takcs, one must remember that he is one of the few outstanding mathematicians who paid much attention to transient behavior in queueing systems, which he in- vestigated by analytic methods in his classic book Introduction to the Theory of Queues (Oxford  University Press, New York, 1962).The problem of transient behavior turned out to be ex- tremely difficult even for simple queueing models.Because of this, it is often reasonable not to obtain explicit formulas for different characteristics but to approximate these characteristics or to investigate general properties of underlying processes.This paper deals with general Markov chains.The theory of such chains has undergone dra- matic changes recently.These changes were caused by the discovery of an embedded renewal pro- cess and, therefore, the feasibility of employing a recurrent events technique (as in Feller [3]) which has been employed successfully for denumerable chains.We refer to Nummelin [13] and Meyn and Tweedie [14] for further details.Though a general Markov chain can be considered as a regenerative process, there are at least two features that distinguish the general case from the de- numerable or finite case.First, a general Markov chain (even Harris-recurrent) has no "proper atom" (by Nummelin's terminology, this is an analog of a recurrent state) and so a traditional choice of recurrent times to a specific state as regenerative epochs is inadmissible in general.This was the reason for E. Nummelin to suggest a so-called "splitting," which enables one to construct an enlarged Markov chain comprising the initial chain as a component and possessing a proper Px(')-P(" Xo x).
Let .)+denote the collection of r-finite nonnegative measures p that are defined on and satisfy the inequality P(X)> 0. The following definitions are standard and can be found in Nummelin [14].
Definition 1: We say that state x E X leads to B E and denote this by xB, if P(x,n;B) > 0 for some n > 0.
Definition 2: Given x B for any x X, B , and p(B) > 0, Markov chain X is called p- irreducible.
Markov chain X is called irreducible if it is p-irreducible for at least one Proposition 1 (Numrrielin [14]): Given an irreducible Markov chain X, there exists a maximal irreducibility measure J + for which (i) X is (-irreducible; (ii) if X is p-irreducible, then p<<(I) (this means (I)(B)-0=p(B)-0, B (iii) if ((B) 0 and U' {x:xB}, then P(B') O. Proposition 2 (Nummelin [14]): Given an irreducible Markov chain X, there exist a subset C , an integer m _ 1, a positive and a measure v J,+ such as P(x, m; B) >_ lc(x)(B), B e , (1) where 1c(X is the indicator of the set C. Moreover, if C O is such that (Co) > O, then there exist C, m, and such that C C C o and (1) holds.
If d-1, then X is called aperiodic.
(2) k=ln=k Proposition 3 (Orey [15])" Markov chain X is Harris-recurrent if and only if there exist a subset C , an integer m >_ 1, a positive fl, and a probability measure u, u(C) 1, such that (1) (2) Throughout this paper, we consider (by default) only Markov chains which are Harris- recurrent for which condition (1) holds, even though C, u, /3, and m can differ for different chains.
Since we will use notions concerning discrete time regenerative processes, we give necessary definitions due mainly to Thorisson [18]; see also Kalashnikov [8].
Let Z-(Z0, Z1,...) be a sequence of random variables (r.v.) taking values from a complete separable metric space (%,) and S-(S0, S1,...), S 0 _< S 1 _< ..., be a sequence of nonnegative integer r.v.'s.For the random pair (Z,S), define the shift where Definition 6: A random pair (Z,S) is called a classic sense regenerative process if for any >_ 0 all shifts 0:. (Z,S) are identically distributed and do not depend on the "prehistory" (Zo,.. Z S _1S0, . The sequence S is called the renewal process embedded in (Z,S).Definition 7: A random pair (Z,S) is called a wide sense regenerative process if for any _> 0 the shifts Osi (Z,S) are identically distributed and do not depend on (So,...,Si).
Obviously, any classic sense regenerative process is regenerative in the wide sense.Definition 7 still implies that the sequence S is a renewal process which means that all inter-regeneration times W S -S 1, _> 1, are i.i.d.r.v.'s.Definition 8: A wide sense regenerative process (Z,S) is called stationary if all shifts On(Z,S), n > O, are identically distributed and hence On(Z,S d_ (Z,S), where d means identity in distribution.Definition 9: A wide sense regenerative process (Z', S') is a version of a process (Z, S), if eS,o(Z',S') s).
-Oso(Z, Proposition 4 (Thorisson [18])" In order for a stationary version (Z',S') of regenerative process (Z,S) to exist it is necessary and sufficient that a wide sense E(S 1 -So) < In this case, a probability distribution of (Z', S') is defined uniquely by the equality P((Z', S') E E((S1 So)I(Oso(Z,S )), where I(. is the indicator of the event (.).
Consider now a splitting construction (see Nummelin [14]).In essence, it consists of embed- ding the initial Markov chain X into a wide sense discrete time regenerative process (Z, S).
Let X be a Markov chain.Embed it into a regenerative process (Z,S) which is constructed as follows.Let Z,-(Ya, i,), where Y, takes values from (X, ) and i n is a binary r.v.taking values 0 and 1.We will call in the bell variable after Lindvall [12] (if n-1 then the bell rings).
From the construction below it will follow that Y (Y0, Y1," ") d_ X. (3) Define random times (4) Let in, n > 0, be a set of i.i.d.r.v.'s, P(n 1) ft.Let (5) i.e. the time is declared to be a regeneration epoch if m steps ago, one of the instants (4) occurred and the bell rang that time.
If Yn Y and T + m < n < T + 1 (hence, y C), then we define Yn + 1 as a r.v. which depends only on y and has the distribution P(y;.).If Yn-Y and n-T (hence, y G C), then two cases are possible according to whether or not the bell rang at time n.We combine these two cases, denoting ,(.), if 6 n 1, Qn( (1 fl)-l(P(y,m;.)-/3,(.)),if 6 n O.
Let Y, + m be a r.v. with the distribution Qn(. and therefore possibly dependent on y.Define the collection Yn + 1"'" Yn + m by defining their joint distribution- P(Yn + BI,"', Yn + Bm 1 Yn Y, Yn + m x) P(Xn+ 1 Bl,'",Xn+m-1 Bin-1 [Xn Y'Xn+m x). (7) Note that the probability in the right hand side of ( 7) is defined uniquely by the transition function of the chain X.
Though the process Z is not Markov in general (it is Markov under m-1), the relation (3)is satisfied.It is obvious that the constructed process (Z,S) is classic sense regenerative if m-1 and wide sense regenerative if m > 1.In all cases P(Ys .)-u(.), n>_O. (8) The construction above is not unique.It is quite similar to the splitting construction sug- gested in Nummelin [14] for m-1.Other variants can be found in Kalashnikov [8] where the bell variable not only marks virtual regeneration epochs but counts residual times until these epochs in order to prove the Markov property for Z.
Let us agree that if we use notation (Z,S), this means that it is a regenerative process constructed according to (4-7) which includes the initial Markov chain as a component.
Denote by successive inter-regeneration times for the process (Z,S).By the construction, all Wk, k >_ 1, are i.i.d.r.v.'s.Recall that we consider here only Harris-recurrent Markov chains.Therefore, all r.v.'s Wk, k >_ O, are finite a.s.The following two assertions are direct consequences of the construction above.Their proofs are straightforward and can be found in Kalashnikov [8].
Proposition 5: the sense that Given an irreducible aperiodic Markov chain X, the r.v.W 1 iS aperiodic in GCD{j:P(W 1 j) > 0) 1, (10) where GCD stands for the greatest common divisor.Let v C min{n:X n C,n > 0}.
We will use the notation c (possibly with indices) for different constants appearing in different relations.We will also introduce a class (9 of functions G(n which is useful for a characterization of uniformly integrable r.v.'s; see Kalashnikov [7].
Proposition 6: For Harris-recurrent Markov chain X, the following implications are true.
The constants ci, i-1,2,3 and ' can be evaluated in a closed form in terms of the involved parameters.
to a zero-delayed regenerative process (Z,

Ergodic Theorems
Let us call a Markov chain X' a stationary version of another Markov chain X if the two d chains have the same transition functions and, in addition, X' is stationary, i.e.X k Xo, k > O.
Theorem 1: If Markov chain X is positive recurrent, that is f ExvcU( )<oo, dx 1 1 x then there exists a stationary version X'.
Proof: By Proposition 6, the inequality EW 1 < c holds for the wide sense regenerative process (Z,S).In turn, this inequality is necessary and sufficient for existence a stationary version (Z',S').Consider the first component Y'-(Y), Y,...) of the process Z'.Evidently, it is a stationary sequence.Prove that it comprises a Markov chain.Suppose first that the chain X is aperiodic.Denote 7r(.)-P(Y E ).Since # < and W a is an aperiodic r.v. by Proposition 5, then Var(Xn, Yn) Var(Xn, Y0) -*0, when n---,cx, where Var stands for the total variation metric (see Kalashnikov [4], Thorisson [17]).Hence, dLmP(, n; )--(. ),for any x X.It follows that 7r(./ P(x;. )r(dx). (12) x Introduce now a Markov chain X' with the transition function P(x;.and the initial distribution 7r(.).Equation ( 12) yields that X' is stationary.By Proposition 4, X' d y, which completes the proof in the aperiodic case.The periodic case can be treated by using standard arguments, reducing it to an aperiodic case by considering the Markov chain each d steps. El The following statement discloses the rate of convergence in the ergodic theorem.
The proof follows from Propositions 5 and 6 and from the rates of convergence to the steady state for regenerative processes contained in Kalashnikov [4, 7], Kalashnikov and Rachev [10], and  Thorisson [17].
In order to evaluate the constant c in equation ( 13), one can use estimates of c from Proposition 6 as well as bounds of moments of the r.v.v C which can be obtained, for instance, by the test functions methods: see Kalashnikov [4, 7, 8], and Meyn and Tweedie [13].

Continuity
Very often, we can not obtain necessary characteristics of a Markov chain or we do not know exactly its transition function.In such cases we need to investigate a continuity property of Markov chains, namely, to learn whether "small deviations" of transition functions lead to "small deviations" of non-stationary or stationary distributions of the chains.To this end, we introduce a notion Feller chain.Let f: X---R 1 be a real function defined on the state space of a chain X.
Define the operator Pf(x): / f(y)P(x; dy), x when the integral in the right hand side of (14) converges.
(14) Definition 10 (Meyn and Tweedie [13]): Markov chain X is called a weak Feller chain if the function Pf(x) is continuous and bounded provided that f is continuous and bounded.If the mapping Pf is continuous and bounded for any measurable bounded function f, then X is called a strong Feller chain.
The following assertion is trite but we will need this in the sequel.
Lemma 1: Let X(n), n _> O, be a sequence of weak Feller chains with the transition functions p(n)./f P(n) f(x)P()f(x) Vx E X, (15) for any continuous and bounded f, then / f(y)P(U)(x, k; dy) ]" f(y)P()(x, k; dy) x x ( 16) If X(n), n _> O, is a sequence of strong Feller chains, then the statement above holds for any for any k _> 0. According to Theorem 6.5.3 from Kalashnikov [4] (see also Corollary 1 to Theorem 3.5.1 in Kalashnikov [8]), the inequality ( 20) is a consequence of relations ( 24) and ( 25)[21 What is more, it is possible to obtain a quantitative bound on given that one can estimate j<k see Kalashnikov [8].
The requirement that X (n) are strong Feller chains is restrictive enough (though, instead we get a continuity property in terms of the total variation metric).We now relax this, requiring that all X (n) are weak Feller chains.In this situation, the relation ( 17) does not yield (18), in general, and limiting relation ( 21) can be violated.So, additional restrictions, appearing in the following Theorem 3, are engaged to overcome undesirable consequences of such violations.In addition, we will need a "weak" metric BL for a metrization of a weak convergence of X (n) to X () which is defined as follows (see Kalashnikov and Rachev [10])- where f" x-R 1, h is a metric in the space 2:, X and Y are r.v.'s with values from the space X- Theorem 3: Let all X (n) n > 0 be weak Feller chains satisfying the "common" minorization condition (18) and P(n)f(x)-P()f(x) for any bounded continuous f and any x E X. Let, in   )   addition, chain X () is irreducible and aperiodic, and the family of distributions (b )k > , n O, is uniformly iutegrable and uniformly aperiodic, i.e. the relation ( 22) holds.Then o.
n k > 0 The proof completely repeats that of Theorem 2.
Statements of Theorems 2 and 3 admit a quantification which follows from results obtained in Kalashnikov [4] and [8].Display only a bound which is valid under a "power case" G(n)ns, s > 1. let probability rnetric d is defined as d-Vat in the case of Theorem 2 and d-(BL in the case of Theorem 3.Then, in both cases, In particular, if spd(X(k n) X))<if_ {maxk<_T d(Xn) then 8p d( Xn), x(kO) ) _ c(((rt)) (s -1)Is.
Relation (27) arise naturally in queueing theory; see Kalashnikov [7].For example, single-server, multi-server and multi-phase models can be described by general Markov chains satisfying finite- time continuity inequality (27).

Approximations
Let us turn to a problem which is complementary to that considered in Section 4. Specifically, we now consider an approximation problem which can be outlined as follows.Let X--X () be a general Markov chain.The problem consists of the construction of a sequence of X (n) belonging to a prescribed class (e.g., having a finite number of states) such that (26) is true.
First, let us assume that the chain X () with the transition probabilities p(0)_ P(x;.) satisfies the minorization.condition(1).In addition, let I'n, n > 0, be a collection of nested compacts such that C C r 0 C r 1 C r 2 C...; (.J Fn-X. For each n _> 1, define the chain X (n) as a restriction of the chain X () to the compact F n.It has the following transition probabilities: P(n)(X; B) P(x; B) + P(x; x\r.)u(B),B C r,.
(31) n"*CX k >0 Without loss of generality, we can (and will) assume that, in each subset F.., the mean t c (n) (n) s residual time until he nearest regeneration time of the pro ess (Z ,S bounded from above by n given EW 1 < pc.Really, if EW 1 < 0c, then EWn)< EW 1 < 0c and one can redefine (if necessary): where R '')'-is a forward recurrence time for the renewal process S tnj' given that X (n) starts from the state x.
Lemma 2 and 3 show how to approximate a general Markov chain X () by another chain X (n) having a compact state space.We now move on to conditions ensuring the possibility of approximating Markov chain X (n) with a compact state space F n by a finite chain.Take an arbitrary e > 0 which can be treated as the accuracy of approximation.a0,...,aN be a e-net in F n such that a 0 E C (evidently, N may depend on e).Divide F n (N) follows: N + 1 subsets F(n),...,F n as r.() {: h(, 0 _< , rn}; r( 1) {x:h(X, al) _ e,x e r\r(.)};r(. ) (: h(, ) < , e r\(r(.r(.1))};

Let into
Recall that h stands for the metric that the complete separable metric space X is equipped with.
(34) By remark to Lemma 3, we can assume (without loss of generality) that sup V(z) < n.
xEF n Let A (n) be a generating operator of the Markov chain X (n).Then equation ( 34) yields ( 35) By the strong Feller property of X, the chain X (n) is strong Feller too, and, hence, V(x) is continuous in F n. Therefore, sup A(n')V(aj) <_ A(e) < 0, (37) I<j_N where A (n') is the generating operator of the finite chain X (n') and A(e)--l as e--0.Function Y(x) is bounded in Fn; see (35).This fact, together with relations (36) and (37), yields that all conditions of Corollary 3 to Theorem 1.1.5 in Kalashnikov [8] (see also Kalashnikov and Rachev  [10], Theorem 2, Appendix 5) hold, which, in turn, imply that sup E x v ') x E C, c for any s > 1 (for example, for s 2).Therefore, by Proposition 5, the family of r.v.'s (Wn')) > 0 is uniformly integrable.The strong Feller property of X (n) and the construction (32) yield 0, for any fixed k, where k-0 is a simultaneous regeneration time for both X (u) and X (').This means the relation ( 33) is a consequence of Theorem 4. Threm 4: Let X be an aperiodic Harris-recurrent, positive, strong Feller chain.Then there exists a sequence of finite chains X(N) such that lira suPBL(X n X,(N))-O.
J oo (39) The proof follows from statements of Lemmas 2 and 4 if one takes X(N) X (n').
Having examined the proof of Lemma 4, we can notice that the strong Feller property is required only for the continuity of V(x).But the proof is still valid if one supposes that there exists a bounded continuous function V(x) such that AV( ) < ZX < 0, C.
(40) Therefore, using Lemma 3 and repeating main arguments of the proof of Theorem 4, we arrive at a weaker version of the above statement.
Theorem 5: Let X be an aperiodic Harris-recurrent, positive, weak Feller chain, and let there exist a continuous function V(x) such that relation (40) holds.Then there exists a sequence of finite chains X(N) such that the relation (45) is still true.
Let us mark the following two circumstances.First, for many queuing models (singleand multi-server, multi-phase and others) test functions V(x) satisfying Theorem 5 have been known; see ialashnikov [7,8].Second, the limiting relation (39) can be quantified just like it has been for the continuity problem in Section 4.

Rarity and Exponentiality
Suppose now that general Markov chain X visits some subset Q c x infrequently.Then it can be expected that the passage time from some specific initial state (random, in general) to Q is exponentially distributed to good approximation.Similar problems have been investigated earlier, mainly, by analytic tools; see Keilson [11].We will now show how to solve them by pro- babilistic methods.
Let us suppose that X satisfies (1) with m 1.It follows that, in this case, X can be em- bedded into a classic sense regenerative process (Z,S).But we would like to emphasize that the restriction m 1 is not of exceptional importance and can be relaxed.Suppose, in addition, that (Z,S) is a zero-delayed process, i.e. S O =0, which implies that the initial distribution of X coincides with (B), B E .Formalize the supposition about "rare visits" to Q as follows.Let F j, j >_ 0, be a sequence of nested subsets of X such that Fj C Fj + 1' J >-0, U Fj X.