Convergence Rates and Limit Theorems for the Dual Markov Branching Process

This paper studies aspects of the Siegmund dual of the Markov branching process. The principal results are optimal convergence rates of its transition function and limit theorems in the case that it is not positive recurrent. Additional discussion is given about specifications of theMarkov branching process and its dual.The dualisingMarkov branching processes need not be regular or even conservative.

This process is the focus of the present paper.Definition 1 comes from [2] which also gives two equivalent definitions, one based on the -matrix  =   (0) and one based on a so-called dual branching property ((22) below).We say a little about these definitions in Section 3. It is shown in [2] that  is regular and hence that () is minimal and honest.That paper also gives criteria for recurrence and transience.
In particular the DMBP is positive recurrent if and only if the MBP is supercritical and the zero state is accessible, in which case the limiting-stationary law of the DMBP is geometric.A curious property is that () is strongly ergodic if and only if F() is dishonest.
There are two principal contributions of the present paper.The first is a study of the convergence properties of the transition function ().If the zero state is inaccessible for the MBP, then it is absorbing and accessible for the DMBP.In this case there is a limiting conditional law (Theorem 3).In the positive recurrent case, Theorem 4 gives optimal convergence rates for the variation distance between () and its limiting geometric law.A by-product is an endogenous proof of the curious property mentioned above.Theorem 5 gives the exact rate at which   () → 0 in the case that the MBP is subcritical, and Theorems 6 and 8 deal with the critical case.These results follow from convergence rate results about F(), the latter being fairly well known, at least when F() is honest.
The second contribution is an account of limit theorems for the DMBP when S is not positive recurrent.Theorem 9 shows that if the MBP is critical or subcritical, then there is a family of constants   → ∞ as  → ∞ such that (  /  ) converges in law.The limit is a standard exponential law in the critical case and a finite mixture of Erlang laws in the subcritical case.In the subcritical case, convergence in law can be strengthened to almost sure convergence (Theorem 10).Theorems 11 and 12 are central limit analogues for this almost sure convergence.Theorems 9-11 are parallel to known results for the dual version of the simple branching process in [3].The motivation for the model studied in that reference is the fact that the simple branching process in which immigration can occur is stochastically monotone; see (3) in [3].Although it is not explicit in this reference, the model there is the Siegmund dual of a nonconservative or killed version of the simple branching process.This killed process is studied in [4].The proof of almost sure convergence in Section 5 for the DMBP is quite different to that for the discrete-time version in [3].That proof employs a general result about Markov chains for which no known continuoustime analogue is known.We remark that the dual MBP with immigration will be discussed in another paper.
Section 2 is devoted to consolidating the scattered literature concerning construction of the MBP and uniqueness of solutions of its Kolmogorov differential equation systems.In addition notation used in the sequel is established.In Section 3 we supplement the discussion of the definitions of the DMBP in [2].In particular, Proposition 2 gives a very direct proof that () is a minimal transition function.This is in contrast to the observation in [2] that this result follows from a more specialised criterion in [5].
The presentation in this paper stresses the relation with branching processes, whereas that in [2] is closer to the analytical approach used more commonly in literature on Markov construction theory.Our approach results in a more intrinsic development and, we believe, the various results have greater intuitive appeal when expressed in terms used in the branching process literature.

Defining and Constructing the MBP
Perusing the standard monograph accounts of branching processes reveals a variety of definitions of the MBP, all of which are motivated by the idea of the MBP as a model of a population of reproducing individuals.The branching property (1) expresses the notion of independence of separate lines of descent, and we begin by summarizing its consequences.Define the probability generating functions   (, ) = ∑ ≥0   ()  , and let (, ) =  1 (, ).The branching property is equivalent to the relation   (, ) = ((, ))  .This identity implies that state 0 is absorbing.If 0 <  < 1 then that is, F() is Feller.This property has the important consequence that all states are stable, meaning that the jump rates q fl    (0) are all finite; see p. 43 in [6] for the definition and its consequences.Let  = −q 11 , assumed positive, and let Assume  0 < 1 and 0 ≤ ∑ ≥0   = 1 − , where 0 ≤  ≤ 1, and  = 0 if and only if Q is conservative.Let () = ∑ ≥0     and () = (()−).We need the following notation for the sequel.
Often the form (4) of -matrix is chosen to model a population process where individuals live and reproduce independently with exp() lifetime law and, at the time of death, they are replaced by  individuals with probability   ( = 0, 2, . ..) or by infinitely many with probability .Thus  is the mean per capita number of offspring, and we delineate the subcritical, critical, and super-critical cases, according to  < 1,  = 1, or  > 1, respectively.Then   is the population size at time .If  > 0 then the state-space is extended to S ∪ {∞}, where ∞ is an absorbing state.In this case sample paths either hit the zero state or hit ∞ via a single infinite jump at the first occurrence of an infinite litter.In all cases, the former occurs with probability   if  0 = .
The population image is captured by the minimal transition function corresponding to Q.This satisfies the backward and the forward Kolmogorov equations.General theory shows that F() is minimal because it possesses the Feller property; see p. 81 in [6].We show this below without appealing to the Feller property.The forward equation system can be wrapped up into the single linear first-order partial differential equation: where Φ  (, ) ( = 1, 2, . ..) represents the probability generating functions for any Q-function.This equation has a unique solution in the class of functions which are holomorphic in the open unit disc; see p. 119 in [7].The proof in this reference is based on a general uniqueness theorem for linear partial differential equations and although it assumes that Q is conservative, the proof is valid in general.Alternatively, a unique solution of the forward system within the class of transition functions follows from Reuter's criterion that such uniqueness is equivalent to the assertion that the only sequence {]  :  ∈ S} ∈ ℓ + 1 such that for some (and hence all)  > 0 is ]  ≡ 0. The explicit form of this equation is If  0 = 0 then setting  = 0 shows that ] 0 = 0, so by recursion, ]  ≡ 0. If  0 > 0 it is clear that, given a value of ] 0 , this system can be solved recursively for ] 1 , ] 2 , . ... So a nonnegative solution certainly exists.Estimates showing that ∑  ]  = ∞ are derived in [8].
The same end is achieved in [6] (p. 114) using a generating function approach.The proof there rather obscures the fact that (7) does have a solution.A tidier version is to formally define the generating function () = ∑ ≥0 ]    and observe that (7) yields the differential equation ()  () = () with solution ) .
If  0 > 0 and  <  then the integrand has a power series expansion with nonnegative coefficients.It follows that () is holomorphic in the disc || <  and, hence, if ] 0 > 0, its coefficients solve (7).However, (−) = ∞, so {]  } ∉ ℓ + 1 .Consequently, the minimal Q-function is the only transition function solution of the forward system.But if (5) holds with  = 1, then it is solved for general  by (Φ 1 (, ))  .It follows immediately from Harris' version of the uniqueness theorem that this can be the only solution holomorphic in the unit disc, and hence the minimal Q-function has the branching property.
It does not seem possible to so directly draw this conclusion from the uniqueness given by Reuter's criterion.However, the branching property implies that the Chapman-Kolmogorov equations for F() can be rendered as the functional equation (,  + ) = ((, ), ) and hence that where the prime denotes partial differentiation with respect to .Setting  = 0, it follows from (4) that the first factor on the right-hand side is (), and hence (, ) solves (5) with  < 1.It follows that F() solves the forward system, and we conclude that F() is the minimal transition function.
The backward system is uniquely satisfied by F() if it is honest.If Q is conservative and F() is dishonest, then the backward system has uncountably many transition function solutions.The single entrance solutions are investigated in [9].Of course, none of these has the branching property.
Surprisingly, if Q is not conservative then the backward system is uniquely solved by the minimal transition function.This follows from another Reuter criterion which asserts that uniqueness within the class of Q-functions is equivalent to the condition that the only solution {() :  ∈ S} ∈ ℓ + ∞ of the system for some (and hence all)  > 0 is   ≡ 0. This criterion is shown to hold in [8] (pp.224-226).The following proof is simpler, though preserving the spirit of [8].
So, assume that there is a nontrivial solution {()}, as above.Note first that (0) = 0 because q0 ≡ 0. If  ≥ 1, then the explicit form of the above system can be written as Suppose there exists   ≥ 1 such that (  ) ≥ () for all .
Although F() is the unique solution of the backward and the forward systems if Q is not conservative, it is not the unique Q-function.Uniqueness in this case is equivalent to two conditions.The first is the above Reuter condition for uniqueness of the forward system, and the second is that, for any  > 0, there is a constant   > 0 such that the inequality holds for all ; see p. 150 in [6].However, the sum equals (())  where we define () = (1, ), and since () < 1, we see that the left-hand side tends to zero as  → ∞.Thus the criterion for uniqueness is violated, meaning there exists a Q-function which is not a solution of either the backward or forward systems.
The MBP is defined in [7] to be a Markov process whose transition function solves the forward system obtained from Q.The branching property is derived from this (Harris) definition.The MBP definition we adopt in Section 1 is given in [10] but not used there.Instead, (4) is motivated from population growth considerations and then used in a description of the Feller construction of the minimal process.The specification in [11] proceeds from a construction involving Galton-Watson trees with randomized split times.Also described in [11] is the Athreya-Karlin construction [12] which, in essence, is the Markov process whose jump chain is a left-continuous random walk and holding times in state  have the exponential exp() law.This construction is described too in [10].The monograph [8], on Markov construction theory, uses our definition.This is the only treatment which, to our knowledge, gives any attention to the nonconservative case.Finally, we mention that, in [13] (see Section 3 there), the author clearly distinguishes between what he calls the probabilistic definition, that is, a Markov process having the branching property, and the analytic definition, that is, the above Harris definition.The proof in [13] of the equivalence of these definitions uses facts already mentioned that the branching property implies the Feller property, and this implies minimality.
The backward system for the minimal Q-function is expressed in terms of the probability generating function (, ) as  (, )  =  ( (, )) , (0 ≤  ≤ 1) , or as in its integrated form.The total mass function This implies the known criterion that F() is honest if and only if where  <   < 1.This appears in [7] where it is attributed to E. B. Dynkin.See the introduction of [14] for remarks on these attributions.Note that  < ∞ only if  > 1 and certainly if Q is not conservative.It follows from ( 16) that if  < ∞ then  < () < 1, and () ↓  as  ↑ ∞.Suppose that Q is not regular.It is not at first apparent why the constructions of transition functions in [9] need Q to be conservative.The transition functions considered in that paper (see (2.11) there) have the form of the sum of   () plus a convolution integral.A computation shows that the derivative of this added term is proportional to   (0).Since (0) = 1, we see from (14) with  = 1 that   (0) = (1) = −, and hence the transition functions constructed in [9] are Qfunctions only if Q is conservative.On the other hand, the limiting conditional theorem in [9] (p.743) holds for the nonconservative MBP.Finally, we mention that several of the results in [15] for the nonconservative linear birth and death process carry over to the MBP.

On Definitions of the Dual MBP
As we mentioned in the introduction, three equivalent definitions of the DMBP are presented in [2].Definition (i) is just Definition 1 above.Definition (ii) asserts that the DMBP is the Markov process whose -matrix has the form where {  :  ≥ −1} is a sequence satisfying In particular   =  0 − ( 1 −  0 ).So  1 ≥  0 , and if equality holds, then  0 = 0. Hence we always assume that  1 − 0 > 0 to avoid a trivial situation.In this case the parameters {  } and {;   } are related through Note that lim →∞   = 0 if and only if Q is conservative.In addition  1 −  0 = , and the inverse relation is implying that  0 = 1 unless there is at least one strict inequality in the chain (19).Definition (iii) is that the DMBP is a Markov process whose transition function has the dual branching property and  −1,0 () ≡ 1.The proof in [2] is a mostly analytic demonstration that (ii) ⇒ (i) ⇒ (iii) ⇒ (ii).The following proof that (i) ⇔ (iii) supplements the treatment in that reference.
We note first that, in the case that Q is conservative, the paper [16] gives an analytic proof that F() is stochastically monotone.A direct and completely general demonstration follows by expressing (1) as where the processes {( , ) :  = 1, 2, . ..} are independent copies of (  ) started with  0 = 1.It follows that and since the summands are nonnegative it is evident that F() is SM.This is valid whether or not Q is regular, not regular, or not even conservative.Hence Siegmund's theorem is applicable in each of these circumstances because, if F() is dishonest, then the added boundary state ∞ is absorbing.Hence there is a Markov process (  ) satisfying Definition (i).
Next, define the generating function from which we obtain the fundamental identity Summing over all  yields This implies that   (  < ∞) ≡ 1; that is, () is always an honest transition function.We show below that it is a minimal transition function.
We now show that (1) implies the identity (22).If  ≥ 1 then (26) implies that Equating the coefficients of   on each side of this identity together with a little manipulation yields (22).
It follows that   () we have constructed comprise a standard transition function F() which possesses the branching property.Hence, as we have seen, it is a minimal transition function.Consequently there is a MBP whose transition function is that which we have constructed and that it is linked to () via (18).But this identity is equivalent to (2), and hence Definitions (i) and (iii) are equivalent.
We have seen that the transition function () is always honest.A sufficient condition for it to be minimal is the Feller property   () fl lim →∞   () = 0 for all ,  ≥ 0 (p.43 in [6]).To check this property, note that (2) implies that () is SM, and this implies the existence of the limit   ().This limit is evaluated by applying an Abelian theorem for power series to (18) to obtain and this is zero if and only if () ≡ 1, that is, if and only if F() is honest.
The fact that () is always minimal is a corollary of a simple general result which we state as follows.
Proposition 2. Suppose F() is any SM transition function whose -matrix elements q satisfy q0 ≡ 0 and which admits a dual transition function () in the sense of ( 2).
(1) The dual -matrix  is specified by and  is conservative.
The jump chain of the MBP is a random walk which is skip-free to the left, and hence we can take  = 1 in Assertion (2) of Proposition 2. In addition, it follows from (33) and ( 4) that and hence (34) is satisfied.It follows that the DMBP transition function is honest and minimal.Some manipulation with ( 4) and (33) will show that Q has the form (19).At this point we mention two extreme cases.If  0 = 1 then the generating MBP is the linear death process and it is easy to show from (18) that It follows that the DMBP is a linear birth process with an independent immigration component in which individuals arrive at the event times of a Poisson process having rate .
If  = 1 and  0 =  ≥ 1, then the sample path has a single jump to ∞ at a time which has the exp() law.In terms of the population model, the first reproduction event results in an infinite number of offspring.The corresponding DMBP can be regarded as a uniform catastrophe process: if   =  ≥ 1, then the next jump occurs after a time having the exp() law, and it is into a state  = 0, . . .,  − 1 with probability  −1 .The zero state is absorbing.
Let  be the hitting time of 0 by (  ).Since 0 is accessible from all  ≥ 1, it follows that   ( < ∞) = 1.
Further calculation shows that the speed of convergence in (41) is characterized by Next we let  0 > 0 and determine the rate at which the variation distance between () and the geometric limitingstationary law approaches zero.This is equivalent to estimating the speed at which V  () fl ∑ ≥0 |  () −   | → 0. A more stringent measure of the speed of convergence is to require that this limit relation holds uniformly in , that is, that V() fl sup ≥0 V  () → 0. If this holds then () is said to be strongly ergodic.As observed in [2], () is strongly ergodic if and only if F() is dishonest.This follows from Theorem 2.2 in [18] asserting that an ergodic and SM transition function is strongly ergodic if and only if it is not Feller, and we have seen above that () is Feller if and only if F() is honest.
The following result gives the rate of convergence to zero of V  () and V().As a by-product, it gives a proof of the result just discussed without using Theorem 2.2 in [18].Theorem 4. Suppose  0 > 0 and  > 1.(a) For all  ≥ 0, V  () = ( − ).
Proof.We obtain exact element-wise convergence rates as follows.It follows from (26) that (46) This implies that Note that ℓ 00 > 0.
We now consider the case  ≤ 1 in which case  0 > 0. So setting  = 0 in (26) yields  0 () = ∑ >   () > 0 for all , and hence 0 is accessible from N. In addition  ,+1 > 0 for all , and hence S is irreducible.Taking  = 0 in (26) yields  00 () = 1 − () → 0, since () → 1, so S is not positive recurrent.It follows that S is transient if the mean time to extinction of the MBP   = ∫ ∞ 0 (1 − ()) < ∞ and nullrecurrent if   = ∞.Let  = () in the integral and observe that (14) where  = (1 − ) denotes the Malthusian parameter of the MBP.This result shows that the uniform measure on S is -invariant for ().See Section 5.2 in [6] for this notion.If  < 1 then  > 0, and it is known that where () is slowly varying at infinity and 0 < (∞) < ∞ if ∑ ≥1    log  < ∞, and (∞) = 0 otherwise.In addition where M() is a nondefective probability generating function and   > 0 for all  ≥ 1. See p. 121 in [11] for these facts.These   's comprise -invariant measure for F() (restricted to N), which may be expressed in terms of generating functions as The generating function of the tail masses Theorem 5. Let  < 1.Then {  } is -invariant function for () and, as  → ∞, Proof.Using (26) we compute by virtue of (26) with  = 0 and (64), and the -invariance assertion follows.The asymptotic relation follows because (26) and (63) imply that lim →∞   (, ) and the extended continuity theorem then yields the assertion.
Turning to the case  = 1, we have the following result similar to Theorem 5.
This is quite well known, and it can be shown easily as follows.
Then (, ) = ( + ) = ((), ), and hence the limit ( 69) is since We can obtain a second-order correction to this result by using the following further facts about the MBP.Recall that () = (() − ), and observe that since (1 − ())/(1 − ) is a probability generating function if  = 1; then 1/(()−) has a Maclaurin expansion with nonnegative coefficients.Hence the function Setting  = 0 shows that () is the inverse function of  and that (, ) = ( + ()).It follows too from (71) that ∑ ≥1 V    () = V  ; that is, {V  :  ≥ 1} is an invariant measure for F() restricted to N. It is known to be unique up to multiplication by constants.The following proposition expresses in terms of () the precise rate at which (, ) − () tends to zero.
If at least one extreme side of this inequality is bounded away from () as  → ∞ through some sequence, then (73) will be violated.It follows that the limit as  → ∞ of each bound exists and equals ().Setting  = (1), then since ((1), ) = ( + 1), we conclude that lim and the assertion follows.
Let  0 = 0 and Proof.The generating function of the term in the numerator is Dividing by ( + 1) − () and letting  → ∞, it follows from Proposition 7 that the first term in the right-hand side numerator contributes the limit ()/(1−).The second term in that numerator is asymptotically equal to (1 − (, )) 2 .

Limit Theorems
In this section we assume that  ≤ 1 and, if  = 1, then we shall understand (63) to hold with M() ≡ 0. Our first result shows that   fl (1 − ())  converges in law.We denote this mode of convergence by Assertion (a) follows from the extended continuity theorem for probability generating functions and (1 + ) −1 is the Laplace-Stieltjes transform of the standard exponential law.
If  < 1 then the right-hand side of (81) can be expanded in the form But where   is the one-step transition probability of the Markov chain which is the (modified) Siegmund dual of the simple branching process whose offspring probability generating function is M().This offspring law is supercritical, and, indeed, since M(0) = 0, it follows that   = 0 if  >  ≥ 0. See [3] for properties of this dual.We conclude from (81) that This completes the proof.
We show next that the the mode of convergence in Theorem 9(b) can be strengthened to almost sure convergence.Our proof exploits the upward skip-free nature of DMBP sample paths.It follows quite closely the progression through Theorems 2.2 to 2.4 in [19], which in turn is an application of basic methodology developed in [20].
It is shown in [20] that this system has the stated solution and it is unique up to constant factors.Indeed, (0; ) ≡ 1, and starting with  = 0, 1, . .., the system can be solved recursively for (1; ), (2; ), . ... Formally define the generating function () = ∑ ≥0 (; )  .Recalling that () = (() − ), it follows from ( 4) and (33) that which together with (87) yields the differential equation Integration yields the solution where, we recall, () = ∫  0 /().The right-hand side has a Maclaurin expansion whose coefficients are positive and which satisfies system (87).The function M() defined at (63) is related to () by where () is slowly varying at infinity; see p. 122 in [11] for the right-hand side representation.It follows that Since   is almost surely nondecreasing in , then () is nondecreasing and hence a Tauberian theorem for power series yields the asymptotic equivalence in other words, as  → ∞ we have the asymptotic equivalence The right-hand side is the (bilateral) Laplace-Stieltjes transform of a random variable  which has the Gumbel type distribution function exp(− − ).We have thus shown that   − ()   → .But the left-hand side is a sum of independent random variables, so it follows from the equivalence theorem for convergent random series of independent summands that   − () ..

󳨀 󳨀 → 𝑆.
Recalling that   = ∑  =+1   , it follows that,   -almost surely, where  =   +   and   and   are independent.Hence the Laplace-Stieltjes transform of the limit is Next, let is a positive-valued Lévy martingale, and   ..
→  −  .On the other hand, where we have used the strong Markov property for the second equality, (97) and the dominated convergence theorem for the penultimate equality and then (98).Letting  = 1 in this result, (93) and (98) show that if  → ∞, then almost surely that is, −(  ) →   ,   -almost surely.This conclusion can be expressed as the asymptotic equivalence (  ) ∼  −  .Since () is regularly varying with index  −1 , its inverse () is regularly varying with index , and hence The assertion follows after some algebra to check that the definitions of , , and  imply that (  ) = 1/(1 − ()).
Assuming this, it follows that Consequently, the norming in Theorem 9 has the asymptotic form 1 − () ∼  −− .In addition, () =  −1 (log  − ) + (1) as  → ∞, and hence the norming function emerging in the final stages of the previous proof satisfies (  ) ∼  + .These equivalences show that the limit laws in Theorems 9 and 10 agree, and we have the identification  =  −   almost surely.We will state and prove two results concerning the rate at which   converges to .First we recollect some known results concerning M() defined by (63).If we let  = 0 in (64), then differentiating with respect to  and recalling (14) yield Integration leads to the identity It follows that the mean of the limiting conditional law for the MBP is  =   .We will assume that   (1) is finite, in which case differentiating (105) leading to as asserted.
and we see that the right-hand side tends to zero if and only if () < 1, that is, if and only if  < ∞.Hence () is not strongly ergodic if  = ∞.≥0        () −        =  ( − ) .
implies that  = /(() − ).It follows that   = / where It is well known that there are examples where  = ∞ and  < ∞.More detail about the asymptotic behaviour of () can be extracted from known results about F().A clue as to what can be expected is exposed by letting  → 1 in (26).This yields and hence S is transient if  < ∞ and it is null-recurrent if  = ∞.This is equivalent to Theorem 3.2 in [2].Since () −  = (1 − )(1 − )(1 + (1)) as  → 1−, it is obvious that  is finite if  < 1;that is, S is transient.The typical case for  = 1 is that the variance 2 fl   (1) is finite.In this case () −  ∼ (1 − ) 2 , and we see that  = ∞; that is, S is null-recurrent.