Success Run Waiting Times and Fuss-Catalan Numbers

We present power series expressions for all the roots of the auxiliary equation of the recurrence relation for the distribution of the waiting time for the first run of k consecutive successes in a sequence of independent Bernoulli trials, that is, the geometric distribution of order k. We show that the series coefficients are Fuss-Catalan numbers and write the roots in terms of the generating function of the Fuss-Catalan numbers. Our main result is a new exact expression for the distribution, which is more concise than previously published formulas. Our work extends the analysis by Feller, who gave asymptotic results. We obtain quantitative improvements of the error estimates obtained by Feller.


Introduction
For a sequence of independent Bernoulli trials with probability of success, let be the waiting time for the first run of consecutive successes. Then is said to have the geometric distribution of order . See, for example, the texts by Balakrishnan and Koutras [1] and by Johnson et al. [2]. An expression for (the probability mass function of ) was derived by Philippou et al. [3] in terms of multinomial sums, following earlier work by Philippou and Muwafi [4]. By a clever counting argument, Burr and Cane [5] obtained an expression for ( > ) as a sum of ( 2 / ) terms involving products of two binomial coefficients (rederived by Godbole [6] using a different method). A somewhat different expression for ( > ) involving multinomial coefficients was obtained by Philippou and Makri [7]. Perhaps the simplest expression to date is the one by Muselli [8, eq. 16], which requires a sum over 1 + ⌊( + 1)/( + 1)⌋ binomial coefficients.
In his classic text [9, pp. 322-326], Feller took a different approach to the distribution of by setting up a suitable recurrence relation. The auxiliary equation has degree . Feller showed that the equation has a unique largest (in absolute value) "principal root," which is real and positive, and hence obtained an asymptotic formula for in terms of the principal root (more precisely, we work with ∈ C (see Section 2) but Feller employed = 1/ ; hence in Feller's formalism the principal root has the smallest absolute value). Feller also bounded the error from neglecting the contribution from the other roots and formulated an iteration scheme for approximating the principal root numerically. Concerning the formula for the exact solution given by the recurrence relation method, Feller commented that it was "primarily of theoretical interest" because "the labor involved in computing all the roots is usually prohibitive" [9, p. 276]. Feller's work dates from 1968 and computing power for numerical analysis has increased greatly since then. Note, however, that our derivation below is purely analytical.
We extend Feller's analysis [9] of the recurrence relation for by finding power series expressions for all the roots of the auxiliary equation. We show that the series coefficients are Fuss-Catalan numbers (see the text by Graham et al. [10]) and the roots are given by suitable values of the generating function of the Fuss-Catalan numbers. This permits the roots to be written in terms of known "elementary functions." This leads to our main result (21) which is a new exact expression for ( > ). This formula differs from the other results mentioned above in the important respect that there are only summands (independently of ). We also obtain quantitative improvements of the error estimates obtained by Feller. We also derive numerous properties of the roots of the auxiliary equation; for example, we draw attention to the fact 2 Journal of Probability and Statistics that, in many results below, the value = /( +1) is a special case, and the ranges 0 < < /( + 1) and /( + 1) < < 1 require separate treatments.
Our analysis below focuses only on the original problem treated by Feller. More recent papers study variants of the problem; for example, Eryilmaz [11] treats the geometric distribution of order with a reward, where each time a success occurs a random reward is received, while Shmerling [12] studies a generalization of the geometric distribution of order for Markov processes.

Auxiliary Equation.
We present our basic notation and definitions below. The probability mass function of satisfies the recurrence relation, for > , Here and below we define = 1 − . A derivation of (1) was given by Barry and Lo Bello [13]. We employ the initial conditions ( ) = 0 for = 1, . . . , − 1 and ( ) = . Next we define the auxiliary polynomial The auxiliary equation is A ( ) = 0. We shall drop the subscript " " unless necessary. To establish contact with Feller's formalism [9], note that he derived the following expression for the probability generating function [9, eq. (7.6)]: Feller then considered the roots of the polynomial in the denominator of (3). Setting = 1/ in (2), this is equivalent to our auxiliary polynomial We shall employ and (2) below, bearing in mind throughout that Feller worked with = 1/ . Feller [9] proved that the roots of the auxiliary equation are distinct. We denote the roots by ( , ), = 0, 1, . . . , −1. Unless required, we shall omit the arguments and . It is useful to multiply A( ) by ( − ) to obtain the polynomial It was shown by Feller [9] that there is exactly one real root for ∈ (0, 1). This real positive root will feature sufficiently prominently in our analysis below that we designate it by the symbol and call it the "principal root." The other roots will be termed "secondary roots." We define 0 = . For brevity in various calculations below, we also define * = /( + 1) and * = 1/( + 1).

Main Results
Theorem 1 (roots of auxiliary equation). For ≥ 1, let For ∈ (0, 1), the secondary roots are given by The principal root is given by The derivation of the above expressions will be given in Section 5. Feller [9] proved that there is a unique real positive root and that its magnitude exceeds that of all the other roots.
The numbers are well defined provided ] + ̸ = 0. Hence the coefficients defined in (6) are Journal of Probability and Statistics 3 The generating function of the Fuss-Catalan numbers is ] ( ) and [10, p. 363] Hence for all 0 < < 1, the secondary roots are given by For /( + 1) < < 1, the above expression also applies to = 0 . For 0 < < /( + 1), note that Hence for 0 < < /( + 1), This establishes the connection of the roots of the auxiliary equation to the generating functions of the Fuss-Catalan numbers.
Theorem 3 (probability mass function). For > , the probability mass function is given by The proof will be given in Section 6.

Corollary 4 (asymptotic solution)
. For fixed and ≫ , the contribution to ( ) is dominated by the principal root, and so asymptotically The expression for ̸ = /( + 1) was derived by Feller [9]. We formulate the notion of "asymptotic" more precisely as follows. For fixed > 0, we demand that the magnitude of the contribution to ( ) in (15) from all the secondary roots is less than times the contribution from the principal root. This is achieved if ≥ ( , ), where Here = 1 ] .
The derivations of the expressions for ( , ), , and will be given in Section 7.

Properties of Roots
Section 4.1 presents results which are essential to prove the main results of our paper in Sections 5, 6, and 7. Section 4.2 contains additional results, which can be omitted by the reader who wishes to proceed directly to Section 5. We rewrite the equation B( ) = 0 in the form We assume ∈ (0, 1) below; expressions where = 0 or 1 will be indicated as appropriate.

Properties of Roots I
Remark 7. Feller [9] proved that all the roots of the auxiliary equation are distinct. We include a summarized proof for completeness (see also Barry and Lo Bello [13]). Recall that B( ) has all the roots of A( ) and an extra root = . Now Then B ( * ) = 0 when * = 0, which is not a root of B( ), or else * = /( + 1). But * = /( + 1) is a root of B( ) if and only if = /( + 1); that is, * = = /( + 1). So B( ) can have a repeated root (of order 2) only when = /( +1). The roots are at * = and we know that one of those two roots is not a root of A( ). Hence A( ) has no repeated roots.
Proof. It was proved by Feller [9], who employed = 1/ , that the auxiliary equation has a unique positive real root. In Feller's analysis, the root had a magnitude larger than unity, so in our case the root lies in (0, 1). The mutually exclusive statements (i), (ii), and (iii) follow immediately from an examination of the level sets of (1 − ) for ∈ (0, 1), bearing in mind that (24) has a repeated root of order 2 when = /( + 1).
Journal of Probability and Statistics 5

Properties of Roots II
Proposition 12. For 0 < < 1, the roots are continuously differentiable functions of .
Hence if 1 and 2 are two roots with unequal amplitudes then | 1 | < | 2 | implies R( 1 ) < R( 2 ) and vice versa. (c) For even , let 2 in (b) be real and negative. Suppose a root 1 exists such that R( 1 ) < R( 2 ); then necessarily | 1 | > | 2 |, which contradicts the result in (b). Hence the negative real root has the most negative real part and the smallest amplitude of all the roots.
We proved in (32) that | | < min{ , } for all the secondary roots. We also know that all the secondary roots vanish, that is, attaining their minimum amplitudes, at = 0 or 1.
Remark 18 (moment generating function). An expression for the probability generating function P( ) was derived by Feller [9] (see (3)) and rederived by Philippou et al. [3]. An equivalent expression for the moment generating function ( ) = P( ) = E[ ] was derived by Barry and Lo Bello [13]. Feller stated that the domain of convergence of P( ) is | | < 1/ , whereas Philippou et al. stated that P( ) exists for | | ≤ 1. Barry and Lo Bello denoted the roots by , = 1, . . . , , and defined = min{− ln | 1 |, − ln | 2 |, . . . , − ln | |}. They proved that all the roots have modulus less than unity and stated that ( ) exists on the interval ∈ (−∞, ). We confirm the correctness of Feller's statement that the domain of convergence is determined by the principal root; hence | | < 1/ yields the most precise value for the domain of convergence.
The following expressions for the mean and variance 2 of the waiting time for the first run of successes were derived by Feller [9] (and rederived by Philippou et al. [3]): Chaves and de Souza [14] obtained the above expression for the mean but a different expression for the variance. We confirm the correctness of Feller's expressions. The mean and variance are polynomials in 1/ and it is easily derived that . (39b) From the above we see that both the mean and variance decrease strictly and continuously with . In particular = and 2 = 0 at = 1, as expected. Both and 2 diverge as → 0.
Corollary 22. For 0 < < /( + 1), the principal root is given by Proof. The sum of the roots is ; hence the secondary roots sum up to − . Using (1 − ) 1/ = 1/ , it follows that ( 1/ ) = min{ , } for all ∈ (0, 1). Hence ( 1/ ) = for ∈ (0, /( + 1)) and so In the second line, the sum over the roots of unity vanishes unless is a multiple of . The interchange of the orders of summation is permissible because the series converge absolutely within the domain of convergence.
The derivation of the asymptotic expressions in Corollary 4 goes as follows. Because the principal root has a strictly larger magnitude than all the other roots, for sufficiently large it dominates the contribution to ( ). From (15) for ̸ = /( + 1), In the last line we used = (1 − ). The expression in (69) agrees with that by Feller [9]. Next for = /( + 1), using = , This expression is not in [9].

Asymptotics
We derive (17) and (18). For given fixed > 0, we want ( , ) such that, for all ≥ , the magnitude of the summed contribution from all the secondary roots in (15) is less than times the contribution from the principal root. From (15), one must have Now suppose that we have an upper bound ( , ) such that | / | < < 1 for all the secondary roots. Note that depends on and but not on . The contribution from the secondary roots is bounded by .
The task then is to determine . We proved that | | < for all the secondary roots and > . Hence a possible bound is This is not a tight upper bound; it is meaningful only if < 1− ; that is, < 1/2. Next if 0 < < /( +1), so /( +1) < < 1; we see that < ( + 1) .
The last line follows because ≤ /( + 1). Thus we can take as defined in (82).
The bound = 1 − using (82) is also applicable for ∈ (0, /( +1)); to see this use min{ , } in the derivation above. Then for all ∈ (0, 1), we select the most stringent bound at each value of . This yields the various cases in (18).