Stochastic methodology for the study of an epidemic decay phase, based on a branching model

We present a stochastic methodology to study the decay phase of an epidemic. It is based on a general stochastic epidemic process with memory, suitable to model the spread in a large open population with births of any rare transmissible disease with a random incubation period and a Reed-Frost type infection. This model, which belongs to the class of multitype branching processes in discrete time, enables us to predict the incidences of cases and to derive the probability distributions of the extinction time and of the future epidemic size. We also study the epidemic evolution in the worst-case scenario of a very late extinction time, making use of the Q -process. We provide in addition an estimator of the key parameter of the epidemic model quantifying the infection, and ﬁnally illustrate this methodology with the study of the Bovine Spongiform Encephalopathy epidemic in Great Britain after the 1988 feed ban law.


Introduction
Outbreaks of infectious diseases of animals or humans are subject, when possible, to control measures aiming at curbing their spread. Effective measures should force the epidemic to enter its decay phase and to reach extinction. The decay phase can then be simply detected by a decrease of the number of cases, when this decrease is obvious. However this is not always the case, and this rough qualitative information might not be sufficient to evaluate accurately the effectiveness of the proposed measures to reduce the final size and duration of the outbreak. The goal of this article is to present a stochastic methodology in discrete time to study more accurately the decay phase of an epidemic. Our framework is the spread, in a large open population, of a rare transmissible disease such that the infection process may be assumed to follow a Reed-Frost type model, with a probability for a susceptible to become infected by a given dose of pathogens inversely proportional to the total population size. Moreover the latent period (during which an individual is infected but not yet infectious) may be random and long compared to the generation time. Questions about the decay phase include: which quantitative criteria can ensure that the disease has entered an extinction phase? What is the probability distribution of the epidemic extinction time, of the epidemic final size, of the incidence of infected individuals? Finally, what would be the evolution of the epidemic in the event of a very late extinction of the disease?
From a practical point of view, it is generally impossible to observe all infections. Susceptible and infected but not yet infectious individuals are most often not distinguishable, being both apparently healthy. This leads to the fact that the only available observations correspond to the incidence of individuals with clinical symptoms. One way to deal with this lack of information was proposed in [20] by Panaretos, who used a model taking into account two types of infected individuals, the observed and the unobserved. In order to answer the previous questions, we choose here a different approach, considering a stochastic model depending on the sole incidences {X n } n of infectives at each time. We assume that an infective can transmit the disease during one given time unit at most. Therefore the incidence of infectives corresponds to the incidence of cases. The process then describes in a recursive way how one single infective can indirectly generate new infectives (so-called "secondary cases") k time units later, where 1 k d. We assume that this number of secondary cases follows a Poisson probability distribution with parameter Ψ k > 0. The recursive formula defining {X n } n is then the following, where the variable Y n−k,n,i is the incidence of secondary cases produced at time n with a delay k (latent period) by individual i infectious at time n− k. The {Y n−k,n,i } i,k are assumed independent given F n−1 := σ ({X n−k } k 1 ), and the {Y n−k,n,i } i are assumed i.i.d. (identically and independently distributed) given F n−1 , with a common Poisson distribution with parameter Ψ k . This model is therefore time-homogeneous, and is in this sense less general than the one introduced in [24], which describes the spread of infectious animal diseases in a varying environment. However since we focus on the extinction phase only, the assumption of a constant environment with no new control measure is well-founded, and enables us to describe more accurately the decay phase. This process is the generalisation of the well-known single-type BGW (Bienaymé-Galton-Watson) branching process, limit, as the total population size tends to infinity, of the process describing the spread in a closed population of an infectious disease with a negligible latent period and a probability to become infected following a Reed-Frost model (see e.g. [2,3] and citations therein). The core of the article lies in Section 2, where the whole methodology is presented. We first formulate the epidemic model {X n } n as a multitype branching processes with Poissonian transitions, the types representing the memory of the process. This formulation provides useful analytical results such as an extinction criteria, and the distributions of the extinction time and of the epidemic size (Subsections 2.1-2.3). Then, in order to investigate the worst-case scenario of an extreme late extinction of the epidemic, we introduce in Subsection 2.4 the Q-process {X * n } n , obtained by conditioning {X n } n on a very late extinction. Using this process, we focus the study on the early behaviour of the decay phase in the worst-case scenario, rather that on its long range behaviour, which would have little meaning in our setting. Motivated by practical applications to real epidemics, for which we want to predict the processes {X n } n and {X * n } n , as well as the derived distributions above, we need to know the values of the parameters {Ψ k } k . We may write Ψ k = am−k a=1 θ a P inc,a (k) P age (a + k), where θ a is the mean number of individuals infected at age a by an infective by direct or indirect transmission, a m is the largest survival age, P inc,a (k) is the probability for the individual aged a at infection to have a latent period equal to k (given his survival), and P age (a + k) is the probability to be aged a + k at the end of the latent period. Parameters {θ a } a are the key quantities for the spread of the disease and can be subject to changes due to control measures during the epidemic. We assume here that θ a = θ 0 + p a , where p a is constant over time, while θ 0 may change with control measures. A typical example is when θ 0 is the mean number of individuals infected by an infective by horizontal route at age a, assumed independent of a 2, and p 1 represents the maternal transmission probability p mat . In this case Ψ k = θ 0 am a=k+1 P inc,a−k (k) P age (a) + p mat P inc,1 (k) P age (k + 1) . (1.2) So we assume here that, except for θ 0 , the other parameters of the {Ψ k } k are constant over time and are known (generally estimated) from previous experiences or from the study of the whole epidemic evolution, in particular its growth phase. We moreover assume that each Ψ k depends affinely on θ 0 (see Subsection 3.1). In Subsections 2.5 and 2.6, we provide optimal WCLSE (Weighted Conditional Least Squares Estimators) of θ 0 in the decay phase, in the frame of {X n } n as well as in the frame of the associated conditioned process {X * n } n , and prove the strong consistency and the asymptotic normality of these estimators.
The final Section 3 is devoted to the application of this method to real epidemics. We first present in Subsection 3.1 some general conditions under which the spread of a SEIR disease (Susceptible, Exposed (latent), Infectious, Removed) can be approximated by our epidemic process defined by (1.1), and give an explicit derivation of the parameters {Ψ k } k . We then illustrate the methodology in Subsection 3.2 with the decay phase of the BSE (Bovine Spongiform Encephalopathy) epidemic in Great-Britain. According to the available data [19], the epidemic is obviously fading out. We assume that the {Ψ k } k satisfy (1.2). Then thanks to the stochastic tools developed here, we provide in addition to this rough information, short-and long-term predictions about the future spread of the disease as well as an estimation of a potentially remaining horizontal infection route after the 1988 feed ban law.
2 Methodology for the study of an epidemic decay phase In this section we present a general methodology to study the decay phase of a SEIR disease in a large population, modeled by the process (1.1) defined in Section 1. Our main goal is to provide analytical tools to evaluate the efficiency of the last control measures taken prior to the considered time period. Most of our results are derived from the fact that this epidemic model can be seen as a multitype branching process. Indeed, {X n } n defined by (1.1) is a Markovian process of order d. Consequently, the d-dimensional process {X n } n defined by is Markovian of order 1, and it stems directly from (1.1) that {X n } n is a multitype Bienaymé-Galton-Watson (BGW) process with d types (see e.g. [1]). Note that the d types in this branching process do not correspond to any attribute of the individuals in the population, which is usually the case in mathematical biology (see e.g. [12]), but simply correspond to the memory of the process {X n } n . The information provided by the d-dimensional Markovian process {X n } n is therefore the same as the one given by the 1-dimensional d-Markovian process {X n } n , but the multitype branching process setting gives us powerful mathematical tools and results stemming from the branching processes theory [1]). The first basic tool is the generating function of the offspring distribution of {X n } n , f : where e i := (0, . . . , 1, . . . , 0) denotes the ith basis vector of N d and u v : The second basic tool is the mean matrix M defined by E(X n |F n−1 ) = X n−1 M, which is here Let us notice that, since Ψ k > 0 for each k = 1 . . . d, then {X n } n is nonsingular, positive regular (see [1]) and satisfies the X log X condition, where . denotes the sup norm in R d .

Extinction of the epidemic
Almost sure extinction. Since the single-type process {X n } n has a memory of size d, it becomes extinct when it is null at d successive times, or equivalently as soon as the d-dimensional process {X n } n reaches the d-dimensional null vector 0. According to the theory of multitype positive regular and nonsingular BGW processes ( [1]), the extinction of the process {X n } n occurs almost surely (a.s.), if and only if ρ 1, where ρ is the dominant eigenvalue (also called the Perron's root) of the mean matrix M. Thus ρ is solution of d k=1 Ψ k ρ −k = 1. In general for d > 1, ρ has no explicit expression. However, d k=1 Ψ k ρ −k = 1 leads directly to the following explicit threshold criteria.
Proposition 2.1. The epidemic becomes extinct almost surely if and only if R 0 1, where R 0 := d k=1 Ψ k is the total mean number of secondary cases generated by one infective in a SEIR disease. We call R 0 the basic reproduction number.
Moreover, when ρ 1, then R 0 ρ with equality if and only if either ρ = 1 or d = 1, and when ρ > 1, then R 0 ρ with equality if and only if d = 1.
Note that when d > 1, R 0 only provides information about the threshold level, whereas ρ provides an additional information about the speed of extinction of the process, as shown in the next two paragraphs.
Speed of extinction. Thanks to well-known results in the literature about multitype branching processes and more particularly to the Perron-Frobenius theorem (see e.g. [1]), we can deduce the expected incidence of infectives in the population at time n, for n large. Denoting by u and v the right and left eigenvectors of M associated to the Perron's root ρ, that is, Mu T = ρu T and vM = ρv, with the normalization convention u · 1 = u · v = 1, where u · v stands for the usual scalar product in R d and where the superscript T denotes the transposition, then E (X n |X 0 ) = X 0 M n ∼ n→∞ ρ n X 0 u T v. The first coordinate in the latter formula becomes for the epidemic process Computing explicitly u and v, we obtain that for all i = 1 . . . d, which leads to the following asymptotic result Hence if ρ < 1, the mean number of infectives decreases exponentially at the rate ρ. In the following section, we provide a much finer result on the estimation of the disease extinction time in the population.
Extinction time of the epidemic. The extinction time distribution can be derived as a function of the offspring generating function. As usual in stochastic processes, this quantity is calculated conditionally on the initial value X 0 = (X 0 , X −1 , . . . , X −d+1 ), but for the sake of simplicity we do not let it appear in the notations. Note that since we are building tools for the prediction of the spread of the disease, the time origin 0 corresponds here to the time of the last available data (generally the current date). Let T := inf {n 1, X n = 0} denote the extinction time of the process {X n } n , and let f n := f • f n−1 be the nth iterate of the generating function f given by (2.2). We denote f n := (f n,1 , . . . , f n,d ). Then, by the branching property of the process ( [1]), the probability of extinction of the epidemic before time n is immediately given by P (T n) = P (X n = 0) = f n (0) X0 , that is to say, It can be immediately deduced from convergence results for f n (0) as n → ∞ ( [17]), that if ρ = 1, P (T n) ∼ 1 − (nη) −1 X 0 · u, while if ρ < 1, P (T n) ∼ 1 − ρ n γX 0 · u, for some constants η, γ > 0. As a consequence, the closer ρ is to unity, the longer the time to extinction will be in most realizations. More specifically, formula (2.7) enables the exact computation (resp. estimation) of P (T n) for any n by the iterative computation of f n , X 0 being given, when the parameters Ψ k of (1.1) are known (resp. estimated). Moreover, since for ρ 1 the epidemic becomes extinct in an a.s. finite time and P (T n) ր n→∞ 1, then for any given probability p ∈ [0, 1[ there exists n ∈ N such that P (T n) p. So in practice, for any p ∈ [0, 1[, (2.7) enables us to compute the p-quantile n T p of the extinction time, n T p := min{n 1 : P (T n) p}. (2.8)

Total size of the epidemic
Under the assumption ρ 1 and the independence of the {Y n−l,n,i } i,l,n (we previously assumed the independence of the {Y n−l,n,i } i,l , for each n), we derive the distribution of the future total size N := T n=1 X n of the epidemic until its extinction, i.e. the future total number of infectives until the extinction of the disease. It can be shown ( [16]) that, given the initial value X 0 , the time origin being the same as in Subsection 2.1, the probability distribution of N is where D = denotes the equality in distribution, an empty sum is by convention 0, the that is, for each n 1, P (N k,i,j = n) = e −n d l=1 Ψ l (n d l=1 Ψ l ) n−1 (n!) −1 . Consequently, under the convention that an empty product is 1, the probability distribution of N is, for any n ∈ N, which may be calculated (resp. estimated), replacing the Ψ k by their values (resp. estimations). In practice, for any p ∈ [0, 1[, (2.11) enables to compute the p-quantile n N p of the total epidemic size, n N p := min{n 1 : P (N n) p}. (2.12) We obtain moreover an explicit formula for the mean value and variance of the size of the epidemic, (2.13)

Exposed population
Depending on the disease, it might also be crucial to study and predict the evolution of the incidence of exposed individuals in the population, which is generally unobservable. We assume that this information is given by the process {Z n } n defined by the conditional distribution Z n |X n D = Poiss (Ψ 0 X n ), where Ψ 0 is the mean number of individuals infected at time n by an infective of this time (see Subsection 3.1). This property enables on the one hand to reconstruct the whole past epidemic (i.e. the incidence of infectives as well as of exposed individuals) thanks to the observable data. On the other hand, it allows to simulate the evolution of the incidence of exposed individuals in the future, based on predictions of the evolution of the epidemic process {X n } n .

Worst-case scenario: very late extinction of the epidemic
Even in the case when the epidemic dies out almost surely (ρ 1), and although one can provide the p-quantile n T p of the extinction time with the probability p as large as wanted (see (2.8)), the epidemic might become extinct after this time with a small but non null probability of order 1 − p. This raises the following question: how would the incidences of infectious and exposed individuals evolve in the (unlikely) case of a very late extinction? In terms of risk analysis, this issue appears to be crucial to evaluate the risks associated with this worst-case scenario. The tools developed in the previous subsections allow to evaluate the probability of all possible outcomes. But since the worst ones, typically a very late extinction, have a negligible probability, these tools do not bring any information in these worst cases, and in particular do not inform on the evolution at each time-step of the spread of the disease (would it decrease extremely slowly, stay at a constant rate for a very long time, present several peaks in its evolution etc.). In order to study the propagation of the epidemic in the decay phase, assuming that extinction occurs very late, we are interested in the distribution of the process {X n } n conditionally on the event that the epidemic has not become extinct at time k, where k is very large. We therefore consider for any n 1 , n 2 , . . . ∈ N and any i 0 , i 1 , i 2 , . . . ∈ N d the conditioned probability P i0 (X n1 = i 1 , . . . , X nr = i r | X k = 0), where the subscript i 0 denotes the initial value. If k is finite this distribution cannot be easily handled due to its time-inhomogeneity. However, when ρ 1, it is known ( [7]) that this conditioned distribution converges, as k → ∞, to the distribution of a d-dimensional Markov process {X * n } n : lim (2.14) We shall further discuss in Proposition 2.5 the relevancy of approximating the conditioned probability for k fixed by the limiting object (2.14). The conditioned process {X * n } n defined by (2.14) is known in the literature as the Q-process associated with {X n } n , also described as the process conditioned on "not being extinct in the distant future". It has the following transition probability ( [7]): for every n 1, i, j ∈ N d , i = 0, where u is the normalized right eigenvector of M associated to the Perron's root ρ as introduced in Subsection 2.1, and computed explicitly in (2.5). In the same way as for the process {X n } n , we define the 1-dimensional process X * n := X * n,1 . By construction we then have X * n,i = X * n−i+1 , for each n and each i = 1 . . . d.
Proposition 2.2. The stochastic process {X * n } n is, conditionally on its past, distributed as the sum of two independent Poisson and Bernoulli random variables: where Ψ := (Ψ 1 , . . . , Ψ d ), * is the convolution product symbol, and p X * n−1 := Proof. Applying (1.1) and (2.15), we obtain that for all j ∈ N, Remark 2.3. Note that if one compares (2.16) with the transition probability of the unconditioned process X n |X n−1 D = Poiss (X n−1 · Ψ), it appears that {X * n } n behaves at each time step like {X n } n , according to a Poisson distribution, except that it has the possibility at each time step to add one unit or not, according to a Bernoulli random variable. Moreover, if X * n−1 = . . . = X * n−(d−1) = 0, then according to (2.17), p(X * n−1 ) = 1, which implies that at time n, the probability to add one unit is equal to one, thus preventing the extinction of the process.
Proposition 2.4. The process {X * n } n admits a stationary probability measure π with finite first and second-order moments.
Proof. Since the multitype branching process {X n } n satisfies property (2.4), it is known ( [7]) that the Q-process {X * n } n is positive recurrent with a stationary probability measure π given by, where ν is the Yaglom distribution of the process {X n } n , uniquely defined by the following property: In the literature, this stationary measure for the conditioned process {X * n } n is also referred to as the doubly-limiting conditional probability. Moreover, by Proposition 2.2, We consequently obtain by means of Fatou's lemma that, for every i = 1 . . . d, We similarly prove that π has finite second-order moments by writing Let us discuss the relevancy of approximating the epidemic process {X n } n conditioned on nonextinction at some finite time k, for k large, by the Q-process {X * n } n obtained by letting k → ∞. When considering the case of late extinction, one works under an hypothetical assumption based on the unknown future, hence in practice one does not focus on a specific value k for the survival of the disease in the population. We therefore might consider that k is chosen large enough such that the approximation of the process {X n } n conditioned on the event {X k = 0} by the process {X * n } n is valid. Of course, the order of magnitude of such k will depend on the rate of convergence of the conditioned process to {X * n } n .
the λ i being the other eigenvalues of M. In case |λ| = |λ 3 |, we stipulate that the multiplicity s of λ is at least as great as the multiplicity of λ 3 .
Proof. Thanks to (2.15) and to the Markov property together with the fact that The right term of (2.20) is known to converge to 0, as k → ∞, thanks to the property that lim k a k = γu, for some γ > 0, where a k := ρ −k (1 − f k (0)) (see [17]). This stems from two convergences, namely It thus appears that the rate of convergence of a k to γu is of the same order of magnitude as the one of a k b −1 k to u. Let us determine this rate in an accurate way. We use the following inequality produced by Joffe and Spitzer in [17]: for each k n 1, where we are going to replace δ n and α n by some explicit formulae function of ρ and n. For this purpose, we use a detailed asymptotic behavior of M k , as k → ∞, presented for instance in [8]: For the sake of clarity the symbol O (.) will denote either a scalar or a matrix with all the entries satisfying the associated property. This implies the existence of some constant a > 0 such that, for all k ∈ N, We thus have provided an explicit formula for the sequences (δ k ) k and (α k ) k introduced by Joffe and Spitzer in [17]. Finally let us apply (2.22) to n = ⌊ k 2 ⌋ and replace in this inequality δ n and α n by their explicit expressions that we got. We obtain that, for all k ∈ N, Consequently, the right member of (2.20) will decrease with max{k s−1 (|λ|ρ −1 ) Hence the concept of the Q-process will have most practical relevance to approximate the very late extinction case if ρ is near to zero and if |λ| is small compared with ρ. Note however that the very late extinction scenario is more likely to happen if ρ is near to unity because the time to extinction in most realisations will then be long (see Subsection 2.1).

Estimation of the infection parameter
We assume for this subsection that the parameters Ψ k of the epidemic model (1.1)-(2.1) are not entirely known. More precisely, we assume that the Ψ k are of the form, for all k = 1 . . . d, where a k > 0 and b k 0 are constants, and θ 0 is an unknown real parameter. We will write in what follows This general assumption corresponds in particular to the case where the Ψ k are of the form (1.2).
The WCLSE is based on the normalized process Y n := X n / a · X n−1 and is defined by We easily derive the following explicit form The normalization of the process X n by a · X n−1 appears to be the most natural and suitable for the following reasons. First, this normalization generalizes the normalization X n / aX n−1 in the monotype case, which is the one leading to the Harris estimator m of m 0 = aθ 0 + b since we have, for d = 1, a θ X0 + b = m. It also corresponds, in the linear case b = 0, to the maximum likelihood estimator of θ 0 . In addition, defining for any vector x, x := min i x i and x := max i x i , we have is invariant under multiplication of the whole process, and bounded respectively to {X n } n , leading to the quasi-optimality of θ |X0| at finite |X 0 | and n, in the sense of [11]. Let us provide asymptotic results for the estimator θ |X0| defined by (2.26), as the initial population size |X 0 | = X 0 + X −1 + . . . + X −d+1 tends to infinity. We denote by m a.s.
= θ 0 , and is asymptotically normally distributed: Proof. Let us first prove that, for each k = 1 . . . n and each i = 1 . . . d, Using the branching property of the process {X n } n ([1]) we write where, for all l = 1 . . . d and j = 1 . . . X 0,l , X (l) k,i,j is the i-th coordinate of a d-type branching process at time k initialized by a single particle of type l. For k, i and l fixed the random variables {X li (θ 0 ). According to the strong law of large numbers and under the theorem assumption, we have, for every l = 1 . . . d such that X 0,l = 0, which together with the theorem assumption leads to (2.30).
To prove the consistency of θ |X0| we apply (2.30) to (2.26), using the fact that X k = X k,1 and X k−i = X k−1,i , and obtain hence (2.31) immediately leads to the strong consistency.
We are now interested in the asymptotic distribution of θ |X0| − θ 0 . We derive from (2.26) that (2.32) By (1.1), Applying a central limit theorem for the sum of a random number of independent random variables (see e.g. [5]), we obtain that, for all i = 1 . . . d, We have used the fact that |X 0 | is a real positive sequence growing to infinity, and n k=1 X k−i a sequence of integer-valued random variables such that n k=1 X k−i / |X 0 | converges in probability to a finite random variable. In our case the limit is actually deterministic, since we have shown in

Estimation of the infection parameter in the worst-case scenario
In order to make predictions of the evolution of the epidemic in case of a very late extinction, i.e. in order to make predictions of the behavior of the conditioned process {X * n } n introduced in Subsection 2.4, we need to estimate the parameter θ 0 in the setting of this conditioned process. We point out that θ 0 does not play the same role in the conditioned process {X * n } n and in the unconditioned process {X n } n , since, as shown in Proposition 2.2, this parameter interferes not only in the Poisson random variable but also in the Bernoulli one. It would thus be irrelevant to estimate θ 0 with an estimator aimed for the unconditioned process, such as θ |X0| . Let us notice that, according to (2.16), the process {X * n } n could be written as a multitype branching process with state-dependent immigration. Because of this state-dependence, and since the parameter θ 0 acts in a nonlinear way in the immigration, the methods developed in estimation theory for branching processes with immigration (see e.g. [22]) cannot be directly applied here. Similarly as in Subsection 2.5 we consider the WCLSE based on the process Y * n := X * n / a · X * n−1 , namely where Θ is defined in Subsection 2.5, and where be the error term between the normalized process and its conditional expectation. We obtain that In what follows, we denote by f ′ the derivative of f with respect to θ, and similarly for the other quantities depending on θ.
= θ 0 , and has the following asymptotic distribution, Remark 2.8. Note that (2.42) involves the function f ′ , and thus requires the knowledge of the derivative of the function u j given in (2.5), and which is not an explicit function of θ since ρ is not either. However, j is known as soon as ρ can be computed. Consequently, denoting ρ by ρ(θ) when θ is the parameter of the model, (2.42) can be used as soon as ρ( θ * obs n ) is known: for this purpose one can for instance numerically approximate the largest solution ρ of d k=1 Ψ k ( θ * obs n )ρ −k = 1.
Proof. The proof heavily relies on a strong law of large numbers for homogeneous irreducible positive recurrent Markov chains applied to the conditioned process {X * n } n and its stationary distribution π θ0 (given by Proposition 2.4), which states ( [4]) that, for every π θ0 -integrable function h : Note that the Perron's root ρ(θ) of M(θ) as well as the associated right normalized eigenvector u(θ) are C ∞ -functions of θ.
Let us now consider the asymptotic distribution of θ * n − θ 0 . For this purpose, we follow the steps of the proof of Proposition 6.1 in [14]. Writing the Taylor expansion of S ′ n ( θ * n ) at θ 0 we obtain that θ * 2 , it appears that (2.49) is true, as soon as the following holds:  = ∞ (as an immediate consequence of the stronger result (2.47)), and since f ′′ (., X * k−1 ) fulfils the required Lipschitz condition. Indeed we proved earlier that u ′′′ i (θ) is continuous on the compact set Θ and is thus bounded on Θ, which implies that f ′′′ (., X * k−1 ) = p ′′′ ., X * k−1 (a · X * k−1 ) −1/2 is bounded by a F * k−1 -measurable function. In view of (2.51), we consider the function f (θ, j) 2 and its derivative 2f ′ (θ, j)f ′′ (θ, j). Similarly as for (2.44), one can show that there exists a constant C 2 > 0 such that for all θ ∈ Θ, and all j = 0, |p ′′ (θ, j) | C 2 . This implies Consequently, which by (2.47) and the strong consistency of θ * n almost surely tends to 0. Writing this implies (2.51). It now remains to prove (2.52). We write , which thanks to (2.47) and the strong consistency of θ * n implies (2.52). In view of (2.46), we finally want to prove that n k=1 ε * k f ′ (θ 0 , X * k−1 )/ √ n converges in distribution. For this purpose we make use of a central limit theorem for martingale difference arrays ( [23,21] k } k n is a sequence of square integrable martingales. Moreover, using inequalities (2.41) and (2.48), we obtain by (2.19), So, by means of (2.43), Third, using Cauchy-Schwarz and Bienaymé-Chebyshev inequalities, We can show that the 4th central moment of the independent sum of a Poisson and a Bernoulli random variables equals µ 4 + 6µ 2 γ 2 + γ 4 , where µ i and γ i denote the ith central moment of the Poisson and of the Bernoulli variable. If these variables have parameter λ and p, respectively, then Since the highest power of X * n involved in (2.57) is 3/2, and since by Proposition 2.4 the stationary distribution π θ0 has finite second-moments, we can apply (2.43) to (2.56) and obtain that It then ensues from the central limit theorem mentioned above that It now remains to prove the asymptotic distribution (2.42). Thanks to what precedes, this result is immediate as soon as we prove that and show that (f ′ (., j)) 2 g (., j) has a bounded derivative and is thus Lipschitz. We have indeed for some constant C 3 > 0. This enables to write

Application to real epidemics 3.1 Explicit epidemic model
Although the epidemic model (1.1) has a clear interpretation in terms of the disease propagation, it can be also proved ( [16]) that it is the limit process, as the initial population size tends to infinity, of the incidence of infectives described by a thorough process in a large branching population, taking into account all the health states of the disease and describing in detail the infection and latent processes for each individual in the population. This detailed process is a multitype branching process with age and population size dependence, taking into account the variability of many individual factors such as the reproduction, the survival, the transmission of the disease and the latent time. We assume a disease of the general SEIR type. We also assume a random latent period and, for the sake of simplicity, a duration of the state I of one time unit only. Hence the incidence of infectives exactly corresponds to the incidence of cases.
The main assumptions for the construction of the limit model {X n } n are, concerning the disease: (i) at the initial time the disease is rare and the total population size is large, (ii) the infection via horizontal route is of Reed-Frost type, with the probability for a susceptible individual to become infected by a given dose of pathogens being inversely proportional to the total population size, (iii) the individual survival law is the same for E and S individuals and is independent of the population and of the time, (iv) the latent time law given the individual survival is independent of the time, the individual age, and of the infectives population during the latent period. This last property is possible if we assume that overinfection during the incubation has a negligible effect on the latent time. We moreover assume that the whole healthy population size is relatively stable over time.
The limit process (1.1) is then obtained in an inductive way as the limit in distribution of {I n } n as the initial total population size |N 0 | → ∞, where I n denotes the number of new infectives at time n. More precisely, denoting by E n the incidence of exposed individuals at time n and by a m the largest individual survival age, we obtain that {X n , Z n } n D = lim N0→∞ {I n , E n } n , with X n | (X n−1 , . . . , X n−am+1 ) and where for all k = 1 . . . a m − 1, Ψ k satisfies (1.2) in Section 1, the latent period distribution given survival, P inc,a , assumed independent of a, being denoted by P inc . Thus Ψ k = θ 0 P inc (k) am a=k+1 P age (a) + p mat P inc (k) P age (k + 1) , and similarly, Moreover, when assuming a relatively stable population size and denoting by S (a) the individual survival probability until age a at least, then P age (a) is equal to S (a) ( am a ′ =1 S (a ′ )) −1 .

Illustration: the BSE epidemic in Great-Britain
In this section we provide an illustration of the methodology developed in Section 2 by studying the decay phase of the BSE epidemic in Great-Britain, based on the observations of the yearly diagnosed cases from 1989 until 2011, 1988 being the date of the main control measure. The disease that was first officially identified in 1986 ( [25]), reached its peak in 1992 (36682 cases) and is obviously now in its decay phase. Since only a very few cases were recently reported ( [19]), namely 11 in 2010 and 5 in 2011, the spread of the disease should a priori "soon" come to an end.
It has been accepted that the epidemic is fading out, with a very low level of risk for cattle and humans. However it might be interesting to have a more precise idea of the extinction phase of the disease, for instance of its length and of its intensity.
Choice of the epidemic model. BSE is a transmissible disease through ingestion of prions (horizontal route) and through maternal route. It is commonly accepted (see e.g. [18]) that the infectious phase (including the clinical state) lasts at most one year, while the latent state can be of several years. If one compares the epidemic data with the total population of around 9 million cattle in Great-Britain, it is reasonable to assume that, even at the peak time, BSE may be considered as a rare disease in a large population. We moreover assume that the probability for an animal to become infected by a horizontal route follows a Reed-Frost type model as in (ii) of Subsection 3.1, and that the probability for a calf to become infected by its dam is nonnull and constant over time. Moreover, for the sake of simplicity and parsimony, we do not take into account potential heterogeneity factors such as the different regulations from 1989 (the main regulation was the feed ban of July 1988, and was shown to be quite efficient [15]), the different types of breeding, of races, the age for animals older than one year, the evolution of the surveillance system and of diagnosis tests. So the process {X n } n defined by (3.1)-(3.4) appears relevant to model the spread of this fatal disease after the 1988 ban, where the infectious state I corresponds to the end of the incubation period and the clinical state, and R corresponds to the death (assumed to be either by routine slaughtering for E and S cattle, and control slaughtering for I cattle). Choosing a time step of one year, X n then represents the yearly incidence of notified cases, which corresponds to the available data provided by the World Organisation for Animal Health ( [19]).
Choice of the model parameters. In order to predict the future epidemic evolution, we need to evaluate all the parameters involved in (3.3). These parameters were already studied in previous works [6,9,10,15] using different models, amounts of observations, and kinds of estimators. Since it might have a crucial role in the decay phase of the disease, we focus here on θ 0 the infection parameter from 1989 via a horizontal route of transmission (i.e. the mean number per infective and per year of newly infected). Until the 1988 feed ban regulation, the main routes of transmission of BSE were horizontal via protein supplements (Meat and Bone Meal, milk replacers), and maternal from a dam to its calf. Since a previous statistical study [15] concluded to the full efficiency of the 1988 ban and since most of cattle are slaughtered before the age of 10 years, the fact that cases of BSE are still observed more than 20 years later could suggest the existence of a remaining source of infection, either via a maternal transmission route, or via a horizontal one, for example via the ingestion of excreted prions from alive infected animals, or from prions left in the environment. The prediction of the future disease spread thus strongly relies on the intensity of this infection, quantified by the parameter θ 0 . We provide an estimation of θ 0 together with a confidence interval, which will be used for prediction estimations. We provide in addition a sensitivity analysis to the other parameters of the model, which are chosen as follows. First, the maximal age of the individuals in the population a m is set to a m = 10, which corresponds to the largest reported survival age with a non-negligible probability. Next, we set p mat = 0.1, which is the largest order of magnitude commonly accepted for the maternal infection probability ( [6,10]). Following a previous work, [15], we assume here a discretized Weibull distribution with parameters α, β for the distribution of the latent period, that is to say for each k 1, P inc (k) := e − α−1 αβ α (k−1) α − e − α−1 αβ α k α , where α is a shape parameter, and β is the mode of the probability density of the corresponding continuous Weibull distribution. The parameters α and β have a very bad identifiability with the infection parameters on a sole monotonous phase, and are consequently estimated in [15] on the whole epidemic series (growth and decay), by Bayesian maximum a posteriori (MAP) estimations. We set α = α MAP = 3.84 and β = β MAP = 7.46. Since the death of susceptible and exposed animals is mostly due to routine slaughtering, the survival probability S (k) (probability for an apparently healthy animal to survive at least until k years) corresponds to the probability of being slaughtered after the age of k years. It is derived from existing literature [9]. Estimation of the infection parameter. Our study is based on the yearly number of cases of BSE reported in Great Britain from 1989 until 2011 (Figure 1) provided by the World Organisation for Animal Health [19], denoted by X obs n for each year n. We set d = a m − 1 = 9 and denote by X obs n the d dimensional vector (X obs n , . . . , X obs n−d+1 ). We choose the first time n = 1 of the epidemic model such that the model with its initial values covers the period starting from 1989, which we consider here as a time-homogeneous period. Hence X −d+1 = X obs 1989 , and n = 1 corresponds to the year 1989 + d = 1998. We estimate θ 0 by the WCLSE (2.26) in model (1.1)-(3.3), where {p mat , α, β, {S(k)} k } are given by the previous values, and where X 0 = X obs 1997 . According to [19], |X 0 | = |X obs 1997 | = 167977. We are thus close to the asymptotic |X 0 | → ∞. The number of observations is n = 14. The estimator (2.26) provides the estimation θ obs |X0| = 2.4324. We point out that this estimation is of the same order of magnitude as the maximum a posteriori Bayesian estimation θ MAP = 2.43 based on the whole epidemic until 2007, assuming a uniform prior probability ( [15]). Using (2.28) we obtain the following confidence interval [ θ min , θ max ] with asymptotic probability 95%, where θ min := θ |X0| − 1.96 c −1 1 , θ max := θ |X0| + 1.96 c −1 1 , c 1 := ( n k=1 a·X k−1 /σ 2 ( θ |X0| )) 1/2 . Assuming α i = X obs 1997−i+1 |X obs 1997 | −1 , we get c obs 1 = 40.7343 (observed value of c 1 ), and P θ 0 ∈ θ min , θ max ≃ 95%, θ obs min = 2.3842, θ obs max = 2.4805. (3.5) Although this confidence interval is an asymptotic one, as |X 0 | → ∞, it is a very good approximation of the true confidence interval for a finite |X 0 |, since |X 0 | is here very large. Since the estimation of θ 0 relies on the values given to the other model parameters {p mat , α, β}, we evaluate in addition its sensitivity to the values of these parameters. For this purpose, we compute the estimation of θ 0 and the associated confidence interval with asymptotic probability 95%, for different values of (p mat , α, β) ( Table 1). The first line of the table corresponds to the parameters chosen for the model. In each of the four following lines, we fix two coordinates and choose an extremal (unrealistic) value for the third one. It appears that the estimation of θ 0 is almost independent of the value of the maternal infection parameter. However, the estimation seems more strongly dependent on the parameters of the latent period distribution. Nevertheless, even for very unrealistic values (α, β), all the estimations of θ 0 remain in the same order of magnitude of several units. This is really small compared to estimations obtained for the infection via Meat   and Bone Meal or lactoreplacers (before 1989) which are of the order of 1000 ( [15]). However, although these estimations are all very small, θ 0 seems non null. This could suggest the existence of a minor but non null infection source which is not of maternal type.
Extinction of the epidemic. We know thanks to Proposition 2.1 that {X n } n becomes extinct almost surely if and only if R 0 = d k=1 Ψ k (θ 0 ) 1. The estimated basic reproduction is here R 0 ( θ obs |X0| ) = 0.1072. Moreover, solving with a computing program the equation d k=1 Ψ k ( θ obs |X0| )ρ −k = 1, we obtain the following value for the Perron's root ρ( θ obs |X0| ) = 0.6665, which provides the speed of decay of the expected yearly incidence of cases (see (2.6)): from a certain time, the expected number of new cases will decrease from around 33% every year.
Prediction of the incidences of cases and incidences of infected cattle. Let us predict the spread of the disease from 2012 by means of simulations of {X n } n , where θ 0 is replaced by its previous estimation θ obs |X0| = 2.4324, and where the initial time of the model is 2011, that is X 0 = X obs 2011 . The simulations are done recursively using the transition law (3.1). We point out that the model initialized by X 0 = X obs 1997 provides quite realistic simulations on the period 1998-2011 compared to the real observations on the same period, as illustrated in Figure 2.1. The epidemic process {X n } n thus seems to provide a satisfying prediction of the overall evolution of the real epidemic. In order to predict the incidences of futures cases, we simulate 1000 trajectories of {X n } n initialized by the observed values X obs 2011 , with the estimated infection parameter θ obs |X0| = 2.4324. We illustrate in Figure 2.3, for each year from 2012, the maximum, minimum, median, 2.5% and 97.5% quantiles associated with these 1000 realizations. It is also relevant to study and predict the evolution of the incidence of infected cattle in the population, which represents the hidden face of the epidemic. The incidence Z n of infected cattle at time n, conditionally on the number X n of cases at that time, is given by the Poisson distribution (3.2). For every n 2012 and for each of the 1000 previously simulated values X n , we generate one realization of Z n . We then illustrate in Figure 2.4, the yearly maximum, minimum, median, 2.5% and 97.5% quantiles associated with the 1000 realizations.
Prediction of the year of extinction. Let T := 2011 + inf {n 1, X n = 0} denote the extinction year of the epidemic process {X n } n . According to (2.7) and denoting by f θ0 the offspring generating function of {X n } n defined in (2.2) (from now on we let the dependence in θ 0 appear in the notation), we have P θ0 (T 2011 + n) = f θ0,n (0)    (3.5). The values in bold character correspond to the p-quantiles n T p for p = 0.5, 0.95 and 0.99, and to the asymptotic confidence intervals of P θ0 T n T p based on (3.6).
f θ0 can be computed explicitly. We obtain in particular, for the estimated value θ obs |X0| = 2.4324, the following p-quantiles for the extinction time (see (2.8)): n T 0.5 = 2028, n T 0.95 = 2035 and n T 0.99 = 2039. Keeping in mind that T corresponds to the complete extinction of the epidemic (i.e. d = 9 consecutive years without any case), these results actually mean that for the infection parameter θ 0 = 2.4324, with probability larger than 50% (resp. 95% and 99%), no case will arise in the population from year 2020 (resp. 2027 and 2031). Moreover, in order to take into account the uncertainty around the estimation θ obs |X0| of the infection parameter θ 0 , we make use of the asymptotic confidence interval (3.5) of θ 0 and of the fact that θ → P θ (T n) is a decreasing function of θ, which implies that for every n 2011, P P θ0 (T n) ∈ P θmax (T n) , P θmin (T n) ≃ 95%. (3.6) We collect in Table 2 the observed interval [P θ obs max (T n) , P θ obs min (T n)], for each n 2020 (if n < 2019 we have, conditionally on the initial value X obs 2011 , P (X n = 0) = 0 because of the memory which is not equal to 0). Note that these intervals are very narrow, leading to an accurate estimation of P θ0 (T n).
Prediction of the epidemic size. Let N := T −2011 n=1 X n be the total size of the future epidemic from 2012 (total number of cases from 2012 until the extinction of the epidemic). We compute the distribution of N using (2.11), conditionally on the event {X 0 = X obs 2011 }. We obtain in particular, for the estimated value θ obs |X0| = 2.4324, the following p-quantiles for the total epidemic size (see (2.12)): n N 0.5 = 22, n N 0.95 = 31 and n N 0.99 = 35. From (2.13) we deduce that the mean value and variance of N for the parameter θ obs min (resp. θ obs max ) are 21 and 27 (resp. 22 and 28). Moreover, we obtain similarly as for the extinction time and thanks to (3.5) a confidence interval [P θmax (N n), P θmin (N n)] of P θ0 (N n), for every n ∈ N, with asymptotic probability 95%, collected in Table 3.
Study of the very late extinction case. In order to predict the behavior of the "most dangerous" evolution of the epidemic, we first need to compute the estimation θ * obs n (2.38) based on the data in Great Britain. The number of available observations is only n = 14, so we are far from the asymptotic setting n → ∞ of Theorem 2.7. However, the large value of |X 0 | can make us hope for a good accuracy. We point out that, by making use of the estimator θ * n on the real data, we make an unverifiable assumption on the future of the epidemic: we consider  Table 3: Cumulative distribution function of the total size of the epidemic for the infection parameters θ obs min = 2.3842 and θ obs max = 2.4805 defined by (3.5). The values in bold character correspond to the p-quantiles n N p for p = 0.5, 0.95 and 0.99, and to the asymptotic confidence intervals of P θ0 N n N p .
the data as if they were the beginning of a trajectory with very late extinction. This should have the following consequence: the estimation provided by θ * obs n should a priori be a bit smaller than the value 2.4324 provided by θ obs |X0| . Indeed we obtain θ * obs n = 2.4305. Using this value, we deduce from (2.42) the confidence interval [ θ * min , θ * max ] of θ 0 with asymptotic probability 95%, where θ * min := θ * n − 1.96 c −1 2 , θ * max := θ * n + 1.96 c −1 2 , and c 2 := n k=0 (f ′ ( θ * n , X * k )) 2 n k=0 (f ′ ( θ * n , X * k )) 2 g( θ * n , X * k ) Therefore, c obs 2 = 40.6988, and P θ 0 ∈ θ * min , θ * max ≃ 95%, θ * obs min = 2.3823, θ * obs max = 2.4787, (3.7) which is of the same magnitude order as the confidence interval [2.3842, 2.4805] obtained with the unconditioned process (see (3.5)). Let us predict the most dangerous evolution thanks to the transition law of the conditioned process given by Proposition 2.2. Note that if one computes the eigenvalues ρ and λ introduced in Proposition 2.5, one obtains ρ( θ * obs ) = 0.6664 and |λ( θ * obs )| = 0.5570. It thus appears that the convergence of the epidemic process conditioned on non-extinction at time k to the Q-process as k → ∞ is not very fast. As a consequence the study of the Q-process only provides information about the behaviour of the disease spread in the case of an extremely late extinction. First, we see thanks to Figure 3.1 that the simulations provided by the conditioned process initialized by X 0 = X obs 1997 , and where θ 0 is estimated by θ * obs n = 2.4305, are true to the real observations on the period 1999-2011. Figure 3.2 is an example of one simulation on the period 2012-2040 of the conditioned process, for X * 0 = X obs 2011 . It appears that the values of this simulated trajectory are rapidly very small, and of course are never equal to 0 for d = 9 consecutive times. For a finer prediction, we simulate 1000 realizations of this process from 2012 until 2050, with X * 0 = X obs 2011 . Moreover, for every n 2012 and for each of the 1000 simulated values X * n , we make one realization of the incidence Z n of infected cattle at time n, according to the law given by (3.2). Figures 3.3 and 3.4 represent the yearly maximum, minimum, median, 2.5% and 97.5% quantiles associated with the 1000 realizations of respectively, the incidence of cases and infected cattle, in case of an extreme late extinction. It appears thanks to Figures 3.3 and 3.4 that the supposedly "most dangerous" trajectories nevertheless do not reach high values and do not present a new peak of epidemic. Conclusion. The methodology developed in Section 2 thus applies well here and enables not only to confirm mathematically what is commonly accepted, namely that BSE is fading out, but also to predict, with a very large probability, that the last BSE case will occur before 2027, and that until the complete extinction of the epidemic in the population, there will be less than 31 cases to come. We obtain moreover from Figure 2.4 the order of magnitude of the number of infected cattle in the population. In addition, the estimation of the infection parameter concludes to the possible existence of a minor but non null infection source which is not of maternal type, and which is very small (only around 3 newly infected animal per year and per infective) compared to the main source of horizontal infection until 1988 due to protein supplements. Finally, the study of the worst-case scenario shows that even in the case of an extreme late extinction of the disease in the population, the incidence of cases will decrease quite rapidly to 0, with afterwards only 1 or 2 yearly cases occurring regularly but sparsely, with no appearance of a new peak of epidemic.
We have shown with this example that the methodology developed in Section 2 provides accurate tools to study the decay phase of an epidemic under the current sanitary measures, which would help to make new policy decisions. This evaluation is all the more relevant since it is obtained not by simply computing what should most probably happen, but also by taking into account the variability of many factors (infection, incubation, survival), and by studying the potentially most dangerous evolution.