A Bayesian Markov chain Monte Carlo method is used to infer parameters for an open stochastic epidemiological model: the Markovian susceptible-infected-recovered (SIR) model, which is suitable for modeling and simulating recurrent epidemics. This allows exploring two major problems of inference appearing in many mechanistic population models. First, trajectories of these processes are often only partly observed. For example, during an epidemic the transmission process is only partly observable: one cannot record infection times. Therefore, one only records cases (infections) as the observations. As a result some means of imputing or reconstructing individuals in the susceptible cases class must be accomplished. Second, the official reporting of observations (cases in epidemiology) is typically done not as they are actually recorded but at some temporal interval over which they have been aggregated. To address these issues, this paper investigates the following problems. Parameter inference for a perfectly sampled open Markovian SIR is first considered. Next inference for an imperfectly observed sample path of the system is studied. Although this second problem has been solved for the case of closed epidemics, it has proven quite difficult for the case of open recurrent epidemics. Lastly, application of the statistical theory is made to measles and pertussis epidemic time series data from 60 UK cities.
The linking of ecological theory with data is currently a major scientific challenge. Modern methods of data collection and storage are rapidly improving at all levels, from the detailed study of individuals in populations to the distribution of populations and communities over vast landscapes. Despite the ease with which it is possible to develop statistical theory and Bayesian Markov chain Monte Carlo (MCMC) computational statistics for many ecological problems [
In fact, it is possible to discuss many of these computational difficulties using simple stochastic epidemiological models. Epidemiological processes serve as excellent prototypes for exhibiting two major problems of inference that appear in many mechanistic dynamic models. First, the transmission process during an epidemic is only partly observed. As a result in epidemiology one only records cases and rarely observes the infection time precisely. Second, the official reporting of observations (cases in epidemiology) is typically done not as they are actually recorded, but at some temporal interval over which they have been aggregated. Although these problems have largely been solved for the case of closed epidemics, it has proven quite difficult for the case of open populations that produce recurrent epidemics (endemic diseases) over many generations in continuous time. This is because it is hard to simulate paths that are consistent with the data due to the condition that one must sample from many recorded intervals given the number of infectives in the beginning and at the end of the interval. In general this has proven easier to do for short-duration epidemics because of computational limitations due to data augmentation. As the number of recorded intervals increases the data likelihood computation rapidly becomes intractable or impossible.
In this paper a data augmentation strategy is implemented that allows addressing these problems, is reasonably straightforward to implement, is fast and accurate for the problem at hand. The basis of the method is a recently proposed Bayesian MCMC algorithm proposed by Wilkinson [
Most previous work on the SIR using likelihood [
Application of the inference method is made for time series data for two endemic childhood diseases, pertussis and measles. It is shown how to reconstruct stochastic oscillations using simulations and model checking with respect to observed cases. Finally the hypothesis of coherence resonance is investigated and it is shown how it may account for some of the empirically observed patterns of stochastic oscillatory dynamics of the two endemic diseases.
In this paper a stochastic version of the Kermack-McKendrick susceptible-infectious-recovered (SIR) model [
Next consider an event-driven model of state change. Define
Structural representation of SI state-event transitions.
Event path | Parameter | Transition | Flow in node | Flow out node | Flow difference | Change | |||
0 | 1 | 1 | 0 | −1 | 1 | ||||
1 | 1 | 0 | 0 | 0 | 1 | 0 | |||
0 | 0 | 1 | 0 | −1 | 0 | ||||
0 | 0 | 0 | 1 | 0 | −1 | ||||
0 | 0 | 0 | 1 | 0 | −1 | ||||
1 | 0 | 1 | 0 | 0 | 0 | 1 | |||
0 | 0 | 0 | 0 | 0 | 0 |
Directed network for Markovian SIR dynamics. Each event pathway,
Figure
A two-hundred-and-fifty-week time series of the number of infected cases from the SIR immigration model discussed in this paper, simulated using the stochastic simulation algorithm (SSA) defined in the supplementary material.
It can also be shown using a factored form of the event function that one can sum over all transitions in the jump chain resulting from the Kolmogorov forward equation (KFE; see the appendix (Section 1.2)) to obtain,
The previous section dealt with the case of availability of perfect information for an observed sample path. In this section the case of imperfect information, such as when sample paths consist of data obtained on fixed recorded intervals, is considered using the output of the vector
For simplicity of notation, it is now assumed that the “true” sample path
Using the ratio of likelihoods,
Using a Poisson approximation allows implementing a very fast stochastic simulation algorithm simply (much faster than the standard SSA) by applying probability functions to deterministic flow rates. This essentially corresponds to computing Euler increments for the
MCMC implementation using this framework is reasonably straightforward (see [
Because numbers of susceptible cases are not available from direct observation they must be reconstructed from the epidemic data. For both the simulation and empirical estimation studies a simple reconstruction method [
It has long been a challenge in mathematical epidemiology to understand the recurrence of epidemic outbreaks and establish an appropriate model that allows studying this phenomenon [
Parameters were estimated for a time series simulated using the stochastic SIR immigration model described previously in this paper. The parameter vector used for
Baseline disease parameters.
Parameter | Disease 1 Value | Disease 2 Value |
---|---|---|
50000 | 100000 | |
1000 wk | 1000 wk | |
4 wk | 2 wk | |
10 wk | 10 wk | |
Using both infected and susceptible cases time series obtained from the simulations the parameters shown in Table
Posterior estimates of stochastic SIR model.
10000 weeks of observations—Disease 1 | |||
Target | Disease value | Mean | Standard deviation posterior |
3.70 | 3.69 | .042 | |
.25 | .250 | .001 | |
.10 | .107 | .021 | |
.001 | .001000 | .000004 | |
1000 weeks of observations—Disease 2 | |||
Target | Disease value | Mean | Standard deviation posterior |
14.7 | 14.70 | .015 | |
.5 | .50 | .004 | |
.10 | .170 | .136 | |
.001 | .00100 | ||
10000 weeks of observations—Disease 2 | |||
Target | Disease value | Mean | Standard deviation posterior |
14.7 | 14.72 | .20 | |
.5 | .50 | .007 | |
.10 | .11 | .032 | |
.001 | .00100 |
Figure
Finally, it should be pointed out that nearly unbiased estimation of the SIR parameters
In this section parameters are estimated using time series data for 60 UK cities. Pertussis and measles data were obtained using case notification records from the UK Registrar General for England and Wales. Pertussis cases were reported weekly and biweekly for measles. For both diseases cases reported from the period 1944–1967 were analyzed. City sizes ranged from 10530 (Teignmouth) to 3249440 (London). Reported cases for three UK cities are shown in Figure
Reconstructed susceptible cases (based upon the method described in Section
Simulation of measles and pertussis as stochastic oscillators with comparison between exact (known) susceptible time series and reconstructed susceptible time series. Parameters used in the simulations are given in Table
Markov chain traces for 500 weeks of observations of Disease 2. The color panel shows how estimation of the migration rate,
Kernel density estimates for 500 weeks of observations of Disease 2. The kernel density estimate for migration rate,
Measles and pertussis reconstructed susceptible time series. Cases are on
(a) Measles and pertussis cases (
Figure
Measles and pertussis estimates of transmission (
Coherence resonance occurs when noise is amplified in an otherwise quiescent system by interaction of the underlying stochasticity of the dynamics with the oscillatory transients of the deterministic dynamics. What has been lacking thus far is a rigorous statistical approach that allows quantifying the theoretical expectations that drive this process using observed time series data. The method developed in this paper is now used to infer endemic sustained oscillations for noisy measles and pertussis epidemics via the mechanism of coherence resonance.
Kuske et al. [
Kuske et al. [
Assume that
The key results are as follows. Figure
Estimated variance of the stationary process from measles and pertussis time series. Plotted on the
Plot of estimated per year rate of infection (labeled gamma on the
Figure
This paper utilizes a parameter estimation used for mechanistic modeling of biochemical systems [
The application of computational and mathematical techniques from what has been called algorithmic systems biology [
Recent breakthroughs in automated estimation of rare event probabilities in biochemical systems [
In this paper a straightforward Bayesian MCMC methodology for inferring parameters for open SIR models using stochastic simulation is applied to both simulated and observed epidemic time series data. The methods described in this paper are general enough for extension to more complex epidemiological scenarios, which is currently the goal of future work. This is useful because the efficient integration of complex likelihoods for population models is currently an object of intense ongoing research. Analysis of the data for the methodology developed in this paper is accomplished using standard Bayesian data analysis [
The results obtained in this paper show how pertussis and measles epidemics behave with respect to the presence of demographic noise. Time series for 60 UK cities were used to estimate epidemiological parameters for these pathogens. A coherence resonance model was fit to the data to infer the role of multiscale effects in producing period and amplitude in the epidemics. It was found that measles appears to fit the model rather well. However, pertussis does not seem to fit the model, and it is predicted that there does not appear to exist quite a strong separation between slow and fast time scales as for pertussis as seems to exist for measles epidemics. Therefore, one expects less coherence and less structured oscillations for pertussis but more coherence and structured oscillation for measles epidemics. The statistical theory developed in this paper was used to investigate coherence resonance of epidemics [
This work was partly funded by the University of Missouri and Duke University. The author would like to thank Helen Wearing and Pej Rohani for comments and suggestions on earlier versions of this paper.