^{1}

^{2,3}

^{1}

^{2}

^{3}

Epidemic percolation networks (EPNs) are directed random networks that can be used to analyze stochastic “Susceptible-Infectious-Removed” (SIR) and “Susceptible-Exposed-Infectious-Removed” (SEIR) epidemic models, unifying and generalizing previous uses of networks and branching processes to analyze mass-action and network-based S(E)IR models. This paper explains the fundamental concepts underlying the definition and use of EPNs, using them to build intuition about the final outcomes of epidemics. We then show how EPNs provide a novel and useful perspective on the design of vaccination strategies.

With the continual improvement of computing power, individual-based models of infectious disease spread have become more popular. These models allow us to incorporate stochastic effects, and individual-scale detail in ways that cannot be captured in more traditional models. In this paper, we review a framework based on directed random networks that unifies a range of individual-based models in closed populations, simplifying their analysis. We then show how this framework provides a new and potentially useful perspective on the design of vaccination strategies.

Directed random networks that we call

For simplicity, we assume the entire population is susceptible to infection at the beginning of an epidemic. When one or more persons are infected, there are two possible outcomes. In a

The idea of using networks to represent the final outcomes of stochastic epidemic models developed separately for

Informally, a single realization of the EPN is generated by considering each individual

no edge between

a directed edge from

a directed edge from

an undirected edge between

An outgoing or undirected edge from

In the rest of this section, we formally define a general stochastic SEIR model, define its EPN, and describe the epidemic threshold in terms of the emergence of giant components in the EPN.

Consider a closed population of

An epidemic begins with one or more persons infected from outside the population, which we call

The general stochastic SEIR model can be turned into almost any standard epidemic model by choosing appropriate

In the stochastic Kermack-McKendrick SIR model for a population of size

In the network-based analogue of the Kermack-McKendrick model, infection is transmitted across the edges of a contact network. It has the same infectious period distribution as the mass-action model but a constant hazard

The general stochastic epidemic model is

First, we can sample “on the fly” for each new infection

Second, we can sample

Sampling on the fly is more efficient if the goal is to produce just a single epidemic realization, but sampling

For each individual

sample a latent period

sample an infectious period

for each

For each pair of individuals

if

if

if

if

The time homogeneity assumption guarantees that

The most important properties of the EPN are its degree distribution and its component size distributions. The

This property of the EPN has several useful consequences. The epidemic threshold of an S(E)IR model corresponds to the emergence of

Schematic diagram of the giant components of an EPN. Note that the GIN and GOUT both include the GSCC. Tendrils are directed paths out of the GIN or into the GOUT that do not enter or leave the GSCC; a tube is a tendril that goes from the GIN to the GOUT. An initial infection in the GIN will lead to the infection of the entire GOUT (including the GSCC). If the initial infection is outside the GSCC, it will also infect a few tendrils or tubes and a few nodes in the GIN outside the GSCC. Since these are small components, their existence does not affect the calculation of the asymptotic probability and attack rate of a major epidemic. Adapted from [

In the limit of large

To demonstrate the accuracy of the EPN framework, we compare theoretical predictions of the probability and attack rate of a major epidemic based on EPNs with observations from a series of simulations of mass-action and network-based models. These simulations were implemented in Python (

In mass-action SEIR models like Example

For mass-action models with independent infectiousness and susceptibility, the outbreak size distribution and the probability and attack rate of a major epidemic can be predicted using branching process approximations that become exact in the limit of large

Figure

Major epidemic probabilities and attack rates in the mass-action models from Section

In a network-based SEIR model, infection is transmitted across the edges of a contact network. For network-based models, analysis via EPNs can be seen as a generalization of analysis via bond percolation models, first used to calculate the attack rate of a major epidemic [

To illustrate this approach and its limitations, we generalize the network-based Kermack-McKendrick model from Example

Consider a network-based Kermack-McKendrick model that has an exponential infectious period with mean one. The probability that a single initial infection with 2 neighbors fails to transmit infection is

In this class of models, the bond percolation model overestimates the probability of a major epidemic whenever there is a variable infectious period [

For a given

Major epidemic probabilities and attack rates in network-based models from Section

In these examples and the models considered in [

Most stochastic simulations of epidemic spread provide dynamic information about the spread of an epidemic. In contrast, a realization of an EPN is a static object, so many more mathematical tools are available to analyze it. To calculate the probability of a major epidemic, it suffices to calculate the proportion of nodes in the GIN. To calculate the attack rate, it suffices to calculate the proportion in the GOUT. In the infinite population limit, this is equivalent to calculating the probability that the EPN has an infinite path directed into or out of a randomly chosen node. This justifies the branching process approximation for mass-action models, and a similar approach is appropriate in networks without short cycles. These lead to analyses based on probability generating functions for the bond percolation model [

More generally, however, we cannot use probability generating functions when the branches of the initial spread of infection intersect with asymptotically nonzero probability. Thus, we need different approaches to calculate the size and probability of an epidemic on a network with short cycles. This can be done numerically with EPNs as described below, but we can also use EPNs to make rigorous statements about the disease spread.

For example, if we want to analyze the impact a single individual has on an epidemic, a standard stochastic model would require many simulations. With an EPN approach, we are able to generate a realization of the EPN, including edges between all other nodes, and then consider the impact of each possible edge involving the targeted individual. This approach was used in [

EPNs are a powerful numerical tool for the simulation and analysis of epidemics. Traditionally, the probability and attack rate of a major epidemic in an S(E)IR model are estimated by running the model repeatedly. For each run, we record whether a major epidemic occurred and, if so, we record the attack rate. Whether a major epidemic occurs is a binomial process where each run of the model is like a single coin flip. When an epidemic occurs, the size has only a small variation. Thus, repeated simulation produces an accurate estimate of the attack rate much faster than the probability. In a model with a sufficiently large population, the probability and attack rate of a major epidemic can be calculated with equal precision from a single realization of the EPN. Tarjan's algorithm [

In Figure

A comparison of the predictions from a single EPN with 50 simulations for three different epidemic processes on an Erdős-Rényi network of 50,000 nodes with average degree 5. The EPN results (large symbols) closely match the calculated predictions in the asymptotic limit. The simulated results (small symbols) compare well for size but poorly for probability. Generating an EPN is a much more efficient numerical method for estimating the probability of a major epidemic than simulation.

Although our attention has focused on static quantities such as the probability or size of major epidemics, EPNs can also be used to calculate the dynamic spread of an epidemic. Returning to the generation algorithm described in Section

In this section, we show how EPNs provide a useful guide to the design of efficient vaccination strategies in mass-action and network-based SEIR models. For simplicity, we assume that we have a perfect vaccine that makes its recipients immune to infection. The effect of the vaccine can be represented by erasing all incoming, outgoing, and undirected edges incident to each vaccinated node in the EPN. Since a major epidemic is possible if and only if there is a GSCC, we hypothesized that vaccine should be targeted to nodes with a high probability of inclusion in the GSCC and a high number of connections to nodes in GSCC.

To test the effect of this targeting strategy in a mass-action model, we created a mass-action model with three subpopulations, A, B, and C, of equal size. Subpopulation A had high infectiousness but low susceptibility, subpopulation B had average infectiousness and susceptibility, and subpopulation C had low infectiousness but high susceptibility. Within subpopulation A, each node had a relative susceptibility that was exponentially distributed with mean one. Within subpopulation C, each node had a relative infectiousness that was exponentially distributed with mean one. Subpopulation A had the highest probability of being in the GIN, subpopulation B had the highest probability of being in the GSCC, and subpopulation C had the highest probability of being in the GOUT. With no vaccination,

Summary of the mass-action model from Section

Subpopulation | A | B | C |
---|---|---|---|

Mean outdegree (infectiousness) | 5 | 2.5 | 1.25 |

Mean indegree (susceptibility) | 1.25 | 2.5 | 5 |

Pr (causes epidemic) | .951 | .779 | .430 |

Pr (infected in epidemic) | .430 | .779 | .951 |

Pr (in GSCC) | .409 | .607 | .409 |

Mean degree within GSCC | .835 | .942 | .835 |

The results of the three vaccination strategies are shown in Figures

Effects of vaccination on the probability and attack rate of a major epidemic in the mass-action model from Section

Effects of vaccination on

The standard approach to vaccination targeting in network-based models is to target nodes with high degree in the contact network [

To compare the two vaccination strategies, we estimated the probability and attack rate of a major epidemic as a function of the vaccination fraction

When all nodes have the same infectiousness and susceptibility, we expect to see no difference between the two strategies because degree in the contact network is the only determinant of a node's probability of being in the GSCC. To represent the effects of variation in susceptibility and infectiousness, we allowed the transmission probability from node

The results of the comparison for a model with independent infectiousness and susceptibility are shown in Figure

Comparison of targeting by contact network degree (lines) and targeting the GSCC (circles) in the network-based model from Section

Comparison of targeting by contact network degree (lines) and targeting the GSCC (circles) in the network-based models from Section

EPNs provide a very useful intuitive point of view when thinking about the behavior of stochastic SEIR epidemic models. The “bow-tie” diagram in Figure

The ultimate outcome of an epidemic does not depend on where it starts.

The probability and attack rate of an epidemic must be both zero or both positive, so there is a single epidemic threshold.

In general, vaccinating the highly infectious (i.e., those likely to be in the GIN) will reduce the probability of a major epidemic, and vaccinating the highly susceptible (i.e., those likely to be in the GOUT) will reduce its attack rate. Vaccinating those likely to be in the GSCC will reduce both.

The ideal vaccination targets are not necessarily the most infectious or the most susceptible individuals. Instead, they are those individuals with the right combination of infectiousness and susceptibility to be effective receivers and transmitters of infection. It is precisely these nodes that hold together the GSCC of the EPN. In Section

The primary limitation of EPNs is that they are defined only for time-homogeneous SEIR models. They cannot accurately represent the final outcomes of complex, time-dependent SEIR models and interventions. For example, they cannot accurately represent seasonality, the effects of changing behavior or demographics, or the effects of an intervention that is implemented only when a certain prevalence of infection is reached. The vaccination strategies in Section

Nonetheless, EPNs generalize earlier approaches to the analysis of mass-action and network-based models, providing a simple unified framework for the analysis and implementation of time-homogeneous S(E)IR models. They are powerful theoretical and practical tools, and they represent an important application of networks in infectious disease epidemiology.

The authors thank Aric Hagberg and Leon Arriola for their comments on the vaccination strategy research, which was done primarily at the Los Alamos Mathematical Modeling and Analysis Summer Institute in 2007 at Los Alamos National Laboratory (LANL). E. Kenah was supported by National Institutes of Health (NIH) cooperative agreement 5U01GM076497 at the Harvard School of Public Health and National Institute of General Medical Sciences (NIGMS) grant F32GM085945 at the University of Washington. Office space and administrative support for E. Kenah was provided by the Fred Hutchinson Cancer Research Center. J. C. Miller was supported by the Department of Energy (DOE) at LANL under contract DE-AC52-06NA25397 and the DOE office of the ASCR programme in Applied Mathematical Sciences, by the RAPIDD programme of the Science and Technology Directorate, Department of Homeland Security and the Fogarty International Center, National Institutes of Health, and by the Center for Communicable Disease Dynamics, Department of Epidemiology, Harvard School of Public Health under NIGMS award Number U54GM088558. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIGMS or NIH.