Improved Estimation of the Initial Number of Susceptible Individuals in the General Stochastic Epidemic Model Using Penalized Likelihood

The initial size of a completely susceptible population in a group of individuals plays a key role in drawing inferences for epidemic models. However, this can be difficult to obtain in practice because, in any population, there might be individuals who may not transmit the disease during the epidemic. This short note describes how to improve the maximum likelihood estimators of the infection rate and the initial number of susceptible individuals and provides their approximate Hessian matrix for the general stochastic epidemic model by using the concept of the penalized likelihood function. The simulations of major epidemics show significant improvements in performance in averages and coverage ratios for the suggested estimator of the initial number in comparison to existing methods. We applied the proposed method to the Abakaliki smallpox data.


Introduction
In any epidemic in a group of individuals, there is a subgroup of individuals who are not susceptible to a disease, that is, those immune to the disease naturally or by vaccination, as well as those not exposed to the disease owing to physical separation or other reasons. Therefore, estimation of the size of the initially susceptible population in the group might be pivotal; for example, see [1]. For the general stochastic epidemic model, [2][3][4] have dealt with estimating the initial number.
For the case where an epidemic is observed fully over a given time interval such that all infection and removal times are known, [2] provided martingale estimating equations to propose an estimator called the M-estimator of the initial number of susceptible individuals and its approximate variance. However, the M-estimator does not have the property of consistent coverage ratios for confidence intervals of the initial number of susceptible individuals. For the same conditions, [4] derived a likelihood function using the counting process theory after [5] to yield the maximum likelihood estimator called the k-MLE and its approximate variance.
However, this likelihood function does not coincide with that given by [5] but with that obtained by using the survival function method of two earlier studies [6,7]. The k-MLE better improves coverage ratios for confidence intervals than the M-estimator, but the problem of inconsistent coverage ratios remains. [3] extended the martingale procedure of [2] when only the removal times are observed. See [8] for a summary of the likelihoods of the completely observed data given parameters under the various model setups for the general stochastic epidemic model adopted by many researchers such as [5,6,9,10].
Here, the first approach to improving the estimator of the initial number of susceptible individuals was to employ the likelihood function of [5]. A system of equations was derived from the log-likelihood function to find the MLEs of the infection rate and the initial number of susceptible individuals, and a normal limiting distribution was assumed to propose a corresponding approximate Hessian matrix. However, because simulations for the MLE give unstable results such as infinite values for the estimate of the initial number and low coverage ratios for confidence intervals such as the M-estimator, a method of penalized likelihood 2 The Scientific World Journal function is proposed. See [11] for an example of estimation using a penalized likelihood function.
Simulations were conducted to compare the proposed maximum penalized likelihood estimator called the p-MLE with the k-MLE and the M-estimator of the initial number of susceptible individuals. Then, the proposed method was applied to the Abakaliki smallpox data from Nigeria to compare results with the findings of [2,4].
The rest of this paper is organized as follows: Section 2 presents the notations and the general stochastic epidemic model. Section 3 describes the estimation methods. Section 4 presents the simulation results. Section 5 considers a numerical example, and Section 6 concludes with a discussion and concluding remarks.

Notations and the General Stochastic Epidemic Model
Notations very similar to those of [2,4] were adopted for the spread of a susceptible-infected-removed infectious disease in a population whose individuals are mixing homogeneously. Suppose that the epidemic is observed over the time interval [0, ] in the population, whose size is ] . Note that when an individual becomes infected, the individual is assumed to be immediately infectious. Given ( ) and ( ), assume that the probability of a susceptible individual becoming infected and that of an infectious individual being removed within a small time interval ( , + ℎ] are given by /] ( ) ( )ℎ + (ℎ) and ( )ℎ + (ℎ), respectively, such that the transition probability is The correction term (ℎ) becomes negligible when ℎ is small; that is, (ℎ)/ℎ → 0. The process ( ), 0 ≤ ≤ , is assumed not to be observed, such that ] is not observable. However, the process ( ), 0 ≤ ≤ , is fully observable, such that the times at which individuals become infected are observable. The observation here includes the infection time for the infected individual and his or her removal time. Let = ( 1 , 2 , . . . , ) be the ordered successive infection times observed over (0, ]. As indicated in [2], the number of individuals infected in [0, ] who are still susceptible at time can be observed; that is, ( ) = ( ) − ( −). Note that although ( ) depends only on infection and removal times, ( ) depends on ], as well as infection and removal times.

Derivation of the Estimators
First, consider the likelihood function of the parameters and ] according to [5] where ( − ) and ( − ) denote the situation just prior to time . The likelihood function (2) differs from that given by [4] ( which is obtained under the same conditions as (2), except that the epidemic process is observed until all infectious individuals are removed, such that the likelihood function can be interpreted as obtained by observing the process over the interval ( 1 , ∞) for and ]. Note that the likelihood functions (2) and (3) are both derived using the definition of a likelihood function in statistical physics based on the counting process theory.
Here, inferences are drawn for ] when the infection process ( ) is observed over a fixed time interval (0, ], which is a relaxation of the condition in [2,4], where the epidemic is observed until it ceases. The log-likelihood function is given by taking the logarithm of (2): Here, we use the relation ( − ) = ( )−1. Note that the total number of susceptible individuals at time is just ( ) plus the number of susceptible individuals not infected by : where ( ) = ( ) − ( − ) denotes the number of individuals infected in (0, ] and still susceptible at time . The Scientific World Journal 3 With (5) substituted into (4) to use information on those infected at the infection time, the following is obtained: where Let the first partial derivatives of the log-likelihood function (4) with respect to and ] be 0; thus, the system of two nonlinear equations is where Because there are no terms of in (9), (9) can be solved with respect to ] to obtain the MLE] of ]. Then, the MLE] of ] can be plugged into (8) to get the maximum likelihood estimator of :̂=] which is the same as the M-estimator of in [2]. The two nonlinear equations are solved here separately to get solutions, whereas [4] maximizes his log-likelihood function of variables ] and .
The Hessian matrix can be approximated by assuming that the limiting distribution of̂and] follows a normal distribution. Equation (9) does not have a finite solution for some values of the observations of . To show this, we can first rewrite ℓ 2 (]) as by using the relationships ( ) = − + 1 and ( ) = + . It is clear that when 1 / 2 > − 1, ℓ 2 (]) becomes positive such that the solution of (9) should be infinity. Here, we assumed ] > without loss of generality.
When the number of the initial susceptible individuals is not large enough, the number of simulated epidemics for which the estimates of ] that did not exist cannot be ignored (Table 1). Therefore a method for improving the maximum likelihood estimator] is proposed by considering a penalized likelihood of (6): where pen (] | ) = 2 (log + 1 (]) − log 1 (])) + log 2 (], 1) , for + 1 (]) = 1 + 2 ( + ] − ( ) + 1), which modifies (9) to Note that + 1 (]) is a modification of 1 (]) to make ℓ 2 (]) slightly bigger, and 1/ 2 (V, 1) is subtracted from ℓ 2 (]) to make it slightly smaller. The penalty function pen(] | ) is heuristically chosen to penalize a large value of the estimate of ] for the log-likelihood (6). It can be shown that the denominator of the first derivative of pen(] | ) with respect to ] is positive and that the numerator is given in the quadratic equation form ] 2 + 1 ] + 2 for constants 1 and 2 such that pen(] | ) increases as ] increases with ] > ] 0 for some finite value ] 0 > 0. See [11] for a discussion in choosing a penalty function. Let the estimators of ] and obtained by solving (8) and (16) be] and̂, respectively, and be called the p-MLE.

4
The Scientific World Journal

Simulations
A simulation study very similar to that of [2,4] was conducted to compare the efficiency of the p-MLE relative to the k-MLE and the M-estimate. Here, populations of ] = 100, 250, 1000, and 5000 susceptible individuals and = 5 initial infectious individuals are considered. In this simulation, = 1 and = 1.5 and 1.3 were taken. Results were conditional on a major epidemic, and, therefore, following [2], only simulated epidemics with more than 20% of infected individuals were considered. For = 1.3, epidemics with more than 40% of infected individuals were considered. The value of was set to ∞ to compare simulation results with the findings of [2,4]. For each combination of parameters, 1000 epidemics were simulated.
Among the 1000 simulations, there were no cases in which the estimate] was infinite; this held true for the k-MLE as well, as in [4]. In the comparison with the k-MLE and the Mestimator, there were substantial improvements in coverage ratios and averages of estimates of] . The coverage ratios for the p-MLE were quite stable in comparison to those for the k-MLE and the M-estimator. An increase in the value of ] increased the coverage ratio such that all coverage ratios for ] = 5000 were close to the true confidence coefficient 0.95.
The average estimates of] were closer to the true value than those of] , the k-MLE, for each combination of parameters. The standard deviations sd(] ) and the average of estimated approximate standard errors av(ŝe(] )) of 1000 estimates of] were quite close to each other in all cases. However, this was not the case for the k-MLE and the Mestimator.
In addition, the averages of 1000 estimates of̂tended to increase to the true value with an increase in ], and the standard deviations of estimates sd(̂) and the average of estimated approximate standard errors av(ŝe(̂)) were also quite close to each other in all cases.

Application
The proposed method was applied to the Abakaliki smallpox data from Nigeria. [12] provided 29 infection times for infected individuals and the number of infectious individuals on each day of the epidemic in Abakaliki. The infectious period is assumed to be fixed at 7 days for every individual, and the latent period fixed at 13 days. The estimateŝand ] , as well as corresponding standard errors, were obtained ( Table 3). The function nleqslv in was used to solve (16) for ] . The results indicated that both the estimate and estimated approximate standard error for] were less than those for ] and] . The estimate (33.88) and estimated approximate standard error for] (4.13) were close to those for] (35.27 and 6.70, resp.) but lower than those for] (42.12 and 37.15, resp.). The estimate of initial susceptible individuals was much lower than 120, the population size. Because of the assumption of homogeneous mixing between individuals, as in earlier studies [2,4,12,13], this estimate was interpreted assuming that a number of individuals were not susceptible to the disease owing to natural immunity, vaccination, or isolation. Figure 1 shows the shape of the log-likelihood function of and ], which is similar to that in [4], and Figure 2 presents the gradient of the likelihood function of ] for the smallpox data.

Discussion and Conclusions
In any epidemic, the estimation of the initial number of susceptible individuals is of great interest. Although [5] derived a log-likelihood function for the initial number and the infection rate when the epidemic is fully observed over a given time interval, no study has investigated the properties of the MLEs. [4] considered the log-likelihood function to The Scientific World Journal 5   be obtained when the epidemic process is observed from the time of the first infection to the time when all infectious individuals are removed and derived the MLE of parameters of interest. [2] used a martingale framework to propose an estimator. The present study used the log-likelihood function of [5] and the relationship between and of [2] to derive a system of equations for parameters and solved the system to obtain MLEs of parameters, instead of finding them as [4] did, by maximizing the log-likelihood function. An approximate Hessian matrix of estimators] and̂was derived based on the assumption that the limiting distribution of] andf ollows a normal distribution. The derivation of the limiting distribution of̂and] can be a challenge.
Our modification of (9) can be considered the same as penalizing the likelihood function (4) with a suitable penalizing factor. Because there are various penalizing methods, another penalized likelihood function can be attempted for better estimators in a future study.
The simulations for the p-MLE provide a more stable result than the M-estimator and the k-MLE for unbiasedness, standard errors, and coverage ratios, and therefore, the proposed method can be used as a more reliable tool for estimating the initial number of susceptible individuals in a population.

Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.