A Spatial-Temporal ARMA Model of the Incidence of Hand , Foot , and Mouth Disease in Wenzhou , China

and Applied Analysis 3


Introduction
Hand, foot, and mouth disease (HFMD) is a common infectious disease which usually affects children, particularly those less than 5 years old and younger.It is characterized by a distinct clinical presentation of fever or vesicular exanthema on their hands, feet, mouths, or buttocks [1][2][3][4][5].The transmission of HFMD occurs from person to person through direct contact with saliva, faeces, vesicular fluid, or respiratory droplets of an infected person and indirectly by contaminated articles [1].After a susceptible individual is infected he firstly enters the incubation period of HFMD, which is about 3 to 7 days.After the incubation period, the infected will show some clinical symptoms, such as having a fever, poor appetite, malaise, and sore throat, and few people may develop dehydration, febrile seizures, encephalitis, meningitis, cardiomyopathy, and so forth.And the infected people will fully recover after 7 to 10 days [1].At present, there are still no available effective vaccines or drugs against HFMD human use, but such vaccines are being developed [6].In 2012, for instance, an epidemic in mainland China involved 2,168,737 cases and 567 deaths [2].Particularly, in 2012, there were 147,941 HFMD cases and 17 deaths in Zhejiang province, and it ranks the first in the "Ten legal infectious disease" [7].HFMD has become an emerging public health concern in the affected countries and a focus of increasing amounts of research [4].Therefore, it is important to use mathematical models to improve our understanding of infectious disease dynamics of HFMD and to help us develop preventive measures to control infection spread qualitatively and quantitatively.
There are several types of analytical models that are valuable to understand and predict the transmission of HFMD.One is compartmental differential equation model [8][9][10][11][12][13][14][15], which is important to understand the spread dynamics of HFMD among the susceptible populations and to enable policy makers to take effective measures to curb the disease spread and reduce the adverse impact of the disease [9,10].The other is statistical models which can help us find novel information concerning pathogen detection and some probable coinfection factors in HFMD and have been applied to understand HFMD's spatiotemporal transmission and discover the relationship between HFMD occurrence and climate [3,[16][17][18][19][20][21][22][23][24][25][26][27][28][29].Of them, Hu et al. [4] explored the spatial association of HFMD incidence with several potential determinants and found that child population density and climate factors are potential determinants of the HFMD incidence in most areas in China.Liu et al. [5] conducted 2 Abstract and Applied Analysis a spatial and space-time scan statistics analysis in Shandong province to explore the distribution characteristics and detect spatial and spatial-temporal clusters (hotspots) of HFMD cases and found that the incidence seasonal peak was between April and July.Deng et al. [28] analyzed the epidemic characteristic of HFMD in Guangdong province and detected spatial-temporal clusters and found that climate factors and demographic changes might be contributors affecting the epidemic situation of HFMD.All these studies increase our understanding of the distribution and severity of the disease.However, potential factors influencing the incidence of HFMD remain little understood.
Wenzhou is a prefecture-level city in southeastern Zhejiang province in China.At the time of the 2010 Chinese census, 9,122,100 people lived in Wenzhou [30].It includes 3 municipal districts (Lucheng, Longwan, and Ouhai) and 8 counties (Cangnan, Dongtou, Pingyang, Ruian, Taishun, Wencheng, Yongjia, and Yueqing) with a total land area of 11,784 square kilometers (Figure 1).Since Wenzhou has a humid subtropical climate zone with an annual average 18.08 ∘ C (64.5 ∘ F), it is of particular public health significance to update molecular epidemiology of HFMD in Wenzhou.It is reported that, in 2012, there were 41,438 HFMD cases and 17 deaths in Wenzhou [7].
In [15], the authors established an SEIQRS epidemic model with periodic transmission rate to investigate the spread of seasonal HFMD in Wenzhou and found that the HFMD becomes an endemic disease in Wenzhou and for controlling the HFMD spread, it is beneficial to increase the quarantined rate or decrease the treatment cycle.However, spatial effect has not been considered in [15].And in [4,5,28], the authors utilized spatial models to determine the risk factors of the incidence of HFMD and found that monthly average temperature, relative humidity, and total sunshine were important factors to affect the incidence of HFMD.However, their models have low power of prediction because future incidence rate is dependent on future risk factors.
The goal of this paper is to explore a model that could describe spatial-temporal effects and that has the ability to forecast.Unlike [4,5,28], we do not take the risk factors into account.Based on the monthly observed data of HFMD in each county of Wenzhou [7], we present a spatial-temporal autoregressive moving average (ARMA) model and give a general Bayesian framework for parameter estimation.
The paper is organized as follows.In Section 2, we give a method to preprocess the original data and propose a spatialtemporal ARMA model.In Section 3, the HFMD data from the Wenzhou Center for Disease Control and Prevention is analyzed based on descriptive statistics and spatial-temporal ARMA model.Finally, we give a brief discussion.

Data Collection and Preprocessing. Data of HFMD in
Wenzhou during 2010 to 2012 were collected by Wenzhou Center for Disease Control and Prevention, including the basic demographic characteristics of HFMD cases, the pathogen type (EV71, CoxA16, and other EV) of some HFMD cases, and incident child cases in each county per month.According to the records obtained, all the cases were children aged between 0 years and 14 years, with county-specific case numbers varying from 0 to 1852 incident child cases.
The HFMD dataset lists the number of incident cases in each county per month.It reflects the occurrence of the disease in different regions.However, it cannot reflect the risk of infecting the disease because of the different population sizes among counties or municipal districts.To reduce the influence of population size, cumulative incidence (CI) is utilized to reflect the risks of infecting HFMD in each county.It measures the disease frequency during a period of time [31].We denote   ,   , and   as the CI of HFMD, the number of cases, and the child population size at the time  in the th county, respectively.Then   =   /  .In our case,  = 1, 2, . . ., 36,  = 1, 2, . . ., 11.However,   cannot be used to compare the disease risk between different counties directly because of random effects.Some counties reported 0 HFMD cases in some months; however, it could not say that there is no risk of HFMD in these counties.In fact, they had HFMD cases in other months.Thus, we use a hierarchical Bayes model to adjust CI [32].The spirit of the idea to improve accuracy in the model is smoothing strength among counties.The model is as follow: where   is the overall level of the disease risk at time , ]  is the uncorrelated heterogeneity following normal distribution with mean 0 and variance  2 ] , and   is the component to describe spatial correlation, which is defined by the intrinsic Gaussian autoregression distribution [33,34].That is, where  = (  ) is the spatial adjacent matrix defining the connectivity between counties.  = 1 if the th county and the th county are adjacent; otherwise   = 0. To perform Bayesian analysis, we assign gamma distributions with a large variance as the priors for the parameters  2 ] and  2  , and the trick is mostly used in Bayesian spatial analysis [34].Thus, the priors for  2 ] and  2  are given by The estimation of the parameters can be obtained by MCMC, and we use R = α + μ as the adjusted CI, where α and μ are the posterior mean of the parameters   and μ .Since 0 < R < 1, modeling such data is unstable and the fitted values may exceed the interval (0,1).We make the transformation For better expressing the model, the backshift operator, denoted , is needed. operates on the time index of a series and shifts time back one time unit to form a new series.In particular,   =  (−1) .Thus, the model can be formulated as where {  ,  = 1, 2, . . ., 36} is a white noise process with variance  2  .However, model (5) does not introduce spatial correlation among different regions.Thus, we modify the model as follows: where   is defined in (2).Model ( 6) is much different from [35,36] ).Then the parameters in the model ( 6) are  = (, ,  2 ).Denote f  = ( 1 , . . .,  36 ),  1 = ( 11 ,  12 , . . .,  111 ), f = (f 1 , . . ., f 11 ), and  = ( 1 ,  2 , . . .,  36 ).We denote that the likelihood function of f given  and  is (f | , ).(f | , ) can be obtained by several methods.One of the most used methods is Kalman filter.The overall strategy for computing the likelihood for a given set of parameter values is to use the Kalman filter equations to generate recursively the prediction errors and their variances and then use the prediction error decomposition of the likelihood function [37].
Prior distribution of  can be elicited from the experts' opinions or historical data.When the historical data is available, power prior is a good choice, which has emerged as a useful class of informative priors for historical data [38].Assume that the joint prior distribution of  is is following the intrinsic Gaussian autoregression distribution.Then the density function of   is defined as where   is a neighborhood of the th county.Thus the joint density of  is Based on the likelihood function (f | , ), (7), and ( 9), the posterior distribution of  is With specific priors of , , and  2 , we can use the corresponding method to generate the posterior samples of .In the case study, we use the normal distributions as the priors for  and , and the Gamma distributions as the prior for  2 .Then the full conditional posterior distributions of , , and  2 are all log-concave densities.Thus adaptive rejection sampling can be used for posterior sampling [39].In general, Metropolis-Hastings algorithm can be used to implement the Markov chain sampling.After running Markov chain sampling procedure  times and discarding the initial  burn-in iterations, then we have ( − ) iterations kept.Since the generated sample is not independent, we need to monitor the autocorrelations of the generated values and select a sampling lag  > 1 after which the corresponding autocorrelation is low; that is, the length of the thinning interval is .Considering the length of the thinning interval, the final number of iterations kept is   = ( − )/, and these independent samples will be used for posterior analysis.Assume the generated sample is ( (1) ,  (2) , . . .,  (  ) ) .
Then for any function () of the parameters of interest  we can (1) obtain a sample of the desired parameters () by simply considering ( ( (1) ) ,  ( (2) ) , . . .,  ( (2) estimate the posterior mean of () by () = (1/   ) ∑   =1 ( ()   .Other measures of interest might be the posterior median or quantiles (e.g., 2.5% and 97.5% percentiles will provide a 95% credible interval).1 lists the demographic characteristics of HFMD cases and the pathogen types of some cases from 2010 to 2012.There are a total of 103,671 HFMD cases reported in Wenzhou.The incidence rates are 2,111.1 per 100,000 (18 years old or younger) in 2010, 1,872.0 per 100,000 (18 years old or younger) in 2011, and 2,702.3 per 100,000 (18 years old or younger) in 2012.The majority of the reported HFMD cases are less than 5 years old (about 89.4%), which means that younger children are more vulnerable to HFMD.

Prevalence of HFMD. Table
Most of HFMD cases (approximately 62.3%) were boys and the male-to-female incidence ratio was 1.638 : 1 in 2010, 1.691 : 1 in 2011, and 1.633 : 1 in 2012, respectively.The  values of testing the difference between incidence in male and that in female were smaller than 0.001.Thus, the incidence in male was higher than that in female.Among 2,401 pathogen study between 2010 and 2012, CoxA16, EV71, and other EV accounted for 10.08%, 73.26%, and 16.66%, respectively.Clearly, the majority of HFMD pathogens were EV71.
Figure 2 shows monthly reported cases of HFMD in Wenzhou from 2010 to 2012.From Figure 2, we can see that incidence of HFMD is periodic and that a peak attains between May and July, because the temperature will affect the incidence of HFMD as studied in [4,5,28].

Spatial
Autocorrelation of HFMD Cases. Figure 3 shows the annual incidence rate per county accounting for the spatial variability of population size.It clearly indicates that the distribution of HFMD is heterogeneous at county level, and Ouhai municipal district is the most severe region.difference is taken and the adjusted Dickey-Fuller test shows that the time series is stationary when the significant level is 0.1 (Table 3).Thus   =   −  (−1) , where  1 = 0.As shown in Figure 2, we can specify the seasonal period  = 12.Then That is,  = 3,  = 2,  = 2, and  = 1.We compared all the submodels by AIC criteria [40] and chose the best model with the smallest AIC.The optimal model with the smallest AIC is  = 0,  = 0,  = 1, and  = 0. Thus the model used in the data analysis is where   ∼ (0,  2  ).Table 4 lists the estimates of   and   in each county, where the numbers in the parentheses are the posterior standard deviations of the parameters.Figure 4 shows the comparison between the data and the fitted values, where "O" means the observed data and "F" is the fitted values.

Conclusions and Discussions
In our study, we observed that children ≤5 years old accounted for most of the HFMD cases during the study period in Wenzhou, with an average male-to-female sex ratio 1.65 and the incidence single peak season was between May and July (see Table 1 and Figure 2), which was similar to other studies [3-5, 26, 28].The dominant pathogen was EV71 (73.26%), which was significantly higher than other districts of China.The difference might be partly attributed to climatic, geographic, social factors, and so forth [24,41,42].
In Wenzhou, HFMD had positive spatial autocorrelation at county level with high Morans I with  value < 0.001.The highest incidence area is Ouhai, which is located in the middle of Wenzhou, and the second highest group areas are Lucheng, Longwan, and Ruian that are adjacent to Ouhai (see Figure 3).To investigate the spatial variability, we present a spatial-temporal ARMA model, in which we describe the spatial effects by introducing correlated random effects.The model provides an adequate fit to the data of HFMD in Wenzhou.Other models can also be considered, for example,   () Φ  ()   =   () Θ  ()   ,  = 1, 2, . . ., 11, (14) where {  ,  = 1, 2, . . ., 11} is intrinsic Gaussian autoregression distribution.Model (14) introduces the spatial effects by {  } directly.However, it will make the parameter estimation much more difficult, since the number of parameters increases.How to analyze model ( 14) needs further study.

Figure 4 :
Figure 4: Comparison between data and the fitted values based on model (13).

Table 1 :
Demographic characteristics of HFMD cases and the pathogen types of some cases in Wenzhou, 2010-2012.

Table 2 :
The results of the spatial autocorrelation test on HFMD cases in Wenzhou, 2010-2012.

Table 3 :
Adjusted Dickey-Fuller test of the monthly incidence rate of each county in Wenzhou, 2010-2012.Temporal ARMA Model of HFMD.For each county, we test the stationarity of {  ,  = 1, 2, . . ., 36} and find that all the series are nonstationary.Then first-order