Estimation of Disease Transmission in Multimodal Transportation Networks

Mathematical models are important methods in estimating epidemiological patterns of diseases and predicting the consequences of the spread of diseases. Investigation of risk factors of transportation modes and control of transportation exposures will help prevent disease transmission in the transportation system and protect people’s health. In this paper, a multimodal traffic distribution model is established to estimate the spreading of virus. *e analysis is based on the empirical evidence learned from the real transportation network which connectsWuhan with other cities. We consider five mainstream travel modes, namely, auto mode, high-speed railway mode, common railway mode, coach mode, and flight mode. Logit model of economics is used to predict the distribution of trips and the corresponding diseases. *e effectiveness of the model is verified with big data of the distribution of COVID-19 virus. We also conduct model-based tests to analyze the role of lockdown on different travel modes. Furthermore, sensitivity analysis is implemented, the results of which assist in policy-making for containing infection transmission through traffic.


Introduction
Despite tremendous efforts to reduce and control infectious diseases, infections continue to be a global threat to worldwide public health. Understanding the virus propagation is quite essential for the implementation of antivirus methods. While research studies about the antivirus policy have been extensively investigated, the viewpoint from the perspective of the propagation along transportation modals is relatively ignored. Consideration of risk factors of transportation modes and control of transportation exposures will help prevent disease transmission in the transportation system and protect people's health. When an infectious disease case occurs at a location, investigators need to understand the mechanisms of disease propagation in the transportation network.
On December 31, 2019, the outbreak of novel coronavirus was first reported in China. e global outbreak of COVID-19 was mainly caused by transmission through different transportation modes. To prevent the spreading of virus, all the transportation system from Wuhan to the outside was closed in the morning of January 23, 2020. On January 30, 2020, the WHO (World Health Organization) declared a global emergency. On March 11, WHO declared the COVID-19 outbreak to be a global pandemic. For weeks after the first reports of a mysterious new virus of COVID-19, millions of people poured out of the central Chinese city, cramming onto buses, trains, and planes as the first wave of China's great Lunar New Year migration broke across the nation, and some of them are virus carriers. e travel patterns broadly track with the early spread of the virus. e majority of confirmed cases and deaths have occurred in China, within Hubei province, followed by high numbers of cases in central China, with pockets of infections in Chongqing, Shanghai, and Beijing as well. e initial spread of travelers to provinces in central China is with large pools of migrant workers. ere might be a "high correlation" between the early spread of coronavirus cases and the distribution of travel destinations. e atmosphere in the transportation vessels is closed, and it is easy for the virus to spread. And the transmission speed is different in different traffic modals, due to the different air fluency in the traffic vessels.
Mathematical models have become important tools in epidemiology in understanding epidemiological patterns of diseases and predicting the consequences of the introduction of public health interventions to control the spread of diseases. ere are two lines of studies in epidemics spreading. e first line is the spreading model of differential equation, and the second line is the complex network theory. In the literature, there are three spreading models widely used in modeling virus transmission, namely, SIR model, SIS model, and SI model (acronyms such as M, S, E, I, and R are often used for the epidemiological classes. e class M represents individuals with passive immunity. e class S represents susceptible individuals who can become infected. e class E represents the exposed individuals in the latent period, who are infected but not yet infectious. e class I represents the individuals of infective, who are infectious in the sense that they are capable of transmitting the infection. e class R represents recovered individuals with permanent infection-acquired immunity. e choice of which epidemiological class to include in a model depends on the characteristics of the particular disease being modeled and the purpose of the model) [1][2][3][4]. To solve the models, three kinds of algorithms have been developed based on percolation theory [5,6], mean field theory [7,8], and Markov chain theory [9,10].
Researchers also developed models to investigate the propagation of different types of viruses including some nonbiological viruses, such as the computer virus, the flash disk virus, the Bluetooth phone virus, and the email virus. Otero-Muras et al. presented a systematic approach to the biochemical network dynamic analysis and control based on both thermodynamic and control theoretic tools [11]. Based on biological control strategy in pest management, Pang and Chen constructed a pestepidemic model with impulsive control, i.e., periodically spraying microbial pesticide and releasing infected pests at different fixed moments [12]. Jin and Wang developed a new dynamic propagation model of FD-SEIR, namely, flash disk virus susceptible-exposed-infectious-recovered, which is embodied by introducing the FD state and new propagation rate [13]. Huang et al. developed an epidemic model of Bluetooth phone virus [14]. Li et al. formulated a novel deterministic SEIS model for the transmission of email viruses in growing communication networks [15]. Jackson and Chen-Charpentier presented two plant virus propagation models, one with no delays and the other with two delays [16]. Jia and Lv established a stochastic rumor propagation model. Sufficient conditions for extinction and persistence in the mean of the rumor have been examined [17]. Zhang et al. established a spreading model based on contact strength and SI model, and a weighted network with community structure based on a network model proposed by Barrat et al. [18].
In the following, a multimodal traffic distribution model is established to estimate the spreading of virus. e analysis is based on the empirical evidence learned from the real transportation network which connects Wuhan with other cities. Five travel modes are considered, namely, auto mode, high-speed railway mode, common railway mode, coach mode, and flight mode. Logit model of economics is used to predict the distribution of trips and the corresponding diseases. e effectiveness of the model is verified with big data of the distribution of COVID-19 virus. e main contributions of the paper are in four aspects. First, we propose a multimodal traffic distribution model using data of the real transportation system. Second, we study the relation between the state of disease transmission and the traffic flows distribution based on the numerical results of the proposed model and the big data of the distribution of COVID-19 virus. ird, we use the model to predict the role of lockdown on different transport means and analyze its impact on the disease transmission. Fourth, we present a sensitivity analysis for the proposed model and derive various transportation improvement policies to control large-scale transportation exposure. e remainder of this paper is organized as follows. Section 2 establishes a multimodal traffic distribution model to estimate the spreading of virus. e proposed model is validated in Section 3, using a real traffic distribution from Wuhan to other regions in China during the outbreak of COVID-19. Conclusions are made in Section 4.

Multimode Travel Cost Functions.
e multimode travel cost functions are based on the empirical evidence learned from the real transportation network connecting Wuhan to other cities. We consider five mainstream travel modes, namely, auto mode, high-speed railway mode, common railway mode, coach mode, and flight mode (Tables 1 and 2 where ε auto represents the cost of gasoline consumed per kilometer and T d auto denotes the auto travel time. e cost function c d auto consists of three terms. e first one stands for the monetary cost of travel time captured by the product of the value of in-vehicle travel time vot auto and the travel time T d auto ; the second one is the cost of gasoline consumed by this trip; the third one is the highway tolls charged along the highway captured by the product of the highway toll charged per kilometer and the total highway length. e average vehicle occupancy n is the average number of occupants in a vehicle. We set the npiece of auto utility function taking account of the actual traffic situation. e Transport Bureau of   Gansu  400  654  1200  190  1200  430  135  1330  Xi'ning  Qinghai  ------130  1300  Taibei  Taiwan  ------155  1400  Beijing  -270  520  720  152.5  900  320  120  2200  Tianjin  -300  525  840  156.5  960  300  115  1150  Shanghai  -300  336  900  140  720  250  95  1880  Chongqing  -390  279  540  140  780  280  95  1650  Hohhot  Inner Mongolia  --1828  229  --130  1050  Nanning  Guangxi  450  478  840  170  1050  320  120  1180  Lhasa  Tibet  ------230  1070  Yinchuan  Ningxia  --1560  198  --135  1300  Urumqi  Xinjiang  --2310  345  --260  2000  Hong Kong  -280  679  ----130  1574  Macao  -------100  1380 Note that, in this part, the unit of measurement is kilometers for the distance, minutes for the time, and CNY for all kinds of tolls and fees.

Journal of Advanced Transportation
(2) High-speed railway mode: where T d high−speed rail is the high-speed railway travel time and vot rail is the value of time spent in the highspeed railway. e travel cost c d high−speed rail consists of 2 terms: the first term is the cost of the travel time and the second term represents the high-speed railway ticket price.
(3) Common railway mode: where T d rail is the common railway travel time and vot rail is the value of time spent in the common railway. e travel cost c d rail consists of 2 terms: the first term is the cost of the travel time and the second term represents the common railway ticket price. (4) Coach mode: where T d coach is the coach travel time and vot coach is the value of time spent in a coach. e travel cost c d coach consists of 2 terms: the first term is the cost of the travel time and the second term represents the coach ticket price. (5) Flight mode: where T d flight is the travel time by taking a plane and vot bus is the value of time spent in a flight. e travel cost c d flight consists of 2 terms: the first term is the cost of the travel time and the second term is the flight ticket price.

Multimodal User Equilibrium Model.
To cater for the consideration of both mode choice and destination choice, we propose a multimodal network user equilibrium model as follows: where θ d is the impedance parameter associated with the travel mode choice to the destination region d, δ is the impedance parameter associated with destination choice, β d m is the exogenous attractiveness of the travel mode m to the destination d, and α d is the exogenous attractiveness of the destination region d. q d m indicates travel demand in the travel mode m from Wuhan to the destination region d. q d indicates travel demand from Wuhan to the destination region d. As to α d , we develop the following weighted destination attractiveness measure to quantify the attractiveness of each destination to Wuhan: We denote the original data of historical demand distribution ratio, population, and travel distance of a destination region d as h d , p d , and l d . As h d , p d , and l d are incommensurable, namely, all are measured in different units, they cannot be directly added and need to be normalized before the use of the weighted-sum method. To do so, we define H d , P d , and L d as follows: where min(·) is a function to obtain the minimum item in a list, for example, min(|h| d ) is to get the minimum item in the list h d . max(·) is a function to obtain the maximum item in a list, for example, max(l d ) is to get the maximum item in the list l d . According to equation (2i), it is easy to get that 0 ≤ L d ≤ 1, from which we can infer that the value of the weighted distance parameter (L d ) k d ranges from 0 to 1 for any k d (k d > 0). Besides, it can be learned from experiences Journal of Advanced Transportation that passengers are more sensitive to travel cost in a short trip than in a long trip. is phenomenon has been studied in the area of stochastic traffic flow distribution [19]. In this work, passengers within the province of Hubei are more sensitive to the travel cost than those traveling out of the province of Hubei because of shorter travel distance. at is to say, the magnitude of travel cost takes a greater effect on the attractiveness of a destination region inside Hubei than that outside Hubei, meaning the value of (L d ) k d should be greater for d ∈ D in than for d ∈ D out . Along with the already known condition that 0 ≤ L d ≤ 1, it consequently requires the travel distance-related weighting parameter k d taking a smaller value for d ∈ D in than d ∈ D out . Furthermore, as the provincial capital of Hubei, Wuhan attracts a good many of migrant workers and students working or studying there each year for its abundant employment opportunities and diverse educational resources. e migrant population constitutes the majority of travel demands in the Spring Festival travel season in Wuhan. Without unexpected disruption, the historical traffic distribution of Wuhan in recent Spring Festival travel seasons will provide high-quality evidence for predicting the traffic distribution of this year. To reflect the significant impacts of the historical traffic distribution on the assessment of a destination's attractiveness, we suggest that the historical traffic distribution-related weighting parameter a takes a larger value than the population-related weighting parameter b. e detailed value setting for various parameters defined in this part can be found in Table 3. e objective function (2a) is a two-level nested logit choice model to deal with the interrelated decisions in a multimodal network. e first level focuses on destination choice and the second level on mode choice. Equation (2b) ensures that the amount of flow assigned to different destination regions from Wuhan sums to the total travel demand Q which, in this work, amounts to 5,000,000. Equation (2c) represents the mode flow conservation constraint. Equations (2d) and (2e) are the nonnegativity conditions for destination demands and mode flows, respectively.
By deriving the first-order optimality conditions of the proposed program, we have the following nested logit model for destination choice and mode choice, respectively: where ud is users' perception of the generalized cost of traveling from the origin city Wuhan to the destination city d, which is computed as a "log-sum" of travel cost of each mode, i.e., To solve the nested logit model-based problem, one can first compute the generalized cost u d (∀d ∈ D) according to equation (2l) and then carry out multiproportional traffic assignment (2j)-(2k) to obtain the combined destination distribution and modal split, i.e., q d m and q d .

Case Study
e outbreak of COVID-19, which started in December last year, took Wuhan as the center and soon spread to all regions of China (including Hong Kong, Macao, and Taiwan). In the early morning of January 22, the province of Hubei launched level II emergency response to public health emergencies, and then cities in Hubei successively stopped public transportation. As of 11 : 00 on January 24, public transportation in 12 cities in Hubei had been shut down, including Wuhan, E'zhou, Xiantao, Zhijiang, Qianjiang, Huanggang, Chibi, Jingmen, Xianning, Huangshi, Dangyang, and Enshi, among which Wuhan, as the transport hub of more than 10 million people, temporarily closed its airports, rail stations, and all main roads out of town, as well as suspended public buses and subways. e government announced that citizens should not leave Wuhan without special reasons, and the lift of the lockdown will be announced separately. On January 26, the Information Office of the People's Government of Hubei held a press conference, pointing out that from the beginning of the Spring Festival to the closure of Wuhan, more than 5 million people left Wuhan, and more than 9 million remained in the city.
In this section, we will use the transportation model proposed in Section 2 to analyze the traffic flow distribution for the 5 million people outbound from Wuhan and then estimate the epidemic situation based on the demand distribution results. We are mainly concerned about the distribution of people within the province of Hubei as well as outside the province of Hubei. Figure 1 shows the map of the province of Hubei and 35 other regions of China, and Figure 2 shows the map of Wuhan and 16 other cities in the province of Hubei.
To facilitate the computation of the travel utility to a destination province outside Hubei, instead of calculating the travel utility to each city in the destination province, we only calculate the travel utility to the provincial capital city. For the calculation of the normalized historical demand distribution ratio parameter H d in equation (2f ), we collect the data of migration from Wuhan to other destination regions of the year 2017 on the Tencent social network's Spring Festival geographic positioning data platform. e data show that except several provinces including Henan, Hunan, Anhui, Jiangsu, and Guangdong, for other provinces, the majority of the traffic out of Wuhan flowed into their provincial capitals. As a result, we replace the population of a province by the population of its provincial capital for the computation of the parameter P d in equation (2f ). As to other five provinces, i.e., Henan, Hunan, Anhui, Jiangsu, and Guangdong, we use the sum of population of cities which occupied the most amount of immigration from Wuhan in 2017 instead of the population of the province. Besides, to obtain the travel distance parameter L d in equation (2f ), we use the road length from Wuhan to other destination regions to measure the travel distance. e data of historical demand distribution, the population of cities and provinces, and the road distance from Wuhan to other destination regions can be found in Table 4. parameters for the computation of the nested logit model-based traffic assignment are listed in Table 3. Learned from the real traveling experiences, parameters are set as follows:   Journal of Advanced Transportation 0 < vot flight < vot high−speed rail < vot auto < vot rail < vot coach . Furthermore, in this study, we consider different travel cost sensitivities of passengers with different scale of travel path sizes. e related research results [19] reveal that passengers in short trip are more sensitive to travel distance or travel cost than those in long trip, which causes the value of θ d m for ∀d ∈ D in is 5 times that for ∀d ∈ D out .

Demand Assignment.
Based on the model proposed in Section 2, we calculate the traffic flows from Wuhan to other 48 destination regions which include 15 cities within Hubei, and 33 destination regions outside Hubei. Note that we exclude several regions which include the Shennongjia Forest District in Hubei, the Diaoyu Islands, and the South China Sea Islands from the calculation of destination demand distribution for that the traffic flows of these regions are very small. Besides, the demand distribution in real condition is collected from the Baidu Migration Big Data Platform. e error ratio of estimation is defined as the ratio of the estimation error to the result in real condition. Data of the estimated demand distribution, the demand distribution in real condition, and the error of estimation are listed in Tables 5 and 6 for cities within the province of Hubei and destination regions outside Hubei, respectively. Note that, in this part, the unit of measurement is kilometers for the distance, minutes for the time, and CNY for all kinds of tolls and fees. Table 7 provides the results of the aggregated demand distribution ratio, which shows that, in both the real and estimation conditions, the traffic flows within the province of Hubei account for most part (about 70%) of the total demands. It is also shown in Table 7 that the aggregated error ratio of demand estimation for cities inside Hubei (6.99%) is smaller than that for destinations outside Hubei (15.73%), meaning it performs better in demand estimation within the province of Hubei, and the aggregated error ratio of demand estimation for all destination regions is 18.60%. e demand distribution results which take a decreasing order are shown in Figure 3. e destination name marked with an asterisk denotes a city within the province of Hubei. Note that in Figure 3, we put the demand distribution results of Hong Kong, Macao, and Taiwan into one item named "others" for brevity. According to results in Figure 3, in both the real and estimation conditions,  where f is the number of confirmed cases nationwide apart from Wuhan. With f � 31296, we have c � 0.6259%. We then further estimate the number of incidence cases in different destination regions which is equal to q d * c, ∀d ∈ D.
Results of the real number of incidence cases, the estimated number of incidence cases, the error of estimation for the number of incidence cases as well as the error ratio estimation for the number of incidence cases are listed in Tables 8 and 9 for destinations within and outside Hubei, respectively. Tables 9 and 10. It can be observed from Figure 4 that our estimation overestimates the number of incidence cases in most cities within Hubei, as well as two provinces, i.e., Henan and Hunan. According to results in Section 3.1, these cities/ provinces are the destination regions with the largest traffic flow distributions. e fact of the lower incidence rate of these destination regions with the most immigration from Wuhan than the average incidence rate implies that   measures adopted by the Chinese government played an effective role in preventing a more serious situation from developing. Measures in the prevention and control of the epidemic involve lockdown on public transportations in 12 cities in Hubei, a two-week mandatory self-quarantine for people immigrated from Wuhan, residential communitybased management, constructing temporary treatment centers, the centralized schedule of medical staff and supplies to the scarce areas, and so on. At the same time, results in Figure 4 also reveal that the model-based results underestimate the incidence rate of several destination regions including Guangdong, Chongqing, Sichuan, Shandong, Zhejiang, Beijing, Shanghai, Heilongjiang, and Hong Kong. is is because high frequent commercial activities that  Figure 4: Comparison of the estimated number and the real number of incidence cases.

Journal of Advanced Transportation
involve face-to-face or close contact with other people lead to the higher incidence rates of the economically developed provinces, such as Guangdong, Zhejiang, Beijing, Shanghai, and Hong Kong. It is also interesting to see that the province of Heilongjiang, far away from Wuhan, not as commercially active as the provinces mentioned above, is featured by its high incidence rate. e high incidence rate of Heilongjiang may be attributed to the mass contact transmission of virus in gathering activities. According to news reported in Heilongjiang, to the date of February 7, there had been 48 family aggregating activities which were the source of 194 cases of cluster infection.
In Table 11, we compute the aggregated error ratio of estimation for the number of incidence cases which is equal to 39.52%, almost two times of the aggregated error ratio of estimation for the traffic flow distribution (18.60%), indicating that the spread of the epidemic is not linear with respect to the model-based traffic flow distribution. For cities within Hubei, the aggregated number of incidence case distribution ratio in real condition (56.40%) is much lower than the estimation (74.08%) while for provinces outside Hubei, the aggregated number of incidence case distribution ratio in real condition (43.60%) is much higher than the estimation (25.92%). is is because the spread of disease within Hubei is well controlled by means of transport restriction, medical assistance, and other effective methods while the high economic activity frequency as well as the high occurrence of mass gatherings in some provinces outside Hubei will potentially increase the incidence rate outside Hubei.

Mode Flow Distribution.
Public transport as the main mode of transportation in big cities carries the highest risk of transmission of infection for a number of reasons. e high density of passengers confined in relatively small spaces was the primary cause. Besides, the in-vehicle air conditioning system featured by the low ventilation rates makes it easy for virus to spread. And the indirect infection from the contaminated public facilities in transport vessels is also one of the major danger sources. Furthermore, for passengers taking a long trip, multiple public transportation transfers are often involved, the fact of which potentially increases the incidence rate. In contrast, self-driving or taking a ride in a privately owned vehicle has several advantages over public transport in containing the transmission of infection. First, passengers are separated by vehicles. e spatial isolation reduces the risk of cross infection. Second, in the self-driving travel mode, passengers drive to destinations directly without any transfer most of the time. ird, people who are friends or familiar with each other often travel together in a privately owned vehicle. It is easy for them to learn the health condition of each other which helps to raise their awareness of health security and as a result mitigates the risk of infection. Comparisons of different transportation means' impacts on the virus spreading reveal that it is important to enhance the epidemic prevention from the perspective of public transport control.
In this section, we first calculate the mode flow distribution based on the proposed model. e mode flow distributions of each destination region are listed in Tables 10 and 12. e results of the aggregated mode flow ratio for destinations inside Hubei as well as outside Hubei are shown in Figure 5. It can be seen from Figure 5 that, for destinations both inside and outside Hubei, public transports are the mainstream transportation means accounting for about 80% of the total demands. Besides, the most popular travel mode of public transport is the high-speed railway for trips both inside and outside Hubei, which indicates that enhanced measurements, such as disinfection and disease detection, should be adopted by the high-speed railway transportation system. Furthermore, the proportion of aggregated mode flow ratio of the common railway inside Hubei (29.63%) is much higher than that outside Hubei (13.55%), indicating that, for trips from Wuhan to cities inside Hubei, extra efforts should also be paid on the epidemic control in the common railway transportation system.
To contain the COVID-19 outbreak, many countries have implemented flight restrictions to China. At the same time, China itself has imposed a lockdown of the transportation system of Wuhan as well as the entire Hubei province. In this context, it is reasonable to investigate how the mode flow distribution changes with different outbound transport restrictions in Wuhan. We will use the proposed nested logit model to analyze the role of lockdown on each transport means in the following content. Table 13 shows the results of the aggregated demands ratio inside Hubei under cases applying lockdown on different travel modes and it reveals that a lockdown on any travel mode will lead to an increase of the aggregated demands ratio inside Hubei, among which shutting down the high-speed railway will cause the maximum rise of the aggregated demands ratio inside Hubei from 73.87% to 78.15%. is indicates that a lockdown on any travel mode will not make a big difference to the change of the aggregated demand distribution between destinations inside and outside Hubei. We then check the effects of transport restriction on the change of mode flow distribution, and related results are listed in Tables 14 and 15 for  destinations inside Hubei and outside Hubei, respectively.  From Tables 14 and 15, it can be seen that, for destinations both inside and outside Hubei, lockdown on a certain transportation means leads to the growth of traffic flows of other travel modes, and particularly lockdown on the highspeed railway has the most prominent impact on the traffic flow increment of other travel modes, indicating that in the  case of a lockdown on the high-speed railway, enforcement on the control of transportation exposures should be conducted for all the other public transport systems. It also reveals that a lockdown on a certain travel mode may cause different extent of aggregated mode flow increment of other travel modes. For example, the common railway restriction has the most significant impact on the increase of the aggregated mode flows of automobile (63.65%) for destinations inside Hubei. And the automobile restriction leads to higher aggregated mode flow growth of coach (42.98%) for destinations inside Hubei than any other aggregated mode flow increment. is indicates that it is important to measure the magnitude of correlation between lockdown on a certain travel mode and the traffic flow increase of other    spread of the crisis is not purely dependent on the transportation situation, but also affected on the one hand by the control methods conducted by the public power and on the other hand by the frequency of local economic activities as well as the occurrence number of crowd-collected activities. e analysis of the role of lockdown on different travel modes reflects that lockdown on the high-speed railway has the most prominent impact on the traffic flow increment of other travel modes, and a lockdown on a certain travel mode causes different extent of aggregated mode flow increment of other travel modes. It is important to measure the magnitude of correlation between lockdown on a certain travel mode and the traffic flow increase of other travel modes. e public transport mode which has a high correlation with the lockdown policy needs intensified management to prevent virus from spreading through transportation. Furthermore, sensitivity analysis is implemented in this study, and based on the results of which, we work out a compromise solution for stimulating the traffic flow of automobile and reducing the demands of target regions with high incidence rates at the same time.

Conflicts of Interest
e authors declare that they have no conflicts of interest.