Long-Time Analysis of a Time-Dependent SUC Epidemic Model for the COVID-19 Pandemic

In this study, we propose a time-dependent susceptible-unidentified infected-confirmed (tSUC) epidemic mathematical model for the COVID-19 pandemic, which has a time-dependent transmission parameter. Using the tSUC model with real confirmed data, we can estimate the number of unidentified infected cases. We can perform a long-time epidemic analysis from the beginning to the current pandemic of COVID-19 using the time-dependent parameter. To verify the performance of the proposed model, we present several numerical experiments. The computational test results confirm the usefulness of the proposed model in the analysis of the COVID-19 pandemic.


Introduction
Coronavirus disease 2019 (COVID-19) is the infectious disease caused by the most recently discovered coronavirus, which had not been previously identified. Many people infected with the coronavirus have mild to moderate respiratory problems and are naturally recovered without special treatment. However, the older people and patients with underlying medical conditions such as diabetes, cardiovascular, and chronic respiratory diseases are more likely to cause serious complications [1]. e serious problem with COVID-19 is that there are asymptomatic (if symptoms are very mild or no symptoms are identified) infections, and the time from infection to the moment symptoms start is on average 5-6 days and ranges from 1 to 14 days. Because the most common symptom of COVID-19 is fever [2], body temperature measurement is used as a means of detecting infection with COVID-19. erefore, if an infected person is asymptomatic or does not start showing symptoms, it is difficult to determine whether an infected person is infected, and in this situation, the rate of spread of COVID-19 can be significantly increased. e authors of [3] use a susceptibleinfected-recovered (SIR) model and machine learning to simulate the spread of COVID-19 in various scenarios. In order to effectively reduce the scale of the epidemic, it is essential to find and isolate the infected as soon as possible. Figure 1 shows the sum of the global infected population using country-specific infected population data published by the World Health Organization (WHO) [4]. We can observe the rapid increase in the number of confirmed infections worldwide. If the number of COVID-19 infections continues to rise, many people will die, and the disease will cause enormous economic damage. erefore, to predict the number of future COVID-19 infections and prepare a prevention and intervention plan in advance, it is important to estimate the unidentified infected cases. If we can estimate the number of unidentified infections using the method proposed in [5], it could help reduce the number of infected cases by evaluating various countries' COVID-19 intervention strategies and adopting effective national intervention strategies.
e SIR model is one of the simplest and most robust models of infectivity. In the traditional SIR model, β and 1/c represent the average number of random contacts that an individual has per unit time and the average time for an infected individual to recover, respectively. e traditional SIR model uses constant β and c and therefore represents only simple characteristics for infectious diseases [6][7][8]. e model with constant parameter does not reflect external factors that have a sharp impact on the change of the patient's confirmation criteria or the national prevention policy. Recently, to resolve this problem, a study on the SIR model using β and c as time-dependent parameters was conducted, and these research results showed more accurate results for epidemic prediction than before [6,7]. e SIR model is variously modified and used to analyze infectious diseases. Some researchers analyzed the spread of COVID-19 using a new, nonmonotonous SIR model rather than a monotonous SIR model, in which all susceptible populations are infected and then recovered [9]. In addition, the fractional-order epidemic model has a memory effect and thus has a positive effect on epidemiologic modeling; thus, the researchers developed a fractional-order susceptible-exposed-infected-recovered-deaths (SEIRD) model [10] and a susceptible-infected-recovered-deaths (SIRD) model including multiple fractional features [11]. Some of the other researchers analyzed epidemics and suggested solutions to end them. e researchers compared and analyzed data from multiple countries using a simple SIRD model to show that cultural factors have a great influence on the infection rate in each country and used the modified SIRD model to analyze the epidemics in each country and suggest solutions [12]. Recently, as research on machine learning has become more active, epidemic prediction and analysis using machine learning are also being actively studied. Researchers used the SIR model and machine learning to develop the epidemic model that provide smart healthcare for prediction and prevention of COVID-19 [3]. Nonlinear neural network for predicting COVID-19 cases has been developed [13]. More researchers modify or develop epidemic models to end COVID-19. e main purpose of this paper is to propose a modified susceptible-unidentified infected-confirmed (SUC) model for long-time analysis of infectious diseases, such as COVID-19, where unconfirmed infections must be considered. e outline of this paper is as follows. Section 2 proposes a time-dependent susceptible-unidentified infected-confirmed (tSUC) model. In Section 3, the computational solution algorithm is presented. In Section 4, the computational experiments are performed. Discussion of various infectious disease models and methods of confirming infection of infectious diseases can be found in Section 5. Conclusions are given in Section 6. In addition, the MATLAB source code is given in Appendix for the interested readers.

Proposed tSUC Epidemic Model
In this paper, we present the tSUC epidemic model for the COVID-19 pandemic, which has a time-dependent transmission parameter. Let S(t) be the susceptible; U(t) be the unidentified infected; C(t) be the confirmed; β(t)( ≥ 0) be a transmission variable; and c ( ≥ 0) be the average number of days taken before the unidentified infected are confirmed. en, U(t) is the population where S(t) is infected with COVID-19 and has no confirmed infection, and C(t) is the state where U(t) is confirmed to be infected and no longer spreads the disease.

erefore, S(t) is infected by β(t)S(t)U(t)/N and decreased, U(t) is increased by β(t)S(t)U(t)/N, C(t) is increased by cU(t) as confirmed infections, and U(t) is decreased by cU(t)
. erefore, the derivative of each parameter with respect to time is e unidentified infected population can spread the disease and has not yet been confirmed. e parameter β(t) is a time-dependent transmission variable, but 1/c is constant as the average number of days taken before the unidentified infected are confirmed. We assume the total population N is constant. We note that if β(t) is constant, then the tSUC model becomes the SUC model [14]. Figure 2 shows schematic illustrations of differently classified groups of the standard SIR and proposed tSUC models. Individuals belonging to S, I, and R groups in the SIR model are susceptible, infected, and recovered, respectively, as shown in the top row of Figure 2. We can subclassify I as UI and CI which are unconfirmed-infected and confirmed-infected, respectively; see the middle row of Figure 2. In the tSUC model, U is UI and C is CI ∪ R (see the bottom row of Figure 2).
In the standard SIR model, c SIR is the reciprocal of the period during which an infected individual acquires antibodies and heals. However, in the proposed tSUC model, c SUC is the reciprocal of the period during which an unidentified infected individual can spread an infectious disease until the infection is confirmed. Generally, 1/c SIR is larger than 1/c SUC (see Figure 3).

Numerical Solution Algorithm
e tSUC model can be solved by a fourth-order Runge-Kutta (RK4) method. First, let us rewrite equations (1)-(3) as follows: Second, let S n � S(nΔt), U n � U(nΔt), and C n � C(nΔt), where Δt is a time step. For n � 0, 1, 2, . . ., we have the following discrete equations: where k 11 � f β n , S n , U n , k 21 � g c, β n , S n , U n , k 31 � h c, U n ,  and c, β n , and U 0 are the unknown parameters. To solve the discrete system of equations (5)- (7), we need to know these parameter values. However, in the real-world population, β, c, and the number of the unidentified infected cases U are unknown; and only the number of cumulative confirmed cases C is known. To estimate the unknown unidentified infected cases U, we use the tSUC model and the fitting function lsqcurvefit in MATLAB R2021a, which is a nonlinear curve-fitting solver in a least-squares sense [15].

Data Smoothing.
As a preprocessing of the epidemic data, we take 7-day average data because the number of testing COVID-19 is different day by day. First, the number of new confirmed cases (ΔC) is calculated using the number of cumulative confirmed cases as follows: where p is the number of the given real cumulative confirmed cases C i (i � 1, 2, . . . , p). Second, we calculate the 7-day simple moving average of the number of new confirmed cases. Δ ave C i � (ΔC i + ΔC i+1 + · · · + ΔC i+6 )/7, i � 1, 2, . . . , p − 7. Finally, the smoothed cumulative confirmed case data refC are generated using the 7-day simple moving average of new confirmed cases as follows:

Estimating Parameters.
Let β � [β n 1 , β n 2 , . . . , β n L ] be the vector with sample transmission values at sample points t � [t n 1 , t n 1 + q, t n 1 + 2q, . . . , t n L ], where q is a sampling interval, t n 1 is the starting day, and t n L is the last check day, which will be less than or equal to the end day of the case data.
Using the piecewise interpolation, we can obtain values at between t n 1 and t n L . We obtain optimal parameters c, β, U 0 which minimize the following cost function: where C n i (i � 1, 2, . . . , p − 7) are the numerical solutions from equations (4) to (6) at the corresponding times. We compute the optimal parameter values of (c, β, U 0 ) that minimize the cost function as where [c, β, U 0 ] are the optimized parameters and [c, β 0 , U 0 0 ] are the initial guess of parameters for the tSUCmodel, C da ta is the real cumulative confirmed case data at times T da ta, lb is the lower bound, and ub is the upper bound.

Convergence and Stability Tests.
In this section, to verify the accuracy of the proposed algorithm, we perform a convergence test. We generate a reference solution using the following initial data: N � 300000, C 0 � 20000, U 0 � 2000, c � 1/4, and β(t) � 0.3 + 0.1t. Table 1 shows that the proposed method has fourth-order accuracy. Here, the following definitions are used: Δt � 2 − 2 , the final time T � 4, Next, we numerically test the stability of the tSUC model. Figures 4(a)-4(c) show the computational results for S, U, and C, respectively. e results are obtained using Δt � 0.1, 1, 10 and β(t) � 0.3 + 0.1t/T up to time T � 30. e proposed algorithm shows that it has nonnegative solutions even with large time steps.

Simulation on Real
Data. For all simulations, it is assumed that the time step size Δt � 0.1, β and U 0 are positive real numbers, and the upper and lower bound of c are 1 and 1/14, respectively. Because c is the reciprocal of the average time until an unidentified infected person is confirmed, it can be inferred from the period of symptom onset and epidemiological investigation. In this section, simulations are performed to confirm that the proposed method can estimate the optimal parameters for estimating the change in the number of unidentified infected cases over a long period of time and the change in the number of new confirmed cases using actual confirmed cases. First, data smoothing is performed on actual confirmed cases for parameter estimation. e actual confirmed case data and smoothed actual confirmed case data are called C raw and C ref , respectively. Figure 5 shows C raw and C ref from March 2, 2020, to July 23, 2021, in the Republic of Korea [4].
Next, we estimate the parameters beta and gamma using smoothed C ref i (i � 1, 2, . . . , 502) and the following initial conditions: .71. Figure 6 shows time-dependent β(t) when q � 60. In equation (2), the rate of change of U is dependent on the values of β(t)S/N − c. If β(t)S/N − c is positive, then U increases, and if β(t)S/N − c is negative, then U decreases. Figure 7 shows the change in the estimated value according to q. In Figure 7(c), we can observe that the value of estimated β(t)S/N − c according to q is different. When q is 2, the number of checkpoints is very large, resulting in overfitting. When q is 30, the number of checkpoints is relatively small, showing simple characteristics. erefore, we need to use appropriate q.

Journal of Healthcare Engineering
We use q � 7 to characterize the epidemic in detail without overfitting. Figure 8 shows the calculated results. It can be observed that the estimated parameters represent the characteristics of the epidemic in detail, and the β(t)S/N − c values represent the changes in the new confirmed cases.

Discussion
In this section, we discuss the advantages and disadvantages of the proposed method and future work. We proposed the tSUC model which enables us to analyze long-time analysis; thus, it is possible to estimate changes in the number of unidentified infected cases over a long period of time and the transmission over time and to estimate the number of unidentified infected cases in the present and past. If we can estimate the number of unidentified infected cases and its long-time trend, then we can plan and prepare the number of testing stations for COVID-19 testing, quarantine policies according to the transmission, and incentives for COVID-19 testing and vaccines. However, limitations of the proposed model are that it cannot represent changes in detailed factors such as vaccines or cultural factors, and it does not consider birth and death. e proposed model uses the least-squares method because it is simple and easy to use. However, if a  fitting function such as a deep learning neural network is used, more effective results can be obtained for predicting the future [13]. e fractional-order epidemic model for the proposed model has a memory effect because it uses past history; thus, we think it is a more effective method for epidemic models such as COVID-19 with an incubation period [10,11]. e proposed model is affected by the period during which infection of an unidentified infected person is confirmed. e method of confirmation of COVID-19 infection with X-ray images by the hybrid model using the deep learning technique can confirm the infection quickly, simply, and accurately [16]. erefore, we consider the following future work. First, we use deep learning neural networks to solve the tSUC model to analyze infectious diseases such as COVID-19 and predict the future. Second, we propose a model specialized for infectious diseases with an incubation period by modifying it to the fractional order of the proposed model. Finally, we propose a modified tSUC model that considers birth and death.

Conclusion
We proposed a long-time analysis of a tSUC model for the COVID-19 pandemic. e parameters of an epidemic model are important indicators of the characteristics of an epidemic. e parameters of an epidemic model change over time due to several factors. erefore, estimating epidemic parameters and fixing them as a single value can represent only simple characteristics; also, it is difficult to express detailed characteristics or long-time analysis. To solve the problem of time-varying epidemic parameters, one of the parameters can be made time-dependent to estimate the optimal parameters for long-time epidemic analysis, allowing detailed characterization of epidemics over time.
We demonstrated the effectiveness of the proposed method by estimating epidemic parameters using real data and performing several tests to confirm that the estimated parameters are characteristics of real data. It can be confirmed that the parameters estimated from the numerical results are suitable for long-time analysis of the epidemic, and detailed data analysis can be performed in the long term than the methods used in the existing SUC model studies.