The New Novel Discrete Distribution with Application on COVID-19 Mortality Numbers in Kingdom of Saudi Arabia and Latvia

This paper aims to introduce a superior discrete statistical model for the coronavirus disease 2019 (COVID-19) mortality numbers in Saudi Arabia and Latvia. We introduced an optimal and superior statistical model to provide optimal modeling for the death numbers due to the COVID-19 infections. This new statistical model possesses three parameters. This model is formulated by combining both the exponential distribution and extended odd Weibull family to formulate the discrete extended odd Weibull exponential (DEOWE) distribution. We introduced some of statistical properties for the new distribution, such as linear representation and quantile function. The maximum likelihood estimation (MLE) method is applied to estimate the unknown parameters of the DEOWE distribution. Also, we have used three datasets as an application on the COVID-19 mortality data in Saudi Arabia and Latvia. These three real data examples were used for introducing the importance of our distribution for ﬁtting and modeling this kind of discrete data. Also, we provide a graphical plot for the data to ensure our results.


Introduction
Modeling pandemics is significant in our life as it makes it easier for researchers to understand the behavior of the spread of each virus and its effect on humanity. Nowadays, a new virus has risen on the top of the scene, Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2), which causes COVID- 19. is virus attracts the interest of many researchers who tried many attempts to model daily deaths in the entire world by the effect of COVID-19 infection. As an example of these studies, Al-Babtain et al. [1] introduced a natural discrete Lindley distribution and studied the mortality numbers in Egypt from 8 March to 30 April 2020. Also, Hasab et al. [2] make a study on the COVID-19 mortality numbers by using the susceptible infected recovered (SIR) epidemic dynamics of COVID-19 pandemic to model COVID-19 infections in Egypt. Algarni et al. [3] discussed type-I half-logistic Burr XG family with application of COVID-19 data. Almetwally [4] discussed the odd Weibull inverse Topp-Leone distribution with applications to COVID-19 data. Almetwally et al. [5] discussed new distribution with applications to the COVID-19 mortality rate in two different countries. El-Morshedy et al. [6] studied a new discrete distribution, called discrete generalized Lindley, to analyze the counts of the daily COVID-19 cases in`Hong Kong and daily new deaths in Iran. Maleki et al. [7] used an autoregressive time-series model regarding the twoscale mixture normal distribution to predict the retrieved and reported COVID-19 occurrences. Nesteruk [8] forecasts the daily new COVID-19 occurrences in China by using the mathematical model SIR. Batista [9] used a logistic growth regression model to estimate the final size and its peak time of the COVID-19 epidemic. Muse et al. [10] discussed modeling the COVID-19 mortality rate with a new versatile modification of the log-logistic distribution. Liu et al. [11] presented a new statistical model called arcsine-modified Weibull distribution for modeling COVID-19 patients' data.
Afify and Mohamed [12] developed the extended odd Weibull exponential (EOWE) distribution for data modeling in many sciences such as architecture, medicine, and reliability. e EOWE distribution is a flexible model offering different density function forms such as left-skewed, symmetrical, right-skewed, and reversed-J; see, the work of Alshenawy et al. [13]. Its hazard rate function (HRF) may provide declining, constant, rising, upside-down bathtub and J-shaped hazard rates, and bathtub and modified bathtub hazard ratings are quite important in terms of durability technologies. For more details, see the work of Alshenawy et al. [13]. Generally speaking, most distributions are used to model such data and can usually take four or five parameters to achieve these hazard rates. DEOWE distribution has three parameters only, and it can be used to analyze censored data due to its easy, closed forms of its HRF and cumulative distribution function (CDF). e CDF and probability mass function (PMF) of the DEOWE distribution are given, respectively, by and f(x; α, β, λ) � αλ exp(αλx) [ Why do we need discrete distributions is a question that any researcher would ask. e reason is that most current continuous distributions do not provide reliable findings for modeling the COVID-19 scenarios. e reason for all of this, as we all know, is that death counts or regular new cases display extreme dispersion.
Many authors have introduced discrete distributions to overcome the deficiencies of the continuous distribution in modeling mortality numbers, such as Para and Jan [14] have introduced discrete Burr-type XII and discrete Lomax distributions. Discrete Lomax (DL) distribution is the discrete distribution which exhibits heavy tails and can be helpful in medical science and other fields, discrete Burr (DB), which is presented by Krishna and Pundir [15], discrete Lindley (DL), which is introduced by Gómez-Déniz and Calderín-Ojeda [16], discrete generalized exponential (DGEx), which is presented by Nekoukhou et al. [17], natural discrete Lindley (NDL), which is introduced by Al-Babtain et al. [1], and discrete Gompertz Exponential (DGzEx), which is presented by El-Morshedy et al. [6]. Gillariose et al. [18] introduced discrete Weibull Marshall-Olkin family of distributions with properties, characterizations, and applications. Discrete Marshall-Olkin generalized exponential distribution has been presented by Almetwally et al. [19]. Al-Babtain et al. [20] discussed the estimation of the parameters of two discrete models called discrete Poisson-Lindley and discrete Lindley distributions, with some applications.
To convert a continuous distribution to a discrete one, a variety of methods are possible. A survival discretization approach is the most widely used technique for generating discrete distributions. It necessitates the existence of CDF, the existence of a continuous and nonnegative survival function, and the division of period through unit intervals. In Roy [21], the probability mass function (PMF) of a discrete distribution is described as where is a CDF of continuous distribution and Θ is a vector of parameters. e random variable X is said to have the discrete distribution if its CDF is given by P(X < x) � F(x + 1; Θ). e hazard rate is given by hr(x) � P(X � x)/(S(x)). e reversed failure rate of discrete distribution is given as e novelty and the motivation to write this paper is to find the best statistical model which can provide the fit for COVID-19 mortality numbers in Saudi Arabia and Latvia by introducing a new discrete model, namely, the DEOWE distribution.
e point estimation of the unknown parameters has been discussed by using the MLE method. Also, we make an expectation for the mortality number in each day. e remainder of this article is organized as follows. In Section 2, we define DEOWE distribution. DEOWE linear representation of its PMF is obtained in Section 3, along with some of its statistical properties. e MLE method is used for parameter estimation in Section 4. In Section 5, we performed a simulation study to study the performance of the distribution relative to the true values of the parameters; also, we evaluated the relative bias (Rbias) and mean square error (MSE) of the estimation method. Two real datasets were used as three real data applications on the mortality numbers in Section 6. ese three applications were used to prove that the proposed distribution provides the efficiency of the DEOWE distribution with respect to other distributions by evaluating the information criteria and the P values and chi-square values for all distributions. Finally, conclusions and the major findings are given in Section 7.

DEOWE Distribution
In this section, we introduce the DEOWE distribution, the PMF, and the CDF which are obtained. Some figures with 2 Complexity different values of the parameters for the PMF and HRF of the distribution are represented in Figures 1 and 2. e DEOWE distribution is obtained based on the survival discretization method. Let S(x; ϑ) � 1 − F(x; ϑ) denote the survival function (S) of a baseline model with parameter vector ϑ, respectively, so the CDF of the DEOWE distribution is given by e corresponding PMF of (4) is defined by where α, β, and λ are positive parameters. e random variable with PMF (5) is denoted by X ∼ DEOWE (α, β, λ) ; the corresponding HRF of the DEOWE distribution is defined by

Mathematical Properties
is section of the paper introduces the linear representation of the DEOWE distribution with its quantile function.

Linear Representation.
In this section, we made a linear representation for the PMF of the proposed distribution. We used linear representation to derive different statistical properties of the proposed model. Unfortunately, we reach a result form which does not follow any statistical model, and it is mathematically difficult to use to derive different statistical properties. In the case of the proposed distribution, we have three different cases for this linear representation.

Quantile Function.
e quantile function (QF) of the DEOWE distribution is the inverse function of the CDF, and it is given as follows: e three quarterlies (Q) of the DEOWE distribution can be obtained by setting u � 0.25, 0.5, and 0.75 in equation (11).
Bowley's skewness (BS) and Moor's kurtosis (MK) can be calculated by the QF, respectively, as follows: and Table 1 shows the numerical mean, variance, BS, and MK for the distribution using different parameters. ese different values are coherent with the plots in Figure 1

Parameter Estimation
In this section, we use the MLE method to estimate the unknown parameters of the DEOWE distribution. Assume that x 1 , . . . , x n represents a random count discrete sample that follows the DEOWE distribution having the parameters, α, β, and λ. So, the log-likelihood function will have the following form: where Ω � (α, β, λ) is a vector of the DEOWE parameters. e MLEs are obtained by solving the following normal equations: Complexity and ese equations cannot be solved explicitly. Hence, a nonlinear optimization algorithm as the Newton-Raphson method is used.

Simulation Studies
is part of the paper is devoted to make the Monte Carlo simulation procedure. is simulation study is performed for the classical estimation method: MLE for estimating parameters of DEOWE distribution in a lifetime by R language. Monte Carlo experiments are carried out based on data generated from 10 000 random samples from DEOWE distribution, where X has DEOWE lifetime for different actual values of parameters and different sample sizes n: (20,40,70, and 100).
We evaluate in every table Rbias and MSE of estimators.Tables 2-4 summarize the simulation results of the point estimation method in this paper. We consider the Rbias and the MSE values to perform the needed comparison between different parameters' values and their effect on point estimation values.
In every table, we fix the β value and increase the values of both λ and α, and then, we study the effect of increasing and decreasing the values. Concluding remarks are provided at the end of this section to illustrate the impact of the increment and decrements of the parameter's values. 6 Complexity

Concluding Remarks on Simulation
Results. In this section of the paper, we introduce the major findings deduced from the simulation tables; we introduced the effect of increasing the sample sizes and the effect of increasing the true values of the parameters used in the simulation study. Also, we will discuss the effect of fixing the value of every two parameters and increasing the value of the third one. e following points can be noted from Tables 2-4: (1) As we can see from the results from Tables 2-4, by increasing the sample size, we can see that the consistent property of MLEs comes true, and the     Table 3 by fixing the value of β � 1.5 and for a fixed value of α � 0.5, 1.5, 5 and increasing λ from 0.05 to 0.5, we deduce that the MSE and Rbias of the parameters increase in most cases (4) By increasing the value of β from 1.5 to be five as in Table 4 and making the sample size fixed for both  values of beta, we deduced that the MSE and Rbias of the parameters increase in most cases

Applications to COVID-19 Data
In this section of the paper, we introduce two real data applications on the COVID-19 mortality numbers in Saudi Arabia, and the third data are outside Saudi Arabia; this third data were for Latvia mortality rate. e first data were an expressed sample on the first wave, while the second sample was an expressed sample on the second wave. e first application depends on the period from 26 December to 17 February 2021 for the infections in Saudi Arabia. We used this period because recording the infection numbers in this period was accurate as it was the peak of the second wave in Saudi Arabia. As in the earlier months of infection, recording the number of deaths was not accurate, so we choose this period specifically. e second dataset was taken for a period from 30 May 2020 to 20 August 2020. We choose this period because this period was the starting of the outbreak of COVID-19 in Saudi Arabia, and the mortality numbers start to increase also. is period is considered as the peak of the  first wave in Saudi Arabia, which is very important to be modeled. We also evaluated the information criteria to introduce the importance of the proposed distribution compared with other competitors.
To make the comparison between many distributions, we must make this comparison based on some criteria; one of these analytical measures is called the Akaike information criterion (AIC), see [25]; there are another criteria called Bayesian information criterion (BIC), see [26], for more  information, and we can also refer to Hannan-Quinn for more information criterion (HQIC), see [27], for more information, and last criteria are called the consistent Akaike information criterion (CAIC), see [28], for more details; all these criteria were used to compare the goodness of fit of the proposed model with other competing distributions. ese measures are as follows. e AIC is given by e BIC is calculated as follows: e HQIC is where k is the number of model parameters, n is the sample size, and ℓ refers to the log-likelihood function evaluated at the MLEs. Table 6 provides values of AIC, BIC, CAIC, HQIC and, chi square (χ 2 ) with a degree of freedom, and its P value for all models is fitted based on the real dataset of Saudi Arabia. Figure 3 indicates a comparison between these distributions to get the best distribution; also, Figures 3-5 indicate the graphical plots of the data and the PMF of DEOWE distributions, with the corresponding competitive distributions with various numbers of parameters. As we can see that the plot in Figure 6 is the CDF of the distributions with the random variable X, while the third graph in Figure 7 is for the quantile function as a function of x, where x is the number of deaths per day; Figure 8 shows graphical plots of the data and the PMF of the DEOWE distributions.

Application 2.
In this section, the DEOWE distribution is fitted to another set of data of COVID-19 mortality numbers in Saudi Arabia of 83 days of infection, which is recorded from 30 May 2020 to 20 August 2020. Table 7 [17], the discrete Marshall-Olkin generalized exponential (DMOGE) distribution is introduced by Almetwally et al. [19], and exponentiated discrete Weibull (EDW) distribution is introduced by Nekoukhou et al. [29].

Application 3.
In this section, the DEOWE distribution is fitted to another set of data of COVID-19 mortality numbers in Latvia of 33 days of infection, which is recorded from 12 May 2021 to 13 April 2021. We choose this period specifically because it was the peak of the second wave of the COVID-19 infection in Latvia. Table 9 contains some information and descriptive statistics for this data, while Table 10 contains the dataset used in this application associated with the frequency of each death number and the probability of this number, and Table 11 contains the MLE of the parameters and the P values and chi-square values for the distributions, also the information criteria for each distribution. e data are as follows: 11, 9, 11, 10, 2, 8, 12, 12, 10, 10, 5, 2, 12, 11, 13, 3, 5, 6, 5, 10, 6, 14, 9, 1, 8, 3, 3, 9, 17, 18, 5, 0, 4. ese data were collected from the world health organization, and these numbers represent the number of deaths per day. For more information, see the following link: https://covid19.who.int/. We compare the fitting results of the discrete generalized exponential (DGE) distribution, see the work of Nekoukhou et al. [17], the discrete Marshall-Olkin generalized exponential (DMOGE) distribution is introduced by Almetwally et al. [19], and exponentiated discrete Weibull (EDW) distribution is introduced by Nekoukhou et al. [29].

Concluding Remarks on the Real Data
(1) By referring to the goodness-of-fit measurements' values in Tables 6 and 12, we deduce that the DEOWE distribution has the lowest chi square, AIC, and CAIC values among all distributions for the three applications. (2) By referring to the values of the goodness of fit measurements in Tables 6 and 12, we deduce that the DEOWE distribution has the highest P value among all of its competitors for the three applications. (3) For application one and by referring to Figures 3 and  4, we can see that the one-and two-parameter distributions provide poor fitting for the data. In contrast, the three-parameter DEOWE distribution in Figure 5 provides better fitting for the data among all its competitors. (4) For application two and by referring to Figure 9, we can see that the three-parameter DEOWE distribution in Figure 9 provides better fitting for the data among all its competitors. (5) For application two, we can see that the plot in Figure 10 is the CDF of the distributions with the random variable X, while the graph in Figure 11 is for the quantile function as a function of x, where x is the  number of deaths per day. Figure 12 shows graphical plots of the data and the PMF of the DEOWE distributions (6) For application three and by referring to Figure 13, we can see that the three-parameter DEOWE distribution in Figure 13 provides better fitting for the data among all its competitors for more information about the PMF of the other distributions, see the Appendix. (7) For application three, we can see that the plot in Figure 14 is the CDF of the distributions with the random variable X, while the graph in Figure 15 is for the quantile function as a function of x, where x is the number of deaths per day. Figure 16 shows graphical plots of the data and the PMF of the DEOWE distributions

Conclusion
In this paper, we introduced a new distribution, which is called DEOWE distribution the aim to do this work was the lack of flexibility in other distributions. We studied its statistical properties and obtained a linear representation for its PMF and the associated quantile function. We used the MLE method for estimating the distribution parameters α, β, and λ. Also, a real dataset of the mortality numbers in the Kingdom of Saudi Arabia (KSA) was considered to assess the performance of the DEOWE. e distribution fitting for the real dataset was compared with its competitors, and by referring to the values of the goodness of fit measurements, we deduce that the DEOWE distribution has the lowest chi square, AIC, and CAIC for the first dataset, and for the second dataset, we deduce that the DEOWE distribution has the lowest chi square, AIC, CAIC, BIC, and HQIC and the highest P value among all of its competitors.
is result indicates that the DEOWE distribution provides a superior model for fitting the mortality number compared with other competitive distributions. Also, we make a graphical plot for the data using the DEOWE with other competitive distributions, and the plots come in our favor and assure the results of the goodness-of-fit measurements.