Discrete Generalized Inverted Exponential Distribution: Case Study Color Image Segmentation

We present in this paper a discrete analogue of the continuous generalized inverted exponential distribution denoted by discrete generalized inverted exponential (DGIE) distribution. Since, it is cumbersome or difficult to measure a large number of observations in reality on a continuous scale in the area of reliability analysis. Yet, there are a number of discrete distributions in the literature; however, these distributions have certain difficulties in properly fitting a large amount of data in a variety of fields. +e presented DGIE(β, θ) has shown the efficiency in fitting data better than some existing distribution. In this study, some basic distributional properties, moments, probability function, reliability indices, characteristic function, and the order statistics of the new DGIE are discussed. Estimation of the parameters is illustrated using the moment’s method as well as the maximum likelihoodmethod. Simulations are used to show the performance of the estimated parameters.+emodel with two real data sets is also examined. In addition, the developed DGIE is applied as color image segmentation which aims to cluster the pixels into their groups. To evaluate the performance of DGIE, a set of six color images is used, as well as it is compared with other image segmentation methods including Gaussian mixture model, K-means, and Fuzzy subspace clustering. +e DGIE provides higher performance than other competitive methods.


Introduction
In the field of reliability analysis, it is inconvenient or difficult to measure a lot of observations in nature on a continuous scale. For example, in many practical satiations [1][2][3][4][5][6][7][8][9], reliability data are measured in terms of the number of cases, runs, or the number of days left for patients with the deadly disease since therapy. For more examples in reliability and lifetime applications, see Meeker and Escobar [10]. ere are ways to build up a discrete distribution that has been recognized [11].
Indeed, this technique has been widely applied to generate new discrete distributions for example [12][13][14][15][16][17][18] and references cited therein. Abouammoh and Alshingiti [19] introduced a shape parameter to the inverted exponential distribution to get the generalized inverted exponential (GIE) distribution. e GIE distribution is derived from the exponetiated Frechet distribution [20]. e hazard rate of the GIE distribution can be decreasing or increasing, based on its shape parameter. e GIE has effectiveness in modeling a lot of data and can be applied in several applications, such as horse racing, life testing, queues, and wind speeds [20]. Abouammoh and Alshingiti [19] show that the GIE distribution provides a better fit than Weibull, gamma, generalized exponential distribution, and gamma distribution. e GIE can be widely used in many fields, see for example, [21,22]. However, there exist a number of discrete distributions in the literature; there are some limitations in these distributions in fitting a lot of data in many areas effectively, such as geometric, discrete Lindely, and discrete logistic distributions. ere is still a need to develop new discretized distributions that are able to have applications such as image segmentation. is motivated us to present a new distribution. e presented distribution discrete generalized inverted exponential (DGIE) is constructed from a generalized inverted exponential distribution. Parameters are estimated using two methods, namely moments and maximum likelihood. e consistency of the estimated parameters is illustrated using simulation. Based on to two data sets the proposed distribution is more convenient to analyze the given data more than competitive distributions. e proposed distribution is applied in color segmentation which helps in clustering the pixels into their groups. e DGIE provides higher performance than other competitive methods.
e main contribution of the current study can be summarized as follows: (1) Present a new distribution discrete generalized inverted exponential (DGIE) to avoid the limitations of other distributions (2) Compute the basic distributional properties, moments, probability function, reliability indices, characteristic function, and the order statistics of DGIE (3) Evaluate the applicability of DGIE by using it to improve the color segmentation e paper is organized as follows: in Section 2, we introduce the DGIE(β, θ) distribution and mention statistical properties, as failure function, survival function. In addition, we list some additional properties of the proposed distribution such as moment generating function, moments, quantile, entropy, stress-strength, mean residual lifetime, and order statistics. We analyse the DGIE(β, θ) using two real data set in Section 3. Finally, the conclusion is mentioned in Section 4.

Discrete Generalized Inverted Exponential Distribution
Definition 1. A random variable X is said to have a discrete generalized inverted exponential distribution with parameter ß (β > 0) and � e − λ , 0 < θ < 1, if its probability mass function (PMF) has the form: We denote this distribution as DGIE(β, θ). Figure 1 illustrates several examples of the probability mass function of DGIE(β, θ) distribution for various values of β and θ.

Cumulative Distribution Function.
e cumulative distribution function CDF of DGIE(β, θ) is given by where β (β > 0) and � e − λ , 0 < θ < 1. Monotonic property simply, we find out Is a decreasing function of which leads to e distribution is log-concave. Based on the log concavity (Mark, 1996), the proposed of DGIE(β, θ) distribution is unimodal with increasing failure rate distribution, and it all its moments.

Statistical Properties.
e r th moment μ ' r of a discrete exponentiated exponential distribution DGIE(β, θ) about the origin is obtained as follows: e moment generating function (MGF) M X (t) of DGIE(β, θ) distribution is computed as follows: e mean (μ) of DGIE(β, θ) distribution is derived as e second moment is obtained as Hence, the variance (σ 2 ) could be derived as e 3 rd and 4 th moments are, respectively are obtained as e measure of skewness α 3 of DGIE(β, θ) distribution is obtained as follows: e measure of kurtosis α 4 of DGIE(β, θ) distribution is obtained as follows: e probability generating function (PGF), G(t), of DGIE(β, θ) distribution is obtained as follows: For simplicity, we compute the PGF numerically where the r th the factorial moment is computed as e variance, the variance (σ 2 ) of DGIE(β, θ) distribution is given by the following:

Mathematical Problems in Engineering
Characteristic function: the characteristic function (CF), ϕ X (w) of DGIE(β, θ) distribution is of the form: Because moments do not have closed forms, the mean and variance can only be calculated numerically. We estimated mean and variance for various values of β and θ in Tables 1 and 2, respectively.

Order Statistics.
Order statistics has a deep reflection on theoretical and practical aspects of statistics. is importance is shown in statistical inference and nonparametric statistics. Let X 1 , X 2 , . . . .., X n be a random sample from A DGIE(β, θ) distribution, and let X 1: n , X 2: n , . . . .., X n: n be the order statistics. en, the CDF of the i th order statistics for x can be represented in the for erefore, the PMF of the k th Os has the form: where Θ (n,k− 1) So, the q th moments of X i: n is written in the form: where

Renyi Entropy.
Renyi entropy plays a vital role in information theory. e Renyi entropy of a random variable X is defined as: where c >0 and c ≠1 (Renyi, 1961). For the DGIE(β, θ)) distribution for c is an integer number, we compute Likelihood equations are then obtained as follows: We can obtain the solution of these equations numerically then; we compute the Fisher's information matrix by finding the second partial derivatives One can infer that the DGIE(β, θ) distribution satisfies the regularity conditions [23]. en, the MLE vector (β, θ) T is asymptotically normal and consistent. Fisher's information matrix can be approximated as where β and θ are the MLEs of β and θ [24]. e element of the hessian matrix I x (α, θ) are obtained from

Method of Moments Estimation.
We can find moments' estimates (MM E s ) of (β, θ) by solving the equations where μ [1] 1 and μ [2] 2 represent the first and the second sample moments. 8 Mathematical Problems in Engineering

A Simulation Study.
In this section, we assess the performance of the maximum-likelihood estimate with respect to sample size n. e assessment is based on a simulation study: (1) Generate 10000 samples of size n from equation (1). e inversion method is used to generate samples; that is, varieties of the discrete generalized inverted exponential distribution are generated using where U ∼ U(0, 1) is a uniform variable on the unit interval. (34) e empirical results are given in Table 3. From Table 3, the following observations can be noted: the magnitude of the bias always decreases to zero as n ⟶ ∞. e MSEs always decrease to zero as n ⟶ ∞. is shows the consistency of the estimators.

Data Application.
We pointed out here, the notability of a discrete generalized inverted exponential distribution on distributions: geometric distribution, discrete logistic distribution and discrete Lindley distribution. Two real data sets are applied. e first data are in Table 4 is for 30 failure times of the air conditioning system of an airplane. ese data are taken from [25]. e MLE of (β, θ) values in all these cases has been computed.
e Kolmogorov-Smirnov (K-S) measure in each case and the associated P value are computed. e result is put in Table 5.
e Akaike information criterion (AIC), correct Akaike information criterion (CAIC) and Bayesian Akaike information criterion (BIC) values for the models have been computed. e result is reported in Table 6. e Akaike's measures indicate that the GIED distribution fits the data better than some existing distributions for this data set. e data set given in Table 7 consists of uncensored data from [23]. e data gives 100 observations on breaking stress of carbon fibers (in Gba). e MLE of (β, θ) values in all these cases have been computed. e Kolmogorov-Smirnov (K-S) measure in each case and the associated p-value are computed. e result is put in Table 8. A comparison between the observed and the fitted distributions are shown in Figures 4  and 5.
e Akaike's measures indicate that the GIED distribution fits the data better than some existing distributions for this data set, as in Table 9. For the first, second data sets, the discrete generalized inverted exponential distribution shows the best convenient p values. e distribution plots propose that the discrete generalized inverted exponential distribution offers the best fit between the competitor distributions. On the basis of the tabulated results, we infer that the discrete generalized inverted exponential distribution provides the best fit compared to its submodels. Some summary statistics of data sets 1 and 2 are listed in Table 10.

Image Segmentation.
In this section, we assess the ability of DGIED to improve the performance of segmentation the image. is can be performed by considering it as a clustering method.

Clustering Problem Formulation for Image
Segmentation. In this part, we introduce the mathematical formulation of the automatic clustering-based image segmentation problem. In general, the main aim of AC is to split the given image I into a set of K max groups. To perform this task, the between-cluster variation must be maximized at the same time with minizing within-cluster variation. erefore, the mathematical representation of AC can be given by dividing the image into K max cluster (i.e., C 1 , C 2 , . . . , C K max ) with satisfied the following criteria: ∪ K max l�1 C l � I, C l ≠ ϕ, l � 1, . . . , K max C l ∩ C l1 � ϕ, l, l1 � 1, 2, . . . , K max , l ≠ l1.

(35)
Gaussian mixture models (GMM): it is one of the most popular clustering techniques, and it has been used as an image segmentation method in different applications, for example, image retrieval [26], chemical and physical properties of Italian wines, and the chemical [27,28] and others [29]. e mathematical formulation of the Gaussian mixture model (GMM) can be represented by considering the given image I consisting of a set of pixels X that are represented as a random variable. So, the GMM can be defined as In equation (36), K represents the number of objects and w i > 0 refer to the weights where K i�1 w i � 1. In addition, the N(x|μ i , σ 2 i ) is defined as    23  62  42  3  16  261  47  20  14  90  87  225  5  71  1  7  71  12  11  16  120  246  120  14  52  14  21  11 11 95 Table 5: e results of data set 1.

Distribution p(x)
Estimates of parameters p value K-S statistics Discrete generalized inverted exponential distribution (2.2) θ � 8.763 × 10 − 6 β � 0.779 0.237755 * 10 − 2 0.326379   where μ i and σ i are mean and the standard deviation of class i. For image X, the parameters are θ � (w 1 , . . . , w k , μ 1 , . . . , μ k , σ 2 1 , . . . , σ 2 k ) are required to determine and to achieve this estimation, the Expectation-Maximization (EM) method is used. e steps of EM can be summarized as in Algorithm 1: However, the traditional GMM has some limitations that influence its performance, such as inefficiency in modeling all the data types, including discrete data in the application such as machine learning [30]. To avoid these limitations, we use the new distribution named Discrete Generalized Inverted Exponential Distribution. In general, DGIED has the ability to tackle the non-inaccurateness of using general distributions such as mixture Gaussian distribution.

Dataset Description.
In this study, the performance of the developed clustering-based color image segmentation using DGIED mixture model (DGIEMM) is evaluated using a set of six color images (as in Figures 6(a)-6(f )) [31]. In addition, we compared the results of DGIEMM with GMM, K-means, and Fuzzy subspace clustering (FSC).

Performance Measures.
To evaluate the efficacy of the developed image segmentation, a set of performance measures is used. For example, Accuracy, Adjust Rand Index, Hubert, and Normalized mutual information. e details of these measures are given as follows: Accuracy: It is a measure used to assess the ability of the method to determine the optimal cluster for each pixel. It is formulated as where TP, FP, TN, and FN are the True positive, False positive, True negative, and False negative. Adjust Rand Index: It is a measure used to assess the similarity between two groups, and it is defined as:   (1) Input: image X j , j � 1, . . . , n and i ∈ 1, 2, . . . , k { } are the label set Y � y 1 , y 2 , . . . , y N , y n ∈ 1, . . . ,   where n ij denotes the number of objects in common between classes. a i and b i are the sum of rows and columns of contingency table, respectively. Hubert: it is a measure used to compute the correlation coefficient between classes and it is defined as: where σ X and σ Y are the standard deviation of cluster X and cluster Y, respectively. Normalized mutual information (NMI): is defined as a normalization of the Mutual Information that defined as: where C T and C are the class label and its cluster label, respectively. H is the Entropy and Mutual Information between C T and C, respectively.

Results and Discussion.
e comparison between the developed color image segmentation method (i.e., DGIEMM) and the other methods is given in Table 11. It can be noticed from these results the high ability of the developed method to cluster the images into their objects overall the other methods. For example, according to the results in terms of accuracy, it can be seen from these values that the DGIEMM has a high ability to assign each pixel into its true label (i.e., the object that contains it). e FSC and GMM provide results better than K-means, and this observation can be noticed from Figure 7(a) that shows the average overall of the tested six images. In terms of AR, it can be seen that the DGIEMM still provides results better than other methods. e same observations are noticed in the other three measures (i.e., RI, NMI, and Hubert); also, Figures 7(b)-7(d) shows the superiority of DGIEMM.
To justify the superiority of DGIEMM, the nonparametric Friedman test is used. In general, this test is applied to make a decision about the difference between the DGIEMM and other methods is significant or not. ere are two hypotheses; the first one is named null, and it is assumed that there is no difference between the tested methods. In contrast, the second hypothesis, called alternative, is considered there is a difference between the method. We accept the alternative hypothesis, when the obtained value is less than significant level 0.05. Table 12 shows the mean-rank obtained using the Friedman test in terms of the performance measures (i.e., accuracy, AR, RI, Hubert, NMI). From these values, it can be seen that the developed color image segmentation method has the highest mean rank in terms of performance measures. In addition, FSC allocates the second mean rank, followed by K-means that provides results better than traditional GMM. Finally, Figure 8 shows an example of Segmented image using competitive algorithms.

Concluding Remarks
In this study, a new two-parameters distribution for modeling a lot of observations in nature has been presented. It is constructed from continuous generalized inverted exponential distribution, so called discrete generalized inverted exponential distribution DGIE distribution. Some important probabilistic properties of this distribution have been studied. Using two methods namely the moment's method and the maximum likelihood technique, the parameters of the DGIE(β, θ) distribution have been estimated. To evaluate the quality of DGIE(β, θ), a set of experimental series has been conducted using synthetic and real data. e results have been shown the efficiency of DGIE(β, θ) in fitting data better than some existing distribution in case of synthetic data. In addition, the developed DGIE has been applied as image segmentation based on clustering technique, which aims to avoid the limitations of the traditional Gaussian mixture model (GMM).
is is achieved by using DGIE instead of Gaussian distribution. e developed image segmentation has been established its performance using a set of color images which provides results better than GMM, K-means, and Fuzzy subspace clustering (FSC).
According to these properties and results, DGIE can be applied to a wide range of applications, including reliability, physics, and machine learning techniques.

Data Availability
e data used to support the findings of this study are available from the authors upon request.

Conflicts of Interest
e authors declare no conflict of interest.