A New Probability Heavy-Tail Model for Stochastic Modeling under Engineering Data

Department of Mathematics, College of Science and Humanities in Al-Kharj, Prince Sattam Bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt Department of Statistics and Operation Research, College of Science, Qassim University, P.O. Box 6644, Buraydah 51482, Saudi Arabia Department of Statistics and Computer Science, Faculty of Science, Mansoura University, Mansoura 35516, Egypt Department of Statistics, Mathematics and Insurance, Benha University, Benha 13518, Egypt


Introduction
e classic extreme value family contains the Gumbel (G) distribution (extreme value type I distribution); the reciprocal Weibull (RW) distribution (extreme value type II distribution), and the Weibull (W) distribution (extreme value type III distribution). e theory of extreme value (EV) theory focuses on the behavior of the block maxima or minima (see [1] for more details). e RW model is one of the most important distributions in modeling extreme values. e RW model was originally proposed by [1]. e EV family has many applications in ranging, accelerated life testing, oods, wind speeds, earthquakes, horse racing, rainfall, queues in supermarkets, and sea waves.
One can nd more details about the RW model in the literature, for example, for applying the RW model in analyzing data of wind speed, see [2], for the beta-RW (B-RW) model, see [3], for the Marshall-Olkin RW (MORW) model, see [4], for the Weibull RW (W-RW), Weibull reciprocal Rayleigh (W-RR), and for the extended odd RW (EO-RW) family, among others. A random variable (RV) Y is said to have the RW distribution if its density function (PDF) and cumulative distribution function (CDF) are given by g a 3 (y) a 3 y − a 3 +1 ( ) exp − y − a 3 | y≥0 , respectively, where a 3 > 0 refers to the shape parameter. For a 3 2, we obtain the reciprocal Rayleigh (RR) model. For a 3 1, we have the reciprocal exponential (RE) model.
In modeling EV real data sets, the GOLLRW model may be considered in the following applicable cases: modeling the "asymmetric heavy-tailed right skewed" data (see Section 5 and Section 6) and modeling the "asymmetric heavy-tailed right skewed" data in case of the EV data modeled for its first time. In reliability analysis and medical sciences, the OLLGRW model can be applied in modeling the stress data, fibers data, relief time data, and lymphoma data (see Section 5 and Section 6). For modeling breaking stress data, the Cramér-von Mises method performs well. For modeling the glass fibers data, the weighted least square method is the best one among all classical estimation methods. Using the validation approach of Bagdonavicius and Nikulin under right censored data, the modified chi-square goodness-of-fit (GOF) test helped us to say that the proposed GOLLRW model fits the lymphoma data. Table 1 (see the Appendix) provides some submodels of the GOLLRW model. As illustrated in Table 1, the new model generalizes eleven submodels, five of them are quite new. Some plots of the GOLLRW PDF and HRF are given in Figure 1 to illustrate some of its characteristics (see the Appendix). For simulation of this new model, we obtain the quantile function (QF) of Y (by inverting (4)), say y u � Q(u) � F − 1 (u), as y u � − ln (u/1 − u) 1/a 1 1 +(u/1 − u) 1 Equation (6) is used for simulating the new model. e novelty and gap of research are explained in more detail as follows: (i) e novel density in (5) can be "unimodal asymmetric and right skewed function" with various shapes.
(ii) e HRF of the novel version can be upside-down (reversed U-HRF). is characteristic gives a bit advantage to the ROLLRW version for analyzing the uncensored data sets in which its HRF can be upside-down (reversed U-HRF).
(iii) e proposed ROLLRW version is recommended for modeling the uncensored breaking stress data (which have many extreme values), the uncensored glass fiber data (which have some extreme values), and the uncensored relief time data (which have no extreme values).
(iv) It is also recommended for modeling the data sets with its nonparametric. Kernel density estimation is bimodal and positive skewed with a very heavy tail and bimodal and positive skewed with light heavy tail and semisymmetric real-life data.

Useful
Representations. Based on [5], the PDF in (5) can be expressed as where and π (1+κ 4 ) (y; a 3 ) is the PDF of the baseline RW model with scale parameter ����� 1 + κ 4 a3 and shape parameter a 3 . us, the new density (6) can be reexpressed as a mixture of the RW PDFs. By integrating (7) is the CDF of the RW distribution with scale parameter �� κ 4 a3 √ and shape parameter a 3 .

Moments and Incomplete
Moments. e p th ordinary moment of Y is given by where  Table 2 (for the GOLLRW) and Table 3 (for the RW) for some selected parameter values (see the Appendix). Based on Tables 2 and 3, we note that the Skew GOLLRW (Y) ∈ (− 26.23,2.531), whereas the Skew RW (Y) ∈ (1.199,5.565). Further, the spread for the Kur GOLLRW (Y) is ranging from nearly 1.00 to nearly 10810.2, whereas the spread for the Kur RW (Y) only varies from 5.699 to 5436.5 with the above parameter values. Skew GOLLRW (Y) can be "negative" or "positive" however the Skew RW (Y) can only be positive. e p th incomplete moment, say I p,Y (t), of the RV Y can be derived from (9) as then where c(∇, p) is the incomplete gamma function.
e first incomplete moment given by (11) with p � 1 is

Moment Generating Function (MGF).
e MGF of Y can be derived from equation (7) as and shape parameter a 3 , then

Residual Life (RLf ) and Reversed Residual Life (RRLf ) Functions and eir Moments.
e m th moment of the RLf can be obtained by using l m, erefore, the RLf can be written as where where

e MLE.
Let y 1 , y 2 , . . . , y m be an observed random sample (ORS) from size q from the GOLLRW distribution with parameters a 1 , a 2 , and a 3 . Let V be the 3 × 1 vector of parameters. For obtaining the MLEs of a 1 , a 2 , and a 3 , we derive the below log-likelihood function (LLF).
e score vector is available if needed and can be computed numerically.

e CVME.
e CVME of the parameters a 1 , a 2 , and a 3 are obtained via minimizing the function CVM (V) , where with respect to (WRT), the parameters a 1 , a 2 , and a 3 , respectively, where ε [1] (i,q) � (2i − 1/2q) and e CVME of the parameters a 1 , a 2 , and a 3 are derived by resolving where   ) denote the CDF of GOLLRW model and let y 1 < y 2 < · · · < y q be the q ordered ORS. e OLSEs are derived by minimizing where ε [2] (i,q) � i/(q + 1). e LSEs are derived by solving where

e WLSE Method.
e WLSE can be gotten via minimizing WLSE(V) WRT a 1 , a 2 , and a 3 , where where ε [3] ( e WLSEs are obtained by solving

e ADE Method.
e ADE of a 1 (ADE), a 2 (ADE), and a 3 (ADE) are derived from minimizing i.e., the parameter estimates of a 1 (ADE), a 2 (ADE), and a 3 (ADE) follow by solving the following nonlinear system:

Uncensored Simulations for Comparing the Classical Methods
Simulation experiments are performed and then employed to assess and compare the classical methods. e assessment is based on N � 1000 data sets generated from the new model where n � 50, 100, 300, and 500, where  Figure 2 gives the density functions for three scenarios (see the Appendix). From Figure 2, it is seen that the PDF for all scenarios are asymmetric and right heavy tailed.
e estimates of all methods are compared in terms of bias (BIAS(V)); root mean − standard error (RMSE(V)); mean of the absolute difference between the theoretical and the estimates (AAD-abs); and the maximum absolute difference between the true parameters and estimates (AAD-max). From Tables 4-6, we note that (see the Appendix): (i) e BIAS (V) tends to 0 as n increases which shows that all estimators are "nonbiased." (ii) e RMSE (V) tends to 0 as n increases which shows the incidence of "consistency property." Generally, the MLE method is providing the better estimation with less RMSE compared to other classical methods for all sample sizes.
(1) For "a 1 � 2.0, a 2 � 0.6, and a 3 � 0.8" (see Table 4 (2) For "a 1 � 0.5, a 2 � 1.5, and a 3 � 2.0" (see Table 5  It is not easy to determine the worst classical estimation method since all other estimation methods perform well especially when n tends to ∞.

Real Data Modeling
In this section, we are interested with introducing three real data applications for comparing competitive models, two of them for comparing estimation methods.

Comparing Competitive Models under Uncensored Data.
For illustrating the wide applicability of the new GOLLRW model, we consider the statistics: Cramér-Von Mises (C 1 ); Anderson-Darling (C 2 ); and Kolmogorov-Smirnov test (KS-test) and its corresponding p value (P.V).
e new model is compared with several common competitive models; Table 7 lists the competitive models and their corresponding abbreviations.

Breaking Stress
Data. e 1 st data set is an uncensored data set consisting of 100 observations on breaking stress of carbon fibers (in Gba) given by [8]. Figure 3(a) gives the total time in test (TTT) plot (see [9]) for stress data set. It indicates that the empirical HRFs of data sets I is increasing. Figure 3(b) gives the box plot for discovering the outliers, Figure 3(c) gives the quantile-quantile (Q-Q) plot for checking the normality, and Figure 3(d) gives the nonparametric kernel density estimation (KrDE) exploring the density of the raw data. Figure 4 gives the P-P plot, estimated density (EPDF), ECDF, and EHRF for stress data set. e statistics C 1 , C 2 KS-test, and P.V for all fitted models are presented in Table 8.
e MLEs and corresponding   standard errors (SEs) are given in Table 9. From Table 8, the GOLLRW model gives the smallest values, the C 1 , C 2 , and KS-test, and the biggest P.V statistics as compared to further RW models; therefore, the GOLLRW can be chosen as the best model. Figure 4 gives the estimated (E-PDF), estimated CDF (E-CDF), P-P plot, and estimated HRF (E-HRF) for stress data set.

Glass Fiber
Data. e 2 nd data set is the generated data to simulate the strengths of glass fibers which was given by [10]. Figure 5 gives the TTT plot for fiber data set. It indicates that the empirical HRFs for fiber data sets are increasing. Figure 5(b) gives the box plot for discovering the outliers, Figure 5(c) gives the Q-Q plot for checking the normality, Figure 5(d) gives the nonparametric KeDE for exploring the density of the raw data. Figure 6 gives P-P plot, estimated density (EPDF), ECDF, and EHRF for glass fiber data. e C 1 , C 2 , KS-test, and P.V are listed in Table 10.
e MLEs and SEs are given in Table 11. From Table 10, the GOLLRW model gives the lowest values, the C 1 , C 2 , and KS-test, and the biggest value of the P.V; therefore, the GOLLRW can be chosen as the best model. Figure 6 gives the E-PDF, E-CDF, P-P plot, and E-HRF for fiber data set.

Relief Time Data.
e 3rd data set (called Wingo data, see [11]) is complete observed sample from a clinical trial describing relief times (in hours) for 50 arthritic patients. Figure 7 gives the TTT plot for relief time data set. It indicates that the empirical HRFs of relief time data set is increasing. Figure 7(b) gives the box plot for discovering the outliers, Figure 7(c) gives the Q-Q plot for checking the normality, and Figure 7(d) gives the nonparametric KeDE for exploring the density of the raw data. Figure 8 gives P-P plot, estimated density (EPDF), ECDF, and EHRF for relief times. e C 1 , C 2 , KS-test, and P.V are listed in Table 12. e MLEs and SEs are given in Table 13. From Table 12, the GOLLRW model gives the lowest values, the C 1 , C 2 , and KStest, and the biggest value of the P.V; therefore, the GOLLRW can be chosen as the best model. Figure 8 gives the E-PDF, E-CDF, P-P plot, and E-HRF for relief time data set.

Comparing Estimation Methods under Uncensored Data.
For comparing estimation methods via real data applications, we introduce two examples. e comparison is based on C 1 and C 2 . Tables 14-16 give the results for comparing estimation methods using breaking stress data and glass fiber data, respectively.

Example 1:
e Breaking Stress Data. In this subsubsection and depending on breaking stress of carbon fibers (in Gba), the estimation methods will be compared. Table 14 lists the comparing results using breaking stress data. From Table 14, we note that the CVME method is the best method among all other methods with C 1 � 0.05864 and C 2 � 0.473, however all other methods performed well.

Example 2: e Glass Fiber Data.
In this subsubsection and depending on the glass fiber data, the estimation methods will be compared. Table 15 lists the comparing results using breaking stress data. From Table 15, it is noted

Censored Validation and Real Data Analysis
e statistic test T 2 n is defined by where e j is the number of expected failures (NEF) in the grouped intervals and U j is the number of observed failures (NOF) in grouping intervals where j � 1, 2, . . . , r � 1, 2, . . . , n, L, L ′ � 1, 2, . . . , s.
(34) e elements of C are defined by where H V (y i ) refers to the cumulative HRF (CHRF) of the GOLLRW distribution. e quadratic form of the modified test statistic can be written as where matrices W, C, and I are the estimated information matrices); for more details; see [12]. We have analyzed lymphoma data set consisting of times (in months) from diagnosis stage up to death for 31 individuals with the advanced non-Hodgkin's lymphoma clinical symptoms, by using our model.     For ε � %5, the critical value χ 2 5 � 9, 4877 which is larger than T 2 n � 6.858, so we can say that the proposed GOLLRW model fits these data.

Concluding Remarks
In this paper, we introduced a new extension of the wellknown reciprocal Weibull (RW) model, called the generalized odd log-logistic reciprocal Weibull (GOLLRW) model which is used for modeling the extreme values. e new model generalizes other eleven RW extensions, five of them are quite new. Some important mathematical properties of the new model are derived.
e Skew GOLLRW (Y) ∈ (− 26.23, 2.531) , whereas the Skew RW (Y) ∈ (1.199, 5.565). Further, the spread for the Kur GOLLRW (Y) is ranging from nearly 1.00 to nearly 10810.2, whereas the spread for the Kur GOLLRW (Y) only varies from 5.699 to 5436.5. Skew GOLLRW (Y) can be "negative" or "positive," however the Skew RW (Y) can only be positive. We assessed the performance of seven estimation methods via some simulation experiments.
ree real data sets are presented for measuring the importance and flexibility of the new model and to compare the competitive models under uncensored scheme.
e new model is better than some other important competitive models in modeling the breaking stress data, glass fiber data, and relief time data. e estimation methods are compared using two real data sets. For modeling the breaking stress of carbon fibers, the Cramér-von Mises method is the best method among all other methods, however all other methods performed well. For modeling the strengths of glass fiber data, the weighted least square method is the best method, however all other methods performed well. Finally, a modified Bagdonavicius-Nikulin GOF is presented and applied for validation under censorship case. Future works could be allocated for studying many new related extensions from other related aspects. e current study can be extended using neutrosophic statistics as future research (see [15][16][17][18][19][20][21]).
Data Availability e 1st data set is an uncensored dataset consisting of 100 observations on breaking stress of carbon fibers (in Gba) given by [14]. e 2nd data set is generated to simulate the strengths of glass fibers which were given by [22]. e 3rd data set (called Wingo data) is complete observed sample from a clinical trial describing relief times (in hours) for 50 arthritic patients (see [23]). e 4th data set censored validation is given in Section 6.