A New Family of Distributions to Encounter the Extrapolating Issues

Traditionally, inﬁnite models have been ﬁtted on ﬁnite datasets which extrapolate the data, resulting in inadequate model ﬁtting and predictions. To overcome this problem, we develop a new family of truncated distributions by introducing a new generator. In this article, a truncated random variable X tr “the transformer or input” is exerted to transform another random variable T “transformed or generator,” which yields a new T − X tr family of distributions. Several characteristics of T − X tr family of distributions are provided which are equally useful in engineering and biological sciences. For application purposes, a type-2 Gumbel-truncated exponential distribution is generated by using the proposed method along with its statistical properties. *e eﬃcacy of the new model is demonstrated by applying it to echophysiology and comparing the resulting outputs with those from the baseline models. Relevance of the Work . Indeed, the probability models have inﬁnite domains but they are applied on the ﬁnite real datasets which may lead to exaggerated inferences and predictions. *is problem can be solved by developing models that have ﬁnite domains. We propose ﬁnite models to analyze the ﬁnite data proﬁciently, provide reliable inferences, and save time.


Introduction
Modeling T − X tr of finite datasets using functions of infinite index has been a common practice, particularly in engineering and medical fields. It is observed that variables belonging to the fields of biology, ecology, engineering, and hydrology are used over finite ranges but fitted over infinite classical models. For instance, gamma-exponentiated exponential (GEE) distribution was applied by Ristić and Balakrishnan [1] to study the number of successive failures of the air conditioning system of a fleet of 13 jet airliners and the survival times of guinea pigs receiving a dose of tubercle bacilli. In engineering science, Sanhueza et al. [2] employed the generalized Birnbaum-Saunders distribution [3] and classical Birnbaum and Saunders [4] distributions to model the dataset corresponding to the cycles (×10 − 3 ) of aluminium articles. Similarly, Power Lindley distribution was fitted by Ghitany et al. [5] to model the tensile strength of 69 carbon fibers tested under tension at gauge lengths of 20 mm. All the above mentioned probability functions possess a domain from 0 to ∞ contrary to the dataset having a finite range.
*e idea of generating univariate continuous distributions has been arisen from the well-known differential equations developed by Pearson [6]. Afterwards, Johnson [7] developed continuous distributions by using the method of translation. Tukey [8] established the method of generating univariate continuous distributions by using a method based on quantile functions. Lee et al. [9] highlighted the fact that after 1980, the majority of the families of distributions generated were based on the method of compounding two existing distributions or embedding extra parameters into an existing distribution. For example, Amini et al. [10] proposed log-Gamma-generated families of distributions; Cordeiro et al. [11] constructed exponentiated half-logistic family of distributions. Eugene et al. [12] proposed a family of Beta generated distributions by utilizing the probability density function (PDF) of a beta random variable and the distribution function (CDF) of any distribution. *e CDF of random variable Y is as follows: where r(t) is the PDF of Beta distribution, and F (y) is the CDF of any distribution having a PDF Nadarajah and Kotz [13,14] developed beta-Gumbel and betaexponential distributions, and Akinsete et al. [15] generated a beta-Pareto distribution by utilizing Eugene's idea. Cordeiro and De Castro [16] and Jones [17] extended Eugene's idea to generate the Kumaraswamy family of distributions by substituting the Beta distribution with the Kumaraswamy distribution [18]. *e PDF of the generalized Kumaraswamy distribution is given by *e method of generating a family of distributions by Eugene, Jones & Cordeiro, and Dae Castro was limited to the new distributions having domain (0, 1). Another common limitation in the abovementioned family of distributions was that the baseline distribution is fixed.
Alzaatreh et al. [19] improved the idea of Eugene by replacing the PDF of Beta as a random variable with a PDF of any continuous random variable T as a generator. *e CDF of the proposed family of distributions is expressed as follows: having PDF and is known as T-X or Transformed-Transformer. According to Alzaatreh et al. [19], W(.) is the function of a cumulative distribution function (CDF) of any PDF and it should satisfy the following criteria: Tahir et al. [20] defined the Logistic-X family of probability densities using the methodology of Alzaatreh et al. [19].
Generally, the probability distributions apply to an infinite domain, i.e., 0 to ∞ but there are certain situations where we need the reduced or finite domains. *e infinite domain of a probability distribution is reduced to a finite domain by truncation, and the resulting distribution is called the truncated distribution. *e truncated distributions are constantly utilized in various fields like astronomy, medicine, epidemiology, biometry, engineering, hydrology, and economy. For example, in survival analysis, we desire to study how long people survive after taking cardiovascular attack treatment. *e time of the cardiovascular attack is the beginning of our study. In engineering, the measurements are taken by using a detector which detects the signals above a specific limit, and the weak signals are not taken into account. *e first attempt to estimate the parameters of a truncated Gamma distribution was accomplished by Chapman [21]. Broeder and Gerard [22] and Hegde and Dahiya [23] proposed different methods to estimate the parameters of truncated distributions. Jawitz [24] emphasized the significance of truncated distributions using hydrology data by analyzing the truncated moment expressions (TMEs) with complete moments using six standard distributions. *e author pointed out that complete moments are a special case of general TMEs (for more [24]). *e main motivation for writing this article is to model finite real data by using a truncated finite distribution to avoid extrapolating issues. *e distributions developed by using different generators provide additional parameters which not only help to model the real-life data effectively but also increase the scope of compatibility. *is research work is distinctive by the introduction of another new weight of tail parameter (kurtosis) obtained from the new truncated transformed distribution along with the usual induction of location and scale and/or shape parameters to control the shape (skewness). Moreover, the proposed family of distributions is also parsimonious in the context of estimating the model parameters and improving the goodness-of-fit of skewed data contrary to the baseline distributions. *e distribution function of T − X tr family of distributions is in closed form which is more effective than those whose CDFs are not in closed form, for instance, Mc G and beta G distributions. *e new proposed generator is easier to use than the traditional theories and techniques as it is more adaptable to the actual nature of information.
*e paper is arranged as follows: In Section 2, a new method is proposed to generate a family of truncated distributions (T − X tr ) and some general distributional characteristics, for example, the hazard function, the cumulative hazard function, the mean, p th raw moment, the Shannon entropy, and the order statistics are derived. In Section 3, a type-2 Gumbel-truncated exponential distribution (G-TEXPD) is generated using the proposed technique, and its statistical properties are presented. In Section 4, model parameters of Type-2 Gumbel-truncated exponential distribution (G-TEXPD) are estimated using the MLE method. In Section 5, a Monte Carlo simulation study is carried out to examine the performance of a proposed model for different choices of the model parameters. In Section 6, the feasibility of the proposed model is studied by fitting it to the real dataset and comparing it with some other baseline models. Some concluding remarks are made in Section 7.

Method for Generating Family of Truncated (T − X tr ) Distributions
Let X tr be a nonnegative random variable truncated on left having pdf f(X tr ) and cdf F(X tr ) on domain [ξ, ∞), and T be a random variable with pdf r (T) and cdf R (T) on interval ) is a function of the cdf F(X tr ) of any nonnegative random variable then W(F(X tr )) satisfies the following conditions: ) is differentiable and monotonically increasing function (iii) W(F(X tr )) ⟶ ξ as x tr ⟶ 0: W(F(X tr )) ⟶ b as x tr ⟶ ∞.
we demonstrate the cdf of the truncated distribution function of T − X tr family as follows: where R(t) is the cdf of random variable T and H(x tr ) is the cumulative hazard function of random variable X tr . *e corresponding pdf of T − X tr family of distributions is as follows: (11) where h(x tr ) is the hazard function of random variable X tr

Remarks
(i) *e cdf and pdf are the hazard and cumulative hazard function of random variable X tr with cdf of truncated distribution G(x tr ). *us T − X tr distribution can be considered as a family of distributions arising from weighted hazard function. (ii) *e pdf r (t) in (6) is transformed into new cdf G(x tr ) through the function W(F(x tr )), which acts as a "transformer." Hence, we shall refer to the distribution g(x tr ) in (13) as "Transformed-Truncated Transformer" or T − X tr distribution. (iii) *e random variable X tr may be discrete if G(x tr ) is a family of discrete distributions

Important Characteristics of T − X tr Family of Distributions
(i) *e hazard function of T − X tr family of distributions is expressed as follows: (ii) *e cumulative hazard function of T − X tr family of distributions is expressed as follows: (iii) *e p th percentile is used to compute median, multiple statistics and to generate random numbers as well. *e p th percentile of T − X tr family of distributions is computed as follows: (iv) *e first raw moment of T − X tr family of distributions is computed as follows: (v) *e moment generating function of T − X tr distribution is obtained as follows: (vi) We can generate a family of discrete distributions by introducing the random variable X as a discrete distribution. It is defined as follows: Theorem 1. If a random variable X tr follows the truncated

Mathematical Problems in Engineering
where μ T and η T are the mean and Shannon entropy of random variable T respectively having pdf r(t).

Theorem 2.
?e ss th order statistic g s;k (X tr ) in a random sample of size k from the T − X tr distribution is given by Proof. By definition,

T-Truncated Exponential Distribution (T-TEXPD) with Different T Distributions
In this section, we present truncated exponential distribution with different T random variables. Some properties of each case are also discussed. Let X tr be the random variable of the truncated exponential model having pdf given as follows: and cdf as follows: where ξ is the truncation point. *e generalized form of Shannon entropy for T-TEXPD is given by where η T is Shannon entropy of random variable T.

Type-2 Gumbel-Truncated Exponential Distribution (G-TEXPD).
Let T be the Gumbel pdf having two parameters α and β is as follows: and cdf is Using (24), the resulting Gumbel-truncated exponential distribution (G-TEXPD) is as follows: *e cdf of G-TEXPD is as follows:  Figure 1(c) clearly gives evidence that the shape of G-TEXPD tends to be mesokurtic by increasing the value of θ. Furthermore, Figure 1(d) is sketched at different values of the parameters.

Distributional Properties of the G-TEXPD
(i) *e hazard function of G-TEXPD is as follows:

Name
F(X tr ) | X tr > ξ pdf of family of g(X tr ) Support of X tr *e hazard is the probability of the event occurring during any given time point. Figure 2 depicts that the G-TEXPD is useful to model lifetime data having an upside down bath-tub, concave shape, or decreasing failure rate function. (ii) *e cumulative hazard function of G-TEXPD is as follows: (iii) *e p th percentile of G-TEXPD distribution is as follows: Theorem 3. Let X tr be a random variable that follows G-TEXPD, then the Shannon entropy is given by Proof. We know that using (29) in the above equation, we get □ Theorem 4. ?e P th raw moment of G-TEXPD is as follows: Proof. By definition Using (29) and (30) in (40), we get

Estimation of Model Parameters by Using the Maximum Likelihood (ML) Method
In this section, we estimated the unknown parameters of G-TEXPD by using maximum likelihood estimation method as proposed by Johnson et al. [25]. *e pdf of G-TEXPD is as follows: *e log-likelihood function of the G-TEXPD is given by Now, computing the first partial derivatives of (43) with respect to a, α, β, and θ and equating the results to zero, we have zlog L a, α, β, θ; x tr za � Min x tr j , j � 0, 1, 2 . . . . . . . . . , n, zlog L a, α, β, θ; zlog L a, α, β, θ;

Mathematical Problems in Engineering
Since the equations (46)-(48) are not in closed form, we use a well-known iterative method, i.e., the Newton Raphson to obtain the approximate ML estimates for the parameters θ, α, and β.

Asymptotic Confidence Bounds.
It is observed that ML estimates of the unknown parameters α, β, and θ of G-TEXPD are not in closed forms. In this situation, we compute the asymptotic confidence bounds of G-TEXPD based on the asymptotic distribution of the MLE. *e Fisher information matrix can be used for interval estimation and hypothesis testing. For G-TEXPD, the information matrix is obtained by computing the second partial derivatives of the equations (46)-(48) as follows: *e entries of Fisher information matrix of G-TEXPD are as follows: *e asymptotic confidence intervals are obtained by using either the approximate normal distribution or the approximate log-normal distribution of the ML estimates . *e estimated standard errors of α ∧ , β ∧ , and θ ∧ are expressed as follows: For instance, the expressions for confidence interval of α is calculated by using the approximate normal distribution and log-normal distribution is as follows: respectively, where δ 1− ξ/2 is the 1 − ξ/2 percentile of the standard normal distribution. *e log-normal approximation works well if the standard error of the parameters is greater than half of their point estimate.

Simulation Study
In this section, a Monte Carlo simulation is carried out to study the stability of the model parameters. *e Monte Carlo simulation is run 5000 times for four different combinations of the parameters to draw the random samples of size n each from the G-TEXPD (α, β, and θ). *e model parameters are estimated by the maximum likelihood (ML) method. Table 2 presents the average maximum likelihood estimates (MLEs) of the three parameters with standard errors (SEs), biases, mean square errors (MSEs), and the corresponding 99% coverage probability for approximate confidence intervals respectively. (3) Similarly, the two-sided asymptotic (1− ε)% CI for the parameter α is computed by using where Z ε represents the (1− ε)% percentile of the standard normal distribution. (4) *e simulated coverage probabilities for two-sided approximate at 99% confidence intervals Pr (α ∈ I ∧ ) (the parameter of interest α is estimated by α ∧ and the confidence interval I ∧ ) based on the normal-approximate distribution are computed. Same algorithm is repeated for the other two population parameters β and θ, respectively. Table 2 contains the results of the simulation study for different values of the G-TEXPD parameters. We can depict from the Table 2 that standard error (S.E) of additional parameter θ systematically become lower than those of α and β. Moreover, the simulation results connote that 99% coverage probability for approximate confidence intervals of the true parameter based on MLEs gives satisfactory results. It is interesting to mention here that the estimate of β tends to the true parameter value provided β is large. We observed based on coverage probability results that the estimate of α, β and θ can be statistically meaningful even in rather small samples.
*e descriptive statistic of CO 2 uptake rate (μL/L) given in Table 3 shows the initial analysis which suggests that the data is skewed and right tailed. Table 4 provides the estimated model parameters of CO2 uptake rate (μL/L) by using the ML method, along with the comparison of G-TEXPD with Gumbel-exponential (G-  EXPD) using the methodology of Alzaatreh et al. [19], truncated exponential (TEXPD), Weibull, Gamma, and exponential distributions. *e negative log-likelihood (l ∧ ), Akaike information criterion (AIC), and Bayesian information criterion (BIC) are computed to compare the models. *e model with the highest negative log-likelihood    Table 4 where the proposed model has the highest value of negative log-likelihood and the lowest values of AIC and BIC for the given data. Table 5 shows the values of different statistics for the distributions in comparison to check the goodness-of-fit. *e distribution having the smallest values of test statistics fits the best. It is obvious from Table 5 that G-TEXPD provides statistically the best fit to the data. Figure 3 shows that the pdf of G-TEXPD fits the histogram of CO 2 uptake rate (μL/L) more adequately than the other models. Also, the observed probabilities are plotted versus the predicted probabilities to check the goodness of fit. It is observed that the G-TEXPD follows the diagonal line more closely than the others.

Concluding Remarks
Conventionally infinite distributions have been used to model the finite datasets which results in extrapolating the data. Extrapolating may cause a greater element of uncertainty in the estimation, prediction and interpretation. We made an attempt to encounter this issue by proposing a new T − X tr family, of truncated models. Different characteristics of the proposed T − X tr family such as hazard function, cumulative hazard function, quantile function, Shannon entropy, and order statistics, are derived to study the behavior of the distribution. We generated Type-2 Gumbeltruncated exponential distribution (G-TEXPD) along with different properties by applying the proposed methodology. *e Monte Carlo simulation of the G-TEXPD is conducted for different combinations of the parameters for different   Figure 3: *e fitted pdfs of the Gumbel-TEXPD, Gumbel-exponential, truncated-exponential, Weibull, gamma, and exponential models on the histogram of CO 2 uptake rate (μL/L) along with their probability plots. (a) Histogram and theoretical densities. (b) P− P plot. sample sizes to study the average ML estimates, standard errors, biases, mean square errors, and corresponding coverage probabilities at 99% confidence intervals. *e simulation results connote that 99% coverage probability for approximate confidence intervals of the true parameter based on MLEs gives satisfactory results. It is also observed based on coverage probability results that the estimate of α, β ,and θ can be statistically meaningful even in rather small samples. A real-life example from ecophysiology is presented to compare the performance of our model with the truncated and untruncated contemporary models. Several statistical criteria, i.e., negative log-likelihood, AIC, BIC, K-Smirnov, C-Von, A-Darling, and pp-plot are used to collect enough evidence for the better performance of G-TEXPD. Indeed, the use of truncated models for finite datasets certainly helps to have precise estimation, prediction, and interpretation. *e authors have extracted the new directions to the readers and researchers for further research. *is methodology can be utilized to generate a new family of T − X tr distributions by limiting the on upper or both domains of X tr random variable. New families of exponentiated, transmuted, and weighted distributions can be originated using the proposed function. *is study can be extended to compare the estimators for censored data. In addition to, the existing estimation algorithms, bootstrapping, and the Bayesian techniques can be implemented for distributions generating from T − X tr family. Furthermore, a bivariate and multivariate version of truncated distributions can be developed following the proposed technique.

Data Availability
*e data used to support the finding of this study are available from the corresponding author upon request.

Conflicts of Interest
*e authors declare that they have no conflicts of interest.