A Flexible Reduced Logarithmic-X Family of Distributions with Biomedical Analysis

Statistical distributions play a prominent role in applied sciences, particularly in biomedical sciences. The medical data sets are generally skewed to the right, and skewed distributions can be used quite effectively to model such data sets. In the present study, therefore, we propose a new family of distributions to model right skewed medical data sets. The proposed family may be named as a flexible reduced logarithmic-X family. The proposed family can be obtained via reparameterizing the exponentiated Kumaraswamy G-logarithmic family and the alpha logarithmic family of distributions. A special submodel of the proposed family called, a flexible reduced logarithmic-Weibull distribution, is discussed in detail. Some mathematical properties of the proposed family and certain related characterization results are presented. The maximum likelihood estimators of the model parameters are obtained. A brief Monte Carlo simulation study is done to evaluate the performance of these estimators. Finally, for the illustrative purposes, three applications from biomedical sciences are analyzed and the goodness of fit of the proposed distribution is compared to some well-known competitors.


Introduction
e statistical analysis and modeling of lifetime phenomena are essential in almost all areas of applied sciences, particularly, in biomedical sciences. A number of parametric continuous distributions for modeling lifetime data sets have been proposed in literature including exponential, Rayleigh, gamma, lognormal, and Weibull, among others. e exponential, Rayleigh, and Weibull distributions are more popular than the gamma and lognormal distributions since the survival functions of the gamma and the lognormal distributions cannot be expressed in closed forms and hence both require numerical integration to arrive at the mathematical properties. e exponential and Rayleigh distributions are commonly used in lifetime analysis. ese distributions, however, are not flexible enough to counter complex forms of the data. For example, the exponential distribution is capable of modeling data with constant failure rate function, whereas the Rayleigh distribution offers data modeling with only increasing failure rate function. e Weibull distribution, also known as the super exponential distribution, is more flexible than the aforementioned distributions. e Weibull distribution offers the characteristics of both the exponential and Rayleigh distributions and is capable of modeling data with monotonic (increasing, decreasing, and constant) hazard rate function. Unfortunately, the Weibull distribution is not capable of modeling data with nonmonotonic (unimodal, modified unimodal, and bathtub-shaped) failure rate function. In some medical situations, for example, neck cancer, bladder cancer, and breast cancer, the hazard rate is shown to have unimodal or modified unimodal shape. e hazard rates for neck, bladder, and breast cancer recurrence after surgical removal have been observed to have unimodal shape. In the very initial phase, the hazard rate for cancer recurrence begins with a low level and then increases gradually after a finite period of time after the surgical removal until reaching a peak before decreasing. Another example of the unimodal shape is the hazard of infection with some new viruses, where it increases in the early stages from low level till it reaches a peak and then decreases; for detail, see [1]. In view of the importance of unimodal failure rate function in biomedical sciences, a series of papers have been appeared to propose new distributions capable of modeling medical data with unimodal failure rate function [2][3][4][5][6][7][8]. In the recent years, the researchers have shown a trend in proposing new families of distributions to obtain more flexible models. In this regard, [9] introduced the Marshall-Olkin generated (MOG) family by introducing an extra parameter to the Weibull distribution. e cumulative distribution function (cdf ) of the MOG family is given by where σ is an additional parameter and F(x; ξ) is the cdf of the baseline model which may depend on the parameter vector ξ. [10] proposed another method of constructing new lifetime distributions known as alpha power transformation approach via cdf Using (2), [10] and [11] introduced the alpha power exponential (APE) and alpha power transformed Weibull (APTW) distributions, respectively. We further carry this branch of distribution theory and introduce a new flexible class of distributions which can be used in modeling unimodal medical care data sets. Tahir and Corderio [12] proposed the exponentiated Kumaraswamy G-logarithmic (EKuGL) class of distributions given by the cdf: where a, b, θ > 0 and p ∈ (0, 1). For the EKuG-L family of distributions, the parametric space of p is restricted to (0, 1). Due to this relation, the EKuG-L family may not be flexible enough to counter complex forms of data. Furthermore, the EKuG-L family has four additional parameters. Note that the expression (3) is not true for p � 1. Furthermore, due to the higher number of parameters, the estimation of the parameters as well as the computation of many distributional characteristics becomes very difficult. erefore, in this paper, an attempt has been made to propose a more flexible class of distributions, called flexible reduced logarithmic-X (FRL-X) family via reparameterizing (3). e new family is introduced for a � b � θ � 1 (to reduce the number of parameters to avoid the difficulties in computation of mathematical properties) and reparameterizing p � 1 + σ (to relax the upper limit of the parametric space of p ), where σ > 0. In view of unrestricted upper bound, the proposed distribution would be quite flexible in modeling complex forms of data. us, the motivation for proposing the FRL-X family is to reduce the number of parameters as well as to relax the boundary conditions of the parametric values to bring more flexibility in the shape of the hazard rate function than the classical monotone behavior. Also, to improve the description which calls for complexity by adding the parameters in the class of distributions, this gives us more information about the behavior of the hazard rate function in the tail end. A random variable X is said to have the FRL-X distribution, if its cdf is given by where F(x; ξ) is cdf of the baseline random variable depending on the parameter ξ, and σ is an additional parameter. e expression (4) is also true for σ � 1. e probability density function (pdf ) corresponding to (4) is given by e new pdf is most tractable when F(x; ξ) and f(x; ξ) have simple analytical expressions. Henceforth, a random variable X with pdf (5) is denoted by X ∼ FRL − X(x; σ, ξ). Furthermore, for the sake of simplicity, the dependence on the vector of the parameters is omitted and G(x) � G(x; σ, ξ) will be used. Moreover, the key motivations for using the FRL-X family in practice are (1) A very simple and convenient method of adding an additional parameter to modify the existing distributions (2) To improve the characteristics and flexibility of the existing distributions (3) To introduce the extended version of the baseline distribution having closed forms for cdf, sf, and hrf (4) To provide better fits than the competing modified models (5) To introduce new distributions having nonmonotonic shaped hazard rate functions (6) To provide best fit to unimodal medical care data sets e FRL-X family can also be obtained via reparameterizing the alpha logarithmic family (ALF) proposed by [13]. e cdf of the ALF family is given by e problem with ALF family is that α � 1, and consequently, the parametric space of α is restricted. e RFL-X addressed this problem via reparameterizing α as α � σ + 1. e advantage of the FRL-X family over the ALF is that σ � 1 acceptable, and its parametric space is not restricted. Furthermore, for σ � 1, the FRL-X reduces to the logarithmic transformed family of [14] given by e survival function (sf ) and hazard function of the FRL-X family are given, respectively, by e rest of this article is organized as follows. In Section 2, a special submodel of the proposed family is discussed. Some mathematical properties are obtained in Section 3. e characterizations results are presented in Section 4. Maximum likelihood estimates of the model parameters are obtained in Section 5. A comprehensive Monte Carlo simulation study is conducted in Section 6. Section 7 is devoted to analyzing three real-life applications. Further framework is discussed in Section 8. Finally, concluding remarks are provided in the last section.
en, the cdf of the FRL-W distribution has the following expression: e density function corresponding (9) is given by Plots of the pdf of the FRL-W distribution are sketched in Figure 1 for selected values of the model parameters.

Basic Mathematical Properties
In this section, some statistical properties of the FRL-X family are derived.

Quantile Function.
Let X be the FRL-X random variable with cdf (4), the quantile function of X, say Q(u), is given by where u ∈ (0, 1). From the expression (11), it is clear that the FRL-X family has a closed form solution of its quantile function which makes it easier to generate random numbers.

Moments.
Moments are very important and play an essential role in statistical analysis, especially in the applications. It helps to capture the important features and characteristics of the distribution (e.g., central tendency, dispersion, skewness, and kurtosis). e r th moment of the FRL-X family is derived as Using (5) in (12), we have Using the series representation (14), we arrive at Using (15) in (13), we obtain Furthermore, a general expression for the moment generating function (mgf ) of the RFL-X family is given by

Residual and Reverse Residual
Life. e residual life offers wider applications in reliability theory and risk management. e residual lifetime of FRL-X, denoted by R (t) , is Computational and Mathematical Methods in Medicine 3 Additionally, the reverse residual life of the FRL-X random variable, denoted by R (t) , is

Characterization Results
is section is devoted to the characterizations of the FRL-X distribution based on a simple relationship between two truncated moments. It should be mentioned that for this characterization the cdf is not required to have a closed form. e first characterization result employs a theorem due to [15]; see eorem 1 below. Note that the result holds also when the interval H is not closed. Moreover, as shown in [23], this characterization is stable in the sense of weak convergence. Theorem 1. Let (Ω, Ƒ, P) be a given probability space and let H � [d, e] be an interval for some d < e (d � − ∞ e � ∞ might as well be allowed). Let X: Ω ⟶ H be a continuous random variable with the distribution function G and let q 1 and q 2 be two real functions defined on H such that Is defined with some real function η. Assume that q 1 , q 2 ∈ C 1 , η ∈ C 2 and G is twice continuously differentiable and strictly monotone function on the set H. Finally, assume that the equation ηq 1 � q 2 has no real solution in the interior of H. en G is uniquely determined by the functions q 1 , q 2 , and η, particularly where the function s is a solution of the differential equation Remark. e goal in eorem 1 is to have η as simple as possible. (5) if and only if the function η defined in eorem 1 is of the form Proof. Let X be a random variable with pdf (5), then and finally Conversely, if η is given as above, then and hence Now, in view of eorem 1, X has density (5).
e general solution of the differential equation in Corollary 1 is where D is a constant. We like to point out that one set of functions satisfying the above differential equation is given in Proposition 1 with D � 1/2. Clearly, there are other triplets (q 1 , q 2 , η) which satisfy conditions of eorem 1.

Estimation
In this section, the method of maximum likelihood estimation is used to estimate the model parameters. Furthermore, the robustness is also discussed.

Maximum Likelihood Estimation.
In this subsection, the maximum likelihood estimators (MLEs) of the parameters σ and ξ of RFL-X family from complete samples are derived. Let X 1 , X 2 , . . . , X k be a simple random sample from RFL-X family with observed values X 1 , X 2 , . . . , X k . e log-likelihood function for this sample is Obtaining the partial derivatives of (29), we have Setting (z/zσ)log L(x; σ, ξ) and (z/zξ)log L(x; σ, ξ) equal to zero and solving numerically these expressions simultaneously yields the MLEs of (σ, ξ).

M-Estimator as a Robust Estimation.
Robust statistics are statistics with good performance for the data drawn from a wide range of probability distributions, especially for nonnormal distributions. Robust statistical approach has been developed for many common problems, such as estimating location, scale, and regression. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from the parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard-deviations; under this model, nonrobust methods like a t-test work poorly. Historically, several approaches to robust estimation were proposed, including R-estimators and L-estimators. However, M-estimators now appear to dominate the field as a result of their generality, high breakdown point, and their efficiency. M-estimators are generalization of the maximum likelihood estimators (MLEs). What we try to do with MLE's is to maximize n i�1 f(x i ) or, equivalently, minimize n i�1 − log(f(x i )) [16] proposed to generalize this to the minimization of n i�1 ρ(x i ), where ρ is some function. MLEs are therefore special case of M-estimators. Minimizing n i�1 ρ(x i ) can often be done by differentiating ρ and solving n i�1 φ(x i ) where φ(x) � zρ(x)/zx; for further detail, we refer the interested readers to [17,18].

Monte Carlo Simulation Study
is section offers a comprehensive simulation study to assess the behavior of the MLEs. e FRL-X family is easily simulated by inverting (4). e expression (4) can be used to simulate any special submodel of the FRL-X family. Here, we consider the FRL-W distribution to assess the behavior of the MLEs of the proposed method. We simulate the FRL-W distribution for two sets of parameters (Set 1: α � 0.7, σ � 1. For w � α, c, β, we consider the sample sizes at n � 25, 50, 100, 200, 400, 600, 800, 900, and 1000. e empirical results are given in Tables 1 and 2. Corresponding to  Tables 1 and 2, the simulation results are graphically displayed in Figures 2-5. From the simulation results, we conclude that (i) Biases for all parameters are positive (ii) e parameters tend to be stabilized (iii) Estimated biases decrease when the sample size n increases (iv) Estimated MSEs decay toward zero when the sample size n increases

Comparative Study
In this section, we illustrate the flexibility of the proposed model via three biomedical data sets. We also compare the proposed model with the other well-known models. e distribution functions of the competitive models are To determine the optimum model, we compute where n is the sample size and x i is the i th sample, calculated when the data is sorted in ascending order.
where G n (x) is the empirical cdf, and sup x is the supremum of the set of distances. A distribution with lower values of these measures is considered a good candidate model among the applied distributions for     [19]. e FRL-W and the considered distributions are applied to this data set. e maximum likelihood estimates of the models for the analyzed data are presented in Table 3, whereas the goodness of fit measures of the proposed and other competitive models are provided in Table 4. Form Table 4, it is clear that the proposed distribution has lower values than the other models applied in comparison. e box plot and Time Scale TTT plot of the first data set are presented in Figure 6. e fitted pdf and cdf of the proposed model are plotted in Figure 7, whereas the PP and Kaplan-Meier survival plots of the proposed model for the first data set are sketched in Figure 8. From the Time Scale TTT plot (Figure 6), we can see that the first data set possess unimodal behavior. Also, from box plot in Figure 6, we can easily observe that the bladder cancer patient's data set is positively skewed. From pdf and cdf very closely. From Figure 8, we can easily detect that the proposed model is closely followed the PP-plot which is an empirical tool for finding a best candidate model.

Data 2: e Survival Times of Neck Cancer Patient Data.
e second data set consists of 44 observations taken from [20] represents the survival times of a group of patients suffering from head and neck cancer and treated using a combination of radiotherapy. is data set also used by [21]. We also applied the FRL-W and the other selected distributions to the second data set. Again, we observe that the proposed model outclasses the other competitors. Corresponding to data 2, the values of the model parameters are presented in Table 5. e analytical measures of the proposed and other competitive models are provided in Table 6. e box plot of the second data set and the corresponding Time Scale TTT plot of FRL-W are presented in Figure 9. e estimated pdf and cdf are sketched in Figure 10, which shows that proposed distribution fit the estimated pdf and cdf plots very closely, whereas the PP-Plot and Kaplan-Meier survival plots are presented in Figure 11. From the Time Scale TTT plot (Figure 9), we can see that the second data set possess the unimodal behavior. Also, from box plot in Figure 9, we can easily observe that the neck cancer data set is positively skewed. e proposed model also provides best fitting to the neck cancer data (see Table 6) and the proposed distribution fit the estimated pdf, cdf, and Kaplan-Meier survival plots very closely.

Data 3: e Guinea Pigs Infected Data.
e third data set consists of 72 observations taken from [22] representing the guinea pigs infected with virulent tubercle bacilli. Again, the FRL-W and other competitors are applied to this data set. Analyzing the third data set, we observe that the proposed model provides the better fit than the other competitors. Corresponding to data 3, the values of the model parameters are presented in Table 7.
e analytical measures of the proposed and other competitive models are provided in Table 8. e box plot of the third data set and the corresponding Time Scale TTTplot of the FRL-W are presented in          e estimated pdf and cdf are sketched in Figure 13, whereas the PP and Kaplan-Meier survival plots are provided in Figure 14. Figures 12-14 reveal that the FRL-W distribution provides the superior fits to the guinea pigs infected data.

Discussion and Future Frame Work
Statistical decision theory addresses the state of uncertainty and provides a rational framework for dealing with the problems of medical decision-making. e medical data sets are generally skewed to the right, and the positively skewed distributions are reasonably competitive when describing unimodal medical data. e traditional distributions are not flexible enough to counter complex forms of data such as medical sciences data having nonmonotonic failure rate function. In view of the importance of statistical distributions in applied sciences, a number of papers have been appeared in the literature aiming to improve the characteristics of the existing distributions. However, unfortunately the number of parameters has been increased and the estimation of the parameters and derivation of mathematical properties becomes complicated. Furthermore, due to the restricted parametric space, some distributions may not be flexible enough to provide adequate fit to many real data sets. To provide a better description of the medical sciences data, in this study, an attempt has been made to introduce a new family of statistical distributions by reducing the number of parameters and reparameterizing the existing distributions to relax the boundary conditions of the additional parameter. A special submodel of the proposed family offers the best fitting in data modeling with nonmonotonic hazard rate function. e maximum likelihood method is adopted to estimate the e very first example about bladder cancer patient data set is considered. e second data set represents the neck cancer data and third data set representing the guinea pigs infection. Analyzing these three real-life examples, it showed that the proposed model performs much better than the other competitive distributions. From the above discussion, it is obvious that the researchers are always in search of new flexible distributions.
erefore, to bring further flexibility in the proposed model, we suggest to introduce its extended versions. e proposed method can be extended by introducing a shape parameter to the model.
(i) A random variable X is said to follow the extended version of the FRL-X family, if its cdf is given by where θ is the additional shape parameter. For θ � 1, the expression (38) reduces to (4). e new proposal may be named as a flexible reduced logarithmic exponentiated-X (FRLE-X) family. For the illustrative purposes, one may consider its special case may be named as flexible reduced logarithmic exponentiated-Weibull (FRLE-W) distribution defined by the cdf: G(x; θ, σ, ξ) � 1 − log 1 + σ − σ 1 − e − cx α θ log(1 + σ) , x ≥ 0, θ, σ, ξ > 0.

(39)
Due to the introduction of the of additional shape parameter, the suggested extension may be much flexible in modeling data in medical sciences and other related fields.
(ii) Another extension of the FRL-X family is given by where η is the additional shape parameter. For η � 1 the expression (40) reduces to (4). e model defined in (40) may be named as the extended flexible reduced logarithmic-X (EFRL-X) family.

Concluding Remarks
In this study, we introduced a new family of continuous distributions called the flexible reduced logarithmic-X family. Some mathematical properties of the proposed family are obtained. e maximum likelihood method used to estimate the unknown model parameters. ree applications to the real-life medical data sets are given to illustrate empirically the flexibility of the proposed model. e comparison of the proposed method is made to some well-known lifetime distributions such as Weibull, Marshall-Olkin Weibull, and alpha power transformed Weibull distributions. e comparison is made on the basis of well-known goodness of fit measures including Cramer-Von Messes test statistic, Anderson Darling test statistic, and Kolmogorov-Simonrove test statistics with corresponding p values. Empirical findings indicate that the proposed model provide better fits than the other well-known competitive models.

Data Availability
is work is mainly a methodological development and has been applied on secondary data related cancer patients, but if required, data will be provided.

Conflicts of Interest
e authors declare that they have no conflicts of interest.