Bayesian Inference of the Weibull Model Based on Interval-Censored Survival Data

Interval-censored data consist of adjacent inspection times that surround an unknown failure time. We have in this paper reviewed the classical approach which is maximum likelihood in estimating the Weibull parameters with interval-censored data. We have also considered the Bayesian approach in estimating the Weibull parameters with interval-censored data under three loss functions. This study became necessary because of the limited discussion in the literature, if at all, with regard to estimating the Weibull parameters with interval-censored data using Bayesian. A simulation study is carried out to compare the performances of the methods. A real data application is also illustrated. It has been observed from the study that the Bayesian estimator is preferred to the classical maximum likelihood estimator for both the scale and shape parameters.


Introduction
One of the features of survival data is censoring. e common one is right censoring and literature on it is well established. Among them are, Al-Aboud [1], Al-Athari [2], Syuan-Rong and Shuo-Jye [3], Guure and Ibrahim [4], Pandey et al. [5], Soliman et al. [6], Abdel-Wahid and Winterbottom [7], Guure et al. [8] and many others. e focus of this study is on interval censoring, which presumably is more demanding than right censoring and, as a result, the approach developed for right censoring does not generally apply.
Interval censoring has to do with a study subject of interest that is not under regular observation. As a result, it is not always possible to observe the failure or survival time of the subject. With interval censoring, one only knows a range, that is, an interval, inside of which one can say the survival event has occurred. Le-or right-censored failure times are special cases of interval-censored failure times. As stated by Turnbull [9], one could de�ne an interval-censored observation as a union of several nonoverlapping windows or intervals.
According to Jianguo [10], interval-censored failure time data occur in many areas including demographical, epidemiological, �nancial, medical, sociological, and engineering studies. A typical example of interval-censored data occurs in medical or health studies that entail periodic followups, and many clinical trials and longitudinal studies fall into this category. In such situations, interval-censored data may arise in several ways. For instance, an individual may miss one or more observation times that have been scheduled to clinically observe possible changes in disease status and then return with a changed status as stated by Jianguo [10].
Consider individuals who visit clinical centres at times convenient to them rather than the predetermined observation time; in this type of situation, the data obtained are interval-censored. Should all study subjects or units follow the predetermined observation schedule time exactly, it is still not possible to observe the exact time of the occurrence of the change, even when we assume it is a continuous variable. In cases of this nature, one has grouped failure time data, that is, interval-censored data of which the observation for each subject is a member of a collection of non-overlapping intervals.
Researchers who have discussed interval censored data in the classical point of view are, Lawless [11], Flygare and Buckwalter [12], Lindsey [13], Scallan [14], and Odell et al. [15]. To the best of our knowledge, none in the literature so far has interval-censored data using the Bayesian estimation approach with regards to Weibull distribution, which is the essence of this study.
One of the primary advantages of Weibull analysis is its ability to provide reasonably accurate analysis and forecasts with extremely small samples, Abernethy [16]. Small samples allow cost effectiveness.
When the Weibull distribution shape parameter is say, , it indicates that the hazard rate decreases over time implying infant mortality. When , it indicates constant hazard rate over time and implies the hazard rate increases with time which is as a result of ageing. In summary, the Weibull shape parameter gives the physics behind the death of a biological system while the scale parameter determines the duration of the disease on a biological system. e scale parameter is also referred to as the characteristic life of the distribution.
Even though nonparametric estimation is more widely used in analysing survival data, it is still necessary to discuss parametric estimations. In parametric estimations, the distribution of the survival data is most oen assumed to be known. Distributions that are oen used in survival analysis are Weibull, exponential, log-logistic, and log-normal. As discussed above, we have assumed that the survival data follow Weibull distribution. e approach used in this paper can be extended to other lifetime distributions. e paper is structured as follows: Section 2 contains the derivative of the parameters under maximum likelihood estimator, Section 3 is the Bayesian inference. Simulation study is in Section 4 followed by results and discussion in Section 5. Section 6 is real data analysis and then a conclusion is provided in Section 7.

Maximum Likelihood Estimation
Let , … , be the lifetimes from a random sample of size where the probability density function (pdf) is represented by , , , the cumulative distribution function (cdf) is , , , and the survival function is , , ; given the two-parameter Weibull distribution we have, respectively, , , exp , , , exp , where represents the shape parameter and the scale parameter. Let [ , ] denote the interval-censored data and let represent the unknown time, that is, ≤ ≤ , where is the last inspection time and the state end time.
If censoring occurs noninformatively and if the law governing and does not involve any of the parameters of interest, we can base our inferences on the likelihood function , , as stated by Gómez et al. [17], which is given by By using (3), we have , | , exp exp .
Taking the natural log of (5), we have ℓ ln exp exp .
To �nd the values of and that maximize (6), we differentiate (6) With a simple numerical approximation, the maximum likelihood estimates of and can be obtained.
Computational and Mathematical Methods in Medicine 3

Bayesian Inference
Bayesian inference is an approach that employs the Bayes' rule in order to update the probability estimate of a hypothesis taking into consideration new evidence as it becomes available. Bayesian updating is one of the essential techniques used in modern statistics, more importantly in mathematical statistics. Bayesian updating is particularly important in analysing data that is progressive. Bayesian inference can be applied in many other �elds like engineering, medicine, and accounting. e Bayes approach makes use of our prior beliefs of the parameters which is referred to as Prior distribution. e prior is a distribution of the parameters before any data is observed and is given as . It also takes into consideration the observed data which is obtained by making use of the likelihood function, and given as . e Bayes estimator is considered under three loss functions which are important in Bayesian estimations. ey are asymmetric (LINEX and general entropy) loss functions and symmetric (squared error) loss function.

Prior.
Prior distribution of the unknown parameters need to be assumed for the Bayesian inference. As discussed by Berger and Sun [18] and subsequently used by Banerjee and Kundu [19], we let take on a Gamma prior with and . We assume that the prior of is independent of the prior of and is in the neighbourhood of ∞ . Let represent the prior of , where A joint density function of the data and for interval censored data can be obtained as below Bayesian inference is based on the posterior distribution which is simply the ratio of the joint density function to the marginal distribution function.
e posterior density function of and given the data

e Loss Functions.
Having taking into account our prior information which is essential, the Bayes estimator shall be considered under three loss functions. e �rst loss function under consideration is squared error loss function (SELF) which is symmetrical in nature. What is worth noting is that many estimation problems that involve either one or more parameters are treated in most cases using the symmetric loss function. is loss function gives equal weight to estimation errors that are the same regardless of whether the loss obtained has either overestimated or underestimated the parameter or the problem being investigated. At some particular times we observe that estimation errors with respect to a particular problem are preferable either in one direction or the other. When one takes into perspective a univariate problem, it can be stated that an overestimation may cause more seriousness than that of an underestimation or vice versa as also stressed by Hamada et al. [20]. Varian [21] was motivated by the use of asymmetric loss functions and therefore applied it in estimating problems arising in real estate assessment, where overestimation of a property's value might cause it to remain on the market unsold for an extended period, ultimately costing the seller super�uous or gratuitous expenses. As a result of the above discussions, we have taken into perspective two asymmetric loss functions, which are linear exponential (LINEX) loss function and general entropy loss function (GELF).
(1) e squared error loss function is given as 2 (11) where is the loss incurred by adopting action when the true state of nature is .
(2) e LINEX loss function is also given as where Δ , with being the estimate of . According to Zellner [22], the posterior expectation of the LINEX loss function is e value of that minimizes the above equation is BL ln (14) provided ⋅ exists and is �nite.
(3) Another useful asymmetric loss function is the general entropy (GELF) which is a generalization of the entropy loss and is given as e Bayes estimator BG of under the general entropy loss is BG / (16) provided ⋅ exists and is �nite.
It may be noted that (10) contains double integrals which cannot be solved analytically; this is due to the complex form of the likelihood function given in (5). erefore, we propose to use Lindley approximation method to evaluate the integrals involved.

Lindley Approximation.
A prior of needs to be speci�ed here so as to calculate the approximate Bayes estimates of and . �aving speci�ed a prior for as Gamma( , it is similarly assumed that ( also takes on a Gamma( prior.
According to Lye et al. [23], the posterior Bayes estimator of an arbitrary function ( given by Lindley is where ℓ( is the log likelihood and ( ( are arbitrary functions of . We assume that ( is the prior distribution for and ( = ( ( with ( being some function of interest. Equation (17) can be approximated asymptotically by the following: where ℓ is the log-likelihood function in (6). If

Linear Exponential Loss
e LINEX loss is obtained by using the same Lindley procedure in (19) with Similar Lindley approach is used for the general entropy loss function as in the squared error loss but here the Lindley approximation procedure as stated in (19), where 1 , 11 and 2 , 22 are the �rst and second derivatives for and , respectively, are given as

Simulation Study
A simulation study is carried out to determine the best estimator for the two parameters of the Weibull distribution with interval censoring. Generating the interval-censored data involved the following steps. Each data set contains 25,50, and 100 interval censored observations. Some observations were le censored, but we have not speci�ed le-censored data from interval censored data.
We assume that the true survival time follows a Weibull distribution.
(a) Generate from the Weibull distribution, say, of size = 25, 50, and 100 with = 2 and 4, and = 0.8 and 1.2 to represent the true survival time with a decreasing and increasing shape parameter.
(b) Generate a vector, say, for a set of clinic visits. Assume there are 5 clinic visits, for the Weibull distribution� ta�e the �rst visit to be 1 and generate from 0, . e next visit of 2 is also generated from 1 e coding and the analysis were performed using the R programming language which is freely available. e parameters were estimated with maximum likelihood and Bayesian. To compute the Bayes estimates, assumptions are made such that and take respectively Gamma( , and Gamma( , priors. We set the hyperparameters to zero, that is in order to obtain noninformative priors. Note that at this point the priors become nonproper but the results do not have any signi�cant di�erence with the implementation of proper priors as also stated by Benerjee and Kundu [19]. e values for the loss parameters were taken to be and . A detailed discussion on the choice of the loss parameter can be seen in Calabria and Pulcini [24]. We iterated the simulation process (R) 1000 times to obtain the estimates of the parameters. e mean squared errors and absolute biases are determined and presented below for the purpose of comparison.
To obtain the MSEs and absolute biases for each estimated value, the MSEs and absolute biases are calculated for each of the one thousand estimated values of the scale parameter and the shape parameter that is from 1 to 1000. At the end, what we obtain is the average of MSEs and absolute biases. Our aim is to �nd out how close the estimated values of the estimators are to the true values at each level of the simulation taking into account that , , , . the general entropy loss function but both are better than Bayes using the squared error loss and that of the maximum likelihood estimator. We realised that MLE as compared to Bayesian only has the smallest mean squared error at 25 and 50 with and 8, where the value for here represents infant mortality. With Table 2, both LINEX and GELF have equal minimum absolute biases for the scale parameter. Considering the standard errors of the estimators given in Table 5, it is observed that, both LINEX and general entropy loss functions perform quite well since they both have almost equal smallest standard errors for the scale parameter. LINEX loss function turn to overestimate the scale parameter.

Results and Discussion
Considering Tables 3 and 4, we noticed that Bayesian with the assumed informative prior performed astonishingly well under the general entropy and the linear exponential loss functions than the maximum likelihood estimator and that of Bayes under the squared error loss function. e LINEX loss function performed very well with small sample size whiles the general entropy loss function gives the smallest mean squared error with relatively large sample sizes. e minimum biases for the shape parameter occur predominantly with the general entropy loss function followed by linear exponential loss function.
What we observed here is that, in estimating the shape parameter of the Weibull model with interval censoring, Bayesian estimator with the general entropy loss function may be preferred to the others when we consider mean squared errors and minimal biases of the estimators. is is followed by Bayesian also with the LINEX loss function. e general entropy and the LINEX loss functions underestimate the shape parameter, this is because all the smallest mean squared errors occur at , which is less than zero in the case of the general entropy loss, but for the LINEX loss they occur at . Considering the standard errors of the estimators given in Table 6, it is observed that LINEX loss function predominantly performs better than the others having obtained smallest standard errors for the shape parameter.

Real Data Analysis
We analyse a data set in this section for illustration and comparison purposes. e data is a retrospective study obtained from Lawless [11]. It was carried out to compare the cosmetic effects of radiotherapy versus radiotherapy and adjuvant chemotherapy on women with early breast cancer.
To compare the two treatments, a retrospective study of 46 radiation only and 48 radiation plus chemotherapy patients were conducted. Patients were observed initially for every 4-6 months, but, as their recovery progressed, the interval between visits lengthened. e event of interest was the time for the �rst appearance of moderate or severe breast retraction. As the patients were observed only at some random times, the exact time, , of breast retraction is known only to fall within the interval between visits. Patients with no moderate or severe breast retraction until the last visit were classi�ed as right censored and then the end point of their intervals was assumed to be ∞ and was assumed as the time from the beginning to the last visit. Here, our primary concern is with women who were under the same group, that is, radiotherapy and adjuvant chemotherapy for the purpose of illustration. e data are presented in Table  7.
As clearly presented in Table 8, the estimator which gives the smallest standard error is Bayesian under linear exponential loss followed by general entropy loss function. is is so because both loss functions take into consideration an overestimation and underestimation, even though they have different approaches. Again, we observe that both linear exponential and general entropy loss functions overestimate the scale and shape parameters of the Weibull distribution, that is, . In a practical perspective, the use of symmetric loss function is based on the assumption that the loss is the same in any direction; that is, it is simply the magnitude of the losses incurred. erefore, if one is not interested in knowing whether the assumption made with regards to the duration or the mortality state of the patients being investigated gives a signal as to whether our conclusion is above or below the actual duration or life expectancy of the patient, then this loss function could be used. However, this assumption may not be valid in many practical situations and the use of the symmetric loss function may be inappropriate. For instance, underestimating the blood pressure of a patient can have a very serious consequences on the patient's treatment than overestimating or vice versa. e gravity of the situation is where one knows that the patient has problem with the blood level but does not know if it is high or low to recommend a proper treatment.
Again from Table 8, it is evident and clear that, at 95% con�dence or credible intervals, the estimator with narrower con�dence or credible intervals is Bayesian, �rst with the linear exponential loss function followed by general entropy loss function. Notwithstanding the fact that the Bayesian estimator gives narrower credible intervals as compared to the classical maximum likelihood estimator, there is still an advantage for using the Bayesian credible intervals over the classical con�dence intervals. Brie�y, a frequentist 95% con�dence interval of, say, (2, means, with a large number of "repeated" new samples at a time, 95% of the calculated con�dence intervals would include the true value of the parameter, this is in contrast to the Bayesian analogy of credible interval, which states that with a small sample "not repeatedly" contains the true parameter 95% at a time. In the Bayesian approach the parameter is observed as being random and sought for.
e values in bold indicate the smallest standard errors and narrower con�dence or credible intervals of the preferred estimator.

Conclusion
In this study, we have taken into consideration the point estimate of the Weibull distribution parameters based on interval censoring through simulation. MLE and Bayes estimators are used to estimate the scale and shape parameter of this lifetime distribution. e Bayes estimators are obtained using linear exponential, general entropy, and squared error loss functions. We also employed the Bayesian non-informative prior approach in estimating the scale and shape parameter aer having considered an informative (Gamma priors. In order to reduce the complicated integrals that are in the posterior distribution which cannot explicitly be obtained in close form, we employed the Lindley approximation procedure to calculate the Bayes estimators.
e Bayesian estimator performed quite well in estimating the scale parameter with respect to the mean squared errors under LINEX loss function than the other loss functions and that of the classical maximum likelihood estimator. With the shape parameter, the Bayesian estimator performed better than the other estimators with general entropy loss function. is is due to the fact that the loss parameter when considered under general entropy has minimal in�uence on the posterior distribution, thereby exhibiting little biasedness as compared to the LINEX loss function. With respect to the real data, it is observed that Bayes under LINEX loss function gives the smallest standard errors and narrower credible intervals compared to the others but overestimates both parameters. is is followed closely by general entropy loss function. e standard errors for the simulation study indicate Bayes estimator via LINEX is better than the others for the scale parameter but for the shape parameter both LINEX and general entropy performed quite well in estimating it.