Interval Estimation for Extreme Value Parameter with Censored Data

The Weibull distribution is widely used in the parametric analysis of lifetime data. In place of the Weibull distribution, it is often more convenient to work with the equivalent extreme value distribution, which is the logarithm of the Weibull distribution. The main advantage in working with the extreme value distribution is that unlike the Weibull distribution, the extreme value distribution has location and scale parameters. This paper is devoted to a discussion of statistical inferences for the extreme value distribution with censored data. Numerical simulations are performed to examine the finite sample behaviors of the estimators of the parameters. These procedures are then applied to real-world data.


Introduction
In medical research, data documenting the time until the occurrence of a particular event, such as the death of a patient, is frequently encountered.Such data is called time-to-event data, also referred to as lifetime, survival time, or failure time data, which has in general rightskewed distribution.For this reason, the Weibull distribution is widely used.In place of the Weibull distribution, it is often more convenient to work with the equivalent extreme value distribution in which data are the logarithm of those taken from the Weibull distribution Lawless 1 .Specifically, if Y has a Weibull distribution with f y λβ λy β−1 exp − λy β , y > 0, 1.1 where λ > 0 and β > 0 are parameters, then T log Y has an extreme value distribution with b 1/β and u − log λ.The main convenience in working with the extreme value ISRN Applied Mathematics distribution is that unlike the Weibull distribution, this distribution has location and scale parameters.An excellent review on extreme value distributions can be found in Coles 2 and Kotz and Nadarajah 3 .
A common feature of lifetime data is that the data points are possibly censored.For example, the event of interest may not have happened to all patients.A patient undergoing cancer therapy might die from a road accident.In this case, the observation period is cut off before the event occurs.In such a case, the data is said to be censored, and it would be incorrect to treat the time-to-death as lifetime.When data are censored as in the case of the cancer patient who dies from a road accident , conventional statistical methods cannot be directly applied to analyze the data.Insteady, special statistical methods are necessary to handle such data.Censored data have been studied by many authors.Kaplan and Meier 4 proposed the estimate, the so-called product limit estimate, of the distribution function, Cox 5 introduced the still commonly used proportional hazard model, Buckley and James 6 investigated the censored regression model, and Prentice 7 studied two-sample censored data.More recent works include, among others, Tsiatis 8 , Wei et al. 9 , Wei and Gail 10 , and Lee and Yang 11 .Our objective in this paper is to systematically study the inference procedures for the extreme value distribution with censored data.Based on the criteria of the empirical coverage probability and the confidence interval length in the numerical studies, we observe that the log transformation of the estimate enhances results by the usual normal approximation, and the likelihood ratio method is effective for small sample sizes when heavy censoring.Note that ideally, we expect that empirical coverage probabilities are close to the theoretical coverage probability, and empirical mean lengths are relatively short.Similar work on the maximum likelihood estimation in the Weibull distribution with censored data can be found in Cohen 12 .This paper is organized as follows: Section 2 describes the maximum likelihood estimates and their confidence intervals when the lifetime data includes some censored observations.Numerical results and a graphical method for checking the model are presented in Section 3.This section also illustrates the procedures using a vaginal cancer data set for rats.Section 4 states the conclusion.

Maximum Likelihood Estimation
The probability density function for the extreme value distribution considered here is

2.3
The above probabilities can be combined into the single expression This yields the sampling distribution of t i , δ i , i 1, . . ., n n i 1 Knowing that g t and G t do not contain any parameter of interest, we have the likelihood function defined as It can be easily shown that for the extreme value distribution, the survival function is Hence, the above likelihood function can be written as To by letting r n i 1 δ i .Differentiating 2.10 with respect to u and b in turn and equating to zero, we obtain the estimating equations where z i t i − u /b.The above equations can be solved by some numerical techniques such as the Newton-Raphson iteration or random search to locate the estimates, u and b, of u and b, respectively.In this paper, the random search was used for simplicity, although it is computationally intensive.
From 2.11 , the maximum likelihood equations can be written as 2.12 which are equivalent to respectively.Equation 2.14 can be solved iteratively for b, then u calculated from 2.13 .
To make inferences about u, b , we can use the fact that, by the usual large-sample theory, the joint distribution of u and b is approximately bivariate normal with mean u, b and covariance matrix where I is the Fisher information matrix, defined as

2.16
It is often difficult to evaluate the expectations in I u, b , so a natural procedure is to use the approximation where I 0 is the observed information matrix

2.18
With z i t i − u /b, we have

2.19
These yield

Inference Procedure
From the usual large-sample theory, we have where the matrix Therefore, we have where A is the transpose of A. Let 2,2 th entry of AI −1 0 A be m 22 .Then, an asymptotic 1 − α 100% confidence interval for ξ is Hence, since b e ξ , an asymptotic 1 − α 100% confidence interval for b is

2.29
Note that the interval always lies in the positive half of the axis.
The procedures based on the normal approximation are appropriate for quite large sample sizes.An appealing alternative is to use likelihood ratio procedures.Chi-squared χ 2 distributions, approximating the distributions of likelihood ratio test, are often found to be adequate for small sample sizes.We include the confidence interval from these procedures for a comparative study.The procedures are discussed below.
Consider the test problem H 0 : b b 0 versus H a : b / b 0 .The likelihood ratio test, with level α, rejects H 0 when where max where max

Numerical Studies
Several experimental simulations were carried out to assess the performance of the confidence intervals discussed in Section 2. We report the simulation results based on the criteria of the empirical coverage probability and the empirical mean length of the confidence interval.For the simulation study, the survival data is taken from the extreme value distribution with u 0 and b 1.The censoring distributions are normal, where C i c Normal 0, 1 , with c chosen to result in various censoring percentages in the samples.For the sample sizes of n 20, 50, and 100, results were based on 500 repetitions.For the censored data, the censoring proportions of 20%, 30%, 40%, 50%, 60%, and 70% were used.These censoring proportions were obtained from settings of c 0.75, 0.33, −0.08, −0.47, −0.87, −1.3, and −1.87, respectively.Simulation results based on these settings are summarized in Table 1.
It should be noted that although the normal approximation procedures are adequate for quite large samples, the approximations on which they are based are rather poor for smallsize samples Lawless 1 .One possible way to solve this problem is a transformation.For example, simulation results presented in Table 1 show that the log transformation appears to be one way of alleviating this problem.That is, treating log b as approximately normal is preferable to treating b as approximately normal.In Table 1, Method 1 denotes the confidence interval results from the large-sample normal approximation, and Method 2 in parenthesis indicates the confidence interval results by the log transformation of the maximum likelihood estimate, b.In the Table, ECP is the empirical coverage probability, and EML means the empirical mean length of the confidence interval.The likelihood ratio method is denoted by LR.It is known that the likelihood ratio method is often found to be appropriate for small sample sizes.
We now look at the results for the censored data case presented in Table 1.Overall, the empirical coverage probabilities of the parameters by Method 1 are noticeably improved by Method 2 and LR for all sample sizes in every censoring proportion considered.If we restrict our attention to the scale parameter v in Table 1, the performance of the confidence interval obtained by Method 1 is improved by the log transformation Method 2 for the small sample size at all censoring proportions.It is also observed that the LR method outperforms Method 2 that is superior to Method 1.The 95% confidence intervals from the 500 simulations result in 500 independent Bernoulli random variables, where success occurs when the true parameter is covered by the confidence interval and the probability of success is 95%.Thus the 95% error margin for the empirical coverage probability is 1.96 .95 1 − .95/500.This implies that the empirical coverage probability is expected to fall within the interval .9309,.9691 .For the sample size of n 20, Method 1 fails to provide the confidence intervals that lie in the interval .9309,.9691, indicating that it does not achieve the theoretical coverage probability of 95%.The confidence intervals by LR fall within the interval .9309,.9691, whereas Method 2 is close but still does not achieve the theoretical coverage probability.For moderate and large sample sizes, there seems to be no dominant method that outperforms the others.The confidence intervals by Methods 1, 2 and LR approximately achieve the theoretical coverage, probability of 95%.However, LR tends to be superior to Method 1 when censoring ISRN Applied Mathematics is heavy.As sample size increases for all censoring proportions, the confidence interval length decreases.It seems that the confidence interval lengths by Method 1 are slightly shorter than Method 2 and LR but these differences are not substantial.
In the case of the location parameter u , Methods 1, 2, and LR show nearly all of the same performance for all sample sizes when data are not heavily censored.However, for small sizes with heavily censored data, LR appears to outperform Method 1, achieving the theoretical coverage probability of 95%.Method 1 fails to achieve the nominal level in this case.The differences of the confidence interval lengths are not substantial, although Method 1 is slightly shorter than LR.We also discuss a graphical method for checking the adequacy of the distribution.The extreme value survival function satisfies log − log S t t − u /b, so t u b log − log S t .With u 0 and b 1, therefore, log − log S t is a linear function of t, and a plot of log − log S t versus t should be roughly linear if the extreme value distribution is reasonable.When data are censored, the most widely used estimate for S t is the Kaplan-Meier estimate Kaplan and Meier 4 , also referred to as the product-limit estimate of the survival function.The Kaplan-Meier estimate is defined as where d j represents the number of lifetimes at time t j , and n j is the number of individuals uncensored before t j .The S t was used for S t in this paper.Figure 1 shows a linear relationship between t and log − log S t , although some indistinguishable deviations from linear trend are detected under heavy censoring.For a sample size of n 100, Figure 1 was obtained from the same settings as the preceding simulations.The procedures are applied to a real data set.Pike 13 gives results of a laboratory experiment concerning vaginal cancer in female rats.In this experiment, 19 rats are painted with the carcinogen DMBA, and the number of days until the appearance of a carcinoma was observed.At the end of study, 17 out the 19 rats had developed a carcinoma, and this indicates that two of the times are censored.The censoring proportion is 2/19 ≈ 0.105.See Pike 13 and Lawless 1 for details.In order to check the adequacy of an extreme value model, we consider a plot of log − log S t versus t. Figure 2 shows this to be roughly linear, suggesting that the extreme value distribution could be reasonable.Another graphical approach employed by Lawless 1 for the vaginal cancer data also confirms that the model is plausible.The estimation procedures under the extreme value distribution give that u 5.4565 and b 0.1649, and the interval estimation results are summarized in Table 2.In this study concerning vaginal cancer for rats, sample size is small and level is low.Therefore, among the three confidence intervals, the interval constructed by the likelihood ratio method would be the most reliable, especially for the scale parameter of the model.

Concluding Remarks
In this paper, we have investigated the inference procedures for the extreme value distribution with censored observations.The extreme value distribution is a useful model in the parametric analysis of lifetime data.Through numerical studies, the inference procedures, based on the maximum likelihood estimates, were examined.The usual normal approximation procedures were enhanced by means of the log transformation and the likelihood ratio method.By analysis of the empirical coverage probabilities and the empirical mean lengths of the confidence intervals, we have found that the likelihood ratio method is very effective for small sample sizes when data are heavily censored.A graphical method for checking the adequacy of the distribution was also discussed.The procedures were then applied to a real-world data set.
estimate u, b , we find the values of u, b that maximize log L u, b by setting the derivative d log L u, b /d u, b equal to zero and solving for u, b .From 2.8 , we have the log likelihood function log L u, b n i 1 From the asymptotic normality of u, b − u, b and the estimated variance, inference procedures on u, b can be easily obtained.For example, possible asymptotic 1 − α 100% confidence intervals for u and b are u − z α/2 d 11 , u z α/2 d 11 , b − z α/2 d 22 , b z α/2 d 22 , 2.23 respectively, where z α/2 is the 1 − α/2 -percentile of the standard normal distribution.However, such an interval for the scale parameter b may contain negative values.Recall that −∞ < u < ∞ and b > 0. Similar to Lawless 1 , this problem can be repaired by using the log transformation.Let ξ log b and ξ log b.A standard argument shows that log b − log b is asymptotically equivalent to 1 − b/ b.From this, we have

Figure 1 :
Figure 1: Plots of t versus log − log S t , n 100.

tFigure 2 :
Figure 2: Plot of t versus log − log S t , Vaginal cancer data.
≥ t that are independent of T i .We observe T i only if T i ≤ C i , and so the available data consists of pairs t i , δ i , i 1, . . ., n, where

Table 1 :
Simulation results, empirical coverage probability ECP and empirical mean length EML of 95% confidence intervals of u and b.

Table 2 :
Confidence interval for u and b, vaginal cancer data.