^{1}

^{1}

^{2}

^{2}

^{2}

^{1}

^{2}

The Weibull distribution is widely used in the parametric analysis of lifetime data. In place of the Weibull distribution, it is often more convenient to work with the equivalent extreme value distribution, which is the logarithm of the Weibull distribution. The main advantage in working with the extreme value distribution is that unlike the Weibull distribution, the extreme value distribution has location and scale parameters. This paper is devoted to a discussion of statistical inferences for the extreme value distribution with censored data. Numerical simulations are performed to examine the finite sample behaviors of the estimators of the parameters. These procedures are then applied to real-world data.

In medical research, data documenting the time until the occurrence of a particular event, such as the death of a patient, is frequently encountered. Such data is called time-to-event data, also referred to as lifetime, survival time, or failure time data, which has in general right-skewed distribution. For this reason, the Weibull distribution is widely used. In place of the Weibull distribution, it is often more convenient to work with the equivalent extreme value distribution in which data are the logarithm of those taken from the Weibull distribution (Lawless [

A common feature of lifetime data is that the data points are possibly censored. For example, the event of interest may not have happened to all patients. A patient undergoing cancer therapy might die from a road accident. In this case, the observation period is cut off before the event occurs. In such a case, the data is said to be censored, and it would be incorrect to treat the time-to-death as lifetime. When data are censored (as in the case of the cancer patient who dies from a road accident), conventional statistical methods cannot be directly applied to analyze the data. Insteady, special statistical methods are necessary to handle such data. Censored data have been studied by many authors. Kaplan and Meier [

This paper is organized as follows: Section

The probability density function for the extreme value distribution considered here is

The above probabilities can be combined into the single expression

This yields the sampling distribution of

Knowing that

It can be easily shown that for the extreme value distribution, the survival function is

Hence, the above likelihood function can be written as

From (

which is equivalent to

The above equations can be solved by some numerical techniques such as the Newton-Raphson iteration or random search to locate the estimates,

From (

which are equivalent to

To make inferences about

where

It is often difficult to evaluate the expectations in

where

From the usual large-sample theory, we have

Thus,

where the matrix

From the asymptotic normality of

respectively, where

which is equivalent to

with

Therefore, we have

where

Hence, since

Note that the interval always lies in the positive half of the axis.

The procedures based on the normal approximation are appropriate for quite large sample sizes. An appealing alternative is to use likelihood ratio procedures. Chi-squared (

Consider the test problem

where

at which

Note that

Similarly, a

where

Several experimental simulations were carried out to assess the performance of the confidence intervals discussed in Section

Simulation results, empirical coverage probability (ECP) and empirical mean length (EML) of 95% confidence intervals of

ECP | EML | |||||

Censoring | Method | |||||

20% | 20 | 1 (2) | 94.0 | 87.8 (91.2) | 0.9508 | 0.7566 (0.7770) |

LR | 94.4 | 93.4 | 1.0217 | 0.8167 | ||

50 | 1 (2) | 94.0 | 93.8 (94.2) | 0.6233 | 0.4948 (0.5000) | |

LR | 93.4 | 94.4 | 0.6335 | 0.5100 | ||

100 | 1 (2) | 94.6 | 92.6 (93.0) | 0.4379 | 0.3458 (0.3476) | |

LR | 93.8 | 92.6 | 0.4348 | 0.3510 | ||

30% | 20 | 1 (2) | 95.6 | 89.4 (92.6) | 1.0412 | 0.8303 (0.8564) |

LR | 95.2 | 94.2 | 1.1397 | 0.9081 | ||

50 | 1 (2) | 95.4 | 93.4 (94.4) | 0.6604 | 0.5245 (0.5308) | |

LR | 95.2 | 94.6 | 0.6764 | 0.5431 | ||

100 | 1 (2) | 94.8 | 92.8 (94.4) | 0.4657 | 0.3696 (0.3718) | |

LR | 93.4 | 94.2 | 0.4647 | 0.3760 | ||

40% | 20 | 1 (2) | 93.4 | 88.4 (91.0) | 1.1755 | 0.9163 (0.9520) |

LR | 93.6 | 93.0 | 1.2944 | 1.0241 | ||

50 | 1 (2) | 93.4 | 93.2 (95.0) | 0.7226 | 0.5659 (0.5738) | |

LR | 93.0 | 95.0 | 0.7469 | 0.5898 | ||

100 | 1 (2) | 94.6 | 92.6 (94.0) | 0.5101 | 0.3999 (0.4026) | |

LR | 93.8 | 94.2 | 0.5118 | 0.4081 | ||

50% | 20 | 1 (2) | 91.6 | 88.4 (92.8) | 1.3248 | 0.9886 (1.0345) |

LR | 94.2 | 93.4 | 1.4620 | 1.1317 | ||

50 | 1 (2) | 94.2 | 90.0 (92.0) | 0.8143 | 0.6132 (0.6236) | |

LR | 94.8 | 93.8 | 0.8524 | 0.6451 | ||

100 | 1 (2) | 94.8 | 95.0 (95.4) | 0.5909 | 0.4471 (0.4508) | |

LR | 94.0 | 95.2 | 0.5983 | 0.4584 | ||

60% | 20 | 1 (2) | 92.8 | 87.8 (93.0) | 1.6431 | 1.1305 (1.2012) |

LR | 94.6 | 94.0 | 1.7439 | 1.3587 | ||

50 | 1 (2) | 94.8 | 93.8 (94.4) | 1.0082 | 0.7065 (0.7219) | |

LR | 94.8 | 95.0 | 1.0733 | 0.7500 | ||

100 | 1 (2) | 95.0 | 94.0 (94.8) | 0.6803 | 0.4822 (0.4871) | |

LR | 93.4 | 95.4 | 0.6954 | 0.4975 | ||

70% | 20 | 1 (2) | 89.8 | 87.2 (91.4) | 2.2459 | 1.3388 (1.4785) |

LR | 94.0 | 93.6 | 2.0291 | 1.8666 | ||

50 | 1 (2) | 94.4 | 92.0 (92.6) | 1.2435 | 0.7913 (0.8140) | |

LR | 93.6 | 93.2 | 1.3256 | 0.8641 | ||

100 | 1 (2) | 92.8 | 91.0 (93.2) | 0.8572 | 0.5503 (0.5576) | |

LR | 93.6 | 94.2 | 0.8894 | 0.5739 |

It should be noted that although the normal approximation procedures are adequate for quite large samples, the approximations on which they are based are rather poor for small-size samples (Lawless [

We now look at the results for the censored data case presented in Table

In the case of the location parameter (

We also discuss a graphical method for checking the adequacy of the distribution. The extreme value survival function satisfies

where

Plots of

The procedures are applied to a real data set. Pike [

Confidence interval for

Method | C.I. of | Length | C.I. of | Length |
---|---|---|---|---|

Normal | [5.3756, 5.5374] | 0.1618 | [0.1081, 0.2217] | 0.1136 |

Log | [5.3756, 5.5374] | 0.1618 | [0.1168, 0.2327] | 0.1159 |

LR | [5.3800, 5.5400] | 0.1600 | [0.1204, 0.2419] | 0.1215 |

Plot of

In this paper, we have investigated the inference procedures for the extreme value distribution with censored observations. The extreme value distribution is a useful model in the parametric analysis of lifetime data. Through numerical studies, the inference procedures, based on the maximum likelihood estimates, were examined. The usual normal approximation procedures were enhanced by means of the log transformation and the likelihood ratio method. By analysis of the empirical coverage probabilities and the empirical mean lengths of the confidence intervals, we have found that the likelihood ratio method is very effective for small sample sizes when data are heavily censored. A graphical method for checking the adequacy of the distribution was also discussed. The procedures were then applied to a real-world data set.