A Two-Sample Parametric Test for Partly Interva-Censored Data with Nonproportional Hazard

In clinical trials and engineering studies that are followed by periodic follow-ups, it is predominantly to have partly intervalcensored failure time data. Partly interval-censored failure time data is composed of exact observations and interval-censored observations. This paper discusses two-sample parametric comparison of reliability function in the existence of partly intervalcensored failure time data.Wehave constructed a score test and likelihood ratio test for this kind of failure time data under piecewise exponential distribution by using multiple imputation technique. Simulation study is established to assess the proposed test, which indicates that the presented procedure works well. Finally, an example is given for illustration purposes.


Introduction
Comparison of reliability functions is one of the main objectives in survival studies.In this paper, we will discuss the comparison problem for two reliability functions in the existence of partly interval-censored (PIC) failure time data.With PIC data, the failure times are exactly observed for some items but, for the remaining items, it can be seen that the failure times only occur within a period of time.PIC data often occurs in medical and engineering studies that are followed by periodic follow-ups.An example of this kind of data is provided by the Framingham Heart Disease study; see [1].Many test procedures have been proposed to solve the comparison problem when observed data are right-censored (e.g., [2,3]) or interval-censored; for example, Finkelstein [4] developed a score test for comparison of several survival functions under proportional hazard model.Pan [5] suggested a multiple imputation method depending on the approximate Bayesian bootstrap for comparing two intervalcensored samples with proportional and nonproportional hazards.Lim and Sun [6] proposed nonparametric twosample tests with proportinal and nonproportional hazards when the interval-censored failure time data are given and many others [7][8][9][10][11][12].For the case of PIC, the research is still ongoing but so far limited.R. Peto and J. Peto [8] pointed out the PIC data, handling the exact observations as the extremely short interval-censored observations.Huang [13] derived the asymptotic properties of the nonparametric estimation for the distribution function with PIC data.Kim [14] studied the maximum likelihood estimation in the PIC data under the proportional hazards model.Zhao et al. [15] developed a nonparametric test approach in the existence of the PIC data, which is based on the same idea of the generalized logrank test for interval-censored data that are given in Sun et al. [11].Multiple imputation (MI) has been proven to be useful in solving many statistical problems.However, to the best of the authors' knowledge, there is no application of MI in the presence of PIC data.This paper was motivated to develop the parametric tests in the existence of PIC data under nonproportional hazard by using MI technique.The rest of this article is divided as follows.Section 2 will introduce the concept of PIC failure time data.Section 3 will introduce the concept of piecewise exponential model.Section 4 will display the development of parametric test approach for comparison of reliability function of two systems in the existence of PIC failure time data.Section 5 will discuss

Partly Interval-Censored Data
In order to explain the concept of PIC, consider a schematic follow-up for engineering studies in which  1 ,  2 , . . .,   are inspection times and the examiner may be absent from the follow-ups with probability   except at the first follow-up.Let   > 0 be a random variable to denote the failure time of interest for th item and  the number of items with failure times following a distribution with reliability (survival) function (, ), where  is a parameter vector.Assume that we observe the exact failure time for  1 items and interval-censored failure time for the remaining items  2 ; ( 2 =  −  1 ).By exact failure times we mean that any item has the failure time during the inspection times, where the failure time is recorded exactly.Also, by interval-censored failure times we mean that the failure time occurs between two inspection times, (  ,   ] where   ,   ∈ ( 1 ,  2 , . . .,   ) and   <   with probability one.In addition, if the failure time occurs for the item before the first inspection time, then we have left-censored observation; that is,   ∈ (0,   ] and if the item did not get the failure time at the final inspection time, we will have right-censored observation; that is,   ∈ (  , ∞).Also, we assume that censoring is independent of the examination time.For the th item with interval-censoring, define   = (  ∈ (0,   ]) and   = (  ∈ (  ,   )).
Then the likelihood function for  is and the log likelihood function for  is (2)

Piecewise Exponential Model
Let   , the failure time, follow the piecewise exponential distribution.Suppose that we divide the duration time into where the ℓth interval is defined as ( ℓ−1 ,  ℓ ] for ℓ = 1, . . .,  and assume that the baseline hazard is constant in interval ℓ, so that () =  ℓ for  ∈ ( ℓ−1 ,  ℓ ] and the corresponding reliability function is defined as where  0 = 0.Here in this article we will consider two situations, the first when we have one cutpoint and the second when we have two cutpoints.Thus, the reliability function of the one cutpoint is define as where  1 =  and  2 = .
The reliability function of the two cutpoints is defined as where  1 = ,  2 = , and  3 = .
Under the nonproportional hazards, the relation between two survival functions is described as early hazard difference, late hazard difference, and crossing hazard.Let   () be the corresponding hazard function for   (),  = 1, 2. The nonproportional hazard configurations are (1) early hazard difference (2) late hazard difference (3) crossing hazard where  1 =  2 and  2 =  1 .

Parametric Test
In order to display the Score Test (ST) and Likelihood Ratio Test (LRT), we will consider  independent items from two different systems.Let   > 0 be a random variable to denote the failure time for th item and  1 the number of items from system ℓ with reliability function  ℓ (); ℓ = 1, 2, and  =  1 +  2 .Suppose that for ℓth system the PIC data are available.Meaning that there is  1 representing the exact failure time that is given by the form {{  }  1 =1 ;  1 = ∑ 2 ℓ=1  ℓ1 } and  2 representing the interval-censored failure time which is given by the form {(  ,   ],  =  1 + 1, . . ., ;  2 = − 1 } where (  ,   ] denote the interval in which   is observed and   and   are positive random variables independent of   , such that   <   with probability one.When   <   = ∞, the failure time   is subject to right-censoring.Assume that the data generated from piecewise exponential distribution (, ), where  is the parametric vector and the cutpoints of piecewise exponential distribution are given.
First, suppose that the piecewise exponential distribution (, ) has one cutpoint and the reliability function is as in (4).Then under this assumption and by substitution in (1), the likelihood function is where   indicate the number of items which satisfies the condition of  and  in which By using Newton Raphson method, we can get the maximum likelihood estimation θ = (â, b) of , so that the parametric estimation of () is Ŝ().To develop the parametric tests, suppose that we have two samples generated from piecewise exponential distribution with parameter   = (  ,   ),  = 1, 2 and our goal is to test the hypothesis  0 :  1 () =  2 ().
The configurations with one cutpoint are early hazard and late hazard.To test the hypothesis under early hazards, this means that we need to test  0 :  1 =  2 =  against  1 :  1 ̸ =  2 and to test hypothesis under late hazards, which means that we need to test Because the likelihood function is complex, we will develop the parametric tests under multiple imputation technique.Our imputation technique is to compute an exact failure time data from PIC data and then apply the ST or LRT for exact data.The imputation scheme that we will use is Rubin's MI [16] and the procedure of MI is as follows.
Step 2. Mix the exact data that have been imputed from the conditional probability function in Step 1 with the exact data that are in the original data, Then we will have an exact data {  ,  = 1, . . .,  1 ;    ;  =  1 + 1, . . ., }.
Step 3. Apply a ST or LRT for the exact data.
To apply the ST, the score is (), where The score test is based on the fact that the score has normal distribution with mean zero and variance () asymptotically under the null hypothesis, where which is the observed information matrix.So under the null hypothesis  0 , the ST statistic is and has chi-square distribution with degree freedom (]), where (]) is the number of imposed restrictions by the null hypothesis and ( θ0 ) is the maximum likelihood estimate of  under the null hypothesis  0 .
For the LRT, the LRT statistic is and it has asymptotically chi-square  2 (]) distribution with ] degrees of freedom under  0 , where (]) is the number of imposed restrictions by the null hypothesis.See Kalbfleisch and Prentice [3].
Step 5. Consider the following.
(a) For the ST, let The covariance matrix  of  is the sum of two components.The first is the average within-imputation covariance connected with  and the second is between-imputation variance of .That is, Hence, the suggested test statistic for comparing two groups of systems is Our simulation result indicates that the test statistic has approximately chi square distribution with one degree of freedom  2 (1) under the null hypothesis.
(b) For the LRT, let  = ∑  =1 (  /) be the test statistic for comparing two groups of systems.Our simulation result indicates that the test statistic has approximately chi-square distribution with one degree of freedom  2 (1) under the null hypothesis.
Secondly, we suppose that the piecewise exponential distribution (, ) has two cutpoints and the reliability function is as in (5).Then, under this assumption and by substitution in (1), the likelihood function is where   indicate the number of items which satisfies the condition of  and  in which By using Newton Raphson method, we can get the maximum likelihood estimation θ = (â, b, ĉ) of , so that the parametric estimation of () is Ŝ().
To develop the parametric test, suppose that we have two groups of systems generated from piecewise exponential distribution with parameter   = (  ,   ,   ) and our goal is to test the hypothesis  0 :  1 () =  2 ().The configuration with two cutpoints is crossing hazard.To test the hypothesis under crossing hazards, this means that we need to test Because the likelihood function is complex, we will develop the parametric test under MI technique as mentioned before.The test statistic for comparing two groups of systems using ST is Our simulation result indicates that the test statistic has approximately chi-square distribution with two degrees of freedom  2 (2) under the null hypothesis.The test statistic for comparing two groups of systems susing LRT is Our simulation result indicates that the test statistic has approximately chi-square distribution with two degrees of freedom  2 (2) under the null hypothesis.

Simulation Study
To examine the accuracy of the parametric tests under nonproportional hazard, we performed a simulation study in which we considered comparison of two samples under a nonproportional hazard model.We presume that the failure times are generated from piecewise exponential distribution.Particulary, the relation between two survival functions configuration is as (1) early hazard difference (2) late hazard difference (3) crossing hazard where  1 () for system 1 and  2 () for system 2. Furthermore, there are  1 = ( 11 +  12 ) observations from system 1 and  2 = ( 21 +  22 ) observations from system 2, where  1 represent the number of exact observations and  2 represent the number of interval-censored observations in which  = 1, 2. Additionally, we assume that the total sample size of the two systems is  = 50, 100, 200.
To generate interval-censored data, we constructed the set of follow-up studies that prespecify inspection times for study items.Furthermore, we presume that each item was assumed to be observed that at 11 follow-up times are   = 1 + 1.5( − 1),  = 1, 2, . . ., 11, but the examiner may be absent from the follow-ups with probability   , where 0 ≤   ≤ 1 except at the first follow-up.Then the intervalcensored time for the item is defined to be the shortest distance between two success inspection times, which covers the generated true failure time.The left side of the interval for the item will be defined as zero when the true failure time is less than the first success inspection time, and the right side of the interval for the item will be defined as infinity when the true failure time is greater than the final success inspection time.A larger absence probability (  ) will yield the interval-censored observations with longer intervals.The result recorded is based on 1000 replications and the number of MI is 10.

Results and Discussion
Table 1 displays the estimated power and size at the significance levels of 0.05 and 0.01 of both tests based on the simulated data with 30% exact data and for the different sample sizes with different values of   .As presented in Table 1, the power of both tests increases as the sample size increases.
For configurations of early difference and late difference, the size of test is reasonable for both tests, but the ST is more powerful than LRT in detecting the early difference, while the LRT is more powerful than ST in detecting late difference.For configuration of cross difference, the size of test is reasonable for only ST except for the case  = 50 and   = 0.1.Table 2 displays the estimated power and size at the significance levels of 0.05 and 0.01 of both tests, according to the simulated data for 50% of exact data and for different sample sizes with different values of   .From Table 2, we find that the power of test increases when there exists more exact data.Also, the size of test is reasonable for both tests for all configurations of early difference, late difference, and cross difference.The ST is more powerful than LRT in detecting the early difference and cross difference, while LRT is more powerful than ST in detecting the late difference.
Both tables reveal that the size of test given here is reasonable at significance level (0.05) for all situations considered except at LRT in detecting cross difference.Also, both tables reveal that the size of test given here is quite reasonable at significance level (0.01) for all situations considered except at It should be noted that the conservative type I error rates (size of test) for ST and anticonservative type I error rates for LRT may be due to the construction of the test statistics.

An Example
Consider the series and parallel systems data that were presented by Guo et al. [17].We carried out several modifications on the series and parallel systems data for the suitability of the new tests.Several interval-censored observations were added by generating randomly in which they lie within the same range of the given data.The data sets are shown in Table 3.The data sets consist of 75 systems, in which 40 systems are series systems and 35 are parallel systems.The target of this analysis is to compare two groups of systems with respect to failure time.To test the reliability of the two systems, we suppose that the data follow piecewise exponential distribution with one cut point  = 250.The obtained values of the parameter estimate are ( 1 = 0.005,  1 = 0.004) for series systems and ( 2 = 0.002,  2 = 0.004) for parallel systems.This means that the data has early hazard difference.So we applied the ST and LRT for PIC data under early hazard difference.The obtained value of ST and LRT statistics is equal to 4.797 and 4.855, respectively, with a  equal to 0.0285 and 0.0276, respectively.These results indicate that there is a significant difference between the two groups of systems at significance level 0.05.We use the parameter estimation to obtain the estimated reliability function for the two groups of systems as shown in Figure 1.It is interesting to note from Figure 1 that the failure time rates for series and parallel systems were different during the earlier examine and the similar mainly occurred in later examine.

Concluding Remarks
In this paper, we have proposed parametric tests (ST and LRT) for PIC failure time data under nonproportional hazard for comparing two reliability functions.A simulation study was conducted to examine and compare the performance of the parametric tests under nonproportional hazards for  different sample sizes with different values for the absence probability   .Simulation results show that the parametric tests work well under the situations considered.Furthermore, the ST is more powerful than LRT in detecting the early difference and cross difference, while LRT is more powerful than ST in detecting the late difference.
It should be mentioned that sometimes in engineering studies the examiner has some prior knowledge about the possible differences between two systems that can be used to determine which type of configurations of nonproportional hazards must be used.

Appendix
Derivatives were involved in the Newoton Raphson method to obtain maximum likelihood estimates of piecewise exponential distribution parameters.The derivatives of the loglikelihood function of one cutpoint in (11) would be as follows: The derivatives of the log likelihood function of two cutpoints in (22) would be as follows:

Figure 1 :
Figure 1: Reliability function of failure time under piecewise exponential distribution.

Table 1 :
Estimated power and size of test with 30% exact observations based on simulated data with 10 multiple imputations and 1000 replications at the significance level 0.05 (0.01).

Table 2 :
Estimated power and size of a test with 50% exact observation based on simulated data with 10 multiple imputations and 1000 replications at the significance level 0.05 (0.01).

Table 3 :
Failure time of two groups of systems.