Parametric Regression Models Using Reversed Hazard Rates

Proportional hazard regression models are widely used in survival analysis to understand and exploit the relationship between survival time and covariates. For left censored survival times, reversed hazard rate functions are more appropriate. In this paper, we develop a parametric proportional hazard rates model using an inverted Weibull distribution. The estimation and construction of confidence intervals for the parameters are discussed. We assess the performance of the proposed procedure based on a large number of Monte Carlo simulations. We illustrate the proposed method using a real case example.


Introduction
In survival studies, covariates or explanatory variables are usually employed to represent heterogeneity in a population.The main objective in such situations is to understand and exploit the relationship between lifetime and covariates.Regres sion models are useful in such contexts to assess the effect of covariates on lifetime.These models can be formulated in many ways and several types are in common use.Parametric regression models for lifetime involve specification for the distribution of a lifetime  given a vector of covariates .The most commonly used parametric model is the Weibull regression model, which satisfies the proportional relationship between hazard rate functions of the lifetimes of two subjects.The maximum likelihood technique is usually employed to find estimates of the parameters of the model.For more properties and applications of parametric regression models, one should refer to Lawless [1].
In survival studies, there are many occasions where lifetime data are left censored.For example, baboons in the Amboseli Reserve, Kenya, sleep in the trees and descend for ageing at certain times of the day.Observers often arrive later in the day after this descent has occurred and on such days they can only ascertain that the descent took place before a particular time, so that the descent times are left censored (see [2]).On such occasions, a reversed hazard rate is more appropriate than a hazard rate to analyze lifetime data due to the fact that estimators of hazard rates are unstable when data are left censored.The reversed hazard rate of  is defined as Introduced by Barlow et al. [3], the function () has been used in various contexts such as the estimation of distribution function under left censoring [1], defining a new stochastic order [4], characterization of lifetime distributions [5][6][7], studying ageing behavior [8,9], evolving new repair and maintenance strategies [10,11], the mixed proportional hazards model [12], and stress hybrid hazards model [13].
Recently, Sengupta and Nanda [14] introduced the proportional reversed hazards model in a semiparametric setup.In the present work, we introduce a fully parametric regression model that satisfies the proportional reversed hazards property.The inverted Weibull distribution is employed as a lifetime model, which can be extended to any parametric model.A large number of simulation studies indicate that the proposed approach is performing well.
The rest of the paper is organized as follows.In Section 2, we introduce a parametric regression model using an inverted Weibull distribution.The proposed model has the property that the reversed hazard rate for the lifetime of pair of subjects is proportional.The estimation of the parameters of 2 Journal of Probability and Statistics the model is discussed in Section 3. Simulation studies are conducted in Section 4 to assess the finite sample behavior of the estimators.The proposed model is applied to real life data in Section 5 to illustrate its utility.Finally Section 6 provides the major conclusions of the study.

Statistical Model
Let  be a nonnegative random variable representing the lifetime of a subject with the distribution function ().Assume that the probability density function of , (), exists.The reversed hazard rate of  given in (1) can be written as Let  be  × 1 vector of auxiliary information which may be time dependent.The proportional reversed hazard (PRH) model is defined by where  0 () is the baseline reversed hazard rate, (⋅) is a nonnegative function of  and ,  × 1 vector of regression parameters, and ( | ) is the reversed hazard rate of  given the covariate .The PRH model can be expressed in terms of the distribution function as where ( | ) is the distribution function of  given  and  0 () is the baseline distribution function in the absence of covariates.It should be noted that, for two subjects, the ratio of reversed hazard rates is independent of the time .
Semiparametric analysis of the model ( 2) is recently discussed in Sengupta and Nanda [14].Our objective here is to carry out the parametric analysis of an inverted Weibull distribution under left censoring.When the lifetime random variable follows the inverted Weibull distribution, the baseline distribution function is given by The baseline reversed hazard rate of  is then obtained as Note that the baseline reversed hazard rate is decreasing as  increases.In the presence of the covariate , we have We assume that (; ) =     so that with Suppose that the lifetime random variable  is randomly left censored by .In practice, one could observe the vectors (, , ), where  = max (, ) and  = ( = ) with (⋅) being the indicator function.Let (  ,   ,   ),  = 1, 2, . . ., , be i.i.d.copies of (, , ).Then the likelihood function can be written as Under the inverted Weibull distribution assumption, the likelihood function given in ( 10) is obtained as so that the log likelihood function is where  is a real constant independent of , , and .We maximize (12) to estimate the parameters , , and  by equating the partial derivatives with respect to each parameter to zero as Since there is no closed form solution available for (13), we use numerical methods to estimate the parameters.The observed information matrix is given by Note that the matrix ( 14) is of order ( + 2) × ( + 2).Under the standard regularity conditions, the vector of estimates ( β, α, γ) is asymptotically ( + 2)-variate normal with mean vector (, , ) and dispersion matrix  * −1 , where  * is the Fisher information matrix obtained from  by taking the expected values of each entry.
There are different algorithms available to estimate the parameters by solving the score equations or directly optimizing the likelihood function.The Newton-Raphson method is the most common method used to estimate since it is easy to determine the derivatives of the score equations.In this numerical iterative method, the initial values play a vital role due to the logarithm function.In the simulation studies given in Section 4, we use the simplex method proposed by Neldar and Mead [15] to estimate the parameters.The simplex method is a simple method to use to estimate the parameters by maximizing the likelihood function where we do not need the derivatives of the function to be optimized.

Testing and Confidence Intervals for 𝛽
Tests and interval estimates of parameters can be derived by the likelihood ratio test procedure.We are mainly interested in the regression parameter  where the parameters  = (, ) are normally considered as nuisance parameters.
Let the  vector regression parameter be denoted as  = ( 1 ,  2 ), where  1 and  2 are of vectors of sizes  and  − , respectively, and  is the other parameter in the model.We are interested in testing where  0 1 is the specified regression parameter value.To test  0 , we construct the likelihood ratio statistic where β1 , β2 , and θ are the maximum likelihood estimates under the full model.For a large value of , Λ * follows the  2  distribution under the null hypothesis.Alternatively, we can use the test statistic where  11 can be obtained from  = ( β, θ) −1 , which is partitioned as Under the null hypothesis, Λ 1 follows  2  distribution.Assuming asymptotic normality, we can construct the 100(1 − )% confidence interval for the individual regression parameter   as β ±  /2 s.e. ( β ) , where s.e.( β ) can be obtained from  11 .
Another important problem is the selection of important covariates in the proportional reversed hazard models.Since we assume a parametric model, we can use variable selection methods such as the Akaike information criterion (AIC) and the Bayesian information criterion (BIC).To test adequacy of the parametric model, Cox-Snell residuals can be used, which is explained in Section 5 with a case example.

Performance Analysis
To assess the performance of the proposed method, we carried out a large number of simulations.We generated samples of size 100 from an inverted Weibull distribution, with different values of parameters as  = (0.5, 1, 1.5, 2) and  = (0.5, 1, 1.5, 2).We considered a single covariate which is generated from Uniform (0,1) and a regression parameter assumed to be  = (0.5, 1.1.5,2).We developed the censoring mechanism using the random data generated from the exponential distribution with the parameter .We choose the value of  such that the percent of censored data is between 10 and 20 percent.We used the simplex method proposed by Neldar and Mead [15] to estimate the parameters.We repeated the study for 10000 times and computed the mean and standard deviation of the parameter estimates.The entire study was repeated for a sample size of 250.The summary of the parameter estimates is given in Table 1.
From Table 1 we can see that the mean of the parameter estimates based on 10000 simulation is very close to the true parameter values and the standard deviation is also small.When the sample size increases the standard error of the estimates decreases and bias reduces.It should be noted that there is a slight positive bias in all cases, even though it is negligibly small.Since there are no comparable models based on reversed hazard rates, we did not perform any comparison studies.

An Example
We consider an extract of left censored data from an Australian twin study given in Duffy et al. [16].The data consist of information on the age of appendectomy of monozygotic and dizygotic twins.There are observations with missing age at onset and therefore the data are left censored.The individuals having age at onset of less than 11 are left censored.The covariate, namely, Zygocity, has values from 1 to 6.This data set consists of 54 observations of which 15 are left censored.We use this data to illustrate the utility of the parametric reverse hazard rate model.Probability plotting and statistical test confirmed the distribution of data as inverted Weibull distribution.We use the simplex method to estimate the parameters.Since the parameter values are unknown and to avoid the effect of inappropriate initial values, we consider different initial values and choose the estimates which have maximum likelihood.Estimates of the parameters are α = 2.3940, β = −0.0142,and γ = 444.0586.The 95% confidence interval for  indicates that the regression coefficient corresponding to Zygocity is not significantly different than zero; that is, the effect of Zygocity is negligible.This conclusion is also verified through the likelihood ratio test statistic value 0.0074 having a  value of 0.93.We use a Cox-Snell residual plot to assess the goodness of fit.The Cox-Snell residual is defined by If the model fits the data, then the residuals should have a standard exponential distribution, so that a hazard plot of residuals versus the Nelson-Aalen estimator of the cumulative hazard of the residuals will be a straight line with slope one.A plot of Cox-Snell residuals against the Nelson-Aalen estimates of the cumulative hazard rate of residuals is given in Figure 1, which shows that the fit is reasonably good.

Conclusions
Proportional reversed hazard rate models are more suitable for modeling the left censored lifetime data.In this paper, we proposed a parametric PRH model assuming that the lifetime data follows an inverted Weibull distribution.The estimation and hypothesis testing of the parameters of the model have been discussed in detail.The performance of the proposed model is assessed based on a large number of Monte Carlo simulations.Our simulations results clearly indicated that the proposed model is performing well.We applied the proposed model to a real life example to illustrate the utility of the method.Recently, Bayesian methodology was extensively employed in the analysis of lifetime data.The inference procedures of the proposed model by selecting an appropriate

Figure 1 :
Figure 1: Plot of cumulative hazard rates of Cox-Snell residuals versus residuals.

Table 1 :
Averages and standard deviation (in brackets) of parameter estimates.