^{1}

^{2}

^{3}

^{4}

^{1}

^{2}

^{3}

^{4}

In the past two decades, joint models of longitudinal and survival data have received much attention in the literature. These models are often desirable in the following situations: (i) survival models with measurement errors or missing data in time-dependent covariates, (ii) longitudinal models with informative dropouts, and (iii) a survival process and a longitudinal process are associated via latent variables. In these cases, separate inferences based on the longitudinal model and the survival model may lead to biased or inefficient results. In this paper, we provide a brief overview of joint models for longitudinal and survival data and commonly used methods, including the likelihood method and two-stage methods.

Longitudinal data and survival data frequently arise together in practice. For example, in many medical studies, we often collect patients’ information (e.g., blood pressures) repeatedly over time and we are also interested in the time to recovery or recurrence of a disease. Longitudinal data and survival data are often

Figure

CD4 measurements over time. (a) All subjects. (b) Five randomly selected subjects.

Typically, joint models for longitudinal and survival data are required in the following situations:

survival models with measurement errors in time-dependent covariates;

longitudinal models with informative dropouts;

longitudinal and survival processes are governed by a common latent process;

the use of external information for more efficient inference.

Joint models of longitudinal and survival data have attracted increasing attention over the last two decades. Tsiatis and Davidian [

Since the literature on joint models is quite extensive, it is difficult to review all references here. In this paper, we provide a brief review of the joint model literature. In Section

In this section, we consider a standard formulation of a joint model. In the literature, a typical setup is a survival model with measurement errors in time-dependent covariates, in which a linear mixed-effects (LME) model is often used to model time-dependent covariates to address covariate measurement errors and a Cox proportional hazards (PH) model is used for modelling the survival data. We focus on this setup to illustrate the basic ideas.

Consider a longitudinal study with

In survival models, some time-dependent covariates may be measured with errors. For simplicity, we consider a single time-dependent covariate. Let

We consider the following Cox model for the survival data:

In Cox model (

To address either missing data or measurement error or both, a standard approach is to model the time-dependent covariates. A common choice is the following LME model:

Note that the survival model (

There are two commonly used approaches for inference of joint models:

two-stage methods,

likelihood methods.

In the following sections, we describe these two approaches in detail. Other approaches for joint modes have also been proposed, such as those based on estimating equations, but we omit them here for space consideration.

In the joint modelling literature, various two-stage methods have been proposed. A simple (naive) two-stage method is as follows.

Fit a LME model to the longitudinal covariate data, and estimate the missing or mismeasured covariates based on the fitted model.

Fit the survival model

Main advantages of the two-stage methods, including the modified two-stage methods as described below, are the simplicity and that they can be implemented with existing software. The limitation of those methods is that they may lead to biased inference for several reasons. First, in the estimation of the longitudinal covariate model parameters, the truncations resulted from the events are not incorporated. That is, the longitudinal covariate trajectories of subjects who experience an event may be different from those who do not experience that event, so estimation of the parameters associated with the longitudinal covariate model in the first stage, based only on observed covariate data, may be biased. Second, the

The bias in the estimation of the longitudinal model parameters caused by ignoring the informative truncations from the events may depend on the

Self and Pawitan [

More recently, other two-stage methods have been developed in the literature. In the sequel, we review some of these recent methods. Following Prentice [

Note that the bias resulted from the naive two-stage method is caused by the fact that the covariate trajectory is related to the length of followup. For example, subjects who drop out early or die early may have different trajectories than those who stay in the study. Thus, much of the bias may be removed if we can recapture these missing covariate measurements due to truncation by incorporating the event time information. Albert and Shih [

To incorporate the estimation uncertainty in the first step, we may consider a

Generate covariate values based on the assumed covariate model, with the unknown parameters substituted by their estimates.

Generate survival times from the fitted survival model.

For each generated bootstrap dataset from Steps

Repeating the procedure

Two-stage methods have bearing with the

The likelihood method is perhaps the most widely used approach in the joint model literature. It provides a unified approach for inference, and it produces valid and the most efficient inference if the assumed models are correct. The likelihood method is based on the likelihood for both longitudinal data and survival data. However, since the likelihood function can be complicated, a main challenge for the likelihood method is computation.

All the observed data are

Parameter estimation can then be based on the observed-data likelihood

MLEs of the model parameters can either be obtained by a direct maximization of the observed data log likelihood or by using an EM algorithm. Since the observed data log likelihood involves an intractable integral, a direct maximization is often based on numerical integration techniques such as the Gaussian Hermite quadrature or Monte Carlo methods. These methods, however, can be quite computationally intensive if the dimension of the unobservable random effects

Hsieh et al. [

A main challenge in the likelihood inference for joint models is the computational complexity, since numerical methods or Monte Carlo methods can be very computationally intensive when the dimension of the random effects

Approximate but computationally more efficient methods for joint models have also appeared in the literature, such as those based on Laplace approximations (e.g., [

Rizopoulos et al. [

Bayesian joint models have also been studied by various authors, including Faucett and Thomas [

For Bayesian joint models, the model parameters are assumed to follow some prior distributions, and inference is then based on the posterior distribution given the observed data. Let

Like other Bayesian methods, it is desirable to check if the final results are sensitive to the choices of prior distributions. Sometimes, in the absence of prior information, noninformative priors or flat priors may be desirable.

In the previous sections, we have focused on joint models based on a Cox model for right-censored survival data and a LME model for longitudinal data. Other models for survival data and longitudinal data can also be considered in joint models. For example, for survival data, we may consider accelerated failure time (AFT) models and models for interval censored data and models for recurrent events. For longitudinal data, nonlinear, generalized linear mixed models or semiparametric/nonparametric mixed models can be utilized. Although the different survival models and longitudinal models can be employed, basic ideas and approaches for inference remain essentially the same. In the following, we briefly review some of these joint models.

In joint modelling of longitudinal and survival data, we can use the AFT model to feature survival data. Here, we focus on an AFT model with measurement errors in time-dependent covariates. For longitudinal data, we again consider LME models for simplicity. The description below is based on Tseng et al. [

Tseng et al. [

Handling the AFT structure in the joint modelling setting is more difficult than for the Cox model, since

Tseng et al. [

In the previous sections, we have focused on right censored survival data and assume that either the

Let

We have focused on LME models for modelling the longitudinal data. Other models for longitudinal data can also be considered. For example, one may consider nonlinear mixed-effects (NLME) models for modelling the longitudinal data in joint models [

When the longitudinal models are nonlinear, the general ideas of the two-stage methods and likelihood methods for joint models can still be applied. The complication is that computation becomes more demanding, because of the nonlinearity of the longitudinal models.

For longitudinal data, missing values are very common. When missing data are nonignorable in the sense that the missingness probability may be related to the missing values or the random effects, the missing data process is often needed to be incorporated in inferential procedures in order to obtain valid results. For likelihood methods, it is straightforward to incorporate missing data mechanisms in joint model inference. However, the computation becomes even more challenging. Wu et al. [

As an illustration, we consider an AIDS dataset which includes 46 HIV infected patients receiving an anti-HIV treatment. Viral load (i.e., plasma HIV RNA) and CD4 cell count were repeatedly measured during the treatment. The number of viral load measurements for each individual varies from 4 to 10. It is known that CD4 is measured with substantial errors. About 11% viral load measurements are below the detection limit after the initial period of viral decay. We call the viral load below the detection limit as “viral suppression.” We wish to check if the initial CD4 trajectories are predictive for the time to viral suppression.

Let

To address the measurement error in the time-dependent covariate CD4 cell count, we use the LME model to model the CD4 trajectories:

Table

Analyses of the AIDS data under different models.

Model | Method | ||||||
---|---|---|---|---|---|---|---|

Cox model | Two-stage | Estimate | — | 0.315 | −0.233 | 1.345 | 0.605 |

SE | — | 0.208 | 0.113 | 0.154 | — | ||

— | 0.129 | 0.040 | <0.001 | — | |||

BSE | — | 0.237 | — | — | — | ||

Cox model | Joint model | Estimate | — | 0.648 | −0.201 | 1.342 | 0.603 |

SE | — | 0.234 | 0.135 | 0.162 | — | ||

— | 0.006 | 0.137 | <0.001 | — | |||

AFT model | Joint model | Estimate | 0.168 | −0.487 | −0.237 | 1.341 | 0.604 |

SE | 0.260 | 0.289 | 0.125 | 0.156 | — | ||

0.517 | 0.091 | 0.059 | <0.001 | — |

SE: standard error; BSE: bootstrap standard error. The

The parameter

In this section, we conduct a simulation study to compare the joint model method and the two-stage method with/out bootstrap standard error correction. We generate 500 datasets from the time-dependent covariate CD4 process (

Comparison of the two-stage Method and the joint likelihood method via a simulation study.

Method | Parameter | ||||||

True value | 0.6 | −0.2 | 1.3 | 0.6 | 0.5 | 0.3 | |

Two-stage | Est | 0.183 | −0.183 | 1.303 | 0.598 | 0.501 | 0.353 |

ESE | 0.216 | 0.111 | 0.164 | 0.025 | 0.114 | 0.209 | |

ASE | 0.201 | 0.111 | 0.164 | — | — | — | |

BSE | 0.250 | — | — | — | — | — | |

Bias | −0.417 | 0.017 | 0.003 | −0.002 | −0.001 | 0.053 | |

MSE | 0.221 | 0.013 | 0.027 | 0.0007 | 0.013 | 0.046 | |

CR | 42.8 | 95.6 | 94.4 | — | — | — | |

Joint model | Est | 0.6004 | −0.175 | 1.296 | 0.598 | 0.492 | 0.321 |

ESE | 0.256 | 0.103 | 0.161 | 0.020 | 0.092 | 0.156 | |

ASE | 0.249 | 0.099 | 0.163 | — | — | — | |

Bias | 0.0004 | 0.025 | −0.004 | −0.002 | −0.008 | 0.021 | |

MSE | 0.066 | 0.011 | 0.026 | 0.0004 | 0.008 | 0.025 | |

CR | 95.6 | 95.8 | 95.2 | — | — | — |

In Table

From the above results, we see that the joint likelihood method produces less biased estimates and more reliable standard errors than the two-stage method. These results have important implications. For example, if one uses Wald-type tests for model selection, the likelihood method would give more reliable results. However, two-stage methods are generally simpler and computationally quicker to output estimates than likelihood methods. We can also compare the two methods with Bayesian methods. Note that, however, Bayesian methods are equivalent to the likelihood method when noninformative priors are used. We expect that Bayesian methods have similar performance to likelihood methods.

We have provided a brief review of common joint models and methods for inference. In practice, when we need to consider a longitudinal process and an event process and suspect that the two processes may be associated, such as survival models with time-dependent covariates or longitudinal models with informative dropouts, it is important to use joint model methods for inference in order to avoid biased results. The literature on model selection for joint models is quite limited. In practice, the best longitudinal model can be selected based on the observed longitudinal data, and the best survival model can be selected based on the survival data, using standard model selection procedures for these models. Then, we specify reasonable link between the two models, such as shared random effects. To choose methods for inference, the joint likelihood method generally produces most reliable results

When the longitudinal covariate process terminates at event times, that is, when the longitudinal values are unavailable at and after the event times such as deaths or dropouts, the covariates are sometimes called

Survival models with measurement errors in time-dependent covariates have received much attention in the joint models literature. Another common situation is longitudinal models with informative dropouts, in which survival models can be used to model the dropout process. Both situations focus on characterizing the association between the longitudinal and survival processes. Some authors have also considered joint models in which the focus is on more efficient inference of the survival model, using longitudinal data as auxiliary information [

Joint models can also be extended to

Zeng and Cai [

Although there has been extensive research in joint models in the last two decades and the importance of joint models has been increasingly recognized, joint models are still not widely used in practice. A main reason is perhaps lack of software. Recently, Dimitris Rizopoulos has developed an R package called

The authors are grateful to the editor and two referees for helpful and constructive comments. The research was partially supported by the Canada Natural Sciences and Engineering Research Council (NSERC) discovery grants to L. Wu, W. Liu, and G. Y. Yi and by NIAID/NIH Grant AI080338 and MSP/NSA Grant H98230-09-1-0053 to Y. Huang.