The process for using statistical inference to establish personalized treatment strategies requires specific techniques for data-analysis that optimize the combination of competing therapies with candidate genetic features and characteristics of the patient and disease. A wide variety of methods have been developed. However, heretofore the usefulness of these recent advances has not been fully recognized by the oncology community, and the scope of their applications has not been summarized. In this paper, we provide an overview of statistical methods for establishing optimal treatment rules for personalized medicine and discuss specific examples in various medical contexts with oncology as an emphasis. We also point the reader to statistical software for implementation of the methods when available.

Cancer is a set of diseases characterized by cellular alterations the complexity of which is defined at multiple levels of cellular organization [

In oncology, biomarkers are typically classified as either predictive or prognostic. Prognostic biomarkers are correlates for the extent of disease or extent to which the disease is curable. Therefore, prognostic biomarkers impact the likelihood of achieving a therapeutic response regardless of the type of treatment. By way of contrast, predictive biomarkers select patients who are likely or unlikely to benefit from a particular class of therapies [

Statistically, predictive associations are identified using models with an interaction between a candidate biomarker and targeted therapy [

In this paper, we provide an overview of statistical methods for establishing optimal treatment rules for personalized medicine and discuss specific examples in various medical contexts with oncology as an emphasis. We also point the reader to statistical software when available. The various approaches enable investigators to ascertain the extent to which one should expect a new untreated patient to respond to each candidate therapy and thereby select the treatment that maximizes the expected therapeutic response for the specific patient [

Cancer is an inherently heterogeneous disease. Yet, often efforts to personalize therapy rely on the application of analysis strategies that neglect to account for the extent of heterogeneity intrinsic to the patient and disease and therefore are too reductive for personalizing treatment in many areas of oncology [

Though very useful when well planned and properly conducted, the reliance on subgroup analysis for developing personalized treatment has been criticized [

From a statistical perspective, personalized medicine is a process involving six fundamental steps provided in Figure

The process for using statistical inference to establish personalized treatment rules.

Individualized treatment rules are functions of model parameters (usually treatment contrasts reflecting differences in treatment effects) which must be estimated from the assumed statistical model and training data. Statistical estimation takes place in step 4. The topic is quite general, and it thus is not covered in detail owning to the fact that other authors have provided several effective expositions on model building strategies in this context [

The quality of a treatment rule depends on the aptness of the study design used to acquire the training data, clinical relevance of the primary endpoints, statistical analysis plans for model selection and inference, and quality of the data.

The predominate statistical challenge pertaining to the identification of predictive biomarkers is the high-dimensional nature of molecular derived candidate features. Classical regression models cannot be directly applied since the number of covariates, for example, genes, is much larger than number of samples. Many approaches have been proposed to analyze high-dimensional data for prognostic biomarkers. Section

In oncology, several endpoints are used to compare clinical effectiveness. However, the primary therapeutic goal is to extend survivorship or delay recurrence/progression. Thus, time-to-event endpoints are often considered to be the most representative of clinical effectiveness [

The performance of ITRs for personalized medicine is highly dependent upon the extent to which the model assumptions are satisfied and/or the posited model is correctly specified. Specifically, performances may suffer from misspecification of main effects and/or interactions, random error distribution, violation of linear assumptions, sensitivity to outliers, and other potential sources of inadequacy [

This section provides details of analytical approaches that are appropriate identifying ITRs using a clinical data source. The very nature of treatment benefit is determined by the clinical endpoint. While extending overall survival is the ultimate therapeutic goal, often the extent of reduction in tumor size as assessed by RECIST criteria (

Let

We classify the statistical methods presented in this section into five categories: methods based on multivariate and generalized linear regression for analysis of data acquired from RCT (Section

Classical generalized linear models (GLM) can be used to develop ITRs in the presence of training data derived from randomized clinical study. The regression framework assumes that the outcome

Often one might want to impose a clinically meaningful minimal threshold,

This strategy was used to develop an ITR for treatment of depression [

Randomization attenuates bias arising from treatment selection, thereby providing the highest quality data for comparing competing interventions. However, due to ethical or financial constraints RCTs are often infeasible, thereby necessitating an observational study. Treatment selection is often based on a patient’s prognosis. In the absence of randomization, the study design fails to ensure that patients on competing arms exhibit similar clinical and prognostic characteristics, thereby inducing bias.

However, in the event that the available covariates capture the sources of bias, a well conducted observational study can also provide useful information for constructing ITRs. For example, the two-gene ratio index (HOXB13:IL17BR) was first discovered as an independent prognostic biomarker for ER+ node-negative patients using retrospective data from 60 patients [

Methods based on propensity scores are commonly used to attenuate selection bias [

The propensity score characterizes the probability of assigning a given treatment

Methods that use propensity scores can be categorized into four categories: matching, stratification, adjusting, and inverse probability weighted estimation [

Of course, propensity scores methods may only attenuate the effects of the important confounding variables that have been acquired by the study design. Casual inference in general is not robust to the presence of unmeasured confounders that influenced treatment assignment [

The methods presented in the previous sections are appropriate for identifying an ITR using a small set of biomarkers (low-dimensional). However, recent advances in molecular biology in oncology have enabled researchers to acquire vast amounts of genetic and genomic characteristics on individual patients. Often the number of acquired genomic covariates will exceed the sample size. Proper analysis of these high-dimensional data sources poses many analytical challenges. Several methods have been proposed specifically for analysis of high- dimensional covariates [

An appropriate regression model can be defined generally as

Parameters for the specified model can be estimated using the following loss function:

Penalized estimation provides the subset of relevant predictive markers that are extracted from the nonzero coefficients of the corresponding treatment-biomarker interaction terms of

Alternative penalized regression approaches include SCAD [

Lu et al. [

Heretofore, we have discussed methods for continuous or binary outcomes, yet often investigators want to discern the extent to which a therapeutic intervention may alter the amount of time required before an event occurs. This type of statistical inference is referred to broadly as survival analysis. One challenge for survival analysis is that the outcomes may be only partially observable at the time of analysis due to censoring or incomplete follow-up. Survival analysis has been widely applied in cancer studies, often in association studies aimed to identify prognostic biomarkers [

The Cox regression model follows as

Several authors have provided model building strategies [

Accelerated failure time (AFT) models provide an alternative semiparametric model. Here we introduce its application for high-dimensional data. Let

This method can be extended to accommodate more than two treatments simultaneously by specifying appropriate treatment indicators. For instance, the mean function can be modeled as

We here present an example when an AFT model was used to construct an ITR for treatment of HIV [

The performances of ITRs heretofore presented depend heavily on whether the statistical models were correctly specified. Recently there has been much attention focused on the development of more advanced methods and modeling strategies that are robust to various aspects of potential misspecification. We have already presented a few robust models that avoid specification of functional parametric relationships for main effects [

Recall that the ITR for a linear model

Let

The methods presented in Section

Unlike classification problems wherein the class labels are observed for the training data, the binary “response” variable

One special case of this framework is the “virtue twins” approach [

Lastly, we present a breast cancer example where several biomarkers were combined to construct an optimal ITR. The data was collected in the Southwest Oncology Group (SOWG)-SS8814 trial [

Heretofore, we have discussed various methodologies for the construction of ITR, while their performances need to be assessed before these rules can be implemented in clinical practice. Several aspects pertaining to the performance of a constructed ITR need to be considered. The first one is how well the ITR fits the data, and the second is how well the ITR performs compared with existing treatment allocation rules. The former is related to the concept of goodness-of-fit or predictive performance [

ITRs may be derived from different methodologies, and comparisons should be conducted with respect to the appropriate clinically summaries. A few summary measures for different types of outcomes have been proposed [

Alternatively, an ITR may be compared to an “optimal” drug that has showed universal benefits (a better drug on average) in a controlled trial. The clinical benefits of an “optimal” drug can be defined as

The summaries heretofore discussed evaluate an optimal ITR for a given model and estimating procedure. Because these quantities are estimated conditionally given the observed covariates, they neglect to quantify the extent of marginal uncertainty for future patients. Hence an ITR needs to be internally validated if external data is not available [

We here briefly introduce a process that was proposed by Kapelner et al. [

The above procedure can be applied to all the methods we have discussed so far. The

Next we present two examples. Recall in Section

As our understanding tumor heterogeneity evolves, personalized medicine will become standard medical practice in oncology. Therefore, it is essential that the oncology community uses appropriate analytical methods for identifying and evaluating the performance of personalized treatment rules. This paper provided an exposition of the process for using statistical inference to establish optimal individualized treatment rules using data acquired from clinical study. The quality of an ITR depends on the quality of the design used to acquire the data. Moreover, an ITR must be properly validated before it is integrated into clinical practice. Personalized medicine in some areas of oncology may be limited by the fact that biomarkers arising from a small panel of genes may never adequately characterize the extent of tumor heterogeneity inherent to the disease. Consequently, the available statistical methodology needs to evolve in order to optimally exploit global gene signatures for personalized medicine.

The bulk of our review focused on statistical approaches for treatment selection at a single time point. The reader should note that another important area of research considers optimal dynamic treatment regimes (DTRs) [

The authors declare that there is no conflict of interests regarding the publication of this paper.

Junsheng Ma was fully funded by the University of Texas MD Anderson Cancer Center internal funds. Brian P. Hobbs and Francesco C. Stingo were partially supported by the Cancer Center Support Grant (CCSG) (P30 CA016672).