Escalation with Overdose Control Using Ordinal Toxicity Grades for Cancer Phase I Clinical Trials

We extend a Bayesian adaptive phase I clinical trial design known as escalation with overdose control EWOC by introducing an intermediate grade 2 toxicity when assessing dose-limiting toxicity DLT . Under the proportional odds model assumption of dose-toxicity relationship, we prove that in the absence of DLT, the dose allocated to the next patient given that the previously treated patient had a maximum of grade 2 toxicity is lower than the dose given to the next patient had the previously treated patient exhibited a grade 0 or 1 toxicity at the most. Further, we prove that the coherence properties of EWOC are preserved. Simulation results show that the safety of the trial is not compromised and the efficiency of the estimate of the maximum tolerated dose MTD is maintained relative to EWOC treating DLT as a binary outcome and that fewer patients are overdosed using this design when the true MTD is close to the minimum dose.


Introduction
Cancer phase I clinical trials are sequential designs enrolling late stage cancer patients who have exhausted standard treatment therapies 1 .For cytotoxic agents or combinations of biologic with cytotoxic drugs, the main objectives of these trials are to characterize treatmentrelated toxicities and estimate a dose level that is associated with a predetermined level of dose limiting toxicity DLT .Such a dose is called maximum tolerated dose MTD or phase II dose.Specifically, the MTD, γ, is defined as the dose that is expected to produce DLT after one cycle of therapy in a specified proportion θ of patients: Although the definition of DLT depends on the cancer type and the agent under study, it is typically defined as a grade 3 or 4 nonhematologic and grade 4 hematologic toxicity for properties of EWOC while at the same time, satisfying those ethical considerations raised by our clinical colleagues.The manuscript is organized as follows.In Section 2, we give a detailed description of the methodology and describe the trial design.In Section 3, we state and prove the ethical considerations and coherence properties of EWOC-POM.The simulation results of the design operating characteristics and comparison with EWOC design are included in Section 4. Section 5 contains some final remarks and discussion of practical implementations.

Model
Let G 0, 1, . . ., 4 be the maximum grade of toxicity experienced by a patient by the end of one cycle of therapy and define DLT as a maximum of grade 3 or 4 toxicity.Let

2.1
We model the dose-toxicities relationship by assuming that where F • is a known strictly increasing cdf.This implies that α 2 ≤ α 1 .We assume that β > 0 so that the probability of DLT is an increasing function of dose.The MTD, γ, is defined as the dose that is expected to produce DLT in a specified proportion θ of patients:

2.3
The value chosen for the target probability θ depends on the nature and clinical manageability of the DLT; it is set relatively high when the DLT is a transient, correctable, or nonfatal condition and low when it is lethal or life threatening.Suppose that dose levels in the trial are selected in the interval X min , X max .

Likelihood
Let D n { x i , Y i , i 1, . . ., n} be the data after enrolling n patients to the trial.The likelihood function for the parameters α 1 , α 2 , and β is

2.4
where I • is the indicator function.
We reparameterize model 2.2 in terms of ρ 0 P Y 2 | x X min , the probability that a DLT manifests within the first cycle of therapy for a patient given dose x X min , ρ 1 P Y ≥ 1 | x X min , the probability that a grade 2 or more toxicity manifests within the first cycle of therapy for a patient given dose x X min , and the MTD γ.This reparameterization is convenient to clinicians since γ is the parameter of interest.Assuming that the dose is standardized to be in the interval 0, 1 , it can be shown that

2.6
Using 2.4 , 2.5 , and 2.6 , the likelihood of the reparameterized model is 2.7

Prior and Posterior Distributions
Let g ρ 0 , ρ 1 , γ be the prior distribution on Ω, where Ω { x, y, z : 0 ≤ x ≤ θ, x ≤ y ≤ 1, X min ≤ z ≤ X max }.Using Bayes rule, the posterior distribution of the model parameters is proportional to the product of the likelihood and prior distribution We designed an MCMC sampler based on the Metropolis-Hastings algorithm 47, 48 to obtain model operating characteristics.We also used WinBUGS 49 to estimate features of the posterior distribution of the MTD and design a trial.The WinBUGS code is included in the Appendix section.In the absence of prior information about the MTD and probability of DLT at X min , we specify vague priors for the model parameters as follows: 2.9

Trial Design
Dose levels in the trial are selected in the interval X min , X max .The adaptive design proceeds as follows.The first patient receives a dose x 1 > X min that is deemed to be safe by the clinician.Denote by Πk γ Π γ | D k the marginal posterior cdf of the MTD, k 1, . . ., n − 1.The k 1 −st patient receives the dose x k 1 Π −1 k α so that the posterior probability of exceeding the MTD is equal to the feasibility bound α.This is the overdose protection property of EWOC, where at each stage of the design, we seek a dose to allocate to the next patient while controlling the posterior probability of exposing patients to toxic dose levels.The trial proceeds until a predetermined number of n patients are enrolled to the trial.At the end of the trial, we estimate the MTD as γ Π −1 n α .

Characteristics of EWOC-POM
The proposed design EWOC-POM assigns dose levels to future patients by taking into account the maximum observed toxicity grade during the first cycle of therapy according to the following properties.
i At each stage of the design, we seek a dose to allocate to the next patient while controlling the posterior probability of exposing patients to toxic dose levels.
ii If the maximum grade of toxicity experienced by patient k − 1 within one cycle of therapy is grade 2, then the dose allocated to patient k is lower than the dose that would have been given to patient k had the maximum grade of toxicity experienced by patient k − 1 been grade 0 or 1.
Property i is the overdose protection defining characteristic of EWOC which is also satisfied by EWOC-POM.Property ii is naturally appealing because when a patient experiences grade 2 dose-related toxicity, then it is ethical not to increase the dose by the same amount had this patient exhibited grade 0 or 1 toxicity at the most.Characteristic ii is summarized in the following theorem.
The proof of Theorem 3.1 is given in the Appendix section.It is easy to verify that the monotonicity condition of Theorem 3.1 holds for the logistic function F w 1/ 1 e −w .Using this link function and the uniform priors given in 2.9 with θ 0.33, Figure 1 gives all possible dose assignments for patients 1 and 2 and selected situations for patients 3 and 4 using the trial design described in Section 2.1.3.The dose has been standardized so that X min 0 and X max 1, and the first patient is given dose 0.10.If this patient experiences grade 0 or 1 toxicity at the most, the dose recommended for patient 2 is 0.36.On the other hand, if patient 1 experiences grade 2 toxicity at the most, the dose recommended for patient 2 is 0.22, much lower than 0.36.This figure also illustrates some of the coherence properties stated in Theorem 3.2 below.

Coherence of EWOC-POM
In cancer phase I clinical trials, it is ethical not to increase a dose of a cytotoxic agent for the next patient if the previously treated patient exhibited DLT when given the same dose level.Furthermore, it is desirable not to decrease the dose of an agent for the next patient if the previously treated patient did not experience DLT when given that same dose level.These two properties are known as coherence in escalation and de-escalation, respectively, see Cheung 50 .A design that satisfies both of these properties is said to be coherent.Tighiouart and Rogatko 23 show that EWOC is coherent.The next theorem states some of the coherence properties of EWOC-POM.
Theorem 3.2.Suppose that for all x ∈ X min , X max and all (ρ 0 , ρ 1 ) such that 0 ≤ ρ 0 ≤ ρ 1 ≤ 1 and ρ 0 ≤ θ, F 1 and F 2 are monotonically decreasing in γ.Then the design EWOC-POM described in 2.1.3is coherent in deescalation.Furthermore, if the toxicity response for patient k is Y k 0, then the dose allocated to patient k 1 satisfies The proof of Theorem 3.2 is given in the Appendix section.

Simulation Studies
We compare the design operating characteristics of EWOC-POM with the original EWOC by simulating a large number of trials under several scenarios.We used the logistic function F w 1/ 1 e −w to model the dose-toxicities relationship in 2.2 .EWOC was implemented as in Tighiouart et al. 22 using the same logistic function to model the dose-toxicity relationship.For all scenarios, we standardize the dose to be in the interval 0, 1 , θ 0.33, the feasibility bound α 0.25, and the trial sample size is n 30.The priors in 2.9 were adopted for EWOC-POM.We also carried out design operating characteristics for θ 0.20, and the conclusions regarding the performance of EWOC-POM relative to EWOC were essentially the same.

Algorithm
For a given scenario determined by ρ 0 , ρ 1 , and γ, the first patient receives dose 0, and the next dose x 2 is determined according to the trial design described in 2.1.3.The second response y 2 is then generated from model 2.2 reparameterized in terms of ρ 0 , ρ 1 , and γ with x x 2 .This process is repeated until all n patients have been enrolled to the trial.We considered 9 scenarios corresponding to a fixed value for ρ 0 0.05, three values of ρ 1 , 0.2, 0.5, and 0.8, and three values of the MTD γ, 0.1, 0.5, and 0.7.The corresponding dose-toxicity relationships for these nine scenarios are illustrated in Figure 2.For each model and each scenario, we simulated M 1000 trials.EWOC and EWOC-POM were compared in terms of the proportion of patients exhibiting DLT, the average bias, bias ave M −1 M i 1 γ i − γ true , and the estimated mean square error MSE , where γ i is the Bayes We also compared the models with respect to the proportion of patients that were overdosed.Here, a dose x is defined as an overdose if x > x * , where x * is defined as the dose for which P DLT | x * θ 0.05.This probability is calculated using the parameter values from the corresponding scenario.These models are further compared with respect to the proportion of patients that were overdosed given that the previously treated patient exhibited grade 2 toxicity.Finally, we compared EWOC-POM to EWOC in terms of the proportion of trials for which the probability of DLT exceeds 0.4.This gives us an estimate of the probability that a prospective trial will result in an excessively high DLT rate.As for the proportion of trials with correct MTD recommendation, we presented percent of trials with estimated MTD within 10% and 20% of the dose range of the true MTD for EWOC-POM and EWOC.

Results
Figure 3 shows that the proportion of patients exhibiting DLT is always less than 34% for both EWOC and EWOC-POM under all scenarios and 4% fewer patients experiencing DLT under EWOC-POM when the MTD is small γ 0.1 and ρ 1 0.8.The same figure shows that the proportion of patients that are overdosed using EWOC is uniformly higher relative to EWOC-POM when the MTD is small.The same trend is observed when γ 0.5 except when ρ 1 0.2.The difference in the proportion of patients being overdosed when γ 0.7 is negligible.The last panel of Figure 3 shows that the proportion of patients that are overdosed given that the previously treated patient exhibited grade 2 toxicity using EWOC is uniformly higher relative to EWOC-POM when γ 0.1, 0.5 except when ρ 1 0.2.The difference in these proportions when γ 0.7 is negligible.The last two columns of Table 1 show that the percent of trials with DLT rate of 0.4 or more is 7.5% at the most for EWOC and 6.6% for EWOC-POM.A more detailed comparison is shown in Figure 4, where side-by-side box plots of 0.2 0.4 Box plots of proportions of DLTs in each scenario a 0.1 the distributions of the proportion of DLTs for EWOC-POM and EWOC under the nine scenarios are displayed.These results show that EWOC-POM maintains the safety of the trial relative to EWOC and is much safer when the true MTD is close to the minimum dose by reducing the number of patients that are exposed to toxic doses.Figure 5 shows that the estimated MTDs using EWOC and EWOC-POM are very close in general, with the highest difference observed when ρ 1 0.8.This is reflected by the estimated bias and RMSE shown in Figure 5.This is expected since EWOC-POM is characterized by a conservative dose escalation when a patient experiences grade 2 toxicity.The highest absolute value of the bias is 0.04 and is achieved when γ 0.5, 0.7 and ρ 1 0.8.This constitutes 4% of the range of the dose and is practically not significant.The percent of trials with estimated MTD within 5% of the dose range and 10% of the dose range of the true MTD γ under the nine scenarios are shown in columns 2-5 of Table 1.These results further confirm that the precision of the estimate of the MTD is similar between the two models, with a higher precision for EWOC achieved when γ 0.5 and ρ 1 0.8.Values other than 5% and 10% of the dose range were also used, and the conclusions were essentially the same.We conclude that EWOC-POM maintains the efficiency of the trial relative to EWOC for all practical purpose.
These simulation results suggest that EWOC-POM is a good alternative design for cancer phase I clinical trials which takes into account the ethical consideration that dose escalation in the absence of DLT is mitigated by the occurrence of a grade 2 toxicity.

Model Robustness
Model robustness with respect to assumption of proportional odds model in 2.2 was assessed by simulating the toxicity responses from a nonproportional odds model: The same logistic link function F w 1/ 1 e −w was used.We considered two different models M 1 and M 2 corresponding to the set of parameters α 2 −3.94, β 1 22.36, β 2 32.36 for model M 1 and α 2 −1.94, β 1 22.36, β 2 12.36 for M 2 .For each model M i , i 1, 2, we considered three different values for the intercept term α 1 , α 1 −1.38, 0.00, 1.38 which correspond to ρ 1 0.2, 0.5, 0.8.These parameters have been selected so that ρ 0 0.020 for  model M 1 , ρ 0 0.126 for model M 2 , and the true MTD is γ 0.1.Figure 6 shows the graphs of the probabilities of DLT and probability of grade 2 or more toxicity as a function of dose for the proportional odds model POM and models M 1 and M 2 when ρ 1 0.2.For each scenario, we simulated 1000 trials with n 30 patients where at each stage of the trial, the next dose is calculated using the trial design described in 2.1.3as in Section 4.1 but the toxicity response is generated using the nonproportional odds model 4.2 .The simulation results are summarized in Table 2.The maximum difference in proportion of patients exhibiting DLT averaged across the simulated trials between model M i , i 1, 2 and EWOC-POM is 3%.Under model M 2 , the proportion of patients that are overdosed is higher than EWOC-POM, and this proportion is 13% higher when ρ 1 0.2.
The percent of trials with DLT rate exceeding 0.4 is less than 15% in all cases.This percent is highest for model M 2 ; however, this is still relatively small compared to the results obtained in 42 .The percent of trials with estimated MTD within 10% of the dose range of the true MTD is 100% between the three models and across all scenarios and very good within 5% of the dose range of the true MTD.
In summary, EWOC-POM seems to be robust to model misspecification when the true MTD is near the initial dose.On the other hand, the model is sensitive to model misspecification when the true MTD is high but the safety of the trial is not compromised.We also conducted similar simulations results not shown when the true MTD is γ 0.5, 0.7.We found that under all scenarios, the proportion of patients exhibiting DLT is always less than 33% but the bias tends to be higher for high values of ρ 1 and γ.This is the case when the probability of DLT curve increases very slowly as a function of dose which results in a very slow dose escalation scheme.

Discussion
In this paper, we proposed a Bayesian adaptive design for dose-finding studies in cancer phase I clinical trials.The method addresses the ethical concern regarding dose escalation in the absence of DLT.Specifically, if the current patient experiences drug-related grade 2 toxicity at the most, then it is ethical not to escalate the dose for the next patient by the same amount as the one had the current patient experienced a maximum of grade 0 or 1 toxicity.The method termed EWOC-POM is an extension of EWOC by accommodating an intermediate grade 2 toxicity to the model.We used a proportional odds model to describe the dosetoxicities relationship for simplicity.We proved that if the maximum grade of toxicity experienced by patient k − 1 within one cycle of therapy is grade 2, then the dose allocated to patient k is lower than the dose that would have been given to patient k had the maximum grade of toxicity experienced by patient k − 1 been grade 0 or 1.Furthermore, we also showed that the coherence properties of EWOC are maintained.We studied design operating characteristics by simulating a large number of trials under different scenarios of the dose-toxicity relationships.EWOC-POM was compared to EWOC with respect to the primary goals of cancer phase I trials, safety and efficiency of the estimate of the MTD.We found that in general, the safety of the trial is not compromised when we account for an intermediate grade 2 toxicity.In particular, when the unknown MTD is close to the initial dose, a substantial number of patients are overdosed when using EWOC relative to EWOC-POM, and if the current patient experiences grade 2 toxicity, then the next patient is more likely to be overdosed using EWOC compared to EWOC-POM.The loss in efficiency of the estimate of the MTD by introducing an extra parameter to the model is very

Figure 2 :
Figure 2: Dose-toxicity relationship for the nine scenarios considered in the design operations characteristics.The solid curves correspond to P Y 2 | x P DLT | x and the dashed curves in bold correspond to P Y > 1 | x .The horizontal dashed lines represent the target probability of DLT θ 0.33 and the vertical lines correspond to the true values of the MTD γ.

Figure 3 :
Figure 3: Summary statistics for trial safety for EWOC and EWOC-POM under all scenarios.Each graph represents mean proportion obtained from all patients from all 1000 simulated trials.

Figure 4 :
Figure 4: Box plots for the proportion of DLTs for EWOC-POM and EWOC under the nine scenarios.Each box plot was constructed from the DLT rates of the 1000 simulated trials.The dashed horizontal line corresponds to the target probability of DLT θ 0.33.

Figure 5 :
Figure 5: Summary statistics for trial efficiency for EWOC and EWOC-POM under all scenarios.Each graph represents a mean value obtained from all patients from all 1000 trials.

Figure 6 :
Figure 6: Dose-toxicity relationship under the proportional EWOC-POM and nonproportional odds models M 1 and M 2 when the true MTD γ 0.1.The dashed horizontal line corresponds to the target probability of DLT θ 0.33.

Theorem 3.1. Let D k { Y 1 , x 1 , . . . , Y k , x k } be the data on the first k patients generated by the design
described in Section 2.1.3,and, Π k

Table 1 :
Percent of trials with estimated MTD within 5% of the dose range and 10% of the dose range of the true MTD γ and percent of trials for which the rate of DLT exceeds 40% for EWOC and EWOC-POM under the nine scenarios.
estimate of the posterior distribution of the MTD at the end of the ith trial with respect to the asymmetric loss function:

Table 2 :
Summary statistics for trial safety and efficiency under model misspecification when the true MTD γ 0.1.Rows 2-6 give the summary statistics based on all patients from all 1000 trials.