Balancing Treatment Allocation over Continuous Covariates : A New Imbalance Measure for Minimization

In many clinical trials, it is important to balance treatment allocation over covariates. Although a great many papers have been published on balancing over discrete covariates, the procedures for continuous covariates have been less well studied. Traditionally, a continuous covariate usually needs to be transformed to a discrete one by splitting its range into several categories. Such practice may lead to loss of information and is susceptible to misspecification of covariate distribution. The more recent papers seek to define an imbalance measure that preserves the nature of continuous covariates and set the allocation rule in order to minimize that measure. We propose a new design, which defines the imbalance measure by the maximum assignment difference when all possible divisions of the covariate range are considered. This measure depends only on ranks of the covariate values and is therefore free of covariate distribution. In addition, we developed an efficient algorithm to implement the new procedure. By simulation studies we show that the new procedure is able to keep good balance properties in comparison with other popular designs.


Introduction
Balanced allocation among treatment groups is often desirable in many clinical trials.A well-balanced design enhances the trials by increasing the credibility of the trial, precision of subgroup or interim analysis, and robustness to model misspecification 1 .Moreover, ignorance of balance at the design stage may lead to loss of statistical efficiency, especially in small trials 2 .In the existence of covariates or prognostic factors , well-balanced allocation does not only mean similar group sizes but also similar distributions of covariate values across treatment groups 3-6 .With discrete covariates, a great many papers have been published, which include stratified permuted block designs, marginal procedures or minimization 7, 8 , and hierarchical models 9-11 .Y. Hu and F. Hu 12 are among the few who explored the theoretical properties of such procedures.studies are carried out in Section 4, comparing our procedure with discretization methods as well as some other designs.In Section 5, we conclude the paper by discussing possible extension of the current procedure.

Motivation
We will first focus on two treatments, A and B and only one continuous covariate Z.For simplicity, assume Z takes value over interval 0, 1 ; any other type of Z can be transformed to 0, 1 by a linear function if Z is bounded, or a nonlinear function such as e x / e x 1 otherwise.
To see the difficulty in defining "balance" for a continuous covariate, we use an example as shown in Figure 1.The first 8 patients have been randomized.The 9th has just arrived, who is tentatively assigned to treatment A a and B b .The figure shows the assignment as well as the covariate values of the patients.First, we point out that discretization could cause complications since different ways of splitting may lead to different assignments.For example, if the covariate range 0, 1 in Figure 1 is split into m intervals of equal lengths, then the choices of m 1, m 2, and m 4 result in different preferences of the assignment: if m 1, that is, only overall patient numbers are considered, then the 9th patient could be assigned to treatment A or treatment B, since in either way the absolute difference is 1; if m 2, then treatment A would be favored, producing balanced patient numbers in the second category, that is, over the interval 0.5, 1.0 where the 9th patient belongs; if m 4, then treatment B would be preferred instead, since it produces allocation of 2 : 1 in the third category, that is, over the interval 0.5, 0.75 , rather than 3 : 0 when treatment A is assigned.Second, we emphasize that balance in the mean values does not necessarily lead to balance in distributions.Taking the upper panel in Figure 1, for instance, the assignment of treatment A would not cause a significant mean difference in the two treatment groups.However, in terms of covariate distribution, a severe imbalance would exist, since the 5 patients 7th, 6th, 9th, 4th, and 8th in group A have covariate values at the center of the range and the 4 patients 2nd, 3rd, 5th, and 1st in group B at the two ends.
In light of the above discussion, we propose a new imbalance measure, which is defined as the maximum absolute assignment difference when all possible ways of

Procedure
where k A or B. Thus, D k n 1 is the potential maximum absolute difference it would cause if the new patient was assigned to treatment k.Note that the interval I in 3.2 needs to contain the new covariate value Z n 1 , because the difference over any other interval will not be affected by the the arrival of Z n 1 and is therefore not of our interest.Note also that Then, assign the n 1 th patient to treatment A with the following probability: where 0 < q < p < 1 and p q 1.
We will show how the 9th patient is randomized according to our new procedure.The critical part lies in the calculation of D A 9 and D B 9 .For the former, that is, the 9th patient is temporarily assigned to treatment A see the upper panel in Figure 1 , then the maximum absolute difference is 5, attained over intervals which exclusively contain Z 7 , Z 6 , Z 9 , Z 4 , and Z 8 , that is, 5 patients over these intervals are assigned to A and 0 is to B, so D We would like to point out that in order to calculate the imbalance D k n 1 , which is defined as a supremum, it is sufficient to examine intervals whose endpoints belong to the set {Z 1 , . . ., Z n 1 }.Hence the total number of such intervals has the order O n 2 .Moreover, the difference of patient numbers over any of these intervals is only related to the ranks of {Z 1 , . . ., Z n 1 }, whose joint distribution places an equal probability 1/ n 1 ! on any permutation of 1, 2, . . ., n 1 .To support the above argument, we can reexamine the upper panel in Figure 1: for any interval I a, b , where Z 2 < a ≤ Z 7 and Z 5 ≤ b ≤ Z 1 , the difference of patient numbers over I is exactly the same as that over interval Z 7 , Z 5 , which is 3 5 − 2; furthermore, so long as the relative positions of {Z 1 , . . ., Z n 1 } remain the same, this difference of 3 does not change.Nor does D k n 1 .Therefore, we come to the conclusion that the new procedure is free of the underlying distribution F.
In fact, the computation time of D k n 1 can be reduced by examining an even smaller number of intervals, that is, n 2 instead of O n 2 .Before demonstrating this, we need a few more notations and definitions.Since our new procedure is distribution-free, simply assume that the covariate Z is from uniform 0, 1 .Suppose Z n 1 and T n have been observed.Δ I, Z n , T n in 3.1 , defined as difference of patient numbers in groups A and B over interval I, will simply be written as Δ n I.For the ease of representation, let Z L 0 and Z R 1. Define two sets S L and S R as That is, Z i is any point from the set {0, Z 1 , . . ., Z n }, that is, to the left of or equal to Z n 1 , and Z i , Z n 1 is a left-closed and right-open interval.For instance, in Figure 1, n 8 and S L {0, Z 2 , Z 7 , Z 6 } from left to right.The interpretation of Z j and Z n 1 , Z j is similar.

3.5
The proof of Proposition 3.1 is given in the Appendix.Proposition 3.1, together with the definitions of S L and S R , suggests the following.2 For two consecutive intervals Z i , Z n 1 and Z i , Z n 1 in S L , where Z i < Z i , i / L and i / L, we have depending on the assignment A or B for the patient at Z i the same argument applies to intervals in S R .
The above two observations form the basis of the algorithm, which was developed for the new procedure in the simulation studies Section 4 .We found that the computation time of the new procedure was even less than discretization methods.

Simulation Studies
Suppose N 60 patients enrolled.As mentioned in Section 1, for continuous covariates, it is desirable to keep similarity between treatment groups in two aspects: the group sizes and the distributions of the covariates.Therefore, the new procedure was first compared with several other procedures in terms of the following two criteria: 1 the mean absolute difference E|N A N − N B N | of all patient numbers in the two groups, shown as ED all ; 2 the mean Kolmogorov-Smirnov distance K-S between the empirical distributions of covariate Z in groups A and B, shown as ED ks , which basically measures the similarity between two distributions.In addition, we used a new criterion: 3 the "maximum imbalance" defined by us as: sup shown as ED max , which is the maximum absolute difference over all possible intervals after all patients have been assigned to a treatment.We will show that criterion 3 acts as a compromise between criterion 1 and criterion 2 .
Since the procedures we compared are all distribution-free, the independent covariate values Z 1 , . . ., Z N were simply generated from Unif 0,1 .All procedures use the strategy of minimization, but each has a different imbalance measure.More specifically, under a certain imbalance measure D, we calculate D A or D B , defined as the imbalance that would occur if the new patient was assigned to treatment A or B. Depending on whether D A − D B is positive, negative, or zero, the allocation probability toward the treatment A is q, p, or 1/2, where 0 < q < p < 1 and p q 1.In the simulation, we used p 2/3 and p 1, with the latter corresponding to deterministic allocation unless there is a tie.
The following procedures were studied.The above two methods are rarely used as a way of balancing over a continuous covariate, since each of them is designed to meet only one criterion.In our simulations, they served as two controls to evaluate other procedures.
3 Discretization DSCRT .In practice, in order to discretize a continuous covariate Z with cumulative distribution function F, the range is often split by the quantiles of F at probabilities 1/m, 2/m, . . ., m − 1 /m.This is equivalent to splitting 0, 1 into m intervals of equal length 1/m for Z ∼ unif 0, 1 .
In our simulations, we tried m 2, 4, 8. Within each category Efron's design was applied.
The "qualitative" imbalance measure D L is defined by where w 0 and w 1 are two weights placed on the two items, c 0 and c 1 are two upper limits, and I is the indicator function; the "quantitative" imbalance measure D T is defined by w 0 , w 1 , c 0 , and c 1 can be changed freely from a subjective point of view.In the simulations, we fixed w 0 w 1 1 and c 1 10%, but tried c 0 2, 4, 6.
The results for p 2/3 and p 1 under 5000 repetitions are shown in Tables 1 and 2.
We first focus on Table 1 p 2/3 .The 1st and 2nd columns suggest that the "best" ED all and the "best" ED ks that can be achieved are 1.28 by EFRON and 0.137 by K-S, respectively, at the expense of large imbalance under the other criterion.For DSCRT when m increases from 2 to 4 and 8, ED all increases from 2.17 to 2.94 and 3.76, whereas ED ks decreases from 0.178 to 0.161 and 0.159.Therefore, we see that there is a trade-off between the balance of group sizes and the balance of covariate distributions.Similar trend can be In terms of ED max the 3rd column , MAX-IMB has the minimum value 7.38 since it sequentially minimizes this criterion.On the contrary, DSCRT minimizes the imbalance of patient numbers over the selected intervals, but ignores the imbalance over others.As a result, on average, the maximum imbalance under DSCRT is higher than that under MAX-IMB.In a sense, ED max serves as a tool which detects any allocation imbalance that is ignored by DSCRT.Since the new procedure examines both "global" imbalance, that is, over the whole range, and "local" imbalance, that is, over any small interval, it can be regarded as a compromise between achieving balance in overall group sizes and achieving balance in covariate distributions.
Similar conclusion can be drawn for p 1 see Table 2 , that is, the allocation is deterministic except the case of a tie.From p 2/3 to p 1, the decrease in ED all is most significant under DSCRT, from 2.17, 2.94, 3.76 to 0.49, 0.93, 1.45 .This is because when p 1 only covariate values are random, and so long as the numbers of patients over the selected intervals are even e.g., 34 over 0, 0.5 and 26 over 0.5, 1 with m 2 , DSCRT can always achieve perfect overall balance.Moreover, even if the patient numbers are odd e.g., 35 over 0, 0.5 and 25 over 0.5, 1 , there are still chances that the allocation differences are 1 and −1 or in the reversed way, again resulting perfect overall balance.Other procedures are more complex and the decrease in ED all is less significant.As a result, when p 1, MAX-IMB is only uniformly better than DSCRT with m 8, not m 4.
We also compared the above procedures by other commonly used measures including the mean absolute difference of sample means ED mean and the mean absolute difference of sample standard deviations ED std of the covariate values in the two treatment groups.Furthermore, Lin and Su 23 introduced another criterion, the area between the empirical cumulative distribution functions of the covariate values in the two treatment groups normalized by the difference of the maximum and the minimum values , denoted as ED area , and pointed out that this criterion has better performance than Kolmogorov-Smirnov distance in capturing the difference in two distributions.We thus included this criterion in the simulation.Since the measurements of mean, standard deviation, and area under a distribution function depend on the underlying distribution of the covariate, we did simulation studies under a uniform distribution Z ∼ Unif 0, 1 and under a normal distribution Z ∼ N 0, 1 and show the results in Tables 3 and 4, respectively.From Table 3 under the uniform distribution, it is seen that K-S has the best performance, since its ED mean , ED std , and ED area 2.29, 1.77, and 5.02 are the lowest among all procedures.This is expected, since K-S solely minimizes the distance between the two distributions.Once the distributions are closest, so are the summary statistics of means and standard deviations as well as the area between the distributions.However, K-S is likely to produce severe imbalance of group sizes, as shown in Tables 1 and 2. EFRON has the worst performance under the three criteria since it completely ignores the covariate distributions.
Among the three choices of m under DSCRT, roughly speaking m 4 performs best: its ED mean and ED area 3.22 and 5.56 are the lowest and its ED std 1.92 slightly higher than that under m 8.Moreover, under these three criteria, DSCRT with m 4 is uniformly better than MAX-IMB, RANK-SUM, and WGT-AVE.However, the good performance of DSCRT with m 4 is based on the correct identification of quartiles of the true covariate distribution, which may not be feasible before the collection of data.In contrast, other methods do not require such information.
Comparing MAX-IMB, RANK-SUM, and WGT-AVE, RANK-SUM has the highest values under the three criteria; MAX-IMB has comparable performance to WGT-AGE with c 0 4, with the former having slightly higher ED mean and the latter slightly higher ED std and ED area .Similar conclusion can be reached for the normal distribution see Table 4 .The result for p 1 under the three criteria ED mean , ED std , and ED area resembles that for p 2/3, the only difference being that the best choice of m under DSCRT is m 8 instead of m 4. In fact, we also did simulations under different sample sizes N 30 and 150 and the results are quite consistent.

Discussion and Conclusions
In this paper, we propose a new minimization procedure that balances treatment allocation over continuous covariates.For any new patient, it examines the imbalances in the neighborhoods of his or her covariate value and bias the allocation probability towards the treatment that would result in a smaller value of the maximum imbalance.The new method only depends on the ranks of the covariates and is therefore distribution-free.Our simulation studies have shown that it is able to maintain relatively good balance in terms of group sizes and covariate distributions across treatment groups.
In addition, the new procedure does not require the specification of any critical values, which is usually needed for discretization methods in order to define categories.For the latter methods, if quantiles of the covariate distribution F are used for the critical values, then lack of knowledge about F may lead to wrong guesses of the quantiles.The new procedure saves this step by considering all possible divisions of the range.Nevertheless, only the assignment differences over n 2 intervals have to be examined to calculate the new imbalance measure, and the corresponding algorithm is computationally efficient.
Borrowing the idea of Pocock and Simon's design 7 , our method can easily be generalized to two or more continuous covariates or a mix of discrete and continuous covariates.Suppose that for a total of L covariates Z 1 , . . ., Z L , the first L 1 is continuous and the rest are discrete.When the n 1 th patient is enrolled, for any continuous covariate Z i , i 1, . . ., L 1 , we define D k n 1,i , k A, B by 3.2 , which is the the maximum imbalance measure with respect to the ith covariate; for any discrete covariate Z j , j L 1 1, . . ., L, observe the category the new patient belongs to, tentatively assign him to treatment k, and define D k n 1,j as the absolute difference of patient numbers in the two treatment groups with respect to that specific category.For example, if the jth covariate is gender and the new patient is a male, then D k n 1,j is calculated among all males.Define where w i 's and w j 's are the weights placed on the covariates and can be assigned by the importance of the different covariates.Depending on whether D A n 1,mix is greater than, less than, or equal to D B n 1,mix , assign the n 1 th patient to treatment A with probability q, p or 1/2.From 5.1 , it is seen that D k n 1,mix is similar to Pocock and Simon's weighted average of marginal imbalances.The only difference is that for those continuous covariates we redefine the marginal imbalances by the new measure proposed in the current paper, so that the negative effect caused by discretization can be mitigated.Since the marginal imbalances for discrete covariates in D k n 1,mix remain the same as in Pocock and Simon's, it is expected that the good balance properties for discrete covariates in their design can be preserved when D k n 1,mix is used in the minimization.We did simulations for the case of two continuous covariates and the new procedure again showed improvement over other procedures.
In the case that an unequal allocation ratio such as r A : r B is desired, one can easily generalize the proposed metric by redefining Δ I, Z n , T n in 3.1 as By doing so, it can be ensured that the allocation ratio over any interval is close to r A : r B .
In practice, one can also modify the maximum imbalance measure by adding weights to different intervals.The weight of each interval can be assigned as a function of the number of patients within the interval, so that the procedure remains distribution-free.But the algorithm to implement such a procedure will be more complicated.We will leave these as future research topics.Now for any fixed i, apply A.1 to the constant Δ n Z i , Z n 1 1 and the sequence Δ n Z n 1 , Z j , j R, 1, . . ., n, and we have : max{I, II}. A.4 Applying A.1 again to I with constant C R 1 1 and sequence Δ n Z i , Z n 1 , we have Similarly, Therefore, The derivation of D B n 1 is similar.

Figure 1 :
Figure 1: Two allocation sequences for the same covariate values.

9 ,
with probability p > 1/2 the 9th patient will be assigned to treatment B.
requires the examination of n 2 intervals, instead of O n 2 .

1
Efron's design EFRON 22 .Let N A and N B be the patient numbers in the two groups.Define the imbalance measure D as |N A − N B |.This method solely focuses on the balance of patient numbers. 2 Kolmogorov-Smirnov measure K-S .Let F k be the empirical distribution function of the covariate values in group k, k A, B. Define imbalance measure D as the Kolmogorov-Smirnov distance between F A and F B .This method solely focuses on the balance of distributions.

Therefore, let D k L
be the the qualitative imbalance resulted from the tentative assignment of treatment k to the new patient, k A, B. If D A L − D B L is positive negative , the allocation probability toward the treatment A will be q p ; if D A L − D B L 0, use measure D T to determine the probability p, q, or 1/2.
Consider a sequential trial with two treatments A and B control and test .For the first n patients that have arrived, let T n T 1 , . . ., T n be the allocation sequence with T i A if the ith patient is assigned to treatment A and T i B otherwise.Suppose we need to balance treatment allocation over one single continuous covariate Z with cumulative distribution function F, whose range U is an interval on the real line.The two endpoints of U can be finite or infinite.Let Z given the covariate information Z n and allocation sequence T n .Suppose the first n patients have been randomized and the n 1 th patient has just arrived, that is, Z n 1 n Z 1 , ..., Z n , where Z i is the covariate value of the ith patient.Suppose Z 1 , ..., Z n are independent.DefineΔ I, Z n , T n N A I, Z n , T n − N B I, Z n , T n 3.1as the difference of patient numbers in treatment groups A and B over the interval I ⊂ U,

Table 1 :
Comparison of ED all , ED ks , and ED max for p 2/3. Stigsby and Taves' rank sum RANK-SUM 16 .Let R 1 , . . ., R N be the ranks of Z 1 , . . ., Z N .Suppose patients i 1 , . . ., i N 1 are in group A and patients j 1 , . . ., j N 2 are in group B. The imbalance measure D is defined by

Table 2 :
Comparison of ED all , ED ks , and ED max for p 1. AVE when c 0 increases from 2 to 4 and 6.In terms of these two criteria, the new procedure MAX-IMB , with ED all 2.36 and ED ks 0.159, has better performance than DSCRT with m 4 and m 8 and WGT-AVE with c 0 4 and c 0 6; it also has lower ED all and ED ks than RANK-SUM.