Maximin Efficiencies under Treatment-Dependent Costs and Outcome Variances for Parallel, AA/BB, and AB/BA Designs

If there are no carryover effects, AB/BA crossover designs are more efficient than parallel (A/B) and extended parallel (AA/BB) group designs. This study extends these results in that (a) optimal instead of equal treatment allocation is examined, (b) allowance for treatment-dependent outcome variances is made, and (c) next to treatment effects, also treatment by period interaction effects are examined. Starting from a linear mixed model analysis, the optimal allocation requires knowledge on intraclass correlations in A and B, which typically is rather vague. To solve this, maximin versions of the designs are derived, which guarantee a power level across plausible ranges of the intraclass correlations at the lowest research costs. For the treatment effect, an extensive numerical evaluation shows that if the treatment costs of A and B are equal, or if the sum of the costs of one treatment and measurement per person is less than the remaining subject-specific costs (e.g., recruitment costs), the maximin crossover design is most efficient for ranges of intraclass correlations starting at 0.15 or higher. For other cost scenarios, the maximin parallel or extended parallel design can also become most efficient. For the treatment by period interaction, the maximin AA/BB design can be proven to be the most efficient. A simulation study supports these asymptotic results for small samples.


Introduction
e standard design of a randomized clinical trial is the parallel group design: subjects are randomly assigned to one of two treatments, say A or B. An alternative, wellknown design is the AB/BA crossover trial in which subjects receive both treatments, A and B, but the sequencing of the treatments is opposite for two randomly allocated groups [1,2]. An AB/BA crossover trial is considered most suited when examining treatments for chronic or ongoing diseases, such as rheumatism, chronic obstructive pulmonary disease, or (frequent) heartburn. In these cases, there is no real possibility that the disease gets cured, and the aim is to moderate the effects of the disease [2]. A third design that we will consider involves treatment sequences AA and BB.
is design extends the parallel design across two treatment periods, allows for testing treatment by time interaction effects, and is a realistic alternative for the AB/BA design in case the treatment regime should continue.
If the outcome variable is continuous and (approximately) normally distributed, the data can be analyzed by mixed effects regression [3]. Of primary interest is testing the treatment effect of, for instance, a new medication for chronic obstructive pulmonary disease. A relevant issue then is which design is the most efficient in estimating the treatment effect, thereby yielding maximum power for testing this effect. Such optimality has already been examined when comparing crossover and parallel designs [2] and when comparing all three designs introduced before [4,5]. If there are no carryover effects and no dropouts, the sample sizes are equal and equally allocated to the treatments, an AB/BA design yields more efficient estimates of the treatment effect than a parallel and extended parallel design and consequently, will yield more power to test this effect. e present study extends results on the relative efficiencies of these designs in that (a) optimal instead of equal treatment allocation is examined, (b) allowance is made for treatmentdependent outcome variances, and (c) next to treatment effects, also treatment by period interaction effects are examined.
Outcome variances may differ between treatments [6,7]. is also is to be expected if treatments differ in terms of their effectiveness. Furthermore, since research costs and outcome variances may differ between treatments, equal allocation to treatments may not be the most efficient. e issue then is how to allocate subjects to treatments such that a design's efficiency is optimized, and how different designs relate in terms of efficiency under such optimal allocation. Optimal allocation requires a priori knowledge on parameters of the analysis model, that is, intraclass correlations for the mixed effects model that we consider. Since this knowledge typically is rather vague, optimal allocations and corresponding efficiencies for maximin versions of the (extended) parallel design and crossover design will be derived. ese maximin designs guarantee a power level across plausible ranges of the intraclass correlations at the lowest research costs.
In designs where treatments are successively given to the same group of subjects, carryover may occur. For the AB/BA trial, it may be that, in the AB sequence treatment, A still has an effect on the outcome, when B has been given and the second measurement is done. When in the BA sequence, the effect of B is present, once A has been administered and this effect differs from the carryover effect for the AB sequence, differential carryover occurs. e present study assumes that differential carryover can be safely excluded or is negligible and that this effect does not need to be estimated in analyzing the data. e paper is structured as follows. Section 2 will present the linear mixed model for analyzing data from each of the three designs. Section 3 will introduce the efficiency criterion and will provide asymptotic expressions for this criterion in the case of maximum likelihood estimation of the treatment effect. Starting from a flexible cost function, optimal allocations to treatments will be derived as well as resulting design efficiencies. Since the efficiencies depend on the intraclass correlations and knowledge on these parameters is often limited, in Section 4, we will derive maximin designs. Section 5 will show to what extent the asymptotic efficiencies translate into desired power levels for small sample sizes. Section 6 will give an application of the results, and Section 7 will discuss some issues for further research.

Linear Mixed Effects Models
In the case of a parallel design, an extended parallel design, and a crossover design, the subjects are randomly allocated to one of the two arms. In a parallel design to treatment A or treatment B, in an extended parallel design, they are allocated to treatment sequence AA or BB, and in a crossover trial to treatment sequence AB or BA. We consider a quantitative outcome variable, denoted as y ij for person j (j � 1, . . . , N) at measurement occasion i, and assume y ij is (approximately) normally distributed.
For a parallel design and outcome variances that differ between treatments A and B, simple linear regression with heterogeneous variances may be an adequate tool for data analysis: where treatment is coded 0 for persons having treatment A and coded 1 for persons having treatment B, and δ A ij and δ B ij are normally distributed, with mean 0 and variances σ 2 A and σ 2 B , respectively. e random terms δ A ij and δ B ij can be thought of as consisting of a random person (between-subject) effect, u 0j , and a treatment-dependent random error (within-subject) effect, ε A ij and ε B ij . In formula, δ A ij � u 0j + ε A ij , and δ B ij � u 0j + ε B ij . ese two sources of random variation cannot be separated in a single-period parallel trial.
For a crossover AB/BA design and an extended, twoperiod, parallel AA/BB design, however, the variances of u 0j and of ε A ij and ε B ij can be identified. e linear regression model can then be extended with a random intercept as well as a fixed effect of time, yielding the following mixed effects model: In (2), time is coded 0 for observations at the first measurement and coded 1 for observations at the second measurement.
e random terms u 0j , ε A ij , and ε B ij are independently normally distributed, with mean 0 and variances σ 2 0 , σ 2 εA , and σ 2 εB , respectively. eir relation with the variances in (1) is σ 2 A � σ 2 0 + σ 2 εA for treatment A and σ 2 B � σ 2 0 + σ 2 εB for treatment B. In the case we want to examine whether there is an interaction between treatment and period, the model in (2) is extended as follows: where β 3 represents the treatment by period interaction effect. e parameters in (1)-(3) can be estimated through maximum likelihood (ML). In what follows, we are interested in optimally estimating β 1 in (1) and (2), which will be denoted as β treat , and in optimally estimating β 3 in (3), which will be denoted as β treat×time . A relevant concept is the intraclass correlation, which is between-subject variation on the outcome as compared to the total outcome variation. For the models in (2) and (3),this can be expressed as ρ A � σ 2 0 /(σ 2 0 + σ 2 εA ) and ρ B � σ 2 0 /(σ 2 0 + σ 2 εB ) for treatments A and B, respectively. e larger the person (between-subject) variance as compared to the error (within-subject) variance, the larger the intraclass correlations. Note that we assume a common between-subject variance, but allow for treatment-dependent within-subject variances, leading to treatment-dependent within-subject correlations. We also define a variance ratio ϕ � (σ 2 0 + σ 2 εA )/(σ 2 0 + σ 2 εB ) � σ 2 A /σ 2 B , which can be expressed as a ratio of the intraclass correlations, ϕ � ρ B /ρ A .

Optimal Allocations and Corresponding Design Efficiencies
Let Var(β x |ξ) denote the variance of the estimator of the treatment effect β 1 in (1) or (2) or the variance of the treatment by the period interaction effect β 3 in (3), given a design ξ. e efficiency of an estimator of β x is defined as the inverse of its variance, that is, (Var(β x | ξ)) −1 . In the sequel, we will consider the efficiency of one design, ξ 1 , versus another design, ξ 2 , which is defined as Var(β x | ξ 2 )/Var(β x |ξ 1 ) and denoted as the relative efficiency. Since no closed-form expressions are available for the variances of the maximum likelihood (ML) estimator, asymptotic variances of the ML estimator were derived (Appendices A.1 and A.2). e optimal allocation to treatments minimizes the variance of the estimator of β treat in (1) or (2) and of β treat×time in (3), given a fixed research budget. Note that changing the coding of the treatment factor or the time factor in (1)-(3), for instance into 1 versus −1 instead of 1 versus 0, will not affect the optimal allocation. Such a change of coding leads to a linear transformation of β treat or β treat×time , and this will change the variance of their estimators only by a multiplicative constant. is implies that allocations that minimize the variance of the estimators do not depend on the coding of treatment and time.
To derive the optimal allocations under a budget restriction, we need to define a budget function. Let the costs involved with each subject in the parallel design be c sp euros, in an extended parallel design be c sep euros, and in a crossover design be c sc euros. ese costs may represent financial rewards given to subjects for participating in the trial but also the (average) costs of recruiting a subject. Furthermore, for treatments A and B there are, for each subject, costs c A and c B , respectively, and each measurement may involve c t euros. Finally, attached to each treatment sequence, there may be administration costs c ts .
In the case of allocation proportions p A for treatment A and p B � 1 − p A for treatment B in a parallel design having n p subjects, the following budget C * is required: For the designs that we consider, this budget function can be reparametrized such that it is the same as the cost function given by Yuan and Zhou [8], thereby generalizing the cost function proposed by Brown [9] and Berger and Wong [4].
For an AB/BA crossover design, involving n c subjects and allocation proportions p AB for treatment sequence AB and p BA � 1 − p AB for treatment sequence BA, noting that each subject receives both treatment A and B and is measured twice, the following budget is required: Finally, the required budget for an AA/BB design, involving n ep subjects, with allocation proportions p AA and p BB � 1 − p AA for the treatment sequences AA and BB, respectively, is as follows: Note that, for the functions in (4)-(6), the budget may simply be the total number of observations involved in a study, by setting c t � 1 and the other costs to 0. It can also represent the total number of subjects involved, by setting c sp � c sep � c sc � 1 and the remaining costs to 0.
In what follows, we will assume that the subject-specific costs of the two-period designs are the same; that is, c sep � c sc � c s_2p . Since subjects in these designs receive two treatments and a washout period may be involved, these costs are very likely larger than those of a parallel design. We also assume that the subject-specific costs for the two-period designs will not exceed 2 times the subject-specific costs for the parallel design, so that c sp ≤ c s_2p ≤ 2c sp . Finally, since each design involves two treatment sequences, the administration costs are the same for each of the three designs considered, and thus, the budgets that are available for remaining costs are identical; that is, the budget C � C * − 2c st is the same for each design.

Treatment Effect.
For treatment effect estimation, the optimal allocations to the treatment sequences are derived in Appendix B.
e optimal allocations and corresponding (asymptotic) variances of the treatment effect estimators are shown in the second and third column of Table 1, respectively. e optimal allocation ratios of the parallel and the extended parallel design depend on the costs and intraclass correlations: the more the expensive treatment A (or the cheaper treatment B) and the larger the intraclass correlation in treatment A (or the smaller the intraclass correlation in treatment B), the more the subjects have to be assigned to treatment B. e optimal allocation ratio for a crossover design is 1, which may be expected, since both groups receive both treatment A and B.

Treatment by Period Interaction Effect.
In the case the treatment by period interaction effect is of primary interest, the optimal allocations can be derived along lines similar to the derivations for the treatment effect (Appendix B). e allocations and corresponding optimal variances are displayed in Table 1. Note that, similar to treatment effect estimation, the allocation ratio for a crossover design is 1, whereas the allocation ratio for an extended parallel design depends on the treatment costs and intraclass correlations, such that more persons are allocated to treatment sequence AA if the intraclass correlation of A decreases, the intraclass correlation of B increases, the costs of treatment A decrease, or the costs of treatment B increase.

Maximin Designs
Choosing the optimal allocation requires knowledge on the intraclass correlations ρ A and ρ B (remember that the variance ratio ϕ is fixed if ρ A and ρ B are given). Commonly, there is only limited knowledge on these parameters. A possible solution is the maximin strategy [4], consisting of 2 steps: (1) for each design determine the minimum efficiency of the effect estimator across the plausible ranges for the intraclass correlations ρ A and ρ B and (2) choose that design which maximizes this minimum efficiency. Such a design optimizes a worst case scenario and is called a maximin design. e maximin strategy implies choosing the design Computational and Mathematical Methods in Medicine that minimizes the maximum variance of the estimator of the effect of interest. In determining sample sizes, choosing values for the intraclass correlations ρ A and ρ B within their plausible ranges (and thus a variance ratio ϕ within its plausible range) for which the variance is maximum will guarantee the desired power level also for all other values of these parameters. Moreover, the maximin design guarantees this power level at the lowest research costs. In what follows, we will refer to ranges of ρ A and ρ B that have lower bounds ρ L A and ρ L B and upper bounds ρ U A and ρ U B , respectively. Table 1, one can derive for which values of ρ A and ρ B (and thus for which value of the variance ratio ϕ), the variance of the treatment effect estimator is maximized. ese derivations are given in Appendix C. e maximin parameter values and corresponding variances for the treatment effect estimator under optimal allocation to the treatments are shown in Table 2. e corresponding optimal allocations for the maximin designs are obtained by substituting the maximin parameter values of Table 2 into the allocation ratios as given in Table 1.

Treatment Effect. From the asymptotic variances in
If for a parallel design the maximin value for the vari- and c s_2p ≤ 2c sp , then a parallel design is always less efficient than a maximin crossover design. If for an extended parallel design the maximin value for one of the intraclass correlations is within the plausible range for the corresponding intraclass correlation, then also this design is less efficient than a maximin crossover design. For other scenarios, the relations between the maximin designs are more complicated, depending on the ranges for ρ A and ρ B , the costs of treatments, subject recruitment, and measurement.
A systematic numerical evaluation was done to examine under what conditions the crossover design is the best choice in terms of efficiency. For ρ A and ρ B , we consider ranges of width 0.10 (small), 0.30 (medium), and 0.60 (large). e lower bounds were {0.01, 0.05, 0.10, 0.15, 0.20, . . .}, where the largest possible lower bound was determined by the width of the range under consideration. For instance, if the range is 0.30 (medium), the largest lower bound for the intraclass correlation is 0.70. All combinations of small, medium, and large ranges for ρ A and ρ B were considered. e values of the variance ratio ϕ thus considered vary from 1/100 to 100. Since in most crossover trials, the intraclass correlation exceeds 0.30 [1-3, 10, 11], ranges with lower bounds of 0.30 or higher are empirically most relevant. e empirical evidence on the costs c A , c B , c t , c sp , and c s_2p is scarce, and we thus choose costs covering a wide range of scenarios. Let CR A � (c A + c t )/c sp , CR B � (c B + c t )/c sp , and CR p � c s_2p /c sp (note that the relative efficiencies of the maximin designs depend only on these cost ratios). CR A and CR B take on the values 100, 20, 10, 1, 0.1, 0.05, and 0.01. For CR p , we consider 1 and 2.
If the costs of treatments are identical between the treatment arms, that is, CR A � CR B , for most scenarios examined, the crossover maximin design turns out to be most efficient. For CR p � 1 and CR A � CR B ≤ 1, the crossover design always is the most efficient. For CR p � 1 and CR A � CR B > 1, or CR p � 2, only if the lower bound of one of the intraclass correlations is 0.05 or lower and the ranges of the intraclass correlations do not overlap, the parallel design can become most efficient. Since in most empirical studies the intraclass correlations will exceed 0.05, this implies that, for equal costs of treatments, the crossover maximin design will almost always be the most efficient design.
In the case the treatment costs differ and CR A ≤ 1 and CR B ≤ 1, only in the case the lower bound of one of the intraclass correlations is 0.10 or smaller, the parallel or the extended parallel maximin design can become most efficient. e extended parallel design can only become most efficient if CR p � 1. Hence, in all scenarios with unequal treatment costs and CR A ≤ 1 and CR B ≤ 1, for intraclass correlations of 0.15 or higher, the maximin crossover design is most efficient. Allocation ratio (n 1 /n 2 ) Var(β treat ) Crossover y /C) Note. n 1 : sample size for A (parallel design), AB (crossover design), or AA (extended parallel design) sequence; n 2 : sample size for B (parallel design), BA (crossover design), or BB (extended parallel design) sequence; σ 2

Computational and Mathematical Methods in Medicine
In the case the treatment costs differ and CR A > 1 or CR B > 1, the maximin crossover design is less often most efficient. For these cost scenarios, also for ranges of intraclass correlations exceeding 0.15, the maximin parallel and extended parallel design may become more efficient.
is especially occurs if the costs of treatment A and lower bound of the range of ρ A are both larger (or smaller) than the costs of treatment B and lower bound of ρ B , respectively. e efficiency improvement is large if treatment A is much more expensive than treatment B and if the costs of treatments and measurements are large compared to the subject-related costs. is is illustrated in Figure 1. e top row shows that if the costs of treatment A are larger than the costs of treatment B and the lower bound of ρ A is larger than the lower bound of ρ B , a parallel design is most efficient, even up to an upper bound 1 of ρ B if CR A � 100. As can also be seen, the upper bound of ρ A is not very relevant in terms of the relative efficiencies. e left plot of the middle row of Figure 1 shows that if the lower bounds of ρ A and ρ B are equal, then for almost all upper bounds of ρ B , the crossover design is most efficient. Again, as can be seen in the rightmost plot of the middle row, if the lower bound of ρ A is higher than the lower bound of ρ B , then for higher upper bounds of ρ B , the parallel design is most efficient but to a lesser extent as compared to a smaller lower bound of ρ B . As is evident from the four subplots in the top and middle row, when increasing the ratio CR A /CR B , the crossover design becomes less efficient as compared to the other two designs. e subplots of the bottom row furthermore show that the crossover design also becomes less efficient compared to the other designs if CR A and CR B increase while the ratio CR A /CR B remains constant. is illustrates that the efficiency of the other designs relative to the crossover design becomes larger if the costs of treatments and measurements are large compared to the subject-related costs. However, to summarize, if the treatment costs differ and CR A > 1 or CR B > 1, no simple rules of the thumb emerge and the most solid way to choose the most efficient design is just to calculate the maximin variances as given in Table 2.
Finally, if CR p � 2, the maximin parallel design is consistently more efficient than the maximin extended parallel design (as is illustrated in Figure 1). If CR p � 1, the maximin extended parallel design can also become more efficient than the maximin parallel design.

Treatment by Period Interaction Effect.
e maximin parameter values and corresponding variances of the estimator of the treatment by period interaction effect are shown in Table 3. e derivations of these results can be done along lines similar to the derivations for the treatment effect estimator (Appendix C). e optimal allocation for the extended parallel design is obtained by substituting the maximin parameter values in the expression for the allocation ratio in Table 1. For a crossover design, the allocation ratio is 1. e maximin efficiency of an extended parallel design is always higher than that of a crossover design if the maximin value ρ * A is within the plausible range for ρ A . is follows from

Design
Maximin parameter values Var(β treat ) for the maximin design  Computational and Mathematical Methods in Medicine where the right-hand side of the inequality in turn is smaller than the variance of a maximin crossover design (Table 3). e higher maximin efficiency of the extended parallel design can also be shown to hold if the maximin value ρ * B is within the plausible range for ρ B . Furthermore, if the variance maximizing values ρ * A and ρ * B are outside the plausible ranges for ρ A and ρ B , respectively, then values for ρ A and ρ B that coincide with one of the borders of their corresponding ranges should be chosen as values that maximize the variance. But in that case, even smaller variances result for the extended parallel design.

Maximin Designs at Minimize the Number of Subjects and Number of Measurements.
As noted in Section 3, by setting c sp � c sep � c sc � 1 and the remaining costs to 0 in (4)-(6), the budget is simply the total number of subjects involved in a study, and by setting c t � 1 and the other costs to 0, the budget reduces to the total number of measurements involved. When the budget is the total sample size and interest is in estimating the treatment effect, it can be proven, based on the formulas in Table 2, that a maximin crossover design requires less subjects than a maximin parallel design. From an extensive numerical evaluation analogous to the one of Section 4.1, a maximin crossover design also appears to require less subjects than a maximin extended parallel design.
When minimizing the number of measurements, the numerical evaluation shows again that the maximin crossover design is the best choice provided the lower bounds of both intraclass correlations are 0.10 or higher. In other cases also a maximin parallel design may minimize the total number of measurements. Since in most crossover trials the intraclass correlation exceeds 0.30 [1-3, 10, 11], in practice, this implies that the maximin crossover trial also is the best choice when minimizing the number of measurements.
In the case, interest is in the treatment by period interaction, Section 4.2 showed a maximin extended parallel design to be more efficient and thus also to require less budget than a maximin crossover trial. In the special case where the number of subjects or the total number of measurements are minimized, the maximin extended parallel design will therefore also outperform the maximin crossover design.

Monte Carlo Evaluation of the Power of
Maximin Designs e efficiencies as derived for the maximin designs are based on the asymptotic variance of the ML estimator, Var(β x | ξ). For sufficiently large numbers of subjects, the relation between the asymptotic variance of the ML estimator and the power level 1 − c to detect a treatment effect in a two-tailed test with type I error rate α can be approximated as follows: where z 1−α/2 and z 1−c are the 100 (1 − α/2) and 100 (1 − c) percentiles of the standard normal distribution. For small sample sizes calculated by (8), corrections are needed [12,13]. For each of the three designs, these corrections will be applied. We will examine to what extent the differences between designs in asymptotic efficiencies translate into corresponding differences in power levels for small samples. Also, when planning sample sizes based on the asymptotic Computational and Mathematical Methods in Medicine variances, we can check whether the commonly used power levels of 80% or 90% are realized in the case of small samples. For the treatment effect estimator, the following expression for the required number of subjects results for a crossover design with optimal allocation: If we let ES � β treat / ���������� � 0.5(σ 2 A + σ 2 B ) be the effect size based on the outcome variances in the treatment and control arm (cf. [14]), then (9) can be rewritten as follows: Note that, in the case of a maximin design, the expression is the same as (10), however, with ρ L A and ρ L B being substituted for ρ A and ρ B , respectively. Similar rewritings of the sample sizes in terms of the effect size are possible for the parallel and extended parallel design, respectively: e choices to be made for ρ A and ρ B in the case of maximin versions of the parallel and extended parallel design are determined by the conditions as formulated in Table 2.
In the case of the treatment by period interaction effect, the following expression for the required number of subjects can be derived for a crossover design with optimal allocation: where ES � β treat×time / ���������� � 0.5(σ 2 A + σ 2 B ). In the case of a maximin design, the expression is the same as (12), however, with ρ U A and ρ U B being substituted for ρ A and ρ B , respectively.
e expression for the sample size of an extended parallel design, when allocating optimally, can be written as follows: e choices to be made for ρ A and ρ B in the case of a maximin extended parallel design are determined by the conditions formulated in Table 3.
Since maximin designs only require information on plausible ranges of model parameters, they are more practical than optimal designs. In what follows, we will therefore examine through a Monte Carlo simulation the power for maximin designs in the case of small sample sizes. First, we will discuss the factors that are varied and motivate the choices made for these factors in determining the simulation scenarios.

Costs.
e empirical evidence on costs is rather scarce, but we will choose the costs such that they imply minimizing the sample size of a study (i.e., c A � c B � c t � 0 and c sp � c s_2p � 1).

Intraclass Correlations.
e ranges for ρ A and ρ B are identical to the ranges of the numerical evaluation of Section 4.1. Since we are interested in the small sample performance, for each design, we consider that pair of ranges across all combinations of ranges for the intraclass correlations (i.e., small-small, medium-medium, large-large, small-medium, small-large, and medium-large) that lead to the smallest sample sizes. Since this each time turns out to be a pair from the small-small category, the same was done for all pairs of medium and large ranges, which will be used more often in practice. For each design, the two resulting pairs of ranges of intraclass correlation are displayed in the two leftmost columns of Table 4 and the Table  in Appendix D.

Power Level and Type I Error Rate.
In sample size planning commonly used power levels are 80% and 90% in a two-tailed test with either a 5% or a 1% type I error rate.
Focusing on the small sample performance, we will consider 80% power in a two-tailed test with a 5% type I error rate. For small sample sizes derived from the standard normal distribution (as in (9)-(13)), corrections are needed that turn out to depend on the type I error rate [12,13]. For this reason, we will also study a 1% type I error rate.
For a test of the treatment effect, the data generated for the crossover design were analyzed with a two-sample t-test on the difference scores obtained by subtracting the two measurements for each subject. e model in (2) implies homogeneity of variances for these difference scores, so that a pooled variance t-test was applied. e data generated for the parallel design were simply analyzed by a two-sample t-test on the original scores, whereas the data for the extended parallel design were analyzed with a two-sample t-test on the scores averaged across both measurements. For these parallel designs, (1) and (2) imply that the analyzed scores may have variances differing between groups, so that an unpooled variance t-test was applied. For the treatment by period interaction effect, the data generated for the crossover design were analyzed with a two-sample (pooled variance) t-test on the scores averaged for each subject across both measurements, whereas the data generated for the extended parallel design were analyzed with a two-sample (unpooled variance) t-test on the differences between the two measurements (3). ese different t-tests follow for each of the designs (involving equal numbers of measurements per subject) from the analysis models in (1)-(3) and do not require asymptotic assumptions.
In calculating the required sample size, (9)-(11) were used, when interest is in testing the treatment effect, and (12) and (13) were used, when interest is in testing the treatment by period interaction. e optimal allocations for each design are given in Table 1, taking the maximin values for ρ A and ρ B as determined from Tables 2 and 3 Table 4 and Appendix D, for all sample size-design combinations that should yield 80% power, the simulated powers were either within or above the 95% predictive intervals. is indicates that the asymptotic results, supplemented with simple correction rules for using the standard normal distribution, yield sample sizes that guarantee the desired level of power. e realized power levels generally are higher than 80%, since the small sample corrections are sufficient and in some cases smaller corrections would have been appropriate [12,13]. e power differences between the designs can become rather large and are in line with the asymptotic relative efficiencies. For the examples of Table 4, the crossover design always is most efficient and in the simulation also has the highest power. Additional simulations show that similar conclusions can be drawn for ranges of intraclass correlations for which the crossover design is not most efficient. As expected, when testing the treatment by period interaction, the extended parallel design has more power than the crossover design (Appendix D).

Application in Planning a Trial
Suppose one would like to perform a randomized trial on the effectiveness of indacaterol versus tiotropium, among subjects suffering from chronic obstructive pulmonary disease, similar to Donohue et al. [16]. After (9)) with 80% power in a two-tailed test with a 5% type I error rate, taking as maximin parameter values ρ * A � 0.10 and ρ * B � 0.30 in (10), 54 subjects are needed. Since the sample size calculation in (10) is based on the standard normal, whereas the test statistic follows a t-distribution, we add 1 subject to each treatment sequence [12], yielding a total sample size of 56 subjects with 28 subjects being allocated to each of the two treatment sequences of the crossover design.

Conclusion and Discussion
We examined the asymptotic efficiency of the ML estimator of the treatment and the treatment by period interaction effect for three two-treatment designs: a parallel, an extended parallel, and a crossover design. For a flexible cost function, the optimal allocations to the treatment sequences and corresponding optimal efficiencies were derived. Since commonly the intraclass correlations for each of the treatments and the ratio of treatment-dependent variances are not precisely known, also maximin designs were derived, which guarantee a power level across plausible ranges of values for the intraclass correlations at the lowest costs.
When interested in testing the main effects of the treatments, the relations between the efficiencies of the maximin versions of the A/B, AB/BA, and AA/BB designs depend on assumed ranges of the intraclass correlations, on the costs of the treatments and the costs of recruiting and measuring subjects. A numerical investigation shows that if A and B are equally expensive or the sum of the costs of one treatment and measurement per person are less than the remaining subject-specific costs (such as recruitment costs), then the crossover design is most efficient for ranges of intraclass correlations starting at 0.15 or higher. In other cost scenarios, also for ranges of the intraclass correlations above 0.15, the parallel design or its extended version may become most efficient. en, the efficiency relations are complicated, and the most efficient design is best determined by the results in Table 2. For the treatment by period interaction, however, the maximin AA/BB trial is proven to be more efficient than the maximin AB/BA design.
Since the efficiency comparisons of the maximin designs were based on asymptotic variances, a Monte Carlo simulation study was done for small samples. After applying correction factors in sample size planning based on the standard normal distribution, it was shown that (a) the asymptotic relative efficiencies translate into corresponding relative power levels and (b) power levels targeted in sample size planning are realized. is illustrates the practical utility of these results for sample size calculation.
If prerandomization measurements of the outcome variable are available, these could be included as covariates in the analysis [1]. Adding covariates in a randomized trial will not change the treatment effect of interest but will lead to a reduction of the intercept variance and thus of the intraclass correlations [3]. Provided that the costs of prerandomization measurements are the same for all designs (and there are no missing values on these prerandomization covariates), the results of the present study also apply. e present study did not consider carry over in deriving optimal and maximin designs. If there is self carry over, that is, carry over from a treatment onto itself, this implies that steady-state did not yet occur in the first period, and then the total treatment effect would be the relevant effect, that is, the direct effect in the first period plus the carry over effect in the second period [5]. If there is self carry over, one-period designs and the AB/BA design are not suitable, as they do not allow for estimating the total treatment effect, leaving only the AA/BB design as a suitable option. ere may also be steady-state carry over, which can only occur if there is a switch of treatments [17]. Such carry over would affect the efficiency of the crossover design. Although one commonly tries to avoid carry over, examining to what extent steadystate carry over affects the relative efficiency of the maximin crossover design would be an interesting issue for further research.

A.1. Asymptotic Variance for the ML Estimator of the Treatment Effect
Let the vector of observed scores on the dependent variable for person j (� 1, . . . , N) in a two-period design be denoted as y j � y 1j y 2j . e linear mixed model for the scores y j can be expressed as follows: where X j is the design matrix for subject j, β is the vector of regression coefficients, u 0j is the random person effect, is the vector of residual scores under treatment is the vector of residual scores under treatment B. In (A.1), 1 � 1 1 , I is the identity matrix, and Q is a matrix which indicates whether treatment A has been given or not in a particular period, so for a person j with treatment sequence AB, we have Q � 1 0 0 0 , indicating that A has been given in period 1 and treatment B in period 2. Let J be a matrix with only ones (of order 2 by 2). e variance-covariance matrix of y j can be derived as follows: For the AA sequence, we have Q � I, and (A.2) can be rewritten as follows: By applying the result that, for an n × n matrix, X � aI + bJ, with a ≠ 0 and a ≠ −nb, the inverse is X −1 � (1/a)(I − J(b/(nb + a))) ( [18], p. 443), the inverse of the matrix in (A.3) can be written as follows: For the BB sequence, we have I 2 Q � I, and in a similar way, we obtain Finally, for the AB sequence and BA sequence, we can derive the following equations, respectively: e information matrix of the ML estimator of β can be written as Taking the inverse of the information matrix yields the asymptotic variance-covariance matrix of the ML estimators of β.
Different treatment sequences not only lead to different matrices V j but also to different X j matrices. As an example, we consider the crossover design. In total, we have N subjects, and p is the proportion of persons being allocated to the AB sequence. Let β T � (β 0 , β 1 , β 2 ), where the regression coefficients represent the intercept, the treatment effect, and the time effect, respectively. For persons allocated to the AB sequence, we have X j � 1 0 0 1 1 1 , where j � 1, . . . , Np. For persons allocated to the BA se- We can now elaborate the matrix formulation in (A.7). For persons in the AB sequence, we have and for persons in the BA sequence, we obtain (A.9) e information matrix in (A.7) can be obtained by summating the expressions in (A.8) and (A.9) across all persons in each of the treatment sequences, and the asymptotic variance-covariance matrix of the ML estimators then results by taking the inverse of this matrix. We are interested in the variance of the treatment effect estimator, which is the entry in row 2 and column 2 of the resulting variance-covariance matrix: which, noting that ρ A � σ 2 0 /(σ 2 0 + σ 2 εA ), ρ B � σ 2 0 /(σ 2 0 + σ 2 εB ), and σ 2 y � 2σ 2 0 + σ 2 εA + σ 2 εB , can be written as the expression given in Table 5. Along similar lines, the variance of the treatment effect estimator for a parallel and an extended parallel design, as shown in Table 5, can be derived.

A.2. Asymptotic Variance for the ML Estimator of the Treatment by Period
Interaction Effect e derivations of the variance of treatment by period interaction estimator are also similar, but start from another model and corresponding design matrices for the fixed regression coefficients. Now β T � (β 0 , β 1 , β 2 , β 3 ), where the regression coefficients represent the intercept, the treatment effect, the time effect, and the treatment by period interaction effect, respectively. When considering again a crossover trial as an example, for persons assigned to the AB sequence, we have and for persons in the BA sequence, we obtain e information matrix in (A.7) is obtained by summating the expressions in (A.11) and (A.12) across all persons in both treatment sequences, and the asymptotic variance-covariance matrix of the ML estimators then results by taking the inverse of this matrix. We are interested in the variance of the treatment by period interaction effect estimator, which is the entry in row 4 and column 4 of the variance-covariance matrix: which can be rewritten in terms of ρ A , ρ B , and σ 2 y , as the expression in Table 5. Along similar lines, the variance of the treatment by period interaction effect estimator for an extended parallel design, as shown in Table 5, can be derived.

B. Derivation of Optimal Allocations under a Budget Constraint
e asymptotic variance of the treatment effect estimator is minimized as a function of p, the allocation proportion, given a fixed budget C. For a crossover, parallel, and extended parallel design, p is the proportion allocated to the sequence AB, A, and AA, respectively. e variance for a crossover design as given in Table 5 can be rewritten in terms of the research costs and budget C, employing the cost function in (5) and noting that c sc � c s_2p : It is easy to see that this expression is minimized if For a parallel design, the variance of the treatment effect estimator as given in Table 5 can be rewritten in terms of the cost function in (4) as follows: Taking the derivative of (B.2) with respect to p yields the following expression: Extended parallel design Solving the expression for p gives two solutions, one of which turns out to give a minimum (the second derivative of the variance as a function of p is positive for this particular value): Finally, for an extended parallel design, the variance of the treatment effect estimator as given in Table 5 can be rewritten in terms of the cost function in (6) noting that c sep � c s_2p : (B.5) Taking the derivative with respect to p yields the following expression: Solving (B.6) for p gives two solutions, one of which turns out to give a minimum (the second derivative of the variance as a function of p is positive for his particular value): Substituting the optimal allocation p � 0.5, and the allocations in (B.4) and (B.7) into the corresponding expressions for the variance of the treatment effect estimator yields the optimal variances as given in Table 1 of the main text. Derivations along similar lines can be done if interest is in minimizing the variance of the treatment by period effect estimator. ese result in the allocations and variances, as also presented in Table 1 of the main text.

C. Derivation of Maximin Designs and Variances of the Effect Estimators
Maximin designs optimize the allocation to the treatment sequences under the worst case, that is, the maximum variance of the treatment effect estimator across plausible ranges of the model parameters, here the intraclass correlations ρ A and ρ B .
For crossover designs, the variance of the treatment effect estimator under optimal allocation to the treatments is (see Table 1 of the main text) By taking the derivatives with respect to ρ A and ρ B , it can be shown that this expression decreases as a function of these parameters. is implies that the worst case occurs for the lower bounds of the ranges for ρ A and ρ B . is yields the expression for the variance of the treatment effect estimator in the case of a maximin crossover design in Table 2 of the main text. Similarly, for a parallel design, we have to examine for which values of ρ A and ρ B the following expression is maximized: For derivations purposes, it is more convenient to rewrite (C.2) in terms of ϕ � ρ B /ρ A : Taking the derivative of (C.3) with respect to ϕ, we find that the variance increases as a function of ϕ as long as ϕ ≤ (c A + c t + c sp )/(c B + c t + c sp ) and decreases as a function of ϕ if ϕ > (c A + c t + c sp )/(c B + c t + c sp ). So, if we can choose ρ A and ρ B from their plausible ranges such that ρ B /ρ A � (c A + c t + c sp )/(c B + c t + c sp ), this maximizes (C. (c A + c t + c sp )/(c B + c t + c sp ) < ρ L B /ρ U A , then choose ρ U A and ρ L B . Substituting the maximin values of ρ A and ρ B into (C.2) will result in the variances of the treatment effect estimator as displayed in Table 2 of the main text.
For the extended parallel design, the variance for optimal allocation to the treatment sequences is given in Table 1 of the main text: 2C .

(C.4)
Taking the derivative of (C.4) with respect to ρ A shows that the variance of the treatment effect estimator increases as a function of ρ A as long as λ(1 − ρ B ) 2 /(ρ B (1 + ρ B )) ≤ (1 + ρ A )/ρ A where λ � (2(c A + c t ) + c s_2p )/(2(c B + c t ) + c s_2p ) and decreases if λ(1 − ρ B ) 2 /(ρ B (1 + ρ B )) > (1+ ρ A )/ρ A . Taking into account a feasible range for ρ A , the value for ρ A maximizing the variance in (C.4), ρ * A , therefore is Taking the derivative of (C.4) with respect to ρ B shows that the variance increases as a function of ρ B as long as (1 − ρ A ) 2 /(λρ A (1 + ρ A )) ≤ (1 + ρ B )/ρ B and decreases if (1 − ρ A ) 2 /(λρ A (1 + ρ A )) > (1 + ρ B )/ρ B . So, also taking into account a feasible range for ρ B , we have as maximin value for ρ B : For each of the intraclass correlations, ρ A and ρ B , there are three possible values that maximize the variance of the treatment effect estimator. Not all values can cooccur.
Similar derivations can be given for the maximin versions of a crossover and extended parallel design in the case interest is in the treatment by period interaction effect. ese derivations are, upon request, available from the author. Table 6 provides the simulated powers for maximin designs in the case of the treatment by period interaction. For each  (1) Note. e power printed in bold indicates the design for which the sample calculation should yield a power of 80%. N: total sample size.

D. Powers from the Monte Carlo Simulations for Maximin Designs in the Case of the Treatment by Period Interaction
pair of ranges of the intraclass correlations, the asymptotic efficiency of each design versus the most efficient design is given within brackets.

Data Availability
is study is not based on empirical data. However, the R programs that are used in this paper are available upon request from the corresponding author.

Conflicts of Interest
e author declares that there are no conflicts of interest regarding the publication of this paper.