MANOVA for Nested Designs with Unequal Cell Sizes and Unequal Cell Covariance Matrices

We propose and study parametric bootstrap (PB) tests for heteroscedastic two-factor MANOVA with nested designs. For the problem of testing “main effects” of both factors, we develop a flexible test based on a parametric bootstrap approach. The PB test is shown to be invariant under affine-transformations. Moreover, the PB test does not depend on the chosen weights used to define the parameters uniquely. The proposed test is compared with the approximate Hotelling T2 (AHT) test by the simulations. Simulation results indicate that the PB test performs satisfactorily for various cell sizes and parameter configurations and generally outperforms the AHT test in terms of controlling the nominal size. For the heteroscedastic cases, the PB test outperforms the AHT test in terms of power. In addition, the PB test does not lose too much power when the homogeneity assumption is actually valid.


Introduction
There are many situations where we record more than one response variable from each sampling or experimental unit and where these units are allocated to or occur in treatment groups.Ecologists often record the abundances of many species from each sampling or experimental unit and physiologists commonly measure more than one variable (e.g., blood pressure, heart rate, etc.) on experimental animals.With multiple response variables, we might be more interested in whether there are group differences on all the response variables considered simultaneously.This is the aim of multivariate analysis of variance (MANOVA), the analogue of univariate ANOVA when we have multiple response variables for each experimental or sampling unit.
The simplest design is single factor MANOVA (or oneway MANOVA) which tests the effect of a factor, having  levels.If the cell covariance matrices are assumed to be equal, then there are some popular tests available to test the equality of the mean vectors.The tests that are commonly used are the Wilks likelihood ratio (WLR), Lawley-Hotelling trace (LHT), Bartlett-Nanda-Pillai (BNP), and Roys largest root tests ( [1], chap.8, sec.6).When there are some departures from the standard assumption, that is, unequal cell covariance matrices, these solutions were proposed by James [2], Johansen [3], Gamage et al. [4], and Krishnamoorthy and Lu [5], among others.The more complex design is multifactor MANOVA especially when the homogeneity of the cell covariance matrices assumption is seriously violated.There has been a continuous interest in the heteroscedastic twofactor MANOVA which tests in checking the significance of the effects of two factors  and , each having  and  levels, respectively, in a multivariate factorial experiment with crossed designs.Recently, Harrar and Bathke [6] attacked this problem via modifying the WLR, LHT, and BNP tests.Their main ideas focus on modifying the degrees of freedom of the random matrices involved in the test statistics so that the heteroscedasticity of the cell covariance matrices is taken into account and the WLR, LHT, and BNP tests can still be used but with the degrees of freedom estimated from the data via matching the first two moments; see some details in Section 2.2 of Zhang and Xiao [7].Zhang [8] proposed an approximate Hotelling  2 (AHT) test.A Wald-type test statistic is used.Its null distribution is approximated by a Hotelling  2 -distribution with one parameter estimated from the data.Some simulation studies conducted in Zhang [8] showed that the AHT test outperforms the modified LHT test of Harrar and Bathke [6].
Another useful two-factor design is the nested MANOVA [9].Notice that each of the testing problems associated with 2 Journal of Applied Mathematics the three null hypotheses in the heteroscedastic two-way MANOVA can be equivalently expressed in the form of the general linear hypothesis testing (GLHT) problem as described in Zhang [8].The GLHT problem is very general.It includes not only the main and interaction effect tests but also various post hoc and contrast tests as special cases.To construct the test for two-factor nested MANOVA, we may use the same Wald-type test statistic as in Zhang [8].However, according to our simulation studies, the empirical sizes of the AHT test for the two-factor nested MANOVA model may far exceed the nominal level.On the other hand, the AHT test needs to consider the the selected weights when the cell sizes are unequal.Therefore, it is important to develop a test procedure for the nesting effects and the nested effects with satisfactory size and power regardless of number of factorial effects and the sample sizes.
Accordingly, the present paper will develop a parametric bootstrap (PB) test for heteroscedastic two-factor nested MANOVA.We use standardized effects sum of squares and a natural test statistic obtained by replacing cell covariance matrices by the corresponding sample cell covariance matrices.The PB test admits several nice properties: (1) it can be simply conducted by a routine Monte Carlo algorithm; (2) it is shown to be invariant under affine-transformations; (3) The PB test does not depend on choices of the weights used to identify the parameters; and (4) it works well.Simulation results reported in Section 4 indicate that the PB test performs satisfactorily for various cell sizes and parameter configurations when the homogeneity assumption is seriously violated and generally outperforms the AHT test in terms of power and controlling size.The main ideas of the proposed PB test are closely related to the work by Xu et al. [10].The methodologies for the PB test are presented in Sections 2 and 3. Simulation results are presented in Section 4. Some concluding remarks are given in Section 5. Finally, some technical proofs of the main results are given in the Appendix.

Tests for Nested Effects
Consider a two-factor nested design model with factors  and .Suppose that the nesting effect corresponds to the factor  having  factor levels, and the nested effect corresponds to the factor  having a total of  = ∑  =1   levels with   levels of  nested within the th level of the factor  ( = 1, . . ., ).Suppose a -variate random sample of size   is available under (, )th level,  = 1, . . ., ;  = 1, . . .,   .Let Y  ,  = 1, . . ., ;  = 1, . . .,   ;  = 1, . . .,   represent these random vectors and y  represent their observed (sample) values.Assume that   >  so that positive definite sample covariance matrices can be computed for each cell of the design.Suppose that Y  satisfy the following model: where   :  × 1 and Σ  :  ×  are the cell mean vector and cell covariance matrix of the random sample at the (, )th cell and e  is an experimental error vector term.All these samples are assumed to be independent with each other.In the two-factor nested model, the cell mean vectors   are usually decomposed into the form where  0 is the grand mean vector,   is the effect vector of the th level of  on each of the  variables in Y  , and   is the effect due to the th level of the factor  nested within the th level of the factor  so that (1) can be further written as the following two-factor multivariate nested model with unequal error covariance matrices: e  ∼   (0, Σ  ) ,  = 1, . . ., ;  = 1, . . .,   ;  = 1, . . .,   . (3) In order to have  0 ,   , and   uniquely defined, we need to have additional constraints.Let  1 , . . .,   and V 1 , . . ., V   be nonnegative weights such that ∑  =1   > 0 and ∑   =1 V  > 0. We consider the following constraints: where  1 , . . .,   and V  , . . ., V   are nonnegative weights such that ∑  =1   > 0 and ∑   =1 V  > 0 for each .In this section, we are interested in testing the null hypothesis,  0 :   = 0;  = 1, . . ., ,  = 1, . . .,   , against its natural alternative hypothesis.The null hypothesis in (5) aims to test if the nested effect corresponding to the factor  is statistically significant.The sample mean vector and the sample covariance matrix of the (, )th cell are denoted by Y  and S  , respectively, where where μ0 and α are solutions of  0 and   that minimize the quadratic equation subject to the constraints given in (4).In fact, denoting  = (  0 ,   1 , . . .,    ) and θ = (μ  0 , α 1 , . . ., α  )  , it follows from Theorem 5.2.5 in Wang and Chow [11] that where Notice that θ indeed depends on L of chosen weights.However, we can prove that the standardized sum of squares in (9) does not depend on L. This is the following Theorem.
where M − denotes any generalized inverse of matrix M.
As shown in Zhang [8], the testing problem associated with the null hypothesis (5) can also be equivalently expressed in the form of the general linear hypothesis testing (GLHT) problem as  0 : C = 0, where C = H  A  ⊗ I  , ) , A  = ( and k  = (V 1 , . . ., V   )  ,  = 1, 2, . . ., .Then the associated Wald-type test statistic is given as where μ = Y, and S is defined as in (17).
In the following, we shall describe two tests based on (17) and (19), respectively.
2.1.The Approximate Hotelling  2 (AHT) Test.Similar to Zhang [8], we proposed a test referred as the AHT test, which is based on the test statistic where and Ω = where where D is any nonsingular matrix and  is any given vector.This property is desirable since in practice, the cell responses Y  (1) are often recentered or rescaled before an inference is conducted.The recentering and rescaling transformations are special cases of (25).
Theorem 3. The PB test (23) is affine-invariant in the sense that both the observed test statistic s (24) and the distribution of PB pivot variable SB (22) are affine-invariant.
The model ( 3) is identifiable when the constraints (4) are imposed.In many situations there are no natural weights to justify a particular test procedure.For this, we have the following result.

Theorem 4. The PB test (23) is invariant on the different weight choices in constraints (4).
Notice that the test statistic (17) does not depend on the chosen weights.The conclusion in Theorem 4 is obvious.However, we have the following property.
This property is desirable since in practice, in order to avoid computation of generalized inverse of matrices and enhance the accuracy of computation, we have taken a particular L such as L = (0, 1/, . . ., 1/) ⊗ I  in the simulation study.

Tests for the Nesting Effects
When the nested effects are present, the nesting effect   can not reflect the effect of   because it depends on which level of factor  it is in.As pointed in Searle [12], Chap.7, in the case of  = 1, a popular solution to the problem is not quite a test for   = 0 in the presence of nested effects but rather to test the null hypothesis We will see that the problem of testing ( 27) is the same case of the problem of comparing  = ∑  =1   normal mean vectors when the cell covariance matrices are unknown and arbitrary [5].In fact, if Σ  are known, then is the best linear unbiased estimator of  0 , and a natural statistic for testing ( 27) is where ) is an idempotent matrix with rank ( − 1); we have The noncentrality parameter In the following, we describe the PB test for  0| in (27).Recall that under  0| the vector Y have the mean X 1  0 .As the test statistic in (32) is location invariant under the group of location transformations G 1 = {Y + X 1 ,  ∈ R  }, without loss of generality, we can take X 1  0 = 0. Using these facts, the parametric bootstrap pivot variable can be developed as follows.For a given (y 11 , . . ., y   ; s 11 , . . ., s   ), let Y  ∼   (0, (1/  )s  ) and S  ∼   (  − 1, (1/(  − 1))s  ),  = 1, . . ., ,  = 1, . . .,   .Then the PB pivot variable based on the test statistic (32) is given by where Note that the PB test provided in this section has the same invariance properties as in Section 2.

Monte Carlo Studies
In this section, we need to compare the PB test with the rates) and powers for two-factor nested MANOVA models via simulations.As pointed out in Section 3, the problem of testing ( 27) is the same case of the problem of comparing  = ∑  =1   normal mean vectors when the cell covariance matrices are unknown and arbitrary.Due to this reason, we will look only at the nested effects for comparisons.
Let the two factors be  and , respectively.Suppose that the nesting effect of the factor  has  factor levels, and the nested effect of the factor  has a total of  = ∑  =1   levels with   levels of B nested within the th level of the factor . Let n = [ 11 ,  12 , . . .,    ] denote the vector of cell sizes.For given n and covariance matrices Σ  ,  = 1, 2, . . ., ,  = 1, 2, . . .,   , we first generate ∑  =1 ∑   =1   multivariate samples as where the cell mean vectors   =  11 + h/ with  11 being the first cell mean vector, h a constant unit vector specifying the direction of the cell mean differences, and  a tuning parameter controlling the amount of the cell mean differences.We independently generate the  entries of the error terms e  from the (0, 1) distribution so that we always have E(e  ) = 0 and Cov(e  ) = I  .This means that (36) will generate the ()th multivariate normal sample y  ,  = 1, 2, . . .,   with the given mean vector   and covariance matrix Σ  .Without loss of generality, we specify  11 as 0 and h as h 0 /‖h 0 ‖ where h 0 = [1, 2, . . ., ]  for any given dimension  and ‖h 0 ‖ denotes the usual  2 -norm of h 0 .To estimate the sizes and powers of the AHT test, we used simulation consisting of 10,000 runs and recorded corresponding  values.Notice that two nested "do loops" are required to estimate the sizes and powers of the PB test, we used 2500 runs for outer "do loops" (for generating the data) and 5000 runs for inner "do loops" for estimating the bootstrap  values.The empirical sizes (when  = 0) and powers (when  > 0) of the two tests are the proportions of rejecting the null hypothesis, that is, when their  values are less than the nominal significance level .In all the simulations conducted, we used  = 5% for simplicity.Notice that the PB test does not depend on chosen weights.Here we report only the comparative studies for the equal-weight method to specify the weights of the AHT test so that the AHT test in Zhang [8] may be used for showing properties of the PB test.The empirical sizes and powers of the two tests for nested effect tests, together with the associated tuning parameters, are presented in Tables 1-3, in the columns labeled with AHT and PB under " = 0" and " > 0", respectively.As seen from the three tables, three sets of the tuning parameters for the cell covariance matrices are examined, with the first set specifying the homogeneous cases; four sets of the cell sizes are specified, with the first two sets specifying the balanced cell size cases.To measure the overall performance of a test in terms of maintaining the nominal size , we use the average relative error defined in Zhang [8] as ARE =  −1 ∑  =1 | λ − |/ × 100 where λ denotes the th empirical size for  = 1, 2, . . ., ,  = 0.05, and  is the number of empirical sizes under consideration.The smaller ARE value indicates the better overall performance of the associated test.Usually, when ARE ≤ 10, the test performs very well; when 10 < ARE ≤ 20, the test performs reasonably well; and when ARE > 20, the test does not perform well since its empirical sizes are either too liberal or too conservative and hence may be unacceptable.Notice that for a good test, the larger the cell sizes, the smaller the ARE values.The ARE values of the two tests under the two error schemes are also presented in these three tables.Notice that for simplicity, in the specification of the covariance and size tuning parameters, we often use (u  ) to denote "u repeats  times." Table 1 shows the empirical sizes and powers of the two tests for a bivariate case with  = 2 and  1 = 16,  2 = 20.With  =  1 +  2 = 36, one may be able to check how the two tests behave when one of the factors has a large number of levels.Tables 2 and 3 show the empirical sizes and powers of the two tests for a three-variate case with  = 3 and  1 = 8,  2 = 10,  3 = 12, and a 10-variate case with  = 3 and  1 = 3,  2 = 5,  3 = 7, respectively.These two tables allow us to compare the two tests for higher-dimensional data.
In the following, let us compare the AHT and PB tests via examining their empirical sizes and powers.It is seen from the three tables that for all the cases, the PB test generally outperforms the AHT test for all the cases under consideration as shown by their empirical sizes and the associated ARE values presented in the three tables.The AHT test performs reasonably well only for 10-dimensional data.In the homogeneous cases, the AHT test appears to be more powerful than the PB tests because of its inflated empirical sizes exceeding the nominal level considerably.For the heteroscedastic cases, we once again observe from Tables 2 and 3 that the PB test performs superior to the AHT test in terms of controlling nominal level and powers.
We conclude this simulation section via summarizing all the simulation studies conducted.In terms of size, the overall conclusion is that the PB test is a flexible procedure that performs satisfactorily, regardless of the cell sizes, values of the cell covariance matrices, and the number of effects being compared.In terms of power, the PB test generally outperforms the AHT test for most heteroscedastic cases under consideration.Thus, one may recommend to use the PB test as a good alternative in practical applications because of its simplicity and accuracy.

Concluding Remarks
The available classical tests for the two-factor MANOVA model with crossed designs and heteroscedastic cell covariance matrices have serious Type I error problems that have been overlooked.To address this serious problem, Zhang [8] proposed the AHT test.In this paper, we proposed and studied the PB test and the AHT test for the two-factor MANOVA model with nested designs and heteroscedastic cell covariance matrices.We showed that the PB test is invariant under affine-transformations and different choices of weights used to define the parameters uniquely.We demonstrated via intensive simulations that the PB test generally performs well and outperforms the AHT test in terms of size and power for most cell sizes and parameter configurations.∑  =1   > 0. Then it can be easily verified that X 0 and L 0 satisfy the following conditions: