^{1}

^{2}

^{1}

^{2}

We proposed a statistical method to construct simultaneous confidence intervals on all linear combinations of means without assuming equal variance where the classical Scheffé's simultaneous confidence intervals no longer preserve the familywise error rate (FWER). The proposed method is useful when the number of comparisons on linear combinations of means is extremely large. The FWERs for proposed simultaneous confidence intervals under various configurations of mean variances are assessed through simulations and are found to preserve the predefined nominal level very well. An example of pairwise comparisons on heteroscedastic means is given to illustrate the proposed method.

Multiple comparisons on a large number of linear combinations of means is of general interest in many applications. If an inferential statistical procedure relies on the number of comparisons, it may be quite challenge as the number of comparisons is increasing. Additionally, oftentimes we may not be able to make the assumption that all variances of means are equal. Many authors proposed various methods for multiple comparison on means in the past. Scheffé [

The problem of comparisons on two means in the case of unequal population variances is known as the Behrens-Fisher problem [

Suppose that we have

We now consider the problem of constructing simultaneous intervals without assuming equal variance. Let

Note that

Finding the exact distribution of linear combination of

We then set

It can be estimated by

To derive the generalized Scheffé's interval we would need the following projection lemma (see [

Applying the projection lemma this probability can be pivoted to give the following generalized

The Type I error in multiple comparisons is referred to as the probability of incorrectly rejecting at least one of the null hypotheses that make up the family. The validity of the proposed generalized Scheffé's confidence intervals largely lies in successfully controlling the FWER at a given nominal level

There are two major factors, population sample sizes and variances, which affect the performance of the Scheffé's confidence intervals. We will show through simulation that the FWER will be inflated in the situation where population variances are unequal.

A variety of configurations of variances and sample sizes will be selected to assess the performance of the generalized Scheffé method. To this end, the number of groups is chosen to be

Coverage rates of 95% Scheffé’s intervals (S) and generalized Scheffé (GS) intervals: two sets of inferences are considered, the population means and pairwise mean differences.

Sample size | Equal variances | Unequal variances | ||||||

(0.1, 0.1, 0.1, 0.1) | (1, 1, 1, 1) | (0.3, 0.3, 0.1, 0.1) | (3, 3, 1, 1) | |||||

Balanced | S | GS | S | GS | S | GS | S | GS |

(5, 5, 5, 5) | 98.00 | 98.60 | 98.45 | 99.00 | 93.60 | 96.85 | 94.05 | 97.35 |

(10, 10, 10, 10) | 97.90 | 98.45 | 98.20 | 98.65 | 94.75 | 97.10 | 95.10 | 97.30 |

(20, 20, 20, 20) | 97.70 | 97.90 | 98.20 | 98.35 | 93.90 | 96.25 | 94.80 | 96.45 |

(50, 50, 50, 50) | 97.90 | 97.95 | 98.35 | 98.35 | 94.35 | 96.75 | 94.55 | 96.60 |

Unbalanced | ||||||||

(5, 5, 10, 10) | 98.20 | 98.75 | 98.20 | 98.70 | 87.70 | 97.50 | 87.20 | 97.40 |

(5, 5, 20, 20) | 98.40 | 99.10 | 97.90 | 98.45 | 73.00 | 96.20 | 76.50 | 96.65 |

(10, 10, 20, 20) | 97.95 | 98.05 | 98.35 | 98.35 | 88.40 | 97.30 | 87.05 | 96.65 |

(10, 10, 50, 50) | 98.60 | 98.80 | 98.70 | 98.65 | 73.95 | 96.70 | 72.55 | 97.10 |

Although Scheffé’s intervals apply to inference on all linear combinations, for simplicity, we have focused on two sets of inferences only: population means and their pairwise differences. For each configuration we conducted 5,000 simulation runs and for each run 95% Scheffé's intervals and generalized Scheffé's intervals on both population means and pairwise mean differences were computed. We then obtained the coverage rates that the proposed intervals contain the true means, which all equal 0.

Table

It would also be interesting to see how different in width the two types of intervals are. Comparing (

The averaged

Comparison of interval widths between Scheffé’s and generalized Scheffé’s methods. Their interval widths differ in quantities:

Sample size | Equal variances | Unequal variances | ||||||

(0.1, 0.1, 0.1, 0.1) | (1, 1, 1, 1) | (0.3, 0.3, 0.1, 0.1) | (3, 3, 1, 1) | |||||

Balanced | ||||||||

(5, 5, 5, 5) | 0.343 | 0.370 | 3.422 | 3.680 | 0.754 | 0.909 | 7.598 | 9.182 |

(10, 10, 10, 10) | 0.322 | 0.331 | 3.229 | 3.323 | 0.718 | 0.813 | 7.166 | 8.105 |

(20, 20, 20, 20) | 0.315 | 0.319 | 3.153 | 3.194 | 0.703 | 0.778 | 7.032 | 7.780 |

(50, 50, 50, 50) | 0.310 | 0.312 | 3.105 | 3.120 | 0.694 | 0.759 | 6.945 | 7.595 |

Unbalanced | ||||||||

(5, 5, 10, 10) | 0.329 | 0.350 | 3.284 | 3.490 | 0.602 | 0.905 | 6.052 | 9.125 |

(5, 5, 20, 20) | 0.318 | 0.340 | 3.196 | 3.423 | 0.489 | 0.905 | 4.894 | 9.093 |

(10, 10, 20, 20) | 0.318 | 0.326 | 3.173 | 3.250 | 0.597 | 0.812 | 5.951 | 8.092 |

(10, 10, 50, 50) | 0.312 | 0.321 | 3.128 | 3.218 | 0.466 | 0.810 | 4.669 | 8.138 |

It can be seen that they are very close to each other in the case of equal variances. However, in the case of unequal variances,

Empirical density plots: each density curve is generated from 5000 simulation runs. The solid line is for the

The configuration (1) indicates the equal variance for the 4 means with equal or different sample sizes. The configuration (2) indicates the unequal variances for the 4 means with equal or different sample sizes. We calculate the empirical distribution function of

One last comment, the above simulation results suggest that the widths of the generalized Scheffé intervals tend to be wider than that of the Scheffé intervals. This is our overall impression, but may not always be true in general. In the simulations, from time to time, we observed narrower generalized Scheffé intervals. We will see this feature from the data analysis example in the next section.

Solomon et al. [

Table

The simultaneous Scheffé intervals and Generalized Scheffé’s intervals on means and pairwise mean differences in the cigarette example.

Parameters | Scheffé | Generalized Scheffé |
---|---|---|

Mean | ||

| (20.66, 28.94) | (20.76, 28.84) |

| (10.95, 22.25) | (11.08, 22.12) |

| (26.02, 31.58) | (26.09, 31.51) |

| (10.08, 17.32) | (10.16, 17.24) |

Pairwise comparisons | ||

| (1.19, 15.21) | (1.36, 15.04) |

| (−8.98, 0.98) | (−8.87, 0.87) |

| (5.59, 16.60) | (5.73, 16.47) |

| (−18.49, −5.90) | (−18.35, −6.05) |

| (−3.81, 9.61) | (−3.65, 9.45) |

| (10.53, 19.67) | (10.64, 19.56) |

Sample sizes, means, and sample standard deviations of 349 women who stopped smoking during pregnancy period [

Label | Condition | Description | |||
---|---|---|---|---|---|

PC | Precontemplation | Smokes and has no plan to quit smoking | 69 | 24.8 | 13.3 |

C | Contemplation | Smokes but is thinking of quitting | 37 | 16.6 | 5.2 |

P | Preparation | Smokes but has made some effort at quitting | 153 | 28.8 | 12.2 |

A | Action | Has already quit | 90 | 13.7 | 8.8 |

One may make a number of inferences with a joint confidence level of

Among others, the Scheffé method is one of the commonly-used method to make simultaneous inference on all linear combinations of means. Scheffé intervals are for all possible linear combinations of means and this brings benefit if a large number of linear combinations of means need to be compared. Assumption of equal variance for all means is needed to control type I error. When this assumption is violated the proposed method can be conveniently used for constructing simultaneous confidence intervals where type I error is controlled at a prespecified nominal level. Results from simulations show that the FWER of the proposed simultaneous confidence intervals are well preserved at a nominal level and the equal variance assumption can be simply ignored.