We study the loss of power of the stratified log-rank test (SLRT) compared to the unstratified log-rank test (ULRT) in the case of a large number of strata with relatively a small number of stratum sizes in terms of the asymptotic distributions of test statistics under local alternatives. The SLRT tends to lose information due to overstratification. It is better to test the homogeneity among strata before using the stratified log-rank test.

It is well known in survival analysis that the (unstratified) log-rank test (ULRT) is the most efficient invariant test under contiguous alternatives in the proportional hazards model [

In multicenter clinical trials with time-to-event as the primary outcome variable, we want to compare the treatment effects of two or more treatment methods. In the example in Section

Some previous work has been developed to study the power loss of the log-rank test. Akazawa et al. [

In this paper, we consider the case where there is a large number of strata, but each stratum has a relatively small sample size. We assume that patients are homogeneous within each treatment group. For this kind of data, we can construct both the stratified and unstratified log-rank tests. We derive a variance relation between the SLRT and ULRT and quantify the power loss due to unnecessary stratification by this relation. We illustrate our approach with data from a multi-center clinical trial (MADIT II) to test the treatment effect of an implantable defibrillator on survival of patients with reduced left ventricular function after myocardial infarction.

This paper is organized as follows. Data and notation are described in Section

Suppose there are

The underlying survival times

The stratified log-rank test (SLRT) can be derived from the stratified Cox-proportional hazards model [

Similar to SLRT, the ULRT can be derived from the Cox proportional hazards model. The log partial likelihood function is

From martingale theory, the predictable covariation of

Suppose

This lemma is readily checked. From (

We study the SLRT and ULRT in a multi-center clinical trial [

In this paper, we studied the loss of power of stratified log-rank test in multi-center clinical trials with a large number of centers, but relatively small stratum size (assuming homogeneous strata). Our results show that asymptotic variance of SLRT is smaller than that of ULRT which makes the SLRT less powerful. Overstratification may incur loss of information compared to the unstratified log-rank test. However, there are some limitations in our study. First, we assumed that strata are homogeneous. In that case, the unstratified log-rank test should be the best choice. In practice, it is important to test homogeneity of strata before using the stratified log-rank test. Second, we considered the case with a large number of strata, but small stratum size. Another case of interest is a small number of strata with large stratum size. Although (

To study the asymptotic distribution of SLRT under alternatives, we consider the following local alternatives:

The authors gratefully thank Dr. Author J. Moss (PI of MADIT-II) for allowing them to use the MADIT-II data in this paper. This research was supported by Grant 5U19AI056390-05 from the National Institutes of Health of USA.