Let Continuous Outcome Variables Remain Continuous

The complementary log-log is an alternative to logistic model. In many areas of research, the outcome data are continuous. We aim to provide a procedure that allows the researcher to estimate the coefficients of the complementary log-log model without dichotomizing and without loss of information. We show that the sample size required for a specific power of the proposed approach is substantially smaller than the dichotomizing method. We find that estimators derived from proposed method are consistently more efficient than dichotomizing method. To illustrate the use of proposed method, we employ the data arising from the NHSI.


Introduction
Recently, logistic regression has become a popular tool in biomedical studies. The parameter in logistic regression has the interpretation of log odds ratio, which is easy for people such as physicians to understand. Probit and complementary log-log are alternatives to logistic model. For a covariate X and a binary response variable Y , let π(X) = P(Y = 1 | X = x). A related model to the complementary log-log link is the log-log link. For it, π(x) approaches 0 sharply but approaches 1 slowly. When the complementary log-log model holds for the probability of a success, the log-log model holds for the probability of a failure [1].
These models use a categorical (dichotomous or polytomous) outcome variable. In many areas of research, the outcome data are continuous. Many researchers have no hesitation in dichotomizing a continuous variable, but this practice does not make use of within-category information. Several investigators have noted the disadvantages of dichotomizing both independent and outcome variables [2][3][4][5][6][7][8][9][10]. Ragland [11] showed that the magnitude of odds ratio and statistical power depend on the cutpoint used to dichotomize the response variable. From a clinical point of view, binary outcomes may be preferred for some reasons such as (1) setting diagnostic criteria for disease, (2) offering a simpler interpretation of common effect measures from statistical models such as odds ratios and relative risks. However, all advantages come at the lost information. From a statistical point of view, this loss of information means more samples which are required to attain prespecified powers.
Moser and Coombs [12] provided a closed-form relationship that allows a direct comparison between the logistic and linear regression coefficients. They also provided a procedure that allows the researcher to analyze the original continuous outcome without dichotomizing. To date, a method that applies the complementary log-log model without dichotomizing and without loss of information has not been available.
We aim to (a) provide a method that allows the researcher to estimate the coefficients of the complementary log-log model without dichotomizing and without loss of information, (b) show that the coefficient of the complementary loglog model can be interpreted in terms of the regression coefficients, (c) demonstrate that the coefficient estimates from this method have smaller variances and shorter confidence intervals than the dichotomizing method.

Model.
Let y 1 , y 2 , . . . , y n be n independent observations on y, and let x 1 , x 2 , . . . , x p−1 be p − 1 predictor variables thought to be related to the response variable y. The multiple linear regression model for the ith observation can be expressed as or where To complete the model, we make the following assumptions: var(E i ) = σ 2 for i = 1, 2, . . . , n, (3) the independent E i follows an extreme value distribution for i = 1, 2, . . . , n.
Writing the model for each of the n observations, in matrix form, we have or The preceding three assumptions on E i and y i can be expressed in terms of this model: cov(E) = σ 2 I,

(Largest) Extreme Value Distribution.
The PDF and CDF of the extreme value distribution are given by It is easy to check that where To return to a random sample of observations (y 1 , y 2 , . . . , y n ), we conclude that the PDF and CDF of each independent y i are given by (6), and the corresponding equality (7) is given by where the estimate β j is the ( j + 1)th element of vector β = ( β 0 , β 1 , . . . , β j , . . . , β p−1 ) . It is readily shown that the results also hold true for the smallest extreme value distribution (Appendix A).
Computational and Mathematical Methods in Medicine 3

The Proposed Confidence Intervals. Let
According to the preceding three assumptions on E i and y i , we obtain Therefore, β and σ 2 are unbiased estimators of β and σ 2 . We have assumed that E i is distributed as an extreme value, and we use the approximation of the extreme value distribution of the errors E i by the normal distribution. For normally distributed observations, β j /( σ δ j ) follows a noncentral t distribution with n − p degree of freedom and noncentrality parameter −∞ < β j /(σ δ j ) < ∞, where t α/2 [r, s] represents the 100(1 − (α/2)) percentile point of a noncentral t distribution with r degrees of freedom and noncentrality parameter −∞ < s < ∞, and δ j is the ( j + 1)st diagonal element of (X X) −1 . We use the approximation of the percentiles of the noncentral t distribution by the standard normal percentiles [13], then Thus, we obtain an approximate 100(1 − α) percent confidence interval for ω j

Comparison of the Two Methods
Let Y i be a continuous outcome variable. For fixed value of C, we define Y * i such that 4

Computational and Mathematical Methods in Medicine
Suppose that Y * 1 , . . . , Y * n form a random sample of observations, and we fit a complementary log-log model where In general, maximum likelihood estimation (MLE) can be used to estimate the parameter θ = (θ 0 , . . . , θ p−1 ). Let θ = ( θ 0 , . . . , θ p−1 ) be the P × 1 ML estimate of θ, and let COV( θ) be the P×P covariance matrix of θ. Using COV( θ) from (23), one can construct confidence intervals. This matrix has as its diagonal the estimated variances of each of the ML estimates. The ( j + 1)th diagonal element is given by σ 2 θj . Therefore, and for large samples, ( θ L j , θ U j ) = ( θ j − z α/2 σ θj , θ j + z α/2 σ θj ) is a 100(1 − α) percent confidence interval for the true θ j . Then (exp( θ L j ), exp( θ U j )) is a 100(1 − α) percent confidence interval for the true ω * j . We now compare the ω j from (7) with the ω * j from (17) This show that the coefficient of the complementary loglog model, θ j , can be interpreted in terms of the regression coefficients, β j . Note that β are related to the responses through the general linear regression model where the independent E i are distributed as an extreme value with mean 0 and variance σ 2 > 0.

Derivation of var(ω
for Large n. The information matrix of generalized linear models has the form = X WX [1], where W is the diagonal matrix with diagonal elements w i = (∂μ i /∂η i ) 2 /(var(y i )), y is response variable with independent observations (y 1 , . . . y n ), and x i j denote the value of predictor j, The covariance matrix of θ is estimated by (X WX) −1 . Maximum likelihood estimation for the complementary log-log model is a special case of the generalized linear models. Let then It is readily shown that the results hold true for the largest extreme value distribution (Appendix A). In large samples, var( θ j ) approaches σ 2 θj | θ= θ [14] which equals the ( j + 1)th diagonal element of (X WX) −1 .
By applying the delta method, let f ( θ j ) = exp( θ j ), then

The Power for the Dichotomized Method.
In large samples, σ θj converges to σ θj almost surely [14]. Therefore, for a given value of ω j = exp θ j (i.e., ln ω j = θ j ), the power is given by where

The Power for the Proposed Method.
In large samples, σ converges to σ almost surely [15]. Therefore, for a given value of ω j = exp(πβ j /σ √ 6) (i.e., β j = σ(ln ω j √ 6/π)), the power is given by Computational and Mathematical Methods in Medicine Our proposed method, since it is based on continuous data rather than dichotomized, is likely to be more powerful.
We show that the proposed method can produce substantial sample size saving for a given power. Let (i) the number of parameters p = 2 (i.e., θ = (θ 0 , θ 1 )), that is, x i1 follows a discrete uniform distribution with range (−a, a). For simplicity, a = 2.
(iii) Total samples are n and n * for the proposed and dichotomized methods, respectively. These samples included k and k * set of these g uniformly distributed points for the proposed and dichotomized methods, respectively. That is, n = gk and n * = gk * , then and from (23), We consider the same power for two methods: relative sample size Computational and Mathematical Methods in Medicine 7 That is, (34) is independent of σ 2 and applies for any power, and any test size α. Table 1 presents relative sample sizes n * /n for a given fixed parameter ω * j and an average proportion of success π. We consider the situations in which π = g i=1 (π i /g) = 0.1, 0.2, 0.3, 0.4, 0.5, g = 9, ω * j = 0.25, 0.50, 0.75. For given fixed ω * j and π, the relative sample sizes in Table 1 can be computed by the following step: (i) compute the value θ j via the equation θ j = ln(ω * j ), (ii) calculate the cut-off point C iteratively such that π attained the specified value for the values x i1 , using the value of θ j in (i).
As can be seen from Table 1, all values are greater than 1. The values of n * /n increase as the ω * j moves farther away from 1. Values of Table 1 immediately highlight the improvement accomplished by the proposed method.

Relative Efficiency of ω j with ω * j
Here, we examine the relative efficiency of the estimate ω j to the estimate ω * j . Using (24) and (26), the relative efficiency is given by Note that the relative efficiency is independent of n and σ 2 and converges to a constant. Comparing (34) and (35), the relative efficiency equals the relative sample sizes. Therefore, as in Table 1, the proposed method is a consistent improvement over the dichotomizing method with respect to relative efficiencies.
It should be noted that these results hold true under the following assumptions: (1) the responses y i and β are related through the equation y i = x i β + E i where the independent E i are distributed as an extreme value with mean 0 and variance σ 2 > 0, (2) the independent variables x i follow a discrete uniform distribution.

Odds Ratio
For values of π larger than 0.90, − ln(π) and π/(1 − π) are very close. Hence, for large values of π, ln(π 1 ) ln(π 2 ) And from (7), odds ratio is given by The parameters estimated from the linear regression can be interpreted as an odds ratio.

Simulation Study
It should be noted that, as in Table 1, the proposed method is a consistent improvement over the dichotomizing method with respect to relative efficiencies. These results hold true under the assumption that predictor variable has a discrete uniform distribution and that the random variables E i follow an extreme value distribution. To demonstrate the robustness of this conclusion to changes in the distributions of predictor variables, simulations were run under different distributional conditions. The data were sampled 10000 times for three sample sizes {n = 250, 500, 1000}, three average proportions of successes {π = 0.10, 0.50, 0.95}, and seven ω j {ω j = 0.75, 0.90, 1.1, 1.2, 1.3, 1.4, 1.5}. The simulated data are generated using the following algorithm ln ω j /π through (7) to produce the correct ω j , and for simplicity β 0 = 0, σ 2 = 1.
We simulated the data for two scenarios based on the distribution of the explanatory variable. In the first scenario, the independent variable follows a continuous uniform distribution and range (−2, 2), and in the second, the independent variable follows a truncated normal distribution with mean 0 and range (−2, 2). The relative mean square errors, relative interval lengths, absolute biases, and the probability of coverage were calculated.
Results of the simulations addressing the validity of the proposed method are displayed in Tables 2 and 3.
The simulations show that the relative mean square errors are all greater than 1, increasing with the average proportion of successes and when the ω j moves farther away

An Example
To illustrate the application of the proposed method presented in the previous section, we utilize the data arising from the National Health Survey in Iran. The other analyses using this data appear in many places [16]. In this study, 14176 women aged 20-69 years were investigated. BMI (body mass index), our dependent variable, was calculated as weight in kilograms divided by height in meters squared (kg/m 2 ). Independent variables included place of residence, age, smoking, economic index, marital status, and education level. The independent variables considered were both categorical and continuous. At first, BMI was treated as a continuous variable, and ω j and 95 percent confidence intervals were calculated using the proposed linear regression method. Then subjects were classified into obese (BMI ≥ 30 kg/m 2 ) and nonobese (BMI <30 kg/m 2 ). A complementary log-log model was used for the binary analysis, with obese or nonobese used as the outcome measure. The ω * j and 95 percent confidence intervals were calculated using the dichotomized method. Table 4 presents the coefficient estimates, estimated confidence intervals, and relative confidence interval lengths. The proposed and dichotomizing methods produced different confidence intervals, although the ω j and ω * j were similar only varying slightly. The ω j estimate from the proposed method had smaller variances and shorter confidence intervals than the dichotomizing method. All relative confidence interval lengths were greater than 2.58.

Discussion
When assuming the errors E i are distributed as an extreme value distribution, as noted before, the method has several advantages. First, the method allows the researcher to apply the complementary log-log model without dichotomizing and without loss of information. Second, the ω * j from the dichotomizing method is dependent on the chosen cutoff point C and will vary with c. However, the proposed ω j is independent of the c since ω j is a function of the continuous Y i and not a function of the dichotomized Y * i defined through C. Third, we show that the coefficient of the complementary log-log model, θ j , can be interpreted in terms of the regression coefficients, β j . Fourth, when the independent variables x i follow a discrete uniform distribution, the proposed method is a consistent improvement over the dichotomizing method with respect to relative efficiencies. The proposed method can provide sample size saving, smaller variances, and shorter confidence intervals than the dichotomized method. Fifth, when π is large, the parameters estimated from the linear regression can be interpreted as odds ratios.
Our results were consistent with the findings by Moser and Coombs [12] and Bakhshi et al. [16] showing the greater efficiency of parameter estimates from the regression method that avoids dichotomizing in comparison with a more traditional dichotomizing method using the logistic regression.
Our main recommendation is to let continuous response remain continuous. Do not throw away information by transforming the data to binary. This means that if the objective is to estimate and/or test coefficients when responses Table 3: Simulated relative mean square errors, relative intervals lengths, coverage probabilities, and absolute biases for the proposed and dichotomizing methods (using a truncated normal distribution for the explanatory variable and an extreme value distribution for the errors).