Reprints Available directly from the Editor. Printed in New Zealand. ROBUST ESTIMATION IN CAPITAL ASSET PRICING MODEL

Bian and Dickey (1996) developed a robust Bayesian estimator for the vector of regression coecients using a Cauchy-type g-prior. This estimator is an adaptive weighted average of the least squares estimator and the prior location, and is of great robustness with respect to at-tailed sample distribution. In this paper, we introduce the robust Bayesian estimator to the estimation of the Capital Asset Pricing Model (CAPM) in which the distribution of the error component is well-known to be at-tailed. To support our proposal, we apply both the robust Bayesian estimator and the least squares estimator in the simulation of the CAPM and in the analysis of the CAPM for US annual and monthly stock returns. Our simulation results show that the Bayesian estimator is robust and superior to the least squares estimator when the CAPM is contaminated by large normal and/or non-normal disturbances, especially by Cauchy disturbances. In our empirical study, we nd that the robust Bayesian estimate is uniformly more ecient than the least squares estimate in terms of the relative eciency of one-step ahead forecast mean square error, especially for small samples.


Introduction
Both financial economists and statisticians have been concerned with the distributions of stock market returns. Fama (1963Fama ( , 1965aFama ( , 1965b) and many others analyzed the empirical data and concluded that the normality assumption in the distribution of a security or portfolio return is violated such that the distribution is 'flat-tailed'. They suggested the family of stable Paretian distributions between normal and Cauchy distributions for the stock returns.
On the other hand, Blattberg and Gonedes (1974) examined the return to security and suggested student-t as an alternative 'flat-tail' distribution for the return. Clark (1973), Christie (1983), Kon (1984) and Tse (1991) suggested a mixture of normal distributions for the stock return while Fielitz and Rozelle (1983) suggested that a mixture of non-normal stable distributions would be a better representation of the distribution of the return.
The distributional structure of the return may carry over into the structure of the disturbance in the Capital Asset Pricing Model (CAPM). In this situation, the dis-tribution of the disturbance is 'flat-tailed' and the mixture of normal distributions or mixture of normal and Cauchy distributions may give a better description of the distribution of the disturbance in the CAPM. Harvey and Zhou (1993) supported this idea and pointed out that the non-normality in the return may carry over into non-normality of the disturbance in the CAPM. They examined the residuals of the world market portfolios in the CAPM and found that the distributions of the disturbances departed from normality in many cases. They then tested the sensitivity of the benchmark in the CAPM by specifying error structures that follow t-distributions or mixtures of normal distributions. Bian and Dickey (1996) developed the robust Bayesian estimator for the vector of regression coefficients using a Cauchy-type g-prior. They showed that this robust Bayesian estimator is adaptive and markedly robust with respect to a flat-tailed sample distribution as compared to both the least squares estimator (LSE) and the usual Bayesian estimator.
Based on the 'flat-tail' characteristic on the distributions of the security or portfolio returns and their corresponding disturbances in the CAPM, we recommend the robust Bayesian estimator for the estimation of the parameters of the CAPM for the stock returns. The findings by Bian and Dickey (1996) lead us to hypothesize that the robust Bayesian estimator is more appropriate in the estimation of the CAPM in the sense that it is more efficient than the LSE.
To illustrate the superiority of the proposed Bayesian estimator, we simulate the LSE and the proposed estimator for a CAPM model. Based on the simulation results, we find that the proposed Bayesian estimator is superior to LSE when the CAPM model is contaminated by large normal and/or non-normal disturbances, especially by Cauchy disturbances.
To test our hypothesis, we also apply the one-step ahead forecasting technique to compare the robust Bayesian estimator with the traditional least squares estimator, LSE, in the estimation of the parameters in the CAPM for the US annual and monthly stock returns. The one-step ahead forecasting technique is commonly used to compare the performance of different models, see Clements and Hendry (1997). In our empirical study, we find that the robust Bayesian estimate is uniformly more efficient than the LSE in terms of relative efficiency of one-step ahead forecast mean square error, especially for small samples. Hence we recommend the robust Bayesian estimator for the estimation of the CAPM.
Many applications in finance involve prior beliefs about the behavior of the data. However, almost all empirical analyses have been carried out in the classical framework. There have been relatively few studies done, which applied the Bayesian approach in finance. Among them are Shanken (1987), Gibbons, Ross, and Shanken (1989), McCulloch and Rossi (1991) and Harvey and Zhou (1990). Two practical difficulties in implementing the approach have resulted in the slow adoption of Bayesian econometrics. The first is how to choose a prior and how to specify prior parameters. The other difficulty lies in evaluating the posterior distribution.
To overcome these difficulties, Harvey and Zhou (1990) imposed a prior on all the parameters of the multivariate regression model and used Monte Carlo numerical integration to accurately evaluate 90-dimensional integrals to estimate the parameters in the posterior distribution. They developed a Bayesian framework to test the mean-variance efficiency of a given portfolio. The test is more direct than Shanken's (1987).
In recent studies, MacKinlay and Richardson (1991) developed the tests of unconditional mean-variance efficiency under weak distributional assumptions using a Generalized Method of Moments framework and concluded that the efficiency indexes can be sensitive to the test considered. Kandel, McCulloch and Stambaugh (1995) used a Bayesian approach to investigate a sample's information about a portfolio's degree of inefficiency and found that the NYSE-AMEX market portfolio is rather inefficient in the presence of a riskless asset.
There are two main issues on CAPM. One is testing the efficiency hypothesis, another is the estimation of the CAPM model, refer to Chapter 5 in Campbell, Lo and MacKinlay, 1997. In our paper, we address the latter issue by proposing an efficient method to overcome the difficulties in both getting the prior information and evaluating the posterior distribution. The proposed prior is an independent Cauchy and improper g-prior which is a robust prior. As such, the resulting estimator is adaptive and robust. We may acquire the information from the previous corresponding sample to specify the values of the prior parameters in practice. This approach makes the computation of the Bayesian estimate as easy as that of the LSE. It overcomes the need for computing integrals of any dimension for the estimation.
In Section 2, we review the least squares estimator, LSE, the usual Bayesian estimator and the robust Bayesian estimators. Section 3 reviews the theory of the standard CAPM, the non-stationarity of Beta parameter, the 'flat-tail' distribution of the security return and discusses applying the robust Bayesian estimator for the estimation of the CAPM. Section 4 reveals the results of the simulation for the Bayesian estimator and the least squares estimator when the CAPM is contaminated by normal and/or non-normal disturbances. We apply both the robust Bayesian estimator and the least squares estimator, LSE,in the estimation of the CAPM for the US annual and monthly stock returns and compare their efficiency in Section 5. The conclusion is in the last section.

A review of the least squares estimator and the Bayesian estimators of regression coefficients
The model considered is the normal linear multiple regression model (NLR) with the standard form: where y is an n × 1 vector of observations on the dependent variable, X is an n × p design matrix with rank p, β is a p×1 vector of regression parameters with unknown value, and e is the n × 1 vector of disturbances. It is assumed that the elements of e are independently drawn from a normal distribution with mean 0 and finite variance σ 2 . The likelihood function for the NLR is In this model, the traditional estimator is the LSE which is also the maximum likelihood estimator for β. It is given bŷ So far in the literature of Bayesian estimation of regression coefficients, only the conjugate prior and the non-informative prior have been employed extensively in statistical estimation. The conjugate prior for the regression model (1) is a normalreciprocal gamma distribution given by: and The usual non-informative prior is: The posterior density of β and σ −2 associated with the conjugate prior (3) is: The Bayesian estimator of β, under quadratic loss, is the posterior mean of β. It is given by whereβ is the LSE of β specified by (2). Zellner (1986) modified the above approach by considering the normal g-prior specified by the following forms: This is a special case of the conjugate prior with the covariance matrix A −1 proportional to (X X) −1 , the covariance matrix of the LSE. From (4), the Bayesian estimator,β N , of β becomeŝ The prior that has been extensively employed is the conjugate prior. Mathematical simplicity of making analytical evaluation for inference is the most important advantage of the conjugate prior. Unfortunately, the resulting estimator is not robust. When the robustness is of concern, an attractive way to develop robust Bayesian inference is to use robust priors which possess flat but not too flat tails to form Bayesian estimators, see Bian (1995), Bian and Tiku (1997), Dickey (1974), Ramsay and Novick (1980), Berger (1980Berger ( , 1984 and Press (1989). However, it is difficult to make analytical evaluation for inference because of the ugly forms of the resulting posterior densities. Bian and Dickey (1996) overcame this difficulty by introducing a prior in which the prior knowledge regarding β and σ 2 is assumed to be independently distributed as a Cauchy g-prior and a reciprocal gamma distribution such that This prior distribution has the same marginal density as the conjugate prior specified in (3) when ν 0 = 1. Combining this prior with the likehood function, the posterior density of β and σ −2 is: When ν 0 = p + 1 − n, and s 0 = 0, the marginal posterior density of β is a poly-Cauchy density The robust Bayesian estimator,β C of β, under quadratic loss, is the posterior mean: with mean The new estimatorβ C is a non-linear function ofβ. The weight w in (7) is a decreasing function of the prior parameter g and the residual ||y − Xβ||. When g goes to zero, the prior of β diffuses to the non-informative prior and the weight w increases to 1. In this situation,β C approaches to the LSEβ which is the Bayes estimator arising from the usual non-informative prior.
The main attraction ofβ C is its weight w depending reasonably on the residual ||y−Xβ||. When wild or extreme observations occur, the value of residual ||y−Xβ|| rises. Hence, there is higher uncertainty for the LSEβ and consequently the weight w becomes smaller. In this sense, the estimatorβ C is an adaptive weighted average and tends to be considerably more robust.
To compare the robust Bayesian Estimatorβ C in (7) with the least squares estimatorβ in (2) and the usual Bayesian Estimatorβ N in (5), Bian and Dickey (1996) simulated the simple regression model y = a + bx + e in which the random term e is distributed as the ε-contaminated normal distribution such that or the ε-contaminated Cauchy distribution such that for small samples.
In their simulation results, they found that the efficiency ofβ C relative to botĥ β N andβ grows rapidly as ε grows in value. The higher the proportion of the observations contaminated by large fluctuations, the more efficient isβ C relative to bothβ N andβ in the simulation with the random terms in (8) and (9).
When the error terms are distributed as the ε-contaminated Cauchy distribution, the means and variances ofβ andβ N do not exist theoretically while the means and variances ofβ C do exist andβ C is unbiased if the prior center β 0 hits the true value of β perfectly. In the simulation of this situation, they found that the efficiency of β C relative to bothβ N andβ is extremely large even if the value of ε is as small as 0.01. This shows thatβ C is considerably robust relative to bothβ N andβ.

The application of the robust Bayesian estimator in CAPM
The Capital Asset Pricing Model is a parsimonious general equilibrium model developed by Sharpe (1963Sharpe ( , 1964, Treynor (1961) and Lintner (1965). They suggested that the excess return R on a security is formulated by: where R m is the excess return on market portfolio, and e is the random error. From Equations (1) and (10), we have β = (a, b) . In this paper, we do not consider the Black version of CAPM which treats the zero-beta portfolio return as an unobserved quantity, making the analysis more complicated than that of the Sharpe-Lintner version. Blume (1975), Brenner (1974), Pettit and Westerfield (1974), Leavy (1971), Hamada (1972) and many others found that the measure of security risk is empirically nonstationary over time. To handle the non-stationarity of β, Bodurtha and Nelson (1991) applied the conditional heteroskedastic error using autoregressive conditional heteroskedastic model (ARCH) for the estimation of the CAPM.
In order to capture the stationary Beta parameter, one may estimate the model from a reasonably short subperiod. In this situation, the Bayesian approach is a good choice. Vasicek (1973) is one of the earliest papers that discusses the application of Bayesian estimation to the CAPM. However, Vasicek's approach is not robust.
Many papers such as Fama (1963Fama ( , 1965aFama ( , 1965b analyzed the empirical data and concluded that the distribution of the security or portfolio return is 'flat-tail' and the normality assumption is violated. They suggested the family of stable Paretian distributions between normal and Cauchy distributions for the stock returns. Blattberg and Gonedes (1974) examined the security returns and suggested studentt as an alternative 'flat-tail' distribution. Clark (1973), Christie (1983), Kon (1984), and Tse (1991) suggested a mixture of normal distributions for the stock return while Fielitz and Rozelle (1983) suggested that a mixture of non-normal stable distributions would be a better representation of the distribution of security and portfolio return.
The structure of the distribution for the return may carry over into the structure of the disturbance. As such, the disturbance's distribution is 'flat-tailed' and the mixture of normal distributions or the mixture of normal and Cauchy distributions may give a better description of the distribution of the disturbance in the CAPM. This is supported by Harvey and Zhou (1993) who pointed out that the non-normality in the return may carry over into non-normality of the disturbance in the CAPM. Harvey and Zhou examined the residuals of the world market portfolios in the CAPM and found that in many cases the distributions of the disturbances departed from normality. They then tested the sensitivity of the benchmark in the CAPM by specifying error structures that were t-distributed or follow the mixtures of normal distributions.
Based on the findings by Bian and Dickey (1996), we hypothesize that the robust Bayesian estimator is more appropriate in the estimation of the parameters of the CAPM in the sense that it is more efficient than the LSE in the estimation of the parameters.
To support our hypothesis, the first consideration pertains to the robustness of the estimator. Mandelbrot (1963) and others show empirically that the distribution of the return is intermediate between normal and Cauchy distributions and therefore the tails of the distribution are flatter than normal but thinner than Cauchy. Since the Cauchy prior distribution is a robust prior with tails very much flatter than the normal distribution, the robust Bayesian estimatorβ C arising from a Cauchy type g-prior with normal-distributed sample distribution performs well in yielding an estimator which is robust with respect to wild fluctuations and extreme observations of the stock return in the CAPM. The estimator is highly representative in the situation in which the distribution is between normal and Cauchy.
The next consideration concerns the sampling distribution following the mixture of normal distributions, as found by Brenner (1974), Boness, Chen and Jatusipitak (1974), Kon (1984) and Tse (1990). The simulation results in Bian and Dickey (1996) showed thatβ C is more efficient than bothβ N andβ under the mixture of normal distributions for the error term. This suggests that our approach should provide a better estimation for the CAPM with respect to the issue of the mixture of normal distributions.
The last consideration refers to the mixture of normal and Cauchy distributions. Fielitz and Rozelle (1983) found that the distribution of some security returns fitted the mixture of normal and non-normal stable distributions with different characteristic exponents. Bian and Dickey (1996) have already demonstrated that β C is more efficient in the case of the mixture of normal and Cauchy distributions. This suggests our hypothesis is justified for the issue of the mixture of normal and Cauchy distributions.

Simulation results
From the practical point of view, it is very important to examine the sensitivity of a statistical procedure to deviations from an assumed model. We thus evaluate, in the traditional sense, the performance under departures from the assumed model of theβ C relative to theβ based on the CAPM model (10) with β 0 = (a, b) = (0, 1) and e i 's being distributed as ε-contaminated distributions, as displayed in Table 1.
We then compareβ C with g = 0.01 toβ. The values of the mean error (bias) and MSE of these three estimators for different error distributions and different locations of the prior center β 0 = (a 0 , b 0 ) are evaluated based on 10,000 runs. The results are tabulated in Table 1. For convenience, we define the bias and MSE as follows: We note that, usually, the bias is defined as E(β) − β when the dimension of β is equal to one. However, we define the bias as in (11) because in our model the dimension of β is greater than 1. Based on the tabulated values, we obtain the following findings: 1. Theβ C has no bias when the prior center hits the true values of β perfectly, and has negligible biases when the prior center deviates moderately.
2.β C is uniformly superior toβ when the model is contaminated by non-normal disturbances or when the prior center hits the true location of β perfectly.
3.β C is remarkably superior toβ when the model is contaminated by Cauchy disturbances. Relative efficiency is from 8330.88 to as large as 18433.48 in our simulation.
Cauchy disturbances cause damage to LSEβ. When Cauchy errors occur in observations, the sampling means and sampling variances ofβ do not exist. Hence the values of LSE fluctuate violently and therefore the values of the MSE forβ shown in Table 1 are very large. Thus one may conclude that LSE is inappropriate when some or all of the errors follow a Cauchy distribution. On the contrary, the values of both bias and MSE forβ C are quite small. In addition, the sampling mean and sampling variance ofβ C do exist. Hence,β C is highly robust relative toβ. At least, it can be viewed as a promising alternative method in a number of CAPM model where the error terms are distributed as mixture distributions. We note that in Table 1 the MSE for the bayesian estimate with β 0 = (0, 1) is less than the MSE for the LSE in the N (0, 1) case. This is because the prior β 0 = (0, 1) hits the exact value of the parameters in the model. When β 0 = (−2, 3), reasonably far away from the parameters, the MSE for the proposed bayesian estimator is greater than the MSE for the LSE. When β 0 = (−1, 2), it is close to the true value and hence the MSE is smaller than the MSE for the situation with β 0 = (−2, 3).
We also note that in Table 1 the MSE for the LSE in .75N (0, 1) + .25C(0, 1) is less than the LSE in .90N (0, 1) + .10C(0, 1). This is possible because the variance of C(0, 1) does not exist and hence the MSE has huge variability and depends on the samples chosen.

Empirical Study
In this section, we demonstrate that the robust Bayesian estimator is a more appropriate estimator of the parameters in the CAPM by examining the US annual and monthly stock returns.
Twelve industrial portfolios of U.S. data are employed in the study. The industry classifications conform to Sharpe (1982), Breeden, Gibbons and Litzenberger (1989) and Gibbons, Ross, and Shanken (1989). The portfolios are value-weighted. The monthly market return is the value weighted NYSE return. The portfolio returns are available from the Center for Research in Security Prices (CRSP) at the University of Chicago. These monthly returns from the period 1926-1987 are in excess of 30-day Treasury-bill rate available from Ibbotson Associates. Harvey and Zhou (1990) introduced a Bayesian test and calculated posterior odds ratios for the industry portfolios of these returns to test the mean-variance efficiency. We use the same data set to demonstrate that the robust Bayesian estimator is a more appropriate approach in the CAPM estimation. N (0, 1) denotes a normal distrbution with mean 0 and variance 1, 3T4 denotes a scaled Student t distribution with 4 degree of freedom with a scale of 3, and C(0, 1) denotes a Cauchy distribution with mean 0 with a scale of 1.
We specify the CAPM for the excess return R i for the ith industrial classification portfolio such that: where R m is the market excess return, and e i is the error term of the ith industrial portfolio. We first apply the normality test concerning the measures of skewness and kurtosis for the returns and the corresponding residuals in the CAPM to test the hypothesis that the returns R i are normally distributed and to test the hypothesis that the disturbances e i come from a normal distribution. The results are shown in the following tables: The results in Table 2 lead us to reject the hypothesis that the monthly returns R i as well as their corresponding disturbances come from a normal distribution at 0.01 level of significance. The above finding supports the hypothesis that the non-normality in the returns will carry over into the non-normality of the disturbances in the CAPM, as mentioned in Harvey and Zhou (1993). However, Table 3 leads us to accept the normality hypothesis for the annual returns of all portfolios except Construction and Basic Industries at 0.01 level of significance but reject the normality hypothesis for their corresponding disturbances in some cases. This suggests that the normality in the return may not carry over into the normality of the disturbance. In the situation where the disturbance is normally distributed or non-normally distributed for the U.S. portfolio return, we apply bothβ C andβ to study the efficiency of estimation in the CAPM. We note that the return may process possess ARCH effects which may cause the return to depart from normal. However, temporal aggregation will reduce this ARCH effects; for examples, see Drost and Nijman (1995). Hence, the annual excess portfolio returns are closer to normal as compared to the monthly excess portfolio returns, see Tables 2 and 3 respectively.
Since the Bayesian estimation involves subjective judgement, we have to specify the values of the prior parameters. Ideally, the specification of the hyper-parameters should be obtained from experts with thorough knowledge in the market. The expert opinion may come from the detailed information of the fundamentals such as corporate profitability, capital structure and leverage, and from confidential and restricted information such as the latest preliminary corporate accounts and investment plans. However, sometimes there is not enough information for statisticians or financial analysts to specify the values of the prior parameters. In this situation, we may acquire the information of the previous corresponding month to specify the information for the prior.
There are two prior parameters in the robust Bayesian estimatorβ C : β 0 and g. The parameter β 0 is the prior centre of β while g is prior precision of β. In this study, we use the estimate of β from the previous sample with the same sample size as the value of β 0 in the updated estimation. This is essentially a empirical Bayes approach (see Maritz and Lwin 1989).
We adopt the one-step ahead forecast MSE, see Clements and Hendry (1997) for more detail, as a basis for comparison betweenβ C andβ for the U.S. monthly and annual data. In the computation, the sample size n is chosen from 6 to 36 for monthly data and from 5 to 20 for annual data. The value of g is chosen from 0.1 to 20. We note that the first n data (t = 1, · · · , n) are being used only to compute the prior information forβ C in the first sample (t = n + 1, · · · , 2n). The second n data (t = 2, · · · , n + 1) are being used only to compute the prior information forβ C in the second sample (t = n + 2, · · · , 2n + 1), and so on.
For each sample size n and for each g value, the estimates of bothβ C andβ are first computed for each industrial portfolio for t = n + 1, · · · , T − 1 where T is December 1987 for monthly data and 1987 for annual data. We then compute their one-step ahead forecasts,R it , by applyingβ C andβ respectively for each portfolio and for t = 2n + 1, · · · , T and subsequently the one-step ahead forecast MSE, for the i th Portfolio with respect toβ C andβ for each n and each g. The average one-step ahead forecast MSE with respect toβ C and the average with respect toβ are then computed for each g and each n. Their relative efficiency average one-step ahead of forecast MSE ofβ average one-step ahead of forecast MSE ofβ C is then computed for each g value and sample size n.
In our empirical study, we find thatβ C is uniformly more efficient thanβ in the sense of the relative efficiency of one-step ahead forecast mean square error for any sample size and for any g value. For simplicity, we only present the average relative efficiency for g = 0, 0.1, 0.5, 1, 2, 5, 10, 15 and 20 and sample size from 6 to 36 with an increment of 6 for monthly data and from 5 to 20 with an increment of 5 for annual data. The results of the average one-step ahead forecast MSE obtained by applyingβ C in the CAPM for monthly and annual US returns are in Table 4 and Table 6 respectively. We note that the values in the tables are 1000 times the original values and the average one-step ahead forecast MSE with respect toβ C is equal to that ofβ when g = 0. The results of the efficiency ofβ C relative toβ for monthly and annual US stock returns are in Table 5 and Table 7 respectively. From the results in these tables, we find that the estimate ofβ C is more efficient than that ofβ for any g value and for any sample size n in our study especially  for small sample sizes. We note from Tables 2 and 3 that the annual returns can be assumed to be normally distributed in many cases while the monthly returns are not normally distributed. This suggests thatβ C can also be applied for both normally distributed and non-normally distributed data. In both situationsβ C is more efficient thanβ as illustrated in our study. As shown in (7), g is the precision of the prior density of β. The larger the value of g, the less is the prior uncertainty about β; and consequently, the estimateβ C puts heavier weight on the prior location. The results in Table 5 and Table 7 show that in general the relative efficiency is higher for greater g values and for smaller sample sizes. This suggests that our choice of prior information is appropriate and the prior information contributes significantly in the estimation.
The results in Table 5 and Table 7 show that the relative efficiency is lower for large sample sizes. This implies thatβ C is not much better thanβ for large sample sizes. Perhaps, it is because the portfolio of the US stock returns is not stable in time or it is because the estimate ofβ is sufficiently good enough. The results in the tables also show that the relative efficiency is lower for small g values. This makes sense becauseβ C tends toβ when g tends to zero. Table 5 shows thatβ C is up to 12% more efficient thanβ, while Table 7 shows thatβ C is up to 17% more efficient thanβ. These empirical results illustrate that β C is uniformly better thanβ in the estimation of the parameters in the CAPM.

Conclusion
Bian and Dickey (1996) developed a robust Bayesian estimator for the vector of regression coefficients using a Cauchy-type g-prior. This estimator is an adaptive weighted average of the least squares estimator and prior location, and is of great robustness with respect to wild and extreme observations. In this paper, we apply the robust Bayesian estimator to financial regression models of stock returns in which the error is well-known to be 'flat-tail' distributed. To compare this estimator with the traditional least squares estimator, we apply both estimators to analyze the Capital Asset Pricing Model of the US annual and monthly stock returns. In our empirical study, we find that the robust Bayesian estimate is uniformly more efficient than the least squares estimate in terms of the relative efficiency of one-step ahead forecast mean square error, especially for small samples. Our study supports that the robust Bayesian estimator is more appropriate in the CAPM estimation.
The approach in our paper is based on regression modeling technique. One may apply the technique in Wong and Miller (1990) and  to investigate the fundamental component and the error component for each portfolio. One may also use the modified maximum likelihood estimation approach, see Tiku, et. al. 1999a,b,c andWong 1998, to relax the normality assumption on the CAPM.
Another possible area for further research is to compare the beta in this study with the equity cost of capital for each portfolio. For the estimation of the equity capital cost, for example see Wong (1991, 1996). One may also apply the approach in this paper in studying the difference of the beta between risk averters and risk lovers, see Li and Wong (1999) and Wong and Li (1999).
calculations. Special thanks also to the editor and the referees for their valuable comments that have significantly improved this manuscript.