Statistically Efficient Construction of α-Risk-Minimizing Portfolio

We propose a semiparametrically efficient estimator for α-risk-minimizing portfolio weights. Based on the work of Bassett et al. 2004 , an α-risk-minimizing portfolio optimization is formulated as a linear quantile regression problem. The quantile regression method uses a pseudolikelihood based on an asymmetric Laplace reference density, and asymptotic properties such as consistency and asymptotic normality are obtained. We apply the results of Hallin et al. 2008 to the problem of constructing α-risk-minimizing portfolios using residual signs and ranks and a general reference density. Monte Carlo simulations assess the performance of the proposed method. Empirical applications are also investigated.


Introduction
Since the first formation of Markowitz's mean-variance model, portfolio optimization and construction have been a critical part of asset and fund management.At the same time, portfolio risk assessment has become an essential tool in risk management.Yet there are wellknown shortcomings of variance as a risk measure for the purposes of portfolio optimization; namely, variance is a good risk measure only for elliptical and symmetric return distributions.
The proper mathematical characterization of risk is of central importance in finance.The choice of an adequate risk measure is a complex task that, in principle, involves deep consideration of the attitudes of market players and the structure of markets.Recently, value at risk VaR has gained widespread use, in practice as well as in regulation.VaR has been criticized, however, because as a quantile is no reason to be convex, and indeed, it is easy to construct portfolios for which VaR seriously violates convexity.The shortcomings of VaR led to the introduction of coherent risk measures.Artzner et al. 1 and F öllmer and Schied 2 question whether VaR qualifies as such a measure, and both find that VaR is not an adequate measure of risk.Unlike VaR, expected shortfall or tail VaR , which is defined as the expected portfolio tail return, has been shown to have all necessary characteristics of a coherent risk measure.In this paper, we use α-risk as a risk measure that satisfies the conditions of coherent risk measure see 3 .Variants of the α-risk measure include expected shortfall and tail VaR.The α-risk-minimizing portfolio, introduced as a pessimistic portfolio in Bassett et al. 3 , can be formulated as a problem of linear quantile regression.
Since the seminal work by Koenker and Bassett 4 , quantile regression QR has become more widely used to describe the conditional distribution of a random variable given a set of covariates.One common finding in the extant literature is that the quantile regression estimator has good asymptotic properties under various data dependence structures, and for a wide variety of conditional quantile models and data structures.A comprehensive guide to quantile regression is provided by Koenker 5 .Quantile regression methods use a pseudolikelihood based on an asymmetric Laplace reference density see 6 .Komunjer 7 introduced a class of "tick-exponential" distribution, which includes an asymmetric Laplace density as a particular case, and showed that the tick-exponential QMLE reduces to the standard quantile regression estimator of Koenker and Bassett 4 .In quantile regression, one must know the conditional error density at zero, and incorrect specification of the conditional error density leads to inefficient estimators.Yet correct specification is difficult, because reliable shape information may be scarce.Zhao 8 , Whang 9 , and Komunjer and Vuong 10 propose efficiency corrections for the univariate quantile regression model.
This paper describes a semiparametrically efficient estimation of an α-risk-minimizing portfolio in place of an asymmetric Laplace reference density a standard quantile regression estimator , by using any other α-quantile zero reference density f, based on residual ranks and signs.A √ n-consistent and asymptotically normal one-step estimator is proposed.Like all semiparametric estimators in the literature, our method relies on the availability of a √ n-consistent first-round estimator, a natural choice being the standard quantile regression estimator.Under correct specifications, they attain the semiparametric efficiency bound associated with f.The remainder of this paper is organized as follows.In Section 2, we introduce the setup and definition of an α-risk-minimizing portfolio and present its equivalent formation under quantile regression settings.Section 3 contains theoretical results for our one-step estimator, and Section 4 describes its computation and performance.Section 5 gives empirical applications, and Section 6 our conclusions.

α-Risk-Minimizing Portfolio Formulation
"α-risk" can be considered a coherent measure of risk as discussed in Artzner et al. 1 .The α-risk of X, say ρ ν α X , is defined as where ν α t : min{t/α, 1} and F ← X α : inf{x : F X x ≥ α} denote the quantile function of a random variable X with distribution function F X .Here, we recall the definition of expected shortfall and the relationship among the tail risk measures in finance.The α-expected shortfall defined for α ∈ 0, 1 as can be shown to be a risk measure that satisfies the axioms of a coherent measure of risk.It is worth mentioning that the expected shortfall is closely related but not coincident to the notion of conditional value at risk CVaR α defined in Uryasev 11 and Pflug 12 .We note that expected shortfall and conditional VaR or tail conditional expectations are identical "extreme" risk measures only for continuous distributions, that is, To avoid confusion, in this paper, we use the term "α-risk measure" instead of terms like expected shortfall, CVaR, or tail conditional expectation.Bassett et al. 3 showed that a portfolio with minimized α-risk can be constructed via the quantile regression QR methods of Koenker and Bassett 4 .QR is based on the fact that a quantile can be characterized as the minimizer of some expected asymmetric absolute loss function, namely, where ρ α u : u α − 1{u < 0} , u ∈ R is called the check function see 5 , and 1A is the indicator function defined by 1A 1 A ω : 1 if ω ∈ A, : 0 if ω / ∈ A. To construct the optimal i.e., α-risk minimized portfolio, the following lemma is needed.Lemma 2.1 Theorem 2 of 3 .Let X be a real-valued random variable with EX μ < ∞, then Then, Y Y π X π denotes a portfolio consisting of d different assets X : X 1 , . . ., X d with allocation weights π : π 1 , . . ., π d subject to d j 1 π j 1 , and the optimization problem under study is, for some prespecified expected return μ 0 , min The sample or empirical analogue of this problem can be expressed as where X ij denotes the jth sample value of asset i, X i : with some κ sufficiently large.The minimizer of 2.7 , namely, and α , provides the optimal weights yielding the minimal α-risk.
The large sample properties of β n α , especially its √ n-consistency, can be implied from the standard arguments and assumptions in the QR context see 5 .
Let W n and Σ n W be the mean vector and the covariance matrix of W i which are given by 2.10 respectively.Here, the p, q th element of

2.11
where Let D Σ W : diag{ σ n W,11 , . . ., σ n W,dd }.Then the correlation matrix of {W i } i 1,...,n 1 becomes R : Σ W , and the p, q th elemant of R is given by for p, q 2, . . ., d.The above correlation coefficient can take values close to 1 when n/κ 2 is close to 0 with X 1 −X p / 0 and X 1 −X q / 0. Hence, the correlation of the estimated portfolio weights is possibly highly correlated among assets whose sample means differ from X 1 , while these problems are ignorable in an asymptotic inference problem if we take κ O n 1/2 .Thus far, we have seen that the α-risk-minimizing portfolio can be obtained by 2.9 , which was the result of Bassett et al. 3 .In what follows, we show that semiparametrically efficient inference of the optimal weights β n α is feasible.The quantity estimated by 2.9 can be regarded as a QR coefficient β α , defined by where Note that here the QR model 2.14 has a random coefficient regression RCR interpretation of the form Z i W i β U i with componentwise monotone increasing function β and random variables U i that are uniformly distributed over 0, 1 , that is, U i ∼ Uniform 0, 1 see 5 .Here, a choice such that . ., b d with F ξ the distribution function of some independent and identically distributed i.i.d.n-tuple ξ 1 , . . ., ξ n yields Hence, recalling that the first component of W i is 1, it follows that, for any fixed α ∈ 0, 1 , the QR coefficient β α can be characterized as the parameter b ∈ R d of a model such as where the density g of G is subject to , where P n b,g denotes the distribution of an observation {Z i } n i 1 .This model 2.16 is a fixed-α submodel of 2.14 and is the parametric submodel through which we will achieve semiparametric efficiency.
The model 2.16 is a quantile-restricted linear regression model.But here we have no knowledge about the true density g, other than that it belongs to F α , which allows us to identify b.So, we arbitrarily choose f from F α and call it the "reference density" and correspondingly define a "reference model" where the density f of F is subject to f ∈ F α .The goal of the next section is to construct an asymptotically efficient version of β n α based on some feasible f ∈ F α , that is, attaining the semiparametric lower bound at correctly specified density f g that nevertheless remains √ n-consistent under a misspecified density f / g .

Semiparametrically Efficient Estimation
The procedure that we will apply here to achieve semiparametric efficiency is based on the invariance principle, as introduced by 3.5 Then, "good" inference should be based on where U b n ,i : F e b n ,i is i.i.d.uniform on 0, 1 under P n b n ,f and hence approximated by where I fg and μ − ϕ g are consistent estimates of Consistent estimates I fg and μ − ϕ g can be obtained in the manner of Hallin et al. 19 , which is done without the kernel estimation of g, though here we omit the details.Lemma 3.5 Section 4.1 of 6 .Under P b,g with g ∈ F α ,

3.11
Therefore, the one-step estimator b n f defined by 3.8 for b is semiparametrically efficient at f g.
In our original notation, the above statement can be rewritten as, for some Recall that the standard QR estimator, defined at 2.9 , is asymptotically normal see Koenker 5 : where Then we obtain the variances of the α-risk-minimizing portfolio constructed by the standard quantile, and the one-step estimators are stated in the following proposition.Since direct evaluation gives the following statement, we skip its proof.
Proposition 3.6.The asymptotic conditional variances of an α-risk-minimizing portfolio using the standard quantile regression and one-step estimators given at X x are, respectively,
For any positive definite matrices A and B, we say A ≤ B if B − A is nonnegative definite.To compare the efficiency of the standard quantile regression estimator and the onestep estimator, we need to show that Σ −1 fg Σ ff Σ −1 fg ≤ D −1 .To see this, as in Section 3 of Koenker and Zhao 20 , let us consider Σ : Note that Σ is a nonnegative definite matrix.If Σ −1 fg DΣ −1 fg is a positive definite, then there exists orthogonal matrix P, such that fg is nonnegative definite if Σ fg is nonsingular.This result assures that the one-step estimator is asymptotically more efficient than the standard quantile regression estimator.From this result, it is easy to see that Also, by taking expectation on both sides, the same inequality holds for unconditional variances.

Numerical Studies
In this section, we examine the finite sample properties of the proposed one-step estimator described in Section 3 for the cases where α 0.1 and 0.5.Our simulations are performed with two data generating processes to focus on the underlying true density g and how the choice of the reference density f might affect the finite sample performances.The first data-generating process DGP1 is the same as that investigated by Bassett et al. 3 .For DGP1, we consider the construction of an α-minimizing portfolio from four independently distributed assets, that is, asset 1 is normally distributed with mean 0.05 and standard deviation 0.02.Asset 2 is a reversed χ 2  3 density with location and scale chosen so that its mean and variance are identical to those of asset 1.Asset 3 is normally distributed with mean 0.09 and standard deviation 0.05.Finally, asset 4 has a χ 2 3 density with identical mean and standard deviation to asset 3. DGP2 is a four-dimensional normal distribution with mean vectors the same as those of DGP1, and covariance matrix Σ σ ij i,j 1,...,4 with diag Σ 0.02, 0.02, 0.05, 0.05 and σ i,j for i / j is σ ii σ jj ρ.Here, we set ρ 0.5, which indicates that the asset returns have correlation 0.5.Notice that both DGP1 and DGP2 have the same mean and variance structures.The underlying true conditional densities of u Z − A w b for DGP1 and DGP2 are a mixture of the normal χ 2  3 and reversed χ 2 3 distribution and the normal distributions, respectively.A simulation of the estimator, for sample, size n 100, 500, and 1000 consists of 1000 replications.We choose prespecified expected return μ 0 at 0.07.
For each scenario, we computed standard quantile regression estimates β n α with corresponding portfolio weights π QR 1 − d j 2 β n j α , β n 2 α , . . ., β n d α , and our onestep estimates are defined by 3.8 for various choices of the reference density f and actual density g in the α-minimizing portfolio allocation problem.
To make the problem a pure location model, we set the variance of the estimated residual to have one, that is, u u/ Var u , where u The true density g can be estimated by the kernel estimator for DGP1, where K is a kernel function, and h is a bandwidth.The first derivative g u is estimated by As for the DGP2, the actual density g becomes normal because the portfolio is constructed by normally distributed returns.We use the normal distribution N , the asymmetric Laplace distribution AL , the logistic distribution LGT , and the asymmetric power distribution APD with λ 1.5 for the reference density f.The density function of the asymmetric power distribution introduced by Komunjer 7 is given by where 0 ≤ α < 1, λ > 0, and When α 0.5, the APD pdf is symmetric around zero.In this case, the APD density reduces to the standard generalized power distribution GPD 21, .Special cases of the GPD include uniform λ ∞ , Gaussian λ 2 , and Laplace λ 1 distributions.When α / 0.5, the APD pdf is asymmetric.Special cases include asymmetric Laplace λ 1 , the two-piece normal λ 2 distributions.
For a given sample size, we compute simulated mean and standard deviation of π QR and π OS f and the relative efficiency Var π OS f,i / Var π QR i for i 1, . . ., 4. Table 1 gives the results of the relative efficiencies for DGP 1.When α 0.1, we see that the efficiency gains of one-step estimators with asymmetric Laplace reference density are large compared with other reference densities with n 1000, while these efficiency gains are less when sample size is n 100.When α 0.5, relative efficiency of assets 3 and 4 with asymmetric Laplace reference density is minimum, while for assets 1 and 2, relative efficiency with normal reference density is minimum.This is because of the covariance structure of Σ n W defined by 2.10 .As can be seen in Section 2, if μ 1 / μ p and μ 1 / μ q , the p, q th element of the correlation matrix defined by 2.13 has a value close to unity.In this case, the asymptotic variance of the usual quantile regression estimator becomes large, which leads to unsatisfactorily large variances in assets 3 and 4.However, the asymptotic variance of our one-step estimator does not have such problems.
Table 2 gives the results of the relative efficiencies for DGP2.In line with efficiency at a correctly specified reference density f N, we see that the relative efficiency is minimal for all assets and sample sizes with α 0.1 and 0.5.Even though we misspecify the reference density f / N, there exists some sort of efficiency gain except for assets 1 and 2 of the asymmetric Laplace reference density with n 100 and α 0.1.Efficiency gains for the normal reference density and logistic reference density are almost the same because the underlying true density is a symmetric normal distribution, and the asymmetric power reference density with λ 1.5 outperforms the asymmetric Laplace reference density.
Figure 1 plots the kernel densities for the estimated portfolio weights for DGP2 with α 0.5 and n 1000.We see that the standard quantile regression estimators have long

Empirical Application
We apply our methodology to weekly log returns of the 96 stocks of the TOPIX large 100 index.The samples run from January 5, 2007, to December 2, 2011, for a total of 257 observations.The stock prices are adjusted to take into account events such as stock splits on individual securities.Preliminary tests reveal that most log return series have high values of kurtosis and negative values of skewness in general, which indicates that the log returns are non-Gaussian.We computed the optimal portfolio allocations for α 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, and 0.5.We set κ 1000 and μ −0.002, which is the third quartile of the average logreturn distribution.For the first-round estimates, we used the standard quantile regression estimator, and for the one-step estimates, we chose a normal distribution as a reference density.Since we do not have enough information about the shape of the portfolio distributions for the various choices of α, the actual density g is estimated by the kernel method.

Advances in Decision Sciences
Figure 2 plots the cumulative distribution functions of the α-risk-minimizing portfolios obtained by the standard quantile regression estimates and one-step estimates for α 0.1, 0.2, 0.3, and 0.5.Summary statistics for the distributions of the different portfolios are reported in Table 3.
Figure 2 and Table 3 clearly show that the optimal α-risk-minimizing portfolio manages to reduce the occurrence of events in the left tail when α is small for both standard QR estimates and one-step estimates.The standard deviation of the one-step estimates of an α-minimizing portfolio is smaller than that of the standard QR estimates.We can also observe that the range of a constructed portfolio with one-step estimates is much smaller than that of standard QR estimates, due to the semiparametric efficiency properties of our one-step estimators.When α becomes large, the difference in the standard deviation of the constructed portfolio between standard QR estimates and one-step estimates tends to become large.Hence, efficiency gains are large for α 0.5, which is the mean absolute deviation portfolio.Another interesting finding is that the standard QR-constructed portfolios have high-density peaks at the required quantiles for all values of α, whereas the portfolio constructed by onestep estimates has a quite moderate density reduction at the required quantiles.Consistent with economic intuition, higher risk aversion is associated with a shorter left tail.In the case where α ≤ 0.1, maximum loss is limited to less than −0.02.This result is particularly striking given that the sample includes the stock market crash of October 2008 due to the US subprime mortgage crisis and the bankruptcy of Lehman Brothers, which resulted in a weekly loss of more than −0.220 for TOPIX.The sample also includes the stock market crash of March 2011 due to the catastrophic earthquake and tsunami that hit Japan, which resulted in a weekly loss of −0.104.
Figure 3 presents empirical efficient frontiers corresponding to the standard quantile regression-based portfolios and one-step estimates of a portfolio with α 0.1 and 0.5.

Advances in Decision Sciences
Figure 3 clearly illustrates that the standard quantile regression-based portfolio is completely inefficient, far from the one-step frontier.

Summary and Conclusions
This paper considered a semiparametrically efficient estimation of an α-risk-minimizing portfolio.A one-step estimator based on residual signs and ranks was proposed, and simulations were performed to compare the finite sample relative efficiencies for the standard quantile regression estimators and the one-step one.These simulations confirmed our theoretical findings.An empirical application to construct a portfolio using 96 Japanese stocks was investigated and confirms that the one-step α-risk-minimizing portfolio has smaller variance that is obtained by the standard quantile regression estimator.Further research topics include 1 construction of portfolios without short-sale constraints and 2 extending the results to the covariates of time series with heteroskedastic returns.For the former, we impose nonnegativity of the weights by using a penalty function containing a term that diverges to infinity as any of the weights becomes negative see 22 .
For the latter, we refer to Hallin et al. 6 and Taniai 23 .
;g h o P 1 , h ∈ R d , all the stochastic convergences are taken under P b,g : P ∞ b,g .Here, the random vector Δ n b;g is called the central sequence, and the positive definite matrix I b;g is the information matrix.To ensure the LAN condition for model 2.18 , the following assumption is required.Assumption 3.1.The reference density f has finite Fisher information for location: L : #{i ∈ {1, . . ., n} | S b n ,i ≤ 0}.In short, we are first rewriting the residual e b n ,i as F ← U b n ,i with realizations U b n ,i of a 0, 1 -uniform random variable, and then approximating thoseU b n ,i as V n b n ,i given {N n b n ,L ; R n b n ,1 , . . ., R n b n ,n }.Using this rank-based central sequence, we can construct the one-step estimator see, e.g., Bickel 17 ; Bickel et al. 18 as follows.Definition 3.3.For any sequence of estimators θ n , the discretized estimator θ n is defined to be the nearest vertex of {θ : θ 1/ √ n i 1 , i 2 , . . ., i d , i j : integers}.Definition 3.4.Let β n α be the discretized version of β n α defined at 2.9 .We define the rank-based one-step estimator of b based on reference density f ∈ F α as b n f :

Figure 1 :
Figure 1: Kernel density plots for the portfolio weights.Panels a to d correspond to the kernel density for assets 1 to 4, respectively.The density shows the standard quantile estimator QR; solid line ; the estimator with an asymmetric Laplace distribution reference density AL; dashed line ; normal distribution N; dotted line ; logistic distribution LGT; dotted-dashed line ; asymmetric power distribution APD; long dashed line .

4 dFigure 2 :
Figure 2: Empirical cumulative distribution function of the α-risk-minimizing portfolio based on the standard quantile regression estimator thin line and one-step estimator thick line for α 0.1, 0.2, 0.3, and 0.5, which corresponds to a to d , respectively.

Figure 3 :Table 3 :
Figure 3:Efficient frontiers for an α-risk-minimizing portfolio based on the standard quantile regression estimator and the one-step one.The lines with triangles and circles represent the pair of obtained standard deviation and mean for the portfolio with α 0.5 and 0.1, respectively.The solid and dashed lines represent risks and returns for the standard quantile regression and one-step portfolios, respectively.

2
Assumption 3.2.The regression vectors W i satisfy, under P b,g , Then, by Theorem 2.1 and Example 4.1 of Drost et al. 15 , model 2.18 satisfies the uniform LAN condition for any b n of the form b O n −1/2 , with central sequence and information matrix W and positive definite Σ W , where W n and Σ n W are defined by 2.10 .
4where e b n ,i denotes the residual i.e., e b n ,i : Z i − W i b n .Consequently, we have the contiguity P means that for any sequence S n , if P n S n → 0, then Q n S n → 0 also.The reason why we have specified uniform LAN, rather than LAN at single b, is the one-step improvement, which will be discussed later.By following Hallin and Werker 13 , a semiparametrically efficient procedure can be obtained by projecting Δ ,f | f ∈ F α } becomes maximal invariant see, e.g.,Schmetterer 16.For the quantile-restricted regression model 2.16 , such a σ-field is studied byHallin et al. 6and found to be generated by signs and ranks of the residuals.Here, let us denote the sign of a residual as S b n ,i , the rank of a residual as R n b n ;f on some σ-field to which the generating group for {P n b n n b n ,i , and the σ-field generated by them as

Table 1 :
Var π Var π QR for DGP 1 3 : in this case, we estimate an unknown g which must be N-χ 2 mixed using the kernel method.

Table 2 :
Var π Var π QR for DGP2 multinormal .In this case, residual density is a normal distribution.Hence, we adopt g N.