Stable Portfolio Selection Strategy for Mean-Variance-CVaR Model under High-Dimensional Scenarios

This paper aims to study stable portfolios with mean-variance-CVaR criteria for high-dimensional data. Combining diﬀerent estimators of covariance matrix, computational methods of CVaR, and regularization methods, we construct ﬁve progressive optimization problems with short selling allowed. The impacts of diﬀerent methods on out-of-sample performance of portfolios are compared. Results show that the optimization model with well-conditioned and sparse covariance estimator, quantile regression computational method for CVaR, and reweighted L 1 norm performs best, which serves for stabilizing the out-of-sample performance of the solution and also encourages a sparse portfolio.


Introduction
Mean-risk models are widely used and play an important role in financial risk management. e classical and revolutionary work is mean-variance (MV) optimization model proposed by Markowitz [1], in which variance is used to measure risk. Afterwards, more and more researchers are devoted to the study of this field. Kolm et al. [2] review the development, challenges, and trends of MV optimization problems in recent six decades. Considering different risk measures focus on different characteristics of risk, some risk measures other than variance are incorporated into the mean-risk framework. For example, Konno and Yamazaki [3] and Ogryczak and Ruszczynski [4] use absolute deviation and semideviation to measure risk, respectively, and construct mean-risk model for portfolio selection. Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR) have been employed as the risk measure to conduct asset allocation (see Consigli [5], Alexander and Baptista [6], Xu et al. [7], and Quaranta and Zaffaroni [8] for more details).
Since different risk measures delineate different information of risk, the combination of two risk measures is used to control risk in mean-risk models. For instance, Konno et al. [9] and Konno and Suzuki [10] construct mean-absolute deviation-skewness model and mean-variance-skewness model, respectively, in which the skewness is indicative of unidirectional movement (bearish or bullish) of the stock market. Robert and Philip [11] propose a mean-variance-skewnesskurtosis portfolio optimization model and show that higherorder moments of return can significantly change optimal portfolio construction. Roman et al. [12] construct an optimization model based on mean-variance-CVaR criterion, in which variance and CVaR are combined to obtain a range of balanced solutions that are generally discarded by both meanvariance and mean-CVaR models. Additionally, the personal preference between variance and CVaR of the decision-maker can be considered. Gao et al. [13] extend it to dynamic scenario in financial field and Shi et al. [14] give the discussion in insurance investment under regulatory constraints. However, the study of the stability of optimal strategy's out-of-sample performance is ready to explore under mean-variance-CVaR criterion. us, the adaptability of optimal investment strategy in practice is questionable, especially in high-dimensional scenarios (dimensionality of assets is comparable to or even larger than the number of observations). Here, we will study this problem from the point of view of model-solving procedure.
To solve mean-variance-CVaR optimization problem, the computation of variance and CVaR for portfolios is a central and fundamental work. For variance term, estimating covariance matrix of the return variables of assets is necessary. Generally, sample covariance matrix is a good estimator when sample size is large sufficiently. But under high-dimensional scenarios, it usually delivers the presence of poor out-of-sample performance (see, e.g., Green and Hollifield [15]; Chopra and Ziemba [16]; DeMiguel et al. [17]). To deal with such instable out-of-sample performance, researchers make various contributions. Based on different thresholding function, thresholding covariance matrix estimator is one of the mainstreams of improving sample covariance matrix by considering sparsity (e.g., Tibshirani [18]; Bickel and Levin'a [19]; Cai et al. [20]). Moreover, to guarantee the presence of a convex optimization problem, Rothman et al. [21] propose the positive definite sparse covariance estimator (PDSCE). While considering the property of well condition, Ledoit and Wolf [22] propose an estimator of covariance matrix as a linear combination of sample covariance and identity matrix. Maurya [23] develops a well-conditioned and sparse estimator of covariance matrix in high-dimensional setting. is estimator has the properties of well-conditioned, sparsity, and positive definite; especially, the property of positive definite guarantees a convex optimization problem and it performs better than several other popular methods used in literatures (e.g., graphical lasso in Friedman et al. [24], PDSCE in Rothman [21], and Bickel and Levin'a thresholding estimator in Bickel and Levin'a [19]).
For CVaR item, Rockafeller and Uryasev [25,26] calculate CVaR through a convex programming problem, which pave the way of portfolio selection with CVaR under nonnormal assumption. Bassett et al. [27] bridge the gap between quantile regression and calculation of CVaR without distribution assumption of returns. However, it has also been investigated that the out-of-sample performance is unstable in mean-CVaR model (see Lim et al. [28], Takeda and Kanamori [29], and Kondor et al. [30] for more details). A popular way to make mean-CVaR model stable is regularization technique which can lead to a sparse portfolio simultaneously. Xu et al. [7] incorporate penalty function into mean-CVaR model under the large scale sample scenarios to get a sparse portfolio. Gao and Wu [31] combine penalty function and variance item to improve the out-ofsample performance of mean-CVaR model. Additionally, regularization method has wide applications in constructing mean-variance model to find stable optimal portfolios with better out-of-sample performance. For example, Brodie et al. [32] reformulate classical Markowitz's mean-variance model as a constrained least-squares regression problem. ey add a L 1 -regularization term to the objective function and show that this penalty regularizes the optimization problem and encourages sparse portfolios. Other works about regularization technique used in mean-variance or mean-CVaR model to promote stable solutions can be seen in literatures (e.g., Gotoh and Takeda [33] and Fastrich et al. [34]). Motivated by the above discussion, in this paper, we try to find portfolios with more stable out-of-sample performance for mean-variance-CVaR model under high-dimensional scenarios. Five progressive optimization problems are designed carefully based on mean-variance-CVaR criteria through combining two estimators for covariance matrix, two computing methods for CVaR, and two penalty functions in different ways. Simulation is conducted to compare the out-of-sample performance of portfolios obtained from these optimization problems and the impact of underlying methods on optimal strategy is analyzed. Based on the historical data from the constituent stocks in Shanghai Stock Exchange 50 Index, an empirical study is considered. e remaining of this paper is organized as follows. Section 2 describes the related methods used in this paper. Section 3 shows the optimization problems we formulate based on the mean-variance-CVaR criterion. Section 4 provides simulation and result comparisons. An empirical study is carried out in Section 5. Finally, Section 6 concludes the paper.

Methods
In this section, we will briefly introduce risk measures used in this paper, their estimating methods, and some regularization methods. Suppose that we have the opportunity to invest in p assets with returns X k , k � 1, 2, . . . , p. Let X � (X 1 , X 2 , . . . , X p ) ′ represent return vector with weight vector β � (β 1 , β 2 , . . . , β p ) ′ , in which β k is the portfolio allocation weight for X k . us, the total return is Y � X ′ β.

Variance.
Variance describes the degree of dispersion of a random variable and hence can measure the fluctuation of investment return. Here, we use V(X ′ β) to denote the variance of a portfolio return variable Y since Y � X ′ β, and V(X ′ β) can be expressed as where Σ � (σ kj ) p×p is the covariance matrix of X and σ kj is the covariance of X k and X j . Generally, Σ is estimated by sample covariance matrix (called as classical estimator); that is, Σ c � (σ kj ) p×p and σ kj � (1/n − 1) n i�1 (x ki − x k )(x ji − x j ) denote the sample covariance of X k and X j , where n is the sample size.
Since the well-conditioned and sparse estimator proposed by Maurya [23], noted as Σ w , has been proved its superiority of property. erefore, in this paper, we employ the well-conditioned and sparse estimator Σ w directly. Following the notation in Maurya [23], we have where Σ − � Σ − Σ + and Σ + denote a diagonal matrix with the same diagonal as Σ, σ i (Σ) is the ith largest eigenvalue of matrix Σ, and σ Σ is the mean of eigenvalues of Σ, and the way to identify the best pair (λ 1 , τ 1 ) can also be found in Maurya [23].

CVaR.
For financial assets or portfolio, VaR refers to the greatest possible loss over specific holding period, at a certain confidence level 100(1 − α)%, α ∈ (0, 1). If the cumulative distribution function for return Y is F, VaR can be defined as As we know, compared with VaR, a key advantage of CVaR is that CVaR satisfies the four coherence axioms of Artzner et al. [35], whereas VaR is not coherent. e definition of CVaR is as follows: Evidently, CVaR 1−α , the conditional expectation of losses exceeding VaR 1−α , is more informative about the tail of the distribution than VaR 1−α . Two popular ways to compute CVaR without distribution constraints have been employed recently. e first one is a linear programming model for optimizing CVaR proposed by Rockafellar and Uryasev [25,26], generally called as Rockafellar-Uryasev's approximation. ey have proved that CVaR can be calculated by solving a convex optimization problem. In other words, CVaR can be formulated as where ξ is the αth quantile of Y. is method has been widely used in many literatures (e.g., Roman et al. [12], Lim et al. [28], and Gao and Wu [31]). e second way is quantile regression method proposed in Bassett et al. [27]. Xu et al. [7] have proved that it has faster computational speed compared with Rockafellar-Uryasev's approximation method. According to Bassett et al. [27], the CVaR of a portfolio return Y could be calculated by where ξ is the αth quantile of Y and ρ α (u) � u(α − I(u < 0)) is the check function, in which I(·) is an indicator function that takes on the value one whenever its argument is true and zero otherwise.

Regularization Method.
To encourage a stable and sparse solution in optimization problem, penalty item Pen(·) is frequently used, which is regarded as regularization method. Pen(·) is a general penalty function that allows shrinking the components in β to zero. In this paper, we focus on smoothly clipped absolute deviation (SCAD) penalty proposed by Fan et al. [36] and reweighted L 1 norm penalty proposed by Emmanuel et al. [37], since they have oracle properties. e SCAD penalty is formulated as follows: where ζ(ζ < 2) and λ are tuning parameters. It is well known that SCAD is continuous and singular at the origin, penalizes large coefficients equally, and has no bias, and the reweighted L 1 norm is as follows: where λ is a tuning parameters and the value of ω k can be identified by iteration method based on the value of |β k |.
Reweighed L 1 norm has a better out-of-sample performance compared with L 1 norm penalty [34], since it gives every component in β a corresponding weight so that the penalization is more accurate.

Model Formation
Motivated by Roman et al. [12], we consider the following mean-variance-CVaR model: where 0 ≤ η ≤ 1 is a weighting parameter, which represents the attitude that investors hold towards the two risk measures and hence balances the importance of them.
Remark 1. Letting η � 0 or η � 1, respectively, the above optimization problem is reduced to the mean-CVaR model or mean-variance model.

Remark 2.
Different from Roman et al. [12], in this paper, we incorporate the variance term and CVaR term into the objective function of optimization model simultaneously and short selling is allowed.

Optimization Problems.
In this section, we present the following five optimization problems based on different methods of calculating variance, CVaR, and different regularization methods in order to obtain optimal portfolio for P mvc .
To illustrate clearly, we outline the optimization problems in Table 1. Here, "√" means the underlying method is incorporated in the corresponding optimization problem. For example, the second row tells us that the classical estimation Σ c of variance term and Rockafellar-Uryasev's approximation CVaR 1 1−α for CVaR term are used in problem P 1 .
Mathematically, optimization problems can be written in the following forms: Why do we design such five optimization problems in the foregoing way? In P 1 , the classical method of variance term and Rockafellar-Uryasev's approximation for CVaR term are used to find the solution. To construct P 2 , we replace the classical estimator Σ c in the variance term in P 1 with Σ w , and the CVaR part remains the same so that we could find if Σ w could mitigate the instability. Further, we replace CVaR 1 1−α in P 2 with CVaR 2 1−α to get P 3 , so that the difference between the two computational methods for CVaR could be compared. In P 3 − P 5 , 1 ′ β � 1 can be absorbed into the CVaR item based on quantile regression method. To examine the performance of SCAD and reweighted L 1 norm penalty function, we design P 4 and P 5 , in which quantile computational method for CVaR is used because of the convenience of selecting tuning parameter in penalty function and faster computational speed [7]. rough such progressive problem design, we can figure out the performance of different methods and find the most effective one in our setting.

Remark 3.
e estimation methods used in P 1 for variance and CVaR are the same as that in Roman et al. [12].
Remark 4. If η � 0, P 4 degenerates into the model in Xu et al. [7]. However, the constraint E[X ′ β] � μ 0 is missing in their paper. According to Bassett et al. [27], the E[X ′ β] item in CVaR part (see (7)) in objective function can be neglected only under the constraint that the expected return of portfolio is a constant.

Selection of Tuning
Parameter. For P 4 and P 5 , we need to search the best tuning parameter for penalty item.
In P 4 , the nonnegative regularization parameter λ drives the relevance of the SCAD penalty and ζ is a tuning parameter belonging to SCAD penalty. Here, we search the best pair θ � (λ, ζ) ′ over two-dimension grids following the criterion mentioned in the following. Motivated by Lee et al. [38] and Wang et al. [39], we use the modified Bayesian Information Criterion as follows: where x 1,1 , x 2,1 , . . . , x n,1 are random samples of X 1 , x 1,j , x 2,j , . . . , x n,j are random samples of X j for j � 2, 3, . . . , p, and df is the effective dimension of fitting model. Additionally, df � |E| indicates the number of points in the set E with E � j: β j,θ ≠ 0, 2 ≤ j ≤ p . Furthermore, the iterative algorithm for conducting P 4 can be seen in Wu and Liu [40]. In P 5 , λ is the nonnegative regularization parameter and ω k , k � 1, 2, . . . , p is the tuning parameter in reweighted L 1 norm penalty. For ω � (ω 1 , ω 2 , . . . , ω p ) ′ , we apply the following iterative procedure for every given λ to change the weighting parameter ω k dynamically and adaptively [37].
(1) Let the iteration count be h. For any given ω (h) , we solve the problem P 5 , which gives the solution β (h) . If the stopping criteria are satisfied, for example, compared with every component in β , the maximum change does not exceed some small constant, we stop the iteration. Otherwise, go to Step (2).

Data Generation.
To evaluate the out-of-sample performance of these five portfolio optimization problems, the dataset is generated by simulation approach with parameters being estimated from real historical price data of stock index.

Construction of Return Variable. Following the idea in
Lim et al. [28], the random returns are captured by a hybrid distribution with multivariate normal distribution and exponential distribution. So all assets might suffer a perfectly correlated exponential-tail loss in a small probability. Let z be the exponential random variable with parameter q and we set q � 10 for simplicity. So the return variable is constructed as follows: where B(τ) is the Bernoulli random variable with parameter τ, N follows multivariate normal distribution with mean vector μ and covariance matrix Σ, and c � ( where Σ kk is the k-th diagonal element of the matrix Σ. e parameter τ controls the tail loss of the distribution.

Determination of Parameter Values.
We choose 142 stocks which in Shanghai Stock Exchange Constituent (SSEC) index download the data from Joinquant Data platform. e sample period spans from Jan 3, 2015, to Dec 31, 2015. We calculate the daily yield return and estimate their mean return vector μ � (μ 1 , μ 2 , . . . , μ p ) ′ and covariance matrix Σ. (Note that the 180 constituent stocks in SSEC are always changing in the sample period, since the sample is adjusted every half year; therefore, we choose 142 stocks which continuously stay in the index over the period.)

Data Generation.
As we know, when the dimensionality of assets is comparable to the number of observations, the data set belongs to high-dimensional scenarios. erefore, in order to simulate a high-dimensional data set, we set the size of n close to p � 142. Letting n � 150 and τ � 0.05, we simulate 101 groups of data, one group as insample data, and 100 group as out-of-sample data.

Optimization Results.
In this section, we will calculate simulation results based on the five optimization problems, respectively, and give the comparisons from the point view of the out-of-sample performance. Without loss of generality, we set μ 0 � 0.01. All the calculations follow three steps: Step 1: calculate the weight of assets in portfolio selection model with different η � 0, 0.25, 0.5, 0.75, and 1 for in-sample data.

Mathematical Problems in Engineering
Step 2: based on the weights in Step 1, we calculate the variance, CVaR with 0.95 confidence level, and expected return for other 100 groups of out-of-sample data.
Step 3: summarize their mean value and standard deviation for every case. For example, under the case of P 1 at η � 1, the mean value and the standard deviation for 100 values of CVaR are calculated to be 10.0812 and (0.9708), respectively.
Steps 1 and 2 mean that we hold the optimal portfolio generated from in-sample data in the out-of-sample period, in which the data follow the same distribution as in-sample data. erefore, if the optimal portfolio is stable, the performance of portfolio in 100 groups of out-of-sample data should be similar, and hence, we calculate the standard deviation in Step 3 to measure the similarity of the performance in 100 groups. e results are shown in Table 2. From Table 2, the following results are concluded: (1) Compared P 1 with P 2 -P 5 , the mean for 100 values of variance, CVaR in P 1 are larger, which means that the risks of portfolios in out-of-sample data get out of control and the expected returns are also higher. is is consistent with the truth of "high risk; high return." e inherent reason is that sample covariance estimator lost its theoretical support (Law of Large Numbers) in high-dimensional scenarios.
(2) Compared P 2 with P 1 , we can find that all the counterparts in P 2 decrease, which tells that the use of well-conditioned and sparse estimator (Σ w ) could control the risk significantly and improves the stability of solutions' out-of-sample performance as well. When η � 0, the model is reduced to mean-CVaR model, whose solution is irrelevant to the estimation of Σ, so the values of CVaR and expected return should remain the same. e data in Table 2 really illustrate this fact. (3) e results in P 2 and P 3 share identical in-sample optimal portfolio values and out-of-sample performance. is means that the impact of Rockafellar-Uryasev's approximation and quantile regression method of CVaR on the solution of optimization problem shows no significant difference under our setting. (4) e values in P 4 further become smaller than ones in P 3 , which says that the use of SCAD penalty technique can improve the stability and the riskcontrolling ability for out-of-sample performance. (5) Compared P 5 with P 4 , the values of variance and CVaR in P 5 show a slight decrease, which means that the reweighted L 1 norm penalty contributes to a more stable out-of-sample performance than SCAD penalty, even though the improvement is fractional. (6) Bigger η represents a higher proportion for variance item in objective function. Since variance is a more conservative risk measure compared with CVaR, generally, bigger η results in a more conservative portfolio with relative low-risk index and expected return in out-of-sample analysis. e values in P 4 and P 5 show this trend well. But others fail, which may attribute to the instable out-of-sample performance.
To analyze the stability of out-of-sample performance in P 1 -P 5 easily, we draw the scatter diagram for each optimization problem at η � 0.5. For the convenience of comparison, we give the following graphical design. First, P 1 , P 2 , and P 3 share the same coordinate system (see Figure 1), and the diagrams of P 2 and P 3 are identical, so we draw them in one diagram. Second, to give a vivid comparison, we zoom in the coordinate scale and draw the diagrams of P 3 and P 4 ; see Figure 2. e coordinate scale is further amplified in Figure 3 for the diagrams of P 4 and P 5 .
From Figure 1, we can find that the points in (b) are in lower risk positions and more concentrative, which means that the portfolio generated from P 2 (or P 3 ) has a better risk-controlling ability and more similar performance in 100 groups of out-of-sample data sets.
us, the well-conditioned and sparse estimator Σ w helps to stabilize the out-ofsample performance and control the risk better.
is is consistent with result (2) from Table 2.
From Figures 2 and 3, we can conclude that penalty items also help to stabilize the out-of-sample performance

Empirical Study
Suppose that an investor intends to track a stock index in a financial market allowing short selling. rough the Joinquant Data platform, we select daily data of the constituent stocks in Shanghai Stock Exchange 50 (SSE 50) Index from Jun 11, 2018, to Dec 11, 2018. e reason why we choose data in a period of half a year elaborates as follows. e constituent stocks in SSE 50 change every half year or so, and keeping them as a group in an index apparently influences the dependency structure of assets, which further affects the optimal investment strategy. erefore, when an investor intends to track SSE 50, it is reasonable to get the data in which the same 50 stocks always stay in SSE 50.    Mathematical Problems in Engineering en, we tag the 50 stocks from No. 1 to No. 50. After calculating, we obtain 124 observations of daily logarithm yield return, including 64 in-sample data spanning from Jun 11, 2018, to Sept 10, 2018, and 60 out-of-sample data spanning from Sept 11, 2018, to Dec 11, 2018. Here, we can find that the dimensionality of in-sample data meets the requirement of high-dimensional scenarios because p � 50 and n � 64 are comparable.
Since P 5 outperforms P 1 -P 4 from Section 4, we calculate the optimal portfolios based on P 5 in formula (13) and show the results in Figure 4. e parameter η is supposed to be 0, 0.5, and 1, respectively, and the target return μ 0 is set to be the average value of the return data of 50 stocks. Considering the idea in this paper is illuminated by the pioneering work of Roman et al. [12], we further conduct the corresponding analysis from P 1 for comparison and the results are shown in Figure 5. It can be easily found that portfolios from P 5 are much more sparse and can avoid extreme investment action, which will be more applicable to the practice of finance.
Next, we observe the out-of-sample performance of the portfolios. ere is evidence that equal weight (EW) strategy cannot be defeated by many portfolio methods consistently (DeMiguel et al. [17]). We also add EW strategy into comparison here. e variance, CVaR 0.95 , and average return of the portfolios are calculated, respectively; see Table 3. e accumulated returns of the portfolios in out-ofsample period are displayed in Figure 6.
From Table 3 and Figure 6(a), we can conclude that the portfolio generated from P 5 has the highest terminal return and can control the risk robustly over the period, since it has a relatively slight fluctuation and avoids extreme losses and outperforms the EW strategy basically. However, the portfolio generated from P 1 is unstable even compared with EW strategy and ends up with the lowest return in spite of being dominate at the beginning. Figure 6(b) also tells us that η � 0.5 in P 5 will have a more promising terminal return compared with the case only considering CVaR (η � 0) or variance (η � 1).

Conclusion
In this paper, we have constructed five progressive optimization problems to figure out a relatively better way to conduct mean-variance-CVaR portfolio selection model in high-dimension scenarios. Specifically, two covariance matrix estimators (classical estimator and well-conditioned and sparse estimator), two estimation methods for CVaR (Rockafellar-Uryasev's approximation and quantile regression), and two penalty functions (SCAD and reweighted L 1 norm) have been considered and different combinations of them have been used for the formulation of optimization problems. e data for simulation is generated from a hybrid distribution with multivariate normal distribution and exponential distribution. Moreover, the correlation matrix and mean vector of the multivariate normal distribution are estimated from real data. e simulation studies show that the well-conditioned and sparse estimator and penalty item devote to alleviate poor out-of-sample performance significantly and the reweighted L 1 norm penalty has a slight superiority. Additionally, we find that the classical optimization methods for mean-variance-CVaR model in Roman et al. [12] may result in poor out-of-sample performance under high-dimension scenarios. An empirical study has been conducted based on the data of constituent stocks in SSE 50. e results solidify the foregoing findings. erefore, we suggest to combine wellconditioned and sparse covariance estimator, quantile regression computational method for CVaR, and reweighted L 1 norm for portfolio optimization under mean-variance-CVaR criterion, since the fragile out-of-sample performance could be effectively mitigated. e study in this paper focuses on the high-dimensional setting, where n is just a little bit greater than p. An extension of this study under a more extremely high-dimensional scenario in which p > n would also be interesting. Moreover, it is noteworthy to further stabilize the out-of-sample performance by controlling the fragile brought by mean item. We will explore these problems in the following study.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.