MPE Mathematical Problems in Engineering 1563-5147 1024-123X Hindawi 10.1155/2019/4189683 4189683 Research Article Applying Least Squares Support Vector Machines to Mean-Variance Portfolio Analysis https://orcid.org/0000-0003-1177-651X Wang Jian 1 https://orcid.org/0000-0002-0484-9189 Kim Junseok 1 Dounias Georgios Department of Mathematics Korea University Seoul 02841 Republic of Korea korea.ac.kr 2019 2762019 2019 22 03 2019 03 06 2019 17 06 2019 2762019 2019 Copyright © 2019 Jian Wang and Junseok Kim. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Portfolio selection problem introduced by Markowitz has been one of the most important research fields in modern finance. In this paper, we propose a model (least squares support vector machines (LSSVM)-mean-variance) for the portfolio management based on LSSVM. To verify the reliability of LSSVM-mean-variance model, we conduct an empirical research and design an algorithm to illustrate the performance of the model by using the historical data from Shanghai stock exchange. The numerical results show that the proposed model is useful when compared with the traditional Markowitz model. Comparing the efficient frontier and total wealth of both models, our model can provide a more measurable standard of judgment when investors do their investment.

China Scholarship Council 201808260026
1. Introduction

Portfolio is the combination of securities such as foreign exchange, stocks, and other market instruments. Stock investment has become very common for household investors to involve in the stock market. Investors used many technical methods to minimize risk and optimize return. Among the methods, Markowitz model developed by Harry Markowitz in 1952 had serious practical limitations due to complexities involved in compiling the variance, covariance, expectation, standard deviation of each asset to other assets in the portfolio. In recent years, many works have been done by scholars to make the portfolio theory more efficient. In , the authors presented the different variants of the goal programming model that has been applied to the financial portfolio selection problem. In , the authors optimized the portfolio based on entropy and higher moments by using a polynomial goal programming model, and the results indicated that the proposed method is suited for portfolio models which have higher moments. Portfolio optimization techniques also significantly improved the return-risk trade-off performances using multiobjective evolutionary model proposed in . The authors in  used the multiperiod mean-variance model to investigate a defined contribution pension plan investment problem during the accumulation phase. To incorporate social responsibility, a modification of the Markowitz model was proposed in . Konno  proposed a mean-absolute deviation portfolio optimization model and applied it to Tokyo stock market. The results of numerical experiments showed that the model generated a portfolio quite similar to that of the Markowitz model within a fraction of time required to solve the latter. In , the authors used the independently estimated possibilistic return rates to deal with a portfolio selection problem.

However, there are few scholars who used the methods of machine learning to modify the Markowitz model. As we know, the return rate in the mean-variance model refers to the historical return rate, which can also be called historical volatility. Historical volatility refers to the standard deviation of the underlying asset price changes over the past period of time, which represents the past volatility law. The actual volatility in the trading point cannot be determined, but can only be predicted with historical volatility and current market information. In this paper, we predict the actual volatility with historical volatility by using machine learning. As one of important applications of the machine learning, LSSVM has been used to deal with various financial problems such as stock price prediction [8, 9] and regression . Mustaffa  optimized LSSVM for nonvolatile financial prediction. In , the authors proposed a time series forecasting model which used LSSVM optimization based on Grey Wolf optimizer algorithm. For time series prediction , Van Gestel  used the LSSVM regression within the evidence framework to infer nonlinear models for predicting time series and volatility. By regularizing least squares fuzzy support vector regression, Khemchandani  handled the financial time series forecasting. In , the authors optimized LSSVM model by weight particle swarm optimization, and they forecasted stock return accurately.

In our study, we apply LSSVM regression model to traditional Markowitz model and an efficient result is achieved by our proposed model when we compare the efficient frontier and total wealth of both models.

The contents of this paper are as follows: in Section 2, we provide model description about mean-variance, LSSVM and LSSVM-mean-variance model. In Section 3, we describe the data set and software which will be taken to do the empirical research in this paper. In Section 4, we will show and describe the empirical results. In Section 5, conclusions are given.

2. Model Description 2.1. Mean-Variance Model

Modern portfolio theory was first introduced by Markowitz . By the model, Markowitz proposed the formulation of an efficient frontier shown in a two-dimension graphic, from it, investors can choose their financial portfolio to maximize return for a given level of risk as measured by the variance of returns. Suppose that the investor’s wealth is W0, the weights on the n assets are ω1,ω2,,ωn, and the return rate in the future is Rp, then the investor’s wealth in the future will be W=W0(1+Rp). Investors usually determine the proportion of investment in each asset at the initial stage to maximize the expected investment value. Then, the process can be formulated as(1)maxwiEURp=EUi=1nwiRis.t.i=1nwi=1By Taylor expansion,(2)EURp=UERp+ERp-ERpUERp+12ERp-ERp2UERp++1n!ERp-ERpnUnERp+Assuming that the series R1,R2,,Rn follow normal distribution, then the above function depends on the mean and the variance of Rp. Suppose U() is a concave function, then we can simplify (1) to (3)minwiσ2Rp=i=1nωi2σ2Ri+ijωiωjσRi,Rjs.t.Rp¯=i=1nwiERiandi=1nwi=1,where Rp¯ represents the portfolio’s expected return which the investor expected. The mathematical form of (3) is the quadratic programming problem that can be solved by a Lagrangian method. We give the first-order condition(4)ΣeR¯e00R¯00ωλ1λ2=01Rp¯,where R¯ is the return mean vector composed by N assets, e is the N×1 unit column vector, and Σ is the N×N covariance matrix of return. Then, the final investment proportion is optimally satisfied by(5)ω=a+bRp¯,a=βΣ-1e-αΣ-1R¯βδ-α2,b=δΣ-1R¯-αΣ-1eβδ-α2(6)α=R¯Σ-1e,β=R¯Σ-1R¯,δ=eΣ-1e.

2.2. Least Square Support Vector Machine

Support vector machine (SVM) has been successfully applied for financial problems, especially in time series forecasting. LSSVM is the least squares formulation of SVM and was developed by Pelckmans . LSSVM is the combination of structural risk minimization and VC dimension theory  and usually is used for classification as well as regression, such as pattern recognition, fitting functions [22, 23], and data analysis. The algorithm of LSSVM is introduced as follows. The following regression model is constructed by using a nonlinear mapping function φ.:RnRnh, which maps the input data to a higher dimensional feature space:(7)yx=ωTφx+b,where xRn,yR, ωT is the weight vector, and b is the bias. Assume a training set as(8)S=x1,y1,,xi,yixiRn,yiRThe original optimization problem is(9)min12ω2and the LSSVM can be formulated as(10)minJω,e=12ω2+12γi=1Lel2subject to the constraints(11)yx=ωTφxl+b+ell=1,2,,L,where γ is the regularization parameter and el is the random errors. Using a Lagrange multiplier method, we have (12)Lω,b,e,α=Jω,e-i=1LαlωTφxl+b+el-yl,where αlare the Lagrange multipliers, from the optimization conditions, by partially differentiating with respect to each parameter, yielding(13)Lω=0ω=i=1Lαlφxl,LB=0i=1Lαl=0,Lel=0αl=γel,Lαl=0ωTφxl+b+el-yl=0.After elimination of parameters ω and e, we obtain the following matrix solution.(14)01LT1LΩ+Iγbα=0y,where the composition of the matrix Ω is Ωij=φxiTφxj=Kxi,xj,i,j=1,,L. Here, Kxi,xj is the radial basis function (RBF) kernel function. For regression models, the RBF kernel is often applied because of its influence and speed in training process .(15)Kx,xl=exp-x-xl22σ2Then, we can obtain the regression function as(16)yx=l=1LαlKx,xl+bσ2 in the above function is the kernel width and we apply it to adjust the degree of generalization. To make the LSSVM model, we should optimize the parameters γ and σ2. In this paper, to do the comparison test between the traditional mean-variance model and the proposed LSSVM-mean-variance model, we take σ2=0.06 and γ=5 unless otherwise specified.

2.3. LSSVM for Mean-Variance Model

In this section, we give a description of applying LSSVM to mean-variance model. We first select a portfolio and then calculate the returns of the assets in the portfolio. As mentioned above, in the LSSVM model, we take the matrix of assets’ returns as the training sets, by the process of Section 2.2, we get a regression matrix of assets’ returns. Then, we use returns and regression returns to do the test and follow the steps described in Section 2.1. Finally, we compare the efficient frontier and final wealth return for the two methods, which will be shown in Section 4.

3. Data Set and Software

We select a portfolio consistimg of three assets which are chosen from Shanghai stock market. To do a buy-and-sell test, we use the historical data for the stock “-zgyh-”, “-nyyh-”, and “-jtyh-” from August 09, 2018, to October 26, 2018. We take the data from August 09, 2018, to October 25, 2018, into mean-variance model and LSSVM-mean-variance model. There are 50 data in total. In the LSSVM model, we divide the data into a training set with 39 data and a test set with 10 data. Then, we compare the performance with the two models on October 26, 2018. To do a buy-and-hold test, we use the historical data for the stock “-zgyh-”, “-nyyh-”, and “-jtyh-” from March 10, 2017, to March 12, 2018. We take the data of closing price every 5 days; then, there are 50 data in total. In the LSSVM model, we also divide the data into a training set with 39 data and a test set with 10 data. Then, we compare the performance with the two models on March 19, 2018. For the calculation process, MATLAB R2016a will be used.

4. Empirical Results 4.1. Empirical Results of Buy-and-Sell Strategy

From the stocks data chosen in Section 3, we can see the stocks’ price trend as shown in Figure 1.

Candlestick chart for “-zgyh-”, “-nyyh-”, and “-jtyh-” from August 09, 2018, to October 26, 2018. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article. The data sources were downloaded from the web site “http://quotes.money.163.com/stock”.)

The body in the candlestick usually consists of an opening price and a closing price; the price excursions below or above the body are called the wicks. For a stock during the time interval represented, the wick contains the lowest and highest prices, as well as the body contains the opening and closing prices. The red body of a candlestick indicates the security has a higher closed price than it opened, the opening price at the bottom and the closing price at the top. The green body of a candlestick indicates the security has a lower closed price than it opened, the opening price at the top and the closing price at the bottom.

Now we select the real historical data of the stocks “-zgyh-”, “-nyyh-”, and “-jtyh-” from August 09, 2018, to October 25, 2018. Taking the closing data to the calculation of return rate for every day, the total number of data is 50. Then, we get 49 return rate data for each asset. The return rate of “-jtyh-” is shown in Figure 2 and reflected by “”. Then, we take return rate to the calculation process of Markowitz model. As a result, Table 1 shows the proportion of each asset for traditional mean-variance model, we set 10 investment proportion combinations.

Proportion of each asset for traditional mean-variance model.

Investment proportion combination Proportion of “-zgyh-” Proportion of “-nyyh-” Proportion of “-jtyh-”
1 0.7994 0.0000 0.2006

2 0.6483 0.0000 0.3517

3 0.4972 0.0000 0.5028

4 0.3461 0.0000 0.6539

5 0.1950 0.0000 0.8050

6 0.0635 0.0349 0.9016

7 0.0000 0.0349 0.9016

8 0.0000 0.4608 0.5392

9 0.0000 0.7304 0.2696

10 0.0000 1.0000 0.0000

Return rate of mean-variance model and LSSVM mean-variance model for (a) “-zgyh-”, (b) “-nyyh-”, and (c) “-jtyh-”.

As a comparison, we calculate the proportion of each asset for LSSVM-mean-variance model by using the LSSVM regression. We take the return rate mentioned above to the LSSVM model described in Section 2.2, we divide the data into a training set with 39 data and a test set with 10 data. Then, we get the regression data which is shown in Figure 2 and reflected by “o”.

Then, we take regression return rate to Markowitz model. Table 2 shows the proportion of each asset for LSSVM-mean-variance model. Here, we also set 10 investment proportion combinations.

Proportion of each asset for LSSVM-mean-variance model.

Investment proportion combination Proportion of “-zgyh-” Proportion of “-nyyh-” Proportion of “-jtyh-”
1 0.9668 0.0000 0.0332

2 0.6654 0.0000 0.3346

3 0.4173 0.0305 0.5521

4 0.2670 0.1171 0.6159

5 0.1167 0.2037 0.6796

6 0.0000 0.3094 0.6906

7 0.0000 0.4821 0.5179

8 0.0000 0.6547 0.3453

9 0.0000 0.8274 0.1726

10 0.0000 1.0000 0.0000

Each investment proportion combination in the table responds to a maximum return for a given level of risk as measured by the standard variance. The points are constituted by mean and standard variance forming an efficient frontier.

As seen in Figure 3, the efficient portfolio frontier for LSSVM-mean-variance model has a better performance than the traditional model. It is possible to check that portfolios corresponding to the new proposed method can improve the return at the same risk. For the same expectation of both models, the LSSVM-mean-variance model can reduce the risk for investors.

Efficient portfolio frontier of mean-variance model and LSSVM mean-variance model.

According to the investment proportion combinations shown in Tables 1 and 2, we perform a simulation test. We assume that the initial total wealth is 100 million, and we buy the asset portfolio depending on the proportion combination on October 25, 2018, and sell the portfolio on October 26, 2018. Under the two models, the total wealth on October 26, 2018, is shown in Figure 4; the point in the figure represents the performance of each combination. By the simulation, the wealth invested by using the new model helps investors earn more than the traditional model when taking the buy-and-sell strategy.

Total wealth for each investment proportion combination of two models.

4.2. Empirical Results of Buy-and-Hold for 5-Day Strategy

As the proposed model conducted by using buy-and-sell strategy, we get a satisfied result. However, the data set we selected is small and between summer and autumn, which makes people think that the above results have specific seasonality. To address this concern, we select a long data set covering all seasons of the year, from March 10, 2017, to March 12, 2018. The candlestick charts for three stocks are shown in Figure 5. In addition, in order to show that our model does not only work for buy-and-sell strategy, we take the data of closing price every 5 days for buy-and-hold strategy.

Candlestick chart for “-zgyh-”, “-nyyh-”, and “-jtyh-” from March 10, 2017, to March 12, 2018. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article. The data sources were downloaded from the web site “http://quotes.money.163.com/stock”.)

The return rate of “-jtyh-” for buy-and-hold for 5-day strategy is shown in Figure 6 and is reflected by “”. Then, we take return rate to the calculation process of Markowitz model. Table 3 shows the proportion of each asset for traditional mean-variance model, and we set 10 investment proportion combinations. Similar to buy-and-sell strategy, each investment proportion combination in the table responds to a maximum return for a given level of risk as measured by the standard variance. The points are constituted by mean and standard variance forming an efficient frontier.

Proportion of each asset for mean-variance model for buy-and-hold for 5-day strategy.

Investment proportion combination Proportion of “-zgyh-” Proportion of “-nyyh-” Proportion of “-jtyh-”
1 0.0924 0.0000 0.9076

2 0.2957 0.0000 0.7043

3 0.4409 0.0302 0.5288

4 0.4531 0.1296 0.4172

5 0.4654 0.2290 0.3056

6 0.4776 0.3284 0.1940

7 0.4898 0.4279 0.0823

8 0.4410 0.5590 0.0000

9 0.2205 0.7795 0.0000

10 0.0000 1.0000 0.0000

Return rate of mean-variance model and LSSVM mean-variance model for buy-and-hold for 5-day strategy of (a) “-zgyh-”, (b) “-nyyh-”, and (c) “-jtyh-”.

As seen in Figure 7, in the buy-and-hold strategy, the efficient portfolio frontier for LSSVM-mean-variance model has a better performance than the traditional model. It is possible to check that portfolios corresponding to the new proposed method can improve the return at the same risk. For the same expectation of both models, the LSSVM-mean-variance model can reduce the risk for investors.

Efficient portfolio frontier of mean-variance model and LSSVM mean-variance model for buy-and-hold for 5-day strategy.

Then, we take regression return rate to Markowitz model. Table 4 shows the proportion of each asset for LSSVM-mean-variance model. Here, we also set 10 investment proportion combinations.

Proportion of each asset for LSSVM-mean-variance model for buy-and-hold for 5-day strategy.

Investment proportion combination Proportion of “-zgyh-” Proportion of “-nyyh-” Proportion of “-jtyh-”
1 0.4498 0.0000 0.5502

2 0.5634 0.0301 0.4065

3 0.5718 0.1123 0.3159

4 0.5803 0.1944 0.2253

5 0.5887 0.2766 0.1347

6 0.5971 0.3588 0.0441

7 0.5134 0.4866 0.0000

8 0.3422 0.6578 0.0000

9 0.1711 0.8289 0.0000

10 0.0000 1.0000 0.0000

According to the investment proportion combinations shown in Tables 3 and 4, we perform a simulation test for buy-and-hold for 5-day strategy. We assume that the initial total wealth is 100 million, and we buy the asset portfolio depending on the proportion combination on March 12, 2018, and sell the portfolio on March 19, 2018. Under the two models, the total wealth on March 19, 2018, is shown in Figure 8; the point in the figure represents the performance of each combination. By the simulation, the new model helps investors earn more than the traditional model when taking the buy-and-hold strategy.

Total wealth for each investment proportion combination of two models for buy-and-hold for 5-day strategy.

To illustrate that this result is not caused by the specific period we selected, we calculate the total wealth of each day in 15 days from August 30, 2018, to September 19, 2018 for the two models according to the former 15 days. The calculation steps are taken as same as the above process. We set the total wealth of mean-variance model as TW1 and TW2 for the LSSVM-mean-variance model; the difference of two models is defined by (17)Diff=TW2-TW1.As shown in Figure 9, almost all the difference values are greater than 0, which indicates that the optimized model has a higher yield of each day in 15 days.

Total wealth difference of two models.

5. Conclusion

Machine learning over the last few years has resulted in a potential opportunity for investors to invest in the financial market with a smarter and profitable way. Combining machine learning technology with financial investment, it can entirely change the way we make investment decisions. This paper gives an overview of how the two technologies can be combined into a powerful tool and proposes the LSSVM-mean-variance algorithm with the aim of maximizing return for a given level of risk as measured by the variance of returns. The efficiency of the proposed method is measured by empirical data, namely, efficient frontier and total wealth. Comparing the efficient frontier and total wealth of both models, the curve of mean-variance model is always below the proposed model. This shows that our model has a higher yield under the same risk and has more total wealth under each combination; our model performs a more measurable standard of judgment when investors do their investment. We confirm the efficiency through the strategy both buy-and-sell and buy-and-hold. The encouraging performance shows that our proposed method may become a promising model for the context of study and the results indicate a positive opportunity to be explored in the future.

Data Availability

Data and source program codes in this paper are available upon request from the corresponding author.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The first author (Jian Wang) was supported by the China Scholarship Council (201808260026).

Aouni B. Colapinto C. La Torre D. Financial portfolio management through the goal programming model: current state-of-the-art European Journal of Operational Research 2014 234 2 536 545 10.1016/j.ejor.2013.09.040 MR3144742 Zbl1304.91178 2-s2.0-84890860098 Aksaraylı M. Pala O. A polynomial goal programming model for portfolio optimization based on entropy and higher moments Expert Systems with Applications 2018 94 185 192 10.1016/j.eswa.2017.10.056 Qu B. Y. Zhou Q. Xiao J. M. Liang J. J. Suganthan P. N. Large-scale portfolio optimization using multiobjective evolutionary algorithms and preselection methods Mathematical Problems in Engineering 2017 2017 14 2-s2.0-85029216370 Wang L. Chen Z. Nash equilibrium strategy for a DC pension plan with state-dependent risk aversion: a multiperiod mean-variance framework Discrete Dynamics in Nature and Society 2018 2018 17 10.1155/2018/7581231 Gasser S. M. Rammerstorfer M. Weinmayer K. Markowitz revisited: Social portfolio engineering European Journal of Operational Research 2017 258 3 1181 1190 10.1016/j.ejor.2016.10.043 Zbl1395.91404 Konno H. Yamazaki H. Mean absolute deviation portfolio optimization model and its application to Tokyo stock exchange Management Science 1991 37 519 531 10.1287/mnsc.37.5.519 Inuiguchi M. Tanino T. Portfolio selection under independent possibilistic information Fuzzy Sets and Systems 2000 115 1 83 92 10.1016/s0165-0114(99)00026-3 MR1776308 Zbl0982.91028 Zhou X. Pan Z. Hu G. Tang S. Zhao C. Stock market prediction on high-frequency data using generative adversarial nets Mathematical Problems in Engineering 2018 2018 11 2-s2.0-85048640888 Wang Z. Hu J. Wu Y. A bimodel algorithm with data-divider to predict stock index Mathematical Problems in Engineering 2018 2018 14 2-s2.0-85046288764 Prayogo D. Susanto Y. T. T. Optimizing the prediction accuracy of friction capacity of driven piles in cohesive soil using a novel self-tuning least squares support vector machine Advances in Civil Engineering 2018 2018 9 6490169 10.1155/2018/6490169 Mustaffa Z. Yusof Y. Optimizing LSSVM using ABC for non-volatile financial prediction Australian Journal of Basic and Applied Sciences 2011 5 11 549 556 2-s2.0-84155185326 Mustaffa Z. Sulaiman M. H. Kahar M. N. M. LS-SVM hyper-parameters optimization based on GWO algorithm for time series forecasting Proceedings of the 4th International Conference on Software Engineering and Computer Systems, ICSECS 2015 August 2015 Malaysia 183 188 2-s2.0-84962045454 Lahmiri S. Bekiros S. Chaos, randomness and multi-fractality in Bitcoin market Chaos, Solitons & Fractals 2018 106 28 34 10.1016/j.chaos.2017.11.005 Zbl1394.91324 Lahmiri S. Asymmetric and persistent responses in price volatility of fertilizers through stable and unstable periods Physica A: Statistical Mechanics and its Applications 2017 466 405 414 10.1016/j.physa.2016.09.036 Lahmiri S. Bekiros S. Time-varying self-similarity in alternative investments Chaos, Solitons & Fractals 2018 111 1 5 10.1016/j.chaos.2018.04.004 Van Gestel T. Suykens J. A. K. Baestaens D.-E. Lambrechts A. Lanckriet G. Vandaele B. De Moor B. Vandewalle J. Financial time series prediction using least squares support vector machines within the evidence framework IEEE Transactions on Neural Networks and Learning Systems 2001 12 4 809 821 2-s2.0-0035392694 10.1109/72.935093 Khemchandani R. Jayadeva . Chandra S. Regularized least squares fuzzy support vector regression for financial time series forecasting Expert Systems with Applications 2009 36 1 132 138 10.1016/j.eswa.2007.09.035 Shen W. Zhang Y. Ma X. Stock return forecast with LS-SVM and particle swarm optimization Proceedings of the 2009 International Conference on Business Intelligence and Financial Engineering, BIFE 2009 2009 China 143 147 2-s2.0-71049186063 Markowitz H. Portfolio selection The Journal of Finance 1952 7 1 77 91 10.1111/j.1540-6261.1952.tb01525.x Pelckmans K. Suykens J. A. Van Gestel T. De Brabanter J. Lukas L. Hamers B. Vandewalle J. LS-SVMlab: a matlab/c toolbox for least squares support vector machines Tutorial. KULeuven-ESAT 2002 142 1 2 Vapnik V. N. Chervonenkis A. Y. On the uniform convergence of relative frequencies of events to their probabilities Measures of Complexity 2015 Cham, Switzerland Springer 11 30 MR3408730 Chen X. Yang J. Liang J. Ye Q. Recursive robust least squares support vector regression based on maximum correntropy criterion Neurocomputing 2012 97 63 73 10.1016/j.neucom.2012.05.004 2-s2.0-84865318461 Nie H. Liu G. Liu X. Wang Y. Hybrid of ARIMA and SVMs for short-term load forecasting Energy Procedia 2012 16 1455 1460 2-s2.0-84896951924 Xin N. Gu X. Wu H. Hu Y. Yang Z. Application of genetic algorithm‐support vector regression (GA‐SVR) for quantitative analysis of herbal medicines Journal of Chemometrics 2012 26 353 360 10.1002/cem.2435 2-s2.0-84863775254