Large-Scale Portfolio Optimization Using Multiobjective Evolutionary Algorithms and Preselection Methods

Portfolio optimization problems involve selection of different assets to invest in order to maximize the overall return andminimize the overall risk simultaneously. The complexity of the optimal asset allocation problem increases with an increase in the number of assets available to select from for investing. The optimization problem becomes computationally challenging when there are more than a few hundreds of assets to select from. To reduce the complexity of large-scale portfolio optimization, two asset preselection procedures that consider return and risk of individual asset and pairwise correlation to remove assets that may not potentially be selected into any portfolio are proposed in this paper. With these asset preselection methods, the number of assets considered to be included in a portfolio can be increased to thousands. To test the effectiveness of the proposed methods, a NormalizedMultiobjective Evolutionary Algorithm based onDecomposition (NMOEA/D) algorithm and several other commonly used multiobjective evolutionary algorithms are applied and compared. Six experiments with different settings are carried out.The experimental results show that with the proposedmethods the simulation time is reduced while return-risk trade-off performances are significantly improved. Meanwhile, the NMOEA/D is able to outperform other compared algorithms on all experiments according to the comparative analysis.


Introduction
In modern financial markets, portfolio allocation is one of the major problems faced by investors and fund managers.Investors need to choose several assets from thousands of available assets to form a single portfolio from an enormous set of possibilities in order to simultaneously maximize the return and minimize the risk.It is crucial for investors to find a perfect portfolio allocation.However, higher returns are generally associated with higher risk.Investors with varying degrees of risk aversion will demand different levels of return for taking on different degrees of risk [1,2].
The concept of portfolio optimization has been an important tool in the development and understanding of financial markets.Portfolio optimization techniques can assist the search for the portfolio that best suits each investor's particular objective [3,4].As stated by the BusinessWeek [5], the single best weapon against risk is to form portfolios with uncorrelated or negatively correlated assets because when several such assets are combined together, the overall risk of the portfolio may be less than that of the individual asset.Thus, finding a suitable combination of investments attracted attentions of investors and scholars.The major breakthrough of portfolio optimization came in 1952 with the publication of Markowitz's theory of portfolio selection [6].Markowitz quantified return and risk of a security using statistical measures of its expected return and standard deviation.Markowitz suggested that investors should consider return and risk together and determine the allocation of funds among investment alternatives on the basis of their returnrisk trade-off [7].This theory is popularly referred to as the modern portfolio theory and it is also the theoretical basis for this work.The details of the theory are presented in Section 2.

Mathematical Problems in Engineering
As there are two-conflicting objectives, there is no single optimal solution to this portfolio optimization problem.Instead, there is an efficient frontier of optimal trade-off solutions.It is often desirable to have the entire efficient frontier of optimal portfolio sets that give the average return against the possible risk so that individual investors can choose the most appropriate return-risk trade-off to suit their investment objectives.In recent years, evolutionary algorithms (EAs) have become an effective tool in handling optimization problems [8].EAs search for the solution by initializing a population of random candidates.These candidates undergo an evolutionary process based on the survival-ofthe-fittest mechanism and the individuals that have superior fitness will be passed to the next generations [9].
Using EAs to solve assets allocation problem has become a trend in recent years, as EAs are able to find multiple Pareto-optimal solutions in one single run [10].Most of the early works transformed the optimization problem into a single-objective problem using a trade-off function [10][11][12][13].With the development of multiobjective evolutionary algorithms (MOEAs) [8,[14][15][16], researchers have focused their attention on using MOEAs for solving portfolio optimization problems.The first use of MOEAs for solving portfolio optimization problem can be traced back to 1993 [17].This work adopted lower partial moments and a genetic approach to handle the problem.In 1996, Shoaf and Foster [18] used a multiobjective GA to solve the conventional Markowitz formulation.They adopted a specific encoding scheme which was able to indicate whether the holding of the particular asset would be long or short.In [19], six different MOEAs including original VEGA [20], two versions of modified VEGA, NSGAII [15], MOGA [21], and SPEA2 [22] were examined and compared using the classical meanvariance model.A mixed binary-real encoding scheme was used for all the compared algorithms.Yan et al. [23] used the semivariance as the risk and growth rate as the return.In this work, instead of considering a single investment period, T periods were used for the optimization.The weight of each investment asset must be determined by the decisionmaker at the beginning of each period.Lin and Liu [24] applied genetic algorithm on Taiwanese mutual fund for portfolio selection problem with minimum transaction lots.However, due to the computational complexity of solving a large-scale quadratic programming problem, Markowitz's portfolio optimization model has not been used extensively in its original form to construct a large-scale portfolio [24].As the number of assets increases, the complexity of the search space increases exponentially.The state-of-theart evolutionary algorithms may not effectively solve the problem when there are more than 200 assets available to form portfolios. Lin and Liu [24] used up to 204 assets to form the optimal portfolios.In [25], Ghahtarani and Najafi used 20 stocks from the Tehran stock exchange in the portfolio optimization.Fernández and Gómez [26] adopted Markowitz's model with cardinality and bounding constraints.In this work, four different markets with different number of stocks were considered and the largest number of stocks used was 225.Motivated by these observations, two asset preselection processes are proposed in this work for large-scale portfolio optimization so that the number of assets to be considered can be increased to a few thousands or more.The rest of the paper is outlined as follows.Section 2 presents the models of portfolio optimization.Section 3 briefly reviews the concept of multiobjective optimization and the four multiobjective optimization algorithms used in this work.Sections 4 and 5 introduce the preselection process and the constraint handling method, respectively.The experimental settings and results are presented and analyzed in Section 6.Finally, the relevant conclusions and directions for future work are discussed in Section 7.

Portfolio Optimization
The mathematical representation of portfolio optimization was introduced by Markowitz in 1952 and he was rewarded with a Nobel Prize in Economics in 1990 [27].The Markowitz model assumes that investors would like to maximize return under a certain risk level or minimize the risk with a certain return level [6] and this model makes use of the mean and variance of normalized historical asset prices to compute the expected portfolio return and risk [24], respectively.The model can be expressed as a biobjective problem as follows: where  is the number of assets in the portfolio which is also the dimensionality of the optimization problem.  is the weight of the th asset to be optimized. 2 stands for the portfolio risk while   is the covariance between asset  and asset .If  = ,   is just the variance of that particular asset.  is the average portfolio return and   is the average individual return of asset .Due to the determined efforts of various researchers including Sharpe [28], Pang [29], Best and Hlouskova [30], and others, Markowitz's work has been widely extended.Recent works also include various constraints for the portfolio optimization problem.Speranza [31] proposed a model that takes into account the characteristics of the portfolio optimization.Chang et al. [32] extended the standard model to include cardinality constraints that limit a portfolio to have a specified number of assets and to impose limits on the proportion of the portfolio  [34] and it is considered as one of the most effective multiobjective optimization algorithms at present.MOEA/D provides a new framework by combining decomposition method and evolutionary algorithm for MOPs [35].It explicitly decomposes the MOP into a number of single optimization subproblems and solves these subproblems simultaneously by evolving a population of solutions.

Multiobjective Optimization
In MOEA/D, various decomposition methods can be used to decompose an MOP into a set of subproblems [34].However, the weighted Tchebycheff approach is the most prominent method as it is less sensitive to the shape of Pareto Front.Therefore, this approach is also adopted in this work.The objective of each subproblem can be represented as follows: where  is the number of objectives,  *  is the reference point for the th objective, and  is the weight vector.Note that the reference point can be represented as  *  = min{  () |  ∈ Ω}.The details of the original MOEA/D can be found in [34].

Normalized MOEA/D with Normalized
Objectives.The MOEA/D algorithm decomposes the MOPs into a number of subproblems by a set of evenly spread weight vectors.The main idea is to narrow the gap between the objective value and the reference value in  (the number of objectives) dimensions with the assigned weight vector.In this way, MOEA/D forces the objective function to evolve in the direction of minimizing the maximum difference.However, the scales of different objectives are generally not the same for real-world applications.In such cases, most of the searching resources will be spent on the most significant objective (with largest value) while the other objectives are barely evolved.Considering portfolio optimization as an example, the Pareto Front of optimizing 100 stocks by the original MOEA/D is plotted in Figure 1.As can been seen from this plot, the solutions in the low risk area are relatively sparse.To overcome this problem, a normalization method is proposed as follows: where   ,    are the original and the normalized value of the th objective, respectively. max  ,  min  are the maximum and minimum value of corresponding objective function.In each generation, the objectives of all the individuals are normalized using the above formula.To demonstrate the effect of normalization, the PFs obtained by MOEA/D with and without normalization are plotted in Figure 2. It is clear that a more integrated front is obtained by using the normalization method.

End
The selection of assets according to their return and risk (the number of stocks will decrease) Step1: preselection method Using MOEAs to get the Pareto optimality Step 2: optimization process The data of one thousand stocks

Large-Scale Portfolio Optimization with Preselection Methods
It is obvious that no state-of-the-art evolutionary algorithm is able to effectively solve complex multiobjective optimization problems with several thousands of variables without requiring much computational time on a standard PC.Therefore, it is important to preselect potential assets which are likely to be included in the optimized portfolios.We employ the concepts of nondominance and Pareto optimality to eliminate assets, which may not be used in the final optimal solutions.Subsequently, an MOEA is applied to the preselected assets to obtain the best approximation for the Pareto optimality.Figure 3 presents the optimization process starting with 1000 stocks.As can be seen from the figure, the number of assets is reduced within an acceptable range through Step 1.The asset selection process is motivated by the observation that every asset has its own risk and return values.These risk and return values can be plotted and can also be subjected to nondomination sorting process.It is obvious that portfolios constructed using the assets on the efficient frontier and other frontiers just below the efficient frontier are expected to yield portfolios near the efficient frontier when the risk and return values of portfolios are subjected to nondomination sorting.Based on these observations, two assets preselection processes are proposed in this paper.The details are listed as follows: Assets Preselection Process 1 (P-1).It is easy to understand that good single asset (high return and low risk) is likely to contribute to the final optimized portfolio.Therefore, P-1 selects the assets according to their returns and risks using nondomination sorting.In this process, each asset is considered as a portfolio and these assets are ranked using the nondomination sorting method.The selection is based on the ranks assigned to each asset until the number of selected assets reaches the predefined (user-defined) number.Although the idea of P-1 is simple, it is effective for reducing the bad assets.
Assets Preselection Process 2 (P-2).Although P-1 is simple and effective, considering only the risk and return of individual asset is insufficient when a high performance portfolio is required, because a pair of negatively correlated assets has the potential to reduce the risk depending on the degree of negative correlation.Therefore, in the asset preselection process 2 (P-2), both single assets and negatively correlated pairs of assets are considered.Figure 4 shows how the assets are selected using P-2.Imagine that we wish to select 3 assets out of the 8 assets A-H.All assets are marked according to their return and risk values.Out of the 8 assets, assume that only C and D are negatively correlated.From C and D, infinite weighted combinations can be obtained.However, there is only 1 point that has the lowest risk called CD and this point will be included in the nondomination sorting process.It is clear that point "CD" dominates asset B. Therefore, assets A, C, and D will be selected as the best 3 assets.For a pair of assets, the combined return and risk values are computed as follows: where  1 ,  2 are the fraction of the two assets held in the portfolio with  1 +  2 = 1,   is the return on asset , and   is the covariance of returns of assets  and .In the selection process, ( 6) is differentiated with respect to  1 and set to zero.By solving the obtained equation,  1 that gives the minimum combined risk is obtained.Substituting  1 back to (6) will give us the minimum risk that can be obtained by that pair.

Constraint Handling Methods
The Markowitz (mean-variance) model initially considered only one strong constraint; that is, the summation of all asset weights must be equal to 1.This constraint is known as the real-world constraint, as the available capital must strictly meet the sum of the invested amounts in all assets which means the sum of the percentages of all the stocks should be equal to 100%.In this work, other constraints such as floor-ceiling constraints or minimum lots constraint are not considered.These constraints will be considered in our future work.To handle practical constraints and to ensure the weights to be positive (no short-sell is allowed), two different encoding schemes are adopted.The details are presented as follows.
Scheme 1 (S-1).In this encoding scheme, the weights/percentages for all the assets are firstly randomly initialized in the range of [0, 1].Then all the weights are summed together to obtain a total weight.Lastly, all the initial weights are divided by the total weight to get the new weight.In this case, the summation of all the weights is always equal to 1.This scheme can be described by the following pseudocode.
The Pseudocode of S-1 Step 1. Initialize the weights  1,ini ,  2,ini , . . .,  ,ini in the range of 0 to 1, where  is the number of assets in the portfolio.
Step 2. Sum all the initial weights to get the total weight.
Step 3. Use the following equation to obtain the new weight and these weights meet the real-world constraint.  =  ,ini / total ,  = 1, 2, . . ., .
Scheme 2 (S-2).This scheme is based on a variation of random keys [36].In this scheme, all weights are randomly initialized from 0 to 0.1.All assets are sorted in the decreasing order of their weight until the summation of the included weights is greater than 1.Lastly, the weight of the last included asset will be reassigned to satisfy the unity constraint.For this scheme, the weights of many stocks are equal to 0.

Experimental Setup.
To test the performance of the proposed methods, MATLAB R2013a was used as the programming language and simulation platform and all tests were conducted on a DELL server with Intel (R) Xeon (R) E5-2603 processor (@1.8 GHz) and 8.00 GB of memory.For all the algorithms, 100 days' closing prices (starting from 2 Jan. 2014) of 1000 stocks of Chinese stock market were downloaded and used as the historical stock data in the simulation.The population size  and the maximum number of function evaluations (MaxFEs) are set as 100 and 20000, respectively, for all algorithms.The other parameters are set as follows: (1) Normalized Multiobjective Evolutionary Algorithm based on Decomposition (NMOEA/D): The number of the weight vectors in the neighbourhood of each weigh vector is 20.The crossover rate CR is selected as 1 while the scaling factor  is 0.6. and   are chosen as 20 and 1/6, respectively.
To evaluate the performance of the proposed algorithm and preselection methods presented in Section 3, two different encoding schemes presented in Section 4 are used, and six experiments are conducted.The details of these experiments are presented as follows.Note that all MOEAs are tested and compared on each of the six experiments and 25 independent runs are executed for each algorithm.

Experiment 1 (S1-E1
).The original 1000 stocks are used to form the portfolio and the dimension of the optimization problem is 1000 (without assets preselection process).
Experiment 2 (S1-E2P1).Optimization with 100 stocks and these 100 stocks are selected from the 1000 stocks using asset selection method 1 P-1.

Experiment 6 (S2-E6P2
). Optimization with 100 stocks and these 100 stocks are selected from the 1000 stocks using asset selection method 2 P-2.

S-1.
To evaluate the performance of different algorithms with encoding Scheme 1, Experiments 1-3 were carried out.The Pareto Front of median run obtained by each algorithm is plotted in Figures 5-7.As can been seen from these plots, NMOEA/D performs the best among the algorithms for all three cases (both with and without preselection method).This is because of the high searching efficiency of NMOEA/D algorithm.The advantage of NMOEA/D is more obvious when the optimization is without the preselection method.This showed that NMOEA/D is able to solve high dimensional problem well.MODE-NDS performs a little better  than MOCLPSO and MODE-SS for S1-E1 while MODE-SS performs slightly better in S1-E2P1 and S1-E3P2.These results indicate that MODE-SS is weak in solving largescale (1000 dimensions) problem but more efficient in solve low dimensional (100 dimensions) problems when compared with MOCLPSO and MODE-NDS.NSGAII performs the worst among the compared algorithms for all three experiments which showed that NSGAII is not as effective as others when solving multiobjective portfolio optimization problems.
To demonstrate the usefulness of the preselection processes, the return/risk trade-off curves for each algorithm without (1000 stocks) and with (100 stocks) two different preselection processes (P-1 and P-2) are plotted in Figures 8-12, respectively.As can been seen from these plots, after applying the preselection methods, the performances of all five algorithms are greatly improved.The preselection processes remove the bad stocks and keep those stocks that could potentially contribute to the portfolio.These methods not only improve the computational efficiency of the algorithms but also improve the quality of the solution.From    the plots, we can also find out that preselection method 2 (P-2) generally performs better than preselection method 1 (P-1).The reason is obvious.P-2 considers both the single stocks (high return, low risk) and negatively correlated pairs while P-1 only considers single stocks.The negatively correlated pairs are able to further improve the solution quality.However, P-1 can be still useful when the requirement of accuracy is not high, as this method is faster than P-2.

S-2.
In this part, the results of Experiments 4-6 are presented.Experiments 4-6 used encoding Scheme 2 and the nondominated fronts for each algorithm (median run) are presented in Figures 13-15.As can been seen from these graphs, the best results are still generated by NMOEA/D which is consistent with the results obtained in S-1.The differences among various algorithms are small compared with the results obtained in S-1, especially for S2-E5P1 and S2-E6P2.Although NSGAII performs the worst, it can still generate distributed and satisfactory fronts.Note that NMOEA/D showed good performance when optimizing the original 1000-stock portfolio indicating that NMOEA/D has a stronger ability in solving large-scale optimization problem.However, preselections offer more solutions at the lowest risk extreme.Similar to the plots in S-1, the Pareto Fronts for each algorithm without (1000 stocks) and with (100 stocks) two different preselection processes (P-1 and P-2) are presented in Figures 16-20 to demonstrate the advantage of using preselection methods.The advantages of using preselection methods are quite clear for all algorithms except NMOEA/D.As shown in Figure 20, the three fronts obtained by three different methods almost coincide.Moreover, the differences between P-1 and P-2 are marginal for all algorithms.To view the difference more clearly and to further illustrate the usefulness of the preselection process, the maximum function evaluations (FEs) were reduced to 5000 for MODE-SS and NMOEA/D.The results are presented in Figures 21  and 22.As can be seen from the plot, the advantages of using preselection methods become clearer as the FEs reduce.

The Effects of Number of Stocks Selected.
To compare the results of selecting different number of stocks, the following experiments are performed.Using NMOEA/D as the optimizing tool, 500, 200, 100, and 50 stocks are selected from the original 1000 stocks using preselection process P-2 and the optimized results are plotted in Figure 23 (encoding Scheme 1, S-1) and Figure 24 (encoding Scheme 2, S-2).The maximum function evaluations used are 10000.From these two graphs, it can be seen that the performance of the algorithm improves as the number of selected stocks decreases.However, if the number of stocks is further decreased to 10, the final Pareto Front becomes bad again especially for the low risk part (Figures 25 and 26 for S-1 and S-2, resp.).This observation is easy to understand.If the selected number of stocks is too small, it is difficult to cover a large number of negatively correlated assets and these assets are the main contributors of reducing the portfolio risk.Based on these observations, 50-200 is generally the desirable range for the number stocks to be preselected.

Numerical Analysis of the Results.
To analyze the experimental results, Table 1 is presented.In this table, the best risk, best return, and the best compromise solutions of different algorithms and different methods are listed.The best compromise solution is obtained by using a fuzzy-based method which can be represented as follows [38]: where   is the membership value of the th objective function (  ).The normalized membership value [] for each nondominated solution  is calculated using where  is the number of objectives while  pareto is the number of solutions in the nondominated front.The best compromise solution is the one with the largest normalized membership value.
As can be seen from Table 1, with the preselection processes, the results (best risk, best return, and best compromise solution) obtained are either better or similar to those obtained without preselection.Moreover, the risk obtained by P-2 is always smaller than the risk obtained by P-1 when the returns of the two methods are the same.This advantage of P-2 is because of the negatively correlated pairs which can reduce the risk of the portfolio.To further demonstrate the advantage of preselection process, the simulation times (in seconds) for each algorithm with/without the preselection process are recorded in Table 2.It is clear that the simulation time is reduced greatly by using the preselection process for all the tested algorithms.
Table 3 presents the values of  indicator for each algorithm.The method of calculating  indicator can be obtained from [39].The best results of mean values are identified by using boldface.As can be seen from this table, the best results are generated by either P-1 or P-2 (most cases generated by P-2).Moreover, -test is applied on the  indicator.The numerical values 1 represent that the results obtained by using preselection methods are statistically superior to those obtained without using preselection method.This also demonstrated the effectiveness of the preselection processes.One observation is that the mean values obtained by other algorithms are better than that obtained by NMOEA/D for S-2.This seems to be contradictory with the results obtained from the figures.This is because NMOEA/D is able to generate better results in the 45-degree part (low risk high return), but the low risk low return part is missing and this will lead to bad indicator values.However, the investors are more interested in the low risk high return part and we can still conclude that NMOEA/D performs the best from this point of view.

Conclusion
In this paper, an efficient procedure for solving large-scale portfolio optimization problem was introduced.The key features of this method are the asset preselection processes.Two different asset preselection processes were proposed.Largescale optimization requires long computational time and large memory to compute the optimal portfolios.However, with the asset preselection process, the assets that may not be potentially included in the final portfolio are removed thereby reducing the computational time substantially while improving the quality of the generated efficient portfolios within the same number of function evaluations.Our extensive results showed that the quality of the generated efficient portfolios improved significantly due to the asset preselection procedures, especially for the selection process 2 which considers both single asset and negatively correlated pairs.Moreover, the Normalized MOEA/D algorithm performed the best due to its high searching efficiency.In future, we are planning to include various real-world constraints such as the floor-ceiling constraint and cardinality constraint in the optimization process.These constraints will introduce additional difficulties for the optimization algorithms and constraint handling methods thereby further highlighting the importance of asset preselection process.With these constraints, the problem formulation will be readily applicable to real-world situations and the results obtained can be attractive to the investors.
Five different MOEAs known as Normalized Multiobjective Evolutionary Algorithm based on Decomposition (MOEA/D), Multiobjective Differential Evolution based on Summation Sorting (MODE-SS), Multiobjective Differential Evolution based on Nondomination Sorting (MODE-NDS), Multiobjective Comprehensive Learning Particle Swarm Optimizer (MOCLPSO), and NSGAII are used as the optimization tools to test the effectiveness of the preselection process.

Figure 1 :
Figure 1: The PF obtained by MOEA/D without normalization.

Figure 2 :
Figure 2: The PFs obtained by MOEA/D w/o normalization.
) Multiobjective Differential Evolution based on Summation Sorting (MODE-SS) and Multiobjective Differential Evolution based on Nondomination Sorting (MODE-NDS)[8]:  and CR are set as 0.5 and 0.1."DE/rand/2" is used as the mutation strategy of DE.