Application of Support Vector Regression in Indonesian Stock Price Prediction with Feature Selection Using Particle Swarm Optimisation

Stock investing is one of the most popular types of investments since it provides the highest return among all investment types; however, it is also associated with considerable risk. Fluctuating stock prices provide an opportunity for investors to make a high profit. We can see the movement of groups of stock prices from the stock index, which is called Jakarta Composite Index (JKSE) in Indonesia. Several studies have focused on the prediction of stock prices using machine learning, while one uses support vector regression (SVR). *erefore, this study examines the application of SVR and particle swarm optimisation (PSO) in predicting stock prices using stock historical data and several technical indicators, which are selected using PSO. Subsequently, a support vector machine (SVM) was applied to predict stock prices with the technical indicator selected by PSO as the predictor. *e study found that stock price prediction using SVR and PSO shows good performances for all data, and many features and training data used by the study have relatively low error probabilities. *ereby, an accurate model was obtained to predict stock prices in Indonesia.


Introduction
A stock is a sign of ownership of a company and indicates that a shareholder holds a share of the company's assets and earnings [1].Stock is one of the most popular types of investment instruments; they are doubly beneficial since they provide both dividend and capital gain.Dividends are profits shared among shareholders based on the number of shares held by them.Further, capital gain is a benefit of the difference between the purchase and selling prices; however, it can become capital loss if the selling price is lower than the purchase price.Stock prices are determined by the marketplace, where the seller's supply meets the buyer's demand.Unfortunately, since there is no specific equation that can exactly determine how a stock price will behave, stock prices are always fluctuating.
ese fluctuations provide an opportunity for investors to make big profits; however, they present big risks, as well.
is is because numerous factors, such as company news and performance, industry performance, investor sentiment, and economic factors, influence stock price fluctuations.
A stock index denotes stock price movement.One of the stock indices prevalent in Indonesia is the Jakarta Composite Index (JKSE).It denotes the movement, whether increase or decrease, of the stock prices of all securities listed on the Indonesia Stock Exchange (IDX).
is is an important concern for investors since JKSE affects the attitudes of investors regarding whether to buy, hold, or sell their shares.To ensure that this prediction model applies to other stock data as well, this study uses three real estate stock datasets listed on IDX.
Stock price prediction mechanisms are fundamental to the formation of investment strategies and development of risk management models [2].Computational advances have led to the formulation of several machine learning algorithms that can be used to anticipate market movements consistently and, thereby, estimate future asset values, such as company stock prices [3].Machine learning is a branch of science that allows computers the ability to learn based on existing data.One type of machine learning is support vector regression (SVR).SVR is the development of the support vector machine (SVM) method for regression cases.One of its advantages when compared to other regression models, such as ordinary least square (OLS), is that SVR can handle nonseparable data, whereas the OLS method gives a poor prediction result for such data [4].
Feature selection will be done using particle swarm optimisation (PSO) to maximise the performance of SVR.Using PSO for feature selection will reduce computational time.It is an evolutionary computation technique, which is computationally less expensive than other evolutionary computation algorithm [5,6].
is study focuses on one such machine learning algorithm, that is, PSO, to select any indicator that has an effect on stock prices.Further, it predicts stock prices by inserting the selected indicator into the SVR.

Literature Review
Over the past few decades, numerous researchers have conducted studies on the prediction of stock prices using machine learning and deep learning.Henrique, Sobreiro, and Kimura used SVR for stock price prediction on daily and up-to-the-minute prices [7].Hiransha et al. [8] used deeplearning models for NSE stock market prediction [8].

Technical Analysis.
Technical analysis is defined as the art of predicting stock prices based on current commodity offerings, stocks, indices, futures, or tradeable instruments.It inserts stock price and volume information into a chart and applies various patterns and indicators to assess future stock price movements [9].Usually, the literature focusing on SVM and SVR uses technical analysis indicators.Table 1 depicts the formulas of technical analysis [10]: e closing price on day t is denoted by C t , and the number of trading days used is represented by n.EMA t−1 is the exponential moving average one day before t, α is the weight coefficient, Dw t−1 denotes the changes in price decreases, and Up t−1 changes in price increases.L t−x is the low price x days before t, and H t−x is the high price x days before t.Low is the lowest price, High is the highest price, and Close is the last price for day t.e expression %K i represents the stochastic %K on day i.A typical price of day i is denoted by TP i .SMA 1 and SMA 2 are simple moving average values for a certain period, and the differences between the typical price and simple moving average of day i are denoted by D i .

Particle Swarm Optimisation.
In 1995, Dr. Eberhart and Dr. Kennedy proposed the PSO algorithm [5]. is algorithm mimics the behaviour of a flock of birds.Similar to a bird in a flock or a particle in a group of particles, each individual in a group has his or her own intelligence and can influence group behaviour.Each individual or particle behaves in a mutually connected manner by using one's own intelligence and is influenced by the behaviour of the group.erefore, if a particle or a bird finds the right path or shortest path to a food source, then their group will be able to immediately follow the path, even though the location of the particle or bird is far away from the group.e particles in a group have a certain size, and each particle has two characteristics, namely, location and velocity.PSO is a well-known tool for finding the optimal characteristics of a particle by performing local and global iterative searches in the feature search space [11].In PSO, there is a group of random particles that moves around the solution space until convergence is reached.ere are some features that are irrelevant and noisy and lead to high misclassification rates [11].
erefore, PSO is used for feature selection to reduce noisy features and remove irrelevant features.
is model is simulated in space with a certain dimension; an increase in the number of iterations indicates that the particle is getting closer to the intended target.is is done until the maximum iteration is achieved, or another criterion can be used.
First, we generate the initial positions, X i , and initial velocities, V i , of n random particles; then, for each particle, evaluate the fitness function, f(x i ), from each particle based on its position.We determine the particle position with the best fitness and set it as gbest.For each particle, the pbest is the particle position with the best fitness that has been obtained by each particle so far.After determining pbest and gbest, the positions and velocities will be updated as specified by Seal and colleagues [11]: where w is a factor used to control the balance of search between exploitation and exploration, c 1 and c 2 are cognitive and social parameters, respectively, which have a value between 0 and 1, ud is another random value bounded between 0 and 1, i � 1, 2, 3, . . .n, and n is the size of population.e primary SVR problem can be defined as follows [12]: where w is a d dimensional weight vector.e constant C > 0 determines the trade-off between the differences in decision function, where the upper limit of deviation that is more than ε can still be tolerated [12].A deviation greater than ε will be subject to a penalty of C. Further, high slack variable values cause empirical errors to affect regulatory factors significantly.In SVR, the support vector is a training data value that is located on or outside the boundary of the decision function; therefore, the number of support vectors decreases with an increase in error values, ε.
In dual formulations, the optimisation problem of SVR is represented as follows [12]: where k(x i , x j ) denotes the kernel function, which is defined as k(x i , x j ) � φ(x i ) • φ(x j ), in which φ is a mapping from data space to the feature space F. α i and  α ı are the Lagrange multipliers. .Using the Lagrange multiplier and optimality conditions, the regression function can be explicitly formulated as follows:

Materials and Methods
is study proposes the application of SVR with PSO for feature selection.e closing price is initially used as a raw input; subsequently, analysis techniques are used to transform the raw input to a technical analysis indicator.In case the indicator has a difference scale, the data in each indicator should be normalised.A normalised technical analysis indicator is used as a feature to predict the stock price and is then selected using PSO.It is noted that PSO is the best tool to determine the optimal characteristics of a feature by finding the local best and global best in the feature search space with the help of a local, as well as global, search in an iterative manner to make model prediction more effective.

Data Set.
is study used the Stock Composite Index and several stocks from the real estate sector in Indonesia as data sets.e JKSE is one of the main indicators reflecting the performance of the capital market in Indonesia; it records the price movements of shares of all the securities listed on the IDX, as well.e data used comprise the adjusted closing price of the daily data history from Yahoo Finance, and the amount of data used is 650 datasets from 4 January 2016 to 10 September 2018.Using technical analysis, the data are processed into several indicators, such as simple moving average, exponential moving average, momentum, rate of change, moving average convergence divergence, commodity channel index, relative strength index, and stochastic %K and stochastic %D.

Data Preprocessing.
In this study, the data were preprocessed using technical analysis, normalisation, and feature selection through PSO. e technical analysis indicator is obtained by applying a formula on the daily data history of stock prices [7].
e daily data history of stock price components used by the study includes close price, low price, and high price.Further, 14 indicators are included in this study using the formula in [10].
As discussed previously, we normalised the data to equalise the value scale.e formula used to normalise the data to range [−1, 1] is as follows [8]: where x ′ (t) is the value of the technical indicator, A, on the normalised day, t; x(t) is the value of the technical indicator, A, on day, t; min A is the smallest value of technical indicator, A; and max A is the largest value of technical indicator, A. Further, min ′ A and max ′ A denote the new smallest and new largest values of technical indicator, A, respectively.en, the normalised value of the technical indicator is selected using PSO.
e selected features are used as input data for the prediction process.
Modelling and Simulation in Engineering 3.3.Feature Selection.Feature selection, or variable selection, can be briefly described as a tool to select the variable that can represent the original data set [13].It is the process of selecting the best i features in a set of data by using an algorithm to evaluate the features [14].It has many advantages, one of which is that it increases the accuracy of the resulting model [15].PSO is one of the tools for feature selection.In a previous study, Seal et al. researched the use of PSO for feature selection in thermal face recognition [11].Cases that used PSO provided markedly better performance results than those that did not use PSO; this inspired the author to try it in predicting the stock price model.
First, we input the variable into the PSO program and then obtain the output as a cost score from PSO.Subsequently, the score is shorted by the smallest score.After being shorted, we create groups of input data with one variable into data with 14 variables and input the data into the SVR program. is study performed 10%, 20%, and 90% data training to find the best model.

Performance Criteria.
In this study, SVR was used to predict close prices.
e results are compared with the returns obtained by the given variance [7].e adequacy of the model's price prediction can be evaluated using methods such as the root-mean-squared error, mean absolute percentage error, and normalised mean squared error (NMSE).In this study, the NMSE was calculated according to the following equation [16]: where the observed value is represented as y i , the predicted value is shown as  y i , and the mean of the observed value is depicted as y.

Experimental Results
e experimental results of stock price prediction using SVR and PSO showed good performances for all the data that were used, and many features and training data used by the study had relatively small NMSE values, which averaged below 0.1.e data used are JKSE and real estate stock data, comprising Alam Sutera Reality Tbk (ASRI), Agung Podomoro Land Tbk (APLN), and Serpong Damai Tbk (BSDE).
Table 2 reveals that the experiment results of using the SVR algorithm and the PSO for feature selection for JKSE data, with 70% training data, have the lowest NMSE value.By using 70% training data, the use of 12, 13, and 14 features give the lowest value of NMSE.
As shown in Table 3, the smallest NMSE value for ASRI uses 90% training data by using 13 features and 80% and 90% training data by using 14 features.
Table 4 clarifies that, by using the same number of features using 50% to 90% training data, the smallest NMSE value of APLN is obtained.On the other hand, by using the same number of training data using 11, 12, 13, and 14 features, the smallest value of NMSE is obtained.Although the NMSE value is the same for data with and without feature selection, the use of a smaller feature is more beneficial, since it will streamline the running time of the program.For example, when considering the use of 13 or 14 features to obtain the same value of NMSE, it is better to use 13 features.e same aspect applies to training data as well; for example, if the use of 80% and 90% training data gives the same NMSE value, then it is better to use the former, since it will streamline the running time of the program.

Conclusions
Based on the experimental results of this study, we conclude that the prediction model for stock price prediction using SVR with feature selection using PSO exhibits good performance, since it has relatively small NMSE values, which average below 0.1.Although some data provide the same smallest NMSE value with and without feature selection, it would be better to use a small feature, since it will streamline the running time of the program.Moreover, depending on the type of data, the order of features in each data varies.
us, an accurate stock price prediction model is obtained so investors can predict future stock prices and gain profits.In subsequent studies, more technical indicators will be added, and another feature selection and another data input will be used as a comparison.

Table 1 :
Formulas of technical analysis.
Support Vector Machines for Regression.SVM is a machine learning method invented by Vladimir Vapnik and team in 1992.Initially, an SVM was used to solve classification problems alone; however, now, it has been developed to solve regression problems, as well.It is noted that SVR involves the application of SVM in regression cases; in this case, the output of this method comprises real numbers., y 2 , y 3 , . . .  ∈ R. e purpose of this method is to find a function of y(x) that gives the smallest error, ε, for all the learning data x i .

Table 5
depicts that the smallest NMSE value of BSDE uses 12 features by using 70% and 80% training data; 13