A Hybrid Intelligent Method of Predicting Stock Returns

This paper proposes a novel method for predicting stock returns by means of a hybrid intelligent model. Initially predictions are obtained by a linear model, and thereby prediction errors are collected and fed into a recurrent neural network which is actually an autoregressive moving reference neural network. Recurrent neural network results in minimized prediction errors because of nonlinear processing and also because of its configuration.These prediction errors are used to obtain final predictions by summation method as well as by multiplication method.The proposed model is thus hybrid of both a linear and a nonlinear model.Themodel has been tested on stock data obtained fromNational Stock Exchange of India.The results indicate that the proposed model can be a promising approach in predicting future stock movements.


Introduction
Prediction of stock returns has attracted many researchers in the past and at present it is still an emerging area both in academia and in industry.Mathematically, the techniques involved in obtaining prediction of stock returns can be broadly classified into two categories.The first category involves linear models such as autoregressive moving average models, exponential smoothing, linear trend prediction, random walk model, generalized autoregressive conditional heteroskedasticity, and stochastic volatility model [1].The second category involves those models which are based on artificial intelligence such as artificial neural networks (ANNs) [2], support vector machines [3], genetic algorithms (GA), and particle swarm optimization (PSO) [4].Linear models have a common limitation associated with them, that is, their linear feature which prevents them from detecting nonlinear patterns of data.Due to instability in stock market, the stock data is volatile in nature; thus, linear models are unable to detect nonlinear patterns of such data.Nonlinear models overcome the limitations of linear models, as ANNs embody useful nonlinear functions which are able to detect nonlinear patterns of data [5].As a consequence, prediction performance improves by using nonlinear models [6,7].
A lot of work has been done in this field; for instance, radial basis neural network was used for stock prediction of Shanghai Stock Exchange, wherein artificial fish swarm optimization was introduced so as to optimize radial basis function [8].In time series prediction, ANNs have received overwhelming attention from researchers.For instance, Freitas et al. [9], Wang et al. [10], Khashei and Bijari [11], Chen et al. [12], and Jain and Kumar [13] report the use of ANN in time series stock prediction.A new approach called the wavelet denoising based backpropagation neural network was proposed for predicting stock prices [14].In another work, researchers explored the use of activation functions in ANN, and to improve the performance of ANNs, they suggest the use of three new simple functions; financial time series data was used in the experiments [15].For seasonal time series prediction, an ANN was proposed, which considers the seasonal time periods in time series.The purpose of such consideration is to determine the number of input and output neurons [16].Multilayer perceptron and generalized regression neural networks were used to predict the Kuwait Stock Exchange [17].The results showed that the models were useful in predicting stock exchange movements in emerging markets.Different ANN models were proposed that are able to capture temporal aspects of inputs [5].For time series predictions, two types of ANN models are proved to be successful: time-lagged feedforward networks and dynamically driven recurrent (feedback) networks [18].

2
Advances in Artificial Neural Systems ANNs are not always guaranteed to yield desired results.In order to solve such problems, the researchers attempted to find global optimization approach of ANN to predict the stock price index [19].With the goal to further improve the performance of predictors, researchers have also attempted to develop hybrid models for prediction of stock returns.The hybridization may include integration of linear and nonlinear models [10,20,21].Researchers developed a hybrid forecasting model by integrating recurrent neural network based on artificial bee colony and wavelet transforms [22].In another work, an adaptive network-based fuzzy inference system was used so as to develop a hybrid prediction model [23].A hybrid system was proposed by integrating linear autoregressive integrated moving average model and ANN [11].To overcome the prediction problem with the use of technical indicators, researchers proposed a hybrid model by merging ANNs and genetic programming [24].Feature selection was also used to improve the performance of the system.In another work, random walk prediction model was merged with ANN and a hybrid model was thus developed [25].Researchers also introduced a hybrid ANN architecture of particle swarm optimization and adaptive radial basis function [26].Various regressive ANN models, such as self organising maps and support vector regressions, were used to form a hybrid model so as to predict foreign exchange currency rate [27].For business failure prediction, a hybrid model was proposed in which particle swarm optimization and network-based fuzzy inference system were used [28].In a recent work, researchers created a hybrid prediction model by integrating autoregressive moving average model and differential evolution based training of its feedforward and feedback parameters [29].This hybrid model has been compared with other similar hybrid models and the results confirm that it outperforms other models.
The rest of the paper is arranged as follows.Section 2 discusses various prediction based models, including those models which are used in this work.The proposed hybrid model is described in detail in Section 3. Section 4 discusses two commonly used error metrics.In Section 5, experiments and results are presented.Finally, Section 6 presents conclusions.

Prediction Based Models
Stock return or simply return is used to refer to a profit on an investment.Return,   at time, is calculated using (1), where   and  −1 are prices of stock at times  and  − 1 respectively: Few prediction-based models available in literature including those used in this work are described below.

Exponential Smoothing Model.
Exponential smoothing model is used to obtain predictions on time series data Brown [30].It computes one-step-ahead prediction by means of computing geometric sum of past observations as shown in the following equation: where r is prediction of   , r+1 is prediction for future value,  is a smoothing parameter in the range of (0, 1), and   − r is prediction error.Exponential smoothing assigns exponentially decreasing weights over time.The observations are weighted, with more weight given to the most recent observations.

Autoregressive Moving Average Model.
Autoregressive moving average model was introduced by Box and Jenkins [31], for times series prediction, which was actually inspired by the early work of Yule [32] and Wold [33].The model consists of an autoregressive part AR() and moving average part MA().This is the reason why the model is referred to as ARMA(, ) model.The autoregressive moving average model is thus defined as AR() process and it can be generally expressed as shown in the following equation: Similarly, MA() process can be generally expressed as shown in the following equation: Hence, the ARMA(, ) model can be expressed as where  and  represent the order of the autoregressive model and of the moving average model, respectively. and  are coefficients that satisfy stationarity of series.  are random errors or white noise at time  with zero mean and variance  2  .

Autoregressive Moving Reference Regression
Model.This is a new regression model proposed by Freitas et al. [9] and is adopted in this work also.The same model has been used by [34,35] along with different ANN models.
Let us consider time series data with  past returns as shown in the following equation: Based on the available historical data, the future return of an action can be defined as the process in which the elements of the past returns are used to obtain an estimate of future return  + , where  ≥ 1.The value of  directly affects the choice of the adopted prediction method used.
Input-output pairs are generated by this model which are given to ANN in a supervised manner; thus, it can be called autoregressive moving reference neural network, AR-MRNN(, ), where  is the order of regression and  is delay from the point of reference.
Using an autoregressive predictor, say R, AR-MRNN (, ) implements the prediction system as shown in the following equation: where r +1 −  is prediction for  +1 −  obtained at time  from the information available from the historical series   = ( −(−1) , . . .,  −1 ,   ). is the order of regression and  is moving reference given in the following equation: After training the neural network, prediction r+1 is obtained as shown in the following equation: According to (7), inputs to the neural network are given as differences rather than original values, and the network thus requires smaller values of weights which improves its ability to generalize.The output obtained from neural network is not the final predictions, rather than final predictions which are calculated by adding  value to the output.

The Proposed Hybrid Intelligent Prediction Model
This section discusses the proposed hybrid intelligent prediction model (HIPM) in detail.Consider the actual returns of time series given by   ( = 1, . . ., ).Let the predictions obtained by any linear model be denoted by r ( = 1, . . ., ).
The difference between actual time series (  ) and predicted series (r  ) is known as prediction error or simply error, which is calculated here in a similar fashion, that is,   =   − r .
In HIPM, the predictions are obtained via two methods, that is, summation method and multiplication method; these methods are defined below.

Summation Method.
According to summation method, actual data is equal to predicted linear data summed up with error terms, as shown in the following equation: Error terms are thus calculated as shown in the following equation: 3.2.Multiplicative Method.For multiplicative method, actual data is equal to predicted linear data multiplied with error terms, as shown in the following equation: The error terms are multiplied back to linear predictions because these error terms were calculated as shown in the following equation:  The series of errors obtained by the above two methods are given to ANN by means of AR-MRNN(, ), where  = 1.Thus, in terms of AR-MRNN, ( 7) to ( 9) are modified as given below: where ε +1 −  is estimate for  +1 −  obtained at time  from the information available from the previously obtained error series   = ( −(−1) , . . .,  −1 ,   ). is the order of regression and  is moving reference given in the following equation: After training, the neural network ε+1 is obtained as shown in (16); thus, minimized errors are obtained: These minimized errors are added to (10) and ( 12) after replacing original errors.Hence, final predictions from HIPM are obtained by summation method and multiplicative method as shown in the following two equations, respectively: Figure 1 shows the complete work flow of HIPM.Initially actual returns,   , are given as an input to linear prediction through which predictions r are obtained.In the next step, errors   are calculated.These errors are fed into ANN which does nonlinear processing.The minimized errors, ε , obtained from ANN are used to calculate final predictions via two methods, that is, summation method and multiplicative method.

Recurrent Neural Network with Autoregressive Moving
Reference.The importance of a recurrent neural network (RNN) is that it responds to the same input pattern differently at different times.Figure 2 shows the recurrent neural network with one hidden layer which is used in this work.In this type of network model, input layer is not only fed to hidden layer but also fed back into input layer.The network is a supervised AR-MRNN with inputs (  − ,  −1 − , . . .,  −(−1) − ).The desired output is ε +1 − .As shown, the receiving end for input layer is a long term memory for the network.The function of this memory is to hold the data and pass it to hidden layer immediately after each pattern is sent from input layer.In this way, the network is able to see previous knowledge it had about previous inputs.Thus, long term memory remembers the new input data and uses it when the next pattern is processed.The disadvantage of this network is that it takes longer to train.
The input layer has input neurons equal to the regression order chosen, that is, , possessing linear activation function.The  are number of neurons ℎ 1,1 , . . ., ℎ 1, in hidden layer possessing some activation function.The output neuron, that is, ℎ 2,1 , also possesses some activation function.Thus, the network is able to learn how to predict complex nonlinear patterns of the data.This network is also known as Jordan Elman neural network and for training itself, the network uses backpropagation algorithm [36,37].

Error Metrics
The performance of HIPM is checked here by using two error metrics, mean square error and mean absolute error.A brief discussion about these error metrics is given in the following subsections.

Mean Square Error (MSE)
. MSE is the arithmetic mean of the sum of the squares of the forecasted errors.MSE is a standard metric for comparing the differences between two time series and is defined as shown in the following equation: where   and r are actual returns and predicted returns, respectively, and  is the length of the series.

Mean Absolute Error (MAE)
. MAE is a quantity used to measure how close forecasts are to the eventual outcomes.MAE measures the average magnitude of the errors in a set of forecasts, without considering their direction.The mean absolute error is given in the following equation: where   and r are actual returns and predicted returns, respectively, and  is the length of the series.

Application of HIPM on Stock Market Data
In order to verify the performance of HIPM predictor, real world stock data has been used for experiments.The stock data of three different information technology companies have been used here.These companies are given in Table 1.Stock data of above three companies has been obtained from Bombay Stock Exchange, India (http://www.bseindia.com/).Daily adjusted closing prices of three stocks have been taken since 14-05-2013 to 30-12-2013 and returns of 164 days (15-05-2013 to 30-12-2013) are calculated using (1).Predictions were obtained using ESM and RNN.

Predictions Using ESM.
Initially predictions are obtained using linear prediction model; here, ESM has been chosen for the purpose.Value for smoothing factor  in ESM has been obtained using the following optimization model: where   is actual return, rESM() is prediction obtained from ESM,  is smoothing factor, and  is length of historical series.The smoothing factor  is associated with the term rESM() as shown in (2).Thus, ( 21) is an objective function which minimizes MSE of the predictions obtained from exponential smoothing technique.Its constraint guarantees that the value of smoothing factor  ranges between 0 and 1.Since ESM is a linear prediction model, it obviously did not produce satisfactory predictions, thus resulting in high prediction error.

Predictions Using RNN.
After obtaining predictions using ESM and series of errors calculated, these errors were given to RNN by means of AR-MRNN(, ) (( 14) to ( 16)).AR-MRNN(6, 1) (RNN) was used to obtain stock predictions.Regression order  = 6 was chosen after trial and error as it was observed that, by using this particular regression order, RNN produced less prediction error.In each stock, data was divided into two equal parts (50:50).Out of 164 returns, 50% data or 82 returns between15-05-2013 and 05-09-2013 were kept for training RNN and the remaining 50% data, that is, 82 returns between 06-09-2013 and 30-12-2013, for testing.Sliding windows each of 82 returns were created.For each window, input-output pairs were calculated using AR-MRNN(6, 1) method which result in 76 input-output pairs in each window.83 sliding windows were formed each giving prediction for future period; thus, 83 windows give 83 future predictions, while initial window gives prediction for the period  +1 .By combing 83 sliding windows, 6308 inputoutput pairs were obtained.These sliding windows were given to RNN and trained in a supervised manner; the procedure was repeated for all stocks.
In the chosen RNN model, there are 16 neurons in hidden layer possessing sigmoid activation function and an output neuron in output layer also possessing sigmoid activation function.An error threshold of MSE = 0.0002 was preset for RNN; it means RNN converged only after average error reached below the threshold.For RNN, it took over 10,000 epochs for each stock to reach below preset error.
Figures 3 and 4 show the prediction output of HIPM (between 06-09-2013 and 31-12-2013) via multiplicative method and summation method for stock 2 and stock 3,  The performance of HIPM predictor can be better judged by Table 2, which displays values of error metrics obtained.As seen, Multiplicative method outperforms summation method in terms of less prediction error.

Conclusions
A new and promising approach for prediction of stock returns is presented in this paper.A hybrid prediction intelligent model is developed by combining predictions obtained from a linear prediction model and a nonlinear model.The linear model chosen is exponential smoothing model while autoregressive moving reference neural network is chosen as nonlinear model.This is a new approach wherein errors are fed into neural network so as to obtain minimized errors.Initially prediction of stock returns is obtained using exponential smoothing model and prediction errors calculated.Autoregressive moving reference method is used to calculate input-output pairs for the errors just obtained.These errors are fed into recurrent neural network, and the network learns using backpropagation algorithm in supervised manner.Finally, the prediction of stocks is calculated via two methods, summation method and multiplicative method.Based on results, it is observed that the proposed model is able to detect the nonlinear patterns of data very well and results are satisfactory.Input to neural network is given as differences rather than original values.The network thus needs to find smaller weights, thus increasing its prediction performance.The performance of proposed hybrid model can be further improved and applied in other areas too; this is certainly an important avenue for future research.

Table 2 :
Values of error metrics for HIPM.