Application of Empirical Mode Decomposition with Local Linear Quantile Regression in Financial Time Series Forecasting

This paper mainly forecasts the daily closing price of stock markets. We propose a two-stage technique that combines the empirical mode decomposition (EMD) with nonparametric methods of local linear quantile (LLQ). We use the proposed technique, EMD-LLQ, to forecast two stock index time series. Detailed experiments are implemented for the proposed method, in which EMD-LPQ, EMD, and Holt-Winter methods are compared. The proposed EMD-LPQ model is determined to be superior to the EMD and Holt-Winter methods in predicting the stock closing prices.


Introduction
Recent studies have indicated that financial markets typically follow nonlinear and nonstationary behavior. Therefore, forecasting finance using classical techniques is quite difficult. The empirical mode decomposition (EMD) explored by Huang et al. [1] is a very powerful tool in modern quantitative finance and has emerged as a powerful statistical modeling technique [2,3]. The capacity of the EMD to handle nonlinear and nonstationary behaviors has provided both researchers and practitioners with an attractive alternative tool. The EMD can explain the generation of time series data from an alternative perspective by breaking up time series signals into smaller numbers of independent and concretely implicational intrinsic modes based on scale separation. This distinguishing feature makes the EMD a valuable and desirable tool for forecasting financial time series signals [4]. The current study aims to extract and forecast the trend of two stock markets, namely, the Kuala Lumpur Bursa (KLSE) index and the New Zealand stock market index (NZX50), using the advantages of local linear quantile (LLQ) regression. The proposed method consists of two stages. In the first stage, LLQ is applied to corrupt and noisy data. The remaining series is subsequently expected to be hidden in the residuals. In the second stage, EMD is applied to the residuals. The final estimate is the summation of the fitting estimates from the LLQ and EMD. To extract and forecast the trend using EMD-LLQ and EMD, we summarize the steps as follows.
(1) A signal is decomposed by the EMD-LLQ and EMD.
(2) Meaningful intrinsic mode functions (IMFs) (components) are selected using the fast Fourier transform (FFT) (see [5]). (3) Selected components are added to the residue to obtain the trend. (4) The Holt-Winter method is based on the selected components and provides the forecasting results.
The remainder of this paper is organized as follows. Section 2 presents a brief background of the EMD and LLQ. Section 3 introduces the proposed method. Section 4 compares the results of the original EMD algorithm and the new proposed method by forecasting the daily closing prices of two stock markets, namely, the KLSE and the NZX50. Section 5 concludes.

2
The Scientific World Journal variance and bias at points near the boundary. The presence of such problem has dramatic effects on results. Varieties of works have been reported in literature in order to reduce the effects of boundary problem in traditional EMD. Twostage methods or coupling methods nowadays have been used extensively for solving such problem; for instance [6], applied neural network to each IMF to restrain the end effect [7] provided an algorithm based on the sigma-pi neural network which is used to extend signals before applying EMD. Reference [4] proposed a new two-stage algorithm. The algorithm includes two steps: the extrapolation of the signal through support vector (SV) regression at both endpoints to form the primary expansion signal, and then the primary signal is further expanded through extrema mirror expansion and EMD is performed on the resulting signal to obtain reduced end limitations. All previous methods have been shown to have good solutions to the end point and achieved a higher precision in an application part as well. In this paper we have followed [8]. The proposed method EMD-LPQ (empirical mode decomposition combined with local linear quantile regression) is designed to be a robust version of classical empirical mode decomposition especially in presence of edge effect problems.

Empirical Mode Decomposition
The EMD [1] has proven to be a natural extension of and an alternative technique to traditional methods for analyzing nonlinear and nonstationary signals, such as wavelet methods, Fourier methods, and empirical orthogonal functions [9]. In this section, we briefly describe the EMD algorithm. The EMD mainly decomposes the data into smaller signals called IMFs. An IMF is a function in which the upper and the lower envelopes are symmetric. Moreover, the number of zero-crossings and the numbers of extremes are equal or differ by one, at the most [10]. The algorithm for extracting IMFs for a given time series is called shifting, and it consists of the following steps.
(i) The initial estimates for the residue are set at 0 ( ) = , 0 ( ) = −1 ( ), and = 1, and the IMF index is set at = 1. (iv) The mean is subtracted from the original signal; that is, = −1 − −1 and = + 1. Steps (i) to (iv) are repeated until becomes an IMF. Hence, the th IMF is given by IMF = .
(v) The residue is updated via ( ) = −1 ( ) − IMF . This residual component is treatedas new data and subjected to the previously described process to calculate the next IMF +1 .
(vi) The previous steps are repeated until the final residual component ( ) becomes amonotonic function. The final estimation of residue ( ) is subsequently considered.
Several methods have been presented to extract trends from a time series. Freehand and least squares methods are the commonly used techniques; the former depends on the experience of users, whereas the latter is difficult to use when the original series is very irregular [11]. The EMD is another effective method for extracting trends [6].

Local Linear Quantile (LLQ) Regression
The seminal study of Koenker and Bassett [12] introduced parametric quantile regression, which is considered an alternative to the classical regression in both parametric and nonparametric fields. Numerous models for the nonparametric approach have been introduced in statistical literature, such as the locally polynomial quantile regression by Chaudhuri [13] and the kernel methods by Koenker et al. [14]. In this paper, we adopt the LLQ regression employed by Yu and Jones [15].
Let {( , ), = 1 . . . , } be bivariate observations. To estimate the th conditional quantile function of response , the equation below is defined given = : Let be a positive symmetric unimodal kernel function, and consider the following weighted quantile regression problem: where ( ) = (( − )/ℎ)/ℎ. Once the covariate observations are centered at point, the estimate of ( ) is simply 0 , which is the first component of the minimizer of (1), and determines the estimate of the slope of the function at point .

Bandwidth Selection
The practical performance of̂( ) strongly depends on the selected bandwidth parameter. We adopt the strategy of Yu and Jones [15]. In sum, we employ the automatic bandwidth selection strategy for smoothing conditional quantiles as follows.
(1) We use ready-made and sophisticated methods in selecting ℎ mean ; we employ [16] which explored The Scientific World Journal 3 a "direct plugin" bandwidth selection procedure which relies on asymptotically optimal bandwidth: where and for later use we have introduced the array = ∫ ( ) ( ) ( ) , , ≥ 0, + even. (4) Again for later use we write down the following estimator with a bandwidth for 2 : This is simply a normalized residual sum of squares and normalizing quantity ], sometimes known as the degrees of freedom, which is given by ] = − 2∑ ,ℎ ( ) + ∑ ,ℎ 2 ,ℎ ( ). Its presented guarantees ( 2 | 1 , . . . , ) = 2 , where ( ) is either constant or linear [17].
to obtain all of the other ℎ from ℎ mean . and Φ are standard normal density and distribution functions, and ℎ mean is a bandwidth parameter for regression mean estimation with various existing methods. This procedure obtains identical bandwidths for the and (1 − ) quantiles.

Proposed Method
The proposed method consists of two stages that automatically decrease the boundary effects of EMD [8]. At the first stage, LLQ which is considered as an excellent boundary treatment [18] is applied to the corrupted and noisy data. The remaining series is then expected to be hidden in the residuals. At the second stage, EMD is applied to the residuals. The final estimate is the summation of the fitting estimates from LLQ and EMD. Compared with EMD, this combination obtains more accurate estimates.
This section elaborates the proposed method, EMD-LLQ. The basic idea behind the proposed method is to estimate the underlying function with the sum of a set of EMD functions, EMD , and an LLQ function, LLQ . That is, We estimate the two components EMD and LLQ to obtain our proposed estimate,̂E MD.LLQ , through the following steps.
(1) The LLQ is applied to the corrupt and noisy data , and the trend estimatêL LQ is subsequently obtained.
(3) The EMD is applied to , given that the remaining series is expected to be hidden in the residuals. This step is accomplished by performing the following substeps: this substep is accomplished by performing algorithms (i) to (vi).
(4) The final estimate is the summation of the fitting estimates from LLQ and EMD and is as follows:

Experiment Analysis and Results
In this section, we consider the daily closing prices of two stock markets, namely, KLSE and NZX50, from December 3, 2007, to December 6, 2013, see Figure 1. The last 10, 30, and 50 days of the KLSE and NZX50 stock indices are forecasted, respectively, based on the past sequences. The selection of these two indices aims to qualitatively and culturally compare the time series of two different markets.
The data used in this study are collected from the website: http://in.finance.yahoo.com/. We analyze the two indices based on the EMD-LLQ and EMD, in combination with the FFT and the Holt-Winter methods. The approach consists of several steps. First, we decompose the daily closing prices of the stock markets into a finite number of components called IMFs and one residue. Second, we select significant components by applying the FFT to each IMF. Third, we add the significant component obtained from step two to the residue to acquire the trend. Finally, we employ the Holt-Winter method for forecasting the trend.

Conclusion and Future Research
We propose an EMD-LLQ model for forecasting future prices by considering the past sequences of daily stock prices. The EMD-LLQ method is a new two-stage forecasting method that combines the EMD and LLQ algorithm. The effectiveness 4 The Scientific World Journal