Linear multiple kernel learning model has been used for predicting financial time series. However,
Forecasting the future values of financial time series is an appealing yet difficult activity in the modern business world. As explained by Deboeck and Yaser [
In comparison with the previous models, SVR with a single kernel function can exhibit better prediction accuracy because it conceives the structural risk minimization principle which considers both the training error and the capacity of the regression model [
In recent years there has a lot of interest in designing principled regression algorithms over multiple cues, based on the intuitive notion that using more features should lead to better performance and decreasing the generalization error. When the right choice of features is unknown, learning linear combinations of multiple kernels is an appealing strategy. The approach with a optimization process is called multiple kernel learning (MKL). A first step towards a more realistic model of MKL was achieved by Lanckriet et al. [
In this paper, a multiple kernel learning framework is established for learning and predicting the stock prices. We present a regression model for the future values of stock prices, that is,
The rest of this paper is arranged as follows. Section
In this section, the idea of
Let
SVR model usually uses a single mapping function
An alternative approach previous equations has been considered by studiers. For example, Zien and Ong [
It can be shown (see the Appendix for details) that the dual of (
In the first stage, the variables
According to
Set the
In the second stage, the following algorithm is used. We give a chunking-based training algorithm (Algorithm
1. Input
2. Iterate (1) Select (2) Store respect to the selected variables. (3) Update gradient (4) Compute the quadratic terms (5) (6) If else break endif 3. Output
In every iteration process, the inner subproblem (
The interleaved optimization algorithm is depicted in Algorithm
Assume the original values of
In the iteration process, the procedure is standard in chunking-based SVR solvers and is carried out by
When the duality gap falls below a prespecified threshold, that is,
In this section, two experiments on a real financial time series have been carried out to assess the performance of
Firstly, we compare the performance of
The data sets for the first experiment.
Dataset | Training | Validating | Testing |
---|---|---|---|
data1 | 2003/1–2006/12 | 2007/1–2007/3 | 2007/4–2007/6 |
data2 | 2003/4–2007/3 | 2007/4–2007/6 | 2007/7–2007/9 |
data3 | 2003/7–2007/6 | 2007/7–2007/9 | 2007/10–2007/12 |
According to [
Based on in the previously mentioned, the input variables can be defined as
There are three parameters that should be determined in advance for SKSVR,
The comparison of RMSE values between SKSVR and
Methods | Data1 | Data2 | Data3 |
---|---|---|---|
SKSVR | 0.179 | 0.183 | 0.197 |
|
|
0.177 | 0.186 |
|
0.163 |
|
0.189 |
|
0.166 | 0.179 |
|
Forecasting performance of SKSVR with different hyperparameters.
For
Secondly, we compare the performance of
The data sets for the second experiment.
Dataset | Training | Validating | Testing |
---|---|---|---|
D-I | 2008/1–2010/12 | 2011/1–2011/3 | 2011/4–2011/6 |
D-II | 2008/4–2011/3 | 2011/4–2011/6 | 2011/7–2011/9 |
D-III | 2008/7–2011/6 | 2011/7–2011/9 | 2011/10–2011/12 |
We also adopt RMSE (
The comparison of RMSE values between
Methods | D-I | D-II | D-III |
---|---|---|---|
|
0.182 | 0.189 | 0.178 |
|
|
0.183 | 0.179 |
|
0.185 |
|
0.180 |
|
0.190 | 0.191 |
|
Forecasting results by
Furthermore, we can use a statistical test proposed by Diebold and Mariano [
Loss differential (
Loss differential (
We denote
Asymptotic test.
Stock closing prices |
|
|
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
We briefly mention that the superior performance of
In this paper, an
In this appendix, we detail the dual formulation of
In the following, we build the Lagrangian of (
Resubstituting the previous equations to the Lagrangian yields the following
For standard support vector regression formulations, the hinge loss function can be defined as
In the following, we find
For the choice of
Now,
The authors would like to thank the handling editor and the anonymous reviewers for their constructive comments, which led to significant improvement of the paper. This work was partially supported by the National Natural Science Foundation of China under Grant no. 51174236.