A CNN-LSTM-Based Model to Forecast Stock Prices

,


Introduction
e change trend of the stock price has always been identified as a very important problem in the economic field [1]. Stock prices are affected by various internal and external factors, such as domestic and foreign economic environment, international situation, industry prospect, financial data of listed companies, and stock market operation. us, the forecasting method also has different emphasis [2,3]. e traditional analysis method is based on economics and finance, which mainly uses the fundamental analysis method and the technical analysis method. On the one hand, the fundamental analysis method pays more attention to the intrinsic value of stocks and qualitatively analyzes the external factors that affect the stock, such as interest rate, exchange rate, inflation, industrial policy, finance of listed companies, international relations, and other economic and political factors. On the other hand, the technical analysis method mainly focuses on the direction of stock price, trading volume, and investors' psychological expectation, which primarily focuses on analyzing the stock index trajectory of individual stocks or the whole market by using Kline chart and other tools. At present, traditional fundamental analysis and technical analysis are still the most commonly employed methods for many organizations and individual investors [4,5]. e accuracy of the traditional fundamental analysis method is difficult to be convincing. e reason is not only that the influencing factors are in a long-term cycle, but also the forecasting results are more dependent on the professional quality of analysts. As a financial time series, stock data have the characteristics of random walk [6]. Based on statistics and probability theory, some scholars use time series linear forecasting model to predict the short-term stock price with a large number of long-term data, such as vector autoregression (VAR) [7], Bayesian vector autoregression (BVAR) model [8], autoregressive integrated moving average mode (ARIMA) [9], and generalized autoregressive conditional heteroskedasticity model (GARCH) [10]. However, the accuracy of using time series model alone is questioned due to the uncertainty and high noise characteristics of financial time series and the relationship between independent variables and dependent variables is prone to dynamic changes over time, which limits its further application and expansion [11].
It has certain limitations to predict stock price trend with single simply using the linear time series forecasting model or neural network model. At present, combining the advantages of various methods and using various best algorithms to improve the hybrid method is the development trend of financial time series deep learning [12]. erefore, in order to make the best of the time series characteristics of data series, deeply mine the data features, and improve the accuracy of stock price forecasting, this paper proposes a stock price forecasting method based on CNN-LSTM for the stock closing price of the next day forecasting. Combining the advantages of convolutional neural networks (CNN) that can extract effective features from the data, and long shortterm memory (LSTM) which can not only find the interdependence of data in time series data, but also automatically detect the best mode suitable for relevant data, this method can effectively improve the accuracy of stock price forecasting. e CNN-LSTM model uses CNN to extract the features of the input time data and uses LSTM to predict the stock closing price on the next day. In order to verify the effectiveness of the model, this paper uses the daily transaction data of 7127 trading days from July 1, 1991, to August 31, 2020, in which the first 6627 trading days data are the training set and the last 500 trading days data are the test set.

Related Work
At present, the financial market is a noisy, nonparametric dynamic system, and there are two main kinds of forecasting methods for stock price: traditional analysis method and machine learning method [13]. e traditional econometric methods or equations with parameters are not suitable for analyzing complex, high-dimensional, and noisy financial series data. In recent years, neural network has become a hot research direction in the field of stock forecasting because it can extract data features from a large number of high-frequency raw data without relying on prior knowledge. In 1988, White used neural network to predict IBM stock, but the experimental results were not good [14]. In 2003, Zhang used neural network and autoregressive integrated moving average model (ARIMA) to forecast stocks, respectively. e experimental results show that neural network has obvious advantages in nonlinear data forecasting, but the accuracy still needs to be improved [15]. In 2005, Sun et al. proposed a time series forecasting method based on neural network.
is method combines the optimal partition algorithm (OPA) and radial basis function (RBF) neural network [16]. In 2014, Adhikari et al. proposed a method combining random walk (RW) and artificial neural network (ANN) to predict four financial time series data, and the results showed that the forecasting accuracy had a certain improvement [17]. In 2018, Zhang et al. proposed the network structure of stock price forecasting based on LM-BP neural network, which improved the traditional BP neural network training algorithm's shortcomings of slow training speed and low precision [18]. In 2018, the experimental results of Hu et al. show that convolutional neural network can predict time series, and deep learning is more suitable for solving the problem of time series. However, because CNN is more commonly used to solve image recognition and feature extraction, the forecasting accuracy of CNN alone is relatively low [19]. In 2020, Kamalov used MLP, CNN, and LSTM to forecast the stock price of four major US public companies. Experimental results showed that these three methods showed better results compared to similar studies that forecast the direction of price change [20]. In 2020, Xue et al. established a high-precision short-term forecasting model of financial market time series based on LSTM deep neural network and compared with the BP neural network, the traditional RNN, and the improved LSTM deep neural network.
e results showed that the LSTM deep neural network has high forecasting accuracy and can effectively predict the time series of the stock market [21]. e main contributions of this paper are as follows: (1) By analyzing the correlation and time series of stock price data, a new deep learning method (CNN-LSTM) is proposed to predict the stock price. In this method, CNN is used to extract the time feature of data, and LSTM is used for data forecasting. It can make full use of the time sequence of stock price data to obtain more reliable forecasting. (2) By comparing the evaluation indexes of CNN-LSTM with multilayer perceptron (MLP), CNN, RNN, LSTM, and CNN-RNN, it is proved that CNN-LSTM has high forecasting accuracy and is more suitable for stock price forecasting.

CNN-LSTM Model.
CNN has the characteristic of paying attention to the most obvious features in the line of sight, so it is widely used in feature engineering. LSTM has the characteristic of expanding according to the sequence of time, and it is widely used in time series. According to the characteristics of CNN and LSTM, a stock forecasting model based on CNN-LSTM is established. e model structure diagram is shown in Figure 1, and the main structure is CNN and LSTM, including input layer, one-dimensional convolution layer, pooling layer, LSTM hidden layer, and full connection layer.

CNN.
CNN is a network model proposed by Lecun et al. in 1998 [22]. CNN is a kind of feedforward neural network, which has good performance in image processing and natural language processing [23]. It can be effectively applied to the forecasting of time series. e local perception and weight sharing of CNN can greatly reduce the number of parameters, thus improving the efficiency of model learning [24]. CNN is mainly composed of two parts: convolution layer and pooling layer. Each convolution layer contains a plurality of convolution kernels, and its calculation formula is shown in formula (1). After the convolution operation of the convolution layer, the features of the data are extracted, but the extracted feature dimensions are very high, so in order to solve this problem and reduce the cost of training the network, a pooling layer is added after the convolution layer to reduce the feature dimension: where l t represents the output value after convolution, tanh is the activation function, x t is the input vector, k t is the weight of the convolution kernel, and b t is the bias of the convolution kernel.

LSTM. LSTM is a network model proposed by
Schmidhuber et al. in 1997 [25]. LSTM is a network model designed to solve the longstanding problems of gradient explosion and gradient disappearance in RNN [26,27]. It has been widely used in speech recognition, emotional analysis, and text analysis, as it has its own memory and can make relatively accurate forecasting [28,29]. In recent years, it has also been adopted in the field of stock market forecasting [30][31][32]. ere is only one repeating module in a standard RNN, and its internal structure is simple. It is usually a tanh layer. However, four of the LSTM modules are similar to the standard RNN modules, and they operate in a special interactive manner [33,34]. e LSTM memory cell consists of three parts: the forget gate, the input gate, and the output gate, as shown in Figure 2. e LSTM calculation process is as follows: (1) e output value of the last moment and the input value of the current time are input into the forget gate, and the output value of the forget gate is obtained after calculation, as shown in the following formula: where the value range of f t is (0,1), W f is the weight of the forget gate, and b f is the bias of the forget gate, x t is the input value of the current time, and h t−1 is the output value of the last moment. (2) e output value of the last time and the input value of the current time are inputted into the input gate, and the output value and candidate cell state of the input gate are obtained after calculation, as shown in the following formulas: where the value range of i t is (0,1), W i is the weight of the input gate, b i is the bias of the input gate, W c is the weight of the candidate input gate, and b c is the bias of the candidate input gate. (3) Update the current cell state as follows: where the value range of C t is (0,1). (4) e output h t−1 and input x t are received as input values of the output gate at time t, and the output o t of the output gate is obtained as follows: where the value range of o t is (0,1), W o is the weight of the output gate, and b o is the bias of the output gate. (5) e output value of LSTM is obtained by calculating the output of the output gate and the state of the cell, as shown in the following formula

CNN-LSTM Training and Prediction
Process. e CNN-LSTM process of training and prediction is shown in Figure 3. e main steps are as follows: .

Input gate
Output gate (1) Input data: input the data required for CNN-LSTM training. (2) Data standardization: as there is a large gap in the input data, in order to train the model better, the zscore standardization method is adopted to standardize the input data, as shown in the following formula: where y i is the standardized value, x i is the input data, x is the average of the input data, and s is the standard deviation of the input data. (5) LSTM layer calculation: the output data of the CNN layer are calculated through the LSTM layer, and the output value is obtained. (6) Output layer calculation: the output value of the LSTM layer is input into the full connection layer to get the output value. (7) Calculation error: the output value calculated by the output layer is compared with the real value of this group of data, and the corresponding error is obtained. (8) To judge whether the end condition is satisfied: the conditions for the end are to complete a predetermined number of cycles, the weight is lower than a certain threshold, and the error rate of the forecasting is lower than a certain threshold. If one of the conditions for the end is met, the training will be completed, update the entire CNN-LSTM network, and go to step 10; otherwise, go to step 9. through the model of CNN-LSTM is the standardized value, and the standardized value is restored to the original value. As shown in the following formula (9). where x i is the standardized restored value, y i is the output value of the CNN-LSTM, s is the standard deviation of the input data, and x is the average value of the input data. (15) Output result: output the restored results to complete the forecasting process.

Experiments
In

Model Implementation.
In order to evaluate the forecasting effect of CNN-LSTM, the mean absolute error (MAE), root mean square error (RMSE), and R-square (R 2 ) are used as the evaluation criteria of the methods. e MAE calculation formula is as follows: where y i is the predictive value and y i is the true value. e smaller the value of MAE, the better the forecasting. e RMSE calculation formula is as follows: where y i is the predictive value and y i is the true value. e smaller the value of RMSE, the better the forecasting. e R 2 calculation formula is as follows: where y i is the predictive value, y i is the true value, and y i is the average value. e value range of R 2 is (0,1). e closer the value of MAE and RMSE to 0, the smaller the error between the predicted value and the real value, the higher the forecasting accuracy. e closer R 2 is to 1, the better the fitting degree of the model is.

Implementation of CNN-LSTM.
e parameter setting of the CNN-LSTM for this experiment is shown in Table 2.
According to the parameter setting of CNN-LSTM network, we can know that the specific model is constructed as follows: the input training set data is a three-dimensional data vector (None, 10,8), in which 10 is the size of the time_step and 8 is the 8 features of the input dimension. First, the data enter the one-dimensional convolution layer to further extract features and obtain a three-dimensional output vector (None, 10,32), in which 32 is the size of the convolution layer filters. Next, the vector enters the pooling layer, and a three-dimensional output vector (None, 10,32) is also obtained. And then, the output vector enters the LSTM layer for training, and the output data (None, 64) after training enter another layer of full connection layer to get the

Results
After using the processed training set data to train MLP, CNN, RNN, LSTM CNN-RNN, and CNN-LSTM, respectively, the model completed by training is used to predict the test set data, and the real value is compared with the predicted value as shown in Figures 5-10.
In Figures 5-10, among the six forecasting methods, the broken line fitting degree of real value and predicted value is CNN-LSTM, CNN-RNN, LSTM, CNN, RNN, and MLP. CNN-LSTM has the highest degree of broken line fitting which almost coincides with each other, and MLP has the lowest degree of broken line fitting.
According to the predicted value and real value of each method, the evaluation index of each method can be e results show that the performance of CNN-LSTM is the best among the six methods. In terms of forecasting accuracy, MAE is 27.564 and RMSE is 39.688, which is the smallest among the six forecasting models and has high forecasting accuracy, in terms of forecasting performance, and the R 2 of CNN-LSTM is 0.9646, which is improved by 2.2%, 0.6%, 0.5%, and 0.2%, respectively, compared with the other four methods. erefore, the CNN-LSTM proposed in this paper is superior to the other four comparative models in terms of fitting degree and error value. It can well predict the closing price of the next day and provide a reference for investors' investment.

Conclusions
According to the chronological characteristics of stock price data, this paper proposes a CNN-LSTM to predict the stock closing price of the next day. e method uses opening price, highest price, lowest price, closing price, volume, turnover, ups and downs, and change of the stock data as the input, making full use of the time sequence characteristics of the stock data. CNN is used to extract the features of the input data. LSTM is used to learn the extracted feature data and predict the closing price of the stock the next day. is paper takes the relevant data of the Shanghai Composite Index as an example to verify the experimental results. e experimental results show that the CNN-LSTM has the highest forecasting accuracy and the best performance compared with the MLP, CNN, RNN, LSTM, and CNN-RNN. MAE and RMSE are the smallest of all methods, and R 2 is close to 1. CNN-LSTM is suitable for the forecasting of stock prices and can provide a relevant reference for investors to maximize investment returns. CNN-LSTM also provides the proposal of practical experience for people's research on financial time series data. However, the model still has some shortcomings. For example, it only considers the impact of stock price data on closing prices and fails to integrate emotional factors such as news and national policy into the forecast. Our future research work is mainly to increase the sentiment analysis of stock-related news and national policies, so as to ensure the accuracy of stock forecast.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.