Stock Prediction Based on Optimized LSTM and GRU Models

Stock market prediction has always been an important research topic in the financial field. In the past, inventors used traditional analysis methods such as K-line diagrams to predict stock trends, but with the progress of science and technology and the development of market economy, the price trend of a stock is disturbed by various factors. *e traditional analysis method is far from being able to resolve the stock price fluctuations in the hidden important information. So, the prediction accuracy is greatly reduced. In this paper, we design a new model for optimizing stock forecasting. We incorporate a range of technical indicators, including investor sentiment indicators and financial data, and perform dimension reduction on the many influencing factors of the retrieved stock price using depth learning LASSO and PCA approaches. In addition, a comparison of the performances of LSTM and GRU for stock market forecasting under various parameters was performed. Our experiments show that (1) both LSTM and GRUmodels can predict stock prices efficiently, not one better than the other, and (2) for the two different dimension reduction methods, both the two neural models using LASSO reflect better prediction ability than the models using PCA.


Introduction
e financial market is quite volatile and experiences periods of contraction as well as expansion. e stock market, as a major financial market, is likewise highly volatile. e stock market has the characteristics of high return which has attracted the majority of investors and high risk which puts pressure on investors to sell out at the wrong time. In order to reduce unnecessary losses and obtain higher trading profits, the investors usually except to predict the stock price trend. As a result, stock market forecasting has been a major research topic in the financial area and attracts the attention of investors. In the stock market, the factors affecting the rise and fall of stock prices are complex and diverse. It includes not only the impact of economic factors such as price indicator, circulation indicator, activity degree, and economic uncertainty but also the impact of noneconomic factors such as traders' expectations, traders' psychological factors, and political environment. erefore, the prediction of stock price has always been a challenging task.
According to the efficient market hypothesis [1], the stock price can be predicted according to the data of historical stocks. Furthermore, in recent years, since the increasing computing power and the decreasing data storage costs, especially the rise and development of innovative technologies such as big data, machine learning, reinforcement learning, and other optimization technologies, researchers have developed various models for predicting stock prices. Machine learning has been widely used in the capital market and plays an indispensable role in predicting future stock prices based on historical data. Traditional stock price forecasting models are mainly linear models, including autoregressive integrated moving average (ARIMA) model [2], multiple linear regression model, and exponential smoothing model [3,4]. However, those (autoregressive integrated moving average,multiple linear regression model, and exponential smoothing model) linear models play an important role in promoting the progress and development of stock forecasting. Stock prices are typically noisy, fluctuating, and nonparametric, resulting in nonlinear and nonstationary characteristics in the stock market.
e standard linear prediction model is unable to produce reliable stock predictions. With the development of deep learning methods, nonlinear neural networks are increasingly employed to predict the stock price for their higher accuracy. e artificial neural network (ANN) includes MP neural network and back propagation (BP) neural network. However, the structure of ANN model is too single and there are some problems: (1) over fitting leads to the weak ability of the model generalization, (2) local extrema leads to the decline of the prediction ability of the model, and (3) the gradient disappears or explodes due to the excessive weight of neurons in the optimization process, resulting in the failure of prediction. erefore, relevant scholars introduce deep neural networks (DNN), including convolutional neural network (CNN), recurrent neural network (RNN), long-term and short-term memory neural network (LSTM), and gated recurrent neural network (GRU), to improve the problems existing in the ANN model, so as to improve the accuracy and efficiency of prediction.
CNN is a type of neural network that has been increasingly popular in recent years. A one-dimensional CNN is a neural network that is designed to analyse image data efficiently. CNN can read and automatically extract the most significant features from the original input data for learning. is method feeds the network observed time series value as input and uses a multilayer network to predict the unobserved value. For example, Xu et al. [5] employed CNN to extract important stock features from stock market returns for forecasting stock market trends. Recurrent neural networks (RNN) such as long-term and short-term memory neural networks (LSTM) are another tool for predicting time series [6,7]. LSTM accurately estimates time series data by using both the historical and the present stock data. In recent years, LSTM has been applied to stock market forecasting in different stock markets around the world. Chen et al. [8] used an LSTM model to predict China's Shanghai and Shenzhen stock markets. Li et al. [9] introduced the stock indicator with investor sentiment based on the LSTM model to predict the CS1300 index value, and the research results showed that the model was better than the support vector machine method in prediction accuracy. However, this model does not reduce the dimension of stock indicator. Jiawei and Murata [10] attempted to identify the influencing factors of stock market trend prediction through the LSTM model, which used a preprocessing algorithm to reduce the dimension of stock features and a sentiment analyzer to present financial news for stock trend prediction. However, only one dimension reduction method is used, and there is no comparison with other methods. Hu [11] reduced the dimension of stock technical analysis indicators by PCA and LASSO methods before using the LSTM model to predict. e results demonstrated that compared with the LASSO-LSTM model, the PCA-LSTM model can significantly reduce data redundancy and enhance prediction accuracy. Although this work used different dimension reduction methods, it only used one model and did not compare with other models.
Cho et al. [12] reduced the LSTM structure and created GRU, a new deep learning architecture that integrates longterm and short-term memory. GRU solves the problem of gradient disappearance and explosion in classic recurrent neural networks (RNNs) when learning long-term reliance. GRU has also been widely used in recent stock forecasting. Shen et al. [13] compared and predicted the trading signals of stock indicator based on the GRU model and SVM. e results demonstrated that the prediction accuracy of the two GRU models is higher than that of other models. However, the emotion indicator was not included in this study. Rahman et al. [14] used the stock data of Yahoo Finance mobile phone and GRU model to predict the stock price. e emotional indicators were not considered in this study, nor were compared with the performance of other models [15].
In this paper, we integrate a variety of technical indicators, such as investor sentiment indicators and financial data based on the Shanghai Composite Index data. We use LASSO and PCA methods to perform dimension reduction on the multiple influencing factors of the extracted stock price. e LSTM and GRU models are then utilized in this paper to forecast the stock price. Most importantly, by comparing the accuracy and stability of the LASSO-LSTM, LASSO-GRU, PCA-LSTM, and PCA-GRU models, the optimal forecasting model may be recommended.

LASSO.
In empirical analysis, in order to minimize the model deviation due to the lack of important independent variables, we set multidimensional variables. e models need to find the set of independent variables with the strongest explanatory power to the dependent variables.
at is, the models need to improve the interpretability and prediction accuracy through independent variable selection (indicator selection and field selection). Indicator selection is an extremely important problem in statistical modelling. LASSO is an estimation method that can simplify the indicator set. It is a compressed estimation. It gets a more refined model by constructing a penalty function, which makes it compress some coefficients and set some coefficients to zero. erefore, it retains the advantage of subset contraction and is a biased estimation for dealing with complex collinear data.
LASSO's basic idea is to minimize the sum of squares of residuals under the constraint that the sum of absolute values of regression coefficients is less than a constant, so as to produce some regression coefficients strictly equal to 0 and obtain an interpretable model. LASSO adds penalty term to the ordinary linear regression model, and the LASSO estimation of the ordinary linear model is which is equivalent to where t and λ are said to be in one-to-one correspondence and they are the adjustment coefficients. Let t 0 � d j�1 |β j (OLS)|, and when t < t 0 , some coefficients will be compressed to 0, so as to reduce the dimension of X and the complexity of the model. Finally, the variable selection can be realized by controlling the adjustment coefficient through the λ.
2.2. PCA. Principal component analysis (PCA) is a dimension reduction statistical method. With the help of an orthogonal transformation, it transforms the original random vector whose components are related into a new random vector whose components are not related. is is expressed algebraically as transforming the covariance matrix of the original random vector into a diagonal matrix and geometrically as transforming the original coordinate system into a new orthogonal coordinate system. en, the multidimensional variable system is reduced, so that it can be transformed into low-dimensional variable system with a high accuracy, and the low-dimensional system can be further transformed into one-dimensional system by constructing an appropriate value function.
en, we construct the sample array and carry out the following standardized transformation on the sample array elements: us, the standardized matrix Z is obtained.
(2) Find the correlation coefficient matrix for the standardized matrix Z as where r ij � z kj .z kj /n − 1; i, j � 1, 2, . . . , p. e final evaluation value is obtained by weighted sum of m principal components, and the weight is the variance contribution rate of each principal component.

LSTM and GRU.
LSTM is a special type of recurrent neural network (RNN). e RNN neural network model can recycle the weight parameters of neurons and can effectively employ past data information for prediction. However, RNN can only deal with certain short-term dependence and is prone to gradient explosion and gradient disappearance, that is, long-term dependence on historical data. In order to solve these problems, LSTM was proposed by Hochreiter and Schmidhuber [6] and then improved and promoted by Graves [16]. It has been widely used in a variety of challenges and has yielded impressive outcomes.
Compared with the RNN model, the LSTM model introduces a cell state (C t ) and uses the input gate (i t ), forget gate (f t ), and output gate (O t ). e three gates are used to maintain and control information. At time t, x t is the input data, h t represents the current output, c t is the value from the input gate, tanh is hyperbolic tangent function, σ is the sigmoid function, W represents the matrix weight, and b is the bias. e operation formula of LSTM is as follows.
Forget gate: Input gate: Output gate: e LSTM model is especially popular in the field of financial forecasting because it effectively deals with the redundancy of relevant information in historical data.
GRU is one of the variants of RNN which is introduced by Cho et al. [12]. By introducing gating structure, it solves the problem that RNN is difficult to deal with long-distance information acquisition. Compared with LSTM, GRU is simplified and only update gate (z t ) and reset gate (r t ) are introduced. In GRU, the update (or input) gate decides how much input (x t ) and previous output (h t−1 ) to be passed to the next cell and the reset gate is used to determine how much of the past information to forget. e current memory content ensures that only the relevant information needs to be passed to the next iteration, which is determined by the weight W. e main operations in GRU are governed by the following formulae.
Update gate: Reset gate: After resetting the gate and updating the gate, the candidate status value of GRU unit is h t and the final output status value is h t : Scientific Programming

Data Source and Indicator Selection.
In this paper, the data of the Shanghai Composite Index (000001) from April 11, 2007, to August 3, 2021, are selected as the experimental data. e data comes from NetEase Finance and Economics website, with a total of 3,481 days. In order to evaluate the training effect of the model, we divide the experimental data into training set and test set, of which 80% are used as one training set to train the stock prediction model and the other 20% are used as test sets to verify the prediction effect of the model. In addition, we use Intel Core i9-9900K CPU with memory 64 GB to finish the experiments.
In the selection process of stock technical indicators, this paper considers the factors affecting the stock price as much as possible. Compared with other studies, this paper selects the open price, highest price, lowest price, trading volume, and other common technical indicators, such as OBV, KDJ, BIAS, RSI, CCI, and MFI, as well as other stock price judgment technical indicators and PSY indicators reflecting investors' psychological mood. ese indicators comprehensively reflect the information affecting stock price fluctuations and have the strong explanatory power for stock price fluctuations. e selected indicators are described in detail in Table 1.

Experimental
Setup. Different superparameters have a significant impact on the prediction ability of LSTM and GRU models. erefore, different superparameter data are set in the prediction to compare the prediction results. e number of neuron layers is set to 2 and 3, the number of neurons is set to 8, 16, and 32, the learning rate is usually set to 0.001, and the number of iterations is set to 1000. We can determine the most accurate prediction method by analyzing the prediction accuracy of the experimental results and the degree of fit of the trend between the predicted stock price and the historical stock price. e prediction accuracy is evaluated by mean square error function (MSE), root mean square error (RMSE), and mean absolute error (MAE) at different look-back values. e smaller the value of the three, the more accurate the forecast result is. e full specification of parameters used in these models is listed in Tables 2-5.

Experimental Results.
e experimental results of stock prediction of four models are shown in Tables 2-5. Two different feature sets were obtained in this experiment. Set I is the data obtained from the LASSO dimension reduction method, and set II is the data obtained from the PCA dimension reduction method. ese characteristic data are used to train LSTM and GRU models. In the experiment, different backtracking values were set. All parameter specifications used by the four models are shown in Tables 2-5. e results show that, through MAS, RMSE, and MAE indicators, both LSTM and GRU models can predict stock prices effectively, not one is more efficient than the other. However, for different dimension reduction methods, we find that all indicators (except the training time) show that      the prediction results of the two neural network models using LASSO dimension reduction are mostly better than those using PCA dimension reduction data. In other words, under the same network model, the prediction performance of LASSO-LSTM model is better than PCA-LSTM and the prediction performance of LASSO-GRU is better than PCA-GRU.

Conclusion
is study innovatively integrates a variety of technical indicators such as investor sentiment indicators and financial data and carries out dimension reduction on the multiple influencing factors of the extracted stock price through LASSO and PCA analysis approaches. is work carries out a comparison on the performances of LSTM and GRU for stock market forecasting under the different parameters. Our experimental results show that (1) both LSTM and GRU models can be used to predict stock prices effectively and (2) for different dimension reduction methods, the prediction results of the two neural network models using LASSO dimension reduction are mostly better than those using PCA dimension reduction data.

Data Availability
e experimental data used to support the findings of this study are available from the corresponding author upon request.