Impact of COVID-19 on Forecasting Stock Prices: An Integration of Stationary Wavelet Transform and Bidirectional Long Short-Term Memory

COVID-19 is an infectious disease that mostly affects the respiratory system. At the time of this research being performed, there were more than 1.4 million cases of COVID-19, and one of the biggest anxieties is not just our health, but our livelihoods, too. In this research, authors investigate the impact of COVID-19 on the global economy, more specifically, the impact of COVID-19 on financial movement of Crude Oil price and three U.S. stock indexes: DJI, S&P 500 and NASDAQ Composite. The proposed system for predicting commodity and stock prices integrates the Stationary Wavelet Transform (SWT) and Bidirectional Long Short-Term Memory (BDLSTM) networks. Firstly, SWT is used to decompose the data into approximation and detail coefficients. After decomposition, data of Crude Oil price and stock market indexes along with COVID-19 confirmed cases were used as input variables for future price movement forecasting. As a result, the proposed system BDLSTM+WT-ADA achieved satisfactory results in terms of five-day Crude Oil price forecast.


Introduction
Infectious diseases have always been a threat to humanity, especially those about which little or nothing is known. World Health Organization (WHO) describes pandemic as "the worldwide spread of a new disease" and although in such times the greatest concern is how to save human lives, the first following objective is how to save the economy and preserve the well-being [1]. In recent history, it is possible to observe the impact of Spanish flu (1918)(1919)) on the economy. According to Centers for Disease Control and Prevention (CDC) estimates, roughly 500 million people were taken ill with the disease, which ultimately took the lives of about 50 million worldwide [2]. Even though the economic data from the early 20th century is rare, it has been noted that the impact of business closures has led to unemployment, and businesses that have survived have suffered huge losses. The comparison can be drawn with the pandemic from the recent past, too. During the 2003 SARS (Severe Acute Respiratory Syndrome), which lasted less than a year, business saw enormous revenue plunge. Similar scenario happened in 2009 when expansion the H1N1 flu triggered numerous consequences [3] [4]. Pandemic like COVID-19 will surely have a significant influence on the global economy, as well as impact on the financial markets. From 24 to 28 February 2020, stock markets worldwide reported their largest one-week declines since the 2008 financial crisis. Traders began to sell shares out of fear, and as a result, a market-wide circuit breaker was triggered four times in March [5] [6]. The breaks were made for 15 minutes each in the hope that the situation would calm down. Every pandemic is unique and it is unlikely to expect the same results, but direction and movement can be predicted which is important for a timely response. A recent occurrence of pandemic has created a supply and a demand shock which is significantly different in comparison with other crises. Starting with the supply-side reductions due to the astonishing closures of factories and labor shortages, the global economy was simultaneously affected by the demand-side shock with immediate reduction in consumer spending. These shocks have ultimately resulted in shifting aggregate supply and aggregate demand downward and, consequently, in reducing national and global gross domestic products.
Forecasting stock prices has always been considered a challenging task due to the fact that stock market tends to be non-stationary, non-linear and highly noisy [7]. Artificial Intelligence (AI) algorithms have been proven successful in solving problems such as predicting stock prices [8] as well as other various fields of science, technology and medicine [9][10][11]. Numerous factors influence financial market performance, and even financial experts find it complicated to make accurate predictions. The algorithm that may be efficient in 4 research was performed, data of commodity and each stock market index consisted of 4992 data-points, which were split into the training and testing sets.
The aim of this research is to integrate SWT with BDLSTM in order to predict the movement of aforementioned commodity and stock market indexes during the COVID-19 outbreak.
COVID-19 caused huge shock to the global economy including commodity prices as well as stock market [23]. With implementation and forecasting price movement, it is expected to make a prompt and significant contribution in terms of understanding and responding to impact of COVID-19 pandemic on the global economy. This approach allows more effective predictions during pandemic and it will help with lowering the negative impact of COVID-19 on financial market by providing experts with additional information and tools in their decision making. Integration of SWT with BDLSTM should help not only in the current situation but also in the future situations similar to COVID-19 in order to be able to react in time and prevent a financial crisis.
First, original data of each dataset will be used as input variable in order to forecast future price movement by utilizing BDLSTM. Second, commodity and each of stock market index data will be decomposed by using SWT in order to obtain approximation and detail coefficients which will be used to train the BDLSTM model. Afterwards, the obtained results for each configuration system will be compared. Third, the impact of confirmed cases detail coefficients on forecasting accuracy will be examined. Lastly, the best performing system configuration will be used in order to show the forecasted movement of Crude Oil price for the next five days with 128 observation days. The overview of the proposed system is given in Figure 1.

Materials and Methods
This section provides a detailed description of datasets used for forecasting price movements as well as a brief overview and mathematical description of Stationary Wavelet Transform, Bidirectional Recurrent Neural Network and Bidirectional Long Short-Term Memory network. In last two subsections, grid search algorithm and evaluation criteria are described.

Dataset description
In order to create the dataset used in this research, historical data of West Descriptive Statistics of commodity, stock market indexes and COVID-19 confirmed cases are provided in Table 1. With these statistics, the features of each dataset can be described [26]. Descriptive Statistics used in this research are: mean, maximum, minimum, standard deviation, kurtosis and skewness. The total number of data-points i.e. observations in each of the aforementioned datasets is 4992, which were split into two parts. First part (80% of the total number) is used for model training, while the second part (20% of the total number) is used in order to evaluate the performance of the trained models. Additionally, each dataset is tested for stationarity using Augmented Dickey-Fuller (ADF) and Phillips-Perron (PP) Unit Root Tests. The results for Level and 1 st Difference are obtained with intercept and with trend and intercept for both ADF and PP tests as shown in Table 2. In order to select optimal lag length in the ADF test, the Schwarz Information Criterion (SIC) was utilized with maximum lags of 31. On the other hand, in the PP test Bartlett kernel was used as Spectral estimation method along with the Newey-West automatic bandwidth selection. The value of optimal lag length (ADF test) and optimal bandwidth (PP test) for each dataset is enclosed in parentheses () and given in  Furthermore, results point out that commodity and three stock market indexes are stationary at their 1 st difference form.
In the case of COVID-19 confirmed cases, the results reveal that the series reject the null hypothesis in the PP test with intercept and with trend and intercept at the 1 st difference and can be considered as stationary.

Data decomposition with Wavelet Transform
The Wavelet Transform (WT) is a powerful mathematical tool for signal processing [27].
Applying WT, a signal can be decomposed into many frequency bands, which can simplify the analysis process. The Fourier Transform (FT) major drawback is losing time information, while preciseness of Short Time Fourier Transform (STFT) largely depends on its window size and shape. Unlike FT and STFT, WT preserves precise information about time and frequency. Since the characteristics of the stock market are non-stationary, non-linear and noisy, and considering the aforementioned drawbacks of FT and STFT, WT can be appropriate approach when dealing with economic and financial time-series analysis. Wavelet transform of signal x(t) can be calculated as: where ψ represents the analyzing wavelet, * stands for complex conjugate, a represents a time dilation and τ represents time translation [28]. Therefore, the Discrete Wavelet Transform can be defined as [29]: To obtain approximation coefficients cA and detail coefficients cD from the original signal

Figure 2. Signal decomposition using SWT at level five (h ihigh level features at decomposition level i, l ilow level features at decomposition level i, cD i -Detail coefficients at decomposition level i, cA 5 -Approximation coefficients at decomposition level five)
In order to obtain a good decomposition of the original signal, discrete Meyer wavelet is utilized. The Meyer wavelet is linear-phase, orthogonal wavelet, and it is defined in the frequency domain as follows [31]: where ν is an auxiliary function that can be defined as

Bidirectional Recurrent Neural Network
Recurrent Neural Networks (RNNs) are a class of Artificial Neural Networks (ANNs) with feedback connection [32]. Between units are connections by which a directed cycle is formed. Therefore in RNN model, a signal can travel both forward and backward. In such network, knowledge can be represented with the values of synaptic connections between input, hidden and output layers of neurons. The main idea behind RNNs is to use sequential data as input.
The RNN model can be simplified by unfolding the RNN architecture over the input sequence of data as is shown in Figure 3.

Figure 3. Folded RNN architecture and the process of unfolding with T time-steps (xinput vector, w xhweight matrix between input and hidden layer, hhidden state, w hhweight matrix between two hidden states, w hyweight matrix between hidden and output layer, youtput vector)
Conventional feedback neural networks process the data in one direction only, but in certain areas, past and future information is desirable. Therefore, in 1997, Schuster and Paliwal introduced Bidirectional Recurrent Neural Network (BRNN), whose basic idea was to extend the RNN architecture by introducing additional hidden layers where data were placed in the opposite, negative direction. The hidden layer maintains a hidden state which can be defined as: for the positive direction, and for the negative direction [33]. represents the weight matrix between input and hidden layer, represents the input vector, represents the weight matrix between two hidden states, represents the bias of the hidden layer and represents the activation function. The output layer can be defined as: where ⃗ ⃗ represents the weight matrix between hidden and output layer, while ⃖⃗ ⃗ represents the same but in other direction, is the bias of the output layer [33]. As a major drawback, BRNN in its basic form cannot model a complex time dynamics and it can suffer from the vanishing or exploding gradients.

Bidirectional Long Short-Term Memory
One of the solutions to overcome aforementioned problems is to use Bidirectional Long Short-Term Memory (BDLSTM) architecture. Such architecture differs from the RNN architecture in terms of hidden layers. BDLSTM has a LSTM cell as hidden layer, which consists of three gates: an input gate, forget gate, and an output gate. LSTM cell can be mathematically defined as follows [34]: In Eq. (8) - (12) represent forget, input, and output gate, and represent the weight matrices, b is a bias vector, is a sigmoid activation function, tanh is the hyperbolic tangent function, is the cell output state, is the layer output, and operator ʘ is the element-wise product of the vectors. By using Eq. (5) -(11), forward and backward layer outputs can be calculated. The result of combining BRNN with LSTM cells is a BDLSTM network, which can model more complex time dynamics and deal with long-term dependencies [35]. The architecture of an unfolded BDLSTM is shown in Figure 4. By using inputs in a positive sequence, the forward layer output sequence is calculated, and by using reversed inputs, the backward layer output sequence is calculated. Each element in output vector of BDLSTM layer can be calculated as: where two output sequences are combined by utilizing the σ function [35]. In many studies, bidirectional networks have been proven to be significantly better than unidirectional networks in various fields, such as speech recognition [36], classification problems [37], and also in stock price prediction [38]. In this research, BDLSTM is trained in order to predict price movement for the time period where the impact of COVID-19 on the global economy is relatively high. In the output vector of a BDLSTM layer, the last element is predicted value for the next time iteration. Furthermore, to prevent the network from overfitting, dropout can be implemented on hidden layers [39].

Hyperparameter Optimization
In order to determine optimal hyperparameters of the ANN, the grid search algorithm has been used. This algorithm can be described as an exhaustive search through a set of manually specified parameters [40]. Therefore it iterates through every possible parameter combination, trains the network and finally stores the result for each combination. Hyperparameters can be described as follows [41]:  regularization parameter L2 forces the weights to decay towards zero but does not make them zero in order to limit the influence of input parameters.

Results
The forecasting results are obtained for Crude Oil commodity and Dow Jones Industrial Average, S&P 500, NASDAQ Composite indexes. For each dataset, SWT is performed in order to obtain their approximation and detail coefficients at five decomposition levels using discrete Meyer wavelet function. For example, such decomposed signal of Crude Oil price is shown in Figure 5. where s is stock closing price for time period from March 22, 2000 to April 07, 2020, cA and cD are approximation and detail coefficients.

Figure 5. Five-level decomposition of Crude Oil closing price using SWT (sinput data, cA i -Approximation coefficients at decomposition level i, cD i -Detail coefficients at decomposition level i)
Three main system configurations were examined in order to achieve high-quality regression and small values of performance measures. In the first configuration, non-preprocessed data is used to train the BDLSTM model, in second the BDLSTM model is trained by using both approximation and detail coefficients (AD). Finally in the last configuration, the data contain approximation and detail coefficients for commodity and stock index price, but only the approximation for COVID-19 confirmed cases (ADA). The values of performance measures for Crude Oil and stock market indexes with system configurations are shown in Table 4.  By using data of Crude Oil commodity, three stock indexes and information of COVID-19 confirmed cases in the past 128 days, predictions were made for Crude Oil price for the next five days, as shown in Figure 6. The values of performance measures for Crude Oil with BDLSTM+WT-ADA system configuration are shown in Table 5. Table 5. Simulation results obtained with the best configuration system for Crude Oil price.
As evaluation criteria, RMSE and MAE are used.

Discussion
This research proposes an integrated system, BDLSTM+WT-ADA, for commodity and stock price movement prediction during current pandemic. In order to validate the feasibility, the proposed system is compared with other approaches presented in the literature [43,44]  Crude Oil is globally the most important commodity and is driven by supply and demand as any other good, but has a tendency to fluctuate more in price than, for example, stocks and bonds on financial markets. As Crude Oil prices rise, so do other fuel prices, which increase production prices in general. Rising production prices lead to higher prices of food and industrial products thus generating inflation. The reduced demand for Crude Oil caused by various impacts, in this case the global pandemic, results in Crude Oil price disruption and, as mentioned, has a profound effect on the economy in general. For this reason, Crude Oil price was selected for five-day prediction that can be extremely useful for foreseeing the events that follow.
The relationship between the COVID-19 confirmed cases and the crude oil price is significant. With an increase in the number of cases, measures are being taken to slow down further spread. Some of them are closing factories, offices and shops and restricting the movement. Consequently, much less fuel is needed for vehicles, machinery, etc. If demand decreases and supply remains unchanged, this leads to lower commodity prices and crude oil prices fall [45]. The same goes for the stock market. If companies on the stock market reduce or close their operations, shareholders become nervous and fear what will happen to the value of that company's shares in the future and whether it will decline. They start selling stocks thus increasing the supply in the market. As the number of confirmed cases increases and measures become more stringent, other buyers are not interested in buying. If there are more participants in the market looking to sell a stock than there is demand to acquire the stock, the stock price will fall. Therefore, the inclusion of a large amount of data (confirmed cases for each day) allows us to have more accurate information and a more credible result.
From obtained results of forecasting Crude Oil price movement it can be seen Consequently, prediction models are valuable and can be used to foresee the sequence of events, but factors like political interference, that cannot be included in the model, also affect the price and must be emphasized.

Result comparison
The obtained results demonstrate the connection between the crude oil price and the number contraction being predicted with 90 percent confidence interval [46]. Toda (2020) shows the possibility of a temporary 50 percent stock price decrease using classic asset pricing modeling [47]. Baldwin and Tomiura (2020) conclude that there is danger of permanent damage to the trade system, depending on the policies implemented [48]. Atkeson (2020) uses as SJR Markov chain based model to determine the spread and comments the possibility of key financial and economic infrastructure being affected temporarily and permanently due to possible extreme staff shortages, in case where the number of active cases exceeds 10 percent of population [49]. Albulescu (2020) investigates the impact of COVID-19 on oil pricing, due to the initial 20 percent drop caused by the market being flooded with oil [50].

Conclusion
The goal of this research was to generate a forecasting model that integrates Stationary showed a huge impact on energy prices as well as stock market. Our proposed system shows a decline in Crude Oil price. In addition to predicting future events through the methods that are presented, it is important to note that the geopolitical aspect is indirectly included in presented model through the input data. There-fore it is not possible to clearly define the impact of geopolitical aspects in here presented model. It can be assumed that the geopolitical aspect in this model is negligible, but it has a significant impact on the global economy.
The observed period used in analysis was marked by the extreme increase in oil stocks on the market. Due to this over-supply from the most important exporting countries (e.g. OPEC countries) and geopolitical issues between the major players on the market, prices were consequently slumping. Following the trends after the research was conducted, it is concluded that despite the increase in the number of COVID-19 confirmed cases, the market is gradually adjusting oil prices due to the fact of joint agreement on production cuts (lowering the supply side) and on the other hand the gradual opening of markets and recovery of demand. The logical consequence is the growing demand on a global level simultaneously followed by the improvement of relations between oil exporters, which contributes to the temporary market stability.
Unexpected situations such as a pandemic can have a significant effect on market fundamentals in the short term, and there have been correlations with indexes and oil. Due to further observation, in a period of several months and through the gradual opening of economies, there is a stabilization of supply and demand, which has a positive effect on the formation of market equilibrium. The continued movement of stock indexes, especially this positive movement, does not reflect the real situation in the economy but is primarily based on expectations and is further stimulated by monetary and fiscal incentives (e.g. cut of interest rates, reduction of taxes) from national governments.
The main contribution and novelty of the presented research is not only demonstrating the existence of a link between the COVID-19 infections and commodity prices along with stock market prices but showing that modeling of the same can be achieved using data driven, artificial based, modeling methods.
Future work should use datasets with more data-points i.e. long time historical intraday data in order to achieve more precise forecasting. Also, apply more AI algorithms such as Dynamic Programming (DP), Genetic Programming (GP) and combination of Convolutional Neural Networks (CNNs) with LSTM network in attempt to find more robust systems. The main idea of using such algorithms will be to develop an advanced automatic forecasting system with capability of recognizing the positive correlation between financial markets.