Two-Stage Hybrid Machine Learning Model for High-Frequency Intraday Bitcoin Price Prediction Based on Technical Indicators, Variational Mode Decomposition, and Support Vector Regression

. Due to the inherent chaotic and fractal dynamics in the price series of Bitcoin, this paper proposes a two-stage Bitcoin price prediction model by combining the advantage of variational mode decomposition (VMD) and technical analysis. VMD eliminates the noise signals and stochastic volatility in the price data by decomposing the data into variational mode functions, while technical analysis uses statistical trends obtained from past trading activity and price changes to construct technical indicators. The support vector regression (SVR) accepts input from a hybrid of technical indicators (TI) and reconstructed variational mode functions (rVMF). The model is trained, validated, and tested in a period characterized by unprecedented economic turmoil due to the COVID-19 pandemic, allowing the evaluation of the model in the presence of the pandemic. The constructed hybrid model outperforms the single SVR model that uses only TI and rVMF as features. The ability to predict a minute intraday Bitcoin price has a huge propensity to reduce investors’ exposure to risk and provides better assurances of annualized returns.


Introduction
Bitcoin, which is considered the largest cryptocurrency with a market capitalization of about $125 billion [1], has experienced its largest-ever Bitcoin inflows and also seen significant plunges in value during the COVID-19 pandemic period.is has caused an unstable intraday price leading to price uncertainties, threatening its potential to be used as currency, and thus regarded as a highly volatile digital currency [2].Different factors contribute to the volatility in the Bitcoin price, which includes the small market size of Bitcoin trading in contrast to conventional financial assets such as stocks, fiat currencies, and bonds; unmonitored mining activity; news events; its availability to trade 24/7; low liquidity which increases price fluctuations; shifting sentiment; decentralized and high speculations.is high price volatility makes it difficult to efficiently predict its price.Even more, there are structural changes in the price of Bitcoin as a result of the effect of COVID-19.e article "What is going on with the Bitcoin Market" published on the website of Chainalysis [3] indicates the response of Bitcoin to the COVID-19 pandemic.From March 9, 2020, Bitcoin exchange markets have cumulatively received 1.1 million Bitcoin over an eight-day period, which peaked at 319,000 Bitcoin on March 13, 2020. is significantly differed from the average of 52,000 Bitcoins per day before March 9, 2020.Also, the daily average amount of Bitcoin that was sent to different Bitcoin exchange markets to be sold within March 12, 2020, to March 13, 2020, also increased by nine times.
is selling pressure caused the price of Bitcoin to reduce to approximately 37%.
Bitcoin market is an inefficient market; hence, the market does not incorporate all available information to determine a fair price for Bitcoin [4][5][6].Reference [4] concluded that Bitcoin returns do not satisfy the efficient market hypothesis.Using data at different frequencies, overlapping and nonoverlapping window analysis, [5] examined the dynamics of informational efficiency of Bitcoin and concluded that the Bitcoin market is an inefficient market.Reference [6] supported this claim by investigating the efficiency of the top 31 cryptocurrencies by market capitalization.is suggests that it is possible to uncover price predictability based on historical information.ree types of time-series prediction models have been proposed in the literature: statistical models, artificial intelligence models, and hybrid models.In the past decades, researchers have used conventional statistical models such as autoregressive (AR) models [7], autoregressive moving average (ARMA) [8], autoregressive integrated moving average (ARIMA) [9], and multivariate linear regression [10] for forecasting the price of Bitcoin.Statistical models are not appropriate for chaotic systems (such as the cryptocurrency market) with many uncertainties because they require the time-series data to be subject to specific a priori assumptions, such as stationarity [11].However, Bitcoin price series are nonstationary and nonlinear [12].As stated in [13], conventional forecasting models such as regression models can hardly capture nonlinear dynamics in most time-series data.
ese imply that there are inherent properties of nonstationarity and nonlinearity in some of the technical indicators that are used as features for prediction.As such, predicting the price of Bitcoin using statistical models is bound to large errors.Time-series data with these stylized facts can be effectively explored using nonlinear models such as machine learning models and decomposition techniques.Machine learning models, which are a subset of artificial intelligence, have received a lot of attention as a result of the advancement of computational intelligence.ey are datadriven methods that do not rely on any assumptions made apriori and so have diverse applications.Support vector regression (SVR), a type of machine learning model, has become a popular algorithm for different forecasting problems because of its strong nonlinear learning capability [14,15].But, the performance of machine learning models is highly dependent on the nature and characteristics of data [16] which makes it difficult for machine learning models to deeply mine the inner characteristics of the Bitcoin series.Also, due to the nonlinear dynamics of the Bitcoin series, which include its inherent fractality and chaoticity, singlestage prediction models are not sufficient to forecast the price of Bitcoin with very high accuracy.
Reference [14] proposed an SVR model based on empirical mode decomposition (EMD) and AR for forecasting electric load.Using unbalanced data, the proposed hybrid model outperformed the original SVR model.By employing moving average technical indicators as input for a multilayer perceptron-based nonlinear autoregressive with exogenous inputs, [17] predicted the price of Bitcoin.Reference [12] predicted intraday Bitcoin price series using the ensemble model of VMD and generalized additive model (GAM).e VMD-GAM model was compared to an ensemble model of EMD and GAM. e author concluded that VMD-GAM performed better than EMD-GAM.As noted by [18], VMD performs better than other signal decomposition tools because it eliminates modal aliasing and noise robustness.
Reference [19] constructed a hybrid forecasting model for stock price indices using VMD and Gated Recurrent Units (GRU) network.e constructed model outperformed the single models using VMD or GRU.Reference [20] concluded in their studies that the proposed hybrid model (EMD-SVM) outperformed the individual forecasting model.Clearly, in chaotic time-series prediction, it has been proven that hybrid models outperform their single counterparts.
Further, [21] indicated that the price of Bitcoin is heavily fueled by market sentiments and momentum instead of underlying economic fundamentals.Hence, we employ technical indicators as features for predicting the oneminute intraday Bitcoin price.e benefit of technical indicators as features is because they require only historical data of Bitcoin price and do not depend on any economic fundamentals.Reference [22]  A new framework based on a decomposition algorithm (variational mode decomposition) that is able to capture the chaotic nature of the Bitcoin series has been proposed in the literature for time-series prediction.In variational mode decomposition (VMD), selecting an optimal model is contingent on the model's potential to sample the fundamental dynamics (variational modes) from the original series and the intensity of noise it carries [24].Hence, Bitcoin price is decomposed into stationary variational mode function's (VMFs) components.
ese VMFs are reconstructed into new series, which give a better forecasting performance than the original VMFs.
e reconstructed series are combined with the high-dimensional technical indicators as features for SVR predicting model.is hybrid model will help increase forecasting accuracy.
As one of the most volatile assets, is Bitcoin price predictable out-of-sample in the midst of the COVID-19 pandemic?Will a hybrid of technical indicators and VMF series provide an optimal feature for high-frequency intraday Bitcoin price prediction during this COVID-19 pandemic?Given different features as such a distinct training dataset, how do we measure the generalization performance of a support vector model trained to the data?ese are the questions we seek to answer in this paper.e contribution of this paper is fourfold: (1) defining a new performance metric to evaluate the effectiveness of the reconstructed VMF in selecting an optimal mode (K) value called signal average absolute difference (SAAD), (2) evaluating the predictability of intraday price of Bitcoin out-ofsample in the midst of COVID-19 pandemic by using a hybrid of technical indicators (TI) and variational mode functions (rVMF) as features for SVR prediction model, (3) evaluating and comparing the predictive performance of two 2 Complexity features (TI and rVMF) to the hybrid model in the midst of COVID-19, and (4) adding to scarce empirical evidence of hybrid model using SVR, TI, and rVMF in predicting oneminute intraday Bitcoin price.e rest of the paper is organized as follows: Section 2 provides the materials and methods for constructing the predictive models data, experimental results, and discussion of the study are presented in Section 3; and the conclusion is outlined in Section 4.

Materials and Methods
In this section, the theoretical concepts for the implementation of the SVR-TI-rVMF prediction model are described in detail.e methodology used for the proposed prediction model and the evaluation metrics are also described in this section.

Technical Indicators and Feature Selection Using Boruta
Algorithm.Technical analysis of an asset is based on the premise that all the important information about that asset is contained in its price and/or other market data like the price low, price high, open price, and the volume traded.In this paper, 30 technical indicators (TI) are used as the initial feature space that characterizes Bitcoin price.Table 1 presents the list of initial technical indicators before feature selection.A comprehensive review of these technical indicators is given in [25].
Feature selection is an important step that finds a subset of features that minimize reductant technical indicator features and maximize relevance to the prediction.It helps to avoid the curse of dimensionality and thus improves the accuracy of prediction.In this paper, the Boruta algorithm is used for feature selection.Boruta algorithm (BA) is a feature selection algorithm that uses a wrapper approach built on Random Forest (RF) algorithm to select the most important features for a prediction model.BA can effectively handle the interactions between variables and consider all features, which are relevant to the outcome variable.TBA is implemented as follows: Step

Variational Mode Decomposition (VMD).
VMD algorithm is used to decompose an actual valued input signal into sets of modes (μ l ), also called variational mode functions (VMFs), where each VMF has a unique property.Each VMF has a unique frequency range derived from the input signal.Each VMF is also assumed to be mainly compact across a center pulsation (τ l ), and the modes are extracted concurrently from a convex optimization perspective.
e VMD algorithm can be stated as a constrained variation formulation, where μ l   � μ 1 , . . ., μ l   and τ l   � τ l , . . ., τ l   are the center frequencies of all nodes.
Expressing equation ( 1) as an augmented Lagrangian results in where c is the balancing parameter and β is the Lagrange multiplier.For the detailed algorithm, see [27].

Signal Average Absolute Average Difference for Variational Mode Determination and Reconstruction of VMF.
e number of variational modes (K) is set from 4 to a maximum possible number (in this study, the maximum value set is 14) depending on the signal conditions.For each K value, VMD is used to decompose the original signal into k VMFs.For each K, all the VMFs are aggregated into a single signal (reconstructed VMF).In Figure 1, we present the proposed reconstruction of the variational mode function.
In this paper, a new performance metric that can be used to evaluate the effectiveness of the reconstructed VMF to help in selecting the best K value is introduced.e signal average absolute difference (SAAD) is a metric that computes the average absolute difference between the aggregate VMFs obtained from VMD and the original signal.A very small SAAD indicates that the signals are very similar, and a large SAAD is evidence of information loss of the original signal.From [28], the signal average absolute difference is given as where N is the total number of sampling points in the signals, y n is the original input signal, and AVMF n is the aggregate VMFs.

Support Vector Regression (SVR).
Initially constructed by Vapnik as a classifier [29], SVR is designed with the ability to capture nonlinear relationships in the feature space.It is a machine learning technique that is highly regarded as an effective technique in regression analysis (i.e., functional approximation) [30].
For a set of training patterns H � (x i , y i )   m i�1 obtained from an unknown function y � f(x) with noise, a function y � h(x) has to be established which completely depends on H and can reduce the difference between h and the unknown function f.Suppose f is a linear relationship between x and y for linear regression; then where x is the feature vector and lives in a space X called the feature space, y is described as the label for each (x, y), n is the dimension of x and X, and y is described as the label for each (x, y).However, the assumption of linear regression is very simple in describing the dynamics of most time-series data.Consequently, it is important to take into consideration a nonlinear f. e basis of SVR for nonlinear regression is to construct a mapping x ⟶ ϕ(x) from the original n dimension of X to a new X ′ .e dimension of X ′ relies on the mapping scheme, and it is not necessarily finite.e nonlinear form is given as follows: where x k 's are support vectors in the given training patters H, y k 's are the corresponding labels, and and is defined as the inner product in X ′ .
Some of the commonly used kernels are Radial basis function (RBF) kernel: where K(x, y) is the kernel density, c is the gamma term in the kernel function, r is the bias term of the polynomial and sigmoid kernel, and d is the polynomial degree term for the polynomial kernel.Generally, the performance of SVR depends on the settings of the global parameters c, d, and c.  4 Complexity SVR always gives the same results when the same dataset is processed at any given time.We train Technical Indicator-SVR (TI-SVR), reconstructed variational mode functions-SVR model (rVMF-SVR), and TI-rVMF-SVR model using the parameter settings shown in Table 2. is is to help in deciding on optimal parameter values.

Data Preprocessing and Evaluation Metrics.
To make the data more relevant for the SVR prediction model, the intraday Bitcoin price data are preprocessed and normalized.To make learning easier for the support vectors, heterogeneous time-series data (time series that have different scales) should always be converted to homogeneous data (time series with similar scales).Hence, the Bitcoin time-series data should take small values (normally be between 0 and 1) and must be homogeneous (all features should possibly take values in the same range).In this study, the mean normalization method (see equation (7)) is used as the data normalization technique.For the mean normalization technique, all features are guaranteed to have the exact same scale.is technique allows data to have values between the range of 0-1.e normalized data values are changed to the magnitude of the original data values via the antinormalization technique as given in equation (8).
where y and y normalization are the value of the inputs and the normalized input value, respectively.R statistical software was used in implementing the data normalization.Table 3 shows the evaluation/performance metrics used in evaluating the prediction model.

Proposed Two-Stage Hybrid Model for Predicting the Price of Bitcoin.
Let A t represent the one-minute closing price of Bitcoin and P t represent the final price prediction of A t .A two-stage hybrid model for high-frequency Bitcoin price is proposed (see Figure 2).e two-stage approach is presented as follows: Stage One.Selection of technical indicators and variational mode decomposition.
Step 1. Technical indicators are filtered using a correlation matrix filter (technical indicators with more than 0.7 correlations are removed).
Step 2. Boruta algorithm is used to select the most important technical indicator for predicting the Bitcoin series.ese technical indicators are used as a feature set 1.
Step 4. Using evaluation metrics (SAAD and NRMSE), the best K value is selected and used as a feature set 2.
Stage Two.Aggregating feature set 1 and feature set 2 as input for SVR.
Step 1. Preprocessing the data (data normalization, data partitioning into validation, training, and testing data set).
Step 2. Hyperparameter optimization.A set of optimal hyperparameters for the SVR algorithm is selected using grid search.e SVR model is trained using feature set 1 and 2. Using the testing data, SVR predicts the Bitcoin series.
Step 3. Single-stage prediction models (Figure 3) are constructed and used as competitor models.Evaluation metrics (MAE, RMSE, NMRSE, and MAPE) are used to verify the performance of the proposed twostage model and the single-stage model.

Results and Discussion
In this section, we present the data and experimental results using technical indicators, reconstructed variational mode functions, a hybrid of technical indicators, and reconstructed variational mode functions as inputs for the support vector radial kernel regression model.e results obtained are also compared in the Discussion section.

Data.
e dataset used for this study is downloaded from https://www.cryptodatadownload.com/data/bitstamp/, a publicly available source of data.e data are a high-frequency intraday data sampled at a one-minute time interval from 29/ 03/2020 to 22/11/2020, making a total of 143464 data points.Each data point contains the minute open, high, low, close, and the trading volume for Bitcoin in United State Dollars (USD).However, after data cleaning, final samples of 136465 data points were retained.e intraday Bitcoin closing price (close) is used as a measure of the price of Bitcoin in this paper.Figure 4 shows the Bitcoin price dynamics over the selected period under study.From the figure, the price of Bitcoin can be seen as highly volatile.is helps to discard all irrelevant and redundant information.Figure 5 presents the correlation matrix for the 32 technical indicators selected as initial features for the prediction model.Positive and negative correlations are shown in blue and red colours, respectively.e correlation matrix is reordered according to the correlation coefficient.is is important to identify the hidden structure and pattern in the matrix.Colour intensity is proportional to the correlation coefficients.From the correlation matrix (Figure 5), some of the technical indicators are highly correlated.A higher correlation between two technical indicators indicates the redundancy of one feature with regard to the other feature.
e resulting correlation matrix is as shown in Figure 6. e filtered technical indicators using the Boruta algorithm are given in Table 5, Boruta algorithm is used to select the most important features for the prediction model.e algorithm performed 10 iterations in 3.126292 hours.All the 10 attributes were confirmed important; that is, there were no attributes deemed unimportant.
e graphical summary of the Boruta algorithm run is shown in Figure 7 for the features.e boxplots show the distribution of features' importance over the Boruta run, using colours to mark final decisions.Green boxplots represent Z scores of confirmed technical indicator attributes, and blue boxplots represent the minimum, mean, and maximum Z score of a shadow technical indicator attribute.Feature statistics (see Table 6) presents the value for mean importance (meanImp), medianImp (median importance), minImp (minimum importance), maxImp (maximum importance), and the decision for each feature for the complete iterations.
e mean importance value of the Zig Zag indicator is the maximum among the selected features.is indicates the importance of the Zig Zag indicator as a feature.

Decomposition of Bitcoin Price Series via VMD and Reconstruction of Variational Mode Function. Intraday Bitcoin
price is decomposed into K(k � 4, 6, 8, 10, 12, 14) relatively stationary variational mode functions using VMD as depicted in Figure 1.With a SAAD value of 0.0006 and NMRSE value of 7.1537e − 04, k � 14 is selected as the best K-value (see Table 7).e VMD results of the optimal variational mode (K � 14) are shown in Figure 8.Compared to other VMFs, VMF 2 and VMF 11 have the largest and lowest errors, respectively.e reconstructed signal of the effective mode (K � 14) is given in Figure 9, and from the figure, VMD was able to avoid information losses during the decomposition process.
3.4.Discussion.Single-stage models (SVR-TI and SVR-rVMF) are constructed and compared to the SVR-TI-rVMF model to demonstrate the reliability and efficiency of the constructed SVR-TI-rVMF model in improving the performance of Bitcoin price prediction.SVR-TI and SVR-rVMF models are constructed using technical indicators and reconstructed variational mode functions as features (inputs) for SVR, respectively.Figures 10 and 11 illustrate the visual performance using the selected technical indicators and reconstructed variational mode functions as features for the SVR model.e visual performance of the proposed SVR-TI-rVMF is presented in Figure 12.In Figure 13, we present the comparison results with respect to the evaluation metrics of all three  In view of the model effectiveness and efficiency, on the whole, we can conclude that the proposed model is quite competitive against two single-stage models, the SVR-TI and SVR-rVMF models.In other words, the hybrid model leads to better accuracy.Furthermore, the results from Table 8 show that hybrid models outperform their single counterparts in chaotic time-series prediction.

Conclusion
In this paper, we combine the advantage of technical indicators (TI) and reconstructed variational mode functions (rVMF) obtained from variational mode decomposition (VMD) to construct a hybrid support vector prediction model for one-minute intraday Bitcoin price.e model (SVR-TI-rVMF) reveals the fact that decomposition methods (VMD) and TI can be used together, yet separately, to construct a hybrid model to predict the Bitcoin price in the cryptocurrency market.Our contribution in this paper is as follows: (1) defining a new performance metric to evaluate the effectiveness of the reconstructed VMF in selecting an optimal mode (K) value called signal average absolute difference (SAAD), (2) predicting the out-of-sample intraday price of Bitcoin in the midst of COVID-19 pandemic via a hybrid of TI and rVMF as features for SVR prediction model, (3) evaluating and comparing the predictive performance of two features (TI and rVMF) to the hybrid model in the midst of COVID-19, and (4) adding to scarce empirical evidence of hybrid model using SVR, TI, and rVMF in predicting one-minute intraday Bitcoin price.Based on these studies, investors can decide whether to buy Bitcoin or not.
e findings are important for practitioners, such as traders and investors, as well as policymakers, who want to learn more about the cryptocurrency market.

1 .
Generate duplicates of technical indicators Step 2. Randomly shuffle the original and duplicate technical indicators to take out their correlations with the outcome variable Step 3. Using the RF algorithm, search for the key technical indicators based on higher mean values Step 4. Using the mean/standard deviation, compute the Z score Step 5. From the duplicates and technical indicator feature, find the maximum Z score Step 6.For Z less than the technical indicator feature, remove that technical indicator feature Step 7. Repeat Steps 1-6 until iteration completes

Figure 1 :
Figure 1: Proposed reconstruction of variational mode function.

3. 2 .
Selection of Technical Indicators.Before applying the Boruta algorithm to identify the most relevant technical indicators for the prediction model, we filter the technical indicators using a correlation matrix filter.atis, we remove technical indicators with more than 0.7 correlation.

Figure 7 :
Figure 7: Boruta result plot (a) and plot of important history (b) on the Boruta on technical indicators object.

Table 2 :
Parameter settings for SVR.

Table 3 :
Evaluation metrics., A t , A, P t , e t � A t − P t are the size of the data set, original data values, mean of the original data values, predicted values, and the error, respectively.
nFigure 2: e structure of the proposed SVR-TI-rVMF model.6Complexity the optimal parameter values selected for the SVR model, and the testing data were used to test the constructed SVR model.3.1.1.Data Preprocessing.e historical dataset downloaded is transformed into an acceptable format as inputs for the machine learning technique.e following preprocessing steps were used:

Table 4 :
Descriptive statistics of closing price (price) of Bitcoin.Complexity Data Transformation.eoriginalhistorical time-series data is transformed into a set of technical indicators.In this study, 30 technical indicators are computed for each data point.Using the correlation matrix filter, the technical indicators are filtered and the Boruta algorithm is then used to select important features (10 technical indicators) as inputs (see Figures 5 and 6 and Table5).e closing price of the data is also decomposed into 8 different IMFs.
Table 8 presents the performance measure of the constructed models using the testing data.e lower the performance metrics (MAE, RMSE, NMRSE, and MAPE), the more accurate the model.It is evident from the table that the constructed SVR-TI-rVMF hybrid model has the lowest MAE (748.4339USD), RMSE (993.6821USD), and NMRSE (0.0919 USD).However, SVR-TI performed well when MAPE was used as the evaluation metrics.e proposed model shows the

Table 5 :
Filtered features for Boruta algorithm feature selection.

Table 7 :
Evaluation metrics for different variational mode functions.