A Neural Learning Approach for a Data-Driven Nonlinear Error Correction Model

A nonlinear error correction model (ECM) is developed to fit nonlinear relationships between the nonstationary time series in a cointegration relationship. Different from the previous parametric methods, this paper constructs a hybrid neural network to learn the nonlinear error correction model by combining a linear recurrent neural network with a multilayer BP network. The network learning algorithm is given by using the gradient descent method and error back propagation. Based on the principle of data-driven, all network parameters can be obtained through the network learning and training. The daily data of gold price and the US dollar index in 2021 were used to verify this proposed nonlinear ECM neural learning method and the results were compared by the likelihood ratio Chi-square test. Simulation results show that the proposed data-driven nonlinear error correction neural learning method can improve goodness of fit statistical significantly of complex nonlinear relationship between time series.


Introduction
Te error correction model (ECM) proposed by Davidson et al. [1] is a regression model established by using the diference between variables and error correction terms. According to the Granger theorem proposed by Engle and Granger [2], if there is a cointegration relationship between nonstationary variables, an ECM can be established to refect the relationship between long-term equilibrium and shortterm fuctuations among variables. ECM combines the longterm equilibrium relation with the short-term disequilibrium fuctuation in the variable series, improves the stability of the series prediction model, and efectively avoids the pseudoregression problem. ECM not only makes full use of the possible equilibrium relationship between nonstationary variables but also retains the economic signifcance of variables. It has become a classical model for analyzing nonstationary time series and has been widely studied and applied. Tis issue has been addressed by several authors in their books such as Baltagi [3] and Enders [4].
Te application of ECM in time series analysis has achieved signifcant results. Hall et al. [5] established an ECM for house prices in the UK. Tis model can switch whether to carry out disequilibrium correction through the adjustment of coefcients. Kulendran and King [6] successfully used an error correction and time series model to forecast international quarterly tourist fow and pointed out that the performance of the error correction model can be improved by improving the decision-making methods of nonstationary and seasonal modelling. Cook [7] studied the generation law of the potential prediction deviation of asymmetric ECM. By dividing the error correction term, it is revealed that the prediction deviation has regularity in the prediction space. Zhang et al. [8] analyzed the forecast price of electric futures by using the ECM. Te model includes not only lag variables such as futures price and spot price but also the long-term relationship between futures price and spot price. Based on the cointegration theory, Yan and Zhao [9] constructed an ECM on GDP of China, which accurately refects the actual situation. Zhang and Zhu [10] studied the relationship between real estate investment and regional economic development in Chengdu, China. Te cointegration model describes the long-term equilibrium relationship between variables, and the error correction model refects the short-term unbalanced relationship between variables. Te results show that real estate investment and economic growth in Chengdu are subject to short-term fuctuations and long-term equilibrium. Kim et al. [11] used ECM to predict the logarithmic rate of return of bitcoin (BTC), analyzed how BTC is afected by other coins, and carried out Granger causality tests on 14 cryptocurrencies. Liang [12] used ECM to fnd that the relationship between bitcoin yield and relevant indicators to measure monetary function is not signifcant, which denies the original assumption that bitcoin can bear monetary function, indicating that bitcoin does not have the ability and potential to bear monetary function.
In order to make ECM more widely used and adapt to more time series characteristics, many literatures have carried out theoretical research on ECM. Xu et al. [13] studied a static source error correction model based on MATLAB and Simulink. Jochmann et al. [14] developed a random search variable selection method for a vector error correction model and successfully applied it to the UK macroeconomic modelling. Compared with the current popular regression and vector autoregressive models, this method can break through many possible restrictions of cointegration space. Hong et al. [15] studied the infuence of measurement errors on the analysis of vector process error correction model (ECM) and proposed a method using an instrumental variable (IV) to obtain the asymptotic distribution of reduced rank estimation so as to eliminate the adverse efect of endogeneity.
However, the traditional ECM still adopts linear regression. If there is a nonlinear relationship between variables, it is difcult to be fully captured by the model, which limits the application scope of ECM to a certain extent. Although the relationship between cointegration and error correction (EC) models is well described in a linear environment, its expansion in a nonlinear environment is still a challenge.
Oliveira et al. [16] proposed a hybrid optimization error correction system for time series prediction, which uses the linear model of time series, the nonlinear model of error series, and the combination prediction of diferent methods. Te system searches for the linear and nonlinear components and the best parameters of the combination method through a particle swarm optimization algorithm. Tis method has achieved good results in time series prediction. Berger [17] estimated an ECM of savings and investment. Te model allows for the distinction between short-term and long-term capital fows, and its parameters are allowed to change over time, which is estimated by the Kalman flter and maximum likelihood technology. Saikkonen [18] studied the Granger representation theorem against the background of the general nonlinear vector autoregressive error correction model. Te model considers nonlinear autoregressive conditional heteroscedasticity, and the conditional distribution involved can be a general type of mixed distribution. Omay et al. [19] proposed a cointegration test method based on nonlinear error correction in the panel data. Escribano and Mira [20] studied the nonlinear error model, proposed a theoretical framework based on the concept of near epoch dependence (NED), and partially extended the Granger representation theorem to the nonlinear case. Psaradakis and Spagnolo [21] studied the ECM for nonlinear and discontinuous adjustment of long-term equilibrium, proposed a nonlinear error correction model with regional switching, and analyzes its prediction performance. Te research shows that if the nonlinear model is used properly, considerable benefts can be obtained, especially when the imbalance adjustment is strong and/or the parameter change range is relatively large. Ma et al. [22] proposed a nonlinear multivariate spatiotemporal threshold vector error correction model for short-term trafc state prediction by using the cointegration theory and method with error correction mechanism. Trough a threshold switching mechanism, the spatial cross-correlation information is combined with the piecewise linear vector error correction model to solve the problem of unknown structural changes in trafc time series. Song and Lei [23] used the nonlinear ECM to better explain the short-term dynamic mechanism of China's broad money demand and analyzed the money demand function under the condition of an open economy by introducing the variable of the exchange rate.
However, it is not easy to model a nonlinear model by the parametric method. At present, intelligent computing methods have been successfully used to deal with nonlinear time series. Hornik et al. [24] have proved that neural networks have the ability of arbitrary approximation to nonlinear dynamics. Various neural networks and methods combined with other models are applied to nonlinear time series analysis. Based on cointegration and Granger causality analysis, Haefke and Helmenstein [25] constructed a linear and neural network error correction model of the Austrian initial public ofering index (IPOX ATX ). Te neural network adopts an enhanced feed-forward structure, takes the Schwartz information criterion as the estimator for predicting risk, and uses the signifcant relationship between IPOX ATX and the Austrian stock market index ATX to predict IPOX ATX . Zhu et al. [26] proposed a method combining neural networks (NNs) and data assimilation (DA). Aiming at the uncertainty of the structural model, the assimilation process and prediction results of time series can be improved. Wu et al. [27] aimed at the consumer price index (CPI) series in Beijing, and using the advantages of the ARIMA time series model in linear space prediction and the BP neural network model in nonlinear space prediction, a combined diferential autoregressive moving average (ARIMA) prediction model with BP network error correction was established.
As we all know, statistics is the art and science of collecting data, analyzing data, and inferring based on data, including parameter estimation, hypothesis testing, regression analysis, factor analysis, time series, nonparametric statistics. Traditional statistics mainly develops on the basis of probability theory to establish a mathematical model, collect data from an observed system, carry out quantitative 2 Computational Intelligence and Neuroscience analysis, and then infer and forecast so as to provide a basis and reference for relevant decisions. Establishing accurate mathematical models is the most important task in traditional statistical methods. However, although traditional statistical methods have the great advantage of being interpretable, model-based methods often have large deviations when dealing with complex systems, and it is very difcult or even impossible to establish accurate mathematical models of observation systems. Moreover, in the era of big data, great changes have taken place in the concepts of sample, data type, data acquisition, quantitative methods, and analysis methods. Terefore, in the era of big data, statistics are required to solve much more complex problems, thus putting forward higher requirements for statistics. Under such situation, it is signifcantly meaningful to explore and develop data-driven-based statistical methods. Based on the data-driven method, only the input and output data of the observation system are used for direct quantitative analysis, which can break the dependence of the traditional statistical theory on the mathematical model and overcome the complex dynamic modeling and robustness problems. Data-driven methods are widely developed in control engineering, fault diagnosis, production management, and other felds. For example, in control engineering, many diferent data-driven methods have been studied so far, typically PID control, iterative learning control (ILC), model-free adaptive control (MFAC), etc. Especially in the recent past, data-driven approaches combining self-attention mechanisms and generative adversarial network (GAN) have been studied in depth and have resulted in remarkable applications. Hu et al. [28] developed the self-attentionbased machine theory of mind for electric vehicle charging demand forecast. A short-term probabilistic charging demand forecast model is suggested to address the problem of estimating future charging demand quantiles of a charging station 15 min ahead. Real-world-data-based case studies have demonstrated its superiority in electric vehicle charging demand forecast over state-of-the-arts. Hu et al. [29] also proposed electrochemical-theory-guided modelling of the conditional GAN to improve both point and probabilistic battery calendar ageing forecasts. By using GAN's ability to learn arbitrarily complex distributions, the Capacity Forecast GAN (CFGAN) is proposed to approximate all the possible joint distributions. By using electrochemical knowledge as the guidelines for CFGAN's crucial part design, CFGAN provides a satisfying consistency between knowledge and data, making it both knowledge-driven and data-driven.
As widely known, "black-box features" are the shortcoming of machine learning. Interpretability is an important research direction for machine learning approaches. Liu et al. [30] made extraordinary and meaningful contributions. An interpretable machine learning framework that could efectively predict battery product properties and explain dynamic efects is proposed which also provides interactions of manufacturing parameters. Because no specifc knowledge of battery manufacturing mechanisms is required, this data-driven framework can be easily adopted by engineers. Te work assists engineers in drawing critical insights about underlying complicated battery material and manufacturing behaviour, and further contributing to smart control of battery manufacturing. Liu et al. [31] have developed an ensemble learning approach that has superiority in accuracy, interpretability, and data-driven nature. Te efects of component parameters from the mixing stage on the manufactured results of Li-ion battery electrodes are scrutinized via classifcation modeling. Te proposed efective ensemble learning framework based on RUBoost can compensate for the category imbalance issue and classify three key quality indicators.
Tis paper will also establish the error correction model (ECM) between the US dollar index and gold price based on cointegration analysis and use the combined network of a linear recursive neural network and a multilayer BP neural network to ft the nonlinear relationship between the US dollar index and gold price. Te main diference between this paper and the existing work is the development of datadriven methods within the framework of traditional statistical theories. Te proposed method not only promotes the application of data-driven methods but also makes new innovations in the conjunctive use of traditional statistical theory and data-driven solutions to take scientifcity, accuracy, and interpretability into consideration.
Te motivation of the research in this paper is mainly based on the following two aspects: (1) ECM can well describe the relationship between the US dollar index and gold price, but it needs to be extended to nonlinear case. As both the US dollar and gold are important components of international reserves, their price linkage mechanism afects the optimal allocation of international reserves and will also be closely watched by Yang and Fang [32]. Te linkage between the US dollar and the gold price is worthy of in-depth study. Several literature studies have shown that ECM can well describe the linkage between the US dollar and gold price. Zhang [33] studied the impact of the US dollar exchange rate on the gold price and described the linkage relationship between the US dollar exchange rate and gold price through cointegration analysis and ECM. Nie and Jiang [34] also established a linear ECM between the gold price and the US dollar index and studied the long-term equilibrium relationship between them. However, the time series of the US dollar index and gold price have strong nonlinear characteristics. Te research results of Gilmore [35] and Joy [36] show that the gold price series with violent fuctuations has chaotic characteristics, its multifractal intensity is time-varying, and the relationship between the gold price and US dollar index time series has nonlinear characteristics. Terefore, in order to describe the nonlinear linkage more accurately between the US dollar and gold, the ECM based on linear regression needs to be further extended to nonlinear ECM.

Computational Intelligence and Neuroscience
(2) Neural networks have been widely used in nonlinear time series analysis, but neural model selection needs to comply with statistical principles. Although neural network learning models and even deep learning models have been successful in the nonlinear feld, the internal structure of these models is usually very complex, the operation mechanism is like a black box, and the intermediate process is difcult to be understood by humans. While this kind of nonlinear model improves the accuracy of the model, the number of parameters to be estimated often surges, which increases model instability and overftting risk. Terefore, whether the improvement of the model goodness of ft can ofset the negative efects of increasing parameters is the focus of this paper. Tis paper combines the neural network learning method with the traditional linear statistical model. Not only is the goodness of ft of the model evaluated by multiple indexes but also the nested model statistical test is used to select the model. Only when the goodness-of-ft gain of the complex model is statistically signifcant compared with that of the simple model is it considered that such complexity is worthwhile. Tis design not only provides a comparative basis for linear and nonlinear time series models but also provides strong support for the subsequent accurate analysis of relationship between time series. It is worthy of in-depth discussion and research.
In the rest of the paper, Section 2 describes the proposed data-driven nonlinear ECM with neural learning approach and its implementation method. Section 3 presents the simulation study on the nonlinear relationship analysis of the US dollar index and gold price by applying of proposed the nonlinear error correction neural learning method and showing their performance in a real-world case study. In Section 4, we make some concluding remarks.

Methodology
In this section, we explain the neural learning methods for data-driven nonlinear ECM after introducing model-based ECM.

Model-Based ECM.
ECM is a specifc form of regression model for nonstationary cointegration time series. After verifying whether there is a cointegration relationship between time series variables, the variables representing the short-term efect and the deviation degree of the long-term equilibrium state can be constructed, and the parameters can be estimated by linear regression.
Te cointegration test can be used to determine whether there is a long-term equilibrium relationship between time series variables, that is, to test whether the linear combination of time series variables is stationary. In this paper, the two-step cointegration test method is used for testing. Tis method is proposed by Engle and Granger [2].
Set Y t and X t represent the two time series to be analyzed, respectively. First, after verifying Y t and X t is a single integer sequence of the same order, the following equation is estimated by ordinary least squares (OLS): where μ t is the error term. After obtaining the estimated coefcient α 0 and α 1 , the estimated value of Y t can be calculated by Y t � α 0 + α 1 X t . And the disequilibrium coefcient e t are also calculated by Secondly, the Dickey-Fuller (DF) test or Augmented Dickey-Fuller (ADF) test is used to test the stability of e t . If Y t and X t are integral sequences of order d, and e t is integral sequences of order d − b, Y t and X t are considered as cointegration of order (d, b). In the study of the long-term equilibrium relationship between gold price and US dollar index, Nie and Jiang [34] adopted the ADF cointegration test for their index sequence and found that their logarithmic sequence was nonstationary, their frst-order diference was stationary, and both were frst-order integration variables. Moreover, they found that their residuals had frst-order autocorrelation but no secondorder autocorrelation. As a result, the following error correction model with frst-order lag is presented as follows: where Y t stands for gold price, and X t stands for US dollar index. ∆Y t � Y t − Y t−1 is the dependent variable in the mode. e t−1 represents the deviation degree of the early long-term equilibrium state.
short-term efects. ϵ t is the error term. After linear regression, it can be obtained that the Akaike information criterion (AIC) and Schwarz criterion (SC) values of the error correction model are the smallest, R 2 is the largest, and there is no autocorrelation in the residual sequence. At this time, the model can be considered the best.
Te advantage of the model in equation (3) is that the diference item eliminates the possible trend factors between variables, which weakens the multicollinearity and avoids the pseudo regression problem as much as possible. Compared with the conventional diferential regression model ∆Y t � f(∆X t , v t ), the long-term equilibrium state deviation degree sequence is estimated by Y t and X t horizontal values are added to the independent variables of this model, so that the estimated value ∆Y t is corrected according to the previous disequilibrium degree.
Generally, the linear regression method is still used for modelling the model in equation (3). However, time series data often have strong nonlinear characteristics. In order to describe the interaction mechanism more accurately between time series with nonlinear characteristics, most nonlinear learning methods based on neural networks represent the model as a general nonlinear ECM as follows: Unfortunately, due to the complex structure of neural networks, no matter what learning algorithm is adopted, people cannot carry out statistical analysis of the network process. As in Li's [37] work, nonlinear ECM is expressed in this paper by combining equations (3) and (4) as follows: where g(e t−1 ) describes the adjusting efect of the last cointegration residue e t−1 on ∆Y t . Its regulatory efect depends on the functional form of g(•). In the work by Li [37], taking the commonly used smooth transformation autoregression (STAR) nonlinear structure model as an example, equation (5) is expressed as follows: where f(•) is the smooth transformation function, which varies continuously with e t−1 from [0, 1]. Te smoothing parameter that determines the transfer speed of this mechanism is the coefcient c (which is a positive parameter), and c represents a threshold value, representing the point at which the transfer occurs. Te exponential transformation function of ESTR model is as follows: Te function is nonmonotonic with respect to the transformation variable, but symmetric with respect to the transformation point. Te logistic transformation function of LSTR model is as follows: Te function is monotonic with respect to the transformation variable, but asymmetric with respect to the transformation point. c and c can be estimated by nonlinear least squares. If the transformation function f(•) � 0 or 1, the cointegration relationship is linear, otherwise, it is nonlinear. Te F distribution can be used to test the nonlinear hypothesis.
However, the null hypothesis to test whether equation (6) is linear is the coefcient c � 0. Under this original assumption, the model cannot be identifed because the parameters φ and c can take any value. In the work by Granger [38], Teräsvirta [39], Song, and Lei [23], the third-order Taylor expansion was carried out, and then the auxiliary regression was tested by the nonlinear hypothesis test.
Obviously, the transformation function is used to establish the nonlinear model of EMC; however, the function should change according to diferent time series characteristics. Further, the parameters of the transformation function are trained by regression. Once there is data updated, regression model needs to be performed again.
Under the condition that there is a cointegration relationship between time series, the error correction model is one of the most efective statistical methods, but its regression is essentially linear regression, which cannot accurately describe the nonlinear relationship between time series, and the developed transformation function and other modifcation modeling methods are also difcult to adapt to the complex nonlinear linkage relationship of time series. In this paper, a data-driven neural network learning method is developed to describe the nonlinear linkage relationship between time series by using the nonlinear expression ability of neural networks. Tis method is not only suitable for the change of data but can also meet the expression of the nonlinear relationship. More importantly, under the framework of existing statistical methods, it can obtain the test of traditional statistics, improving the scientifcity and interpretability of the approach. Terefore, the data-driven approach developed in this paper is very necessary.
Tis paper explores the use of the nonlinear arbitrary approximation ability of neural networks to establish a nonlinear ECM. Te obtained network can accurately describe the nonlinear ECM by learning. However, diferent from the existing nonlinear models of time series based on neural networks, this paper does not directly choose the model with the best goodness of ft but emphasizes the comparison of the model with the criterion of the statistical test.
It is worth pointing out that a hybrid neural nonlinear ECM model integrating RNN and BP networks is established by using the nonlinear representation ability of neural networks in this paper. However, we did not choose one or even other deep neural networks to learn this nonlinear relationship. Tis is because, on the basis of statistical tests, the existing research results have shown that the structure of nonlinear ECM between time series with a cointegration relationship is more consistent with the proposed network structure. On the other hand, the proposed method can avoid the "black box" problem of neural networks and achieve the interpretability of statistical signifcance for its process.

Data-Driven Nonlinear ECM Neural Network.
Te construction of the neural network data-driven nonlinear error correction learning model is mainly divided into the following three steps. Firstly, under the condition that the time series data meet the construction conditions of ECM, the ECM variables representing the short-term efect and the deviation degree of the long-term equilibrium state are constructed. Secondly, the nonlinear error correction learning model based on a two-layer recurrent neural network (RNN) and multilayer BP network and the parameter learning algorithm of the network are established. Finally, model selection among linear models and neural networks with diferent parameters is decided by the nested model statistical signifcance test.
It is well known that a three-layer feed-forward neural network can approximate any complex nonlinear continuous function if the hidden layer contains enough neurons [24]. It only takes enough hidden elements, the approximation can have arbitrarily small precision, and the number of network layers can be increased to improve the approximation performance of the network. Terefore, this paper constructs a hybrid network of linear RNNs and multilayer BP neural networks to model the nonlinear error correction model expressed in equation (5). Tis paper takes Computational Intelligence and Neuroscience 5 the example of a four-layer BP neural network, and its hidden layers are 2 layers (more layers of BP network structure can only increase its hidden layers). Te network inputs are ∆x t and e t−1 and input is ∆Y t . A frst-order lag nonlinear neural network ECM is represented as follows: Since the output of multilayer BP neural network adopts Sigmoid function in this paper, the nonlinear part g(e t−1 ) of ECM is expressed in the form of linear and nonlinear combination ρ 1 e t−1 + ρ 2 BPNN(e t−1 ). Te network structure diagram of neural network input and output is shown in Figure 1.
In Figure 1, the input layer includes ∆X t and its frstorder lag ∆X t−1 and e t−1 . Te hidden layer includes two hidden layers. Te i-th neuron input of the frst hidden layer is net i , which is given as follows: After the activation function mapping, the output of the frst hidden layer h (1) i is obtained, which is given as follows: Te j-th input of the second hidden layer in the network is net j , which is given as follows: Its output is h (2) j , which is given as follows: Te output of BP neural network is as follows: where Te output layer of the network includes the following three parts. Te frst part is the linear combination of multilayer BP output and error correction term: Te second part is the output of ∆X t through the linear network: where β 1 is the network weight parameter, β 0 is the ofset parameter of the network. Te third part is the linear combination of frst-order lag ∆X t−1 of network input and frst-order lag of ∆Y t−1 of output as follows: where π y and π x are the network weight parameters.
Because the network output contains the lag term of the network output, which is equivalent to the feedback of the network, this network is a hybrid network of multilayer BP network and a linear RNN. Te neural network activation function f(•) selected in this paper is the Sigmoid function: Similarly, we can use more hidden layer neural networks to improve nonlinear approximation ability.
Te constructed hybrid network expressed as equation (9) is structurally similar to the ECM expressed as equation (6) with a smooth transformation autoregressive (STAR) nonlinear structure. Te multilayer BP neural network is used to replace the smooth transformation function. In addition to the improvement of the nonlinear approximation expression ability of the neural network, this model can optimize the network parameters through a training and learning algorithm rather than getting all the parameters through regression. Tis network learning method can not only optimize the approximation coefcient but also adaptive optimization of the approximation basis function set, so it has a strong nonlinear approximation ability. Moreover, the training algorithm has better adaptability to get the approximate parameters. When the data changes, the parameters can be obtained through distributed parallel computing, and the real-time performance of the algorithm is also better. When the number of hidden layers of a neural network increases, the number of neurons in the hidden layer also increases, and the ability to model the nonlinear relationship of time series also increases. Because the nonlinear ECM is expressed by a transformation function in equation (6), the model parameters are obtained by the regression method and rely heavily on regression data. When the data changes, all the data need to participate in the regression again, and the adaptability of parameters is not as strong as the approximation coefcient obtained by neural network learning. Moreover, due to the limitation of the transformation function, its nonlinear expression ability is also restricted. At the same time, diferent from the neural Hidden Layer Output Layer Output network model in general literature, parameter set selection of network model is not only based on goodness of ft but also uses a statistical signifcance test to evaluate the gain of adding more parameters.

Neural Learning Algorithm.
In this subsection, we will use the gradient descent method and error back propagation to give the network learning and training algorithm for this structure. Suppose that the output of the hybrid neural network is ∆Y t and the loss function is the square diference between the network output value and the actual value of the dependent variable. Tat is Te error backpropagation gradient descent method is used to derive the network parameter updating formula at time t.
We defne the error signal as follows: where Te error signal of time t implies the error at time t − 1, which refects the memory ability of RNN.
Similarly, we can deduce the updated formula of other parameters of linear RNN as follows: for the multilayer BP network, the weight updating formula is also derived by using the error backpropagation gradient descent method. For w j : We can calculate For b (o) , similarly, we can obtain: For v ij : For b (2) j , the same can be: For u i ,

Computational Intelligence and Neuroscience
For b (1) i , we can get: So far, we give the updating formula of all parameters of the linear RNN and multilayer BP hybrid network.

Implementation of Neural Learning Methods for Nonlinear ECM.
In order to avoid the infuence of variable size diference on modelling, all variables in the model (∆Y t , ∆X t , and e t sequence, etc.) for standardization. Tat is For convenience, the processed data variable mark remains unchanged. Te loss function is selected as the square diference between the network output value and the actual value of the dependent variable as equations (20) and (21). Te initial value of each error signal is set to 0. Besides loss function, three commonly performance indicators, including mean square error (MSE), R 2 and log L are presented in the simulation data to describe the network performance.
Te algorithm solves the problem as shown as Algorithm 1.

Statistical
where the degree of freedom df of Chi-square distribution is the diference between the number of free parameters without and with constraints. θ 0 � argmax θ∈Θ 0 L(θ) and θ � argmax θ∈Θ 0 ∪ Θ c L(θ). L(θ) is the likelihood function under parameter θ. If the P value calculated under the above distribution is larger than 0.05, it is considered that there is no signifcant diference between the two models, that is, simple model under θ ∈ Θ 0 should be chosen.
In nonlinear ECM neural network, we can assume that ∆Y t is normal distribution in this paper as equation (4) and 1 , x t ) can also be considered the mean of ∆Y t . Under the uniform variance assumption, the maximum likelihood estimation method is to fnd the parameters to maximize the joint distribution of ∆Y t , that is, to maximize the following formula: where T is the number of observations. It is easy to see that the above parameters can be obtained in two steps. For a given σ, in order to maximize the above formula, it is equivalent to making T 2 (∆Y t − ∆Y t ) 2 minimum, which is consistent with loss function of nonlinear ECM neural network. For a given ∆Y t , we can obtain the following equation: After the above two parameters are known, the likelihood ratio test of nested model can be carried out.

Empirical Studies
Gold price and US dollar index have always been common varieties in international reserves and investment portfolios, while gold has the dual properties of currency and commodity, so the relationship between US dollar index and gold price has attracted academic attention. Te gold price studied in this paper is London gold denominated in US dollars, which is a relatively active spot gold market in the world. Te US dollar index is selected to consider the comprehensive trend of the US dollar exchange rate. Te nonlinear model between the US dollar index and gold price is described by the proposed nonlinear ECM neural network. We frst examine whether gold price and US dollar index conform to the cointegration relationship, and then compare 8 Computational Intelligence and Neuroscience and analyze the linear ECM model and nonlinear ECM neural network, respectively, to show the evolution process of the dollar-gold nonlinear relation model. Figure 2 shows the daily frequency data of the gold price and the US dollar index from January 4, 2021, to December 31, 2021. As we can see, the relationship between these two series was constantly changing. Te gold price and the US dollar index was in negative relationship in January to March as gold prices showed a sharp downward trend, while the US dollar index was just the opposite. From April to May, an inverse relation went on between the gold price and US dollar index as gold price rose and rebound while US dollar went downward. In the second half of 2021, the US dollar index maintained a gentle upward trend, and the gold price remained volatile.

Cointegration Test of Data.
Te ADF test is used to test the integration of time series in order to verify the cointegration of the dollar index and gold price. Te original hypothesis of this test is that the series has a unit root, that is, it is nonstationary. When the P value corresponding to the DF statistic value is greater than or equal to the selected threshold, reject the original hypothesis, and it can be considered that the sequence is nonstationary; otherwise, it is stationary. Te threshold value selected in this paper is 0.05.
As can be seen from Table 1, both the US dollar index series and gold price series are not stable in the selected time period, but they are stable after the frst-order diference. It shows that the dollar index series and gold price meet the prerequisite of frst-order cointegration and need to be further tested.
Based on the dollar index and the frst-order single integration of gold price, the following long-term equilibrium model is established as follows: Te overall signifcance of the regression model is signifcant that F statistic is 28.64 with P value < 0.01 and R 2 is 10.70%. Te coefcients of the constant term and the US dollar index are also statistically signifcant, as shown in Table 2.
In the ADF stationary test of residual error in the longterm equilibrium model, the DF statistic value is −3.2177 with a P value <0.01. On this basis, residual stability is verifed by ADF. To sum up the above analysis results, the US dollar index and gold price in 2021 meet the (1, 1) order cointegration and have the premise of building ECM model. Te residual of the long-term equilibrium model will become the long-term equilibrium state deviation variable in the later model.

Neural Learning Results and Comparison.
Te simulation includes the following parameter sets: Te nonlinear part of the neural network adopts the four-layer BP network, in which the number of nodes of the frst and second hidden layers (i and j in subsection 2.2) is 1, 3, and 5, respectively. Te ECM linear model results are also given in this paper for comparison with the nonlinear ECM neural network. Te number of training epochs for the hybrid network is set to be 2000. Initialized learning step μ is 0.001. In order to make the nonlinear ECM neural network converge as soon as possible, the learning step is changed to 0.0001 after 1000 epochs of training.
Te training process in 2021 under diferent parameter sets are shown in Figures 3-6. In Figures 3 and 4, Loss and MSE no longer reduce when approaching 2000 th epoch. In Figures 5 and 6, R 2 and log L no longer increase when approaching 2000 th epoch. As a result, the training is suffciently complete and the network has converged. In Figures 3-6, we started from the 10 th epoch because the scale of the statistical indicators of the frst 10 epochs changed greatly, which was caused by the random initialization parameters of the network and had nothing to do with the conclusions obtained.
Te neural learning results are shown in Table 3. For the data in 2021, nonlinear ECM neural network with 1 and 5 nodes in the frst and second hidden layers, respectively (model No. 4) was the best in perspective of goodness of ft.

Robustness Analysis.
In order to evaluate the robustness of the proposed nonlinear ECM neural learning method and analyze its performance under diferent data cases, the empirical studied is carried out on generalized time period and generalized variables. In detail, besides analyzing 2021 gold price as Y t and US dollar index as X t in equation (3), as described in 3.2, after the cointegration test, the following simulations are also performed: (1) Generalized time period: both 2021 and 2020 are studied. (2) Generalized variables: using the US dollar index for Y t and the gold price as X t in equation (3).
Te results are listed as in Table 4. In all 4 cases, the nonlinear ECM neural network models performs better than the linear ECM model in perspective of goodness of ft only. When likelihood ratio chi-square test was introduced to compare complex model and simple model, in most cases (3 out of 4), the advantage of nonlinear ECM neural network models is proved to be statistically signifcant. Te necessity of the proposed nonlinear ECM neural learning method is proven. While in most cases (3 out of 4), the model with the best goodness of ft was not the comprehensively best model when both goodness of ft and model complexity were taken into consideration. Te necessity of the proposed likelihood ratio chi-square test is proven.
In conclusion, the proposed nonlinear ECM neural network models can be carried out for diferent data cases and generalized well. Te robustness and necessity of the method are verifed.

Conclusions
Te data-driven method combining traditional statistics under a linear paradigm with nonlinear computational intelligence is undoubtedly a potential method for fnancial time series data processing in the future. Aiming at the nonlinear relationship faced by the nonlinear ECM in analyzing the nonlinear relationship between gold price and US dollar index, this paper constructs a data-driven nonlinear ECM through the combination of a linear RNN and multilayer BP network and gives the corresponding neural learning algorithm to realize the accurate modelling of the nonlinear time series with a long-term equilibrium  relationship. It provides a new research method for the analysis of time series by a nonlinear error correction learning model. Diferent from other methods that simply use neural networks to describe nonlinear time series, the nonlinear ECM proposed in this paper is a neural learning network constructed based on the principle of statistical detection, and all network parameters will be obtained through learning. Based on empirical analysis with the gold price and US dollar index in 2021, a nonlinear ECM neural network with 1 and 5 nodes in the frst and second hidden layers, respectively (model No. 4), was the best from the perspective of goodness of ft. After using the likelihood ratio chi-square test for the nested model, which means taking both goodness of ft and model complexity into consideration, a nonlinear ECM neural network with 1 node in the frst and second hidden layers, respectively, is considered as comprehensively best model. Te above conclusions are intuitive. In some complex economic and fnancial cases, the nonlinearity of time series increases, and traditional linear models need to be improved. Terefore, it is necessary to fully compare the nonlinear ECM neural network in diferent parameter sets with the linear model in the modelling process and then make the fnal model selection. However, it is not always the case that the most complex model performs best. We need to strike a balance between goodness of ft and complexity. Statistical hypothesis testing is a good method for quantitative comparison, as suggested in this paper.
Te nonlinear ECM neural learning method proposed in this paper can meet the needs of traditional statistical detection so as to obtain interpretability. It not only improves the goodness of ft of time series within a certain range but also expands the application range of ECM. By combining with traditional statistical methods, a deep learning neural network learning method will be more widely used in the feld of time series modelling. Tis method can be further extended to a real-time learning network. When new data are obtained, they participate in learning based on the original network and obtain network parameters online without all data participating in regression. It can also be easily extended to multivariable, nonlinear time series analysis methods.
Te hybrid neural learning method of RNN and BP networks proposed in this paper cannot be applied to all nonlinear time series because the established learning model needs to meet the modeling conditions of ECM, namely, cointegration. Only time series that pass the cointegration test can be applied to the proposed approach. In addition, the existing deep machine learning methods, such as the combination of self-attention mechanisms and the data-driven method of the generic adversarial network, the ensemble machine learning method [28][29][30][31], will be used as references in the future research to closely combine the application conditions of ECM and develop more efective data-driven nonlinear ECM.

Data Availability
Te experimental data used to support the fndings of this study are available from the corresponding author upon request.

Conflicts of Interest
Te authors declare that there are no conficts of interest regarding this work.