An Approach for Reconstruction of Realistic Economic Data Based on Frequency Characteristics between IMFs

Reconstruction of realistic economic data often causes social economists to analyze the underlying driving factors in time-series data or to study volatility. The intrinsic complexity of time-series data interests and attracts social economists. This paper proposes the bilateral permutation entropy (BPE) index method to solve the problem based on partly ensemble empirical mode decomposition (PEEMD), which was proposed as a novel data analysis method for nonlinear and nonstationary time series compared with the T -test method. First, PEEMD is extended to the case of gold price analysis in this paper for decomposition into several independent intrinsic mode functions (IMFs), from high to low frequency. Second, IMFs comprise three parts, including a high-frequency part, low-frequency part, and the whole trend based on a ﬁne-to-coarse reconstruction by the BPE index method and the T -test method. Then, this paper conducts a correlation analysis on the basis of the reconstructed data and the related aﬀected macroeconomic factors, including global gold production, world crude oil prices, and world inﬂation. Finally, the BPE index method is evidently a vitally signiﬁcant technique for time-series data analysis in terms of reconstructed IMFs to obtain realistic data.


Introduction
e importance of revealing the underlying characteristics of macroeconomic data has attracted considerable attention from social economists for studying its underlying driving mechanism [1,2]. Because it is affected by complex factors, especially noise signals, macroeconomic data are too difficult to decompose into some data from the perspective of the more economically meaningful components.
As an empirical, intuitive, direct, and self-adaptive data processing method, the empirical mode decomposition (EMD) is a novel data analysis method, which is utilized to decompose time-series data into a small number of independent intrinsic modes based on a local characteristic scale, and the IMFs have specific economic meanings [3,4]. en, the ensemble empirical mode decomposition (EEMD) [5], the complete ensemble empirical mode decomposition (CEEMD) [6], and PEEMD [7], as improvements of the EMD algorithm, are widely applied in the decomposition of time-series data for more accurate IMFs by eliminating the effects of interfering signals [8][9][10][11], and hybrid models based on this are exploited to predict the time-series data at the entry point of the IMFs' numerical distribution characteristics [6,12,13].
Furthermore, it is necessary to classify the IMFs according to the potential influencing factors to study the influence of driving factors. ere are several methods and concepts for reconstructing the IMF. Zhang et al. [14] proposed the T-test method to synthesize the IMFs into more realistic economic significance, which solves the reconstructed data of high-and low-frequency data based on the frequency characteristics of the IMFs. Yu et al. [15] proposed a decomposition-ensemble methodology with data-characteristic-driven reconstruction based on two promising principles: "divide and conquer" and "datacharacteristic-driven modeling." Aamir et al. [16] proposed a decomposition-ensemble model with reconstructed IMFs for forecasting crude oil prices based on the well-known autoregressive moving average (ARIMA) model. Gao et al. [17] proposed using average mutual information (AMI) on the Reconstruction of Modes of Decomposition. is paper proposes an economic meaning reconstruction method based on the BPE index [18] to classify high-and low-frequency data, which compares the chaos degree of the synthetic signal and the adjacent IMFs, based on the frequency relationship between IMFs, ignoring the independent distribution of frequencies in the IMFs compared with the T-test reconstruction method.
is paper selects gold data as the application object, which are not only a crucial foundation of the international monetary system but also play an important role in national economic security, financial stability, and national defense security, especially in the context of the deterioration of the international financial environment and international political turmoil [19,20]. is paper utilizes the T-test method and the BPE method to divide the IMFs into high-frequency data, low-frequency data, and trending partial data based on the PEEMD [7]. en, a correlation analysis between composited data and related factors is proposed to explain the rationality of the new composition method for gold price analysis. e rest of the paper is organized as follows: Section 2 gives a brief introduction to the PEEMD, T-test, and BPE algorithms. en, a new reconstruction method based on BPE is proposed. Section 3 proposes the process for application in gold prices based on PEEMD and BPE. Section 4 presents a detailed analysis based on the composition of intrinsic modes and verifies the rationality under different composition methods. Section 5 concludes the paper.

PEEMD Algorithm.
e PEEMD algorithm, as an improvement of the EMD algorithm, is a generally nonlinear, nonstationary, and self-adaptive data processing method [7,12,21]. Under the assumption that the data may have some different coexisting modes of oscillations and some noise at the same time, PEEMD can extract the intrinsic modes in the original data without noise signals by utilizing permutation entropy (PE) [22,23] to estimate the effect of noise signals. e PEEMD is described as follows: (i) S(t) is a given time-series signal. e pair of white noise series n i (t) and − n i (t) are added to S(t): where i indicates the number of pairs of the added white noise. i � 1, 2, . . . , Ne, j is the number of iterations for decomposing the IMFs that meet the requirements, and a is the amplitude of the added white noise.
(ii) First, EMD decomposes the two signal series r + ij (t) and r − ij (t) to obtain two IMF sets I + ij and as well as two residue sets r + ij+1 (t) and r − ij+1 (t) : (iii) By assembling the final IMF in the jth rank to eliminate the effectiveness of the added pairs of white noise signals, the following equation can be obtained: (iv) e PE of I j (t) is calculated and compared with the threshold θ 0 , which is set to reject the intermittency or noise signal in the original data. PE is always calculated, and steps 1-3 are repeated until PE j is smaller than θ 0 . (v) en, the first j − 1 IMFs are considered as intermittency or noise signals, which should be separated from the original signal, and the residue is expressed as (vi) r(t) denotes some different coexisting modes of oscillations at the same time without some noise signal, and it is decomposed completely by EMD: (vii) c k (t) are seen as the IMFs following the first j − 1 IMFs. e initial signal is described as In the PEEMD algorithm, the reconstruction error (RE) may be limited to a negligible level by adding the pair of white noises with positive and negative signs, and the PE is utilized to indicate the chaos degree so that the noise signal is eliminated in the algorithm to guarantee that the IMFs are closer to the inner intrinsic modes of the original signals than EMD.

T-Test Algorithm.
e principle of the T-test method for decomposing the IMFs is that the component whose zero mean characteristic has the first significant change is the demarcation as the IMFs are arranged in the descending order [24] based on a fine-to-coarse reconstruction, i.e., high-pass filtering by adding fast oscillations (IMFs with smaller index) up to slow (IMFs with larger index) so that all IMFs before this component (including this IMF) are the high-frequency parts, and the subsequent components are the low-frequency parts. Additionally, either when the residue r(t) becomes so small that it is less than the predetermined value of a substantial consequence or when the residue r(t) becomes a monotonic function from which no more IMFs can be extracted, the PEEMD stops, and the residue is considered the trend.
en, the high-frequency, low-frequency, and trend parts of the original data boundary are obtained. e process of the specific IMF reconstruction is as follows [14]: (1) Compute the mean of the sum of IMF 1 to IMF i for each component (except for the residue) (2) Use the T-test to identify for which IMF i the mean first significantly departs from zero (3) Once IMF i is identified as a significant change point, identify the partial reconstruction with IMFs from this to the end as the low-frequency parts, and identify the partial reconstruction with other IMFs as the high-frequency process Zhang et al. [14] considered that the IMFs obey the zero-mean normal distribution and then used the T-test to test whether the hypothesis is true. e IMF that does not satisfy the hypothesis is the critical point of frequency reconstruction.

BPE Algorithm.
To estimate the degree of mode splitting in the adjacent IMFs, the BPE algorithm was proposed by Liu et al. [18] based on PE. e adjacent IMF frequency distribution characteristics are analyzed to determine the critical point between high and low frequencies between IMFs. A method based on BPE is proposed in this paper to evaluate the problem and is utilized to reconstruct IMFs. e BPE is described as follows.

Hypothesis.
e targeted time-series signal is composed of low correlative signals. e BPE index is defined as follows: where PE i denotes the PE value of the ith IMF component and PE ij indicates the PE value of the signal comprising the ith and jth IMF components. ere are two domains of BPE ij values. According to the permutation entropy (PE) proposed by Bandt and Pompe [22], the algorithm is illustrated as follows: (i) Given a time series x k , k � 1, 2, . . . , N.
where χ m i denotes the m-dimensional delay embedding vector at time i. (ii) en, x m i has a permutation π r 0 r 1 ...r m− 1 if it satisfies where 0 ≤ r i ≤ m − 1 and r i ≠ r j . m indicates the embedded dimension, and τ is the time delay. (iii) ere are m! possible permutations of an m-tuple vector. For each permutation π, the relative frequency is determined by (iv) e PE of the m-dimension is then defined as (v) erefore, the normalized permutation entropy (NPE) can be expressed as Specifically, if BPE ij ≥ 1, it represents that the chaos degree of the synthetic signal is higher than that of the signal IMF because no signal compatibility exists between the ith and jth IMF components, so the inner modes are considered to be decomposed independently into a single IMF. By contrast, BPE ij < 1 signifies that the chaos degree of the synthetic signal is lower than that of the signal IMF component, which is largely attributed to the chaos signal being offset against the compatibility between the IMFs because some inner modes are divided into adjacent IMFs.
According to the definition of BPE proposed in the literature, the IMF reconstruction process based on BPE is proposed as follows: (1) Arrange all the IMFs obtained by decomposition from high frequency to low frequency (IMF 1 , IMF 2 , IMF 3 , . . .) (2) Separately calculate the PE value and the BPE value (BPE 12 , BPE 23 , BPE 34 , . . .) (3) e number of the maximum points (n) in the BPE should be equal to the number of parts with requirement (N), which means n � N (4) Part of the IMF between the BPE maxima constitutes the corresponding frequency.

Process of Data Analysis
According to the above analysis, first, the time-series data are decomposed by PEEMD to obtain orthogonal IMFs; then, the short-term trend and long-term trends are composed of the BPE index and T-test for comparison. Finally, the correlation analysis is carried out with the long-term factors and short-term factors to explain the rationality of reconstructing the data. e scheme is shown in Figure 1.

Decomposition.
According to the discussion of PEEMDrelated parameters in the literature [7], the relevant parameters of PEEMD processing gold price data are shown in Table 1. In particular, the setting of the PE threshold directly affects the accuracy of the decomposition of the IMF results and the subsequent combined models. When the PE threshold is zero, the PEEMD decomposition method is the same as the complete ensemble empirical mode decomposition (CEEMD) method. As demonstrated in the literature, when the threshold is between 0.5 and 0.6, the best decomposition results are obtained. Based on this, the threshold value is chosen as 0.6, and the decomposition result of the gold price data is shown in Figure 2.
As shown in Figure 2, the gold/dollar data are processed as IMF 1 -IMF 7 and the residual trend signal RS 8 by PEEMD. e frequency of the IMFs gradually decreases from IMF 1 to IMF 7 , and the high-frequency part and the low-frequency part of the gold price trend are included. erefore, the paper demonstrates that the high-frequency part indicates Step #2

Verification
Step #3  Note. e symbols "NStd," "MaxIter," "Ne," "T," "Mode," and " r" indicate the noise standard deviation, the maximum number of screening iterations, increased white noise logarithm, delay time of PE, order of PE, and threshold, respectively. the short-term trend underlying gold price change, the lowfrequency part indicates the long-term trend, and the remaining signal RS 8 indicates the overall downward trend.
In this paper, the T-test method and the BPE index are used to measure the IMFs of the decomposition to show the relative range of the frequency and then reconstruct the IMFs to obtain the high-frequency and low-frequency parts.

T-Test
Reconstruction. e literature proposes that the method is based on the fact that the higher the data frequency is, the more random the data are and the more the mean value approaches zero under the hypothesis of the normal distribution. e T-test method is utilized based on the numerical distribution characteristics in each single IMF.
e T-test statistic for each of the IMFs is calculated, as shown in Table 2.
It can be seen that the P value of IMF 3 (0.0001) is less than 0.05 for the first time, with the condition that the confidence is 95%. is indicates that the mean of IMF 3 significantly departs from zero. erefore, the superimposed IMF 1 -IMF 2 data are the high-frequency parts as a result of independence and orthogonality between each IMF. e superimposed IMF 3 -IMF 7 data are the low-frequency parts, with RS 8 being the trend. Otherwise, it can be seen that IMF 5 is significantly different from the zero mean with a P value less than 0.05. en, the results show that IMF 1 -IMF 4 can be reconstructed as high-frequency parts, and IMF 5 -IMF 7 can be reconstructed as low-frequency parts, which may be rational to some degree if judged by only statistical values. erefore, it is necessary that IMFs are arranged from high frequency to low frequency, and the IMF, which means that the first significant difference of zero value, is treated as the demarcation point. Essentially, the T-test method divides different parts, ignoring the graduation between the adjacent IMFs according to statistical characteristics in just a single IMF. is paper explores the relationship between the adjacent IMFs to classify high-and low-frequency data by BPE.

BPE Reconstruction.
In theory, regardless of how the choice of parameters in the PE value of the IMF is calculated, as long as the consistency is guaranteed, the law of change will always have a relative tendency. To calculate the change in PE value and BPE value under the same conditions, the parameters of PE calculated in this paper are consistent with those of PE in PEEMD decomposition. e distribution takes the delay time, and the PE order is also 6. e calculated mode functions are the PE value and BPE change trend and are shown in Table 3.
IMF 1 -IMF 7 are arranged in the descending order. Moreover, it can be seen from the above table that the calculated PE values are from a maximum of 0.4287 to a minimum of 0.1084, showing a gradually decreasing trend and a decrease in frequency at any time.
e degree of disorder of the data structure is gradually reduced, and the later IMF decomposition represents the gold/dollar trend  Mathematical Problems in Engineering 5 information. It represents that the BPE value before IMF 5 is less than 1, indicating that there is a certain degree of mode splitting between IMF 1 and IMF 2 , IMF 2 and IMF 3 , IMF 3 and IMF 4 , and IMF 4 and IMF 5 so that the degree of chaos in the reconstructed signal is lower than that of the single IMF, and the BPE value between IMF 5 and IMF 6 (BPE 65 ) is more than 1 for the first time at 1.0014, indicating that the degree of chaos in the reconstructed signal between IMF 5 and IMF 6 is higher than that of IMF 5 . e degree of chaos increases after the absence of mode splitting between the two signals, which also shows that the frequency of IMF 1 -IMF 7 is relatively different from that of the single IMF, and there is a significant difference between IMF 5 and IMF 6 ; at the same time, the BPE value between IMF 7 and RS 8 (BPE 87 ) is 1.0198, which is also more than 1, indicating that there is also a significant frequency domain division between IMF 7 and the residual signal. erefore, the paper concludes that the frequency division of IMF 1 -IMF 5 is the high-frequency part, IMF 6 -IMF 7 is the low-frequency part, and RS 8 is the trend part based on the calculation of BPE.

Verification.
Different high-and low-frequency data can be used to form different reconstructed data by the Ttest and the BPE index in the comparison. is paper analyzes the influencing factors of different reconstructed high-and low-frequency data. Many studies [25][26][27][28] show that the long-term trend of gold prices is mainly affected by global gold production, and the short-term fluctuations are mainly affected by world crude oil prices and world inflation.

Long-Term Trend.
e low-frequency data reconstructed by the T-test and BPE analysis are shown in Figure 3. It can be intuitively seen that the low-frequency data reconstructed by the BPE index have consistency on the whole, and the low-frequency data formed by the T-test contain more local fluctuations. It can be concluded from Table 4 that the low-frequency data and the original gold data have a higher Pearson coefficient and variance [29], indicating that the low-frequency data can explain the main part of the original gold price fluctuation, so the lowfrequency data formed must reflect the impact of longterm trends to some extent. Pearson correlation coefficient and variance in the low-frequency data reconstructed by the T-test and the original gold price data are 0.9314 and 57.14%, respectively, which are higher than the low-frequency data formed by the BPE reconstruction, which shows that, in the formed low-frequency data (long-term trend), more relatively high-frequency data remain in the reconstructed low-frequency data of the T-test.
To further analyze the long-term trend effects of reconstructed low-frequency data, this paper refers to the world gold mine output obtained by the World Gold Council (WGC) from 2010 to 2018, as shown in Figure 4. It can be seen intuitively from Figures 3 and 4 that the longterm trend fluctuations in gold prices are negatively correlated with global gold production, Pearson correlation coefficient between them is calculated as negative, and the absolute value of the correlation coefficient between the long-term trend and global gold production is larger by the  BPE index, as shown in Table 5. It is demonstrated that the low-frequency data formed by this method have an advantage over the traditional T-test method in reflecting the long-term trend in the gold price.

Short-Term Trend.
e high-frequency data reconstructed by the T-test and BPE analysis are shown in Figure 5. e high-frequency data formed by the T-test are relatively higher than the frequency of the high-frequency data formed by the BPE. Table 6 shows that the variance in the high-frequency data and the original gold data is 1.17% and 11.38%, respectively, and the correlation coefficient between the high-frequency data formed by the T-test and the original gold price data is only 0.1376, which is approximately half of the correlation coefficient of the highfrequency data formed by the BPE index, which means that the high-frequency data formed by the T-test are more random and can hardly explain the short-term trend fluctuations of the original gold price data; the high-frequency data formed by BPE can relatively more accurately explain the short-term trend fluctuations of gold price data.
To further explore the rationality of high-frequency data reconstructed by different methods, this paper selects the inflation rate of the United States to replace the world inflation rate [27,30] from InflationData.com to analyze the correlation. Table 7 shows the monthly inflation rate from 2010 to 2018. Similarly, this paper selects the spot price of crude oil from 2010 to 2018, and the data are downloaded from the spot trading software MT4, as shown in Table 8. Table 9 shows that the correlation coefficients of highfrequency data and short-term factors are small regardless of the method used, which shows that the calculated correlation coefficient is distorted and even changed from a negative correlation (positive correlation) to a positive correlation (negative correlation) under the trend condition without eliminating the short-term trend influencing factors due to the microeconomic factors containing many highfrequency noise signals. However, in terms of the comparison of the absolute values under the same-directional nature, the correlation between the high-frequency data formed by the BPE division and crude oil price is stronger than by the T-test, which shows that the high-frequency data formed by the BPE are closer to the short-term trend of the original gold price data.

Conclusion
ere are data-driven macrofactors behind any economic data. It is important for social economists to analyze the cause and predict data volatility. is paper selects the world gold price as an example for verifying the reconstruction method based on the BPE index.
Adaptive PEEMD is utilized to decompose the gold price data to obtain the orthogonal mode function while eliminating as much of the effectiveness from the noise signal as possible, and then the IMFs are classified into high-and low-frequency parts to determine long-term and short-term trends by the BPE index method compared with the T-test method. e composed results are subjected to correlation analysis with global gold production, world crude oil prices, and world inflation to highlight the rationality of long-term and short-term trends composed of the two methods. According to the study results regarding the trends in gold price data, the main conclusions are as follows: (1) According to the calculated value based on the two methods in comparison, the BPE value in IMFs indicates clearer boundaries than the T-test and is more in line with the actual situation. BPE 87 and BPE 65 are more than 1, which manifests the obvious boundaries between RS 8 and IMF 7 , while the P value of IMF 3 (0.0001) is less than 0.05 for the first time, and IMF 5 , IMF 6 , and IMF 7 are also less than 0.05 by the T-test method. (2) According to the perspective from the composed result, the correlation analysis shows that Pearson correlation coefficient between low-frequency data reconstructed by the BPE index method and mine production is − 0.5005, greater than − 0.4421, which is Pearson correlation coefficient between low-frequency data reconstructed by the T-test method and mine production. is indicates that the reconstructed long-term trend is more appropriate than the realistic economic meaningful trend by the BPE index method. erefore, the high-frequency data are correspondingly more effective and are extracted from high-frequency data. e paper demonstrates that the trend data reconstructed by the BPE index method can better explain the internal drivers of gold price volatility, both in terms of longterm and short-term trends, and it lays the foundation for further study about the trend of gold price volatility. It provides new methods and ideas for studying the driving factors behind macroeconomic fluctuations to better explain the changes in macroeconomic indicators.

Data Availability
is paper selects the simulation signal for empirical analysis and does not contain specific data.

Conflicts of Interest
e authors declare that they have no conflicts of interest.