A Causal and Correlation Analysis between China Energy Futures and China Energy-Related Companies Stock Market

Taking the opportunity of China’s launch of Shanghai crude oil futures (INE), this study empirically examined the information transmission in this immature financial market, investigating this issue from a new perspective. To identify the impact of INE on the related stock market, we collected high-frequency trading data of oil futures and 22 stocks owned by listed companies in the upstream and downstream of China’s oil-related industry chains, constructed a causal chain through Directed Acyclic Graph, and used MFDCCA-MODWT to perform multifractal analysis on the chain. Research shows that INE does have a causal relationship with the stock market of the related industry chain, and there is a multifractal correlation between its transaction time series. Subsequently, the source of fractal correlation was analysed with shuffled and surrogated sequences. We conclude that long memory plays a leading role and is the main reason for multifractal features.


Introduction
After the signing of the Paris Climate Agreement, most signatories took the development of green finance as one of the important means for the domestic financial industry to respond to global climate change. In China, the focus of the work of the financial sector has included issuing green bonds and green credit, and establishing a regional green finance pilot system. For example, the People's Bank of China announced in August 2019 that since 2016, China's green bond market has ranked first in the world in terms of issuance, with annual issuance and inventory ranking among the top in the world. e balance of green loans has increased year by year, accounting for nearly 10% of total domestic corporate loans. e effective use of green finance depends on the effective operation of international financial markets, especially those closely related to climate change. e trading of oil-based energy products is one of the most important markets. For developing countries, the development of their financial markets is often not perfect, but their energy consumption is huge, and traditional analysis methods are doubtful when analysing relevant markets [1,2]. How to better identify the development status of imperfect financial markets but closely related to green finance? e use of datadriven analysis methods has become an option.
Regarding the analysis of the impact of the futures market on the spot market, scholars mostly proceed from the perspective of information efficiency. In classic literature, Fama divides information into historical information, publicly disclosed information, and all unknown related information [3]. e information efficiency of the market refers to the ability of market prices to digest and absorb the information when a new information shock occurs and to guide investors through the market price to predict another market price, and the increase in transaction time and transaction volume can strengthen information transmission [4]. at is, information efficiency can be the basis of market price efficiency, and the measurement of information efficiency is theoretically a test of market efficiency. But in developing economies' financial markets, due to the high appearance of false transactions and falsified financial data, some traditional models are not effective in studying these markets [5]. Since in markets, a transaction is a transaction, and the data that appear every moment bring useful information; therefore, this study looks forward to considering the problem in a purely data-driven way. To explore the information transmission efficiency of China's crude oil future, this paper collects the high-frequency data of Shanghai oil futures (INE) price and the stock prices of several listed companies. Furthermore, we use DAG and Multifractal Detrended Cross-correlation Analysis (MFDCCA) to study the information transmission path and the multifractal relationship between these two markets and then analyse the multifractal sources.
Different from the previous research, we analyse the information transmission between INE and related stock markets from a data-driven perspective. Since the scholars have disputes over whether the market is effective, we choose not to make any prior logical assumptions, but to investigate the problem from a pure data perspective. Such methods can be used not only in the financial markets of immature developing countries but also in the analysis of relevant topics in mature markets. On the one hand, the research provides ideas for investigating the relationship between the energy market and energy-related stock markets in China. On the other hand, research provides a basis for investigating market linkages in the immature financial markets of developing countries, and it also provides a framework for investigating related content in other markets around the world.
In this paper, Section 2 reviews the literature. Section 3 introduces the data and methodology. Section 4 initials empirical research and talks about result implications, Section 5 discusses and puts forward policy suggestions.

Energy Futures.
Energy futures play an important role in ensuring global energy supply and maintaining economic stability [6,7]. On the one hand, energy futures influence the global economic development by influencing the price of energy products [8,9]; on the other hand, the financialization of energy products also has an impact on the economy through financial markets [10,11]. Among them, oil futures are the most attractive. At present, the researchers have carried out a large number of studies on several major oil futures contracts, including the linkage with other financial markets [12,13], the influence on oil spot prices and stock prices of downstream industries [14,15], and the prediction of macroeconomic indicators. INE has been listed for a short time, and relevant researches on them also follow the above directions [16]. Scholars found a clear-tail dependence between the INE, equities, foreign exchange, and gold markets [17]. Moreover, there is a significant and continuous twoway volatility spillover effect between INE, WTI, and Brent [18,19], and its yield is also in equilibrium with the yield of Daqing, Shengli, Oman, WTI, and Brent crude oil spot prices, which supports the pricing efficiency of crude oil futures prices in the Asia-Pacific region [20]. At the moment, global financial uncertainty will closely affect the volatility of the INE [21]. In some extreme environments, there is a strong causal relationship between oil futures prices and investor sentiment [22,23]. After several years of development, INE prices now have short-term and medium-term independence and conductivity. However, compared with international benchmark oil prices, INE has limited pricing power and lacks long-term influence on the international crude oil market [24,25]. With the introduction of INE, it will promote the balance of China's and European and North American crude oil market price systems, help to improve the world crude oil price system, and research on crude oil prices and capital markets is increasingly relevant. Compared with international research, there is a relative lack of research on China's oil price volatility spillover effect and less attention to the complex impact of oil price volatility types on capital markets. e correlation between economic sectors and oil prices also needs to be further deepened.

Directed Acyclic Graph (DAG).
In mathematics, particularly graph theory, a DAG is a finite directed graph with no directed cycles. is means that it is impossible to traverse the entire graph starting at one edge. e edges of the directed graph only go one way. e graph is a topological sorting, where each node is in a certain order. Based on this idea, Spirtes proposed the DAG analysis method in 2000 [26]. is method can effectively identify the causal relationship among high-dimensional variables and determine the conduction path and direction of information. Compared to the traditional methods like Granger causality test, DAG does not need to apply any theoretical assumptions. It can derive the causal relationship only according to the residual variance-covariance matrix of the data, which makes itself a purely data-driven method. Besides, the employment of data-driven methods such as machine learning for analysis and prediction in the field of energy finance is also a hot topic in recent years [27,28]. DAG was first applied to the economic analysis in 2003 [29] and has been widely used in several fields until now. Liang et al. analyses the internationalization trend of China's stock market from the perspective of information spillover and finds that though there are significant differences in the dynamic path between return and volatility spillover, the international integration process of China's stock market is steadily advancing [30]. Yang et al. combined DAG and VAR to study the international transfer of inflation among G7 countries and found that US inflation has become less vulnerable to foreign shocks since the early 1990s [31]. Awokuse studied the relationship between Japanese exports and economic growth and found a bidirectional relationship [32]. In these studies, DAG can not only identify the causal relationship in the overlap period [33] but also solve the problem of nonsynchronization between different markets by applying constraints and restrictions [29]. According to what is mentioned above, DAG can not only analyse the domestic market that overlaps with the trading time of Shanghai crude oil futures but also the overseas market with the overnight time difference, which makes this article and subsequent research possible.

MFDCCA-MODWT (Multifractal Detrended Cross-Correlation Analysis-Maximal Overlap Discrete Wavelet Transform).
e idea of fractal first appeared in the 19th century, such as Weierstrass function and the Cantor set [34]. According to Britannica, fractal, in mathematics, any of a class of complex geometric shapes that commonly have "fractional dimension," a concept was first introduced by the mathematician Felix Hausdorff in 1918. And the term fractal, derived from the Latin word fractus, was coined by the Polish-born mathematician Benoit B. Mandelbrot [35], and now fractal has been applied in several fields [7]. e fractal method consists of single-fractal and multifractal. Single-fractal is mainly used to analyse the long memory of sequences, also known as persistence or antipersistence. Later, scholars have gradually found that the multi-scale and complexity of financial time series poses a challenge to the single-fractal. Mandelbrot pointed out that multifractal can better quantify the complex wave characteristics of financial markets compared to the single-fractal and has wider application in the empirical research [36]. However, both single-fractal and multifractal can only characterize the fractal features of a single time series and not capable of the correlation between two time series. Referring to time-series correlation analysis, Podobnik et al. introduced the detrended fluctuation analysis method to correlation analysis and proposed a detrended cross-correlation analysis (DCCA) which can measure the long memory of two nonstationary time series [37]. On this basis, Zhou combines DCCA and multifractal and proposes the MFDCCA to study the cross-correlation multifractal of two time series with the same observations [38]. Subsequently, scholars proposed MFDMA [39], MFADCCA [40], DMF-ADCCA [41], and MFDCCA-MODWT [42] according to different purposes and achieved certain results [43]. Nowadays, correlation analysis based on multifractal has now been widely used in energy [44,45], meteorology [46], and financial markets [47][48][49]. MFDCCA-MODWT performs better than MFDCCA when measuring long memory features of sequences of different lengths and different Hurst exponents because it does not need to select different polynomial orders to fit to eliminate trend items in the time series. Hence, this paper will use MFDCCA-MODWT to measure the multifractal correlations between the sequences. erefore, the contribution of this paper is to use the DAG for the first time to examine the influence path of INE on the industry chain and, on this basis, analyse the source of the multifractal correlation between INE and related stocks.

Data and Method
3.1. Data. We searched all related industries upstream and downstream following the industry chain and found the stock with the largest market value in each industry. e sample includes INE and 22 other stocks selected according to the industry chain, covering oil and gas, coal, energy, and downstream sectors, respectively. In the crude oil futures market, multiple contracts are traded at the same time. To make the sample representative, we have selected continuous data on the main contract of INE. e sample period is from March 26 to August 23, 2018, and the data type is 1minute high-frequency data. e details can be seen in Table 1. Column 1 is the acronym for the Chinese Pinyin of the stock names (i.e., zgsh stands for Zhong Guo Shen Hua in Chinese and China Shenhua Energy Company Limited in English), Column 2 is the stock codes, and the remaining two columns are the industry to which the stock belongs and the role of enterprises in the industrial chain.
After the data matching process, 193 price observations per day multiply 103 days, and at last 19,879 observations are obtained for each stock/futures.

Method
3.2.1. DAG. DAG is composed of nodes and vector edges. Nodes represent variables and directed edges connect these nodes to represent the synchronic relationship. By analysing the correlation coefficient and the partial correlation coefficient of the variables, the synchronic relationship between variables is identified. e identification steps are divided into "edge removal" and "orientation." In the "edge removal" stage, DAG starts from an "undirected complete graph," first tests the unconditional correlation coefficients between variables, removes the edges with significantly zero coefficients, and then analyses the first-order partial correlation coefficients. In the above analysis, the Fisher's z test is normally used to determine the significance level. For the two variables x and y, there are five possible results for the casual relationship: x y (independent and unconnected), x ⟶ y (x causes y), y ⟶ x (y causes x), x↔y (two-way causality), and x−y (a causality but the relationship cannot be clarified). In this paper, the above operations can be implemented by PC algorithm in the TETRAD 6.6.0 software.

MFDCCA-MODWT.
First, suppose there are two time series x(i) { } and y(i) , i � 1, 2, ..., N, where N represents the length of the time series, and then construct a contour sequence: e same processing is performed on the reverse order of the time series to obtain 2N s subintervals to avoid information loss.
According to MODWT, a sequence x(t) can be calculated by wavelet.
where J and i are integers that represent the maximum level of scale s and the number of coefficients in the specified component, respectively. D J,i and S J,i respectively represent Complexity 3 the wavelet and smooth trend of the sequence in the interval. By this method, a local trend for each interval v is calculated: y v (i) � S J,i . en, construct the sequence residual: us, for each interval v, we can obtain wave function for two time series as follows: Fourth, construct a q-order wave function: Last, given any real number q, e scaling behavior of fluctuations can be described by a log-log graph between F q (s) and s. If x(i) { } and y(i) have a long-term correlation, then F q (s) changes by the power law: Take the logarithm of two sides of equation (6): e scaling index H xy (q) is the Hurst exponent, which is the slope of the function diagram of ln F q (s) ∼ ln s. It could measure the power-law relationship between time series. If H xy (q) is independent of q, the correlation is single-fractal; if H xy (q) changes with q, the correlation has multifractal characteristics. When q > 0(<0), H xy (q) exhibits the scaling behavior of the correlation between large (small) fluctuations of two time series. e relationship between H(q) and the multifractal index t(q) is as follows: If t(q) is a nonlinear function of q, it shows that series has multifractal characteristics.
With Legendre transform, we can obtain the relationship between the multifractal spectrum D(q) and h(q): where h(q) is a singularity index that describes the singularity of a time series. D(q) is a multifractal spectrum reflecting the fractal dimension with a singular exponent h(q). To better reflect the multifractal characteristics, we use a financial risk index: where ΔH is the range of H xy (q). e larger the span, the more obvious the multifractal feature and the higher the risk. According to equation (10), ΔH xy (q) will be an index of the multifractal degree.

Causality Analysis.
We first import the processed 23 sets of data into TETRAD 6.6.0, and set the display mode as "causal order" and obtain Figure 1. In Figure 1, there are

Correlation Analysis.
In this section, we will analyse the multifractal correlation between INE and stocks based on the causal relationships shown in Figure 2 and examine the sources of correlation. Figure 3 shows the plot of the log F q (s) − log(s) obtained by MFDCCA-MODWT. e curves in each subgraph from bottom to top correspond to q � −10, −9, −8 . . . , 8, 9, 10. In Figure 3, we find that the slope of the curve is different for different q. e bigger the q, the flatter the curve. When q approaches −10, the curve fluctuates to a certain extent, but the values of the coefficient H(q) and the constant term log(A) obtained by OLS are significant at the 1% significance level. erefore, for different q, each curve is linear, which indicates that there is a power-law correlation between the volatility of two products with a causal relationship. Figure 4 shows the Hurst exponent Hq calculated by MFDCCA-MODWT. It can be seen from the figure that Hq decreases gradually as the q increases, indicating that the scaling index is not a constant, that is, the cross-correlation between the volatility has the multifractals. In addition, when q � 2, H(q) is greater than 0.5, indicating long-term memory. e scale index is approximately greater than 0.5 in the interval of −10 < q < 7, indicating that the correlation of volatility in this interval has long-range persistence; and less than 0.5 in the interval of 7 < q < 10, indicating that the volatility correlation has an inverse persistence in this interval. at is to say, the cross-correlation of the volatility of the selected samples is characterized by multiple fractals. In general, H(q) decreases as q increases, indicating that the cross-correlation of volatility with small fluctuations is more persistent than the cross-correlation when large fluctuations occur. In short, when there is a small fluctuation occurs in one market compared to large fluctuations, the persistence of the cross-price correlation between the two markets is stronger.

Multifractal Analysis.
e multifractal strength of the financial system is expressed by the degree of nonlinearity of the scale index. It can be seen from the second column of Figure 5 that the curve has a certain degree of curvature but is not obvious, indicating that the cross-correlation between the price of selected products has weak multifractal characteristics. e third column of Figure 5 is a singular spectral function of the multifractal spectrum that describes the complex dynamics of financial markets. In general, the multifractal spectral width is used to estimate the fractal strength. According to the study of Chen and He [50], the multifractality can be expressed by the width of the multifractal spectrum: Δhq and Δhq can measure the absolute magnitude of the price fluctuation of time series.

Complexity
In the above, we found that the cross-correlation between INE and selected stocks price volatility has strong multifractal characteristics. Based on this, we will further explore the source of multifractal features. From the existing literature, there are many different methods used to characterize the implicit behavior of different financial variables, such as wave scale analysis, structure-function, wavelet transform method, and so on. It is generally believed that the thick tail distribution and the long memory are two possible sources of multifractal properties in financial time series [51]. First, by comparing the degree of multifractal between the original and the shuffled series, we can quantify the contribution of long memory. In this paper, the shuffled sequence can be achieved by the randperm command in the Matlab software. We repeat 100 times to ensure that the original series is completely disrupted. Second, the classical method of quantifying the contribution of thick-tailed distributions in sequence with multifractal features is to compare the multifractal degree between the original sequence and surrogated sequence. Here, the surrogate sequence is achieved by Fourier phase randomization. e procedure creates a surrogate data with the same correlation properties as the original signal [52]. Following the procedure, one performs a Fourier transform on the original time series, preserving the Fourier amplitudes but randomizing the Fourier phases. Finally, one performs an inverse Fourier transform to create surrogate data [53]. e results are shown in Figure 5 and Table 2.
From Table 2 we can see that the ΔHq and Δhq calculated with a shuffled and surrogated sequence is smaller than the original sequence, indicating that the multifractal feature between the volatility sequences is caused by the long memory and thick tail distribution. Figure 5 provides the span of Hurst exponent H(q) and the fractal spectrum between the original, shuffled, and surrogated series. It can be seen that the range of ΔH and Δhq of the original sequence was significantly reduced after shuffled and surrogated, indicating that that long memory and thick tail distributions play a role in multifractals. However, after comparing shuffled and surrogated sequences, we found that the shuffled sequence has a narrow opening in Figure 5, indicating that long memory plays a leading role and is the main reason for multifractal features.
For the empirical tests, we focused on the price correlation between China's first mature crude oil futures product and several key stocks in the crude oil industry chain in the domestic stock market and obtained the following three findings: First, DAG ascertains the price causal relationship among INE and key stocks. From the simplified DAG, it can be seen that the price of INE has affected the market prices of oil and gas exploration and sales companies in China's financial market, and further affecting the stock price of downstream industry companies. Second, based on MFDCCA-MODWT with high-frequency data, we verified that there is a multifractal correlation among INE and selected stocks and the correlation between small fluctuations is higher than that between the large fluctuations. ird, by using shuffled and surrogated sequences, we prove that this multifractal is caused by long memory and thick tail distribution, and long-term memory is the main source of the multifractal features.

Discussion
Following the above results, the conclusion for the research could be drawn from two aspects. On one hand, for the financial market, the results of DAG and MFDCCA-MODWT show that INE could bring stable expectations and guidance to the market performance of China's key crude oil industry-related companies during trading hours. ese results prove that INE has a good ability to reduce the risk of related products in the financial market. is indicator is especially crucial for high-frequency traders. Also, the correlation between the price of crude oil futures and the stock price of listed companies can be a piece of useful supplementary information for green finance tools using. If a listed company does have a green technology promotion, its stock price can usually find its value quickly. At this time, the auditing institution focuses on the correlation fluctuation between the company and the price of crude oil futures as double insurance to evaluate the authenticity of the declaration materials.
On the other hand, for Chinese policymakers, we suggest that China can better sort out its market linkages, and on this basis, promote its green financial policies more effectively. In addition, considering that the spillover effects of fluctuations between different financial sub-markets in China are more complex, decision-making departments should take into account the development trend of each financial sub-market and the information spillover and risk dissemination among the sub-markets, and strengthen structural governance to enhance the suitability of China's financial market system. Besides China, this research provides a basis for investigating market linkages in the immature financial markets of developing countries, and it also provides a framework for investigating related content in other markets around the world. At the same time, we can also use the linkages between the markets to design a check and balance mechanism to better regulate the energy market and implement precise policies to make more contributions to the control of global warming.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.