Coal Price Forecasting and Structural Analysis in China

Coal plays an important role in China’s energy structure and its price has been continuously decreasing since the second half of 2012. Constant low price of coal affected the profits of coal enterprises and the coal use of its downstream firms; the precision of coal price provides a reference for these enterprises making their management strategy. Based on the historical data of coal price and related factors such as port stocks, sales volume, futures prices, Producer Price Index (PPI), and crude oil price rate from November 2013 to June 2016, this study aims to forecast coal price using vector autoregression (VAR) model and portray the dynamic correlations between coal price and variables by the impulse response function and variance decomposition. Comparing predicted and actual values, the root mean square error (RMSE) was small which indicated good precision of this model. Thus this short period prediction can help these enterprises make the right business decisions.


Introduction
Coal is a kind of fundamental energy source which is uppermost in China.In 2015, the total energy consumed in China was 4.3 billion tons of standard coal, of which the consumed coals took a percentage of 64%.Bhattacharya et al. suggested that improving overall efficiency in coal sector will continue to play a significant role in maintaining sustainable growth in China in the long run [1].On the supply side, China government made amendment to Law of the People's Republic of China on the Prevention and Control of Atmospheric Pollution on the 16th session of the 12th Standing Committee of the National People's Congress.It stipulates that the relevant ministry of the State Council and local people's governments at different levels have the responsibility to take proper measures to adjust the energy structure and promote the production and utilization of clean energies; and the related departments should optimize using method of coals, promote the clean and effective utilization, and consequently reduce the proportion of coals in the consumption of primary energy.Meanwhile, coals have the priority in the implementation of production capacity reduction based on the 13th five-year plan.The production capacity is planned to be reduced by about 0.5 billion tons in order to advance the revolution of energy supply side.On the demand side, China energetically promotes the utilization of clean energies and reduces thermal power generation.In 2015, the absolute generating capacity of thermal power was 4210.2 billion KWH which fell 2.8% year on year.The equipment utilization hours of thermal power averaged 4329 hours which reduced by 410 hours year on year.The Action Plan of the Clean and Effective Utilization of Coals in Industrial Circle aims to save more than 0.16 billion tons of coal consumption volume by 2020.Yuan et al. found that coal used for power generation would reach the peak at around 1280 million ton coal equivalent under the clean coal power plan declared by the Chinese government [2].Within the revolution of both supply side and demand side, both sides in the business will pay much more attention to the price of coals, so the coal price forecasting seems more important to both of them.
In 2013, China government canceled the key contract on coal supply and the double-track pricing system of electrical coals, which marked the notion that the coal market totally stepped into marketization period and the coal price started to entirely be determined by market.However, since the second half of 2012 electrical power industry and steel industry which normally have high consumption of coals had reduced the coal consumption because of the influence by the decelerated economic growth.At the same time, impelled by the previous high price, many companies started to enter into coal market and snatch resources and finally resulted in the situation that coal production capacity is excessively more than demand amount.This unbalanced supply-demand situation caused the price of coals to start to drop down and maintained the falling trend.The price in 2015 was reduced by 60% compared with the price at high level in 2011.At the beginning of 2016, the price of coals showed some slight rise because of the coal consumption peak in winter and the resource tension at port.Afterwards, the price tends to be stable.
For both supply and demand on coals and related energy policy makers, accurate prediction of coal price will help them to make more rational decisions.So it is useful to explore a prediction model of coal price under market economy.Based on the available literature, Liu and Ma predicted coal price by using ARMA and Grey-forecasting model [3].Liu et al. analyzed the major reasons of coal price in China [4].Xu et al. built a FS-SVM model for coal price [5].These papers have some certain practical values but more common types of energy price analysis articles are in oil price, natural gas price, and so forth.Kanamura proposed a SDV model for energy prices using the supply-demand relationship [6].García-Martos et al. built a multivariate model for energy prices and compared its results with those of the univariate model [7].Yuan et al. studied the relations between Chinese energy consumption and energy prices [8].Wei and Guo studied the relationship between oil prices and the Chinese macroeconomy [9].Zhang and Yao employed the state-space model and log-periodic power law model to explore the dynamic bubbles of oil prices and predict their crash time [10].van Goor and Scholtens investigated daily natural gas prices in the UK [11].The essential thoughts and forecasting methods of these papers have good inspirations for our research; in this paper we choose PPI, port stocks, sales volume, futures prices of steam coals, and crude oil price to represent, respectively, economic situation, supply and demand, financial instruments, and substitute to predict the coal price in China.Since the coal future in China started running since September 2013, this factor is used less in other articles.
Considering the superiorities of the VAR, this model is used in this paper.The VAR model allows us to explore how variable values depend on their own lags and the lags of other variables, thus offering us a rich structure to evaluate integrated data.The characteristics of this model make it is useful to predict data.Wang et al. analyzed the trend of RMB equilibrium exchange rate in the VAR model system [12].Zheng et al. used the VAR model to forecast pork consumption in China [13].Sing et al. used VAR model to estimate the value of annual construction work carried out by main contractors, which was useful for forecasting privatesector construction completions [14].Shah and Ghonasgi developed a VAR model to find the determinants of price level for the period 1994-2008 in India and tested the predictive ability of the model [15].Aydin and Cavdar compared the prediction performances of Artificial Neural Network and VAR model [16].

Vector Autoregression Model.
The VAR model can be used to analyze the dynamic interaction of time series and the dynamic impact of random disturbances on the variable system and for forecasting.As such, it explains the influence of various impacts on variables, and it is one of the relatively extensive applications in multivariate time series models.It can easily handle multiple variables and predict the time series.Compared with some neural network models or those with complex kernel functions, this model could more straightforward analyze the economic questions and observe the influence of variables on prices.We used the VAR model to predict the coal price with the port stocks, sales volume, futures prices, PPI, and price of WTI.
The mathematical expression of a general VAR (P) model is where   is a  × 1 vector of time series  = 1, 2, . . ., ;  is a  ×  parametric matrix;   is a  × 1 vector of exogenous variables; and  is the × coefficient matrix to be estimated.  represents the random error term, while  represents the lag period.
Akaike Information Criterion (AIC) and Swartz Criteria (SC) are used to select the lag periods of the VAR model in this study.The AIC and SC are computed as follows: where  represents the total number of estimated parameters and  represents the sample length. is determined using In practice, the unit root test is used to measure the stationary case, while the cointegration test is used to check whether any correlations exist.If the VAR model is stable, then our analysis can continue to predict the coal price and error test.

Data.
This paper makes some price estimations for coals with the five variables of port stocks, sales volume, futures prices of steam coals, PPI, and crude oil price selected, respectively, in the aspects of supply and demand, financial instruments, economic situation, and the price of substitute.The major factor that influences prices is the degree of marketization, among which the relationship between demand and supply plays a significant role, with the alternatives also making difference.For the coal price in China, the data of (ii) Futures Prices.Chinese steam coal futures were officially listed for running in September of 2013.It is found by Wei that steam coal futures have shown the function of price discovery [17].Therefore, the futures prices of steam coal are able to give an index for the price trend of spot market.
(iii) Sales Volume.The sales volume influences the sale price of coal enterprises.When consumers anticipate the price rising or strained transport capacity, they will correspondingly increase the quantity of coals that they will purchase in the same month.Meanwhile, the coal enterprises may reduce the coal price for boosting the sales volume.
(iv) PPI.PPI means a price indicator to reflect the variation in the price level of goods and service commodity in different periods from the viewpoint of producers, which also has significantly referential value for national economic accounting.Because coal price is a monthly data, so PPI is used in this paper which is also a monthly data as economic situation in China.In recent two years, Chinese economy is running in a poor state with a declining PPI and serious problems of weak demand and excess capacity.The sustained economic downturn resulted in great price stress on coals.
(v) Crude Oil Price.As a kind of important fossil energy source, crude oil is a major substitute to coal.The price fluctuation of crude oil will correspondingly influence the demand and price for coal.When crude oil rises in price, the demand on coal will have some corresponding increase as the substitute.The fluctuation could also happen to the price.China's oil price depends on the Chinese policies and lags the market reaction.The WTI price could influence not only the import of coals but also the domestic oil price somehow.And there are some papers about the WTI and Chinese coal price.Guo et al. analyzed the relationship between the world oil price and China's coke price [18].Liu et al. did the comparative analysis between the price of WTI, BJI steam coal price, and Shanxi blending coal price [4].So, in this paper, we employ price of WTI as crude oil price.
As to eliminate heteroscedasticity and make the variables obey linear relation, we take natural logarithm of port stocks (LOGSS), sales volume (LOGS), futures prices (LOGF), and price of WTI (LOGW).Since PPI is a dimensionless variable, its original time series (PP) is used.We selected 29 months from November 2013 to March 2016 as time units in the model.For all variables, we obtained data in this paper from the China coal market network (http://www.cctd.com.cn/)and Zhengzhou Commodity Exchange net (http://www .czce.com.cn/portal/index.htm).

Unit Root Test Results
. Before using the VAR model, it was necessary to guarantee the stationary data to prevent spurious regression.Table 1 shows the results of augmented Dickey-Fuller unit root tests.Our results show that the null hypothesis was not rejected.In fact, while all the variables are first-order difference stationary, rejection of the null hypothesis occurs under the 10% level.Thus, it can be concluded that all the variables are first-order difference stationary and can proceed with cointegration test.

Cointegration Test Results
. Since most time series of variables are nonstationary and the transformed time series often do not have direct economic significance, Engle and Granger proposed the cointegration theory and methods, providing another way for nonstationary series modeling [19].The cointegration test is used on multiple variables.
Although they may have independent long-term variation, if they are cointegrated, then there exists a long-term and stable relationship between these variables.We use the method of multivariate cointegration proposed by Johansen and Juselius [20], based on the VAR model, and our cointegration tests are carried out on data, using the first-order lag.Test results are shown in Table 2. Max-Eigen Statistic show that at the confidence level of 95%, there is only one long-term cointegration relationship between coal price, port stocks, sales volume, futures prices of steam coals, PPI, and crude oil price.

Construction of Our Model.
A VAR model is built based on the statistical properties of the data.It is constructed by taking each endogenous variable in the system as the lag value function of all the endogenous variables in the system.In this way, a single variable autoregression model is expanded to a "vector" autoregression model, comprising a multivariate time series variable.
An important aspect of the VAR model is the determination of lag order.The bigger the lag period, the greater the need for estimated parameters, and the greater the reduction in the degrees of freedom of the model, while an insufficient lag period will not reflect the dynamic characteristics of the model.AIC are used to evaluate lag order in this study.Lag order for our VAR model is given in Table 3, reflecting AIC.
As shown in Table 3, the selected lag order is 2 in accordance with the AIC information rule. The To ensure our model was well specified, it was necessary to conduct a stability test.If the VAR model is not stable, then the prediction cannot be carried out.Stability was assessed using an autoregressive characteristic polynomial.When all the characteristic roots are less than 1, that is, they are located within the unit circle, then the model is stable.VAR roots of this characteristic polynomial are shown in Figure 1.

Impulse Response Functions.
Because the VAR model is not a theoretical model, no apriority constraints are made on the variables.Thus, the dynamic effect on the system is analyzed as the VAR model is impacted.The impulse response function is an analysis tool used by many researchers to describe this causality, that is, the shock generated by the change of one variable in the VAR model on another variable (e.g., Xu and Lin [21]; Zhu et al. [22]).Analyses are presented dynamically using graphs, which show the response direction, amplitude, and persistence of the variables in the model related to a shock.Such changes can be observed over time.Its advantage is to highlight variable changes over time at a system-scale; that is, when the whole system undergoes external shocks, it will be temporarily unstable but will achieve balance with adjustments over time.In impulse response analysis, the VAR model is transformed into a vector process of infinite order.All other conditions are unchanged.Thus, the error term impacted by a unit at some time point  will impact the endogenous variable in the model during the current period.The resulting impulse response functions for coal price are shown in Figure 2. Here, the horizontal axis shows that the lag period of impact effect is 15.The vertical axis shows the response of coal price to these five factors; the solid line is the impulse response function.
Figure 2 shows how the dependent variable (coal price) responds during the future 15 periods when the explaining variables (port stocks, sales volume, futures prices of steam coals, PPI, and crude oil price) are shocked from innovation, respectively.Firstly, in the early stages (first 4 periods) coal price volatility responds strongly to its own shock, and in whole stages it brings positive response to itself; the response reaches the peak at 0.027 in the second period.Secondly, coal price shows a positive response to PPI both in the short and long terms and in the ninth period that reaches the peak of 0.046.This indicates that as economic factor increase exerts its impact, the industrial electric consumption will increase which will provide coal consumption growth that increases the coal price.Thirdly, coal price volatility responds modestly to sales volume shocks in the early stage.From the third period to the fifteenth period, the response shows a large fluctuation from positive to negative, showing that as demand increases the price tends to rise but in the long time the overproduction may cut prices.Fourthly, coal price shows a positive response to futures in the short term and negative response in long term, and the intensity of the response reduces over time.Fifthly, a positive shock on crude oil price brings negative response to coal price in the short term but later achieves equilibrium prior to showing a positive response.Sixthly, a positive shock on port stocks brings negative response to coal price, which has weak intensity.This could show that the high port stocks show the "oversupply," expecting a fall in coal prices.

Variance Decomposition.
Variance decomposition analyzes the contribution degree of each impact on the endogenous variable, highlighting the importance of different structural shocks.Therefore, variance decomposition can illustrate the relative importance of a given factor in the VAR model.To quantitatively describe the contributions of port stocks, sales volume, futures prices, PPI, and crude oil price on coal price, variance decomposition of our VAR model is given in Table 4. Here, the contributions to changes in coal price standard error are shown for these five factors.
As shown in Table 4, in the future 15 periods, coal price has the greatest impact during the first six periods with its own contribution rate above 45%.Since then, rates in the rest of the period level off at about 22%.The contribution rate of PPI to coal price increase steeply from the fifth period and then exceeds all other variables at the seventh period and then remains stable and keeps the high level between 62% and 65%.Futures price shocks rank third between fifth and eighth periods and in the remaining periods, and sales volume shocks rank third after eighth period.Contribution rates of port stocks and crude oil price remain relatively static.Therefore, in the long run, economic conditions have the biggest contribution to the changes of coal price; in a short time, coal price has the biggest contribution to its own changes.We applied the root mean square error (RMSE) to further verify precision.The formula of RMSE is expressed as  where   means predicted coal price,   means actual coal price, and  = 1, 2, 3, . . ., 28.

Results and Error Test.
We obtained "RMSE = 1.39%."And we predict the coal price of May 2016 (using the data between November 2013 and April 2016), June 2016 (using the data between November 2013 and May 2016), and July 2016 (using the data between November 2013 and June 2016) by the same processes; the RMSE are 1.48%, 1.46% and 1.44%, respectively, which indicate good precision of the model.
According to the prediction results, the comparison results between the VAR model in this paper and other methods are shown as Table 5.

Conclusions and Discussion
This study performed a VAR model for predicting the coal price in China; port stocks, sales volume, futures prices, PPI, and crude oil price were used as influence factors in this model.After verifying the stability of the model, impulse response, variance decomposition, and the prediction of coal price were carried out.Results of the impulse response 2.23% SVM [5] 7.27% Notes: MAE indicates the mean absolute error and MRE indicates the mean relative error.
function indicated that the economic growth and increased coal price will become the driving force of coal price.The variance decomposition results showed that the contribution of state of the economy account is over 50% since the seventh period.RMSE were applied to verify precision and the value are 1.39%, 1.48%, 1.46%, and 1.44%, respectively, which indicated good precision of the model and indicated that it is instructive to coal related corporations and government departments.

Figure 2 :
Figure 2: Responses of the coal price to influencing factors.

Figure 3
shows the obtained results.We used 29 months (November 2013-March 2016) to predict the coal price between January 2014 and April 2016.The blue line represents the predicted coal price within 28 months since January 2014 and the red line represents the actual coal price.Clearly, the simulated price curve has the same downward trend as the practical situation in the 28 months (January 2014-April 2016).The curves are comparatively close to each other, which proved the accuracy of the VAR model.

Figure 3 :
Figure 3: Predicted results of VAR model.
Notescoal capacity remains inaccurate due to the existing extensive mining and small mines.Usually the port inventory could be taken as an important indicator of the demand and supply relationship in the circulation stage.(i)Port Stocks.All together with the participation of both upstream and downstream enterprises, the port inventory of coals is able to intuitively reflect the supply and demand situation on coals.In 2014, the nationwide port stocks monthly averaged 41.82 million tons.In 2015, it dropped to 37.44 million tons which was decreased by 4.38 million tons compared with 2014.In 2016 port stocks keep the declining trend with the amount less than 30 million tons for each of the first 3 months.It is observed that the coal quantity supplied at port has been relatively lower compared with demand quantity since 2014.The low price of coals caused the trading companies of coals to be unwilling to deposit a large storage at port.If such a situation would sustain for a while, it may turn into an inducement for the modest rise of coal price.

Table 2 :
Cointegration test results.* means Mackinnon-Haug-Michelis  values and * denotes that the null hypothesis of a unit root is rejected at the 5% significant level.

Table 3 :
Lag length test results.
*Notes: * indicates lag order selected by the criterion.