^{1}

^{1}

^{1}

This paper proposes a novel approach to the directional forecasting problem of short-term oil price changes. In this approach, the short-term oil price series is associated with incomplete fuzzy information, and a new fused genetic-fuzzy information distribution method is developed to process such a fuzzy incomplete information set; then a feasible coding method of multidimensional information controlling points is adopted to fit genetic-fuzzy information distribution to time series forecasting. Using the crude oil spot prices of West Texas Intermediate (WTI) and Brent as sample data, the empirical analysis results demonstrate that the novel fused genetic-fuzzy information distribution method statistically outperforms the benchmark of logistic regression model in prediction accuracy. The results indicate that this new approach is effective in direction accuracy.

It is well documented that the oil price has strong connection with the business cycle, macroeconomics variables, global economic conditions, and policy uncertainty [

The oil price, or equivalently, the return of oil price can be decomposed as a product of two components: the direction of price change (or the sign of log return) and change magnitude (the absolute value of the return). While the forecasting models of variance have been widely developed in statistics and econometrics like Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) models, however, the magnitude of directional change is less understood and less developed in literature except for a few remarkable exceptions. Several methods have been utilized in literature for forecasting of oil price; for example, Ghaffari developed a soft computing approach to predict the daily variation of the West Texas Intermediate (WTI) crude oil price, and adopted the direction prediction accuracy ratio to prove its effectiveness (see [

Several other researchers have analyzed the directional forecasting problem in other contexts of economics. The directional accuracy as a reasonable utility-based measure of forecasting performance was also advocated by Engel and Hamilton [

Through the literature review, most of the studies originated from modeling the oil price series as a complete information structure, and less studies have perceived the fuzziness of oil price information. This paper contributes to literature by presenting a new approach to forecast the directional change of oil price. In our approach, the crude oil price series is associated with an incomplete and inaccurate date set due to possible missing samples or noise information. To deal with incomplete data, a fuzzy framework for incomplete data is developed to explore the true hidden data generation process. Huang (see [

In this paper, a new fused genetic-fuzzy information distribution method is proposed to predict the direction of the short-term crude oil price changes in a fuzzy incomplete information setting. The adjustable weighted sum of the reciprocal of directional accuracy and root of mean square error (RMSE) are set as the fitness function to identify the solution with the least RMSE under the same directional accuracy. The numbers of information controlling points and lagged order of returns in fuzzy information distribution are optimized by the presented genetic algorithm. A coding algorithm of multidimensional information controlling points are further introduced to fit genetic-fuzzy information distribution to time series forecasting, with the crude oil spot prices of WTI and Brent as sample data, an empirical analysis on the oil price changes are adopted for following the presented approach. We select logistic regression as a benchmark model to compare the directional forecasting accuracy.

The rest of the paper is organized as follows. In Section

The fuzzy information distribution theory has been considerably successful in processing the fuzziness of information, especially when observed information is incomplete or inaccurate. To understand its essential characteristics, the genetic-fuzzy information distribution model is explained next in details.

In fuzzy information distribution method, the objective is to construct a fuzzy inference from

For the purpose of directional forecasting, the fuzzy information distribution theory is extended in several important theoretical aspects. (1) The fuzzy information distribution is fused and a new fuzzy forecasting model is developed, a genetic algorithm by using the weighted sum of the reciprocal of direction accuracy and root of mean square error (RMSE) are developed as the fitness function in the genetic algorithm. The role of genetic algorithm is to search the “optimal parameters” for enhancing the quality of fuzzy reasoning, which has been missed in the field of fuzzy information distribution theory and its applications. (2) It demonstrates that there is no sample information loss throughout the multi-dimension linear information process. (3) In order to fit fuzzy information distribution to the oil price time series analysis, a coding algorithm of multidimensional information controlling points is adopted, which can maintain the temporal structure of time series data particularly for large sample applications.

Let

Without loss of generality, the output variable

Let

For any one sample

Similarly, for input variables, let

A multidimensional sample,

The multidimensional linear distribution function is equal to the product of all one-dimensional linear distribution functions:

In practical application of fuzzy information distribution, lots of multidimensional information controlling points like

To maintain the temporal structure of time series data, a coding method with

By using this method, dynamic non-repetitive coding according to the number of information controlling points of each index are conveniently realized. The coding method enables us to keep the logical temporal structure of time series data, and therefore, facilitate the application of fuzzy information distribution in time series analysis.

Through multidimensional information distribution, all information controlling points with

Finally, the two-dimensional information distribution matrix from

As shown by the next result, the whole information gains on all multidimensional information controlling points from sample information are preserved; in other words, there is no sample information loss through the multidimensional linear information process.

The distribution matrix

Alternatively, the set-valued statistics and conditional falling shadow formula based the falling shadow theory are used by Jun and Kang [

The fuzzy relation matrix is

Both these two approaches will be used in our model below. We now move on to discuss fuzzy inference process and use it to deal with incomplete information data set.

A fuzzy inference process is, in essence, a fuzzy transform from an input fuzzy set

Suppose that

For fuzzy inference on

For inference for

Moreover, two

Another one is the weighted average of the information controlling points with the membership degree as weights, that is,

At last, to simplify notations of

In fuzzy information distribution process, the key optimization parameters are: the numbers of input indexes

The genetic algorithm is applied to optimize the parameter combinations. For this purpose,

Since the directional change of the crude oil spot price is investigated, the logarithmic returns are used as

In the fuzzy information distribution theory, the numbers of information controlling points of input and output indexes are the vital parameters of information distribution model, which directly affect the precision of the model. To reduce the number of estimated parameter, we consider a special case where each input index has the same number of controlling points, i.e.,

There are three steps in finding Pareto solution.

Clearly, each combination in

We obtained crude oil daily spot prices from US Department of Energy: Energy Information Administration, unit: Dollars per Barrel. Brent crude oil spot prices (Brent-Europe) and West Texas Intermediate crude oil spot prices (WTI-Cushing, Oklahoma) are extracted from Nov. 13, 2017, to Sep. 28, 2018, with a total of 220 samples. The first 200 samples are used for modeling, while the latter 20 samples are for testing the prediction accuracy. The descriptive statistics of samples are shown in Table

The descriptive statistics of oil returns.

Series | Mean | Maximum | Minimum | Standard deviation | Skewness | Kurtosis |
---|---|---|---|---|---|---|

Brent | 0.001359 | 0.045343 | -0.044136 | 0.015835 | -0.282877 | 3.24081 |

WTI | 0.001153 | 0.073341 | -0.052511 | 0.016796 | -0.052855 | 4.89266 |

To apply the method of fuzzy information distribution, the first step is to determine the number of input indexes and the number of information controlling points of each index. As is known, the crude oil market assimilates and reacts to the new information with a time delay. The time length of the delay process depends on the maturity degree of crude oil market. So in the model the lagged returns are set as the input indexes (explanatory variables) and the returns of the following day as the output index (dependent variables). Matlab codes are written to carry out the fuzzy inference, and the genetic optimization is done by the GA(x) function in Matlab genetic algorithm toolbox, allowing for integer optimization. In crude oil spot market, the historical short-term price information in about one week is of particularly importance to affect the current oil price; and it also indicate that a larger

The intervals of information controlling points are set as 1-1.1 times the range of sample return, and it can be dynamically adjusted after future oil price data accumulation. In order to simplify the calculation, the information controlling points with equal steps are used. However, if there is enough information, a more flexible approach can be taken to arrange information controlling points in line with experts’ knowledge, such as placing more information controlling points over the interested range of return. Another rule is that the number of controlling points should be appropriate. We aim to determine the numbers of controlling points objectively by a genetic optimization algorithm. The initial ranges of

In order to achieve the best prediction effect, several weights

The estimated results of the genetic-fuzzy in formation distribution approach.

m | h | n | | | | | | | | | | | | | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Brent crude oil prices | |||||||||||||||

| |||||||||||||||

5 | 13 | 25 | 1 | 0.9231 | 0.9231 | 0.65 | 0.7 | 0.55 | 0.55 | 0.0145 | 0.0151 | 0.0132 | 0.0131 | 0.6125 | 0.014 |

5 | 13 | 26 | 0.95 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 13 | 26 | 0.9 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 13 | 26 | 0.85 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 14 | 11 | 0.8 | 0.8615 | 0.8821 | 0.55 | 0.6 | 0.7 | 0.6 | 0.0133 | 0.0137 | 0.0132 | 0.0137 | 0.6125 | 0.0135 |

5 | 14 | 21 | 0.75 | 0.9128 | 0.9385 | 0.65 | 0.45 | 0.7 | 0.65 | 0.0139 | 0.0155 | 0.0132 | 0.0134 | 0.6125 | 0.014 |

5 | 13 | 16 | 0.7 | 0.9538 | 0.9538 | 0.6 | 0.65 | 0.55 | 0.6 | 0.0123 | 0.0145 | 0.0132 | 0.0131 | 0.6 | 0.0133 |

5 | 13 | 26 | 0.65 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 14 | 21 | 0.6 | 0.9128 | 0.9385 | 0.65 | 0.45 | 0.7 | 0.65 | 0.0139 | 0.0155 | 0.0132 | 0.0134 | 0.6125 | 0.014 |

5 | 13 | 26 | 0.55 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 14 | 11 | 0.5 | 0.8615 | 0.8821 | 0.55 | 0.6 | 0.7 | 0.6 | 0.0133 | 0.0137 | 0.0132 | 0.0137 | 0.6125 | 0.0135 |

5 | 13 | 26 | 0.45 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 14 | 11 | 0.4 | 0.8615 | 0.8821 | 0.55 | 0.6 | 0.7 | 0.6 | 0.0133 | 0.0137 | 0.0132 | 0.0137 | 0.6125 | 0.0135 |

5 | 14 | 11 | 0.35 | 0.8615 | 0.8821 | 0.55 | 0.6 | 0.7 | 0.6 | 0.0133 | 0.0137 | 0.0132 | 0.0137 | 0.6125 | 0.0135 |

5 | 14 | 11 | 0.3 | 0.8615 | 0.8821 | 0.55 | 0.6 | 0.7 | 0.6 | 0.0133 | 0.0137 | 0.0132 | 0.0137 | 0.6125 | 0.0135 |

5 | 13 | 26 | 0.25 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 13 | 26 | 0.2 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 13 | 26 | 0.15 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 13 | 26 | 0.1 | 0.9538 | 0.9487 | 0.55 | 0.75 | 0.55 | 0.6 | 0.0133 | 0.0146 | 0.0132 | 0.0133 | 0.6125 | 0.0136 |

5 | 14 | 11 | 0.05 | 0.8615 | 0.8821 | 0.55 | 0.6 | 0.7 | 0.6 | 0.0133 | 0.0137 | 0.0132 | 0.0137 | 0.6125 | 0.0135 |

6 | 13 | 11 | 0 | 0.8608 | 0.8814 | 0.55 | 0.55 | 0.55 | 0.45 | 0.0137 | 0.0137 | 0.0134 | 0.0134 | 0.525 | 0.0135 |

| |||||||||||||||

WTI crude oil prices | |||||||||||||||

| |||||||||||||||

5 | 10 | 27 | 1 | 0.9231 | 0.8462 | 0.5 | 0.45 | 0.75 | 0.55 | 0.0159 | 0.024 | 0.0147 | 0.0155 | 0.5625 | 0.0175 |

4 | 14 | 22 | 0.95 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

4 | 10 | 29 | 0.9 | 0.8622 | 0.7551 | 0.55 | 0.55 | 0.5 | 0.65 | 0.0194 | 0.0222 | 0.0152 | 0.0161 | 0.5625 | 0.0182 |

4 | 10 | 29 | 0.85 | 0.8622 | 0.7551 | 0.55 | 0.55 | 0.5 | 0.65 | 0.0194 | 0.0222 | 0.0152 | 0.0161 | 0.5625 | 0.0182 |

5 | 10 | 27 | 0.8 | 0.9231 | 0.8462 | 0.5 | 0.45 | 0.75 | 0.55 | 0.0159 | 0.024 | 0.0147 | 0.0155 | 0.5625 | 0.0175 |

4 | 14 | 22 | 0.75 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

4 | 14 | 22 | 0.7 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

4 | 14 | 25 | 0.65 | 0.9082 | 0.8214 | 0.45 | 0.5 | 0.7 | 0.55 | 0.0178 | 0.0235 | 0.0152 | 0.0168 | 0.55 | 0.0184 |

4 | 14 | 22 | 0.6 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

4 | 14 | 22 | 0.55 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

4 | 14 | 22 | 0.5 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

4 | 10 | 29 | 0.45 | 0.8622 | 0.7551 | 0.55 | 0.55 | 0.5 | 0.65 | 0.0194 | 0.0222 | 0.0152 | 0.0161 | 0.5625 | 0.0182 |

4 | 14 | 22 | 0.4 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

5 | 10 | 27 | 0.35 | 0.9231 | 0.8462 | 0.5 | 0.45 | 0.75 | 0.55 | 0.0159 | 0.024 | 0.0147 | 0.0155 | 0.5625 | 0.0175 |

4 | 14 | 22 | 0.3 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

4 | 14 | 22 | 0.25 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

6 | 10 | 13 | 0.2 | 0.7784 | 0.7577 | 0.6 | 0.5 | 0.5 | 0.6 | 0.018 | 0.0249 | 0.0147 | 0.016 | 0.55 | 0.0184 |

5 | 10 | 15 | 0.15 | 0.8718 | 0.8462 | 0.5 | 0.45 | 0.75 | 0.5 | 0.0162 | 0.0264 | 0.0147 | 0.0163 | 0.55 | 0.0184 |

5 | 10 | 27 | 0.1 | 0.9231 | 0.8462 | 0.5 | 0.45 | 0.75 | 0.55 | 0.0159 | 0.024 | 0.0147 | 0.0155 | 0.5625 | 0.0175 |

4 | 14 | 22 | 0.05 | 0.9184 | 0.8878 | 0.45 | 0.55 | 0.7 | 0.65 | 0.0201 | 0.0232 | 0.0152 | 0.0168 | 0.5875 | 0.0188 |

6 | 13 | 10 | 0 | 0.8969 | 0.9639 | 0.45 | 0.45 | 0.55 | 0.45 | 0.0143 | 0.0143 | 0.0148 | 0.0151 | 0.475 | 0.0147 |

The fitness values during optimization for forecasting WTI oil price (

The parameter optimization problem under directional accuracy and RMSE is similar to a multi-objective programming. Because of the complexity of fitness function of fuzzy information model, however, the linear weighted average of two objectives method does not necessarily produce pareto solutions. Since the fitness function of fuzzy information model is the synthesis of four fuzzy inference modes, and each solution includes four sub-solutions, there are 84 sub-solutions in Table

The Pareto solutions and fuzzy inference methods.

Series | Inference Method | m | h | n | RMSE | D |
---|---|---|---|---|---|---|

| | | | | | |

Brent | Rs-avr | 5 | 14 | 11 | 0.0132 | 0.7 |

Brent | Rs-max | 5 | 13 | 16 | 0.0123 | 0.6 |

| | | | | | |

The bold estimators are optimal.

The solutions among pareto optimal solutions with much preference to directional accuracy are characterized. For Brent crude oil price series,

The three-dimensional stereograms of information distribution matrix at the final solutions.

Brent crude oil prices

WTI crude oil prices

As shown in Table

The oil spot prices (solid line) and forecasting prices (dotted line).

Brent crude oil prices

WTI crude oil prices

Clearly, the in-sample predictions are better than the out-of-sample ones. The out-of-sample forecasted values and the direction consistency comparison are presented in Table

The out-of-sample forecasted and actual values.

The forecasted samples | Brent | WTI | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

Price | Price forecasted | Return | Return forecasted | Direction consistency | Price | Price forecasted | Return | Return forecasted | Direction consistency | |

201 | 77.8100 | 77.1249 | 0.0112 | 0.0024 | Y | 69.84 | 70.7789 | -0.0059 | 0.0075 | N |

202 | 77.5100 | 78.5606 | -0.0039 | 0.0096 | N | 69.82 | 68.4365 | -0.0003 | -0.0203 | Y |

203 | 76.6800 | 75.4830 | -0.0108 | -0.0265 | Y | 68.69 | 69.7991 | -0.0163 | -0.0003 | Y |

204 | 75.6700 | 76.8643 | -0.0133 | 0.0024 | N | 67.81 | 69.0274 | -0.0129 | 0.0049 | N |

205 | 75.5500 | 75.8518 | -0.0016 | 0.0024 | N | 67.73 | 67.8982 | -0.0012 | 0.0013 | N |

206 | 76.7700 | 76.8376 | 0.0160 | 0.0169 | Y | 67.55 | 67.6894 | -0.0027 | -0.0006 | Y |

207 | 78.2200 | 76.9545 | 0.0187 | 0.0024 | Y | 69.29 | 67.9905 | 0.0254 | 0.0065 | Y |

208 | 80.0200 | 78.4080 | 0.0228 | 0.0024 | Y | 70.37 | 69.1516 | 0.0155 | -0.002 | N |

209 | 77.6600 | 80.2123 | -0.0299 | 0.0024 | N | 68.60 | 69.9840 | -0.0255 | -0.0055 | Y |

210 | 77.8700 | 77.8466 | 0.0027 | 0.0024 | Y | 68.98 | 68.6549 | 0.0055 | 0.0008 | Y |

211 | 78.2200 | 78.0571 | 0.0045 | 0.0024 | Y | 68.86 | 68.8491 | -0.0017 | -0.0019 | Y |

212 | 79.2500 | 78.4080 | 0.0131 | 0.0024 | Y | 69.87 | 68.9496 | 0.0146 | 0.0013 | Y |

213 | 79.4300 | 79.4404 | 0.0023 | 0.0024 | Y | 71.08 | 70.0309 | 0.0172 | 0.0023 | Y |

214 | 79.0300 | 79.3347 | -0.0050 | -0.0012 | Y | 70.77 | 70.8529 | -0.0044 | -0.0032 | Y |

215 | 78.9000 | 76.9632 | -0.0016 | -0.0265 | Y | 70.80 | 70.7983 | 0.0004 | 0.0004 | Y |

216 | 80.8900 | 79.0896 | 0.0249 | 0.0024 | Y | 73.23 | 70.6656 | 0.0337 | -0.0019 | N |

217 | 82.2100 | 81.0844 | 0.0162 | 0.0024 | Y | 73.40 | 73.3033 | 0.0023 | 0.001 | Y |

218 | 81.8700 | 83.3107 | -0.0041 | 0.0133 | N | 72.22 | 73.3780 | -0.0162 | -0.0003 | Y |

219 | 81.5400 | 81.7718 | -0.0040 | -0.0012 | Y | 72.18 | 71.7377 | -0.0006 | -0.0067 | Y |

220 | 82.7200 | 82.9297 | 0.0144 | 0.0169 | Y | 73.16 | 72.4548 | 0.0135 | 0.0038 | Y |

Y represents the consistent forecasting results and N is for inconsistent cases.

By fuzzy inference, a fuzzy set

The fuzzy possibility distributions of the out-of-sample WTI oil returns.

In order to test the effectiveness of the model in directional forecasting, the classical logistic regression and random walk with drift are taken as two benchmark models to compare the directional prediction accuracy. The results are promising.

First, we discuss logistic regression model. The number of lagged input indexes is the only one parameter for logistic regression model. The identified results of logistic regression model are shown in Table

The forecasting result of logistic regression.

Series | m | D | TN | TP | AUC | Optimal threshold |
---|---|---|---|---|---|---|

Brent | 4 | 0.7000 | 0.8889 | 0.5455 | 0.6364 | 0.5584 |

| | | | | | |

Brent | 6 | 0.6500 | 0.7778 | 0.5455 | 0.6061 | 0.5574 |

WTI | 4 | 0.6000 | 1.0000 | 0.1111 | 0.4444 | 0.6748 |

WTI | 5 | 0.7000 | 1.0000 | 0.3333 | 0.5253 | 0.6436 |

| | | | | | |

The bold estimators are optimal.

According to the principle of maximum

Second, we assume that the stock price

The forecasting result of the geometric random walk with a dynamic drift.

Series | m | RMSE | D |
---|---|---|---|

Brent | 4 | 0.016 | 0.45 |

Brent | 5 | 0.0149 | 0.55 |

| | | |

WTI | 4 | 0.0167 | 0.5 |

| | | |

WTI | 6 | 0.0153 | 0.45 |

The bold estimators are optimal.

In this paper, a new approach is proposed to enhance the directional forecast accuracy of short-term oil price. It is distinct from previous studies that the crude oil price series is viewed as an incomplete data set with inaccurate fuzzy information and then it is modeled by fuzzy information distribution theory. A new fused approach of genetic-fuzzy information distribution is adopted to forecast the direction of oil price changes. The genetic algorithm is used to optimize the assignment of fuzzy information controlling points, and a coding algorithm of multidimensional information controlling points is applied to make genetic-fuzzy information distribution approach feasible in its time series applications. With the crude oil spot prices of WTI and Brent as sample data, the empirical analysis demonstrates that the novel approach offers high accuracy on the directional forecasting of short-term oil prices, compared with the logistic regression and random walk.

The genetic-fuzzy information distribution can distribute the information of the high level oil price into two neighbour information points; thus it is a new method to construct the oil price distribution structure and then form a fuzzy knowledge-based inference to predict oil prices. It particularly helps to describe the true oil price behaviour hidden in oil price bubbles when oil prices present an unsustainable fast rise. Besides, we demonstrate an effective time series prediction in this paper, but it is worthy of selecting economic or political factors as input variables to enhance the prediction ability and performing the medium and long-term oil price forecasting in future.

The data used to support the findings of this study can be found on the website of US Department of Energy: Energy Information Administration, which is free for academic use.

The authors declare that they have no conflicts of interest.

The research described in this paper was substantially supported by the National Natural Science Foundation of China under Grant no. 71871215; the Humanity and Social Science Youth Foundation of Ministry of Education of China under Grant no. 17YJC630012; the Postgraduate Research & Practice Innovation Program of Jiangsu Province under Grant no. KYCX17_1501.