Grain Consumption Forecasting: One Modified MLR Model Combined with Time Series Forecasting Theory

,


Introduction
With the change of social, economic, and environmental factors, the grain consumption structure appears to have new features. ere have been various prediction methods by research methods, focuses, and perspectives, which can be divided into qualitative analysis and quantitative analysis. Qualitative prediction is the analysis of the specialists based on the characteristics of historical data and intuitive materials, and the corresponding results rely on the experience and analysis ability of researchers. e quantitative prediction method is to build one mathematical model to predict the changing laws and the development trend in future, which has obtained wider acceptance and application. In general, the quantitative analysis methods include the time series prediction model, econometric equation (single equation and simultaneous equation), regression analysis prediction model, etc [1]. Cheng et al. considered the actual level of China's current grain consumption and the future trend of China's population development and predicted that the total grain consumption demand in China will not exceed 640 million tons in 2030 [2]. Gao used the time series method to calculate per capita grain consumption by selecting a national sample of residents and predicted total food consumption. He said that in 2020, China's total grain consumption will reach to 595 million tons [3]. Chen et al. used the single equation econometric model to complete the prediction of per capita rural consumption and population changes in China; here the two variables were multiplied to obtain the total grain consumption, which will reach 76 million tons in 2020 and 65 million tons in 2030 [4]. Liao et al. proposed the modified CAPSIM-PODIUM model to forecast and analyze the grain demand of the whole country and nine major watershed slices, including rice, wheat, and corn; the results have shown that China's total grain demand will reach 507 million tons in 2020 [5]. Pan et al. first established a regression analysis; here only two factors of food price and per capita disposable income are regarded as significant impacts on food consumption. us, the direct and indirect grain consumption models of rural residents per capita were obtained by the linear fitting model, and the predicted value of total grain consumption was obtained by combining the total population. e results have shown that direct and indirect grain consumption in China's rural areas will reach 96.8658 million tons and 87.3173 million tons by 2020 and 58.1396 tons and 91.1647 million tons by 2030 [6]. From the existing research, the impact factors and historical data features are more important for the grain consumption prediction model. However, so far, it is not enough for the more comprehensive analysis. In this paper, one new combined model is proposed, including the selection of impact factors and the consideration of optimal preprocessing. e main contributions of this paper are listed as follows: (1) To smooth original data and optimize the selection of impact factors, firstly, the abnormal points in the original data are removed by moving average filtering with the optimal window width, which makes the data trend to be smoothed and stabilized, and the relational degree of grain consumption and its impact factors are calculated by the proposed combined method, and thus the key impact factors are selected. (2) To minimize the prediction error of grain consumption, we combine the advantages of two models, and one new modified MLR model combined with time series forecasting theory based on data barycenter is proposed.

Research of Consumption Forecasting Method
China's grain consumption can be divided into two categories: food consumption and nonfood consumption. Food consumption includes ration and feed grain consumption, while nonfood consumption includes industrial grain, seed grain, and loss grain. As shown in Figure 1, food consumption is the most important mode of grain consumption in China, so the research focus of this paper is the changing trend of ration and feed grain.

Moving Average Filtering with Optimal Window Span.
Moving average filtering has been applied widely in numerical analysis, also known as linear filtering, which belongs to one-dimensional filtering. e algorithm is realized by recursion, that is, replacing the original value of the corresponding position with the average value of several neighborhood points, and moving average value is regarded as the trend representative value of the moving average term in this period. In this paper, all the historical data will be preprocessed by average moving filter for eliminating abnormal point and random fluctuations. Generally, when the filtering window width N is a positive odd number, the general formula of moving average filtering is defined as where x(n) is the input sequence, f(n) is the output sequence, and N is the window width. Under different window widths, the smoothness and stabilization of the filtering original data will have a large difference. Here, coefficient of variation is adopted to evaluate the discreteness of observed values, which can be expressed by the ratio of standard deviation to average value of each observed value, and the symbol is CV [7]: where σis the standard deviation, σ �

����������������
(1/n) n i�1 (x i − μ) 2 , and μ is the average value, μ � (1/n) n i�1 x i . e smaller the coefficient of variation is, the better stability the group of data has. In fact, for obtaining higher prediction accuracy, the largest window width with the smallest coefficient of variation may not bring expected results; therefore, the optimal filtering window width will be chosen by the coefficient of variation and the prediction error.

Grey Relational Degree.
e method of grey correlation analysis, also called "grey relational degree," is used widely to calculate the correlation between variables [8], which can analyze the degree of correlation between two systems or two factors within the system under the condition of the change of time and speed and judge whether two data series are highly related by comparing their geometric similarity. In this paper, assuming the grain consumption data are a reference sequence and their impact factors are comparative sequence, the grey relational degree between the reference sequence and comparative sequence is calculated as follows [9]:  (1) e reference sequence is x 0 (k) � x 0 (1), x 0 (2), . . . , x 0 (n)}; comparative sequence is (3) (2) Normalization is adopted to carry on a dimensionless preprocessing, and the normalized reference and comparative series are expressed as z 0 (k) and z i (k), respectively.
(3) Calculating the absolute difference sequence Δ 0i (k) between reference sequence z 0 (k) and comparative sequence z i (k): (4) Searching the maximum absolute difference sequence Δ max and minimum absolute difference sequence Δ min : the i-th grey relational coefficient is defined as where ρ is the resolution coefficient, which is generally between 0.5 and 1. e smaller the parameter p is, the stronger the distinguishing ability for the difference among the relational coefficients is, usually taking 0.5. (5) Averaging the relational coefficients: the relational degree can be obtained as

Pearson Correlation Coefficient.
Pearson correlation coefficient is a classical statistic proposed by Karl Pearson. In statistics, it is widely used to measure the degree of correlation between two variables, which can reflect the linear correlation between two variables. It is defined as the quotient of covariance and standard deviation between two variables [10]. Assuming two series X � x 1 , x 2 , . . . , x n and Y � y 1 , y 2 , . . . , y n , the Pearson correlation coefficient between them is where r is between − 1 and 1. e larger the absolute value of the correlation coefficient is, the higher the correlation between X and Y is, and the positive correlation coefficient means positive correlation, and the negative correlation coefficient represents negative correlation.

Combined Model of Selecting Main Impact Factors.
Grain consumption includes ration consumption, feed-grain consumption, and industrial/seed/loss grain consumption. e changing social, economic, and environmental factors will produce a different effect on each component of the grain consumption; however, some uncorrelated impact factors and those with lower correlation will reduce the prediction accuracy and increase the complexity. erefore, it is necessary to select main impact factors for the following prediction of the grain consumption.
From Sections 2.2.1 and 2.2.2, grey correlation analysis can embody the geometric similarity between any two factors, and Pearson correlation coefficient can reflect the linear correlation between two variables. How to combine the above two correlation degrees and select the real key impact factors will be one of innovations in this paper. In the existing references, only one correlation analysis usually is used to rank the correlation degree between one main behavioral factor and the reference series. ough for the same impact factors, there can produce the different ranking order of correlation degree by the above two correlation analysis methods [11]. One new combined model of selecting main impact factors is proposed in Figure 2. Here, grain consumption is the main behavioral factor, and its more possible impact factors are the reference series. From Figure 2, the correlation degree between them will firstly be computed separately by grey correlation analysis and Pearson correlation coefficient, and then one weighted operator will be adopted to obtain the final relational degree. us, the key impact factors can be chosen by the ranking order and one threshold of the final relational degree. Here, the threshold can be determined by the feedback of the following prediction error.

Prediction Model of Impact Factors.
e ARIMA model is one time series prediction method proposed by Box and Jenkins in the early 1970s, which is also called the Box-Jenkins model [12,13]. In the ARIMA (p, d, q) model, p is called autoregressive order and q is called moving average order. For obtaining better prediction, the original data should be stationary. e unit root test is often used to check the stationarity of time series. If the time series is nonstationary, it can be transformed into stationary time series by several times of difference [14].
Suppose that ω t is the predictive value in t year and ω t− 1 , ω t− 2 , . . . , ω t− p are the original values of various impact factors in past p years and set ω t � (1 − L) d y t ; here y t is a single integer sequence with d order and ω t is the stationary series; the general form of the ARMA model can be written as Suppose L is the lag operator, then International Journal of Mathematics and Mathematical Sciences and equation (9) can be rewritten as where e d order difference transformation of ARMA(p, q) as equation (10) is called ARIMA(p, d, q): where ε t is a white noise process with its mean value being 0 and variance being σ 2 .

MLR Prediction Model Based on Data
Barycenter. In the classical multivariate linear regression (MLR) model, the least-square method is usually used to compute the prediction parameters. Compared with the least-square method, the barycenter operators [15] can provide higher stable parameters for a multivariate regression model [16]. Assuming the dataset is x i , i � 1, 2, . . . , n, its first-order barycenter can be written as (1) x i a first-order barycenter operator. Similarly, its second-order barycenter can be written as (2) x i , a second-order barycenter operator.
And so forth, the k-order barycenter operator is n i�1 In the classical multivariate linear regression model [17,18], the prediction value can be expressed as here, β 0 , β 1 , . . . , β p− 1 are the prediction parameters; x is the independent variables, and the number are p − 1 of them, n sets of observations of variables are expressed In this paper, the k-order barycenter operators are adopted to compute the parameters of multivariate prediction model. e order of k-order barycenter operator is equal to the number of the prediction model parameters. e new proposed model based on data barycenter is called DBbased MLR.
In the DB-based MLR, equation (17) can be transformed as Let k � 1, 2, . . ., p in equation (18); we can get one linear equation set. e corresponding parameters can be solved by Cramer's rule: . . . , (1) 1 Replacing the j-th column of D with n i�1 (1) y i , n i�1 (2) y i , n i�1 (p) y i (that is, the left term in equation (18)), we can get the coefficients D j , j � 1, 2, . . . , p. Furthermore, taking equation (19) into equation (17), the optimal predictive model parameters can be obtained by several iterations.

Time Series Prediction Model.
e time series prediction model is one of the quantitative prediction methods, which arranges the historical data of the predicted object into a time series in time order, analyzes its changing trend with time, and establishes a mathematical model to extrapolate the future value. [19]. e general structure of time series prediction model is expressed as follows: where ϕ 1 , ϕ 2 , . . . , ϕ n are the prediction parameters and y t , y t− 1 , . . . , y t− n are the historical values of grain consumption. By the modeling idea and parameter calculating algorithms, the time series models can be divided into the moving average (MA) model, trend extrapolation model, exponential regression (ES) model, autoregressive (AR) model, ARMA model, ARIMA model, etc [20][21][22]. It is same for the above time series to predict the future value by fitting its historical data, only using the different computing methods of model parameters.

Modified MLR Model Combined with Time Series
Forecasting eory. Summarizing the analysis of the time series model and classical MLR model, we can find that the two models cannot make full use of the historical data and the impact factors at the same time. In this paper, the historical data of prediction variable will be considered as one key impact factor, which will be imputed to the MLR model with the other chosen impact factors. e proposed modified MLR model can be expressed as y t � β 0 + β 1 x 1 (t) + β 2 x 2 (t) + · · · β n x n (t) + ϕ 1 y t− 1 + ϕ 2 y t− 2 + ϕ 3 y t− 3 + · · · + ϕ n y t− n , (22) where y t , y t− 1 , . . . , y t− n are the historical data of the prediction value y t ; they are the "internal factors" of the MLR model; besides, x 1 (t), x 2 (t), . . . , x n (t) are the chosen key impact factors, which are "external factors." erefore, the modified MLR model can take advantages from adding the internal factors. In this model, the model parameters β 0 , β 1 , . . . , β n and ϕ 1 , ϕ 2 , . . . , ϕ n can be computed at one time; therefore, the proposed model is different with the simple combining the two models, respectively. e prediction performance will be discussed in the following sections.

e Evaluation Index of Prediction Error.
In this paper, the absolute percentage error (APE), the mean absolute percentage error (MAPE), and the eil inequality coefficient of prediction residuals are used as the criteria for the model prediction. MAPE is the average ratio of all absolute percentage errors (APEs) to real values, and it is a percentage. In general, absolute error can avoid the problem of errors cancelling out each other; therefore, MAPE can accurately reflect the actual prediction error and can better reflect the reliability of prediction model. Besides, eil inequality coefficient is a correlation index to measure the prediction accuracy. e closer the eil inequality coefficient is to 0, the smaller the difference between the predicted value and the real value will be, which indicates the better fitting degree of the prediction model. e mean absolute percentage error and the eil inequality coefficient are defined as

Moving Average Filtering with Optimal Window Width.
In experiment, the data are from the website of the National Statistical Office and "China Statistical Yearbook 2018" [23]. All impact factors and different kinds of grain consumptions will be preprocessed by the moving average filtering under different window widths, as shown in Figures 3 and 4. Here, urban ration (1981∼2017) is one example of the original data in the experiment. From Figures 3 and 4, for different window width N, the larger the window width is, the smaller the coefficient of variation is and the smoother the filtered sequence is. Considering the following prediction accuracy, for given training data, the optimal window width can be determined by two indexes: coefficient of variation test and prediction error.

Selection of Impact Factors.
As shown in Sections 2.2.1 and 2.2.2, the relational degree and order between each kind of grain consumption and the corresponding several impact factors are analyzed. e selection of key impact factors is shown as Table 1. Here, the grain consumption includes urban ration, rural ration, urban feed grain, and rural feed grain. e impact factors include population, urbanization International Journal of Mathematics and Mathematical Sciences 5 level, Engel's coefficient, per capita income, price index of agricultural products, etc.

Analysis of Model Prediction.
In the modified prediction model proposed above, by considering the complexity of the algorithm and the comparison of experimental results, the final model determined by the project is as follows: where model parameters β 0 , β 1 , β 2 , and ϕ 1 can be calculated by the data barycenter method. e original data include grain consumption and several impact factors. In simulation, the first 33 years (1981-2013) of grain consumption data are trained in the model, and then the next five years (2014-2017) of grain consumption data are predicted. Defining the year of 1981 as t � 1 and 2013 as t � 33, the urban ration consumption, urban feed-grain consumption, rural ration consumption, and rural feedgrain consumption are, respectively, expressed as y 1 , y 2 , y 3 , and y 4 ; the five factors are, respectively, defined as x 1 , x 2 , x 3 , x 4 , and x 5 . e forecasting process is shown as follows: (1) When t � 1∼37, the relational degree between y 1 and x 1 , x 2 , x 3 , x 4 , and x 5 is calculated; the two key impact factors will be selected by the proposed combined method, expressed as x 11 (t) and x 22 (t).
(3) x 11 (t + 1) and x 22 (t + 1) are predicted by the ARIMA model. (4) Assuming t � t + 1 and y 1 (t + 1) � β 0 + β 1 x 11 (t + 1)+ β 2 x 22 (t + 1) + ϕ 1 y 0 (t), the urban ration consumption next year is obtained by the proposed joint model. Similar to the forecasting process of rural ration consumption, the urban feed-grain consumption y 2 , rural ration consumption y 3 , and rural feed-grain consumption y 4 can be obtained, respectively. For obtaining the total grain consumption, we can sum the above prediction results.

Fitting Results.
By the proposed method as equation (24), for every kind of grain consumption, two key impact factors and the historical grain consumption are chosen as the input of the time series-MLR joint model. However, for the classical MLR model, the input is only the two chosen key impact factors. e fitting results of the two models are shown in Figure 5(a) for feed grain prediction and Figure 5(b) for ration grain prediction.
From Table 2, it can be seen that the fitting goodness of ration and feed grain all can be improved. e fitting results of the proposed model can provide better prediction reliability.

Analysis of Prediction Results.
For the experiments of predicting the future trend of grain consumption, we choose the filter window width equal to 3, 5, or 7 and establish the prediction model. Different filtering window widths will affect the model prediction performance; therefore, if we feed the prediction error back to adjust the filter parameters, the optimal filtering window width can be chosen. Here, the chosen optimal window width is 5 and 3, respectively, for ration consumption prediction and feed grain consumption prediction. In this paper, the MLR model and modified MLR model are established based on these optimal preprocessing methods.     6 International Journal of Mathematics and Mathematical Sciences In addition, in our simulation, a classical time series ARIMA model is adopted to predict the ration and feed grain consumption in China. Besides, deep learning prediction is also considered for comparison, which is the emerging research hotspot in recent years [24][25][26][27][28][29][30]. Five kinds of prediction models are considered for comparing APE, MAPE, and eil's U performance.
(1) e classical time series ARIMA model: where y t− 1 , y t− 2 , . . . , y t− p is the actual value of the past p years; the order of p and q can be realized by model recognition and φ 1 , φ 2 , . . . , φ p and θ 1 , θ 2 , . . . , θ q can be calculated by the least-square method.
(2) e MLR model: e two impact factors with the greatest relational degree as the key impact factors in the modeling were input into the MLR prediction model: where x 1 (t), x 2 (t) are key impact factors; model parameters α 0 , α 1 , α 2 can be calculated by the data barycenter method. (3) e modified MLR model: Considering the complexity of the algorithm and the prediction performance, the proposed MLR model is where model parameters β 0 , β 1 , β 2 , and ϕ 1 can be calculated by the data barycenter method; y t is the future prediction value; y t− 1 is the historical data of  Output layer Figure 6: e prediction frame of the BP model. Figure 7: e prediction frame of the LSTM model. e prediction frame of the BP model is shown in Figure 6.
ere are three layers to predict the future value from the three inputted history data; here the hidden layer is used to analyze the characteristics of input data. In Figure 6, x t− 2 , x t− 1 , x t is the inputted history data of grain consumption, y t is the future prediction value, and W 1 and W 2 are the weighting values of the trained model. (5) e LSTM model: LSTM is one special recurrent neural network (RNN), which substitutes the original neural units of hidden layers for memory units [29,30]. Each memory unit has an input door, output door, forgotten door, and memory cells, as shown in Figure 7.
Here x t− 2 , x t− 1 , x t is the input history data of grain consumption, c t− 2 , c t− 2 , c t− 1 , c t is the output of different memory unit, y t− 3 , y t− 2 , y t− 1 is the prediction value of each layer, and so on; we can get the final future value y t . e comparison of the prediction results of these five models is shown in Tables 3 and 4. From Tables 3 and 4, for the ARIMA model, the closer the forecast year is, the higher the prediction accuracy of the model is. For the MLR model, the further away the predicted year is, the higher the prediction accuracy of the model is. is is because the time series prediction model is   e prediction performance of ration consumption and feed grain consumption is shown in Table 5.
From Table 5, it can be seen that the prediction error limit is kept within 2% for ration and feed grain consumption under different time intervals, which indicates that the proposed model can realize stable prediction with satisfying accuracy for the medium-short-term prediction.

Conclusions
In this paper, one new modified MLR model combined with time series forecasting theory is proposed. In the proposed model, the historical data of prediction variable will be considered as one key impact factor, which will be imputed to the MLR model with the other chosen impact factors. us, it can make full use of the advantages of the time series model and the MLR model, and the prediction performance has been improved strikingly, compared with the MLR model, ARIMA model, and even the BP model and LSTM model. Besides, there are other innovative works; the original data are preprocessed by moving average filtering under optimal window, and the data barycenter method is also adapted to compute the model parameters of MLR for improving the robustness. e proposed prediction model can be applied widely in short-medium-term prediction of grain prediction or the other fields.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.