A Combined Model of Global Cultivated Area Change and Prediction for Future: 1961–2020

As a basic condition for food safety, cultivated area fluctuates in recent years. So, it has important political and economical significance to understand the future change trend of global cultivated area. Based on historical data in the past 50 years, deterministic time series analysis model, random time series analysis model, and combined model were established by using time series analysis method. By comparison, the combined model has the highest fitting and prediction accuracy, and it is suitable for the prediction of global cultivated area change trend. The global cultivated area will rise slightly in fluctuation in the near future driven by a combination of deterministic factors and random factors.


Introduction
Food is the first necessity for people, and it is the foundation for settling down.In recent years, food safety of various countries has already been the strategy of different countries in international exchange, so global food safety has always been the focus of all parties.Guarantee of food safety depends on two parts which are technological level and cultivated area, respectively.Cultivated land is the basis for food production [1], and grain output is closely related to cultivated area.As the carrier of food production, cultivated land is rare and irreplaceable [2].Therefore, in-depth study on the evolution rule of global cultivated area and prediction for the future development trend of global cultivated area can predict the basic guarantee for global food safety.Besides, it can provide reference for making national cultivated land protection policy and food safety policy [3].
Global cultivated area has presented increased trend on the whole since the 1960s, as illustrated in Figure 1.The average annual increase of cultivated area from 1961 to 2011 is about 7,150,897 hectares (calculated according to data provided by the World Bank).However, in the process of increase, different forms are presented at different stages and there are 3 stages in general.The first stage is from 1961 to 1990 when the cultivated area increases stably and the average annual increase of cultivated area is 4,211,730 hectares.The second stage is from 1990 to 1993 when the cultivated area is adjusted greatly owing to statistical caliber change of some countries.The third stage is from 1993 till now when the global cultivated area fluctuates.It increases for 7 years and decreases for 9 years from 1993 to 2011.On the whole, the change of global cultivated area presents a certain trend, especially at the first stage when it rises significantly.But the change at the third stage is uncertain.It is worthwhile to note that the change trend of global cultivated area is not clear in recent years.Therefore, it is of great theoretical significance to construct a precise global cultivated area change model and predict the future trend of global cultivated area change through historical data about global cultivated area.
Theoretically, there have been many studies on the change rule and prediction of cultivated area [4][5][6][7][8][9][10][11].But most of them are aimed at one country or some areas, and there is lack of study on the change rule of global cultivated area.The methods used were mainly qualitative analysis [4,5,8].Only a few scholars [7] adopted quantitative method.The fitting models adopted were mainly subjective judgment, and forms were also single.They lacked optimum selection after comparison among multiple models.Therefore, the model selection lacked scientific nature.All these factors resulted in a great difference among the predictions.This paper utilized the historical data about global cultivated area in the past 50 years and built deterministic time series analysis model, random time series analysis model, and combined model in a global view.By simulating time series of global cultivated area, a combined model is finally selected to predict the global cultivated area from 2008 to 2020 after evaluation for the results.

Research Methods
The research method adopted in this paper is mainly time series analysis approach.Time series analysis can be used to reveal the evolution rule of time series and predict development trend through the study of the observed value of time series.According to difference of factors of the time series evolution, time series analysis can be divided into the following: deterministic time series analysis, time series statistical analysis, and time series combined analysis.

Deterministic Time Series Analysis Method.
Deterministic time series analysis, also known as factor decomposition method, decomposes time series into long-term trend (  ), seasonal variation (  ), cyclical variation (  ), and irregular part (  ).The time series can be expressed by the following formula: In deterministic time series analysis,   ,   , and   are the certain functions of the time .  is the random disturbance part without rules, often known as white noise and expressed by   , and the remaining when   that is removed can be expressed by deterministic time function, so time series can also be written in the additive model of time series as follows: in which   presents time series, () means the selected time function, and   signifies the white noise of disturbance which satisfies the following equation: Deterministic time series analysis adopts econometric parameter method and applies parameter of time function determined by time series historical data to construct deterministic time function model.

Statistical Analysis Method.
Statistical analysis method can be divided into time domain analysis method and frequency domain analysis method.Application of frequency domain analysis method is restricted owing to its complexity and abstraction.In this study, the time domain analysis is adopted.The time series is considered to be composed of long-term trend, seasonal effect, and white noise of random disturbance.The single series is random and has uncertainty.But it should be a stationary process which means that mean value and variance of time series do not change with time.Only in this way, the statistical analysis method can be applied, and the typical linear model ARMA(, ) can be established as follows: where   is time series value,  and  are lag orders, and   is the random term.It also satisfies (3).When   = 0 in the model, the following AR() model can be constructed as follows: When   = 0, the following MA() model can be built as follows: The problems of identification of ARMA(, ) model, AR() model, and MA() model and the selection of  and  can be solved with the help of autocorrelation coefficient (AC) and partial autocorrelation coefficient (PAC) of time series.For the MA() model, the AC function can be expressed as follows: It can be seen that AC function of MA() model is truncated with the step of , whereas the PAC function cannot be truncated, and with the increase of time lagging, it can only present exponential attenuation or sine wave attenuation and tends to 0. Such property is called the tailing feature of PAC function of MA() model.
In terms of AR() model, its PAC function meets   = nonzero constant, 1 ≤  ≤ ,   = 0,  ≦ .This shows that PAC function of AR() model is truncated with the step of , while the AC function cannot be truncated, and with the increase of time lag, it can only present exponential attenuation or sine wave attenuation and tends to be 0.Such property is called tailing feature of AC function of AR() model.
As for ARMA(, ) model, both autocorrelation function and partial autocorrelation function have tailing feature.In this case, selection of  and  is conducted gradually via trialand-error method.
After the model form and order are determined, parameters of the model will be estimated by applying historical data of time series and adopting econometric parameter estimation method.Number of the model parameters will be chosen according to the parameter significance and the overall significance of the model.The optimum model will be selected according to the criterion function (AIC, SC).Generally speaking, the model that can minimize the criterion function value will be the optimum one.
If the random time series is nonstationary, which means that the unit root process exists, the difference will be taken for the time series to stabilize the time series and the B-J method will be adopted to analyze the time series after differentiation, that is, to establish ARIMA(, , ) model according to the original series.

Combined Analysis Method.
Combined analysis extracts long-term trend, seasonal variation, and periodic variation of time series through certain time function and factor decomposition method.The remaining irregular randomly changing part is stationary series, and the remaining information can be extracted with random time series analysis method.It is a method combining deterministic time series analysis method with stationary random time series analysis method (ARMA).If the change rule of time series is composed of certain factors and random factors, the model structure is as follows: where (  ,   ,   ) = () is used to extract information about long-term trend, seasonal effect, and periodic variation and

Model Construction and Prediction Result
Data used in this paper come from the thematic data base of the World Bank (in which cultivated area per capita comes from the topic of "Agricultural and Rural Development", and total population of the world comes from the topic of "Health"), and the data application term is from 1961 to 2010.Table 1 presents the results of statistical description of such data.

Selection of Model Form.
The statistical results of global cultivated change data were listed in Table 1.It can be seen that the distribution does not agree with the form of normal distribution.According to the tendency chart of global cultivated area illustrated in Figure 1, the change series of global cultivated area is not seasonal, but it has strong trend.According to the figure features, it may be equipped with linear trend, quadratic curve trend, cubic curve trend, logarithmic curve trend, power function trend, exponential function trend, the Gompertz curve trend, growth curve trend, cyclic curve, and so on.Statistical data about global cultivated area from 1961 to 2010 provided by the World Bank were applied to fit the above models.The results are listed in Table 2.It is obvious that the goodness of fit of cubic function form is the greatest, and the model can reveal more than 90% factors of cultivated change.The residual value of the Gompertz model is the smallest.Both the goodness of fit and the residual value of cyclic curve form are between these two.Therefore, cubic function model, and the Gompertz and cyclic curve trend models were preliminarily selected as deterministic time series analysis models of cultivated change.small, which shows that residual errors of these three models have positive series correlation.Some information has still not been extracted, and this is the shortage of deterministic time series analysis.

Model Evaluation and Ultimate Prediction Model. Three models estimated in
in which   is the global cultivated area,  is the lag operator, and   =  −1 .  means white noise of disturbance and meets (3).
This model was utilized to extract deterministic information for global cultivated area series and test stationarity for the remaining part after extraction, which means to conduct unit root test for   =   − () series.The results were shown in Table 7.The irregular part is a stationary process, and random time series analysis method can be directly used to construct the model.In order to determine the order of the lag term, AC function and PAC function were used to conduct correlation test for the series.The test results present that AC coefficient of the series presents attenuation in exponential form and that partial AC coefficient drops rapidly after 2 lag periods, which has no significant difference with 0. So it can be preliminarily judged that the series is the autoregressive process with the highest order of 2. The AR (1) and AR(2) models can be adopted.

Estimation and Evaluation of Model Parameter and
Prediction Model.The above model was evaluated via data about the irregular part and the least square method.According to results illustrated in Table 8, the independent variable coefficients of both models are significant under the level of 0.01.However, AC test for residual series of AR(1) model shows that the residual error does not satisfy pure randomness and has not passed adaptability test.The residual error of AR(2) model satisfies pure randomness and has passed adaptability test.Therefore, the specific form of model    for the irregularly changing part in global cultivated series is as follows: in which   is the irregular part in global cultivated series,  is the lag operator, and   =  −1 .  means white noise of disturbance and meets (3).
In conclusion, the following combined model can be established for global cultivated area series as follows: in which   is the global cultivated area,  is the lag operator, and   =  −1 .

Model Evaluation and Prediction of Global Cultivated
Area.In order to compare prediction precision of the three models, absolute error percentage (MAPE) for historical data of these three models was calculated.According to the results listed in Table 9, the MAPE of these three models is less than 10.The MAPE of combined model is the lowest, and the fitting precision of combined model for global cultivated area is the highest.The random model takes the second position, and the deterministic model is the worst.Combined model was utilized to predict the future change of global cultivated area.The prediction illustrated in Figure 2 shows that global cultivated area in recent years (from 2009 to 2011) is higher than that in 2008.The predicted values are 1,376,300,000 hectares, 1,378,810,000 hectares, and 1,378,140,000 hectares, respectively, about 4,500,000 to 7,000,000 hectares more than that in 2008.The value rises from 2009 to 2010 but drops from 2010 to 2011.According to the future prediction results (i.e., from 2018 to 2020), the predicted value of combined model (i.e., residual autoregressive model) is similar to that in recent years.Therefore, this shows that under the combined action of certain factors and random factors, global cultivated area will rise slightly during the fluctuation.

Conclusion
The change process of global cultivated area can be simulated precisely using the time series analysis model established in this study.It can help to understand the dynamic change rule and make an accurate prediction.The fitting degree of combined model for data is the highest, and reliability of the prediction result is also the highest.This shows that the global cultivated area evolution process is driven by both deterministic factors and random factors.According to the prediction result, global cultivated area will rise slightly in fluctuation.
Model and prediction in this study are only aimed at global cultivated area time series over the past 50 years.The precision of the established model is limited, and the obtained conclusion is just the inherent law reflected by data.The driving factors behind the law were roughly judged as deterministic factors and random factors.The deterministic factors may include increase of world population, development of world industry, development of global urbanization, and land policies of various countries.The random factors may include war and climatic variation.In terms of whether these factors have long-term equilibrium relationship with the change of global cultivated area during its process and push-forward evolution of global cultivated area, extending studies are required.
used to extract information of irregular and changing factors via ARMA(, ) model.If the irregular part is, autoregressive model, the model after combination is also called residual autoregressive model.Factors that cause change of global cultivated area may consist of certain factors, random factors, or combination of deterministic factors and random factors.Under situations where specific factors of global cultivated area change cannot be determined in advance, the above three methods can be used to construct global cultivated area change models.In addition, for the choice of ultimate models, index values like absolute error percentage (MAPE) will be worked out to evaluate the fitting precision of the model for time series.
Test.The elementary linear transformation was done for nonlinear model according to forms of the selected model.The above three models were estimated and tested by taking advantage of statistical data about global cultivated area from 1961 to 2010.The results are shown in Table3.According to the estimated results, all of these three models are good whether in parameter significance or in the overall fitting effect of the model.They can explain about 90% of the global cultivated change and can be used to fit and predict global cultivated change in the future.Besides, D-W statistical values of the models are

7 Figure 2 :
Figure 2: Chart of global farmland area change prediction.

Table 1 :
Statistic test for normal distribution of global cultivated time series.: data come from the thematic data base of the World Bank. Note

Table 2 :
Fitting results of various deterministic trend models.fitting of this model is estimated by turning it into linear model via transformation of model algebra.The residual error and adjusted- 2 are the fitting results of transformational model.* * Till 2010, the maximum value of agricultural land in the world had been 4,935,796,100 hectares, and the growth limit of cultivated area will not exceed this value, so  = 4,935,796,100.

Table 3 :
Cubic curve model, the Gompertz model, and cyclic model estimation results.
Note: * * * means significance under the level of 0.01, * * means significance under the level of 0.05, and * means significance under the level of 0.1.

Table 3
value of cubic curve model is the smallest.So, the prediction precision of cubic curve model is the highest.Comparison among these three models in bias proportion (BP), variance proportion (VP), and covariance proportion (CP) also shows that the prediction effect of cubic curve model is the best.According to the above results, cubic curve form was chosen as the ultimate prediction model.The specific form of the model is as follows: unit test for global cultivated area series.It is obvious that the series has unit root and that it is nonstationary time series.The first difference series is stationary.Therefore, the global cultivated area belongs to the first order integration, that is, (1) process.The season test shows that series after differentiation does not have seasonal phenomenon.So it

Table 4 :
Prediction results of cubic curve model, the Gompertz model, and cyclic model.

Table 5 :
ADF and PP tests for global cultivated change time series * * .*ADFand PP test equations often have three forms: with intercept term but without trend term, with intercept term and with trend term, without intercept term and without trend term.Global cultivated area time series has trend and intercept terms, but the first difference series has no trend and intercept terms.Statistical data about global cultivated area from 1961 to 2010 provided by the World Bank were applied to evaluate the above model.The results were listed in Table6.The independent variable coefficient of AR(1) is significant under the level of 0.01, and the first independent variable coefficient of AR(2) model is significant under the level of 0.01, but the second coefficient is insignificant.The autocorrelation test of sample residual series shows that residual series of the model has no autocorrelation and passes adaptability test.According to AIC and SC standard, it is proper to adopt AR(1) model for fitting, and  test value of this model presents that residual error of this model has no autocorrelation and that the model passes adaptability test.
Note: * C means intercept term, and T means trend term with time.* is proper to establish ARIMA(, 1, ) model.AC function and PAC function were used to test correlation for the first difference series of global cultivated area.The PAC function presents second-order truncation, and AC function presents tailing feature.Therefore, AR(1) and AR(2) models can be chosen to fit the first difference series, which means that ARIMA(1, 1, 0) and ARIMA(2, 1, 0) models can be used to fit global cultivated area series.3.2.2.Estimation and Evaluation of Model Parameter and Prediction Model.

Table 6 :
Evaluation results of various first difference series models of global cultivated area.Note: * * * means significance under the level of 0.01, * * means significance under the level of 0.05, and * means significance under the level of 0.1.

Table 7 :
Unit root test results of irregular part in global cultivated series. means intercept term, and T means trend term with time.Unit root test equation often has three forms: with intercept term but without trend term, with intercept term and with trend term, without intercept term and without trend term.Statistical test shows that the series of irregular part of global cultivated area has no trend and intercept terms.

Table 8 :
Corresponding results of various models of irregular part series.
Note: * * * means significance under the level of 0.01, * * means significance under the level of 0.05, and * means significance under the level of 0.1.

Table 9 :
Evaluation results of historical data fitting precision about global cultivated area.