Short-Term Forecasting of Agriculture Commodities in Context of Indian Market for Sustainable Agriculture by Using the Artificial Neural Network

. Prediction of well-grounded market information, particularly short-term forecast of prices of agricultural commodities, is the essential requirement for the sustainable development of the farming community. Such predictions are mostly performed with the help of time series models. In this study, the soft computing method is used for short-term forecasting of agriculture commodity price based on time series data using the artiﬁcial neural network (ANN). The time series data for sunﬂower seed and soybean seed are considered as the agriculture commodities. The soybean seed time series data were collected for the period of ﬁve years (Jan 2014–Dec 2018), for Akola district market, Maharashtra, India. The sunﬂower time series data were collected for the period of six years (Jan 2011–Dec 2016), for Kadari district market, Andhra Pradesh, India. The dataset is available at the Indian government website taken from the website www.data.gov.in. For forecasting, the ANN model is used on the abovementioned datasets. The performance of the model is compared with the result of the traditional ARIMA model. The mean absolute percentage error (MAPE) and root mean square percentage error (RMSPE) are considered as the performance parameters for the forecasting model. It is observed that the ANN is a better forecasting model than the ARIMA model by considering the two forecasting performance parameters MAPE and RMSPE.


Introduction
In India, the 2/3rd parts of total population directly or indirectly depend on the agriculture [1,2]. As per the survey conducted by "Agriculture Census of India" in 2011, approximately 62% of Indian population living in rural area is dependent upon agriculture directly or indirectly. To this sector of population, agriculture is the main source of income. India is second ranked in terms of production of agriculture commodity. Agriculture sector contributes almost 18% to the Indian GDP [3]. Agriculture commodities are the important source from the earning point of view. Hence, the influence of commodity price [4] is crucial in Indian economy. e agriculture commodity price forecast will play the important role for the farmers, the policymakers, and various administrative offices. For example, if a farmer knows in advance the price of crop in near future (short term), then he can decide about the farming area of that particular crop to be undertaken. Other than farmers, government agencies also need to know the probable price of commodity in advance for implementing the government schemes (subsidy schemes and import/export activity) smoothly.
Agriculture commodity forecasting is very important for sustainability of future generation. With ever increasing demand of agricultural products and reduction in agricultural land, this forecasting methodology is very important for sustainability of farmers. Indian economy is majorly an agriculture-based economy. is forecasting methodology can help the farmers and other stakeholders to make it sustain for a larger duration. e advantage of this forecasting methodology includes healthy and economical food products to the consumers leading to improved health parameters. is can lead to sustainability of the agricultural land and products.
Forecasting of agricultural commodity is very essential for our day-to-day life. e agricultural price fluctuations are rising nowadays, hence, resulting in mismanagement of people's food expenditure. Agriculturalists need to resolve this problem for the better future of agricultural commodity.
Fluctuating and rising agricultural prices are one of the major factors resulting in global fight against poverty. Many models are used for the forecasting of which statistical method is used the most. But still no proper model is made to resolve the problem efficiently. e forecasting of agriculture commodity price is mainly divided into two parts: structural and nonstructural. e structural methods [5] mainly consider the supply demand ratio. Computationally, it is very difficult to estimate the consumers' needs and the production of that particular crop for developing countries. e nonstructural methods [6] may be categorized as statistical technique [7,8] and machine learning technique. For nonstructural methods, historical data are collected as time series data. e time series data can be of linear or nonlinear in nature.
ere are various methods [9] for forecasting based on time series data. e ANN is a better alternative than the statistical model for nonlinear time series data [10,11].
Some research studies have been performed in forecasting of agriculture commodities price in developing countries such as India. According to [12,13], there are some special features of the ANN such as nonlinearity, adaptability, and mapping procedures providing strong support for using the ANN as a good forecasting model.
In [14], the ARIMA model and time delay neural network (TDNN) are for time series forecasting of agriculture commodity price. ey concluded that the neural network model performed better due to nonlinear nature of time series data. Finally, they presented a hybrid model for forecasting. Surprisingly, the hybrid model was less efficient than the ANN for soybean data and more efficient for mustard.
According to the work reported in [15], the neural network is presented which is a very good alternative for "short term" forecasting, while the Box-Jenkins method performs better for very short-term forecasting. ey also discussed that the neural network without a hidden layer can work similar to the Box-Jenkins method.
Work presented in [16] used the support vector machine for forecasting of financial time series data to perform better in terms of efficiency in comparison with the back propagation neural network.
In [17], the authors presented the ANN approach for multivariate time series data. ey used the dataset of flour price of three cities, and based on training and testing results, they concluded that the ANN model can well be used for forecasting.
In [18] too, the ANN model is used for electrical load forecasting. ey used the characteristics of the ANN to learn from the relationship among the past data, current, and future temperature. Based on the testing data, the result was very satisfactory.
In [19], the Jordan neural network is used in forecasting the inflation based on time series data. ey used macroeconomic variables such as financial variable, lagged inflation, and labor market variable. In the work [20], the ANN is used for sales forecasting of the apparel retail chain stores. e MAPE for the model they observed was 8.79%. Some of the applications of the ANN model for forecasting based on time series data are as follows: (1) Electricity load forecasting [21,22] (2) Financial forecasting [10,23] (3) Monthly average rainfall prediction [24] is study is summarized in five sections. e first and current section contains the brief introduction of problem statement and various solutions given by the scholars. e second section elaborates the computational models ARIMA and ANN. e third section explains the implementation and result analysis for the sake of efficiency measurement of computational models discussed in second section. e fourth section explains the conclusion of work presented in this study.

Materials and Methods
Sunflower time series data and soybean time series data are taken in this research work. Statistical description of the data is given in Table 1. Description of soybean time series datais as follows: (1) Taken from "data.gov.in" an Indian government website (2) For the period of five years (January 2014-December 2018) (3) Data related to the Akola district market, Maharashtra, India Description of sunflower time series data is as follows: (1) Taken from "data.gov.in" an Indian government website. Pradesh, India.

Forecasting Techniques.
Forecasting is defined as the prediction made on the basis of some scientific calculation based on historical data and demand-supply data. Classification of forecasting techniques [25] is shown in Figure 1.
Forecasting technique is mainly divided into two types: "Quantitative technique" and "Qualitative Technique." In the qualitative method, we use the facts that cannot be measured in terms of the numeric value. It is also known as judgmental forecasting [26] where the prediction is made on the basis of survey, events, and many more noncomputational parameters. e quantitative technique [27] works on numerical data or computational data. It is also known as statistical technique or time series technique. e time series forecasting can be divided into two parts: the (a) classical Box-Jenkins Models [15,28,29] and (b) machine learning models [30,31]. e classical models work well on linear data, while the machine learning models work well on a wide range of data. e ARIMA model and ANN model are further discussed later in this section.
Selection of forecasting technique depends upon the various parameters. Some of them are level of accuracy required, purpose of forecasting, type of data available, tenure of the forecasting, and many more. Qualitative models for agriculture commodity forecasting are very expensive and not suitable for developing countries. As India is a developing country, time series forecasting models are suitable to forecast the agriculture commodity price. e agriculture time series data are nonlinear in nature; hence, naturally, the "Artificial Neural Network Model" is a best suitable model [32][33][34][35] for forecasting of agriculture commodity price.

Forecasting
Using ARIMA. ARIMA stands for autoregressive (AR) integrated (I) moving average (MA). It works on the principle of Box-Jenkins [5,29,36,37]. ARIMA [38] is associated with three important parameters, namely, p, d, and q as shown in Figure 2.
e working model of ARIMA is shown in Figure 3. Visualization of time series data is fundamental and most basic for ARIMA. After visualization, we can do the preprocessing of the data such as removing the outliers and dealing with missing data. By visualization, we can also conclude whether the data are stationary or not. If the series is nonstationary, then first of all, we should make time series stationary. After the stationary time series, we should find the optimal parameters for the ARIMA model with the help of ACF plot and PACF plot [39].

Forecasting Using the Artificial Neural Network.
e feed forward neural network with a single hidden layer is used as shown in Figure 4. Back propagation concept is used for learning purpose. Let "m" be the input size (neurons at the input layer) of the neural network and "n" is the number of nodes at the hidden layer. e input y t− m , y t− i+1 , . . . , y t− 1 is scaled into the interval [0, 1]. e activation function rectified linear unit (ReLU) [1] is used for finding the activation value for the input layer neuron to hidden layer neuron. Sigmoid is used for calculating the activation for the intermediate layer to output layer.
e ReLU function, mathematically, is defined as (1) e final output of the network will be given by the following equation: where f 1 and f 2 are the activation functions at the hidden layer and output layer, respectively. e training session of the network categories consists of two parts. In the first part of the training, the network will produce the output based on the selected input window. In second part, the error is calculated based on the actual value and the predicted value. Now, this error is propagated back via the output layer for updating the weight of the neurons in the hidden layer for next round as shown in Figure 5.
For training purpose, the error is calculated by comparing the actual value y t with the predicted value. e error is back propagated to the neural network for updating the weight of connection between the hidden layer and output layer.
e output of a neuron is calculated by the following equation: where O (L) i is the output of neuron i in the L th layer, (Bias) i is the bias in L th layer, w (L) ij is the weight from i th neuron of the layer (L − 1) to j th neuron of layer L, and Z (L) i is the output of i th neuron in layer L.
e updated weight at time T will be given by the following equation: where η is the learning rate, α is the momentum, and δ [40] can be calculated with the help of gradient of the output function of neuron. By seeing the boxplot of soybean time series data in Figure 8, it seems that the price is with higher mean and variance in the months of February, March, and April. Similarly, in the boxplot of sunflower time series data, the price is with higher mean and variance in the months of March and April as shown in Figure 9. Figures 8 and 9, we can clearly say that there are components such as seasonality, trend, and cycle which are shown in Figures 10 and 11.

Finding Parameters of the ARIMA Model.
Stationarity of the time series data is checked with the help of the "Augmented Dickey-Fuller Test" as shown in Figure 12.

Input layer Hidden layer
Output layer    e "auto.arima( )" function is used to automatically fit the model based on the input time series data and to find the optimal parameters for the ARIMA model. ARIMA (0, 1, 0) is chosen as an optimal model for forecasting for both sunflower time series data and soybean time series data as shown in Figures 13 and 14. Figures 15 and 16 show the plot of linear models for the soybean time series data and sunflower time series data, respectively.

Data Preprocessing.
In data preprocessing, we mainly focus on to analyze the data, remove the noise, deal with the missing values, and transform the input value to the desired scale for the model to be implemented. e first step for data preprocessing is to plot the series. Figure 17 shows the plot of monthly average price of soybean for the period January 2014-December 2018. Similarly, Figure 18 shows the time series data for sunflower.

Training Dataset and Test Dataset.
We used the supervised learning concept. e former 80 percent of the preprocessed data is used to train the model and last 20 percent of the data is used to test the model as per the standard accepted by the various scholars [42]. Figure 19 shows the division of the soybean dataset into train data and test data. Similarly, Figure 20 shows the division of the sunflower dataset into train and test data.   Journal of Food Quality given in Tables 2 and 3 for the commodities soybean and sunflower, respectively.

Evaluating Forecasting Accuracy.
We have used two parameters "Mean Absolute Percentage Error (MAPE)" and "Root Mean Square Percentage error (RMSPE)" for the forecasting accuracy. [43,44] is one of the important parameters to measure the quality of the forecasting system. It is defined in the following equation:

MAPE. MAPE
where A t is the actual price at time t, F t is the forecasted price at time t, and n is the number of the forecasted value. e MAPE by using the ANN for the forecasted sunflower time series data is 2.4 and for soybean time series data is 7.7% as given in Table 4.

RMSPE
. RMSPE can be calculated by following the given steps: Step 1: Calculate the percentage residuals by using the following formula: where A t is the "actual price", and F t is the "forecasted price." Step 2: Calculate the residuals square Step 3: Calculate the mean of residuals squares by adding the residuals squares and divide it by n   Journal of Food Quality Step 4: Calculate the square root of mean obtained in Step 3

Conclusion and Future Work
Currently, India is ranked second in the world for production of agricultural commodities and contributes almost 18% in the Indian GDP. Although, the market prices of these commodities fluctuates geographically. To give a better understanding of these fluctuations to stakeholders, in this study, we have presented a short-term price forecasting model which will eventually lead to more sustainability to different stakeholders. For this, we have compared the ANN and ARIMA model for forecasting the prices. We considered sunflower time series data and soybean time series data      collected from Indian government portal for training and testing purpose of the proposed forecasting model. e parameters MAPE and RMSPE are used for the accuracy measurement of the presented model. For soybean and sunflower time series data for prices, the mean absolute percentage error (MAPE) by using the ANN is 2.4% and 7.7%, respectively. Whereas by using ARIMA, MAPE for soybean and sunflower time series data is 19.76% and 15.2%, respectively. Similarly, the root mean square percentage error (RMSPE) by using the ANN for soybean and sunflower time series data is 3.15% and 8.92%, respectively, whereas, by using ARIMA for the same time series data, RMSPE is 19.84% and 15.9%, respectively. ese results concluded that the ANN is a better model for forecasting of agriculture commodity price than the ARIMA model. As per the literature review, the ANN model is suitable for nonlinear time series data and the ARIMA model is suitable for linear time series data. Hence, future work will be focused on developing the hybrid model for forecasting of agriculture commodity price to overcome the limitation of the ANN model.

Data Availability
e data used to support the findings of this study are taken from the website "http://www.data.gov.in" managed by Government of India.

Conflicts of Interest
e authors declare that they have no conflicts of interest.