A Hybrid Neural Network and H-P Filter Model for Short-Term Vegetable Price Forecasting

,


Introduction
Among all the price fluctuations in the market, the prices of agricultural products have the most obvious and basic impact on the cost of living.Many countries have established price early warning systems to monitor and evaluate grain prices so that the price can be timely adjusted and controlled when it is in abnormal state, in order to guarantee that the grain economy develops in a sustainable, healthy, and stable way [1,2].
To predict a time series price is a challenging problem according to current studies [3,4].In recent decades, several linear and nonlinear prediction models have been developed for time series data forecasting.The autoregressive integrated moving average (ARIMA) model [5] is one of the most popular methods based on time series and has been widely used for data prediction [6].Bianco et al. proposed ARIMA and transfer function models for the prediction of electricity consumption [7].Linear regression models have also been proposed for the prediction of energy consumption [8].
However, time series data forecasting is usually a nonlinear problem, so linear approaches may fail to capture the nonlinear dynamics of the process.There are some nonlinear methods for forecasting future prices using a machine learning model [9][10][11][12].As the price of vegetables has seasonal cyclical factors, some authors use signal processing methods to analyze the historic data and predict future prices [6,13].The limitation of this method is that it assumes that the period of price changes is fixed and will follow the same cycle in the future.As the price is impacted by many factors, it cannot have a fixed change cycle.Those methods cannot handle both linear and nonlinear patterns at the same time.
In this paper, a hybrid approach, combining H-P filtering [14] and a neural network model [15,16], is developed to predict short-term time series price data.Due to the high correlation of time series data, they often contain a trend component.Hence, using an H-P filter to decompose the demand time series into its trend and cyclical components is proposed as an effective technique for time series data forecasting.The use of combined models in water quality time series data could assist in capturing patterns in the data and could improve prediction accuracy.The motivation behind this hybrid approach is large that water quality problems are often complex in nature and so any individual model may not be able to capture all the different patterns equally well.
This paper is organized as follows.Section 2 briefly describes the situation of the current vegetable market in China and the factors affecting prices.In Section 3, we introduce some technical background knowledge.Section 4 describes the proposed H-P filter based hybrid neural network model.We discuss the experimental results in Section 5. Our conclusion is presented in Section 6.

A Brief Description of the Vegetable Market in China
The Chinese vegetable market plays an important role in people's daily life and in the agricultural industry.In 2011, the area planted with vegetables was about 19.7 square kilometers, and vegetable production reached 6.79 million tons.According to a report from the Food and Agriculture Organization (FAO), China has 43 percent of the planted area of the world and 50 percent of the production of the world, and so it is ranked as the first in the world vegetable production.However, vegetable prices have been volatile in recent years.There are many factors that have an influence on vegetable prices, such as population, national policies, area of arable land, international financial markets, price of alternatives, economic growth, and international trade.Here, we study the hidden patterns behind the history of vegetable prices to see if we can build a more accurate model for price forecasting.

Background
There are several methods for extracting the trend and cyclical components from an original single set of data.As the prediction results from the two components need to be merged together, we have to find a linear filter to separate the original single set of data.Here, we choose the linear Hodrick-Prescott filter to get the trend and cyclical components [17].Then, we use a traditional neural network to learn from the patterns.

Hodrick-Prescott Filter Model
. The H-P filter [17] decomposes time series data into trend and cyclical components:   =    +    ,  = 1, 2, . . ., , where    and    are the trend and cyclical components, respectively.This decomposition assumes that the trend component does not contain any seasonality and, because the cycle is derived residually, it does not separate out the cycle from any irregular movements.Hodrick and Prescott [17] minimize the variance of    , subject to a penalty for variations in the second difference of the growth term.Their filter is given by min The parameter  controls the smoothness of    .The minimization of (1) provides a mapping from   to    , with    determined residually.The estimate of potential output using the H-P filter depends on the choice of , the smoothing parameter.A  of zero corresponds to an extreme real business cycle model where all of the fluctuations in real output are caused by technology shocks, because the H-P trend is the same as the series being detrended.Conversely, as  tends to infinity, the H-P trend moves towards a deterministic time trend.Following Hodrick and Prescott, researchers typically set  at 1600 for use with quarterly data [17] but test the robustness of their results with different values.

Principles of Neural
Networks.Artificial neural networks (ANN) are popular models for studying nonlinear relationship functions.One of the most significant advantages of an ANN model is that it can approximate a large class of functions with a high degree of accuracy [18].This capability comes from the parallel processing of the information.No prior assumption of the model form is required in the model building process.Instead, the network structure is mainly determined by the features of the data.The key element of an artificial neural network is an artificial neuron.For a given neuron, there are multiple ( + 1) inputs and one output.The output of the th neuron is   = (∑  =0     ), where (⋅) is called the action function.
A single hidden layer feed-forward network is the most widely used model form for time series modeling and forecasting [19].The model is characterized by a network of three layers of simple processing units connected by links, as shown in Figure 1.The output   and the inputs ( −1 , . . .,  − ) have the following relationship: where   and   ( = 0, 1, . . ., ,  = 1, 2, . . ., ) are parameters often called connection weights,  is the number of input nodes, and  is the number of nodes in the hidden layer.There are several types of activation function.The most widely used activation function for the output layer is the linear function, as a nonlinear activation function may introduce distortion to the predicted output.The logarithmic and hyperbolic functions are often used as the hidden

Input layer
Hidden layer Output layer

Cyclic component
Hence, the neural network model of (2) acts as a nonlinear function mapping from past observations to the future value   .The function can be represented as where  is a vector and (⋅) is the mapping function that we want to find.The function learning process uses a back propagation training algorithm [20] to minimize the errors  between the output results and the desired results.This minimization is done by adjusting the parameters   of the neural network by an amount Δ  according to the following formula: Finally, the estimated model is evaluated using a separate hold-out sample that has not been exposed to the training process.

Model of H-P Filter Based Hybrid Neural Network
In this section, we will introduce the proposed forecasting model.The whole system framework will be presented first.Then, the model formulation will be given.Finally, we will discuss the workflow of our forecasting scheme.

Framework of the Proposed Approach.
The module description of our proposed time series data forecast framework is presented in Figure 2. As shown in Figure 2, the proposed approach includes three main stages: data preprocessing, data forecasting, and data merge.
In the first stage, the time series price data are passed through the H-P filter.Trend and cyclical components are generated.This decomposition allows us to model the trend and the cyclical fluctuations of the time series separately and more accurately.As the H-P filter is a linear filter, we can merge these two components after forecasting.It must be noted that the trend and the cyclical components are separately learned by ANN T and ANN C .Then, in the next stage, we select suitable features to be applied to the ANN models and forecast each component individually.In the third stage, we use a linear function to merge the two components together as the original data series forecasting result.

Hybrid Model Formulation.
The behavior of vegetable prices may not easily be captured by stand-alone models because the price time series data could include a variety of characteristics such as seasonality, heteroskedasticity, or a non-Gaussian error.Our hybrid model aims to reduce the dimension of the raw data.As the raw data contains multidimensional variables, we need to find a more complex function to fit the data.The artificial neural network needs to have more nodes and more layers to learn the complex function.This also requires more time to train the ANN, which will increase the risk of failure and make the results less accurate.After we have reduced the dimension of the raw data, the artificial neural networks are easier to train and the predictive performance will be improved in the combined models.
Based on Box's [21] work in linear modeling, time series data are considered as a nonlinear function of several past observations and random errors as follows: (7) where (⋅) is a nonlinear function learned by the neural network, the noise   is the residual at time , and  and  are integers.Suppose the time series () is composed of a trend and a cyclical component: where   and   are the trend and the cyclical components.
As we can see from (9), we still use the original data as the inputs for the two neural networks, and the output result is the trend value and the cyclical value.Then, in the forecasting process, we use the following function to compute the future data: Step 1. Prepare the raw data of the vegetable price time series   and compute the cycle  of the time series data using Fourier transform.
Step 2. Preprocess the raw price data using the H-P filter and generate two series of data:    and    .Step 3.According to the time series data cycle , build the training set for the neural network.
Step 4. Use the heuristic algorithm to select the optimal number of neurons in the hidden layers of ANN T and ANN C and initialize the parameters of each neural network.
Step 5. Use the back propagation algorithm to train the neural network.
Step 6. Repeat Steps 4 and 5 until the best fitness value satisfies the minimum requirement, or the given count of total generations is reached.
The proposed forecasting scheme is briefly described in the following steps.
Step 1. Select the latest data within a cycle as the input data.
Step 2. Preprocess these data using the H-P filter.
Step 3. Use the trained ANN T and ANN C to forecast the price at the next date.
Step 4. Combine the results of the two networks as the final forecast price value.

Experimental Results
In this section, we will use two popular forecasting models to compare with our proposed method.These time series come from the monthly price data for five types of vegetable from 2012 to 2013.We randomly choose the price data for one cycle as the testing dataset and the remaining data as the training dataset.All the experiments are done using Matlab R2013b in the Windows 7 platform.We choose the neural network function in the Matlab toolbox [22].Now, we study the characteristics of the prices of those five kinds of vegetable.

Study of Vegetable Price Data.
We choose the prices of five types of vegetable: cabbages, peppers, cucumbers, green beans, and tomatoes.The price trends of the five types of vegetable are shown in Figure 4.As we can see from Figure 4, the price series data have a significant cyclical character.This cycle is mainly influenced by the seasonality of agricultural production.We can also note that there is a growth trend in this series data.This is mainly due to the effects of inflation.
In Table 1, we illustrate the price characteristics for the five types of vegetable.They have very similar cycles as they have annual changes.The prices of peppers and green beans have a higher variance than the other prices.It can be seen from Figure 4 that the prices of those two vegetables have relatively larger fluctuations than the prices of the others.

Experiment Modeling.
We implement two traditional forecasting models ARIMA and ANN to compare with the proposed model.

ARIMA Modeling.
In the present study, several trials were carried out to choose the optimal ARIMA model parameters.The model parameters that satisfy the statistical residual diagnostic checking were chosen for the ARIMA forecasting model.In an ARIMA (, , ) model, the future value of a variable is assumed to be a linear function of the past observations and random errors; that is, the underlying process that generates the time series with the mean has the form where   and   are the actual value and random error at time period .
From Tables 2, 3, 4, 5, and 6, we can see that the ARIMA models can basically predict the results.However, we need to calculate the different parameters each time for each model, and it is difficult to find an easy way to get the parameters.The artificial neural network is easier to use because we just need to determine the number of layers.From Figure 5, we can see that the dashed lines are very close to the solid lines.This means that our ANN model can successfully learn the patterns of the time series price data and predict the future results.The neural network is more robust than the ARIMA model, because it does not need us to analyze the characteristics of the original data and find suitable parameters.ANN is easier to use and has similar accuracy.

Hybrid Modeling.
Our proposed model uses the same data from the previous 12 months as the testing dataset.First, we need to extract the trend and cyclical components.Figure 6 is an example for the pepper price data after using an H-P filter.We use the decomposed data as the training datasets.The forecast results are shown in Figures 7(a The neural network uses the same structure as we used in Section 5.2.2.As the cycle of the price data is 12 months, we use the latest 12 months' data as the input datasets.The input Figure 7 shows a cycle's real data and the predicted results.We can see that the proposed method has good performance in learning the trend and cyclical patterns of the original data.The forecasting results are more accurate than the traditional results.The performance comparison is presented in the next section.

Model Verification and Comparison.
To illustrate the accuracy of the method, two different forecast consistency measures are used for the different types of vegetables.The root mean squared error (RMSE, (13)) is used as the error criterion; it is the ratio of the root mean squared error to the variance of the time series.The mean absolute error (MAE, (14)) is also employed as a performance indicator.The RMSE and the MAE are defined as follows: Table 7 contains a statistical analysis of the performance of the three forecasting models.We can see that our proposed model has the best performance in predicting the future prices for the vegetables.The ARIMA model is unstable for different kinds of time series data.It did well for price data for cabbage but worse for the data for the price of peppers.This is because we did not find the best parameters for the ARIMA model.Our proposed model is more stable when handling different kinds of time series data.The ANN model has a middling performance.

Conclusion
Time series forecasting is one of the most important quantitative models and has received a considerable amount of attention in the literature.This study presents a novel adaptive approach to extending the artificial neural network model; adaptive metrics of the inputs and a new mechanism for mixing the outputs are proposed for time series prediction.Due to the individual modeling of the trend and cyclical components, the forecasting accuracy is improved.The experimental results generated by a set of consistent performance measures with different metrics (RMSE, MAE) show that this new method can improve the accuracy of time series prediction.The performance of the proposed method is validated by time series data for five sets of vegetables.

Figure 1 :
Figure 1: Structure of the three layers of a neural network.

Figure 2 :
Figure 2: Framework of the proposed forecasting approach.

Figure 3 :
Figure 3: The overall flowchart of the proposed forecasting approach.

3 .
Forecasting Scheme.The overall flowchart of the proposed H-P filter based neural network forecasting is shown in Figure3.The proposed training scheme is briefly described in the following steps.

Figure 5 :
Figure 5: Regression of the predictive data using ANN.
Modeling.A three-layer feed-forward neural network model was developed for the prediction of the price series data using an optimized back propagation training algorithm.In the present study, the scaled conjugated gradient algorithm was selected as the optimized training method.The network structure is shown in Figure 1.We choose 12 neural nodes for the input data and 8 nodes in the hidden layer.The output has one node with the "purelin" function.In what follows, artificial neural network model performances were validated for flow prediction under a monthly time-step condition.The forecast results are shown in Figures 5(a), 5(b), 5(c), 5(d), and 5(e) for the prices of the five different types of vegetable for one season.

Figure 7 :
Figure 7: Regression of the proposed predictive data and real data.

Table 1 :
The price characteristics of the five types of vegetable.