Simulated Annealing Based Hybrid Forecast for Improving Daily Municipal Solid Waste Generation Prediction

A simulated annealing (SA) based variable weighted forecast model is proposed to combine and weigh local chaotic model, artificial neural network (ANN), and partial least square support vector machine (PLS-SVM) to build a more accurate forecast model. The hybrid model was built and multistep ahead prediction ability was tested based on daily MSW generation data from Seattle, Washington, the United States. The hybrid forecast model was proved to produce more accurate and reliable results and to degrade less in longer predictions than three individual models. The average one-week step ahead prediction has been raised from 11.21% (chaotic model), 12.93% (ANN), and 12.94% (PLS-SVM) to 9.38%. Five-week average has been raised from 13.02% (chaotic model), 15.69% (ANN), and 15.92% (PLS-SVM) to 11.27%.


Introduction
Municipal solid waste (MSW) often refers to the discarded materials like trash or garbage produced from daily life in urban areas. However, as a result of the high-speed urbanization process and dynamic population fluctuation in recent years, large amounts of MSW have been generated and have led to environmental pollution and health problems [1][2][3]. Therefore, an efficient and accurate MSW management system is essential for the planning and management of the entire urban systems, even though difficulties exist due to the high complexity and uncertainty of MSW amount. Specifically, an accurate prediction of MSW generation is a crucial and fundamental process in solid waste management system as it provides guidance for transport and disposal resources allocation. Among the large amount of proposed MSW prediction methods, time series analysis is an effective method for determining the temporal trend from history data and their own patterns [4]. Compared with other multivariable models involving demographic, socioeconomic, and other explanatory variables, the time series analysis method is more flexible and feasible due to its convenient manner of data acquisition.
To date, a number of research efforts have explored effective methods in MSW generation prediction utilizing time series analysis. Conventional statistical models have been widely used since the 1970s. Linear time series models, such as the seasonal autoregressive and moving average (sARIMA) model, have been used by Navarro-Esbri et al. [5] and Xu et al. [6] to simulate and predict the MSW time series data. Other studies treated MSW generation as a nonlinear complex system and introduced machine learning methods to mine the past patterns and to make prediction. Grey system theory [1,6,7], artificial neural network [8,9], and support vector machine [10][11][12][13][14] are most widely used for monthly and weekly MSW generation predictions. These algorithms used a stochastic model to determine the optimum black-box model in training process and apply it to further prediction. There are other sorts of algorithms, however, using nonlinear dynamic method in long term MSW prediction. Navarro-Esbri et al. [5] proposed a nonlinear dynamics algorithm by state space reconstruction and a fitting function for this trajectory of the system. Song and He [15] proposed a chaotic model of nearest neighbor state space reconstruction for multistep ahead MSW generation prediction. These nonlinear dynamic system methods are based on the embedding theorem of Takens [16] and aim to rebuild the dynamic behavior from a discrete time series data with time delay and embedding dimension.
Hybrid forecast or hybrid model has long been used in power load [17,18], economics [19,20], hydrology [21,22], and many other fields. However, the potential of hybrid models for MSW generation prediction has seldom been developed. Xu et al. [6] have proposed a fixed weighted hybrid forecast method integrating sARIMA and grey system theory (GM (1, 1)). However, a fixed weighted hybrid forecast is only suitable for medium and long term prediction because the model performance is changing constantly. At the time of this writing, there was still no report of using variable weighted hybrid forecast in daily MSW generation forecasts. In this study, a simulated annealing (SA) based variable weighted hybrid method was proposed to combine the chaotic model, ANN model, and partial least square support vector machine (PLS-SVM) model for daily municipal solid waste prediction and the results showed that the hybrid forecast outperformed the three individual models. Our study results also showed that the hybrid forecast is more accurate in comparison with traditional forecast methods like chaotic model, ANN, or PLS-SVM.
In this paper, Section 2 introduces the basic theory of the SA based hybrid model and three candidate models: ANN, PLS-SVM, and the chaotic model. Section 3 compares the time series modeling results between the hybrid model and three individual models, discusses the accuracy and stability of the proposed method, and summarizes the pros and cons. Finally, Section 4 provides the main conclusions of the study.

Data.
Daily MSW garbage generation data were obtained from the Seattle Public Utilities website. The data consisted of up to 1001 consecutive days ranging from January 2011 to September 2013. The whole dataset was split into three parts: first 851, middle 50, and last 100. The first 851 values were used for the three model training, the middle part was used for deciding initial weights, and the last 100 values were used as comparison to evaluate the model performances.

SA Based Hybrid Model.
As the real world is often complex in nature, no individual model can fit all conditions when conducting predictions. Instead, hybrid forecast methods can compensate the errors caused by using one single model and make more precise predictions in an extent [20]. Considering the different performances and accuracies of individual models, variable weighted hybrid forecast method based on history performances will make full use of all integrated models and conduct better predictions [23]. In this study, SA method was adopted to determine the weights for each forecast step.
The SA based hybrid forecast weighs individual forecast methods based on their prior model performances. As each model is stable in a short term, the best weight combination for the past time window will be optimum for some time. As formula (1) shows, the hybrid model forecast ( ) is composed of individual model forecasts with corresponding weights. Optimum linear weight assignment is carried out through SA which we will describe later. Consider (1) Figure 1 provides the basic structure for the SA based hybrid model. Note that there is a time window for picking the training data for the optimizer. This is because the MSW system is highly susceptible to its environment such as economics or weather. Prior to date observations may contain different patterns which can influence each individual model differently. As urban areas do not change dramatically, an appropriate time window will provide relevant training data for the optimizer to determine the best weight combination. Developed in 1983, simulated annealing is a heuristic algorithm for optimization [24]. It has been widely used in a broad range of application areas ever since then. Based on thermodynamics, this method is controlled by a sequence of Markov chains by gradually decreasing the temperature of the system. In this paper, all the weights form a state vector and SA searches the optimum one in the state space by applying random generated state to training data. For any pair of the states and , the probability of the system moving from state to state is denoted by , ( ) and the probability of acceptance is denoted by , ( ). , ( ) of each neighbor state , as in the domain of ( ), will be selected with inverse distance possibility. The possibility of inferior state acceptance is exp(( ( )− ( ))/ ). This equation indicates that (1) the worse the state is, the less likely that it will be accepted. (2) As temperature declines, the The Scientific World Journal 3 possibility of inferior state acceptance becomes less and less. As Dowsland and Thompson [25] point out, the performance of this algorithm is largely dependent on the cooling schedule which is the annealing rate of temperature ( ). Consider where for any , (3)

A Univariate Chaotic Model.
First proposed by Packard et al. [26] and Takens [16], phase space reconstruction is a technique to rebuild the nonlinear dynamic system by using a time delay and embedding dimension for a discrete time series. For a time series = { 1 , 2 , 3 , . . . , } with length , the phase space coordinate can be expressed in the form of vectors = { 1 , 2 , 3 , . . . , }, where length = − ( − 1) ; denotes the embedding dimension; denotes the time delay. The coordinate of each point in phase space is In this way, the univariate time series is constructed as a multivariate time series and its future status can be predicted by analyzing the evolution of this dynamic system. One of the most commonly used analysis method is the local forecast approach. Based on the self-similarity of chaotic attractor in which the current point is similar to the trajectories of its neighboring points, it searches several neighboring points and makes forecast by averaging the next points of these searched points. In this method, calculate the distance of the th selected point to the current point . Assign each point to the weight value by = −( / min ) , where min is the minimum distance.
step forecast is the weighted average of the step ahead neighboring point + = ( * is the th point to the current point and + is the step ahead of the th point.

Artificial Neural
Network. Zade and Noori [8] first adopted a multilayer perceptron with back propagation for predicting MSW generation. Figure 2 shows the three-layer neural network (input layer, hidden layer, and output layer) with feed forward structure including back propagation of error.
The input of this network is a time window of observations from − to . The output of the network is predicted result +1 . In hidden layer, Input layer

Hidden layer
Output layer where Net represents the th node in the hidden layer and V represents the activation function of a node. This study adopts sigmoid function used by Zade and Noori [8] as follows: where is a constant. In the output layer, the prediction is calculated through the following function: where is the corresponding weight for each hidden node and 0 is the activation function. In this study, we use the commonly used line function. Both and V are assigned with random values initially and are then adjusted by the delta rule derived from the learning samples.

Partial Least Square Support Vector Machine.
Originally proposed by Wold [27], partial least square (PLS) assumes that is an × matrix and is an × matrix. To build a PLS model, is regressed onto the -scores ( ) to predict the -scores ( ); then and in turn are used to predict the responses . As a result, and can be expressed as = + and = + , where × is loadings, 1 × is -loadings, × is -scores, × is -scores, × is -residuals, and × 1 is -residuals. The details of the underlying principle of support vector machine (SVM) have been well described in literature (Vapnik et al. [28]; Vapnik [29]). It assumes that the relationship between the dependent and independent variables is given by a deterministic function ( ): where , Φ, and represent a nonlinear transformation from to high dimensional space. The goal of SVM is to find the value of and so that values of can be determined by minimizing the regression risk: The Scientific World Journal where is a constant, Γ(⋅) is a cost function, and vector can be written in terms of data points as Substitute (10) into (8), the generic equation can be rewritten as The dot product can be replaced with the kernel function ( , ). The -insensitive loss function is the most widely used cost function [30]. The function is in the following form: The regression risk and the -insensitive loss function can be minimized by solving the quadratic optimization problem in the following: Subject to where and * are Lagrange multipliers. The variable can be computed by applying Karush-Kuhn-Tucker (KKT) conditions.

Results and Discussion
To assess the performance of the hybrid model and the three individual models, three indices for error prediction were calculated: mean absolute percentage error (MAPE), root mean square error (RMSE), and correlation coefficient ( 2 ).
MAPE is a commonly used index that measures the accuracy as a percentage in time series forecasting. It is defined as follows: where is the observation and is the forecast. RMSE is frequently used to indicate the sample standard deviation of the forecast and observation. It is defined as follows: Coefficient of determination ( 2 ) is a measure of linear correlation of two variables. It indicates how well the observation and forecast values fit a line. Consider where is the mean values of .
In this study, we adopted three commonly used models as individual models: chaotic model, ANN, and PLS-SVM. The embedding dimension, time delay, and nearest neighbor values of the chaotic model are estimated using heuristic analysis to be 7, 4, and 9. Since previous literature reported that an advanced ANN model nonlinear autoregressive with exogenous input (NARX) outperforms standard neural network based predictors [31,32], NARX is employed as a substitution to ANN model. Time delays of both ANN and PLS-SVM models were also evaluated as 7 according to heuristic analysis. This is consistent with 7 days a week as daily MSW is a pseudoperiodic series. After training the three models with first part of data (1 to 851-+ 1), -step ahead predictions were carried out in the second part (852-+ 1 to 900-+1). In order to obtain reliable weights, SA based weight estimation was implemented from 901-+ 1 data prediction. In each prediction step, ANN and PLS-SVM models were implemented for 5 times and a medium performance was recorded for further hybrid forecast to avoid nonstable results caused by random walk involved in these models. In SA hybrid forecast, decay scale, step factor, and tolerance are set to 0.95, 0.02, and 1.0 × 10 −9 . Figures 3 and 4 illustrate the performance of SA based hybrid model in one step ahead prediction and its corresponding correlation coefficient. In one step ahead prediction, the average MAPE is lowered from 10.08% (chaotic model), 11.47% (ANN model), and 13.30% (PLS-SVM) to 8.80%. Only 8 days exceed 20% and one day exceeds 40%. These results show that the SA based hybrid model has better forecast ability than any individual model and can be further applied to daily prediction. Note that the high 2 in Figure 4 results from the huge gap between workday (Monday to Friday) and weekend (Saturday and Sunday). If we calculate 2 for these two different patterns separately, we will get a much lower result (0.518 for weekday and 0.656 for weekend). The MAPE and 2 of the results are inferior to weekly MSW forecast prediction in previous studies [12,13] due to the fact that daily MSW is more likely to be influenced by random chance.
To test the multistep ahead forecast ability of the hybrid model, we implemented all the models from 1 day to 5 weeks ahead ( Figure 5). All three indices indicate that the SA based hybrid forecast outperforms the three individual models, meaning that it can be recommended for practical use. Previous report [15] has shown that chaotic model outperforms the other two models around 2%. In this study, the hybrid model raises the accuracy of chaotic model to another 1.75%.
Different from weekly MSW generation [8,9,12,13], daily data fluctuates quasiperiodically regarding 7 days a week The Scientific World Journal 5  which is also equivalent to time delay estimated from the three individual models. In this paper, we calculated one week average MAPE and RMSE from the first week to the fifth week and measured the degradation of each model shown in Tables  1 and 2. The degradation is estimated by the average difference of each neighboring week as in the following formula: where Param can be either MAPE or RMSE or MAPE of the th week. Both tables suggest that SA based hybrid model not only has better forecasts, but also degrades less than the three individual models. This means that the hybrid model is more suitable and rigorous in multistep ahead prediction. We accumulated the three weights for each model in one  Tables 1 and 2. Comparing performances of three individual models in Figure 5, the chaotic model performs better than the other two models and ANN model performs the worst. The different weights for the three models are also in accordance with the three different performances. If a very bad performance model is involved in hybrid forecast, it will be assigned very low weight which makes it useless in hybrid forecast.

Conclusions
With the rapid process of urbanization, a growing amount of municipal solid waste is produced in daily life. Aiming at solving accurate prediction of MSW generation for policy making and optimum resources allocation, this paper proposed a simulated annealing based variable weighted hybrid forecast method for combining the chaotic model, ANN model, and PLS-SVM model to improve multistep MSW generation forecasts. By applying this SA based hybrid method to candidate models, more accurate forecasts can be produced. The results of the three indices (MAPE, RMSE, and 2 ) show that the SA based hybrid forecast not only has better prediction capabilities, but also degrades less when applied to longer forecasts.
Previous studies have utilized hybrid models to improve the prediction performances of weekly or monthly time series MSW data for midterm and long term forecasting [6,10,11]. However, these studies simply choose the weights beforehand using various disciplines, which cannot choose the best weight combination automatically during the prediction process. Our study integrates the SA method to determine the weights dynamically, which can make full use of all 3 models, and proves to have improved the prediction performances of existing models well. Yet this study is just a preliminary exploration for integrating variable weighted models into daily MSW prediction. Practically, a hybrid model should be tested on historical data first before applying it to real data prediction. More studies will be conducted to explore different variable weighted models to support the time series prediction in solid waste area.