Comparison of Stabilization Ability of Models for Hydrological Time Series with a Deterministic Trend

Under influence of climate change and human activities, deterministic trend has been detected and reported in various hydrometeorological observation records. In order to correctly model the stochastic properties, the time series has to be stabilized by removing the trend. Both detrending and differencing have been proposed to fulfill such a task. But the influence of the two stabilizing approaches on the residual series is distinguishing. In this study, ARMA models are constructed based on the above two stabilization approaches for an annual minimum daily discharge series with a deterministic trend. Comparisons are made with respect to stabilization ability, model simulation, and forecasting. Results indicate that themodel based on detrending is superior to the one based on differencing in almost all the selected comparison criteria. So detrending is suggested to remove the deterministic trend before using ARMAmodel to fit the observed data.


Introduction
During the last decades, time series analysis and modeling have received considerable attention for simulation, forecasting, and control.In the literature, the linear time series model, particularly the autoregressive moving average (ARMA) model, has been widely applied to different hydrologic and climatic variables such as rainfall [1], runoff [2], droughts [3], and water quality variables [4].Theoretically, trend of any nonlinear system has its approximation with classical regression models.Numerous studies have been undertaken in different regions of the world with respect to trend estimation by various regression techniques [5][6][7][8][9][10].To be consistent with the practical situation in China where most annual discharge series follow Pearson type III (P-III) probability distribution [11,12], it is wiser to fit the observed data using more flexible polynomials based on P-III distribution than to choose a traditional linear model based on normal distribution blindly.
ARMA model is constructed to fit observed data based on the assumption that the time series is stationary.However, with the development of human society, the climate all over the world is changing gradually, which may result in nonstationarity of the observed data series in the form of deterministic trend.This kind of trend has been detected and reported in various hydrometeorological observation records.In order to correctly model the stochastic properties of the time series, the trend has to be removed to fulfill the stationarity assumption.Traditionally, differencing is utilized to stabilize a nonstationary series, which leads to the autoregressive integrated moving average (ARIMA) model.However, this approach may seriously damage the structure of the residual series [13].As an alternative, detrending has also been proposed to remove a deterministic trend from a nonstationary time series which does not distort the residual series.Li et al. [14] have applied the approach of detrending to remove the serial correlation of their streamflow data.No result has been documented about the performance of ARMA models based on the two stabilization approaches which is the main objective of our study.As a case study, we take the annual minimum daily discharge series of the Yangtze River at the Hankou hydrological station for the period of 1952-2005 to investigate the stabilization ability.
The paper is organized as follows.First, the ARMA modeling procedure and comparison criteria used in this research are described.Then, ARMA models based on differencing and detrending are constructed to fit the observed discharge series with a deterministic trend.Comparisons are made with different criteria in model calibration and prediction.Finally, suggestions and conclusions are presented at the end of the paper.

Methods
2.1.ARMA Model.According to time series theory, difference equations are established based on the observed data series.Among all the equations, ARMA model is the most widely used.The form of an ARMA(, ) model is usually written as where and   is the observed data series,   ,   are model parameters, and ,  are orders of the model. is the backward shift operator (    =  − ).Random errors   are assumed to be white noise.The significance of ARMA(, ) model is that the current data can be defined as certain combination of the historical observation data and white noise.
The procedure of building an ARMA model to fit the observed data series often includes model identification, parameter estimation, and model diagnosis.The detailed analysis stages are shown as follows.
(1) Data preparation: transformations of the data (such as square roots or logarithms) can help stabilize the variance in a series where the variation changes with the level.Then, the data are stabilized in the mean level by differencing or detrending.The stabilized data are often easier to model than the original data.
(2) Stationarity and linear correlation checking: stationarity and serial linear correlation checking are necessary to establish ARMA model.
(3) Model structure identification: after data preparation and checking, the order number of the model can be got from the autocorrelation function (ACF) and partial autocorrelation function (PACF) plots.
(4) Parameter estimation: this stage means finding the values of the model coefficients that provide the best fit to the data.There are sophisticated computational algorithms designed to do this.Here, the maximum likelihood method is adopted.
(5) Model checking: this step involves testing the assumptions of the model to identify any areas where the model is inadequate.If the model is found to be inadequate, it is necessary to go back to Step (3) and try to identify a better model.
(6) Model prediction: forecasting is what the whole procedure is designed to accomplish.Once the model has been selected, estimated, and checked, it is usually a straightforward task to compute forecasts.Of course, computer does this.

Comparison Approach.
The ARMA models built in this study are compared through a multicriteria comparison by applying a set of evaluation metrics [15].The evaluation metrics can be classified into three groups: (1) metrics that calculate the absolute error, (2) metrics that calculate the relative error, and (3) the dimensionless metrics [16].
The used evaluation metrics of the above categories in this study are listed as follows: (1) Metrics for calculating absolute errors are as follows: Absolute maximum error (AME) Peak difference (PD) Mean absolute error (MAE) Root mean squared error (RMSE) (2) Metrics for calculating relative errors are as follows: Relative absolute error (RAE) Mean relative error (MRE) (3) Dimensionless metrics are as follows: Coefficient of determination (-squared) Coefficient of efficiency (CE) Index of agreement (IA) In the above equations,   is the observed time series, x is the predicted time series, and x and x are the mean of the observed and predicted time series, respectively.

Application and Comparison
In this section, we illustrate the application of ARMA model for an annual minimum daily discharge series with a statistically significant deterministic trend.ARMA models are constructed to fit the residual time series after the trend is removed from the original series.
In the part of detrending, we adopt three types of polynomials to analyze the trend of annual minimum discharge series of Yangtze River at the Hankou hydrological station: linear trend based on the nonparametric approach of TSA [13], linear regression based on P-III distribution, and quadratic polynomial regression based on P-III distribution [17], which are denoted as Model 1, Model 2, and Model 3, respectively, hereinafter.Besides, ARMA model built on differencing is denoted as Model 4. Estimates of the coefficients are found by maximum likelihood estimate with Newton-Raphson algorithm [18].
Then, comparisons are made with regard to stabilization ability, simulation, and forecasting accuracy between the above four models.

Model Construction. The annual minimum daily discharge series from Hankou hydrological station in Yangtze
River basin of China is chosen as the case study.The discharge series for the period of 1952-2005 is illustrated in Figure 1.The time series plot indicates a deterministic trend present in the mean level of the series.In fact, the  value in the linear regression test [19], Mann-Kendall (MK) trend test, and Spearman product-moment correlation test for the study series indicates that an increasing trend is significant at the significance level of 0.05 (Table 1).Meanwhile, the ACF and PACF of the original series (Figure 2) indicate no seasonality in the used series.
In order to establish ARMA model, the original data series has to be stabilized by removing the trend.After detrending and differencing, there is no visual trend in the series which is confirmed by the MK trend test.The stabilized series are depicted in Figure 3. From all the above, it can be concluded that both detrending and differencing can remove the trend from a nonstationary time series successfully.However, Yue and Pilon [13] have documented that detrending can remove the trend without distorting the residual series while differencing may seriously damage the existing AR (1) process.To investigate the influence of the two stabilization approaches, ARMA models based on the above stabilized series will be built and compared in multicriteria.The ACF and PACF of the residual time series after detrending (Figure 4) suggest that the stabilized series are stochastic variables, which is verified through model diagnosis.So Model 1, Model 2, and Model 3 are composed of the stochastic process and the deterministic trend component, which leads to the following fitted series [14]: where   is the combined series generated by model of detrending,   comes from the residual series model ARMA(, ), and  *  is the deterministic trend estimated by the nonparametric approach of TSA [13], linear and quadratic polynomial regression based on P-III distribution, where   form a sequence of Pearson type III distributed random variables.After first-order difference, the ACF and PACF of the corresponding residual series are presented in Figure 5, and  ARMA(1, 1) is selected as the adequate model after model checking.So ARIMA(1, 1, 1) become the time series model for the original data series, which has been denoted as Model 4.

Model Comparison.
This section provides the multicriteria for model performance evaluation and comparison of Model 1, Model 2, Model 3, and Model 4 fitted to the discharge data at Hankou station.
In Table 2, the multicriteria introduced in Section 2.2 are considered.The comparison of results listed in it suggests that the detrending-based models perform much better than the differencing-based model according to the evaluation metrics.Meanwhile, Model 3 seems to relatively outperform Model 1 and Model 2 according to evaluation metrics.The   evaluation metrics also indicate slightly better performance of Model 2 than Model 1.These findings also confirm that regression model based on P-III distribution is more effective and quadratic polynomial is more flexible to analyze the trend of discharge series in Hankou station.
To further investigate the accuracy of the studied models, the time series plots are depicted in Figure 6.Although the performance of the models looks almost the same for most of the annual minimum daily discharge, it can be seen that, for peak values, Model 4 based on differencing usually reaches the biggest value while Model 3 based on detrending has the largest error for valley data series, which indicates that the accuracy of the two models is poor at extreme points of the series.
The prediction results are presented in Table 3.In the forecast, the first 52 pieces of data are used to determine model parameters and the data of the last 5 years are used for comparison.The forecasting error of Table 3 is the relative error between the prediction value and the real value.The comparisons suggest that model based on detrending is a better model for one-and two-year prediction of the annual minimum daily discharge, while model based on differencing performs better for three-to five-year forecast for the studied series.

Discussion and Conclusion
This study investigates the influence of stabilizing approaches, detrending and differencing, on the performance of the subsequent time series models.ARMA models are constructed based on the above two stabilization approaches for an annual minimum daily discharge series with a deterministic trend.As a study case, the annual minimum daily discharge series of the Yangtze River at the Hankou hydrological station for the period 1954-2005 is analyzed using different stabilizing approaches.Comparisons are made with respect to stabilization ability, model simulation, and forecasting.Results indicate the following: (1) The model based on detrending is superior to the one based on differencing in almost all the selected comparison criteria.So detrending is suggested to remove the deterministic trend before using ARMA model to represent the observed data.
(2) For the stabilization ability and model simulation, detrending model of the quadratic polynomial regression based on P-III distribution is more effective than the linear regression based on P-III distribution, and the linear regression based on P-III distribution outperforms the linear regression based on TSA method.
(3) As the stabilization power is sensitive to the method utilized to estimate the trend, before we stabilize the series by detrending, it is very necessary to find out the probability distribution type that fits the studied data and flexibly choose a fitted regression curve that represents the real long-term trend best.

Figure 1 :
Figure 1: Annual minimum flow series of Hankou station.

Table 1 :
Trend test result of observed data.

Table 2 :
Model criteria for the annual minimum daily discharge series.

Table 3 :
Predictions for annual minimum daily discharge series.