Groundwater Depth Forecasting Using a Coupled Model

Accurate and reliable prediction of groundwater depth is a critical component in water resources management. In this paper, a new method based on coupling wavelet decomposition method (WA), autoregressive moving average (ARMA) model, and BP neural network (BP) model for groundwater depth forecasting applications was proposed. The relative performance of the proposed coupled model (WA-ARMA-BP) was compared to the regular autoregressive integrated moving average (ARIMA) and BP models for annual average groundwater depth forecasting using leave-one-out cross-validation (LOO-CV). The variables used to develop and validate the models were average groundwater depth data recorded from 1981 to 2010 in Jinghui Canal Irrigation District in the northwest of China. It was found that the WA-ARMA-BP model provided more accurate annual average groundwater depth forecasts compared to the ARIMA and BP models. The results of the study indicate the potential of the WA-ARMA-BP model in forecasting nonstationary time series such as groundwater depth.


Introduction
Groundwater is a major source of water supply for industry and agriculture in many countries. However, with the development of urbanization and augment of water consumption, groundwater has been overexploited which leads to harmful environmental side effects, for example, major water-level declines, drying up of wells, reduction of water in streams and lakes, water-quality degradation, increased pumping costs, land subsidence, and decreased well yields, in many parts of the word, especially in developing countries [1,2]. In order to prevent the above situation and manage groundwater resources more effectively, accurate and reliable forecasting of groundwater level (GWL) is crucial [3]. However, the GWL is very complex and highly nonlinear as it depends on many complex factors such as precipitation and temperature. erefore, developing effective models that can predict GWL precisely is imperative [4].
Physical-based numerical models have been frequently employed for the simulation of the GWL [5,6]. However, to develop a physical-based numerical model needs collecting a great deal of data such as thickness and hydraulic properties of each hydrogeological unit, groundwater recharge/discharge, topography information, geological coverage, soil properties, land use map, vegetation distribution, evapotranspiration information, and hydrologic and climatic data [7]. A large quantity of precise data/parameters of study domain are required in the calibration and validation process of the model [8]. In addition, because numerical models make many assumptions on the governing equation in porous media and most of the model forcings such as evapotranspiration and rainfall are less predictable, they are less competent in forecasting. erefore, the use of physicalbased models is highly restricted.
Statistical models are a suitable alternative when data is limited and obtaining accurate forecasts is more important than conceiving underlying mechanisms [9]. For example, Bidwell (2005) [10] developed an autoregressive moving average (ARMA) model to predict GWL in New Zealand. Lee et al. (2009) [11] used an ARMA model to forecast the GWL in Changwon, Korea. Guzman et al. (2017) [12] applied an ANN model to simulate daily GWL at a local scale in the Mississippi River Valley Alluvial Aquifer, located in the southeastern United States. Yaseen et al. (2019) [13] applied the bioinspired adaptive neurofuzzy inference system (ANFIS) models to forecast the highly nonlinear streamflow of the Pahang River.
However, the above models frequently have limitations with nonstationary data and cannot handle nonstationary data without input data preprocessing [14]. e methods for dealing with nonstationary data are not as advanced as those for stationary data and additional research is needed to investigate methods that are better able to handle nonstationary data effectively. An example of such a method is wavelet analysis.
Preliminary studies have indicated that wavelet analysis (WA) is more effective when analyzing nonstationary time series. WA can decompose an observed time series variables (such as GWL) into various components so that the newly established time series can be used as inputs for an ANN model [15][16][17]. In recent years, coupled wavelet-artificial neural network (WA-ANN) models have been widely used for the prediction and forecasting in hydrological studies. For instance, Wang et al. (2009) [18] developed a wavelet neural network model to forecast the inflow at the ree Gorges Dam in Yangtze River. Adamowski and Chan (2011) [2] proposed WA-ANN models for 1-month-ahead forecasting of GWL at two sites in the Chateauguay watershed in Quebec, Canada. It was found that WA-ANN models provide more accurate forecasts compared to the ANN and ARIMA models.
In this research, the groundwater depth series was decomposed into multiple detailed signal sequences and approximate signal sequences through wavelet decomposition and reconstruction. en, detailed signal sequences and approximate signal sequences were calculated and predicted using the ARMA model and BP neural network model, respectively. e coupled model mentioned above is called the WA-ARMA-BP model. e objective of this paper was to develop a WA-ARMA-BP model and compare it with the ARIMA model and the BP neural network model for the forecast of groundwater depth in Jinghui Canal Irrigation District in the northwest of China.

Study Area and Data Description.
e Jinghui Canal Irrigation District lies at a longitude of 108°34′34″ to 109°21′35″N and latitude of 34°25′20″ to 34°41′40″E in Shaanxi Province, China ( Figure 1). It covers an area of 1180 km 2 , with a continental semiarid climate. e average annual temperature is approximately 13.4°C. e average annual precipitation is about 538.9 mm, most of which occur during May and October. Rural population accounts for approximately 84% of the total population, and cultivated land accounts for approximately 80% of the total area. e irrigation water resources of Jinghui Canal Irrigation District are Jing River and groundwater. Recently, with the intensification of industrial and agricultural activities, the demand for water resources is constantly increasing. At the same time, affected by the decrease of water inflow from Jing River and the expansion of the scale of motor-operated wells, the water consumption ratio of well and canal irrigation has risen from 0.3 in the 1970s to 1.3 at present. e increase in the amount of groundwater exploitation results in the increase of groundwater depth in the irrigated areas. Over the course of the recent decades, the increasing groundwater depth has caused a series of ecological and environmental problems in Jinghui Canal Irrigation District, for example, land subsidence and soil salinization [19].
us, accurate predictions of GWL fluctuations would be imperative for water resources management in this area.
As groundwater cannot be renewed artificially on a large scale, sustainable management of this resource is vital and its nonsustainable use is a serious danger. It needs a globally averaged analysis of the groundwater depth [7]. To calculate the average groundwater depth over the study area, observed groundwater depth data from 1981 to 2010 from 15 observation wells were obtained from Shaanxi Jinghui Canal Irrigation Administration. Figure 2 presents

Wavelet Decomposition Method (WA).
e main feature of the wavelet decomposition method is to decompose the nonstationary time series into those which are much more stationary than the original time series and then study each sequence separately to realize the simulation and prediction of the nonstationary time series. Mallat algorithm [20] is generally used for multiscale decomposition and reconstruction. Actually, the time series f(t) is decomposed based on different frequency channel components in the Mallat algorithm. e time series f(t) can be described as e nth wavelet decomposition of f(t) can be expressed as where c j and d j represent the approximation signal and detail signal of the original signal c 0 at 2 -j resolution, respectively; H is the low pass filter; G is high pass filter; φ nk and ψ nk represent the orthogonal basis; and P 0 f � f. In this way, the nonstationary time series f(t) is decomposed into the superposition form of n detailed signal sequences R 1 f, R 2 f,. . .R n f and 1 approximation signal sequence P n f.
Since the number of signal points of the approximation signal and detail signal obtained by the Mallat algorithm is 2 Discrete Dynamics in Nature and Society less than the number of signal points of the original signal, the decomposed signals can be reconstructed by reconstruction algorithm as follows: where H * and G * denote the dual operators of H and G, respectively.

ARMA and ARIMA Models.
Autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) models are the most commonly used stochastic models.
In stochastic modeling techniques, data should be divided into trend, periodical, autoregressive, and random components. If the statistical characteristics have increasing or decreasing changes over time, there will be a trend term; if they change periodically, there will be a periodic term; and if   Discrete Dynamics in Nature and Society a large change occurs suddenly and it continues, there will be a jump term. e three parts of the trend, period, and jump in time series are deterministic and predictable components that should be omitted for modeling with stochastic methods, and the remaining term is the stochastic term which is used for modeling [21]. e average groundwater depth of the study area from 1981 to 2010 experiences obvious increasing changes over time, which is clear by observing the time series of the data ( Figure 3). ere were neither periodic fluctuations nor suddenly major changes in the data series, which indicated that the trend term and jump term did not exist. e Mann-Kendall nonparametric test is widely applied in hydroclimatic trend analysis. However, the performance of the Mann-Kendall (MK) method is affected by the presence of autocorrelation [22][23][24]. erefore, a modified Mann-Kendall (MMK) test introduced by Hamed and Rao (1998) [22] was employed to evaluate the trend of serial correlation data. e relationship of the traditional MK nonparametric test is as follows [25,26]: where U MK is the standard of Mann-Kendall statistic; MK is the Mann-Kendall statistic; Var(MK) is the variance of MK; sgn is the sign function; and N is the number of samples. A positive MK value indicates an increasing trend, while a negative MK value shows a decreasing trend. In this paper, the significance level of 5% is adopted. If |U MK |>1.96, the trend is obvious. In MMK test, Var(MK) * is used as the corrected variance [22].
where n is the actual number of observations, n * is the effective sample size, r i is the lag i significant autocorrelation coefficient of i th rank of time series, and its expression is Statistic U * MK of MMK trend test method is computed by replacing Var(MK) in equation (4) with Var(MK) * . e rule used to determine the trend and significance of U * MK is similar to U MK .
If the MMK test results indicate nonstationary factors, it is necessary to remove the trend term. Differencing is also one of the most basic methods of data stationary. In this method, each data is reduced from the previous data, and the differentiated series (Diff(t) � T(t) -T(t-1)) is obtained. e stationary time series' Auto Correlation Function (ACF) structure can be detected by drawing a correlogram. If the graph damps down to zero rapidly in exponential or oscillatory form, the time series is stationary.
ARMA model provides a parsimonious description of a weakly stationary stochastic process. e ARMA model is developed via combining the AR (autoregressive) and MA (moving average) models. ARMA (p, q) models are defined as [27] x where x t (t � 0, ±1, ±2, . . . ) represents a time series; φ 1 , φ 2 ,. . ., φ p and θ 1 , θ 2 ,. . ., θ q are called the AR and MA parameters, respectively; the parameter p and q denote the AR and MA orders, respectively; w t (t � 0, ±1, ±2, . . .) is a Gaussian white noise sequence. When a time series does not appear to be covariance stationary, the differencing procedure may be applied to make it stationary. e ARMA (p, q) model can be applied to the stationary differenced time series and the model so constructed is called the ARIMA (p, d, q) model where d denotes the order of differencing.
ARIMA modeling involves three stages: identification stage, estimation stage, and diagnostic checking stage.
Firstly, in the identification stage, it is ensured whether the time series is stationary or not. For nonstationary time series data, a differencing process is applied till the data series are transformed into stationary ones. Auto Correlation Functions (ACFs) and Partial Auto Correlation Functions (PACFs) are commonly used in evaluating a time series variable's dependency on its past. e AR and MA terms of the stationary time series are determined by examining the patterns of the graphs of ACFs and PACFs. Autocorrelation is the correlation between time series and the same time series lag, while partial autocorrelations are the correlation coefficients between the basic time series and the same time series lag, but with the elimination of the influence of the members in between [28].
Secondly, in the estimation stage, parameters p and q of AR and MA terms are estimated. Parameters p and q can be identified by looking at ACF, which is a bar chart of time series of coefficients of correlation of time series and lags of itself, and PACF, which represents the plot of partial coefficients of correlation of time series and lags of itself [29]. However, the more objective way to choose the orders of p and q of an ARMA (p, q) process is to use objectively defined criteria such as Bayesian Information Criterion (BIC), which is given as formula (8) [30]. e orders of p and q that minimizes BIC value are used as the appropriate orders of the model.
where L is the likelihood function, k is the number of parameters in the model, and n is the number of observations. Finally, the diagnostic stage ascertains whether the developed ARIMA model fits well with the input time series data or not. Once it is ensured that the difference between the predicted and actual observations is sufficiently small, then the model can be utilized for forecasting.

BP Neural
Network. BP neural network, called the backpropagation neural network, is a typical type of multilayer feed-forward neural network. It was proposed by Rumelhart and McClelland in 1986, which is one of the most easily understood and widely used neural network algorithms [31,32]. e structure of the BP neural network is shown in Figure 2. It consists of three layers, including the input layer, output layer, and hidden layer. ere is at least one hidden layer. Each layer is composed of several neurons (nodes), via which each layer is connected to its adjacent layer. ere is no connection between nodes of the same layer. e input data is stored in the input layer node and transmitted to the output layer through the hidden layer. e learning and training of the BP neural network adopt the method in which the error function decreases according to the gradient to minimize the mean square error between the actual output value and the expected output value [33]. e learning process is divided into two stages: a multilayer feed-forward stage, which calculates the actual input and output of each node at each layer in turn from the input layer; the backpropagation stage; according to the output error of the output layer neurons, the weight of each connection is corrected along the reverse path to reduce the error [34]. e multilayer feed-forward mathematical model is described by equation (6): where y l i is the output value for node i of layer l; x l i is the activation value for node i of layer l;w l ij is the connection weight for node j of the layer l-1 to the node i of the layer l; θ l i is the threshold of the node i of the layer l; N is the number of nodes in the layer; L is the total number of layers; f() is the activation function of the neuron, which is Sigmoid function in this paper described as f(x) � 1/(1+e-x).

Coupled Model.
In the new coupled model, called the WA-ARMA-BP model, groundwater depth series was decomposed into multiple detailed signal sequences and approximate signal sequences through wavelet decomposition.
en, detailed signal sequences and approximate signal sequences were calculated using the ARMA model and BP neural network model, respectively.

Model Performance Comparison.
In order to test the accuracy of the models, the coefficient of determination (R 2 ), Nash-Sutcliffe model efficiency coefficient (NSE), and rootmean-squared error (RMSE) were used in this paper. R 2 measures the degree of correlation among the observed and modeled values. Its values range from 0 to 1, with 1 indicating a perfect fit between the data and the line drawn through them. R 2 is given by [35]   RMSE indicates the discrepancy between the observed and calculated values. A perfect fit between observed and forecasted values would have an RMSE of 0. RMSE is expressed as [7] RMSE � where n is the number of the observations; M i and O i represent the forecasted and observed values.
Leave-one-out cross-validation (LOO-CV) is a commonly used algorithm for estimating the predictive performance of a multivariable calibration model. Usually, practical calibration experiments have to be based on a limited set of available samples. e idea behind the LOO-CV algorithm is to predict the property value for a sample from the data set, which is, in turn, predicted from the calibration model calculated from the data for all other samples. When applying the algorithm to a data set with N examples, the calibration modeling is performed N times, each time using N-1 samples for modeling and one sample for testing. us, the procedure of LOO-CV can be divided into N segment. Finally, the R 2 and RMSE of LOO-CV for the three models can be obtained.

ARIMA Model. Statistic U MK
* of the average groundwater depth series was 1.635 which indicated an increasing trend but not significance. us, the average groundwater depth data series was not stationary and it is necessary to transform the data series into a stationary series through the differencing process. After twice differencing processes, it is obvious that both ACF and PACF showed the trailing trend which illustrated that the data series was stationary and it is suitable to build ARIMA (p, 2, q) model for average groundwater depth data series Figure 4. According to the ACF and PACF, the appropriate models were determined as ARIMA (1, 2, 1), ARIMA (2, 2, 1), ARIMA (3, 2, 1), ARIMA (1, 2, 2), ARIMA (2, 2, 2), and ARIMA (3,2,2). In order to determine the optimal model, the values of BIC were calculated for the above models ( Table 2). As shown in Table 2, ARIMA (3, 2, 1) was the optimal model since the value of BIC was minimum. Following this, the parameter estimation of the ARMA model was performed. e model was trained using the data from 1981 to 2000 and tested using the data from 2001 to 2010.

BP Neural Network.
e BP neural network model for average groundwater depth forecasting was developed in MATLAB R2014a software. In this paper, it was considered that the groundwater depth (H t ) at time t was related to the previous groundwater depth data (H t-l , l � 1, 2, ...), and the effect of H t-r on H t is greater than that of H t-r-l . e groundwater depth data in the previous n years were set as input parameters, and the groundwater depth data of the (n+1)th year were set as output parameters of the BP network model. e number of n was set to 5. us, there were altogether 25 input nodes and 25 output nodes in the BP network model of the study area. e structure of the neural network was determined by trial and error. e optimal number of nodes in the hidden layer and the stopping criteria were optimized by trial and error for obtaining accurate output. e activation function was set as tansig transfer function. Out of the 30-year data sets available, 20 data sets were used for training the BP network model and 10 data sets were used for testing the model.

Coupled Model.
In the coupled WA-ARMA-BP model for average groundwater depth forecasting, the first step was decomposing groundwater depth data series into multiple detailed signal sequences and approximate signal sequences through wavelet decomposition and reconstruction. is work was developed using the wavelet toolbox of MATLAB software. According to the method of wavelet decomposition mentioned above, the groundwater depth series (x) was decomposed threefold as follows: where d 1 , d 2 , and d 3 represent the three detailed signal sequences; a 3 represents the approximate signal sequence ( Figure 5). e three detailed signal sequences were calculated using the ARMA model. In this study, the simulation process is illustrated by taking detailed signal d 1 as an example. e first step is to estimate the stationarity of the average groundwater depth data via ACF and PACF. As shown in Figure 6, it is obvious that both ACF and PACF showed the trailing trend which illustrated that the d 1 data series was stationary and it is suitable to build ARMA (p, q) model for detailed signal d 1 series. Based on the principle of minimizing BIC value, ARMA (2, 1) was the optimal model Table 3. Following this, the parameter estimation of the ARMA model was performed. e model was trained using the data from 1981 to 2000 and tested using the data from 2001 to 2010. e calculation of detailed signals d 2 and d 3 followed the procedures above. e approximate signal sequence was calculated using BP neural network model, which was developed in MAT-LAB R2014a software. e calculation steps of the model were consistent with the above description. Out of the 30year data sets available, 20 data sets were used for training the BP network model and 10 data sets were used for testing the model.
After obtaining the simulation and prediction models of the above sequences, the models were superposed to achieve the simulation and prediction model of the average groundwater depth.

Results and Discussion
In order to verify the validity of the proposed prediction method, the ARIMA model and BP neural network model were developed to forecast the groundwater depth directly and then compared with the result of the coupled model. e results of the models are shown in Table 4        Observed average groundwater depth (m)

Conclusions
In this paper, a new method based on coupling wavelet decomposition method (WA), ARMA model, and BP neural network model for groundwater depth forecasting applications was proposed to help manage groundwater supplies in a more effective and sustainable manner. e coupled model (WA-ARMA-BP) was compared to the regular ARIMA model and BP neural network model for average groundwater depth forecasting in Jinghui Canal Irrigation District in the northwest of China. Using the wavelet decomposition method, the original data series was decomposed into three detailed signal sequences which were then forecasted via ARMA models and one approximate signal sequence which was then forecasted via BP neural network. e wavelet decomposition method allowed most of the "noisy" data to be removed and made data changes more stable.
is study found that the WA-ARMA-BP model was substantially more accurate than the ARIMA model and BP neural network model. It is hypothesized that the WA-ARMA-BP model is more accurate because the wavelet decomposition method allows most of the "noisy" data to be removed and makes data changes more stable. It is indicated that the WA-ARMA-BP model is a potentially very useful new method for nonstationary time series such as groundwater depth forecasting.
Data Availability e data can be obtained upon request to the corresponding author.

Conflicts of Interest
e authors declare that they have no conflicts of interest.