Comparison of Three Statistical Downscaling Methods and Ensemble Downscaling Method Based on Bayesian Model Averaging in Upper Hanjiang River Basin, China

Many downscaling techniques have been developed in the past few years for projection of station-scale hydrological variables from large-scale atmospheric variables to assess the hydrological impacts of climate change. To improve the simulation accuracy of downscalingmethods, the BayesianModel Averaging (BMA)method combined with three statistical downscaling methods, which are support vector machine (SVM), BCC/RCG-Weather Generators (BCC/RCG-WG), and Statistics Downscaling Model (SDSM), is proposed in this study, based on the statistical relationship between the larger scale climate predictors and observed precipitation in upper Hanjiang River Basin (HRB). The statistical analysis of three performance criteria (the Nash-Sutcliffe coefficient of efficiency, the coefficient of correlation, and the relative error) shows that the performance of ensemble downscaling method based on BMA for rainfall is better than that of each single statistical downscaling method. Moreover, the performance for the runoff modelled by the SWAT rainfall-runoff model using the downscaled daily rainfall by four methods is also compared, and the ensemble downscaling method has better simulation accuracy. The ensemble downscaling technology based on BMA can provide scientific basis for the study of runoff response to climate change.


Introduction
Global climatic changes could lead to changes in regional water availability.Such hydrologic changes will affect nearly every aspect of human well-being, from agricultural productivity, energy use, flood control, to municipal and industrial water supply, fish and wildlife management.The tremendous importance of water in both society and nature underscores the necessity of understanding how a change in global climate could affect regional water supplies [1].General circulation models (GCMs), which are numerical coupled models and describe the atmospheric processes through mathematical equations, have been one of the most important tools for studying climate change.GCMs represent various earth systems including the atmosphere, oceans, land surface, and sea ice and offer considerable potential for studying climate change.At large scales, GCMs which have been steadily evolving over several decades are able to simulate the most important features of the global climate, and simulations are most reliable over the tropical regions [2].However, these same models perform poorly at smaller spatial and temporal scales relevant to regional impact analyses [3,4].Because the spatial resolution of GCMs grids is too coarse to resolve many important subgrid scale processes, GCMs outputs are often unreliable at individual grid and subgrid box scales [1,5].
One possible solution to overcome this problem is to downscale the output from GCMs to a higher resolution in space/time and then to use scenario output in local water management.The basic idea of downscaling is to transfer large-scale changes in atmospheric variables (predictors), reliably simulated from GCMs, to local weather series (predictands) [6].To deal with this issue several downscaling methodologies, such as dynamic downscaling and statistical downscaling, have been developed.Dynamic downscaling refers to the use of regional climate models (RCMs), or limited-area models (LAMs) which employ largescale and lateral boundary conditions from GCMs to produce higher resolution outputs [7].The statistical downscaling methodology has many obvious drawbacks like the uncertain assumption of applicability in a future climate, but it is a computationally cheap and statistically sound complement to dynamical downscaling.It is notable that predictor and predictand can be the same parameter on different scales, but statistical methods can freely select any variable as predictor as long as it can be motivated.Several existing statistical downscaling methods have been applied in different climate regions [1,4].More sophisticated statistical downscaling methods are generally classified into three groups [7]: regression models (e.g., CCA, SVM), weather typing schemes, and weather generators (e.g., LARS-WG, BCC/RCG-WG).
Studies comparing different statistical downscaling methods are now relatively common [4,[8][9][10].The results of these studies have shown that different methods have different performance in a certain area, and a certain method has different performance in different study areas.Since different methods have strengths in capturing different aspects of the downscaling, combining the results from diverse methods by weighting procedures can present a better performance than any individual method [11][12][13].
The early combination techniques employed such tools as simple model average, linear regression, and artificial neural network [14][15][16][17].These methods use a set of deterministic weights to combine multiple model outputs, and the weights in such combination can take any arbitrary real (positive or negative) values that lack physical interpretations [18].Bayesian Model Averaging (BMA) came to prominence in statistics in the mid-1990s, and Madigan and Raftery [19] were the first to propose this method for combining predictions.Subsequently, Raftery [20] and Draper [21] gave more detailed discussion about BMA.It has been applied in diverse fields such as economics [22], biology [23], ecology [24], public health [25], toxicology [26], meteorology [27], and management science [28].In many case studies, BMA produces accurate and reliable predictions and was shown to be a better scheme than other model-combining methods [29][30][31].In recent years, hydrologists have also applied BMA to hydrologic modelling, such as groundwater [32] and rainfall-runoff modelling [18,33,34].
The BMA method has the ability to improve the accuracy of the prediction; only a few studies have applied BMA to downscaling methods.For example, Yang et al. [35] used the BMA ensemble method to reduce the uncertainties in lateral boundary forcing and improve model performance in regional climate downscaling.Three statistical downscaling models are combined together by using BMA method in this study, which aims to investigate the potential use of BMA in downscaling GCM simulations in the upper Hanjiang River Basin (HRB), China.The three statistical downscaling techniques used here are as follows: (i) support vector machines (SVM) model [36,37]; (ii) BCC/RCG-Weather Generators (BCC/RCG-WG) [38,39], and (iii) Statistics Downscaling Model (SDSM) [40].More specifically, the following objectives have been set for this paper: (1) to establish the statistical relationship between large-scale circulation (using NCEP/NCAR reanalysis data) and precipitation in the upper HRB by using these three downscaling methods; (2) to combine the three downscaling models by using BMA method; and (3) to assess the performance of ensemble downscaling method based on BMA for rainfall and runoff modelled by SWAT model in the upper HRB.The technical route of the research in this paper is described in Figure 1.This paper is organized as follows.First, the details of the study area, station-observed data, NCEP/NCAR reanalysis data, and digital watershed data used in the study are described.This is followed by a description of the downscaling methods, BMA method, and hydrological model.The results are then presented followed by a discussion and finally the conclusions are presented.

Study Area and Data
2.1.Study Area.The selected area for this study is upper HRB as shown in Figure 2. It is located in Shanxi Province and Hubei Province of China and the total area is approximately 95200 km 2 .The length of main stream is 925 km which takes up 59% of the total length of the Hanjiang River.The basin has a subtropical climate and the area is humid with fairly high precipitation.The mean annual rainfall is 904 mm.The precipitation distribution in this area changes greatly in time and space, and the amount of precipitation is mainly concentrated in summer.3).The NCEP grids are interpolated spatially into the meteorological stations by using the inverse distance weighting method.When establishing the statistical downscaling models, selection of predictors is one of the most critical steps and three criteria should be followed [5].The complexity of the models can be effectively reduced while the predictors that have significant impacts on the predictands are selected.At the same time, the predictors that have no significant impacts on the predictands should be excluded to eliminate redundant information and avoid introducing additional interference factors.According the previous studies [5,41], 10 alternative predictors are chosen by stepwise regression and correlation analysis combining the criteria and climate characteristics of HRB based on NCEP reanalysis data including 26 atmospheric circulation factors for each grid point (seen in Table 1).

Digital Watershed
Data.This study uses the GTOPO30 digital elevation model (DEM) provided in downloadable form by the US Geological Survey (USGS).Based on these data, the watershed DEM and stream networks are extracted and divided into subwatersheds (seen in Figure 1).The soil spatial distribution data are obtained from the soil database of the Institute of Soil Science of Nanjing, China Academy of Sciences, and are classified according to the soil subclasses under the class of land resources and environment in the Chinese Resources and Environment Database.The spatial distribution data of land use are obtained from the national land cover framing TIF maps (30 m spatial resolution) provided by the State Bureau of Surveying and Mapping and are classified into 12 categories according to the Soil and Water Assessment Tool (SWAT) parameter database of land use in the United States.

SVM. The foundations of SVM have been developed by
Vapnik [42,43], initially for optical character recognition.In recent times, SVM approach is recognized for its ability to capture nonlinear regression relationships between variables [36,37,44].The SVM model has been used as a downscaling technique for predicting precipitation of different regions and proven to be effective for downscaling precipitation [41,45].The least square support vector machine (LS-SVM) which provides a computational advantage over standard SVM [46] is a least squares version of SVM, where the solution to the optimization problem is found by solving a set of linear equations instead of a convex quadratic programming for classical SVMs.Because the final decision function of SVM is only determined by support vector, its complex degree depends on the number of support vectors, rather than the dimensions of the sample space (factor), so more forecast factors reflecting more space change in the atmosphere can be chosen.In this study, the optimal model is established through parameter (between predictors and predictands) optimization using 10 primary predictors as prediction factors of the SVM.

BCC/RCG-WG.
BCC/RCG-WG is named for the weather generator developed by Beijing Climate Center of China Meteorological Administration (CMA) and Regional Climate Group at the University of Gothenburg [39].Due to its stochastic framework for the daily climatic variables, BCC/RCG-WG allows us to generate arbitrarily long series to meet the needs of impact assessment and risk analysis of climate change, and so forth.Liao et al. [38,39] have shown that BCC/RCG-WG can successfully simulate daily precipitation and nonprecipitation variables including maximum temperature, minimum temperature, and sunshine hours in China.The input data is temperature decreasing degree and precipitation increasing/decreasing percentage.In this study, the input data are calculated from the monthly temperature and precipitation from 1991 to 2009 and the monthly mean across the years during this period.

SDSM.
The SDSM is a decision-support tool for assessing local climate-change impacts using a robust statistical downscaling technique.It was developed by Wilby et al. [40].SDSM uses a hybrid stochastic weather generator and a multilinear regression method to simulate local variables of regional circulation and atmospheric moisture predictors [47].The model has been applied in many catchments in North America [48] and Europe [40,49]  is the variance associated with model prediction   and observations .In order to make this assumption valid, some techniques such as Box-Cox transformation are needed to make the data approximately normally distributed and to narrow the data range.In the case that the observations and individual model predictions are all normally distributed, the BMA predictive model is then where   is the posterior probability of forecast   being the best one and is based on forecast   's performance in the training period.The   's are probabilities and so they are nonnegative and add up to 1; that is ∑  =1   = 1.

EM Algorithm for BMA Parameter Estimation.
To estimate BMA weight   and model predication variance  2  , the Expectation-Maximization (EM) algorithm has proved to be an efficient technique for BMA calculation based on the assumption that -member predictions are normally distributed [33].

Introduction of SWAT.
The Soil and Water Assessment Tool (SWAT) is chosen to simulate the hydrological processes in this study.The SWAT is a newly developed model that can be applied to a large ungauged rural watershed with hundreds of small subwatersheds.It was developed in the early 1990s by the US Department of Agriculture's Agricultural Research Service.SWAT can be applied to large-scale river basins and in different time scales.It has been used extensively worldwide and shown to adequately reproduce the hydrological response of watersheds across a range of geographical regions and climates.SWAT is a physically based model able to estimate the impact of land uses on water, sediment, and agricultural chemicals on subcatchment and or land use unit scales over long periods of time, as well as responses of climate factors, for example, precipitation and evaporation, spatial variation of underlying surface factors, and human activities.By now, the SWAT model has been widely used in domestic and international basins for simulation of watershed hydrologic processes and hydrologic responses under the conditions of climate change and land use change, evaluation of human activities' impacts on ecological environments, and planning and management for regional water sources [31,50,51].

Simulation Methods.
The precipitation input of the SWAT model is the daily precipitation at 9 stations.The model then calculates the areal precipitation for each subcatchment by using spatial interpolation method.Since only the daily precipitation data are available, the surface streamflow is estimated using the Soil Conservation Service (SCS) runoff curve number method.The Priestley-Taylor method was selected to calculate potential evapotranspiration (PET).According to the relationship between sensible heat flux and latent heat flux in wet surfaces and depending on the idea of evapotranspiration balance, Priestly and Taylor put forward the PET calculation formula under low-advective conditions.This method takes into consideration several meteorological elements such as solar radiation, soil heat flux, air temperature, and relative humidity.It has been proved to be applicable for calculating PET in humid areas by many researchers [52][53][54].For river flood routing, the Muskingum method is adopted.The time scale of the streamflow simulation is monthly and the evaluation objective of parameter calibration is monthly streamflow efficiency coefficient.

SWAT Parameter Calibration.
There are numerous parameters in the SWAT model, which can be classified into two types.Parameters of the first type can be directly determined by their physical meaning.The values of those attribute parameters such as soil physical properties and land use/land cover properties can be determined via the SWAT model database.Parameters of the second type are mainly related to discharge, including initial SCS runoff curve number for moisture condition (CN), available water capacity of the soil layer (SOL AWC), soil evaporation compensation factor (ESCO), groundwater reevaporation coefficient (GW REVAP), and the baseflow recession constant (ALPHA-BF).This paper focuses on the watershed streamflow simulation and optimization of the second type of parameters.The parameter calibration process follows several principles: give upstream priority over downstream; first adjust the water balance and then the flow duration curve; and first adjust the surface flow and then the soil water, evaporation, and underground streamflow [31,50,51].In view of the complexity of the SWAT model, both automatic and manual calibration methods are used to optimize the second type of parameters.

Performance Criteria for Evaluating the Simulation.
The Nash-Sutcliffe coefficient of efficiency (NSC), the coefficient of correlation (RC), and the relative error (ERR) (%) are used to measure model performance.The expressions are given by where  is the number of time steps,   obs is the observed data at time step ,   sim is the simulated data at time step ,  obs is the mean value of the observed data, and  sim is the mean value of the simulated data.The closer the values of NSC and RC are to 1, the more successful the model calibration or validation is.Simulations are considered satisfactory when ERR is below 10% and excellent when ERR is less than 5%.

Precipitation Downscaling Results
. In this study, the first 30 years (1961-1990) is chosen for calibrating the models and the remaining data (1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) are used for validation.The three models are tested for the period 1991-2009 for reproduction of various daily precipitation statistics.All downscaling methods are used with the same set of predictors for training.To assess the accuracy of the three downscaling models in producing rainfall inputs for hydrological model, a comparison of the predictors is selected including mean precipitation, 95th-percentile-of-rain-day amounts (P 95q, mm), largest 5-day total rainfall (P M5, mm), maximum length of wet spell (CWD, days), maximum length of dry spell (CDD, days), and percentage of days long-term exceeding 90th percentile (R90t, %).The values of indictors are shown in Table 2.The observed and predicted monthly precipitation time series during the validation period are shown in Figure 7(a), and the comparisons between observed monthly precipitations with monthly precipitations predicted by SDSM, SVM, and BCC/RCG-WG are shown in Figures 7(b)-7(d).
It can be seen that there is a small difference between the simulated and observed mean daily precipitation in all three methods.It is also evident that the simulation results of P 95q and CDD obtained by BCC/RCG-WG are close to those simulated by SVM and SDSM, while the other statistical indicators' simulation results by BCC/RCG-WG have a relatively large deviation.In addition, SDSM has a slight advantage compared with SVM in the simulation results of P 95q and R90t, showing that SDSM performs better at simulating precipitation extreme values.SVM is more effective to simulate CDD, and other indicators are similar to SDSM simulated results.The result of continuous dry days is better than continuous wet days, but both of the results are smaller compared with the observed variance.Therefore, the simulation accuracy for rain/no rain event needs to be further improved.SDSM has no significant advantage in individual indicator simulation but is slightly better than other two methods in overall stability through comparing simulation results of different seasons.In general, the SVM model can simulate rainfall distribution characteristics in the year better, while it is a little less than SDSM on the accuracy of extreme values in precipitation.This may be related to SDSM by precipitation of conditional probability to estimate rainfall.
The comparison results shown in Figure 7 also indicate that BCC/RCG-WG performed worse than SDSM and SVM, while SDSM performed a little better than SVM.The reason for the poor simulation by WG in recent years is associated with the parameter estimations using observed climate data before the year 2000.Since SDSM and SVM require the input of meteorological factors in corresponding years to  predict precipitation by using the established relationships between meteorological factors and precipitation, the results are relatively better than those obtained by BCC/RCG-WG.
Comparing SVM and SDSM, the results by SDSM are a little better than SVM because all 10 factors in Table 1 were used in SVM, while the meteorological factors of each station are screened and the factors which correlated well with the precipitation are chosen in SDSM (see Table 6).stations.It can be seen from Table 4 that the efficiency coefficients of precipitation sequence are negative, and correlation coefficients are all below 0.11 when simulated using BCC/RCG-WG.It demonstrates that the simulated precipitation process by BCC/RCG-WG has a big difference compared with the observed data which corresponding to the efficiency coefficient of BMA approach to improve the effect is not obvious.The correlation coefficients calculated by the BMA method are greater than the three single downscaling models and the SA ensemble method, and the comparison result of correlation coefficients indicates that the BMA method could improve the correlation between simulated and measured values.Because the same weight is applied to each individual method in the SA ensemble method, the simulation effect of SA is better than WG but worse than SVM and SDSM.However, the simulation forced by the BMA ensemble outperformed not only the simulations forced by individual reanalysis datasets, but also the equal-weight ensemble simulated precipitation.These results suggest that the BMA ensemble method is an effective method for improving model performance in climate downscaling.

Runoff Simulation.
In order to fully analyze the effect of rainfall-runoff simulation using the BMA method, the daily precipitation sequences obtained by SVM, BCC/RCG-WG, SDSM, SA, and ensemble downscaling method based on BMA are put into the SWAT model which is calibrated well in the above content.The simulated runoff results from 1991 to 2009 are compared with the measured data, and the evaluation indictors are listed in Table 5 and the runoff process is shown in Figure 8.The runoff coefficient is only about 0.2 when using the input daily precipitation downscaled by the BCC/RCG-WG model.The value is significantly lower compared with the runoff coefficients calculated by input of the precipitation downscaled using SVM and SDSM.From the specific values of relative error, BCC/RCG-WG model shows well at simulating the total amount of precipitation while it is poor at grasping the process of precipitation.The simulated runoff processes are similar by input of the precipitation using SVM and SDSM method and the response to the runoff of these two models is almost the same, but the relative error of the indicators is significantly greater than the measured ones.This indicates that there are great uncertainties when simulating the precipitation through downscaling method.The NSC of simulated runoff calculated by input precipitation downscaled using SVM is bigger than that using the SDSM.But from the ERR which reflects the total error, we can see that SVM performs a little better.This runoff simulation result does not agree with the downscaling precipitation results.In the downscaling precipitation results, SDSM can describe the precipitation more reasonably.From the point that precipitation is the main factor affecting the runoff, the idea that evaluation indictors can describe precipitation well may not accurately describe the runoff and the correlation characteristics between rainfall and runoff.Similarly, the runoff simulation by SA ensemble method is better than WG but worse than SVM and SDSM.It can be seen that the simulated runoff accuracy is improved when inputting the weighted average precipitation by BMA compared with the single downscaling method.Also, the efficiency coefficient and correlation coefficient are slightly increased, and the relative error obviously decreases.
In general, due to the uncertainty and randomness of precipitation simulation, statistical downscaling method is focused on the statistical distribution characteristics rather than errors of the precipitation process.Therefore, the statistical downscaling methods should be improved at comprehensive analysis of the precipitation simulation and collected methods together to get better effects.

Conclusions
This paper first assesses the three downscaling models which are SVM, BCC/RCG-WG, and SDSM by comparing the predictands against observed historical data and then evaluates the runoff modelled by the SWAT rainfall-runoff model using the downscaled daily rainfall against observed historical runoff characteristics and further proposed ensemble downscaling method based on BMA combined with the above three statistical downscaling models; at the same time the performance of each model is measured by the chosen indexes.In particular, the downscaled precipitation from all these methods is put into the SWAT model for detailed comparison and analysis.The specific conclusions are as follows.
(1) In terms of mean precipitation, the downscaled results are close to the observed data, while for the other predictors each downscaling model has different performance.For the ability to estimate precipitation events and simulate the distribution of rainfall during the year, SVM performs well.SDSM has no significant advantage in the assessment of

Figure 1 :
Figure 1: The technical route of this study.

Figure 2 :
Figure 2: Location of the meteorological and hydrological stations in upper HRB.

Figure 4 :Figure 5 :
Figure 4: Comparison between simulated and observed monthly streamflow in the study area during calibration periods of 1961-1990.
Model.The historical records for DJK Reservoir over the period of 1961-2009 are split into two periods: 1961-1990 for calibration and 1991-2009 for validation.The SWAT model is first calibrated on the period 1961-1990 and then validated on the period 1991-2009.Model calibration and validation are conducted by comparing the SWAT simulated data with the observed discharge on a monthly basis.Figures 4 and 5 compare simulated monthly streamflow with observed streamflow values.Except for several years, most of the periods have a very good agreement between the simulated and observed

Figure 7 :
Figure 7: (a) Monthly precipitation time series of observed data and data predicted by three methods and the relationship of observed monthly precipitation with monthly precipitations predicted by (b) SDSM, (c) SVM, and (d) BCC/RCG-WG.

Figure 8 :
Figure 8: Simulated monthly streamflow by SWAT model through inputting the precipitation obtained using BMA method.The red and blue solid curves are precipitation with values shown on the right -axis, while the black and blue-dashed curves are runoff with values shown on the left -axis.The scales have been adjusted to avoid overlapping of the curves.
[9,10,40]s studies have shown that SDSM has superior capability to capture local-scale climate variability[9,10,40].In this study, SDSM is established using NCEP/NCAR reanalysis data and observed data.The first30 years (1961-1990) is used for calibrating the model, and the remaining19 years (1991- 2009)is used to validate the model.
[31]1.Bayesian Model Averaging.The BMA probability density function (PDF) is a weighted average of the conditional PDFs given each of the individual models, weighted by their posterior model probabilities.BMA possesses a range of theoretical optimality properties and has shown good performance in a variety of simulated and real data situations[31].Consider a quantity  to be predicted on the basis of training data  = [, ] ( denotes input forcing data and  stands for the observational rainfall data).=[ 1 ,  2 , ...,   ] is the ensemble of the -member predictions.The posterior distribution of the BMA prediction is thus given as ( | ) =  ∑ =1  (  | ) ⋅   ( |   , ) ,(1)where (  | ) is the posterior probability of the prediction   given the training data  and reflects how well model   fits .The posterior model probabilities add up to one, and they can thus be viewed as weights.  ( |   , ) is the conditional PDF of the prediction and  conditional on   and training data , and it is always assumed to be a normal PDF and is represented as ( |   ,  2  ) ∼ (  ,  2  ), where  2

Table 2 :
Results of three statistical downscaling methods (calibration and validation).

Table 3 :
Performance assessment of SWAT model during calibration and validation periods.

Table 3 .
In the model calibration period, the relative error for monthly average streamflow is −6.31%, and RC and NSC are 0.95 and 0.85, respectively.All the three are within the range of satisfactory accuracy.The model validation result shows that the model gives satisfactory and comparable performance on the streamflow simulation.Model performance over the validation period is acceptable, with values of −12.31% for ERR, 0.89 for RC, and 0.76 for NSC.Because the monthly data include the seasonal cycle, the correlation is very high.Thus to objectively evaluate the predictability, the performance of each month is shown in Table3.Considering individual months, the simulation results show good performance from May to October, although the accuracy was quite low from December to February.In other words, the SWAT model can operate well in the wet season but it is less accurate in the dry season, like many other hydrological models.In general, the results indicate that it is feasible to apply the SWAT distributed hydrological model to streamflow simulations.

Table 4 .
Figure 6presents the effect of the BMA method through comparing the mean value of evaluation indictors of the nine

Table 4 :
Different methods precipitation simulation results comparison.