Response of Extreme Hydrological Events to Climate Change in the Water Source Area for the Middle Route of South-to-North Water Diversion Project

As the water source area for the middle route of China’s South-to-North Water Diversion Project, the upper Hanjiang basin is of central concern for future management of the country’s water resources. The upper Hanjiang is also one of the most floodprone rivers in China. This paper explores the process of extreme floods by using multivariate analysis to characterize flood and precipitation event data in combination, for historical data and simulated data from global climate models. The results suggested that the generalized extreme value and Gammamodels better simulated the extreme precipitation and flood volume sequence than the generalized Pareto model for the annual maximum series, while the generalized Pareto distribution model was the best-fit model for peaks over threshold series. For the two-dimensional joint distributions of precipitation and flood volume, the Frank Copula was preferred in simulation of the annual maximum flood series whereas the Gumbel Copula was the most appropriate function to simulate the points over threshold flood series. We concluded that, compared with the traditional univariate approach, multivariate statistical analysis produced flood estimates that were more physically based and statistically sound and carried lower risk for flood design purposes.


Introduction
The impact of climate change on water resources is a matter of worldwide concern [1][2][3].Global warming accelerates processes of the hydrological cycle and leads to redistribution and change in the quantity of water resources in time and space.One important implication of this is predicted increases in the frequency and intensity of extreme hydrological events, namely, droughts and floods [4].Droughts have become more serious in the Sahara, South Africa, and eastern Asia, and floods have generally increased in America and Europe in the last few decades [5,6].Similar observations have been made in China [7].As well as risks to human safety, extreme hydrological events cause economic losses, and these costs are rising exponentially [8], threatening sustainable development.There is a need then to improve the scientific understanding of trends and patterns in extreme hydrological events in the context of global climate change to inform planning for disaster protection and alleviation.
There is a large body of literature on the topic of characterizing the statistical distributions and likely future patterns of extreme hydrological events, with a few key papers highlighted below.Müller-Wohlfeil et al. [9] used a global climate model (GCM) downscaling and hydrological model to simulate extreme hydrological processes under current climate conditions and future scenarios.The impact and uncertainties of climate change on hydrology was assessed by Dessu and Melesse [10] by comparing and contrasting results across diverse GCMs, future climate scenarios, and two downscaling methods.A dramatic increase in the frequency of the heaviest precipitation events over Britain in the future was predicted by Jones and Reid [11].Using a secondgeneration coupled global climate model under different emission scenarios and fitting a generalized extreme value

Study Area and Data
2.1.Study Area.Hanjiang, with a watershed area of 159,000 km 2 and a mainstream length of 1577 km, is the largest tributary of the Yangtze River.It originates in the south Qinling Mountains and flows through the five provinces of Shanxi, Gansu, Sichuan, Henan, and Hubei.The focus of this study is the headwater area located upstream of the Danjiangkou Reservoir between 106-112 ∘ E and 31.4-34.1 ∘ N.This area serves as the water source area for the middle route of SNWDP, which draws water from Danjiangkou Reservoir.In this headwater area the mainstream has a length of 925 km and drains a watershed area of 95,200 km 2 .The subtropical monsoon climate gives rise to mild and humid weather, with an annual average temperature of 14.6 ∘ C and mean annual precipitation of 819.5 mm.Rainfall is unevenly distributed in the basin, declining from south to north.The precipitation is strongly seasonal, with 70% occurring between May and September.The mean annual runoff is about 368.7 billion m 3  and is also strongly seasonal, being dominantly sourced from storm event surface runoff.1).The areal mean precipitation was calculated with the Thiessen polygon method.Two extreme series were considered, the annual maximum (AM) series and peaks over threshold (POT) series.The AM series comprised the maximum daily precipitation in each year.The POT series comprised the daily precipitation exceeding the 98th percentile value of the data.

GCM Data.
The Fourth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC AR4) provides 23 GCMs which have the ability to simulate current climate over East Asia with a given degree of uncertainty [18].In order to reduce the uncertainties of GCMs simulation of precipitation, it is common practice to adopt the coupled

Copula function 2D distribution 3D distribution
Gumbel-Hougaard climate model rather than the single model.The relative error of the simulation results for extreme precipitation indices with the coupled climate model is smaller than with the single model [19].However, our research focuses on extreme events, and the homogenizing effect of the coupled climate model would dilute the extremes.Therefore, the single model dataset from the World Climate Research Programme (WCRP) Coupled Model Intercomparison Project Phase 3 (CMIP3) was applied in this study.The Special Report for Emission Scenarios (SRES) [20] developed four future greenhouse gas emission scenarios on the basis of possible long-term global and regional dynamics of the 21st century, and three of them, A2 (high emission), A1B (medium emission) and B1 (low emission) [18], were selected for use in this study.The daily precipitation series from CSIRO MK3 5, INMCM3 0, and NCAR PCM1 GCM models were adopted.

Extreme Value Statistical Probability Models.
The Gamma, GEV, and GPD models have been widely applied in simulations of extreme hydrological events.The GEV distribution integrates three extreme distributions [21][22][23], including the Weibull, Gumbel, and Fréchet.As the GEV distribution is independent of the probability distribution characteristics of the original data and only samples the extreme value, it is the most direct description of extreme climate information contained within climate observation data.The Gamma distribution is the most important skewed distribution in climatological statistics, it can be used to fit normal distributions, and it shows high stability in description of precipitation [24].
In application of the POT series, the GDP is often used to describe the probability distribution of all observation data beyond a certain threshold value [25].The POT method increases the number of measurements included in the analysis and correspondingly reduces the statistical uncertainty of quantile variances and improves the fitting accuracy.The cumulative distribution functions (CDFs) of the GEV, GPD, and Gamma distribution are expressed as where  and  are shape parameters,  and  are scale parameters, and  is a location parameter.

Copula Function.
The Copula function plays an important role in the study of multivariate extreme theory [26][27][28].The Copula function can connect joint distributions of several random variables with their marginal distributions [26].In this study, three Copula functions, Gumbel-Hougaard, Clayton, and Frank, were used to build a joint distribution model of precipitation and flood discharge.These functions can be described by two-dimensional (2D) and three-dimensional (3D) distributions (Table 1).In this case, the joint CDF with two or three variables can expressed as where   () is the marginal CDF of each variable.

Selection of the
where   () is actual frequency and   () is theoretical frequency.

Return Periods.
A return period is an estimate of the average time between rainfall or flood events of a given magnitude.The return periods for variables greater than (or equal to) a specific value are usually determined as For bivariate distributions, the probability that both  1 and  2 exceed certain thresholds can be derived in terms of Copulas: The joint return periods can be expressed as For the 3D distribution, the joint return periods can be expressed as ) . (7)

Mann-Kendall Tests for Trend and Change Point in Time
Series.The Mann-Kendall (MK) test is a nonparametric method of detecting monotonic trend in a data series [29,30].
As the MK method does not require the data to conform to any particular distribution and is less sensitive to outliers, it has been widely applied to hydrological data, which rarely follows a normal distribution.The data should not be serially correlated, so for this study the data were first prewhitened [31].
For a time series {  ;  = 1, 2, . . ., }, the test statistic  is defined as When  ≥ 10, the distribution of  approaches a normal distribution, and the mean and variance of  are given as The normalized test statistic  is calculated as In a two-side trend test, the null hypothesis of no trend is rejected if || >  1−/2 at the  level of significance ( = 5% in this study).A positive  shows increasing trend and a negative  shows decreasing trend.
The sequential version of Mann-Kendall test [32] is used to test assumptions about the start of a trend within a sample based on rank series of progressive and retrograde rows of the sample.The sequential MK test is therefore useful for detecting abrupt change points in a hydrological series [33,34].For a time series {  ;  = 1, 2, . . ., }, the rank series   is defined as The mean and variance of   are given as Under the assumption that the time sequences are independent, the normalized test statistic   is defined as which is the forward sequence, and the backward sequence   is calculated using the same equation but in the reverse data series. 1 = 0, and the distribution of   approaches a normal distribution.If   > 0, the trend is increasing with time, and if   < 0, the trend is decreasing with time.The calculated   value is compared with the standard normal distribution table with two-tailed confidence levels.If

Univariate Frequency Analysis of Extreme Hydrological
Events.The maximum 1-day precipitation (RX1day) series, maximum 3-day precipitation (RX3day) series, maximum 1day flood discharge (W1) series, and maximum 3-days flood discharge (W3) series were established for the upper reaches of the Hanjiang basin by the AM and POT methods.The GEV, GPD, and Gamma models were selected for fitting of the series.The KS test statistic  was less than the critical value 0.21 ( < 0.05) for the three models, suggesting that all were an adequate fit (Table 2).For the AM series, the Gamma model was the best-fit model for RX1day, RX3day, and W1 series, while GPD model was optimal for the W3 series.As a whole, the GEV and Gamma models better simulated the AM series than did the GPD model.For the POT series, the GPD model was the best fit for three of the four series (Table 2).Estimated extreme precipitation and flood discharge associated with return periods of 10, 50, 100, 500, and 1000 years were calculated by these three distribution models for the AM and POT series (Table 3).The results indicated that, for AM series, for recurrence intervals ≥50 years, values of precipitation and flood discharge estimated by GPD model were noticeably lower than those by GEV and Gamma models.These lower values would translate to a higher risk for flood planning so GDP was regarded as unsuitable for the AM series.For the POT series, values estimated by GEV model were higher than those by GPD and Gamma models, and the values were distant from the observed data, indicating that GEV was unsuitable for the POT series (Table 3).For the AM series, the values calculated by Gamma distribution model indicated slightly higher values series than for the POT series.Adopting the more conservative estimates of the AM series would provide lower risk for flood planning.

Multivariate Frequency Analysis of Extreme Hydrological
Events.The Kendall tau rank correlation coefficient and the Spearman's rank correlation coefficient were used to analyze the degree of bivariate correlation between pairs of RX1day, RX3day, W1, and W3 for the AM and POT series (Table 4).There were significant positive correlations between extreme precipitation and extreme flood discharge, allowing the possibility of building a joint distribution model of precipitation and flood discharge using Copula functions.
Two-dimensional joint distributions were established based on the Gumbel-Hougaard, Clayton, and Frank Copula functions.The KS test and RMSE criterion suggested that the Frank Copula was superior for simulation of the AM series and the corresponding GEV distribution, while the Gumbel-Hougaard Copula was superior for simulation of the POT series and the corresponding GPD model (Table 5).All the values of the KS test statistic  of the optimal functions were smaller than the critical value of 0.21 under the significant level of  = 0.05, demonstrating that the simulations passed the KS test.
The 2D Frank Copula (Figure 2) and Gumbel-Hougaard Copula (Figure 3) functions both displayed highly significant correlation between the empirical and theoretical frequencies.For the AM series, the correlation coefficient for RX1day-W1, RX1day-W3, RX3day-W1, and RX3day-W3 were 0.986, 0.989, 0.991, and 0.991, respectively, demonstrating that the Frank Copula function was a good fit to the AM samples.For the POT series, the correlation coefficient for RX1day-W1, RX1day-W3, RX3day-W1, and RX3day-W3 were 0.992, 0.995, 0.994, and 0.993, respectively, demonstrating that the Gumbel-Hougaard Copula function was a good fit to the POT samples and could be used to build the 2D joint distributions of precipitation and flood discharge.
Extreme precipitation and flood discharge under the return periods of 10, 20, 50, 100, 200, 500, and 1000 years were estimated using the 2D joint distributions (Figure 4).It was assumed that the frequency of precipitation and floods were the same.The notation used to describe the joint distributions was that, for example, "AM, RX1day-W1" represents that the variables of the ordinate were calculated via joint distributions of RX1day and W1 from the AM series, and so on (Figure 4).
Similar to the 2D joint distributions, 3D joint distributions were established based on the Gumbel, Clayton, and Frank Copula functions.The Frank Copula was the best-fit function for both the AM and POT series.Extreme precipitation and flood volume under the return periods of 10, 20, 50, 100, 200, 500, and 1000 years were estimated (Figure 5).The notation used to describe the joint distributions was that, for example, "AM, RX1day-W1-W3" represents that the variables of the ordinate were calculated by joint distributions of RX1day, W1, and W3 from the AM series, and so on (Figure 5).
The estimated extreme precipitation rates and flood discharges for the AM series were larger than those for the POT series under the same return period (Figures 4 and 5).Also, the estimated extreme value of the POT series did not increase with increasing return period to the same extent as the AM series.It appears that the Copula function and the corresponding GPD model for the POT series was unsuitable for estimating the extreme value under large return periods because of the limited extreme sample of the POT series.This result suggests that the AM series better described the  extreme hydrological events for the purpose of lower risk flood planning.The design floods of Danjiangkou Reservoir dam used in the preliminary design stage, calculated by a traditional single distribution hydrological method [37], were compared with the design value of W1 under a range of return periods calculated by 2D and 3D Copula functions for the AM series (Table 6).The design value calculated by the 3D Copula function was the highest of these for all return periods.The design values calculated by the 2D Copula function were slightly larger than the ones used in the preliminary dam design stage for 10-, 20-, and 1000-year return periods, while 2D Copula function estimates were slightly lower than preliminary dam design estimates for 50-and 100year return periods.The joint distribution makes more use of available extreme information than a traditional single distribution model, and the higher estimates of flood peaks given by the joint distribution are more conservative for application to flood planning.Overall, the results of the multivariate frequency analysis suggests that, in the case of the upper Hanjiang, this approach to describing extreme floods would lead to more conservative (lower risk) design than a traditional approach.

Prediction of Extreme Hydrological Events
under Future Climate Change Scenarios  INMCM3 0, and NCAR PCM1, having complete daily outputs of precipitation data, were utilized for this study.
Observed daily precipitation data in the baseline period  was used to assess the capability of these three GCMs in simulating the historical precipitation.Precipitation in the example years 1970,1980,1990, and 2000 and the total precipitation over the 40 years of the baseline period were used for comparison (Table 7).For the simulation of the total precipitation over 40 years, the errors of three models CSIRO MK3 5, INMCM3 0, and NCAR PCM1 were 3.65%, 88.69%, and 70.05%, respectively.As well as closely predicting the 40-year total precipitation, the CSIRO MK3 5 model also closely predicted annual precipitation for the four example years, while the other two models were highly inferior (Table 7).This result is in agreement with previous work that has demonstrated the capacity of CSIRO MK3 5 to simulate the contemporary climate of China [38,39].On this basis, the CSIRO MK3 5 was chosen to simulate hydrological events in the study area under future climate change scenarios.The annual precipitation series over the baseline period simulated by the CSIRO MK3 5 model fitted the observed pattern of annual precipitation reasonably well, with both series having negative trend (Figure 6).However, the rate test for trend in future extreme precipitation under the three climate change scenarios.The  values of RX1day were 2.152, 2.167, and 1.454 for A1B, A2, and B1 scenarios, respectively, indicating an increasing trend of RX1day series under each scenario, and the trend was significant under A1B and A2 scenarios.The  values of RX3day were 1.37, 1.438, and 0.452 for A1B, A2, and B1 scenarios, respectively, indicating increasing trend but the trend was not significant.The results of sequential MK test statistic (Figure 10) indicated a similar pattern over time for the three climate change scenarios.The   curves showed an increasing trend over the considered periods, and the   curves indicated a decreasing trend.Under the A1B scenarios, the sequential version of MK test for RX1day series indicated a change point in 2053 (Figure 10(a)), and the   also displayed that RX1day had shown an increasing trend since 2053.The   and   plots for RX3day series intersected each other several times (Figure 10(b)), but only 2046 can be recognized as a change point.Under the A2 scenarios, both RX1day and RX3day series had a change point in 2072, and the extreme precipitation showed a decreasing trend from 2016 till 2072 and afterward it had increased.Under the B1 scenarios, the   and   plots indicated that the change points of RX1day series were detected in 2025 and 2085, and the change points of RX3day series occurred in 2030 and 2075.Future precipitation had no significant trend under the three climate change scenarios (Figure 8), but the extreme precipitation had an increasing trend in most years (Figure 10), indicating that the proportion of the extreme precipitation in total precipitation increased constantly.

Change Trends of Future Extreme Flood.
It was assumed that the frequencies of extreme precipitation and extreme flood are the same.Based on the Frank Copula function, the joint probability of precipitation and flood was calculated by the probability of future precipitation.This was used to represent the probability of floods in order to calculate the future 1-day (W1) and 3-day (W3) flood discharges via the marginal distribution function of flood volume.The future extreme flood discharge time series obtained for the AM series of RX1day and RX3day under the A1B, A2, and B1 scenarios indicated that future extreme flood volumes were    larger under the A2 scenarios than under the A1B and B1 scenarios (Figure 11).
The future extreme 1-day (W1) and 3-day (W3) flood discharges calculated from precipitation for a range of return periods were slightly smaller than the corresponding values calculated from the historical data (Table 8).This was consistent with the results of the previous research [40].In China, the convention is to classify floods by size according to return period: (1) small, having a return period less than 5 years; (2) medium, having a return period between 5 and 20 years; (3) large, having a return period longer than 20 years.The frequencies of occurrence of floods of the three conventional size grades under the A1B, A2, and B1 scenarios (Figure 12) indicate that under the future scenarios small floods would occur less frequently than under historical conditions, while the frequency of occurrence of the medium floods and large floods under the future scenarios would be higher than the observed frequency of occurrence.Also, the frequency of occurrence of large floods under the A2 scenario was higher than that under the A1B and B1 scenarios, while the frequency of occurrence of small floods under the A2 scenario was less than that under the A1B and B1 scenarios.

Conclusions
In this study, the GEV, GPD, and Gamma distribution models and Copula functions were applied to estimate extreme hydrological events from 1969 to 2009 in the water source area of the middle route of South-to-North Water Diversion Project in China.Based on the simulated results of 23 GCMs from the World Climate Research Programme's CMIP3 single-model dataset in the Intergovernmental Panel on Climate Change Fourth Assessment Report, the future extreme hydrological events from 2016 to 2100 were simulated   under the A1B, A2, and B1 scenarios.The main conclusions can be summarized as follows: (1) For the AM (annual maximum) series, the GEV and Gamma model better simulated the extreme precipitation and flood volume distributions than the GPD model, while the GPD model was the best fit for the POT (peaks over threshold) series.
(2) For the 2D (two-dimensional) joint distributions of precipitation and flood volume, the Frank Copula performed better in simulation of the AM series and the corresponding GEV distribution, whereas the Gumbel Copula was the most appropriate function to simulate the POT series and the corresponding GPD distribution.The estimated extreme precipitation and flood discharges of the AM series were larger than those of the POT series for the same return period.
Adopting the more conservative estimates of the AM series would provide lower risk for flood planning.
(3) For the same return period, the magnitudes of the design floods calculated by the 2D and 3D (threedimensional) Copula functions were larger than those used in the preliminary design stage of Danjiangkou Reservoir.The joint distributions utilize more of the available extreme information, and the higher estimated flood magnitudes carry lower risk for design purposes, suggesting that multivariate statistical analysis has benefits over a traditional univariate approach.
(4) The outputs of CSIRO MK3 5 global climate model were applied to simulate the future precipitation over the study area from 2016 to 2100.The results suggested that the future precipitation shows no significant trend under the three climate change scenarios, but the extreme precipitation showed a tendency that it will decrease in the first few years and increase in the last few years under these three scenarios, which indicated that the proportion of the extreme precipitation in total precipitation increases constantly.(5) The future extreme flood discharges were estimated to be slightly smaller than the historical values for the same return period, while the occurrence frequency of the medium and large floods under the future scenarios is higher than the observed occurrence frequency.The frequency of occurrence of the large flood under the A2 scenario is higher than that under the A1B and B1 scenarios, while the frequency of occurrence of the small flood under A2 scenario is less than that under the A1B and B1 scenarios.

Figure 1 :
Figure 1: Locations of meteorological stations and reservoir in the study area.

Figure 4 :
Figure 4: Extreme precipitation and flood discharge estimated via 2D joint distributions under a range of return periods.

Figure 5 :
Figure 5: Extreme precipitation and flood discharge estimated via 3D joint distributions under a range of return periods.

Figure 6 :
Figure 6: Comparison of simulated and observed annual precipitation in the study area.

Figure 7 :
Figure 7: Annual precipitation changes from 2016 to 2100 under three climate change scenarios.

Figure 8 :
Figure 8: MK test score for annual precipitation from 2016 to 2100 under three climate change scenarios.

Figure 10 :
Figure 10: The MK test score of extreme precipitation from 2016 to 2100 under three climate change scenarios; (a) RX1day, (b) RX3day.

Figure 11 :Figure 12 :
Figure 11: Extreme flood volume changes from 2016 to 2100 under three climate change calculated by (a) RX1day and (b) RX3day and W3 calculated by (c) RX1day and (d) RX3day.

Table
: The Gumbel-Hougaard, Clayton, and Frank Copula functions and model parameters.

Table 2 :
[35,36]test statistic  value of marginal distribution models.||>  /2 , the trend is statistically significant; otherwise, the trend is not significant.The sequential MK test enables detection of the approximate beginning of a developing trend from the intersection point of the curves   and   of the test statistic.If the intersection point is significant at  = 0.05, then the critical point of change is at that period[35,36].

Table 3 :
Extreme precipitation and flood discharge estimated via marginal distribution models for a range of return periods.

Table 4 :
Correlation coefficient of the different extreme samples.
Note:  is the Kendall coefficient;  is the Spearman's coefficient.RX1day-W1 represents the joint distribution of RX1day and W1.

Table 5 :
Optimal function and evaluation indicators of 2D joint distribution for AM and POT series.

Table 6 :
The design flood calculated by different methods under a range of return periods.

Table 7 :
Comparison of simulated and observed annual precipitation in the study area.

Table 8 :
Future and historical predicted extreme flood discharges for a range of return periods.