Exploration of Use of Copulas in Analysing the Relationship between Precipitation and Meteorological Drought in Beijing , China

1College ofWater Sciences, BeijingNormalUniversity, BeijingKey Laboratory ofUrbanHydrological Cycle and SpongeCity Technology, Beijing 100875, China 2Agricultural Water Conservancy Department, Changjiang River Scientific Research Institute, Wuhan 430015, China 3Environmental Science Division, Argonne National Laboratory, Lemont, IL 60439, USA 4State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, China Institute of Water Resources and Hydropower Research, Beijing 100038, China


Introduction
Drought is a recurrent extreme climate event that affects every aspect of the natural environment, as well as human lives [1].It is defined as a natural temporary imbalance in water availability, which may be related to persistently lower-than-average precipitation.Droughts have uncertain frequencies, durations, and severities, and their occurrence is difficult to predict.Additionally, the water resource availability and carrying capacity of an ecosystem are diminished [2,3].Based on these factors, drought analysis has been a challenging topic in water resource management.
A variety of studies have examined the different aspects of drought phenomena by developing different drought indices [4][5][6][7][8][9], analysing the spatiotemporal patterns of drought [10][11][12], and predicting future drought risks [13,14].Regarded as a complex hydrologic event, drought is a type of multivariate event that is characterized by a few correlated random variables.Thus, a number of studies have focused on constructing multivariate distributions of drought characteristics, such as the intensity, duration, frequency, and recurrence [15][16][17][18].Copulas are effective tools in multivariate distribution construction.They use a nonlinear approach to establish the joint probability distribution of two or more related variables and are able to model the dependence structure between random variables independent of their marginal distributions.Since Sklar [19] first introduced copulas, they have been widely used in water-related research fields.When using copulas for drought analysis, a joint probability distribution has commonly been used between different characteristics of drought.Shiau [20] used two-dimensional copulas to construct a joint drought duration and severity distribution for the first time.Song and Singh [21] used three-dimensional metaelliptical copulas to construct a joint drought duration, severity, and interarrival time distribution.Similar studies [22][23][24][25][26][27] have shown that copulas are useful in exploring the associations among correlated drought variables.
Apart from the joint probability distribution, copulas are also able to establish conditional probability distributions.Unlike joint probability, conditional properties can provide the probability distributions of drought under different conditions.Saghafian and Mehdikhani [28] investigated the conditional return periods of drought severity, peak, and duration based on a conditional probability distribution established by copulas.Additionally, Madadgar and Moradkhani [29] applied conditional probability for drought prediction.By incorporating copulas into a Bayesian framework, they developed a probabilistic forecasting model for predicting future spring drought status given the drought conditions in the previous season.Hao et al. [30] proposed a hydrological drought prediction method based on a meta-Gaussian model of conditional probability.The model considered the drought persistence and the prior meteorological drought conditions.
Since climate change and the increasing global warming trend have affected hydrological events in recent years, it is important to consider climatic variables when analysing drought.Generally, drought originates from a precipitation deficit, which first causes meteorological drought.Zhang et al. [31] studied how precipitation and temperature affect drought occurrence.They concluded that the similarity coefficient between the drought probability and low precipitation probability in China was 0.9732, reflecting a strong relationship between drought and precipitation.Copulas can be used to combine precipitation and drought indexes and to provide conditional probability distributions (with precipitation as the condition) between them.From this conditional probability distribution, we can quantify the probability of drought occurrence under different precipitation conditions.
Beijing is located in the dry northern part of China, which experienced water crises caused by drought from 1980 to 1985 and from 1999 to 2007 [32].Cai et al. [33] evaluated the spatiotemporal characteristics of drought in the Beijing-Tianjin-Hebei metropolitan region.Liu et al. [34] proposed a vegetation-dependent temperature-vegetation dryness index model for analysing regional drought disasters in the Beijing-Tianjin-Hebei metropolitan region.Li et al. [35] analysed the joint probability and return period of the drought severity and duration during the growth period of winter wheat in Beijing.Such studies have focused more on the spatiotemporal characteristics of drought and drought variables.They found that Beijing experienced quite frequent moderate and severe droughts, especially during the periods of 1965-1973 and 1997-2002.In addition, the risk of drought during winter was very high, and the area affected by drought was >70,000 ha.However, similar studies have focused less on how the climatic variables can affect drought conditions.
In this study, a conditional probability distribution was applied for meteorological drought analysis based on copulas.Monthly total precipitation and monthly mean temperature data from 1951 to 2013 were collected from one meteorological station to calculate the Standardized Precipitation Evapotranspiration Index (SPEI).After a temporal analysis of the SPEI based on the Hilbert-Huang Transform (HHT) and Mann-Kendall (MK) trend test, the meteorological drought probability was analysed under three different precipitation conditions (precipitation equal to, greater than, or less than a threshold).

Data and SPEI Calculation.
The SPEI was used to represent meteorological drought.A meteorological station in Beijing (latitude 39.80 ∘ N, longitude 116.47 ∘ E) was chosen as the study site.The monthly total precipitation (mm) and monthly mean air temperature ( ∘ C) data from 1951 to 2013 were collected from the National Climate Center of the China Meteorological Administration (http://data.cma.cn/).The SPEI describes the effects of drought on vegetation and agricultural practices on short-term scales and represents a broad proxy for water resource management on longterm scales [36].In this study, we calculated both shortterm (1 month) and long-term (12 months) SPEI values for meteorological drought analysis.According to the definition of the SPEI, the -month SPEI is based on precipitation and evapotranspiration total for the previous  months.Thus, a 12-month SPEI represents conditions over the full calendar year, while a 1-month SPEI represents conditions within a single month.The aim of analysing the temporal characteristics of the SPEI is to explore the changing patterns in meteorological drought conditions between different years and provide a holistic view.Thus, we chose the 12-month SPEI for this part of the analysis.Meanwhile, the goal of the copula-based analysis of precipitation and SPEI is to obtain the conditional probability distributions of meteorological drought given different precipitation conditions.Conditional probability distributions can be used for meteorological drought prediction in the future.Because short-term drought prediction is more practical than long-term prediction, we used the 1-month SPEI for the copulas-based analysis.
The calculation process can be divided into the following three steps [8].
(1) Calculate the water balance for month  according to the following equation: where  is monthly total precipitation and PET is potential evapotranspiration, which was calculated using the Thornthwaite method [37].The calculated   values are aggregated at different time scales.The difference   , in month  and year  depends on the chosen time scale .For example, the where  , is the  − PET difference in month  and year i, in millimetres.
(2) Use a three-parameter log-logistic probability density function to fit the established  series.Then, obtain the cumulative probability function as follows: where () is the probability density function of  series; () is the probability distribution function of  series; and , , and  are scale, shape, and origin parameters, respectively, for  values, which can be obtained using an Lmoments approach.
The SPEI follows the same classification criteria as the Standardized Precipitation Index (SPI) because of the similarity in the calculation principles of SPEI and SPI [11].Table 1 reports the climate classification according to the SPEI [5].

Hilbert-Huang Transform.
The HHT method was used to analyse the periodic features of the SPEI.The HHT comprises two algorithms: empirical mode decomposition (EMD) and Hilbert transform.After the EMD application, the original time series can be divided into several intrinsic mode functions (IMF) and one residual (RES).Each IMF represents a period, and RES represents the overall trend of the original time series.A Hilbert transform is then performed on each IMF to find its instantaneous frequency and amplitude.Thus, it provides the time-frequency-amplitude relationship of all the IMFs together.The calculation process is based on that presented in [38].

Mann-Kendall Trend Test with a Running Approach.
The MK nonparametric test is widely used in the hydrological field for trend analysis [39,40].The related equations for calculating the MK test statistic  and the standardized test statistic  are as follows: where In this paper, the MK test was employed for SPEI trend analysis using a running approach, which was conducted as shown in Figure 1 [36].
When applying the approach shown in Figure 1 in this paper, the minimum window width  min was 10 years.The maximum window width was 63 years.

Copula Definition and Selection.
Copulas offer a flexible way of describing the dependence structure of multivariate data independent of their marginal probability distributions.Additionally, copulas are powerful tools for modelling and sampling multivariate data that are nonlinearly interrelated.Gumbel, Frank, and Clayton copulas are all Archimedean copulas, which have been widely used in drought analysis.In this paper, these three types of copulas are chosen to match the two variables (precipitation and SPEI).The cumulative Draw Z values in a diagram according to the window width and the index of first data within the window window width Change the window width from w min to n; repeat the former two steps for every Move the window forward; calculate Z value for data within the window, until the running window covers the last data-x n calculating window; then calculate Z value for (x 1 , x 2 , . . ., x wmin ) For time series (x 1 , x 2 , x 3 , . . ., x n ), set a minimum width w min for the running distribution functions (CDF) of the Gumbel, Frank, and Clayton copulas are described in ( 6) to ( 8): Frank is Clayton is where  and V are two variables and  is the copula parameter. is the Kendall rank correlation coefficient, which can be calculated using (7): where sgn is the sign function,  is the number of samples, and (  ,   ) is the actual data.Ordinary least squares (OLS), mean square error (MSE) [41], Akaike information criterion (AIC) [42], and Bayesian information criterion (BIC) [43] were used to assess the fitting results of different copulas and to identify the bestperforming one.The minimum value reflects the best-fitting result.The OLS, MSE, AIC, and BIC measures were defined as follows: where   is the empirical frequency,   is the theoretical frequency,  is the number of copula parameters, and  is the number of samples.

Conditional Probability Distribution.
To distinguish between the abbreviations of precipitation and probability (both capital letter P), we use R (short for rainfall) as the abbreviation for precipitation and  as the abbreviation for SPEI.With the joint probability distribution of precipitation and SPEI, the conditional probability can be calculated in three ways.Given a certain precipitation threshold ( = ), the conditional CDF of the SPEI is given as follows [44]: If precipitation exceeds a certain threshold ( ≥ ), the probability that the SPEI is less than a certain threshold ( ≤ ) or exceeds a certain threshold ( ≥ ) is given as follows [45]: where   () and   () are the marginal CDFs of precipitation and the SPEI, respectively;  , ( | ) is the joint CDF of precipitation and SPEI, which is derived using copulas; and  is the unique copula function that links   () and   () to form the joint CDF.When applying copulas, five steps should be followed.
Step 1. Determine the marginal distributions of two variables based on the collected data.
Step 2. Determine the Kendall rank correlation coefficient  to establish whether the two variables are related.
Step 3.According to , calculate the parameter  for the Gumbel, Frank, and Clayton copulas to obtain the expressions of different copula functions.
Step 4. Choose the best-fitting copula function based on OLS, MSE, AIC, and BIC criteria.
Step 5. Calculate the conditional probability distribution according to the corresponding functions.
All the methods used in this paper, such as HHT, the MK trend test, and copulas, were executed using MATLAB5 software.

SPEI Calculation Results.
The SPEI was calculated at both the 1-month (short-term) and 12-month (long-term) scales from 1951 to 2013 (Figure 2).Both the 1-month SPEI and 12-month SPEI exhibited fluctuating patterns.The 12month SPEI (Figure 2(a)) was used for trend and period analysis, while the 1-month SPEI (Figure 2(b)) was used for establishing copula functions between monthly precipitation () and the SPEI.

SPEI Trend and Period.
The HHT method was applied to analyse the periodic features of the 12-month SPEI.After the EMD application, the original SPEI was decomposed into four IMFs and one RES (Figure 3).Hilbert transform was conducted on four IMFs and one RES to obtain the timefrequency-amplitude Hilbert spectrum (Figure 4).From the EMD results in Figure 3, the average periods of IMF1, IMF2, IMF3, and IMF4 were 2.7 years, 5 years, 10 years, and 18 years, respectively.In particular, the RES exhibited a decreasing trend.The Hilbert spectrum (Figure 4) offers a different method of analysing IMF and RES values from the point of view of energy variations.The colours of lines and dots represent the energy.The period is the reciprocal of the frequency.Thus, a lower frequency reflects a longer period.The two lines near the horizontal ordinate can be described as lowfrequency and high-energy lines.The periods of these two lines were larger than those of others, indicating that RES and IMF4 had periods longer than 18 years.This phenomenon  indicates that large periods, especially periods over 18 years, were the major SPEI periods, and they significantly influenced the variations in SPEI.In the upper part of Figure 3, the high-frequency IMFs appear as vaguely scattered dots.These scattered dots also displayed high energy, indicating that short 2-5-year periods can significantly influence the variations in SPEI.
The MK trend test with a running approach was applied to analyse the trend characteristics of different start times and durations.The trends were plotted for better visualization.The -axis indicates the window width, and the -axis is the first year of the window to which the trend refers (Figure 5).The value of the trend is characterized by the coloured triangle symbols.An increasing or decreasing trend is significant when || > 1.96.This figure captures the entire spectrum of significant trends present in the series; thus, it provides all detailed subseries information.The frequency of each symbol in Figure 5 is summarized in Table 2.
From Table 2, 80% of the subseries displayed a general decrease, half of which were significant.Most blue inverted triangle symbols appeared in the upper-right portion of Figure 5, indicating a significant decreasing trend for the entire SPEI series.For the largest window widths, a significant decreasing pattern was detected (the blue inverted triangle in the upper-left portion of the figure).In the first years of the observation period, for temporal window widths of less than 15 years, an insignificant increasing trend was detected (yellow triangles in the bottom-left portion of the figure).This phenomenon is consistent with the RES result, with a slight increasing trend from 1951 to 1965.A significant increasing trend was also detected in recent years from 1998 to 2013 (red triangles in the bottom-right portion of the figure).

Conditional Probability between Precipitation and SPEI.
Figure 6 illustrates the marginal distributions of  and the Windows (years) 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 1950 Starting year  SPEI.The CDF of precipitation fit an exponential distribution, while the SPEI values fit a normal distribution.
The statistical dependencies between  and SPEI were estimated using statistical measures.Kendall's  was calculated to be 0.734, indicating a strong positive dependence between  and SPEI.Initially, the Gumbel, Frank, and Clayton copulas were established to jointly describe the two variables.The OLS, MSE, AIC, and BIC criteria were used to compare the fits of the different copula models and to select the model that best described the dependency between the two variables.The results, which are presented in Table 3,  show that the best-fitting model was the Clayton copula; therefore, it was chosen for use in this study.
When precipitation equals a threshold, that is,  = , the probability of SPEI ≤  can be calculated based on (12).Figure 7(a) illustrates the conditional probabilities in three ways: SPEI ≤ −2, SPEI ≤ −1.5, and SPEI ≤ −1.According to the classification of SPEI in Table 1, SPEI ≤ −2 represents the conditions of extreme drought and worse, SPEI ≤ −1.5 represents the conditions of severe drought and worse, and SPEI ≤ −1 represents the conditions of moderate drought and worse.In Figure 7(a), all three curves exhibit decreasing trends, which suggests that the probability of drought decreases as monthly precipitation increases.Additionally, the slopes of the three curves display the same decreasing pattern.The curve of SPEI ≤ −1 is above the other two curves, indicating a higher probability.
When precipitation exceeds a threshold, that is,  ≥ , the probability of SPEI ≤  can be calculated based on (13).Figure 7(b) illustrates the conditional probabilities of five conditions: R ≥ 10 mm, R ≥ 20 mm, R ≥ 50 mm, R ≥ 100 mm, and R ≥ 200 mm.The -axis represents the values of SPEI, while the -axis represents the probability that SPEI is less than the corresponding value.In other words, these curves can be regarded as the CDF of the SPEI.The shape of the five curves is similar to a normal distribution, in which the probability that SPEI = 0 is approximately 0.5.The curve of R ≥ 200 mm plots lowest in the figure, while the curve of R ≥ 10 mm is at the top.This result suggests that the probability of drought decreases as the precipitation threshold increases.
When precipitation is equal to or less than a threshold, that is,  ≤ , the probability of SPEI ≤  can be calculated based on (14).Figure 7(c) illustrates the conditional probabilities of five conditions: R ≤ 10 mm, R ≤ 20 mm, R ≤ 50 mm, R ≤ 100 mm, and R ≤ 200 mm.In this sense, Figure 7(c) also illustrates the CDF of SPEI for precipitation greater than a threshold.The shapes of the five curves in Figure 7(c) are similar to a normal distribution.Similar to the curves in Figure 7(b), the curve of R ≤ 200 mm plots at the bottom, while the curve of R ≤ 10 mm is at the top.This result suggests that the probability of drought decreases as the threshold of precipitation increases.
The three diagrams in Figure 7 all correspond to the same trend: the probability of drought decreases as precipitation increases.This rule is logical, as precipitation can alleviate drought.However, the diagrams in Figure 7 can further provide a quantitative relationship between precipitation and meteorological drought.In Figure 7(a), when precipitation ranges from 0 to 20 mm, the curves of SPEI ≤ −1 and SPEI ≤ 1.5 exhibit large decreasing trends; however, when precipitation increases above 20 mm, the curves are concave.For the curve of SPEI ≤ −2, the inflection point occurs when precipitation is approximately 10 mm.Thus, 10 mm can be regarded as the monthly precipitation threshold for triggering extreme drought phenomena, while 20 mm can be considered the threshold value for lesser droughts.Within the threshold value, even 1 mm of precipitation can effectively reduce the incidence of drought.Quantitatively, when monthly precipitation in Beijing is less than 20 mm, the probability of moderate drought occurrence is 8%, the probability of severe drought occurrence is 15%, and the probability of extreme drought occurrence is 5%.
Figures 7(b) and 7(c) are two types of cumulative probability distributions for the SPEI.The probability of each type of drought (or level of wetness) classified in Table 1 can be calculated based on these cumulative probability distributions, as shown in the diagrams in Figure 8.The axis indicates the SPEI class in Table 1 divided by the SPEI value (going from dry to wet conditions from Class 7 to Class 1).For instance, Class 7 represents extreme drought, Class 4 represents normal conditions, and Class 2 represents very wet conditions.The -axis indicates the probability of meteorological drought.Figure 8(a) was calculated from Figure 7(b) for precipitation exceeding a threshold ( ≥ ).The upper diagram in Figure 8(a) demonstrates the probabilities of all seven classes.The probability of a normal situation is the largest at approximately 70% for all five conditions, and this probability is far higher than those of the other six classes.The probabilities of wet conditions are higher than those of drought conditions.To obtain a clear view of the probability of drought, we magnified the diagram to obtain the bottom plot in Figure 8(a).The probabilities of extreme drought, severe drought, and moderate drought are all below 10% for the five conditions.Figure 8(b) was calculated from Figure 7(c) for precipitation equal to or less than a threshold ( ≤ ).The probability of each scenario is listed in Table 4.The probability of normal conditions (Class 4) is the highest, but different situations vary.The probabilities of drought conditions are higher than those of wet conditions.When R ≤ 10 mm, the probabilities of moderate drought, severe drought, and extreme drought are 22.1%, 18%, and 13.6%, respectively.When  increases above 20 mm, the probability of each class does not vary considerably.
This copula-based conditional probability between  and SPEI not only reveals the occurrence of drought in Beijing but also provides a probability distribution of drought under different precipitation conditions.This quantitative distribution is based on a comprehensive analysis of historical data, and it can be used for future drought prediction.For instance, if we know how much precipitation will occur in the next month, then we can refer to this probability distribution and obtain the risk associated with each type of drought.Copula functions are effective tools for establishing relationships between related variables based on a nonlinear approach.However, in this study, we used data from only one station and analysed only the relationship between precipitation and meteorological drought.In future studies, these analysis methods can be expanded for similar fields, and other meteorological variables that affect drought conditions (temperature, wind speed, sun exposure, etc.) can be considered.

Conclusions
In this paper, copulas were applied to analyse the probabilistic relationship between monthly precipitation and SPEI (representing the situations of meteorological drought) based on three conditions ( = ,  ≥ , and  ≤ ) in Beijing, China.The HHT and MK trend tests with a running approach were also applied to analyse the temporal variations in the SPEI.
Results from HHT method indicated that the SPEI varied in four periods: 18 years, 10 years, 5 years, and 3 years.The results from MK test with a running approach showed a significant decreasing trend for the entire SPEI series and insignificant increasing trends from 1951 to 1965 and 1998 to 2013, respectively.
When establishing the copula functions, Clayton copula was selected as the best-fitting copula.Based on the conditional probability method, the probability of drought was calculated based on three conditions ( = ,  ≥ , and  ≤ ).For all three conditions, the probability of meteorological drought decreased as monthly precipitation increased, and 10 mm can be regarded as the threshold for triggering extreme drought, while 20 mm can be regarded as the threshold for lesser droughts.From a quantitative perspective, when precipitation is equal to 20 mm, the probability of moderate drought, severe drought, and extreme drought is 8%, 15%, and 5%, respectively.For  ≥ , the probability of normal conditions is approximately 70%, which is the largest among all 7 classes.The probabilities of wet conditions are larger than those of drought conditions.Notably, the probabilities of extreme drought, severe drought, and moderate drought are all below 10%.When  ≤ , the probabilities of drought conditions are higher than those of wet conditions.The probabilities of moderate drought, severe drought, and extreme drought are 22.1%, 18%, and 13.6%, respectively, when  ≤ 10 mm.

Figure 1 :
Figure 1: A schematic diagram of the MK trend test with a running approach.

Figure 5 :
Figure 5: Running trends of annual mean regional series in the 12month SPEI.

Table 1 :
Classification of SPEI values.

Table 2 :
Results of the MK trend test with a running approach.

Table 3 :
Parameter estimation and criteria results of copula fitting.