Estimation of Risk Thresholds for a Landslide in the Three Gorges Reservoir Based on a KDE-Copula-VaR Approach

The Three Gorges Reservoir area, one of the most landslide-prone areas in China, is characterized by widely distributed deep-seated landslides exhibiting creep deformation due to rainfall and reservoir fluctuation. Thresholds, which are a key component for a reliable landslide early warning system, are still lacking for the prediction of movements of deep-seated reservoir landslides with creep deformation because information about reservoir fluctuation indicators is lacking, uncertainty is ignored, and binary output is provided. The risk threshold, defined as the tolerance criteria for risks that will lead to action, is an effective measure of the degree of uncertainty. In the present study, a hybrid approach utilizing kernel density estimation, a copula function, and the value at risk is proposed for the estimation of the risk threshold for the Baishuihe Landslide, a typical deep-seated landslide in the Three Gorges Reservoir area. Historical observations over approximately 15 years including rainfall, reservoir fluctuation, and landslide velocity were used to extract the risk threshold. A three-level risk threshold describing the minimum magnitudes of rainfall and reservoir fluctuation for changing the landslide movement state under three confidence levels was developed. A three-level risk response procedure, including risk responses in yellow alert, orange alert, and red alert, is proposed for risk management. Given its successful application, the present approach can be used to estimate the risk threshold for deep-seated landslides.


Introduction
The movement and failure of landslides can cause significant societal and economic losses. According to statistics, landslides have caused 10,338 deaths during the period of 2008-2017 [1]. Rainfall is considered to be the primary causal factor of landslide movement and failure [2,3]. A threshold is defined as the minimum or maximum level of a quantity needed for a process to take place or a state to change [4], and this concept is a key component for landslide early warning systems. For rainfall-induced landslides, the rainfall threshold defines the lower bound of known hydrological conditions (e.g., rainfall intensity, rainfall duration, or soil moisture) that has resulted in landslides [5][6][7], and this concept has been widely used to forecast the possible occurrence of a landslide in a given study area. In the 1980s, a global rainfall threshold for shallow landslides and debris flows was proposed based on 73 cases [8]. Following this pioneering work, a series of rainfall thresholds ranging from global to regional to local scales [6,7,[9][10][11][12][13][14][15][16][17][18][19][20][21] have been proposed based on landslide inventories and precipitation. The available rainfall thresholds can also be classified according to both the estimation approaches that are used (i.e., physically based approaches or empirical approaches) and the rainfall variables that express the threshold (e.g., rainfall duration, cumulative event rainfall, rainfall intensity, or antecedent rainfall) [22]. These rainfall thresholds have been widely applied due to the advantage of determining the timing for the occurrence of shallow landslides.
The Three Gorges Reservoir area, one of the most landslide-prone areas in China, is characterized by deepseated landslides. The deep-seated landslides there exhibit creep deformation which can be described as a series of steps consisting of rapid movements at certain times and suspended activities during other periods due to rainfall and reservoir fluctuation [23,24]. And changes of landslide movement state from suspended activities to rapid movements pose a significant threat to lives and properties. For example, the movement of the Qianjiangping Landslide, which was triggered by the initial filling of the Three Gorges Reservoir, caused the deaths of 24 people and damage to 129 houses [25]. However, only a few efforts have attempted to estimate the thresholds in the Three Gorges Reservoir area [26][27][28]. The available thresholds are classifiers that provide a binary output (landslide or no landslide), which are more applicable for sudden and rapid rainfall-induced failures. Moreover, hydrological indicators (such as reservoir fluctuation) that have been acknowledged to be domain causal factors for the movement of deep-seated landslides [29,30] are not usually included in the available thresholds. Furthermore, little attention has been paid to the uncertainty of the models. In fact, a certain degree of uncertainty exists in the identification of rainfall thresholds [31], given that only the major variable, rainfall, is included, and less important variables are excluded. Poor quality observations or insufficient data sets may also increase the involved uncertainty.
Therefore, it is of great value to define thresholds to describe the minimum magnitudes of rainfall and reservoir fluctuation for changing the landslide movement state and to thus implement an effective tool for landslide risk management in the Three Gorges Reservoir area. The risk threshold, defined as the tolerance criteria for risks that will lead to action, is an effective measure of the degree of uncertainty [32,33]. It has been widely applied as a popular risk management measure in the fields of finance [34], hydrology [35,36], and energy storage operation [37] due to the advantage of conceptual simplicity.
The purpose of the present study is to estimate the risk threshold for deep-seated reservoir landslides based on kernel density estimation (KDE), a copula function, and the value at risk (VaR). The Baishuihe Landslide, a typical deep-seated landslide in the Three Gorges Reservoir, is selected as a case study. Based on approximately fifteen years of observations, a three-level threshold under three confidence levels is developed for risk management.

Materials and Methods
2.1. Features of the Baishuihe Landslide. The Baishuihe Landslide (latitude and longitude: N31°01 ′ 34 ″ and E110°32 ′ 09 ″ , Figure 1(a)) is located on the south bank of the Yangtze River approximately 56 km upstream from the Three Gorges Dam. The main sliding direction is 20°. The toe elevation of the landslide is approximately 130 m, but the toe is submerged by the Yangtze River, while the crown varies from 350 to 400 m (Figures 1(b) and 1(c)). The landslide is approximately 600 m long and 700 m wide. The average thickness is 30 m, and the landslide volume is estimated to be 12.6 million cubic meters. The mean inclination of the landslide surface is 30°.
Based on the deformation characteristics, the Baishuihe Landslide can be divided into two blocks [38]: the warning zone and the relatively stable block (Figures 1(b) and 1(c)). The warning zone is located in the front part, with elevations ranging from 130 to 270 m. The warning zone is approximately 450 m long and 500 m wide. The planar area of the warning zone is 0.225 million square meters, and the volume is 6.75 million cubic meters (Figures 1(b) and 1(c)). Borehole analysis shows that the landslide materials are Quaternary deposits consisting of gravel clasts and silty clay. The underlying bedrock is sandstone and mudstone of the Triassic Shazhenxi Formation, with an average dip direction of 15°a nd a dip angle of 36°.
The Baishuihe Landslide is an ancient landslide. Since the 1990s, continuous deformation has been recorded and has caused damage to roads and retaining walls (Figure 2(a)).    Figure 3. The available data reveal that the Baishuihe Landslide exhibits step-like deformation and that rapid movements occur in the rainy season (May to September) and then essentially cease during the dry season (October to April). Figure 4 displays histograms and empirical cumulative distribution functions (CDFs) for rainfall, reservoir fluctuation, and landslide velocity. Rainfall, reservoir fluctuation, and landslide velocity clearly follow different distribution patterns. Therefore, a universal framework capable of handling different distribution patterns and being further extended to more complex variables is required.

Methodology
2.3.1. Risk Threshold and Value at Risk (VaR). It is generally accepted that risk arises from uncertainty [40]. According to the definition from the Project Management Institute, a risk is an uncertain event or condition that, if it occurs, has a positive or a time cost, span, or quality, which implies an uncertainty in the identified events and conditions [32]. A risk threshold refers to measures along the level of uncertainty at which a stakeholder may have a specific interest. Below that risk threshold, the organization will accept the risk.

Geofluids
Above that risk threshold, the organization will not tolerate the risk [32,33]. In the present study, VaR, an effective risk management tool in the field of finance, is utilized to estimate the risk threshold.
In finance, VaR defines as a threshold value to measure the potential financial loss within a given time horizon and at a confidence level. It has been widely adopted by practitioners and regulators as the standard method of measurement of the financial loss. Define α as the confidence level ranging from 0 to 1 and L as the loss measured as a positive value. VaR α is the threshold at which the probability of a loss exceeding the threshold is equal to 1 − α (schematically illustrated in Figure 5) and can be expressed as follows: Take, for instance, a 99% confidence level (α = 0:99). Va R α = 0:99 is the loss such that the probability of experiencing a greater loss is less than 1%.
Various methods, including the variance-covariance method, historical simulations, and Monte Carlo simula-tions, are available for the estimation of the VaR [41]. However, these methods are based on the assumption of probability distributions, such as normal distributions, lognormal distributions, or any of a number of other distributions [42,43]. Moreover, a large number of samplings are required in a Monte Carlo simulation. Additionally, when addressing practical applications with multiple factors, VaR estimation can become very difficult due to the complexity of joint distribution estimation. In the present study, the copula, an efficient mathematical tool capable of constructing multivariate distributions from their univariate marginal functions without any assumptions of normality and linear correlation, is utilized for VaR estimation.

2.3.2.
Copula. The copula theory, first developed by Sklar in the 1960s, states that any multivariate distribution function can be written in terms of univariate marginal distributions and a copula function [44]. The flexibility of specifying the marginal distributions separately from the copula function that links these distributions to form the joint distribution has resulted in its wide application by researchers for 4 Geofluids multivariate analysis [45]. Taking the bivariate copula as an example, let Fðu, vÞ be a two-dimensional joint distributional function of random variables u and v with univariate margins F u ðuÞ and F v ðvÞ. Then, there exists a copula C such that Various copulas exist, including Gaussian, Clayton, and Frank. A more detailed description of the copula function is listed in Table 2.
The parameter θ of the copula can be estimated from the original data through the maximum likelihood method based on the log-likelihood function as follows: In the present study, the Euclidean distance between the empirical copula and the optional copula is chosen to determine the optimal copula as follows: where n is the data size andĈ n is the empirical copula. The empirical copula [46,47] is defined as follows: where l ½⋅ denotes the indicator function and u ðkÞ and v ðlÞ , 1 ≤ k, l ≤ n represent order statistics from the sample Blue area to the left of the line represents of the total area under the curve.
Red area to the right of the line represents (1-) of the total area under the curve.
The construction of bivariate or multivariate copulas usually starts by specifying marginal univariate distributions. In this study, nonparametric KDE is applied for marginal univariate distribution estimation.

Kernel Density Estimation.
As one of the most wellknown nonparametric density estimators, KDE is capable of learning the shape of the density of a data set automatically without any distribution hypotheses. Let x 1 , x 2 , ⋯, x n be a sample from a distribution P with density p. Nonparametric density estimation refers to the estimation of p with as few assumptions as possible. Various approaches, including the k-nearest neighbor method, Parzen windows, and KDE, are available for nonparametric density estimation. However, the performances of the k-nearest neighbors and Parzen windows methods are generally poor [48]. The output of the Parzen windows method is usually characterized by discontinuous curves with step-like features. Fortunately, these drawbacks can easily be eliminated by adopting KDE.
The estimatorp of x 1 , x 2 , ⋯, x n is given bŷ where Kð⋅Þ is the kernel and h is the bandwidth. Technically, KDE smooths out each data sample into a bump, the shape of which is determined by the kernel. Then, KDE estimates the shape of the data set by summing over all these smoothed bumps.

Risk Threshold Estimation Based on KDE-Copula-VaR.
The overall framework of estimating the risk threshold based on a KDE-Copula-VaR approach mainly consists of three steps: (1) marginal distribution estimation by KDE, (2) joint distribution construction by the copula, and (3) risk threshold extraction by VaR. First, the marginal distribution functions of rainfall intensity (x 1 ), reservoir fluctuation (x 2 ), and landslide velocity (y) were estimated by Epanechnikov KDE. The bandwidths of the Epanechnikov KDE for variables related to rainfall, reservoir fluctuation, and velocity were set to 0.05, 0.05, and 0.5, respectively. Figure 6 shows the empirical CDFs and estimated CDFs based on KDE for rainfall intensity (x 1 ), reservoir fluctuation (x 2 ), and landslide velocity (y). The CDF estimations based on KDE match well with the empirical CDFs. This perfect performance is based on the nonparametric nature of KDE.
Second, joint probability distribution functions were constructed with a copula. The popular copula functions listed in Table 2, including the Gaussian, t, Gumbel, Clayton, and Frank functions, were set as candidate copula functions. The Euclidean distance between the empirical copula and the optional copula was calculated to determine the optimal copula function. The Euclidean distances between the empirical copula and the optional copula of (x 1 , y) and (x 2 , y) are listed in Table 3. The Euclidean distances indicate that the Frank copula was the optimal copula function with a minimum Euclidean distance. Therefore, the Frank copula was selected to form a multivariate distribution function. Third, a three-level risk threshold at three popular confidence levels (α = 0:95, 0.90, and 0.85) [49] was extracted from α quantiles of the multivariate joint probability distribution functions.  The best-fit copula is in italic font.

Results
The joint Frank copula probability density functions (PDFs) and CDFs of (x 1 , x 2 ), (x 1 , y), and (x 2 , y) are shown in Figures 7 and 8. Moreover, the correlation coefficients between rainfall intensity and landslide velocity and between reservoir fluctuation and landslide velocity are measured by two indexes including Kendall's tau and Spearman's rho and listed in Table 4. Kendall's tau and Spearman's rho have values ranging from -1 to +1. Zero indicates no correlation, and +1 and -1 indicate complete positive and negative correlations, respectively. The copula analysis shows a moderate correlation (Kendall's tau = 0:3812, Spearman's rho = 0:5675) between rainfall and landslide velocity. The constructed joint PDF between the rainfall and landslide velocity (Figure 7) is characterized by a symmetric tail that indicates a moderate correlation between the lower tail (0,0) and the upper tail (1,1). That is, if the landslide area experienced a heavier rainfall event, the landslide mass underwent more deformation. The effect of rainfall infiltration on movement of the Baishuihe Landslide can be described as follows: under prolonged rainfalls, most of the rain infiltrates into the landslide mass with permeability coefficients up to 9 × 10 −3 cm/s according to in situ tests. The infiltrated water may cause the increases of landslide mass weight and pore-water pressures and the reduction of shear strength along the sliding zone, thus triggering the landslide movements. Meanwhile, infiltrated rainfall may reduce the mechanical strength of the sliding zone, accelerating movements. A negative moderate correlation (Kendall's tau = −0:3689, Spearman's rho = −0:5399) is found between the reservoir fluctuation and landslide velocity, suggesting that a higher landslide velocity was more likely to occur during a sharp drawdown of the reservoir. The movements of the Baishuihe Landslide during the sharp drawdown of the reservoir were caused by the reduction of support provided by the reservoir at the toe and high seepage pressures toward the toe. The obtained results also indicate that the correlation of reservoir fluctuation and landslide velocity is stronger than that of rainfall and landslide velocity. This finding agrees well with the conclusion derived from the monitoring. The fast landslide movements mainly occur between May and June, when the water in the reservoir rapidly decreases to its lowest. For example, during June 2015, the reservoir rapidly decreased to 145 m at a daily fluctuation speed ranging from 0.6 to 0.99 m per day, and fast movement occurred with displacement increment up to 497 mm.
A three-level threshold relating the rainfall intensity, reservoir fluctuation, and landslide velocity under three popular levels of confidence is shown in Table 5. To illustrate the essence of the obtained threshold models, three models are described in the following paragraphs.
The threshold at the 0.95 confidence level states that under prolonged rainfall (exceeding 314 mm per month) 7 Geofluids and rapid drawdown with a magnitude of 11 m per month, the landslide was certain to change into rapid movement state with a probability up to 95%. When those thresholds are fulfilled, the copula-based velocity for landslide movements will reach to 320 mm per month. The correlation relationships between rainfall intensity and rapid movement (Figure 9(a)) show that rapid landslide movements positively correlated to the rainfall intensity. The heavier the rainfall is, the more likely it is that significant landslide movements will occur. A negative correlation exists between the reservoir fluctuation and landslide velocity (Figure 9(b)). The sharper the reservoir drawdown is, the more likely it is for significant landslide movements to occur. Almost all rapid movement events fall below the obtained thresholds. These results correspond with the obtained values from the KDE-Copula-VaR approach.
The threshold at the 0.90 confidence level shows that under heavy rainfall (exceeding 206 mm per month) and water level drawdown at a moderate speed, namely, 8 m per month or 0.27 m per day, the landslide would be extremely likely to change into rapid movement state with probability up to 90%. When those thresholds are fulfilled, the copulabased velocity for landslide movements will reach to 51 mm per month. The reservoir fluctuation thresholds correspond to the work of He et al. [28], in which the reservoir drawdown criterion for landslide instability is 0.2 m per day.
The threshold at the 0.85 confidence level shows that under heavy rainfall (exceeding 174 mm per month) and water level drawdown at a speed of 6 m per month, the landslide would be highly likely to change into rapid movement state with probability up to 85%. When those thresholds are fulfilled, the copula-based velocity for landslide movements will reach to 42 mm per month.
The obtained thresholds also indicate that the copulabased velocities rapidly increase in exponential growth rate with confidence levels. This is because with higher confidence levels, people will be more risk tolerant and therefore willing to take more risk to achieve higher expected return.
The obtained thresholds can be valuable for risk mitigation and the implementation of an effective tool for landslide early warning systems. Decisions on risk response can be made based on the comparison between measured or forecasted values and the obtained thresholds ( Figure 10). For example, the rainfall thresholds can be compared with the  8 Geofluids rainfall intensity recorded from the beginning of the rainfall event or the forecasted rainfall intensity. When the rainfall is close to or exceeds a preestablished rainfall threshold, the appropriate risk response can be raised. For the purpose of landslide early warning, a three-level risk response procedure is proposed for risk mitigation ( Figure 10 and Table 5). The three-level risk response procedure mainly consists of the following steps: risk identification and risk quantification, comparison with the risk threshold, and response to the risk ( Figure 10). A red alert is raised, indicating a high potential risk of landslide failure, when the risk value exceeds the risk threshold at the 0.95 confidence level. Mitigation measures including 24 h continuous and comprehensive monitoring, general inspection, deployment of warning signs, closure of roads and river channels, and issuance of evacuation orders should be employed.
An orange alert is raised when the risk value exceeds the risk threshold at the 0.90 confidence level but is not higher than the threshold at the 0.95 confidence level. The following mitigation measures should be implemented: 24 h continuous and comprehensive monitoring and general inspection; development of emergency mitigation, prepared-ness, response, and evacuation plans; and a consultation meeting between experts and government decision makers.
A yellow alert is raised when the risk value exceeds the risk threshold at the 0.85 confidence level but is not higher than the threshold at the 0.90 confidence level. Implementation of the following measures is recommended to minimize the landslide risk: more frequent and comprehensive monitoring and issuing the information to expert and decisionmaker groups.
Otherwise, the risk events are withheld or ignored.

Discussions
In order to estimate early warning criteria of rainfall and reservoir fluctuation and quantify uncertainty associated with the estimation, a three-level risk threshold under three confidence levels is extracted using a KDE-Copula-VaR approach. In fact, the hybrid approach utilizing the KDE, a copula function, and the VaR is a data-driven approach. The major advantages of the proposed approach consist of the following. The modeling procedure is quite straightforward, and less detailed physical mechanism information involved in  9 Geofluids landslides is required, which will save both time and cost for threshold estimation. The process of data acquisition and processing is relatively convenient, which means the threshold model can be easily established. The hybrid data-driven approach provides satisfactory performance once the threshold model is well trained.
However, limitations relating to data size and representativeness still exist. Firstly, sufficiently large size of observations is needed for training models with satisfactory performance. Generally, the trained models only extrapolate the past pattern of landslide movement. Since the future is never exactly like the past, a mere extrapolation of past patterns cannot provide accurate predictions, whichever sophisticated model is used to do so. To ensure better model performance, one should consider a sample period that is sufficiently large, so that the landslide movement patterns can be modelled more appropriately. Another limitation concerns the representativeness. Natural systems including landslide may change rapidly due to changing material supply and topography, restoration of vegetation, and human actions. The conditions that triggered landslide movements in the past may not be representative for the future. Therefore, the trained threshold models should be retrained and updated frequently when changes are made or more data is available.
In spite of the limitations described above, we still believe that the proposed hybrid approach can be useful for end users. From the perspective of practice, our research provides an analytical procedure for end users, enabling them to min-imize landslide risk, especially when the associated safety factors are already marginal. However, end users still face essential challenges. Two of these challenges related to uncertainty and confidence level are discussed as follows.
The first challenge is the attitude of end users towards uncertainty. For some, uncertainty is perceived as a curse, with risk arising from uncertainty. For others, uncertainty is perceived as a blessing that offers possible benefits if addressed properly. The ways in which end users interpret uncertainty also depend on their attitude. When users face uncertainty, they may choose to either downplay uncertainty for the sake of efficiency or prepare for the worst-case scenario because of high-risk aversion [50]. In fact, as Kahneman [50] has noted, an unbiased appreciation of uncertainty is a basis for rationality. However, this viewpoint is at odds with the will of the people and organizations. In general, expressions of uncertainty by researchers are typically perceived as an indication of weakness and vulnerability. Researchers who display more confidence will easily gain the trust of end users and the public and will take the place of researchers who acknowledge the full extent of their ignorance.
The second challenge is how to evaluate the degree of confidence level. This factor is particularly important for those situations in which estimations are established to aid decision-making. In fact, the level of confidence will have an impact on the application of the obtained results.
To address the abovementioned challenges in practical applications, we should begin by changing the discourse about uncertainty, moving from a lack of knowledge to a more realistic and nuanced view. To this end, uncertainty could be treated as confidence and additional knowledge instead of as a lack of knowledge. Moreover, making uncertainty explicit and transparent should also be valued as a virtue similar to honesty, humility, and trust.
Based on experience and the expert-judgment framework within the Intergovernmental Panel on Climate Change (IPCC) [51], the following process ( Figure 11) is provided to address the second challenge, evaluating the degree of confidence. Evidence and agreement are two metrics for evaluating the degree of confidence. The first step in the process is the identification of existing evidence (see step 1 in Figure 11). Types of evidence include experimental results, mechanistic understanding, observations, and models. The next step in the process is to evaluate evidence and agreement (see step 2 in Figure 11), particularly the type, amount, quality, and consistency of evidence and the degree of agreement. The degree of agreement is a measure of the consensus across the scientific community on a given topic. Qualitative language is used to characterize the amount of evidence and the degree of agreement among author teams. Evidence ranges from limited to robust, and the consistency of evidence ranges from low to high. In general, evidence is most robust when there are multiple independent and consistent sources of high-quality evidence. The next step in the process is to evaluate the level of confidence based on evidence and agreement by integrating the evaluation of evidence and agreement into one metric (see step 3 in Figure 11). Increasing confidence is correlated with an increasing level of  evidence and degree of agreement. The obtained confidence is expressed qualitatively using five qualities: very low (less than 1 in 10 chance of being subsequently corrected), low (approximately 2 in 10 chance of being corrected), medium (approximately 5 in 10 chance of being corrected), high (approximately 8 in 10 chance of being corrected), and very high (at least 9 in 10 chance of being corrected).

Conclusions
Due to rainfall and reservoir fluctuation, the Three Gorges Reservoir area is highly affected by the creep deformation of deep-seated landslides. However, few attempts have been made to estimate early warning criteria that describe the magnitudes of rainfall and reservoir fluctuation for changing the landslide movement state. The risk threshold is a measurement along the level of uncertainty at which a stakeholder may have a specific interest, below which the organization will accept the risk, and above which the organization will not tolerate the risk. In the present study, a hybrid approach utilizing the KDE, a copula function, and the VaR is proposed to estimate the risk threshold describing the minimum magnitudes of rainfall and reservoir fluctuation for changing the landslide movement state for the Baishuihe Landslide, a typical deep-seated landslide in the Three Gorges Reservoir. The following conclusions were obtained from this study: (1) A three-level risk threshold describing the minimum magnitudes of rainfall and reservoir fluctuation for rapid movement under three confidence levels is extracted. The thresholds at the 0.95 confidence level depict a minimum monthly rainfall of 314 mm and a monthly reservoir decrease of 11 m. The thresholds at the 0.90 confidence level result in a minimum monthly rainfall of 206 mm and a monthly reservoir decrease of 8 m. The thresholds at the 0.85 confidence level yield a minimum monthly rainfall of 174 mm and a monthly reservoir decrease of 6 m (2) A three-level risk response procedure, including risk responses in yellow alert, orange alert, and red alert, is proposed for risk management (3) Given the successful use of the KDE-Copula-VaR approach for estimating the risk threshold for the Baishuihe Landslide, this methodology is likely to be useful to estimate the thresholds for deep-seated reservoir landslides. More specifically, the threelevel thresholds found in this paper relating rainfall and reservoir fluctuation and the three-level risk response procedure could be applicable to other deep-seated landslides in the Three Gorges Reservoir

Conflicts of Interest
The authors declare no conflicts of interest.  Figure 11: The expert-judgment approach for evaluating the degree of confidence.