Copula-Based Probabilistic Hazard Assessment Model for Debris Flow Considering the Uncertainties of Multiple In ﬂ uencing Factors

This paper proposes a probabilistic hazard assessment model for debris ﬂ ows considering the uncertainties of multiple in ﬂ uencing factors based on copula approaches. Fifty-nine rainfall-induced debris ﬂ ows occurred between 2001 and 2009 in Taiwan are taken as an illustrative example to validate the proposed approaches. A copula-based probabilistic model is developed to model the joint probability distribution of debris-ﬂ ow volume V and its in ﬂ uencing factors (e


Introduction
Debris flows are characterized with large discharge, high velocity, and enormous destructive power that can usually cause devastating damages to the buildings, infrastructures, local residents in the downstream [1,2].To reduce the damages of debris flows, performing accurate hazard assessment of debris flows is necessary for risk management and remedial measure designs [3][4][5].Hazard assessment of debris flows generally includes the quantitative estimation of the most important parameters (e.g., the debris-flow volume, mean flow velocity, and runout distance), and determination of the probability that a debris-flow event can occur in a specific debris-flow basin [6,7].Among the most important debris-flow parameters, the debris-flow volume, V, is one of the most important parameters for potential debris-flow hazard assessment, which is a prerequisite for predicting the peak discharge and runout distance of debris flow [8][9][10].Meanwhile, the probability level of debris flows can be quantitatively determined by a probabilistic model of debris-flow volume [11,12].However, due to various uncertainties and variabilities in debris flows, it is difficult to accurately estimate the volume of debris flow and its corresponding probability of occurrence.
Various attempts have been made to estimate the debrisflow volume for a specific debris-flow basin, including theoretical and empirical approaches.Theoretical methods are physically based models to simulate the dynamic processes of debris flows [13][14][15][16].Theoretical approaches generally have the limitation of selecting appropriate rheological parameters [17].Due to the scarcity of rheological parameter data and size effects in laboratory tests, the rheological parameters are associated with great uncertainties.Empirical approaches usually provide a simple and useful tool for its convenience, which have been widely used in estimating the debris-flow volume.Most empirical equations are developed by relating the debris-flow volume to the basin area of a specific debris-flow basin [18][19][20][21].Some empirical equations were also proposed to consider more factors (e.g., geological and rainfall parameters) to make more accurate estimation of debris-flow volume [22][23][24].For example, Gartner et al. [20] developed a multivariate nonlinear empirical equation for predicting the debris-flow volume by considering the basin area and the total storm rainfall.Chang et al. [24] proposed an empirical relationship by including geological, topographic, rainfall parameters.However, there are various unavoidable uncertainties and variabilities that exist in debris flows [25][26][27].For example, due to the sedimental production, erosion, and deposition, the debris-flow volume for a specific basin may change over the time.Moreover, the topography of a particular debris-flow basin also changes over time, affecting the propagation of debris flow [28].These influencing factors are also associated with various uncertainties, which are usually ignored in estimating the debris-flow volume.Thus, it is necessary to take various uncertainties into account in the hazard assessment of debris flows.
This paper proposes probabilistic hazard assessment models for debris flow considering the uncertainties of multiple influencing factors based on the copula approaches.Probabilistic analyses are conducted on the 59 past rainfall debrisflow events in Taiwan.First, 59 datasets of past debris flows are divided in to 50 sets of data for model construction and 9 sets of data for validation.A multivariate copula model is developed based on the 50 sets of observation data to model the joint probability distribution of debris-flow volume V and its influencing factors (e.g., rainfall intensity, RI and landslide area, A L ).Then, the developed V-RI-A L copula model in Taiwan is used to make probabilistic prediction of the debrisflow volume for a specific hazard level, and compared with the empirical approaches.Finally, the proposed probabilistic model is validated by the independent nine sets of data.The proposed probabilistic model is also used to develop the exceedance probability charts of quantities (e.g., the debris-flow volume, V and rainfall intensity, RI) considering a given landslide area, A L for a specific debris flow.This paper successfully characterizes the high uncertainties and variabilities of debris-flow volume and its influencing factors, and provide a preliminary reference for debris-flow risk assessment and control measure design.

Methodology
2.1.Definition of Multivariate Copula Model.According to Sklar's [29] theorem, the basic idea of copula theory is to decompose a joint probability function into the modeling of dependence and the modeling of multiple marginal distributions.Let X 1 , X 2 ,…, X n (n > 2) denote debris-flow parameters.The multivariate joint cumulative distribution function (CDF) of X 1 , X 2 ,…, X n is denoted as F 1; 2; ⋯; n ðx 1 ; x 2 ; ⋯; x n Þ.Then, F 1; 2; …; n ðx 1 ; x 2 ; …; x n Þ can be presented as follows [30,31]: where u n ; θÞ is the multivariate copula function with the copula parameter θ; where f 1 (x 1 ), f 2 (x 2 ),…, f n (x n ) are the marginal PDFs of X 1 , X 2 …, X n , respectively; cðu 1 ; u 2 ; …; u n ; θÞ is the multivariate copula density function associated with the copula function Cðu 1 ; u 2 ; …; u n ; θÞ, which is given by: Multivariate copula functions, such as elliptical copulas (e.g., Gasussian copula) and Archimedean copulas (e.g., Frank, Clayton, and Clayton copulas) have n-dimensional generalizations, which can be used in Equation (3) to construct the joint distribution of multivariate parameters.The copula parameter θ can be estimated by the maximum likelihood estimation (MLE) based on the observation data fx 1j ; x 2j ; …; x nj ; j ¼ 1; 2; …; Ng with a sample size of N, which is derived as follows: where ðu 1j ; u 2j ; …; u nj Þ are the empirical distributions of observation data ðx 1j ; x 2j ; …; x nj Þ, which are defined as follows [30,31]: where rankð⋅Þ is the ranking function.For example, rankðx 1j Þ denotes the rank of x 1j among the list x 1 ¼ ðx 11 ; x 12 ; …; x 1N Þ in an ascending order.Then, the copula parameter can be obtained by maximizing the likelihood function LðθÞ, which can be expressed as follows: The best-fit copula among the set of candidate copulas can be selected by the AIC score, which is given by [28]: where ; n is the measured joint probability of multivariate parameters (i.e., X 1 , X 2 , …, X n ); Cðu 1j ; u 2j ; …; u nj ; θÞ is the candidate copula; k is the number of copula parameters for the candidate copulas.

Data Sources and Marginal Probability Models
3.1.Data Sources of Rainfall-Induced Debris Flows.This paper collected 59 datasets of rainfall debris-flow events occurred between 2001 and 2009 in Taiwan from Chang et al. [24].These 59 debris-flow basins were affected by frequent earthquakes, which often induced a large number of landslides.A large amount of loose material deposited in the debris-flow basins, making it easy to form debris flows after intense rainfall.Chang et al. [24] collected physiographical parameters, geological index, and rainfall factors.Considering that the debris flows were mainly triggered by the intensive rainfall in the study area, thus, the rainfall intensity, RI, and landslide area, A L are taken as the main influencing factors.Fifty-nine datasets of past debris flows are divided into 50 sets of data for model construction and 9 sets of data for validation, as shown in Table 1.More details of the geological conditions and debris flows in the study area are refered to the study by Chang et al. [24].Table 1 summarizes the statistical characteristics of RI, A L , and V of 50 datasets of debris flows in Taiwan.It shows that the mean value of A L in the study area is 263,785.70m 2 , with a coefficient of variation (COV) of 2.37.Even with the same rainfall conditions in the study area, the landside area for different debris-flow basins vary by many orders of magnitude.The mean value of V is 97,905.920m 3 with a COV of 1.37.Compared with the variability of RI with a COV of 0.28, A L and V show great variability.Considering that there are the high uncertainties in the debris-flow volume and its influencing factors, RI, A L , and V are considered as random variables, where X 1 = RI, X 2 = A L , and X 3 = V.
The Kendall rank correlation coefficient is used to illustrate the cross-correlation between the debris-flow volume and its influencing factors (i.e., RI, A L , and V), as shown in Table 2.It is found that these three variables are positively correlated.It can be seen that there is a strong positive correlation between A L and V with Kendall rank correlation coefficient of 0.603.Such a strong correlation between A L and V may implies that more loose material can be transported downstream to form larger debris flows.RI-A L and RI-V are also positively correlated with the Kendall rank correlation coefficients of 0.180 and 0.181, respectively.
Considering that RI, A L , and V are positively correlated variables, three-dimensional copula density functions are used to construct the joint probability model of RI, A L , and V. Four three-dimensional Archimedean copula functions, namely Advances in Civil Engineering Clayton, Frank, Ali-Mikhail-Haq, and Gumbel-Houggard copula, are considered as the possible copulas to characterize the dependence structure of RI, A L , and V.These four commonly used copula functions and the copula parameters are presented in Table 3. Because, the copula functions in Table 3 are singleparameter copulas.Therefore, the number of copula parameters for the four candidate copulas k equals to 1.

Best-Fit Marginal Distributions.
To best-fit the marginal distributions of RI, A L , and V, four commonly used marginal distributions, namely truncated Normal (TruncNormal), Lognormal, Weibull, and truncated Gumbel (TruncGumbel) distributions in the geotechnical literatures [26][27][28], are used in this paper.Table 4 summarizes the CDFs and PDFs of these four distributions.
Based on the 50 sets of debris-flow data, the optimal marginal distributions of RI, A L , and V are identified by AIC, respectively.Table 5 presents the AIC scores for the  which is identified as the most suitable marginal distribution for A L in the study area.Similarly, the most suitable marginal distribution for V in the study area is Lognormal distribution (Table 5).Figures 1-3 show the PDFs and CDFs of the measured data and four candidate marginal distributions for RI, A L , and V, respectively.It is clear that as for RI, the Weibull distribution fits well with the observation data of RI.Lognormal distribution is in good agreement with the measured data of A L and V.This validates the accuracy of the identification of marginal distributions of RI, A L , and V.

Probabilistic Prediction Model of Debris-Flow Volume Based on Copula Approaches
Based on the 50 sets of observation data of RI, A L , and V in Table 1, copula approaches are used to construct their joint probabilistic model.Table 6 presents the results of calibration of the most suitable copula function and its corresponding copula parameter.The AIC values for Clayton, Frank, Ali-Mikhail-Haq, and Gumbel-Houggard copulas are −235.346,−244.087,−197.52, and −100.01,respectively.It is obvious that Frank copula has the minimum AIC value.Therefore, it is identified as the most suitable copula to characterize the dependence structure of RI, A L , and V in the study area.The corresponding copula parameter θ of Frank copula is calculated as 2.808.By substituting the best-fit marginal distributions for RI, A L , and V (i.e., Weibull, Lognormal, and Lognormal distributions), and the most suitable copula function (i.e., Frank copula) into Equation (1), the three-dimensional joint distribution for RI, A L , and V can be expressed as follows: where RI * ; A L * , and V * are the arguments for RI, A L , and V, respectively; u RI ; u A L , and u V are the marginal CDFs for RI, A L , and V, respectively.Then, based on the probability theory, the conditional probability α that the debris-flow volume V for a specific basin equals to V 0 considering RI = RI 0 and A L = A L0 can be calculated as follows:

Marginal distribution
CDF F(x; p, q) PDF f(x; p, q) μ and σ Note.Φ denotes the CDF of standard normal distribution; p and q are the parameters of the marginal probability distribution function; μ is the mean value; σ is the standard deviation.Advances in Civil Engineering where Cðu RI 0 ; u A L0 Þ is the bivariate joint distribution of RI and A L .From Equation ( 9), it can be seen that given RI = RI 0 and A L = A L0 , the risk level of a potential debris-flow event that the debris volume equals to V 0 is quantitatively

6
Advances in Civil Engineering characterized as α.Equation ( 9) also shows that the debrisflow volume can be estimated with a given probability level α considering RI = RI 0 and A L = A L0 .Obviously, there exists a value of α that can derive the best-fit estimates of debris-flow volume.Mean-square error (MSE) is a measure of the difference between the measured and estimated value, which is given by: where N is the sample size of the debris-flow observation data; V 0 i is the estimated value of debris-flow volume with a given probability level α; V i is the measured value of debrisflow volume.The probability level α with minimum MSE is considered as the best-fit risk level, which is subsequently used to estimate the debris-flow volume by using Equation (9).
Figure 4 shows the relationship between MSE and the probability level α.It is clear that MSE has the minimum value when the probability level α equals to 0.94.Given the conditional probability α = 0.94 and 50 sets of observation data in Table 1, the volumes for the 50 past debris-flow events are estimated by using Equation (9). Figure 5 shows the results of estimated debris-flow volumes by copula approaches.The determination coefficient is about 0.743.It is obvious that estimated debris-flow volumes are generally around the 1 : 1 line.About 98% estimated values are within the 95% confidence interval.
To further validate the proposed copula approaches, nine sets of independent observation data in Table 1 (Nos.51-59) are employed to forecast the debris-flow volume using Equation (9), where the conditional probability α is set as 0.94.Meanwhile, an empirical relationship is developed with the 50 sets of "training" data in Table 1, as shown in Figure 6.The determination coefficient of empirical relationship is about  Advances in Civil Engineering 0.643, which is then used for comparison.Figure 7 shows the measured values and predicted values obtained from the proposed copula approaches and the empirical relationship.It is clear that the V values predicted by the proposed copula approaches are closer to the measured values.The V values estimated by the proposed copula approaches show smaller scatters than that obtained from the empirical relationship.This indicates that the proposed copula approaches properly characterizes the high uncertainties and variabilities of the debris-flow volume and its influencing factors, and can provide reasonable forecasting of the debris-flow volume.
In addition, it should be noted that the debris-flow volume also depends on the debris-flow parameters (e.g., flow velocity) and physiographical parameters.These factors are not considered in this paper and should be considered for more rigorous and more accurate estimates of debris-flow volume.Although the proposed method has the above limits, these limits do not impact the proposed method itself but are the common problems of most physical and empirical methods.The probabilistic model is developed based on a limited number of observation data from debris-flow events in the study area.If more debris-flow event data are available, the joint probability model can be recalibrated to substantially improve the prediction accuracy of debris-flow volume.

Exceedance Probability Charts for Debris-Flow Hazard Assessment Based on Multivariate Joint Probabilistic Model
Considering that the debris-flow volume is a key parameter in the hazard assessment of debris flow, it is worthwhile developing exceedance probability charts for mitigation strategies design based on the previous analyses of this study.The developed probabilistic model (i.e., Equation ( 8)) can be used to develop exceedance probability design charts for debris-flow hazard assessment.From Equation (8), the exceedance probability of RI and V given a specific landslide area A L = A L0 is defined as follows [26]: Using Equation ( 8), the respective exceedance probability values of RI and V can be calculated at different threshold values.Figure 8 shows the conditional probability distribution and bivariate exceedance probability chart considering 0 1,000,000 2,000,000 3,000,000 4,000,000 0 200,000 400,000 600,000 800,000 Debris-flow volume,

Advances in Civil Engineering
A L = 263,785.70m 2 (i.e., the mean value in Table 1).Each line in Figure 8(b) implies an equal exceedance probability line.The bivariate exceedance probability chart of RI and V shows that the exceedance probability increases with the decreasing threshold values of RI and V.The exceedance probability chart of RI and V can also provide a means to determine the magnitude of a debris flow.For example, if the rainfall intensity RI = 60 mm/hr, the corresponding debrisflow volume can be estimated with different exceedance probability.In addition, the intensity of a debris flow can be characterized in a probability-based manner.For example, the exceedance probability of RI = 60 mm/hr and V = 10,000 m 3 equals to 0.7 from the chart, which can provide a preliminary reference for debris-flow risk assessment and design of control measures.Similarly, the exceedance probability chart of A L and V given a specific rainfall intensity RI = 22.08 mm/hr (i.e., the mean value in Table 1) has the same results, as shown in Figure 9.The bivariate exceedance probability of A L and V also increases with the decreasing threshold values of A L and V.The exceedance probability chart of A L and V can also provide a means to determine the magnitude of a debris flow.

Summary and Conclusions
Hazard assessment is crucial for debris-flow risk assessment and design of the control measures.This paper proposed Advances in Civil Engineering probabilistic models for debris-flow hazard assessment considering the uncertainties of multiple influencing factors based on the copula approaches.The proposed probabilistic models not only can provide probabilistic estimation of the debris-flow volume, but also determine the probability of a potential debris-flow event.The proposed copula approaches were illustrated by using the 59 past rainfall debris-flow events in Taiwan.First, 59 datasets of past debris flows were divided in to 50 sets of data for model construction and 9 sets of data for validation.Then, a three-dimensional copula model incorporating the debris-flow volume V and its influencing factors (e.g., rainfall intensity, RI and landslide area, A L ) was developed based on the 50 sets of observation data.Finally, the developed V-RI-A L joint probabilistic model in Taiwan was used to make probabilistic prediction of the debris-flow volume for a specific hazard level.The proposed approaches were validated and compared with the empirical approach by using nine sets of independent observation data in the study area.The proposed probabilistic model was also used to develop the exceedance probability charts of quantities (e.g., the debris-flow volume, V and rainfall intensity, RI) considering a given landslide area, A L for a specific debris flow.The findings are summarized as follows: (1) The statistical goodness-of-fit tests show that the Weibull distribution is the most appropriate marginal distribution for rainfall intensity, RI in Taiwan.
Lognormal distribution is the most appropriate marginal distribution for debris-flow volume V and landslide area, A L in the study area.(2) Among the Clayton, Frank, Ali-Mikhail-Haq, and Gumbel-Houggard copula, the Frank copula is the best-fit copula for characterizing the dependence structure between RI, A L , and V in the study area.
A three-dimensional joint probabilistic model that incorporates Weibull-Lognormal-Lognormal distribution, and the Frank copula can be used to characterize the joint probability distribution of RI, A L , and V in Taiwan.(3) The joint probabilistic model of RI, A L , and V can be used to provide reasonable prediction of debris-flow volume with a specific conditional probability α = 0.94.Compared with the empirical relationship, the estimated debris-flow volume by using the proposed copula approaches are closer to the measured values.The proposed approaches can provide an alternative method for forecasting the magnitude of a potential debris-flow event in Taiwan.(4) The developed probabilistic model of RI, A L , and V can provide exceedance probability-design charts for debris-flow hazard assessment.The exceedance probability increases with the decreasing threshold values of RI and V given a specific landslide area A L = A L0 .The exceedance probability chart can provide a preliminary reference for debris-flow risk assessment and design of the control measures.

FIGURE 2 :
FIGURE 2: Best-fit marginal distribution of A L : (a) PDF and (b) CDF.

FIGURE 6 :
FIGURE 6: Empirical relationship for predicting the debris-flow volume in the study area.

V (m 3 ) 0 FIGURE 5 :
FIGURE 5: Comparison of measured debris-flow volume with the estimates obtained from the copula approaches.

FIGURE 7 :
FIGURE 7: Validation of the proposed copula approaches based on the nine sets of independent data.

FIGURE 8 :FIGURE 9 :
FIGURE 8: Conditional probability distribution and bivariate exceedance probability chart considering A L = 263,785.70m 2 (i.e., the mean value in Table 1).(a) Conditional probability distribution of and RI and (b) bivariate exceedance probability chart.

TABLE 1 :
Fifty-nine datasets of debris flows in Taiwan and the statistics of RI, A L , and V.

TABLE 2 :
Kendall rank correlation coefficients of RI, A L , and V.

TABLE 4 :
Four commonly used marginal distributions.

TABLE 5 :
Summary of the calibration for the best-fit marginal distributions.
Note.Bold indicates the minimum AIC value among the four candidate marginal distributions.

TABLE 6 :
Identification of best-fit copula function and copula parameters.
MSEFIGURE 4: Relationship between MSE and the probability level α.