Method of Fusion Diagnosis for Dam Service Status Based on Joint Distribution Function of Multiple Points

The traditional methods of diagnosing dam service status are always suitable for single measuring point.These methods also reflect the local status of dams without merging multisource data effectively, which is not suitable for diagnosing overall service. This study proposes a new method involving multiple points to diagnose dam service status based on joint distribution function. The function, including monitoring data of multiple points, can be established with t-copula function. Therefore, the possibility, which is an important fusing value in different measuring combinations, can be calculated, and the corresponding diagnosing criterion is established with typical small probability theory. Engineering case study indicates that the fusion diagnosis method can be conducted in real time and the abnormal point can be detected, thereby providing a new early warning method for engineering safety.


Introduction
A dam is a kind of important infrastructures that form a reservoir, and its safety, which is a major public safety issue, is related not only to reservoir benefits but also to human lives, national economic development, social stability, and many other aspects.However, some dams confront some security risks and crash risks because of the complexity of their surroundings and because of other deficiencies in management and engineering; in addition, dam burst events have also occurred occasionally in the world [1,2].
Dam safety monitoring is an important means to ensure the normal operation of dams.The service status of dams can be reflected and the abnormal situation can be detected by analyzing the effect variable of dams, such as deformation, seepage, and stress, thereby providing a useful way to monitor these structures and to provide an early warning system [3].
At present, the work of analyzing monitoring data and diagnosing dam status mainly relies on the monitoring model of single point.Since the Italian scholar Tonini [4] described the main factors that affected dam deformation as hydraulic press, temperature, and time, in 1956, further studies aimed at the classical monitoring model of single point have been conducted and have been widely used in practical engineering, forming the statistical model [5,6], the deterministic model [7], and the hybrid model [8].Meanwhile, with the development of machine learning algorithms [9,10], the modeling approaches of single point have developed into the nonlinear direction.Stojanovic et al. [11] proposed the artificial neural networks to establish the nonlinear monitoring model between environmental variables and dam displacement, improving the adaptability of model.Ranković et al. [12] used support vector machine to train the original monitoring data, improving the forecast accuracy of model in the nonlinear situation.Some scholars also introduced intelligence algorithms, such as particle swarm optimization [13], ant colony optimization algorithm [14], and genetic algorithm [15], to construct the monitoring model of single point, thereby enriching the modeling approach and achieving accurate result.
However, the monitoring model of single point is only a reflection of the local structure of a dam, and the overall status of dam cannot be described easily by this method.In general, the monitoring information of each measuring point has a strong correlation.Thus, the overall service state of a dam should also be diagnosed through the fusion of multipoint monitoring information.The current research in this area is relatively scarce.He et al. [16] introduced D-S evidence theory to realize the fusion diagnosis for dam on the basis of expert evaluations.De Sortis and Paoliani [17] used the modulus of elasticity as diagnostic indicator and diagnosed dam behavior on the basis of finite element model and monitoring data.Su et al. [18] used fractal dimension as diagnostic indicator and applied rescaled range analysis to fuse multipoint monitoring information.In addition, some fusion diagnosis methods in other engineering fields are also worth learning.Rafiq et al. [19] proposed Bayesian networks to diagnose the bridge behavior according to expert evaluations.Masoumi et al. [20] constructed finite element model of steel structure and introduced the optimization algorithm to improve diagnostic precision about the modulus of elasticity.Georgoulas et al. [21] diagnosed the rotor bar with Markov model on condition that the monitoring data obey the normal distribution.
Although a useful attempt to diagnose dam services was derived from the results of these studies, some shortcomings remain; for instance, expert evaluations are always subjective, and the diversity of different experts may affect the diagnosis.Then the diagnostic model based on these methods regards the characteristic parameter (such as fractal dimension and modulus of elasticity) as diagnosis basis.Therefore, making a real time diagnosis for the whole dam based on the measuring values of each point in each monitoring day is impossible.In addition, the monitoring values should presumably obey the normal distribution when the aforementioned methods are used to conduct diagnosis of whole dam.However, these values do not strictly obey the normal distribution.In particular, the distribution is not normal when the abnormal data exist in the monitoring series.Therefore, the fusion diagnosis methods for dam behavior under the multiple points need to be further studied.
This study aims at the aforementioned shortcomings and proposes a new fusion diagnostic method related to joint distribution function based on the in situ monitoring data of dams.The distribution of a single point can be calculated with kernel density estimation (KDE), thereby obtaining the relatively real distribution of each point.Then, the distribution of different single points can be connected to the multidistribution function with t-copula function, and the possibility of measuring values in different combinations can be calculated, thereby providing benefits in analyzing the synergetic changing feature of multiple points.Then, the diagnostic criterion is established with the typical small probability method, and the diagnostic process is described, thereby providing a real time and efficient way to diagnose the overall service status of dams.In addition, the proposed method can also be applied to the fusion diagnosis of other engineering fields.

Single Point Distribution.
In general, the monitoring data of single point are considered normal distribution to facilitate analysis and calculation [22].However, deviations always exist between the true distribution and normal distribution.The use of approximation may ignore the real characteristic of the monitoring data, thereby leading to calculation error.KDE [23] uses the sample data to estimate the probability density function of the population distribution.The advantage of this method is that the estimation depends on the sample data without a priori assumptions.Therefore, the real trait of monitoring data, which reflect the real feature of single point, can be retained.The expression of KDE can be written as follows: where Therefore, the precision of S-PDF mainly relies on the bandwidth, which is a key factor determining the shape and smoothness of the S-PDF curve.The cumulative distribution function of single point (S-CDF) can be obtained with the integration of S-PDF.The expression of S-CDF is presumably (), and the measuring value is presumably  in one observation; thus, the S-CDF value, indicating that the possibility () of  is less than , can be expressed as where 0 ≤ () ≤ 1.  (1) ≤  (2) ≤ ⋅ ⋅ ⋅ ≤  (−1) is assumed to be the order statistics of the sample to compare the accuracy in a different bandwidth, and the empirical cumulative distribution function of a single point (S-ECDF) can be written as The fitting goodness can be estimated with root mean square error (RMSE) by comparing the S-CDF value with the S-ECDF value in the sample values   , and low RMSE value means high fitting accuracy.The expression of RMSE is written as Mathematical Problems in Engineering 3

Multipoint Distribution Based on t-Copula Function.
The S-PDF and S-CDF only reflect the operational state of one point.However, the probability density function of multiple points (M-PDF) and the cumulative distribution function of multiple points (M-CDF) should be constructed based on S-PDF and S-CDF to diagnose the overall service status of dams.Sklar theorem [24,25] provides a theoretical basis to generate the M-CDF based on nonnormal S-CDF.If the number of points is  and the expressions of their S-CDF are  1 ( 1 ),  2 ( 2 ), . . .,   (  ), then a unique copula cumulative distribution function (C-CDF) (⋅; ) connecting the S-CDFs to M-CDF ( 1 ,  2 , . . .,   ; ) exists.For the measuring values ( 1 ,  2 , . . .,   ), M-CDF can be expressed as where  is the parameter of copula function and the domain of (⋅; ) is [0, 1]  .For any value ( 1 ( 1 ),  2 ( 2 ), . . .,   (  )) in the domain, the expression 0 ≤ (⋅; ) ≤ 1 is established.Sklar theorem has shown M-CDF can be decomposed into several S-CDFs and one C-CDF that describes the relevant information among these variables.C-CDF has many forms.In this study, t-copula function is chosen as the C-CDF because of its symmetrical tail and sensitive feature in capturing tail correlation of variables.For t-copula function, the parameters are  (one matrix) and  (one constant), which describe the relationship of various points and degrees of freedom in the function.The parameters can be calculated with maximum likelihood estimation (EML).If the S-PDFs of  points are  1 ( 1 ),  2 ( 2 ), . . .,   (  ), then M-PDF can be expressed as (7) when the derivation is conducted on (6): where (⋅; ) is the probability density function of t-copula.Let  be the number of times for observation; thus, the likelihood function can be written as The corresponding logarithmic likelihood function can be written as ln  (, ) The EML for  and  can be obtained by solving the maximum value of (9): ρ, k = arg max ln  (, ) . (10) The fitting accuracy can also be evaluated by comparing the M-CDF values and the empirical cumulative distribution function of multiple points (M-ECDF).

Diagnostic Criteria Based on Typical Small Probability
Method.As shown in ( 6), M-CDF, which is the function of measured values collected by different points, has a positive correlation with these measured values.Therefore, M-CDF can be used as an important parameter to diagnose the service status of an entire dam.If the value changes in a fixed interval, then the service status of an entire dam can be considered normal.If the value exceeds the interval, its status can be judged as abnormal.In this study, the monitoring data are divided into two periods: modeling period and diagnosis period.The data of modeling period are applied to establish M-CDF, and the interval can be calculated based on M-CDF values in this period and typical small probability method.
Let the length of the monitoring data in the modeling period be  years; thus M-CDF can be established, and the corresponding M-CDF value in each monitoring day can be calculated.Then, the unfavorable M-CDF value  in each year indicates that the annual maximum  and the annual minimum  form the samples of  max and  min : The mean values  max and  min and variances  max and  min are written as follows: According to the characteristic values, - testing approach is applied to check the distribution pattern of  max and  min .Then, the probability density functions  max ( max ) and  min ( min ) and the cumulative distribution functions  max ( max ) and  min ( min ) are confirmed.Let the threshold values of M-CDF values be  max  and  min  ; thus, the possibility of  >  max  or  <  min  , which indicates the abnormal situation of dam, can be written as Based on the small probability principle [26],  is always set as 0.05.Thus,  max  and  min  can be calculated as (14).If M-CDF exceeds threshold, then the measured values collected by different points are in the small probability event area: If the new M-CDF value still exceeds the threshold level, then its trend should be observed in the following days.If necessary, the emergency action should also be conducted to recover the value to the reasonable region.The flowchart based on the preceding discussion is illustrated in Figure 1 to describe clearly the diagnosing process for service behavior of dams.

Description of the Project. Wanan Water Conservancy
Project is located in the middle reaches of Gan River in Jiangxi province, China.This project is constituted by a concrete gravity dam, earth-and rock-filled dams, and a ship lock.
Figure 2 shows the photo of the dam.The normal water level, design flood level, and maximum flood level of the project are 96 m, 100 m, and 100.70 m, respectively.The crest elevation of the concrete gravity dam is 104 m, and the height of gravity dam is 46.04 m.Thus, the project has a complete monitoring system for displacement and seepage.A total of 25 dam foundations are located in the gravity dam.One wire alignment named EX4 with 25 measuring points is arranged at the observation gallery set in its crest to monitor the static horizontal displacement of each dam foundation.The sketch of point arrangement is shown in Figure 3.In this study, four points (i.e., EX422, EX423, EX424, and EX425) in  the powerhouse dam are selected as example to analyze the diagnosis method (the EX421 is excluded because it has insufficient monitoring data).After processing and synchronizing these measuring values of four points, 4091 sets of data from January 1, 1999, to December 31, 2013, are set as modeling period.These data are applied to establish the M-CDF and diagnosis criteria.Therefore, the data in 2014 are set as diagnosis period (including 295 sets), and the service status of the powerhouse dam in this period is diagnosed.

Distribution of a Single Point and Multiple
Points.The process lines in the modeling period and diagnosing period of the four points are shown in Figure 4.The definition of the orientation is as follows: toward downstream is positive, whereas toward upstream is negative.Although the changing trend of EX422 from 2005 to 2013 is toward downstream, all points have consistency and regularity, thereby facilitating the construction of M-CDF.
According to (1), the S-PDF values of different points in the modeling period are determined with KDE.The fitting results of these points for frequency histogram are shown in Figure 5 with Gauss kernel function, and the bandwidths are set as 0.1, 0.5, and 1.Then, S-CDF can be calculated based on S-PDF, and the S-ECDF values are calculated with (4).Table 1 indicates the RMSE in different bandwidths of different points.As shown in Figure 5 and Table 1, if the bandwidth is small, then the S-PDF curve is close to the frequency histogram.This scenario indicates a high fitting accuracy.When the bandwidth is set as 0.1, the features of frequency histograms are simulated preferably by the S-PDF.
The small RMSE indicates that the corresponding S-CDF values are close to S-ECDF values.If the samples of EX422, EX423, EX424, and EX425 are { 1 }, { 2 }, { 3 }, and { 4 }, respectively, then the S-PDF of these points in this bandwidth can be expressed as The corresponding S-CDF can be written as Then, the M-CDF values of the four points are established with ( 6) to (10), and the M-ECDF in each sample point is calculated.The following is the expression of M-CDF:

Diagnosis of Service Status.
According to (11) and ( 12), the annual maximum  max and minimum  min of the fourpoint M-CDF in the modeling period are calculated and shown in Table 2.With K-S test method,  max obeys normal distribution (7.54 × 10 −1 , 7.95 × 10 −2 ), whereas  min obeys logarithmic normal distribution (−5.69,2.03).When  is set as 0.05,  max  and  min  are 8.84 × 10 −1 and 1.2 × 10 −4 , respectively.The process line of the four-point M-CDF values in the diagnosis period is shown in Figure 7.
As shown in Figure 7, all M-CDF values in the diagnostic period are not lower than  min  .This observation indicates that the displacement of various dam sections toward upstream is harmonious, and an excessive large displacement toward the upstream of all dam sections or individual section does not occur.However, from March 11 to 15, the value exceeds  max  , and the maximum value in March 13 is reached.This observation indicates that the displacement of all sections or partial sections toward downstream in this period is excessively large, and further analysis should be  conducted.According to the diagnosis process of Figure 1, the three-point M-CDFs after eliminating one point are established, and the expressions can be expressed as The results of K-S test show that  max obeys normal distribution, whereas  min obeys the logarithmic normal distribution for all three-point M-CDFs.The characteristic parameters of these distributions, such as expectation, , and variance, , are shown in Table 3.For  max and  min of M-CDF involving EX423, EX424, and EX425,  is larger whereas  is smaller than those of other M-CDFs.This finding indicates that the annual extreme value change is relatively stable, and the dispersion degree is small.The process lines of the three-point M-CDF values in diagnosis period are shown in Figure 8.
As shown in Figures 8(a)-8(c), the M-CDF values involving EX422 exceed  max  from March 11 to 15, and the maximum value is reached in March 13.This scenario is similar with that in Figure 7.However, the M-CDF values involving EX423, EX424, and EX425 are in the normal range from March 11 to 15, 2014.Therefore, M-CDF value anomaly is regarded to be caused by a single point (i.e., EX422) from the qualitative perspective.Furthermore, the quantitative analysis can be applied on EX422.The measured information The M-CDF involving EX423, EX424, and EX425 is in normal region on March 13.Therefore, under this condition, when the M-CDF values of the four points reach  max  = 8.84 × 10 −1 , the S-CDF of EX422 can be calculated as S-CDF EX422 =  −1 ( max  ,  2 ,  3 ,  4 ) =  −1 (8.84 According to the analysis, the measured value of EX422 is in the region of [11.21, 13.35] with the condition that the M-CDF involving EX423, EX424, and EX425 is in normal region while the M-CDF of the four points is in the abnormal region.The probability of the matter is 6.72×10 −3 , which is much less than the significance level of small probability event (0.05).However, the measured values of EX422 are in this region from March 11 to 15, 2014.This observation indicates that the small possibility event has already occurred.Therefore, the change of EX422 is not coordinated with other points.Thus, the displacement toward the downstream of 4# powerhouse   dam section is excessively large, whereas those of the other sections are normal.As shown in Figure 4, the displacement toward the downstream of EX422 in the modeling period shows an increasing trend.Thus, this trend is fully considered in the process of building M-CDF.In this case, the M-CDF values related to EX422 from March 11 to March 15, 2014, still exceed the high threshold level.This finding indicates that the changing trend toward the downstream of this dam section continues to increase.Therefore, focusing on the monitoring values of EX422 is essential, and the maintaining or strengthening measures can be applied to 4# powerhouse dam section if necessary.

Conclusion
In this study, a diagnosis method for the service status of the whole dam based on joint distribution of multiple points is proposed.The distributions of single point and multiple points are researched.The method of parameter estimation and the accuracy of S-CDF and M-CDF are discussed.Then, the diagnostic criteria and processes are established with typical small probability method.The following conclusions are obtained.
(1) KDE can calculate the S-PDF on the basis of protecting in situ monitoring information effectively.When the bandwidth is set as 0.1, S-PDF has a high fitting precision, which can overcome the shortage of a priori assumption and can provide a new method to estimate the distribution of single point.(2) M-CDF based on t-copula can reflect the relationship of multiple points.It also has high fitting precision for joint distribution.The M-CDF effectively fuses the measuring values of multiple points.Therefore, it can be considered an important index to diagnose the service status of an entire dam.
(3) The diagnosis criteria based on the typical small probability method reflect the distribution of the extremum of M-CDF.This extremum is beneficial to diagnose the service status of an entire dam and to identify abnormal points.

Figure 3 :Figure 4 :
Figure 3: Diagram of wire alignment in dam crest.

Figure 6
Figure 6 describes the relationship of the M-CDF and the M-ECDF of each sample point.All values of the sample points are obviously located near the diagonal line.This observation indicated that the M-CDF values are near the M-ECDF values.In addition, the RMSE is 2.04 × 10 −2 , thereby indicating the high accuracy of M-CDF in simulating the joint distribution of the four points and the benefits for fusing diagnosis.

Figure 8 :
Figure 8: Process line of the three-point M-CDF values in the diagnosis period.
max ,  max , 0.95) ,  min  =  min −1 ( min ,  min , 0.05) .(14) 2.4.Diagnosis Process for Service Status of the Entire Dam.M-CDF reflects the joint distribution of various points based on historical monitoring data.Therefore, if the value of the diagnostic period exceeds the threshold level, then two cases should be considered.(1) An abnormal M-CDF value is caused by multiple points.This phenomenon indicates that the whole dam is in a state of high risk.Attention should be given to the changing trend of the value in the following days, and related measures should be taken to make the value be in the normal range.(2) An abnormal value is caused by the measuring data of several points.When this phenomenon occurs, the abnormal points should be removed, and a new M-CDF should be reestablished.If the new M-CDF is in a normal state, then the analysis should be focused on these abnormal points.

Table 1 :
RMSE in different bandwidths of different points.

Table 2 :
Maximum and minimum values of the four-point M-CDF.

Table 3 :
Characteristic parameter of three-point M-CDF.

Table 4 :
Measured information of various single points on March 13.