Seepage behavior detecting is an important tool for ensuring the safety of earth dams. However, traditional seepage behavior detection methods have used insufficient monitoring data and have mainly focused on single-point measures and local seepage behavior. The seepage behavior of dams is not quantitatively detected based on the monitoring data with multiple measuring points. Therefore, this study uses data mining techniques to analyze the monitoring data and overcome the above-mentioned shortcomings. The massive seepage monitoring data with multiple points are used as the research object. The key information on seepage behavior is extracted using principal component analysis. The correlation between seepage behavior and upstream water level is described as mutual information. A detection model for overall seepage behavior is established. Result shows that the model can completely extract the seepage monitoring data with multiple points and quantitatively detect the overall seepage behavior of earth dams. The proposed method can provide a new and reasonable means of quantitatively detecting the overall seepage behavior of earth dams.
National Natural Science Foundation of China51379162Water Conservancy Science and Technology Innovation Project of Guangdong Province2016-061. Introduction
Seepage is an important factor that affects the safety of earth dams. Based on the statistical data of the International Commission on Large Dams, approximately 52.2% of earth dam crashes are caused by seepage damage [1].
The upstream water level under normal service conditions causes earth dams to form a stable seepage field in the dam body and foundation, thereby indicating a stable seepage behavior. However, excessive seepage gradient, excessive seepage pressure, and other abnormal seepage phenomena may occur in earth dams due to the construction defects and material aging. These phenomena can cause seepage damage, increase in instability of a dam’s slope, and lead to dam breakage.
However, dam safety can be controlled. Safety detecting provides the basis for dam safety control. Several osmometers are typically placed along the key points in the dam to detect seepage behavior. The measured value of osmometers fluctuates within a reasonable range when the seepage behavior of earth dam is normal. The measured value will exhibit sudden changes or trends when the seepage behavior of earth dam is abnormal. Therefore, analyzing the data of these osmometers should be conducted to detect the seepage behavior of earth dams. Mathematical and mechanical methods are used to analyze the data and detect seepage behavior. Appropriate measures are reinforced when an abnormal seepage phenomenon is detected to effectively reduce the risk of dam breakage and provide technical assurance on the service safety of earth dams.
Currently, the detection methods for the seepage behavior of earth dams are divided into three types. The first type uses a statistical regression method for analyzing the monitoring data. The factors that influence the seepage behavior of earth dams are summarized as water level, rainfall, temperature, and aging. A seepage monitoring model for a single point is established. Abnormal seepage behavior is detected by analyzing the trend of different factors [2]. Si et al. [3] used support vector machine to train original monitoring data. This method improved the precision and detection accuracy for seepage behavior and provided reasonable trends from different factors. Xiang et al. [4] introduced a particle swarm optimization algorithm during the modeling process, which analyzed the lag time of water level in osmometers and optimized the expressions of water and rainfall factors. Thus, this algorithm was optimized, and the accuracy of the model was improved. Gamse and Oberguggenberger [5] used a “coordinate time series” method to analyze the seepage monitoring data of earth dams, thereby avoiding the overfitting phenomenon in the model. The expressions for water level, temperature, and aging factor were reasonable, and the monitoring model detected abnormal seepage behavior.
The second type of detection method combines finite element calculation and monitoring data. The seepage parameters of dam body and foundation are calculated by back analysis, and the seepage field of the dam is simulated [6, 7]. Abnormal seepage behavior is detected by comparing the calculated and measured values of the finite element model. Zhang et al. [8] established a finite element model for earth dam. The seepage parameters for the different parts of the dam were calculated by using seepage monitoring data, and the seepage field of the dam was simulated under various operating conditions. A comparison of the calculated and measured values obtained from the osmometers revealed that core wall failure causes the abnormal seepage behavior of the dam. Ren et al. [9] conducted an inversion of dam seepage parameters based on the finite element model of an earth dam and the monitoring data from the osmometers. The seepage field of a dam was simulated under normal water levels. The result provided a basis for seepage behavior detection under normal operating conditions. Chi et al. [10] introduced a neural network algorithm for the inversion of seepage parameters of earth dam, thereby improving the accuracy of inversion results and providing reliable seepage behavior detection.
The third type of detection method uses knowledge of experts to evaluate seepage behavior, in which the seepage monitoring data at a single point are scored. A fusion algorithm is used to integrate the score of different points, and the overall seepage behavior of an earth dam is detected based on fusion results. Yang [11] evaluated the monitoring data of seepage pressure by using expert experience. The membership matrix of each monitoring point was established, and an analytic hierarchy process was used to integrate the monitoring projects. Then, the overall seepage of the dam was detected. Shao and Xin [12] introduced a projection pursuit method into the fusion process, in which the weights of different monitoring points were assigned during the fusion process, thereby improving the detection accuracy.
The existing methods provide the means for seepage behavior detection of earth dams from different perspectives. However, the safety monitoring of modern dams is mainly based on automatic monitoring, and the amount of monitoring data constantly increases. The information contained in the data has become increasingly abundant. Therefore, the existing methods exhibit the following shortcomings. (1) The methods mainly focus on data with only one monitoring point, which is called the single-point detection method. This method can only detect local seepage behavior. Data with different points should be fused if the overall seepage behavior must be detected; this technique is called the multiple-point detection method. (2) The overall seepage behavior is qualitatively detected based on experts’ experiences and detection results of local seepage behavior. The subjectivity of this method is strong, and the experts’ experiences affect the detection results. Therefore, the quantitative detection method for the overall seepage behavior of dams should be investigated.
In summary, new research methods should be developed to explore the potential information in massive monitoring data and establish an efficient and accurate model for seepage behavior detection given the increase in monitoring data. Therefore, this study utilizes principal component analysis (PCA) and mutual information (MI) in data mining technology to extract massive seepage monitoring data with multiple points. The PCA is used for information extraction, and MI is used to describe the correlation between the principal component (PC) and upstream water level. The detection model for seepage behavior is established based on MI distribution, thus providing a new means of accurately detecting the overall seepage behavior of earth dams.
2. Method for Establishing the Model2.1. Modeling Process
Data mining [13, 14] refers to the process of discovering hidden information from massive amounts of data. The correlation study on the data and the extraction of key information from massive monitoring data is the primary content for establishing a seepage detection model given the increasing volume of monitoring data.
The PCA is an important data mining algorithm [15] that extracts one or a few PCs from a plurality of variables to replace the original variables through the correlation between the data based on the principle of minimizing data information loss. Currently, PCA is extensively applied in data analysis [16, 17]. Several seepage monitoring cross sections are arranged in the dam, and osmometers are arranged in the sections to monitor the seepage behavior of dams. The monitoring data are called the water level of osmometers. The fusion of the data from multiple osmometers should be conducted to detect the overall seepage behavior quantitatively. The locations of several sections and working conditions are similar. Therefore, the data from these osmometers are similar and are correlated. In this study, the PCA is used to reconstruct one or a few integrated variables (PC) that reflect the basic characteristics of primitive variables when each osmometer is considered a primitive variable. PCs contain key information from primitive variables and provide the basis for quantitative detection of the overall seepage behavior.
The upstream water level is an important factor that affects the seepage of earth dams [18]. For earth dams, the earth that is used to fill the dam does not prevent seepage, and the upstream water enters the earth. To prevent seepage damage in the dam, a core wall made from impervious materials is constructed in the dam to block the seepage and ensure its safety. Therefore, the water level of osmometers arranged in the front core wall is close to the upstream water level considering the poor antipermeability of the earth. The PCs of these osmometers are also close to the upstream water level. The correlation between the PC and upstream water level is strong. The water level of osmometers that were arranged behind the core wall is significantly reduced given the antipermeability of the core wall. The PCs of these osmometers are also significantly reduced. Therefore, the correlation between the PC and upstream water level is low. MI [19] is used to quantitatively describe the correlation between the upstream water level and PC after extracting the PC of osmometers. A considerable amount of MI results in a strong correlation between upstream water level and PC. Compared with the traditional correlation coefficient, MI simultaneously describes the linear and nonlinear relationships between the variables. In addition, MI is extensively used to describe the correlation of variables [20, 21].
MI between the upstream water level and PC should fluctuate in a rational region when the core wall is intact, thereby indicating that the seepage behavior of the dam is normal. The seepage quantity in the dam increases, and the PC becomes abnormal when the core wall is damaged given the effect of upstream water level, thus leading to an abnormal MI. This condition indicates that the seepage behavior of the dam is abnormal. The MI fluctuation range, that is, the detection model, can be obtained by analyzing the MI distribution of historical data. The seepage behavior of the dam is normal if the MI falls within this range. The seepage behavior of the dam is abnormal if the MI falls outside this range.
The advantage of MI in detecting seepage behavior is that MI can be used to eliminate the interference of osmometer failure. In general, several abnormal values are found in the data of osmometers. The PC fuses the data from different osmometers. Therefore, the PC also contains abnormal data. These abnormal values may reflect the abnormal seepage behavior. However, these abnormal values may be caused by osmometer failure. The abnormal values caused by osmometer failure may interfere in detecting seepage behavior and lead to misdiagnosis. The MI represents the correlation between PC and upstream water level. If the abnormal data are caused by osmometer failure, then the MI will not be abnormal because the abnormal data are not caused by the upstream water level. Therefore, MI eliminates the interference from osmometer failure and improves the detection accuracy.
In summary, this study uses PCA to extract PCs from massive seepage measurement and MI to describe the correlation between the PC and the upstream water level. The detection model for seepage behavior is constructed based on the MI distribution. A flowchart that illustrates the modeling process is depicted in Figure 1.
Modeling process for seepage detection of earth dams.
2.2. Extracting PCs of Effect Variables
The number of seepage monitoring points is assumed as n; that is, the number of primitive variables is n, and each point contains m times of observed value. Therefore, these observed values can form the following n×m matrix: (1)X=X1X2⋮Xn=x11x12⋯x1mx21x22⋯x2m⋮⋮⋱⋮xn1xn2⋯xnm,where Xi(i=1,2,…,n) is the row vector that denotes the monitoring data sequence of the ith monitoring point and xij(i=1,2,…,n;j=1,2,…,m) denotes the jth monitoring data of the ith monitoring point.
In matrix X, the working environments of these primitive variables (seepage monitoring points) are similar. Therefore, the measured data of these points (X1,X2,…,Xn)T exhibit a strong correlation. The PCA is used to reconstruct n irrelevant integrated variables (PC) when the number of primitive variables is n. Score matrix F can be expressed as (2)F=F1F2⋮Fn=l11l12⋯l1nl21l22⋯l2n⋮⋮⋱⋮ln1ln2⋯lnnX1X2⋮Xn=LX,where Fi is the ith PC and L is the score coefficient matrix. lij is the coefficient of the jth primitive variables in the ith PC and reflects the relevance between the jth primitive variables Xj and the ith PC, Fi. A large absolute value of lij leads to a high correlation between Fi and Xj. Hence, considerable information on Xj can be explained by Fi. If lij is positive, then the correlation between Fi and Xj is positive. If lij is negative, then the correlation between Fi and Xj is negative. Fi demonstrates the following properties:
① Fi is uncorrelated with Fj (i≠j;i,j=1,2,…,n).
② F1 has the largest variance in all linear combinations of X1,X2,…,Xn. F2 has the biggest variance in all linear combinations of X1,X2,…,Xn, which are uncorrelated with F1. ⋯Fn has the largest variance in all linear combinations of X1,X2,…,Xn, which are uncorrelated with F1,F2,…,Fn-1.
Equation (2) denotes that calculating L is an important step. PC can be obtained by calculating L. Assume that the covariance matrix of primitive variables is expressed as (3)C=covX1,X1covX1,X2⋯covX1,XncovX2,X1covX2,X2⋯covX2,Xn⋮⋮⋱⋮covXn,X1covXn,X2⋯covXn,Xn.The eigenvalue decomposition of C can be expressed as(4)C=UΛUT,where Λ is the diagonal matrix; that is, Λ=diag[λ1,λ2,…,λn]. λi is the eigenvalue of C, which is the variance of the ith PC. U is the eigenvector matrix, which can be written as U=(u1,…,ui,…,un). ui can be written as (u1i,u2i,…,uni)T(i=1,2,…,n). Thus, L=UT can be confirmed.
Assume that ai=(a1i,a2i,…,ani)T is an orthogonal vector, which makes the ith PC Fi=a1iX1+a2iX2+⋯+aniXn=aiTX. The variance of Fi is calculated as follows:(5)varFi=aiTCai=aiT·UΛUT·ai=∑i=1nλiaiTuiuiTai≤λi∑i=1naiTuiuiTai=λiaiTai=λi.
The second property of Fi indicates that Fi has the largest variance in all linear combinations of X1,X2,…,Xn, which are uncorrelated with F1,F2,…,Fi-1. Therefore, var(Fi) can reach the largest variance when ai=ui; that is, Fi=u1iX1+u2iX2+⋯+uniXn=uiTX. Thus, L=UT.
If the number of primitive variables is n, then less than n PCs can be reconstructed. The ability of these n PCs to explain primitive variables is different. Therefore, z (where z < n) PCs should be extracted from n PCs that best describe the properties of primitive variables. The values of λi are sorted from large to small, and the value of z is typically determined based on the cumulative variance contribution rate η, which is calculated as follows: (6)η=∑i=1zλi∑i=1nλi×100%.In general, if η is greater than 95%, then over 95% of the original information can be explained by former z PCs. Therefore, η ≥ 95% is set as the discriminant index for extracting z PCs from n PCs. In engineering applications, the number of PCs can be properly adjusted according to the specific circumstances.
2.3. MI Calculation
Assume that the ith PC of seepage is Fi and the upstream water level is Y. Then, the MI Ii between Fi and Y is calculated as(7)Ii=∑fi∈Fi∑y∈Ygfi,yloggfi,ygfigy,where g(fi) and g(y) are the probability density functions of Fi and Y, respectively, and g(fi,y) is the joint probability density function of Fi and Y. If the correlation between Fi and Y is high, then Ii will be considerable. Moreover, if Fi and Y are not related, then Ii will be zero. Fi and Y may not follow the fixed-form distribution type. Hence, kernel density estimation (KDE) method [22] is used to estimate the probability density functions of Fi and Y. In this method, g(fi) and g(y) are expressed as (8)gfi=1mh∑d=1mKfi-Fidh;gy=1mh∑d=1mKy-Ydh,where m are the measuring times, Fid is the dth measured value of the ith PC, Yd is the dth measured value of the upstream water level, and K is the kernel function. Gaussian kernel function [22] is generally used and expressed as(9)Kfi-Fidh=12πexp-fi-Fid22h2;Ky-Ydh=12πexp-y-Yd22h2.The joint probability density function g(fi,y) is expressed as (10)gfi,y=1m∑d=1m1h2·Kfi-Fidh,y-Ydh.In (8)–(10), h is the bandwidth used to control the smoothness and fitting accuracy of the probability density curve. If the value of h is high, then the probability density curve is smooth with a low fitting precision. If the value of h is small, then the smoothness of probability density curve decreases but the fitting precision increases. In general, the value of h is determined through the comprehensive analysis of smoothness and fitting accuracy.
2.4. Detection Model of Seepage
Assume that the length of the dam safety monitoring data is r years. If the unit is years, then the detection model is constructed based on the annual variation of MI. In (7), the MI between the PCs and the upstream water level can be obtained. These MI values form the following MI matrix:(11)I=I1I2⋮Iz=I11I12⋯I1rI21I22⋯I2r⋮⋮⋱⋮Iz1Iz2⋯Izr,where Ii(i=1,2,…,z) is the row vector, which indicates the MI of ith seepage PC and upstream water level in different years. Iij(i=1,2,…,z;j=1,2,…,r) denotes the MI between the ith seepage PC and the upstream water level in the jth year. Assume that the mean of Ii(i=1,2,…,z) is I-ii=1,2,…,z. Then, mean vector I-=I-1,I-2,…,I-i,…,I-zT can be obtained.
Assume that MI is independent and obeys the normal distribution. The seepage detection model can be established by analyzing the distribution of [I1j,I2j,…,Iij,…,Izj]T(j=1,2,…,r). The statistics T2 can be constructed as follows:(12)T2=rr+1I1j,I2j,…,Iij,…,IzjT-I-TS-1I1j,I2j,…,Iij,…,IzjT-I-, where S is the following covariance matrix: (13)S=covI1,I1covI1,I2⋯covI1,IzcovI2,I1covI2,I2⋯covI2,Iz⋮⋮⋱⋮covIz,I1covIz,I2⋯covIz,Iz.Based on statistical theory [15], T2 follows the (r-1z/(r-z))F(z,r-z) distribution in which (14)T2~r-1zr-zFz,r-z.
Therefore, possibility P where the MI value [I1j,I2j,…,Iij,…,Izj]T in the jth year falls into the confidence region 100(1-α)% satisfies the following equation: (15)PT2≤r-1zr-zFαz,r-z=1-α,and the confidence region satisfies the following inequality:(16)T2≤r-1zr-zFαz,r-z.The region is a confidence interval when z=1; the region is a confidence ellipse when z = 2; the confidence region is a confidence ellipsoid when z=3; the region is a hyperellipsoid when z>3.
The range of the confidence region is determined by the eigenvalue of covariance matrix S and significance level α. S is symmetric and positively definite and has z real eigenvalues that are greater than zero. The eigenvalues of S are expressed as (17)δ1≥δ2≥⋯≥δz≥0.
The confidence interval 100(1-α)% of the MI value [I1j,I2j,…,Iij,…,Izj]T in the jth year is centered on mean vector I-. The lengths of each half axis are expressed as (18)δ1zr2-1rr-zFαz,r-z,δ2zr2-1rr-zFαz,r-z,…,δnzr2-1rr-zFαz,r-z.
From the statistical theory [15], significance level α is typically set as 0.05 and 0.01. Therefore, the distribution of [I1j,I2j,…,Iij,…,Izj]T satisfies the following equations:(19)I1j,I2j,…,Iij,…,IzjT-I-TS-1I1j,I2j,…,Iij,…,IzjT-I-≤zr2-1rr-zF0.05z,r-z,(20)I1j,I2j,…,Iij,…,IzjT-I-TS-1I1j,I2j,…,Iij,…,IzjT-I-≤zr2-1rr-zF0.01z,r-z.
Equations (19) and (20) are considered the detection models for seepage behavior of earth dams. For the MI value [I1j,I2j,…,Iij,…,Izj]T in the jth year, the probability of falling in the range of (19) is 0.95, and the probability of falling outside the range of (20) is 0.01. Based on the small probability principle, the event is considered a small probability event when its probability is less than 0.01. If a small probability event occurs, then appropriate attention must be provided. The seepage behavior of earth dams is divided into three states, namely, normal, early warning, and abnormal, when the preceding mentioned theories are combined with engineering experience in seepage monitoring; these states are described as follows:
(1) [I1j,I2j,…,Iij,…,Izj]T falls within the range of (19) (P=0.95); that is, the seepage behavior of earth dam is normal.
(2) [I1j,I2j,…,Iij,…,Izj]T falls outside the range of (20) (P=0.01), thereby indicating a small probability event; that is, the seepage behavior is abnormal, and corresponding engineering measures should be immediately taken.
(3) The region between (19) and (20) is a transition region between the normal and abnormal statuses. If [I1j,I2j,…,Iij,…,Izj]T falls within this range (P=0.04), then it warrants an early warning status. The trend of [I1j,I2j,…,Iij,…,Izj]T should be observed in the future. If [I1j,I2j,…,Iij,…,Izj]T tends toward the abnormal range, then appropriate engineering measures should be taken.
3. Case Study3.1. Description of the Project
The Shenzhen Reservoir (Figure 2) is located downstream of Shawan River in Shenzhen City, Guangdong Province, China. This reservoir is a water conservation project with functions of flood control, water supply, and power generation. The main building includes the main dam, the left auxiliary dam, and the right auxiliary dam. This main dam is an earth dam that has a core wall with a shell material that is gravelly, silty, and clayey, and the core wall for antiseepage is made of concrete. Four seepage monitoring cross sections (MXF, MXG, MXS, and MXL) are arranged to monitor the dam seepage behavior and antiseepage effect of the core wall. Then, 20 osmometers are placed on the cross sections, where five osmometers are placed in each cross section. The osmometers in front of the core wall are called prewall osmometers, which are numbered as MXF1, MXG1, MXS1, and MXL1 to facilitate early recognition. The osmometers behind the core wall are called back-wall osmometers and are numbered as MXF2–MXF5, MXG2–MXG5, MXS2–MXS5, and MXL2–MXL5. The locations of the osmometers are exhibited in Figure 3.
Shenzhen Reservoir Project.
Layout of the osmometers in the Shenzhen Reservoir Project.
In general, the current study uses prewall osmometers (MXF1, MXG1, MXS1, and MXL1) and the first osmometers of the back-wall (MXF2, MXG2, MXS2, and MXL2) as representative monitoring points. Seepage behavior is detected through a data mining method that uses the monitoring data, which are obtained from the osmometers from January 1, 1995, to December 31, 2014. The process lines for prewall osmometers MXF1, MXG1, MXS1, and MXL1 are demonstrated in Figure 4. Meanwhile, the process lines for back-wall osmometers MXF2, MXG2, MXS2, and MXL2 are displayed in Figure 5.
Process lines of the prewall osmometers.
Process lines of the back-wall osmometers.
The qualitative analysis in Figure 4 denotes that the measured values of the prewall osmometers and the variations are similar. The qualitative analysis presented in Figure 5 indicates that the measured values of MXF2, MXG2, and MXS2 in the first osmometers of the back-wall are the same. However, the fluctuation of the MXL2 value from 2005 to 2010 significantly increases, thereby demonstrating an abnormal phenomenon where the measured values of MXL2, MXF2, MXG2, and MXS2 are inconsistent.
The possible causes of the abnormal measured values of MXL2 include the following: (1) the core wall in the MXL monitoring section being damaged, thus resulting in an abnormal seepage behavior; (2) osmometer failures, such as external water infiltration in the MXL2 osmometer, and abnormal operation of the MXL2 osmometer. This study uses the data mining method to establish the detection model for seepage behavior. The seepage behavior is detected. Then, the causes of abnormal MXL2 data are speculated.
3.2. Mining Seepage Monitoring Data3.2.1. Extraction of PCs in the Prewall Osmometers
The covariance matrix Cp for prewall osmometers MXF1, MXG1, MXS1, and MXL1 is calculated by using (3), as displayed in Table 1. The covariance matrix Cb for back-wall osmometers MXF2, MXG2, MXS2, and MXL2 is also calculated, as presented in Table 2.
Covariance matrix of the prewall osmometers.
MXF1
MXG1
MXS1
MXL1
MXF1
1.00
0.94
0.91
0.92
MXG1
0.94
1.00
0.98
0.97
MXS1
0.91
0.98
1.00
0.98
MXL1
0.92
0.97
0.98
1.00
Covariance matrix of the first back-wall osmometers.
MXF2
MXG2
MXS2
MXL2
MXF2
1.00
0.76
0.70
0.21
MXG2
0.76
1.00
0.85
−0.01
MXS2
0.70
0.85
1.00
0.05
MXL2
0.21
−0.01
0.05
1.00
From Tables 1 and 2, the covariance of the prewall osmometers is determined between 0.91 and 0.98; this covariance indicates a high correlation among the values. The covariance of MXF2, MXG2, and MXS2 is between 0.70 and 0.85, and the correlation is also high. However, the covariance of MXL2, MXF2, MXG2, and MXS2 is between −0.01 and 0.21, thereby indicating that MXL2 is weakly correlated with the first back-wall osmometers on the other monitored sections.
The eigenvalues and their variance contribution rate and the cumulative variance contribution rates of Cp and Cb can be calculated by using (4) and (6) as summarized in Table 3.
Eigenvalues and the variance contribution rates of Cp and Cb.
PC
Fp1
Fp2
Fp3
Fp4
Prewall osmometers
Eigenvalues
3.86
0.10
0.02
0.01
Variance contribution rate (%)
96.59
2.58
0.51
0.32
Cumulative variance
96.59
99.17
99.68
100
Contribution rate (%)
PC
Fb1
Fb2
Fb3
Fb4
Back-wall osmometers
Eigenvalues
2.56
1.02
0.29
0.13
Variance contribution rate (%)
63.85
25.55
7.35
3.26
Cumulative variance
63.85
89.40
96.75
100
Contribution rate (%)
Table 3 indicates that the eigenvalue of Fp1 in the prewall osmometers is considerably larger than the eigenvalues of the other PCs. In addition, the variance contribution rate of Fp1 reaches 96.59%, which is higher than the threshold of 85.00%, thereby denoting that the main information from the original information of MXF1, MXG1, MXS1, and MXL1 can be explained using Fp1. Hence, Fp1 can be used to describe the seepage characteristics of MXF1, MXG1, MXS1, and MXL1. The values Xp1, Xp2, Xp3, and Xp4 represent MXF1, MXG1, MXS1, and MXL1, respectively. In (2), the expression of Fp1 can be expressed as (21)Fp1=0.253Xp1+0.251Xp2+0.252Xp3+0.244Xp4.In (21), the coefficients of Xp1, Xp2, Xp3, and Xp4 are extremely close; these coefficients indicate that the measured data of MXF1, MXG1, MXS1, and MXL1 are similar. Therefore, Fp1 can express MXF1, MXG1, MXS1, and MXL1.
Table 3 also indicates that 63.85% of the original measured information can be explained by using the first PC Fb1 in the first back-wall osmometers, whereas 25.55% of the original measured information can be explained by using the second PC Fb2. The cumulative variance contribution rate of Fb1 and Fb2 reaches 89.64%. Although the cumulative variance contribution rates of Fb1 and Fb2 are below the threshold, the information of Fb3 and Fb4 is significantly reduced. Therefore, the original measured information of MXF2, MXG2, MXS2, and MXL2 can be represented by Fb1 and Fb2. Let Xb1, Xb2, Xb3, and Xb4 be the measured data of MXF2, MXG2, MXS2, and MXL2, respectively, to calculate the expressions of Fb1 and Fb2 by using (2): (22)Fb1=0.311Xb1+0.326Xb2+0.312Xb3+0.051Xb4,(23)Fb2=0.152Xb1-0.201Xb2-0.132Xb3+1.181Xb4.In (22), the coefficients of MXF2, MXG2, and MXS2 are close and significantly higher than the coefficient of MXL2, thereby indicating that Fb1 mainly explains the original measured information of MXF2, MXG2, and MXS2. In (23), the coefficient of MXL2 is higher than the absolute value of the coefficients of MXF2, MXG2, and MXS2, thus denoting that Fb2 mainly explains the original measured information of MXL2.
The values of Fp1 during the monitoring period from January 1, 1995, to December 31, 2014, are calculated using (21), whereas those of Fb1 and Fb2 are calculated using (22) and (23), respectively. The process lines for Fp1, Fb1, and Fb2 and upstream water level are illustrated in Figure 6.
Process lines for Fp1, Fb1, and Fb2 and upstream water level.
In this figure, the process lines for Fp1 and upstream water level coincide, thereby showing a strong correlation between Fp1 and the upstream water level. The correlations between Fb1 and the upstream water level and between Fb2 and the upstream water level are not evident. In addition, Fb1 mainly describes the seepage characteristics of MXF2, MXG2, and MXS2, thus exhibiting the strong regularity of process line. By contrast, Fb2 mainly describes the seepage characteristics of MXL2. Therefore, a significant fluctuation in its process line is observed during the period of 2005–2010.
The PCA in data mining combines information that highly correlates and separates anomalous data. The key information for the original osmometers is expressed by Fp1, Fb1, and Fb2, thereby reducing the number of original variables and providing the basis for quantitative detection.
3.2.2. MI between PCs and Upstream Water Level
Additional analysis is conducted by calculating the MI between PC and upstream water level to establish the detection model for seepage behavior and determine the cause of abnormal measurements of MXL2. Let Ip1, Ib1, and Ib2 denote the MI values between Fp1 and upstream water level, between Fb1 and upstream water level, and between Fb2 and upstream water level, correspondingly, during the period of 1995–2014. In (8), the probability density function of each PC and upstream water level can be obtained from the KDE when the bandwidth is set to 1.0, 0.5, and 0.1. The image is displayed in Figure 7.
Probability density functions of the PC and upstream water level by KDE.
Fp1
Fb1
Fb2
Upstream water level
In this figure, the probability density function can accurately simulate the distribution of PC and the upstream water level when the bandwidth is set to 0.1. The MI values Ip1, Ib1, and Ib2 under this bandwidth during the period of 1995–2014 (i.e., 20 years) are calculated using (7). The matrix of I is expressed as(24)I=Ip1Ib1Ib2=Ip1,1Ip1,2⋯Ip1,j⋯Ip1,20Ib1,1Ib1,2⋯Ib1,j⋯Ib1,20Ib2,1Ib2,2⋯Ib2,j⋯Ib2,20;j=1,2,…,20.
The process lines for the MI values are depicted in Figure 8.
Process lines of MI.
MI reflects the correlation among variables, and MXF1, MXG1, MXS1, and MXL1 are placed in front of the core wall. Therefore, a high correlation theoretically exists between Fp1 and the upstream water level, thus indicating that Ip1 is large. However, the core wall plays the main role for antiseepage. If the seepage behavior of the earth dam is normal, then the correlations between Fb1 and the upstream water level and between Fb2 and the upstream water level should be significantly reduced; these conditions indicate that Ib1 and Ib2 are small. If the seepage behavior is abnormal, then the correlations between Fb1 and the upstream water level and between Fb2 and the upstream water level will increase, thereby indicating that Ib1 and Ib2 will exhibit a significant increase.
In Figure 8, Ip1 varies within the range of [1.17,2.44], and Ib1 and Ib2 vary within the range of [1.30 × 10−1, 6.32 × 10−1] during the period of 1995–2014. Ib1 and Ib2 are significantly lower than Ip1. Therefore, we can qualitatively consider that the seepage behavior is reasonable.
3.3. Detection Model of Seepage Behavior
The result of Kolmogorov-Smirnov [23] analysis shows that Ip1 follows a normal distribution N(1.86, 0.312), Ib1 follows a normal distribution N(0.31, 0.112), and Ib2 follows a normal distribution N(0.27, 0.102). The detection model is established using the distribution of MI values to quantitatively analyze the seepage behavior. The measured value of the MXL2 osmometer in 2005–2010 is evidently abnormal; that is, Ib2 may not reflect the real MI between MXL2 (Fb2) and the upstream water level. Therefore, the detection model is established based on the distribution of Ip1 and Ib1, which reflects the real MI of the measured value and the upstream water level.
In (15) and (16), the confidence region is an ellipse when the number of PCs = 2. The means for Ip1 and Ib1 are 1.86 and 0.31. Significance level α is set to 0.05 and 0.01. Then, the two confidence ellipses can be obtained using (19) and (20). The equations are expressed as follows:(25)Ip1,jIb1,j-1.860.31T0.09110.00410.00410.0116-1Ip1,jIb1,j-1.860.31=7.87,(26)Ip1,jIb1,j-1.860.31T0.09110.00410.00410.0116-1Ip1,jIb1,j-1.860.31=13.32.
Equations (25) and (26) are considered the detection model for seepage behavior, and their images are exhibited in Figure 9. In this model, the seepage behavior can be determined based on the positions of Ip1,j, Ib1,j, and Ib2,j(j=1,2,…,20) in the ellipses. (1) If the MI falls within the range of (25), then the seepage behavior is normal. (2) If the MI falls within the range of (25) and (26), then the seepage behavior signals an early warning. (3) If the MI falls outside the range of (26), then the seepage behavior is abnormal.
Detection model of seepage behavior.
The MI values (Ip1,j,Ib1,j) and (Ip1,j,Ib2,j) from 1995 to 2014 are plotted in Figure 9. In this figure, the values in other years (Ip1,j,Ib1,j) and (Ip1,j,Ib2,j) are in a normal state, except for the value of (Ip1,j,Ib1,j) in 2004, which is in the early warning state. This result indicates that the seepage behavior is normal. Therefore, the significant fluctuation of MXL2 in 2005–2010 may be caused by equipment failure.
3.4. Verifying the Speculation
The MXL2 osmometer was tested and analyzed through an engineering method to verify the speculation.
(1) The technical performance of the MXL2 osmometer was tested, and the results showed that the current service status of the MXL2 osmometer is qualified.
(2) The piezometer sensitivity in the MXL2 osmometer was also tested, and the results showed that the piezometer sensitivity in the MXL2 osmometer is unqualified. A certain degree of clogging occurred in the piezometer.
(3) The working records of the MXL2 piezometer were investigated and analyzed. The results showed that the dam surface was transformed in 2004. However, the piezometer in the MXL2 osmometer was poorly maintained, thereby causing rainfall infiltration. The piezometer was punched and cleaned at the beginning of 2011, and piezometer maintenance was conducted. Thus, the measured results of MXL2 after 2011 were normalized.
In summary, the evident fluctuation of MXL2 in 2005–2010 was mainly attributed to the unqualified piezometer sensitivity and rainfall infiltration in the piezometer of the MXL2 osmometer. The results of engineering test and detection model are consistent with each other, thereby confirming the speculation.
4. Conclusion
Seepage behavior is an important factor that affects the safety of earth dams. In this study, the PCA and MI methods are organically combined to detect the overall seepage behavior of earth dams. The monitoring data from different monitoring sections are effectively synthesized and mined. The detection model can eliminate the interference of osmometer failure and improve the accuracy of the detection, thereby providing a new method for detecting the overall seepage behavior of earth dams.
The main contributions of this paper are as follows: (1) The PCA method is applied to fuse the data of correlated osmometers, thus promoting the development of seepage detection from a single point to multiple points. (2) The detection model is established by MI distribution, which supports the improvement of seepage detection from being a qualitative method to being a quantitative method. In particular, the method can be extended to detect the behavior of concealed engineering such as core wall, foundation, and steel structure.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this article.
Acknowledgments
This research was supported by the National Natural Science Foundation of China (no. 51379162) and Water Conservancy Science and Technology Innovation Project of Guangdong Province (2016-06).
ICOLD1995Paris, FranceInternational Commission on Large DamsLiangG. Q.ZhengM. S.SunB. Y.Analysis model and method of seepage observation data for earth rock-fill dams200328387SiC. D.LianJ. J.ZhengY.Genetic support vector machine model for seepage safety monitoring of earth-rock dams2007381113401346XiangY.FuS.-Y.ZhuK.YuanH.FangZ.-Y.Seepage safety monitoring model for an earth rock dam under influence of high-impact typhoons based on particle swarm optimization algorithm201710170772-s2.0-8501747983510.1016/j.wse.2017.03.005GamseS.OberguggenbergerM.Assessment of long-term coordinate time series using hydrostatic-season-time model for rock-fill embankment dam20172412-s2.0-8496438958610.1002/stc.1859e1859WangL.XuQ.Analysis of three dimensional random seepage field based on Monte Carlo stochastic finite element method20143512872922-s2.0-84893709205ZhouC.-B.LiuW.ChenY.-F.HuR.WeiK.Inverse modeling of leakage through a rockfill dam foundation during its construction stage using transient flow model, neural network and genetic algorithm20151871831952-s2.0-8502794153310.1016/j.enggeo.2015.01.008ZhangJ.WangJ.CuiH.Causes of the abnormal seepage field in a dam with asphaltic concrete core201627174822-s2.0-8495707715010.1007/s12583-016-0623-6RenJ.ShenZ.-Z.YangJ.YuC.-Z.Back analysis of the 3D seepage problem and its engineering applications2016752, article no. 113182-s2.0-8495328103010.1007/s12665-015-4837-1ChiS.NiS.LiuZ.Back Analysis of the Permeability Coefficient of a High Core Rockfill Dam Based on a RBF Neural Network Optimized Using the PSO Algorithm201520152-s2.0-8494879846410.1155/2015/124042124042YangH. P.Fuzzy Comprehensive Evaluation of Dam Safety Based on AHP-Entropy Combination Weight Method2013356116118ShaoL. F.XinY. Y.Safety Evaluation of Earth-Rock Dam Based on Projection Pursuit Analysis and Normal Cloud Model201533128184PeralJ.MateA.MarcoM.Application of Data Mining techniques to identify relevant Key Performance Indicators201754SI768510.1016/j.csi.2016.11.006ChoiJ.KimB.HahnH.ParkH.JeongY.YooJ.JeongM. K.Data mining-based variable assessment methodology for evaluating the contribution of knowledge services of a public research institute to business performance of firms20178437482-s2.0-8501842233210.1016/j.eswa.2017.04.057JohnsonR. A.WichernD. W.20076thUpper Saddle River, NJ, USAPrentice HallMR2372475Zbl1269.62044ChenH.JiangB.LuN.MaoZ.Multi-mode kernel principal component analysis–based incipient fault detection for pulse width modulated inverter of China Railway High-speed 520179102-s2.0-8503346354610.1177/1687814017727383QiangL.QinS. J.TianyouC.Decentralized fault diagnosis of continuous annealing processes based on multilevel PCA201310368769810.1109/tase.2012.22306282-s2.0-84892564502HeJ. P.2010BeijingChina Water Power PressCoverT. M.ThomasJ. A.1991New York, NY, USAJohn Wiley & Sons10.1002/0471200611MR1122806Zbl0762.94001LeggP. A.RosinP. L.MarshallD.MorganJ. E.Feature Neighbourhood Mutual Information for multi-modal image registration: An application to eye fundus imaging2015486193719462-s2.0-8492534841310.1016/j.patcog.2014.12.014BidelmanG. M.BhagatS. P.Objective detection of auditory steady-state evoked potentials based on mutual information20165553133192-s2.0-8496120053910.3109/14992027.2016.1141246SilvermanB. W.1986London, UKChapman & Hall10.1007/978-1-4899-3324-9MR848134Zbl0617.62042Daniel WayneW.19892ndBoston, Mass, USAPWS-Kent