Improving Dam Seepage Prediction Using Back-Propagation Neural Network and Genetic Algorithm

Statistical model is a traditional safety diagnostic model for dam seepage. It can hardly display the nonlinear relationship between dam seepage and the load sets and has the disadvantage of poor extension prediction. In this paper, the theories of Back Propagation Neural Network (BPNN) combined with Genetic Algorithm (GA) are applied to the seepage prediction model. Taking a typical dam in China as an example, the prediction results of BPNN-GA model and statistical model are compared with the monitoring values. *e results show that the improved dam seepage model enhances the ability of nonlinear mapping and generalization and makes the seepage prediction more accurate and reasonable in the near future. According to the established criterion, the safety state of the dam in flood season is evaluated.


Introduction
Seepage can directly reflect the dam working state and plays an important role in the dam safety monitoring. Related statistical studies have shown that dam break caused by seepage accounts for 30%-40% of the total dam break [1]. erefore, developing or improving a reliable model, timely analyzingthe dam seepage monitoring data, and predicting its change trend are of great significance to grasp the safety state of dam seepage.
Faneli from Italy and Roeha from Portugal first applied the statistical regression method to the field of dam safety monitoring. Afterwards, Fanarin and others combined the theory of finite element with the monitoring data-derived deterministic model and hybrid model [2]. In recent years, with the continuous development of numerical simulation technology, scholars have introduced different prediction methods into the study of dam seepage analysis models [3][4][5][6][7][8][9]. Artificial Neural Network (ANN) is a research hotspot in the field of artificial intelligence since the 1980s. For the water conservancy project, ANN is a new and very important frontier research topic, and they have mature applications in many aspects including hydrological simulating, flood predicting, safety monitoring, and comprehensive evaluating [10][11][12][13][14][15][16][17][18][19]. Dawson et al. [10] used the Radial Basis Function Neural Network (RBFNN) to simulate the local rainfall-runoff evolution of the Yangtze River in China, which promoted its development in the field of hydrological simulation. Campolo et al. [11] established a flood forecasting model based on the ANN and used real-time information of a watershed to predict the water level evolution. Jeong et al. [12] used the ANN method to establish the rainfallrunoff model in order to predict the ensemble streamflow. Wang et al. [13] used the improved ANN to predict the displacement of a concrete-faced rockfill dam, which provided a new idea for monitoring dam displacement. Mata [14] combined the ANN model with the comprehensive evaluation system and proposed the corresponding method for evaluating the dam working behaviour. Zadhesh et al. [15] adopted the ANN to estimate the secondary permeability required for the grouting quality assessment on the Cheraghvays dam's foundation. Unes et al. [16] used the ANN to predict daily reservoir levels of the Millers Ferry Dam on the Alabama River in the USA. Li et al. [17] applied the Back Propagation Neural Network (BPNN) to analyze the relationship between the sediment-flushing efficiency of the ree Gorges Reservoir and its influencing factors. Shaw et al. [18] successfully simulated the prediction ability of a high-fidelity hydrodynamic and water quality model using the ANN. Bui et al. [19] proposed a novel hybrid artificial intelligent approach for modeling and predicting of the horizontal displacement of hydropower dams. In the proposed model, the neural fuzzy inference system was used to create a regression model, whereas particle swarm optimization was employed to search the best parameters for the model.
BPNN is a feedforward multilayer network using the error backpropagation training algorithm. It can solve many complex nonlinear problems in practical engineering applications. GA is an evolutionary algorithm developed from biology. It is often used to find optimal solution. In this paper, we combine the BPNN and GA to improve the dam seepage prediction model and illustrate it with a typical dam in China. On this basis, the seepage safety state of this dam is studied.

Dam Seepage Statistical Model
Dam seepage monitoring variables are generally divided into two categories: uplift pressure and leakage. Taking the uplift pressure as an example, it is mainly affected by factors including water pressure, temperature, and time-effect [2,[20][21][22][23][24][25], which can be shown as where δ is the uplift pressure, δ H is the water pressure component, δ T is the temperature component, and δ θ is the time-effect component.
(1) Water Pressure Component δ H . e change of reservoir water levels has a great influence on the seepage of the dam and has a certain delay [2]. e equivalent water level can be calculated according to the method in the literature [21][22][23] to express the delay effect of seepage. δ H has a linear relationship with water level, so the water pressure component can be expressed as where H * is the equivalent water level on the monitoring day and a 1 is the regression coefficient.
e temperature component refers to the seepage change caused by the temperature change of the dam concrete and foundation rock. ermal expansion reduces cracks, enhances impermeability, and then slows down seepage. Cooling shrinkage increases cracks, reduces impermeability, and thus intensifies seepage. e temperature of the dam body and foundation rock varies periodically with atmospheric temperature, which can be expressed with a periodic function. Considering the linear relationship between seepage and temperature, the multiperiod harmonic is chosen as the factor to represent the temperature component: where t is the number of days from the initial monitoring day to the monitoring day and b is the regression coefficient.
(3) Time-Effect Component δ θ . e time-effect component is an irreversible component that develops in a certain direction over time. It mainly reflects the influence of dam body material creep, dam foundation rock creep, rock mass joint crack, and weak structure on uplift pressure. Its mechanism is complicated, and the accurate expression is often difficult to derive [23]. Typically, the variation of the time-effect component usually changes sharply at the beginning and gradually turns stable during the later period. e general variation law of the time-effect component can be expressed by the combination of two empirical formulas, which can be expressed as where θ � 0.01 t and c is the regression coefficient.
In summary, the statistical model of the uplift pressure can be expressed as

Back-Propagation Neural Network.
e Back Propagation Neural Network (BPNN) was proposed by Rumelhart and McClelland in 1986. It is a multilayer feedforward network trained by the error backpropagation algorithm and is also the most widely used neural network model at present. e BPNN consists of input layer, hidden layer, and output layer, and each layer contains several neurons [26]. e neurons in the same layer are not connected with each other, while the neurons in the adjacent layers are connected with each other, as shown in Figure 1.
Assuming the number of samples (x i , y i ) (i � 1, 2,..., n) is n, the i is one of the samples, x i is the input vector, and y i is the expected output vector of the BPNN. Taking the node j at the m-level as the research object, when sample i is input, the node j receives the output information of the previous layer as follows: where k is a node in the upper layer, n 1 is the number of nodes in this layer, ω kj is the connection weight between the node k and the node j, and O ki is the output signal of the node k. e output value of the node j after the action function can be expressed as where f(x) is the function at the node j. e sigmoid function is often used in practical applications. e output 2 Mathematical Problems in Engineering value of the sigmoid function is between (0, 1). e form of the function is as follows: According to equations (6)-(8), the output value y i of the output layer can be obtained by calculating the output layer by layer.
In the process of error backpropagation, it is necessary to define an objective function. at is, the sum of squares of errors between the output value y ji of the output layer and the expected output value y ji where l is the number of nodes in the output layer of the BPNN. In practical applications, there is often only one node in the output layer, so the above expression can be expressed as en, the total error is where n is the number of samples. e process of error adjustment is equivalent to an unconstrained optimization problem, that is, to minimize E by adjusting the connection weights without any restriction: Repeat the above steps and constantly correct the weights of BPNN until the error meets the requirement.
BPNN, due to its ability to map complex nonlinear and unknown relationships, is a preferred choice among researchers for modeling unstructured problems and has been well applied in many practical engineering problems. However, it also has some shortcomings, such as being affected by the initial values and easily falling into a local extremum [26][27][28]. In this paper, the Genetic Algorithm (GA) is used to optimize the initial weights and thresholds of the BPNN, so as to improve the convergence speed of the BPNN and reduce the possibility of the BPNN algorithm falling into the local extremum.

Optimizing BPNN by Genetic Algorithm.
Genetic Algorithm (GA) is a gradient-free global optimization and search technique inspired by the evolutionary processes namely, natural selection and genetic variation. It was first proposed by Professor Holland of the United States in the 1970s. It operates directly on the structural object without restriction of derivation and function continuity. It adopts the probabilistic optimization method, can automatically obtain and guide the optimization search space, adjusts the search direction adaptively, and does not need to determine the rule. Different from the traditional optimization algorithm, the GA allows simultaneous search for optimal solutions in different directions, instead of starting from one direction [26]. GA has no specific requirements for the state of the objective function, and it has a good global search ability. erefore, GA can be used to optimize the connection weights and thresholds of the BPNN [26][27][28][29], and the flow chart is shown in Figure 2.
BPNN is based on gradient descent algorithm for training and weight adjustment. Before training, the BPNN randomly initializes the connection weights and threshold of each layer to the values between interval [0, 1]. is method of random initialization tends to slow down the convergence speed of the BPNN and lead to local extremum problem, while GA has strong global convergence, but the ability of local refinement is deficient. So, we can combine GA with BPNN. When the convergence speed of BPNN is slow, the connection weights and threshold of each layer in the network can be used as the input information of GA.
rough the genetic operator, the optimal individual is obtained. e optimal individual obtained by GA is decoded and assigned as the initial weights and thresholds of BPNN. en, BPNN is used for local optimization, and the output values with global optimal solution are obtained.
From the point of view of BPNN based on GA, the method is to use GA to search the solution space of target information.
en, when the GA finds a better network form, it uses the BP algorithm to locate so as to find the optimal solution of the problem. e specific steps are as follows [30][31][32]: (1) Initial Population. Firstly, the topological structure of the BPNN should be determined, and then the length of individual should be determined according to the network structure. All weights and thresholds in the network are real coded as a set of chromosomes X: where ω is the weight between the input layer and the hidden layer, θ is the connection threshold between each layer of the hidden layer, φ is the weight between the hidden layer and the output layer, and μ is the output layer threshold. Mathematical Problems in Engineering expected output of the BPNN is taken as the fitness function F: where N is the number of training samples, m is the dimension of the output variables, y i j is the target value of the output node j of the BPNN when sample i acts, and o i j is the output value. (3) Genetic Operations.
(1) Selection. Using the roulette method to select the operator, the probability of being selected is y ik � y ik r + y jk (1 − r), y ij � y ij r + y ij − y max r 1 1 − s s max , r 2 ≥ 0.5, y ij r + y min − y ij r 1 1 − s s max , r 2 < 0.5, where c is the number of individuals in the population.
(2) Crossover. Using the real number cross method, the cross operation method of the ith and jth individual at the position k is as follows: where y ik and y jk represent, respectively, the gene of the ith and jth individual at the position k, and r is the random number between (0, 1). (3) Mutation. Selecting the jth gene of the ith individual (y ij ) for mutation operation, the method is as follows: where y max is the upper bound of y ij , y min is the lower, s is the current number of iterations, s max is the maximum number of iterations, and r 1 and r 2 are the random numbers between [0, 1]. (4) Decode. e weights and thresholds of GA output are taken as the initial weights and thresholds of BPNN. BPNN carries out forward propagation, calculates global error, adjusts network parameters, and repeats learning training.
Using GA to optimize the BPNN has a great probability to avoid the network falling into a local extremum and to speed up the training of the network, and the final connection weights will be more stable.

Seepage Model Based on BPNN-GA.
e research object of this paper is the dam seepage prediction model, so the output variable is the predicted value of uplift pressure of dam foundation in a certain period of time in the future. e prediction model is designed as a three-layer BPNN. Combining the seepage theory with BPNN-GA, in the improved model, the input variables are recorded as x: where the meaning of each letter is the same as equation (5). e output variable is the uplift pressure y, and denote the sample set as Q m : where x i represents the input variables of the ith group and y i represents the corresponding uplift pressure monitoring value. e GA assigns the optimal initial weights to the BPNN through copying, crossing, and varying. e BPNN uses the training set sample Q m for machine learning, and the network continuously adjusts the weights to achieve the set minimum error, so that the optimized BPNN can better predict the uplift pressure. At this point, if the test factors are input into the trained model, the model prediction value can be obtained.

Evaluation Criterion of Dam Seepage State.
e BPNN-GA model is used to get the prediction value (δ ′ ) of the uplift pressure at a certain time, which can be compared with the monitoring value (δ) at that time. According to the probability and statistics theory, the probability of |δ − δ ′ | (the absolute difference between the monitoring value and the prediction value) falling into (0, 2S) is 95.5%, and the probability of |δ − δ ′ | falling into (0, 3S) is 99.7% [33,34], where S is the standard deviation of the model.
In this way, the state of seepage can be evaluated [3,20]:

Case Study
Cotton Beach hydropower station is located in Yongding County, Fujian Province, China. e project is mainly based on power generation, with comprehensive benefits such as flood control, navigation, and aquaculture. It belongs to class I, the layout of the hub is mainly composed of the main dam, auxiliary dam, the spillway, the bottom hole, the water delivery structure, and the underground power house. e main dam is arranged in the main river bed of the river valley. e dam crest elevation is 179.0 m, the maximum dam height is 113.0 m, and the dam crest length is 308.5 m. e normal water storage level of the reservoir is 173.0 m, the storage capacity is 1.122 billion cubic meters, and the check flood level is 177.8 m, corresponding to a total storage capacity of 2.035 billion cubic meters. e main dam is divided into six dam sections, of which 1 # ∼2 # are the left bank gravity dam sections, 3 # ∼4 # are the overflow dam sections, and 5 # ∼6 # are the right bank gravity dam sections. e uplift pressure is one of the safety monitoring contents of a gravity dam, and its size directly affects the stability, strength, and engineering cost of the dam. To this end, various engineering measures such as curtain grouting, dam foundation drainage, or forced drainage are needed to reduce uplift pressure in dam construction [35]. Engineering practice shows that the geological conditions of the dam section of the main river bed are complex and greatly influenced by the environmental factors. e seepage state in the overflow dam section should be paid more attention due to the frequently overflow influence [20]. erefore, the measuring points UP9 and UP11 of the overflow dam section are studied, shown in Figure 3.
Selecting the uplift pressure and monitoring data of upstream water level, temperature, and time in the overflow section of Cotton Beach dam from January 2006 to June 2008 as training samples of BPNN-GA, the model is carried out. e trained model is used to predict the uplift pressure in flood season (July 2008). e statistical model is used as a contrast model. e monitoring values and the prediction values of two models are shown in Figure 4 and Tables 1 and 2. It can be seen from Figure 4 that both the statistical model and the BPNN-GA model have certain predictive extension capability. From the change trend, the BPNN-GA model is more in line with the change of monitoring value than the statistical model.
We also introduce the Mean Absolute Percentage Error (MAPE), Mean Square Error (MSE), and Mean Absolute Error (MAE) to quantify the prediction effectiveness of the two models, where a smaller parameter indicates a more effective model. ey are expressed as follows: where n is the number of samples, A i is the monitoring value, and F i is the prediction value. e calculated parameters are presented in Table 3.
From Table 3, taking MAPE as an example, it can be seen that the prediction error of the statistical model is 0.18% and 0.38%, while that of the BPNN-GA model is only 0.08% and 0.14%. e prediction accuracy of the BPNN-GA model is 180% and 158% higher than that of the statistical model, respectively.
is is mainly because the BPNN-GA model has stronger nonlinear mapping and generalization ability, which improves the prediction accuracy of the model. e above results show that the BPNN-GA model has better prediction ability.
Since the prediction accuracy of BPNN-GA model is higher than that of the statistical model, the prediction values of the BPNN-GA model are used to evaluate the seepage safety state of Cotton Beach dam in flood season.
e prediction values of uplift pressure from July 1 to 31, 2008, of the two points are obtained by BPNN-GA. e results of the absolute difference |δ − δ′| values are shown in Figure 5.
From Figure 5, the absolute difference |δ − δ ′ | values are relatively small and stable on the whole, illustrating the BPNN-GA model has a good prediction effect. However, Figure 5(a) shows a slightly larger fluctuation on July 29. e affecting factors are analysed, especially the water level, which is often considered to be the main factor. e analysis results show that the upstream water level dropped significantly with a range of 0.6 m on that day, which has a certain impact on data analysis. When there are mutation data in the stationary time series, the BPNN-GA model does not capture its impact well, which leads to the above error. e accuracy of the BPNN-GA model is slightly lower in capturing the mutation data, but it can accurately predict the trend of data change. is is enough to meet the engineering application needs.
In general, all of the absolute difference |δ − δ ′ | values lie in the interval (0, 2S). According to the criteria in Section 3.4, it can be evaluated that the two points are in the normal seepage state, indicating that the seepage control effect of the overflow dam section is good. Accordingly, the same method can be used to predict the seepage state of other points and analyse the seepage control effect of other dam sections, so     6 Mathematical Problems in Engineering that the decision-makers can take corresponding countermeasures in special circumstances.

Conclusions
In this paper, the seepage monitoring model of the dam is studied according to its own working characteristics.
Combining the seepage theory with BPNN-GA, an improved seepage model is established and applied to the monitoring and analysis of the uplift pressure of Cotton Beach dam. e results are shown as follows: (1) From the prediction results of the two points, compared with the statistical model, the BPNN-GA model has high prediction accuracy and can predict the trend of data change better. is shows that it is reasonable and feasible to apply the BPNN and GA theory to improve the seepage prediction model. is method can be extended to other points to obtain a comprehensive seepage safety state and seepage control treatment effect.

Data Availability
e data used to support the findings of this study are available from the author upon request.

Conflicts of Interest
e authors declare no conflicts of interest.