Study on MPGA-BP of Gravity Dam Deformation Prediction

. Displacement is an important physical quantity of hydraulic structures deformation monitoring, and its prediction accuracy is the premise of ensuring the safe operation. Most existing metaheuristic methods have three problems: (1) falling into local minimum easily, (2) slowing convergence, and (3) the initial value’s sensitivity. Resolving these three problems and improving the prediction accuracy necessitate the application of genetic algorithm-based backpropagation (GA-BP) neural network and multiple population genetic algorithm (MPGA). A hybrid multiple population genetic algorithm backpropagation (MPGA-BP) neural network algorithm is put forward to optimize deformation prediction from periodic monitoring surveys of hydraulic structures. This hybrid model is employed for analyzing the displacement of a gravity dam in China. The results show the proposed model is superior to an ordinary BP neural network and statistical regression model in the aspect of global search, convergence speed, and prediction accuracy.


Introduction
Dam cracks and displacement monitoring that reflect the structural aging and disease are widely used in various forms of dams (e.g., Xianghongdian Dam [1], Chencun Dam [2], and Dokan Dam [3]).Dam deformation is generally caused by three primary factors: temperature variation, chemical reactions, and live loads [4].Many ideas have been proposed to monitor dam deformation with statistical regression analysis or mechanical calculation.Tonini [5] proposed that dam displacement may come from three causative influences: water pressure, temperature variation, and aging.Other scholars further researched these influences and proposed various models.
Statistical regression models [6] have been proposed to analyze and describe dam deformation data quantitatively.Improved models used the average temperature, certain interval number of days before the observation, to analyze Castelo Arch Dam utilizing hysteresis of air temperature in the dam [7].Rocha used the power-polynomial of the reservoir water level value to express hydrostatic pressure factor [6].It was not until the failure of Malpasset Arch Dam in France in 1959 and Arch Dam's reservoir-bank landslide that we began to realize the importance of dam safety and value safety monitoring work.Thus, statistical regression model was used through the finite-element method to calculate influence factor of the dam deformation [8].
Researchers further studied and proposed deterministic models [9].Deterministic model can be used to analyze quantitatively and qualitatively dam time observation sequence [10].Multiple linear regression approaches, which are simple and require no prior knowledge of the structure material properties, became popular in analyzing the relationship between environment quantity and effect size [11].Hybrid models, a combination of the deterministic models and purely statistical (regression) models, can be used to analyze dam displacement through the finite-element theory.
In the last 10 years, artificial intelligence algorithms, such as the grey system, the fuzzy mathematic theory, the time series, the wavelet theory, and bionics algorithm, were also gaining popularity.The grey system has been proposed and applied to the dam stress grey forecasting model [12,13].The fuzzy mathematic theory is used to analyze gravity dam instability due to interval risk [14].The method based on the wavelet theory describes the effect of dam monitoring data of quantity separating into effect size and environment quantity 2 Mathematical Problems in Engineering [15].Recent studies such as artificial neural network [16] and artificial bee colony algorithm (ABC) [17] provided excellent fitting precision to assess feasibility and practicability of dam safety monitoring model.
Artificial neural network, which possesses strong ability of nonlinear function approximation and self-organizing and self-adaptive function, has been applied to the data of dams safety monitoring analysis and forecasting to remove data irregularity.Backpropagation neural networks have been proposed to monitor and predict the dam deformation while based on the actual values of a concrete gravity dam's horizontal displacement [16].The prediction accuracy was greatly improved from previous statistical model.In addition, space displacement analysis [18] and the forward-inversion analysis method [19] were also proposed to supervise dam deformation.That BP network model's shortcomings were optimizing the structure of the network model and establishing the dam deformation forecasting model whose BP neural network was improved [20].Recent studies [21,22] have revealed that the artificial neural network can be applied to monitor deformation and monitor seepage of earth-rock dams.
Many studies [23,24] demonstrated that BP neural network has some defects, such as slow learning convergence speed, local extremism, and the inconsistency and unpredictability of the structure.Therefore, many scholars combined BP neural network and other theories to improve prediction accuracy, such as fuzzy mathematics theory [25,26].Such combination provides better results than regression models.The combination with wavelet decomposition on the function approximation improved the fitting and prediction accuracy of dam deformation monitoring [27].The combination with particle swarm also received good forecasting results [28].A hybrid wavelet neural network methodology shows the superiority in improving prediction precision of time-varying behavior of engineering structures [29].
As a new type of search methods, genetic algorithm (GA) has many advantages, such as simple general, high global searching capability, strong robustness, and wide application range.Many studies have shown that these advantages can be used to optimize BP neural network's structure, the weight, threshold value, and parameters and improve the prediction accuracy.Fu et al. [30] combined genetic algorithm backpropagation neural network prediction and finite-element model simulation to improve the process of multiple-step incremental air-bending forming of sheet metal.However, little research has been conducted in dam deformation analysis.Yin et al. [31] proposed the use of the GA-BP to optimize the injection molding process parameters.Chen et al. [32] suggested that GA-based BP neural network could be a promising approach for anticipating MMP in CO 2 -EOR process.However, the results [30][31][32] may have been more beneficial if the phenomenon of premature convergence in the GA was considered; because all individuals in the population tended to have the same status and stopped evolution, the algorithm fails to give a satisfactory solution.Furthermore, when using normal GA to solve practical problems, we could have been puzzled by setting control parameters and designing the genetic operator.They tend to be given based on the actual problem tentatively.As aforementioned, the inappropriate setting will largely influence the performance of the algorithm.Thus further studies are required for exploring the applicability and reliability of GA-BP in dam deformation and seeking better methods of improvement.
The objective of this study is to analyze the feasibility of GA-BP in dam deformation, to explore the usefulness of proposed multiple population genetic algorithm backpropagation (hybrid MPGA-BP model), and to compare it with statistical regression model and conventional BP network model with the same parameter for dam deformation analysis.The rest of the paper is arranged as follows.Section 2 points out the pertinence between the loads and the dam behavior and then presents a brief review of BP neural network and MPGA and introduces proposed MPGA-BP model.Section 3 provides a case study of a gravity dam, which includes model setting-up, the results of model analysis, and evaluation.Several figures and tables are presented to illustrate the comparison among the statistical regression model, BP neural network model, and MPGA-BP model.Section 4 presents concluding remarks.

Statistical Relation between the Loads and the Dam
Behavior.Dam structure is influenced by hydraulic, environmental, and geomechanical factors.Therefore, the situation requires us to study the variables that affect the dam behavior before applying the improved artificial neural network approaches.Based on the results of the study of [33,34], the approved formulation for deformation  of an observing point located on the dam can be generated as follows: where   ,   , and   are the displacements contributed by hydrostatic pressure, temperature variation, and aging, respectively.
As an important factor of deformation, hydrostatic pressure can be expressed as a polynomial function for the reservoir water level  above the foundation as follows: where  0 and   are determined by regression analysis;  is the number of water pressure factors ( = 3 for the concrete gravity dam).The displacement contributed by temperature variation can be modeled in two ways.If temperature measurements within the dam body and foundation are adequate and available, then where   is the temperature measured values at temperature measuring point at point ;  0 and   are determined by regression analysis;  is the number of temperature factors.If temperature measurements are inadequate and unavailable, the form of measured temperature cannot be used to describe temperature field's variation.When dam temperature field closes to the quasi-stable temperature field, we can describe approximatively the changes of temperature field within dam body through the changes of temperature outside.But there is a lag effect on the dam body internal temperature variation which was influenced by air temperature changes.So, the influence of air temperature variation on monitoring effect-quantity also causes a lag effect.As a consequence, average temperature of several days before the day of monitoring effect-quantity was served as temperature factors.Therefore the form of temperature temporal loadings can be described as follows: where  (−) is the average value of the temperature factors  in the day  to the day  before observational days;  0 and   are determined by regression analysis;  is the number of temperature factors.The influence contributed by aging can be modeled as follows: where   1 adopts the Sigmoid function.Sigmoid function can control the arbitrary input within the scope of (0, 1).Logarithmic Sigmoid function relation can be modeled as follows: As the output layer excitation function,  2 adopts the linear type function that can be modeled as follows: is connection weight between the th input layer neuron and the th hidden layer neuron;   is connection weight between the th hidden layer neuron and th output layer neuron;   is the threshold value of th hidden layer neuron;   is threshold value of th hidden layer neuron;  is the learning rate of the network;  is error between the output value and target value.
The BP neural network training process is mainly divided into the following two steps.
Step 1 (forward propagation).Obtain the input values of the th hidden layer neuron as follows: Because the threshold value of the th hidden layer neuron is   , we can get the th hidden layer neuron activation value net  −   .Therefore, the th hidden layer output value can be expressed as V  =  1 (net  −   ),  = 1, 2, . . ., .The same can be deduced such that the input value of the th output layer neuron is net  = ∑  =1   V  , the threshold value of the th output layer neuron is   , the th output layer neuron activation value is net  −   , and the th output layer output value is as follows: Step 2 (reverse propagation).If the difference value between output value of forward propagation and that we expect is bigger than set value, the corresponding weights need to be adjusted backwards constantly.The error function of the neural network can be expressed as follows: Based on the gradient descent method, the incremental formula of connection weights between th hidden layer neuron and th output layer neuron can be described as follows: where   = (  −   ) ⋅   2 ,   2 represents excitation function  2 () derivation of the independent variable , and  is the activation value of this layer.In the same way, we can get threshold value increment of the output layer as follows: Then, based on the same gradient method, the incremental formula of connection weights between th output layer neuron and th hidden layer neuron can be described as follows: where   =       1 represents excitation function  1 () derivation of the independent variable .Therefore, threshold value increment of the output layer can be deduced as follows: Through the update incremental formula above, neurons connection weights and thresholds can be iterated and updated for the next network learning and training.Subsequently, circuit training the above steps many times, the neural network connection weights and thresholds are continually updated.Meanwhile, output error will also tend to be minimum.When the minimum reaches set value, the loop is completed; if not reached, the cycle training continues for many times.The BP neural network calculation flow diagram was shown in Figure 2. of its global efficiency, it also has its disadvantages in practical application.One of the important is premature convergence which is considered the common phenomenon in the GA.It has much effect on the solution of the optimal value, whose main characteristic is that all individuals in the population present a trend and terminate evolution.Thus, the satisfied solution cannot be obtained.The multiple population structure has been introduced in order to solve the problem of premature convergence of the GA.This paper uses multiple population genetic algorithm replacing conventional standard genetic algorithm.The structure diagram was shown in Figure 3.

Multiple Population Genetic Algorithm (MPGA). Although the genetic algorithm shows excellent characteristics
MPGA is mainly done through the following optimization based on SGA: (1) MPGA introduces multiple populations to optimize searching simultaneously, while SGA has only a single population.Different population can achieve different searching purposes with control different parameters.
(2) Through the migration operator contacting various populations, the multiple population coevolution can be realized.It will eventually obtain the optimal solution.
(3) Artificial selection operator will save the best individual of each population in evolution, which is regarded as criterion of the algorithm convergence.
In MPGA, each population selects different control parameters.Meanwhile, the value of crossover probability and mutation probability decided the algorithm's global search and local search ability.Although many articles and scholars advised choosing a larger crossover probability   (0.7∼0.9) and smaller mutation probability   (0.001∼0.05), approaches to   and   are still not sure.Many kinds of possibility lead to the final optimization results causing great difference.MPGA overcomes this shortcoming of SGA.The multiple cooperation in a kind of coevolution with the population of random control parameters takes into account the global search and local search ability in SGA.But while all populations in the MPGA are independent of each other, they can contact by immigration operator (immigrant).That immigrant moves the best individual in kinds of population in the process of evolution into other populations every certain number of generations for exchanging information between populations.As aforementioned, the immigrant in the MPGA is crucial.If the immigrant does not exist, there is no contact between kinds of population and MPGA is equivalent to using different control parameters for the SGA calculation for many times.Then, the problem of premature convergence is not solved.Every evolution generation should select the best individual of other populations through artificial selection operator (EliteIndividual) and put it in elite population to save.Unlike other populations, elite population is without related genetic operation, to ensure the integrity of the best individual.At the same time, elite population is termination criterion of the algorithm, where one uses the best individual keeping the number of generations as termination criterion.This method is more reasonable than the SGA, whose biggest number of generations is seen as the termination criterion.

Multiple Population Genetic Algorithm with Backpropagation (MPGA-BP).
As aforesaid, the BP neural network and the standard genetic algorithm (SGA) have their own shortcomings.This study suggests using multiple population genetic algorithm (MPGA) to solve the problem of premature convergence in SGA and to optimize the BP neural network weights and threshold for speeding up the learning speed of network.MPGA-BP model includes the advantages of the BP neural work, such as nonlinear mapping ability, self-learning and adaptive ability, and strong fault-tolerant ability.It uses genetic algorithm to overcome the BP neural network easy falling into local minimum value, slow convergence speed, initial threshold and weight values difficult to determine, and other shortcomings.Meanwhile, it uses concept of multiple population to solve the problem of premature convergence of genetic algorithm itself retaining the global search ability of genetic algorithm.The modeling steps of MPGA-BP generally include the following.
Step 1. Determine the topological structure of BP network (determine the number of weights and thresholds to be optimized).
Step 2. Genetic operation is based on each population's initialized code (put the norm of prediction error matrix as objective function; design the fitness function).
Step 3. Set crossover and mutation parameters for each population (introduce multiple population concept to genetic algorithm).
Step 4. Collect the best individual of various populations to form an elite population through immigration artificial selection operation (realize best individual with the optimal weights and thresholds of the network).
Step 5. Assign the best individual weights and threshold to the BP neural network.
Step 6. Get the final result through the network training, testing, and verifying.

Set Parameters.
Set the size of the population as .Then, an initial population can be generated randomly as follows: Each individual population needs encoding because solutions that need to be optimized in initial populations are BP network's weights and thresholds.This study adopts binary encoding method, which encodes every individual into a binary string.Here, the size of the individual consists of four parts; there are connection weights of input layer and hidden layer, thresholds of implicit layer, connection weights of hidden layer and connection layer, and thresholds of output.Each individual's weights and thresholds use  bit binary code.Connecting all codes of weights and thresholds forms an individual code string.Assume that the input layer contains  nodes, the hidden layer contains  nodes, and the output layer contains  nodes; then the length of individual code string is  = ( ⋅  +  +  ⋅  + ) ⋅ .Therefore each individual code string can be expressed as follows: = ( 1 ,  2 , . . .,   ) ,  = 1, 2, . . ., . (16)

Design Fitness Function.
In the genetic algorithm, the solution of issue is shown as the values of population individual and survival of the fittest through fitness function.In this paper, we are going to get the optimal weight and threshold of BP network, so the individual values are expressed as weights and thresholds of BP network and the error norm of the matrix can be chosen as target function.There will be a positive and negative value of the target function, so we need to design a fitness function.We need to ensure that fitness value is nonnegative based on the relationship between fitness function and target function.At the same time, the optimization direction of target function is the increased direction of fitness.
For the optimization problem of target function's minimum, we need to add a minus sign to transform it into the optimization problem of target function's maximum in theory as follows: For the problem of target function's maximum, we can directly set up fitness function as equal to the target function, when the target function is always positive as follows: This paper mainly involves the target function's minimum, so we choose the fitness function as formula (17).

Design Genetic Operators and Multiple Populations.
MPGA includes SGA genetic operation: selection, crossover, and mutation.Selection operation is mainly to choose excellent individual with a certain probability from old population to form new population in order to reach the goal of breeding the next generation individuals.Crossover operation is to select two individuals at random from the population and to pass parents' excellent genetics to offspring through exchanging and combination of two chromosomes for generating new excellent individuals.The main purpose of the mutation operation is to maintain the diversity of population.
In MPGA model, different population has different crossover probability   and smaller mutation probability   .Crossover probability   generates in [0.7, 0.9] at random while mutation probability   generates in [0.001, 0.5] at random.Then selection of each population uses roulette wheel to sample.The crossover operator uses simple singlepoint crossover operator while mutation operator adopts the method of discrete variable.By introducing the migration operator (immigrant) and artificial selection operator (EliteIndividual), the best individual of various populations can form elite population.The best individual finally got is best solution of the error function, which is the optimal weights and thresholds of the neural network.Multiple population genetic neural network calculation processes were shown in Figure 4.

Data Set.
A series of observations including the water level, temperature, and aging of a gravity dam in China were used building the MPGA-BP model.The crest length of the dam is 1080 m, the crest elevation is approximately 91.7 m, the upstream slope is 95%, the downstream slope is 78%, the profile is approximately the triangle, and the total reservoir capacity is 11.5 billion cubic meters.Observation wire system has been set within crest cable corridor.In this study, we examined the recorded data of the monitoring point No. 16, which is located at the centre of impervious reinforced concrete face of the overflow dam section as shown in Figure 5.The test data set, built up over a period of ten years, consists of 489 pairs.These data reflect the change of displacement, water level, temperature, and aging.In order to avoid raising the overfitting problem in BP network training, we use half data for training, a quarter for validating and a quarter for predicting.By considering the dam characteristics in Section 2.1 and the available observations, we selected the ten variables listed in Table 1 as influential factors for forecasting dam deformation.

Determinate Hidden Layer Nodes.
There exists a Kolmogorov theorem, namely, the continuous function representation theorem, ensuring that any continuous function or mapping can be used in a three-layer neural network to where , , and  are the number of input layer nodes, the number of output layer nodes, and the number of hidden layer nodes. is the constant of 1∼10.
In addition, the number of nodes can be found out through the test algorithm.We eventually get the appropriate value through either training from smaller number of nodes and increasing gradually the number of nodes based on the change of network output error or training from larger number of nodes and reducing gradually the number of nodes.This method consumes more time and energy for a large number of calculations.And not only that, it is able to quickly find suitable number of hidden layer nodes on account of strong randomness.
This study uses the method of combining the empirical formula and test algorithm to determine the number of nodes in the hidden layer.In the case,  and  are 10 and 1.The values range of  is 5∼14 based on formula (19).Then, we utilize the method of gradual growth to start the trial and compare mean square error values (MSE) of network verification results under the condition of different number of nodes.Do test 10 times, respectively, on different number of nodes (5∼14), and then take their average MSE to compare.The test results were listed in Table 2: where   and   are network output value and desired output of validation sample.
Through observing the relationship between the number of hidden layer nodes and the MSE value we concluded that if the number of hidden layer nodes were 13, the MSE value of network would be smaller.So we determine that the number of hidden layer nodes is 13 and the network structure is 10-13-1.

Sample Data Normalization.
On account of the difference of units and dimension within sample data (displacement, water pressure, temperature, and aging), we need to carry out normalization processing with sample values of input factors and output factors.Retaining original nature of sample data makes the values in [−1, 1].Normalization formula is as follows: where ,  max , and  min are the original sample data, the maximum of sample data, and the minimum of sample data. nor is the normalized sample data.The premnmx function can be used to achieve this process in the MATLAB.For network output value, normalized deformation on the above formula can be used to obtain sample data.

Optimization Process of MPGA-BP.
Common encoding includes binary code, grey code coding, and real coding.The binary coding is one of the most common kinds of coding way.It not only facilitates implementation of the genetic operations such as crossover and mutation, but also causes encoding and decoding operation to be easy.This study uses a binary code, which is set up of binary notation {0, 1}.
Code string consists of connection weights of input layer and hidden layer, hidden layer threshold, connection weights of hidden layer and output layer, and output layer threshold.
As aforementioned, the length of individual coding is  = ( ⋅  +  +  ⋅  + ) ⋅ , where  are binary digits, which have been set as  = 10.So the length of individual coding is  = (10 × 13 + 13 + 13 × 1 + 1) × 10 = 1570.Then,  individuals like these form a group, set  = 100.The scope of the initial weights and threshold has been set in (−1, 1).In the MPGA, the number of the populations is more than one, which has been set as MP = 10.There is mutual relationship between multiple populations.Fitness function is a method to judge whether individual population is good or bad.We decode the code representing weights and threshold on the code string and substitute into the BP neural network.By using training samples to train the network and using testing samples to test the network the test error  = ∑|  −   | can be obtained, where   is the expected output of th node and   is the predicted output of th node.In order to make the error of predicted values and expectations as small as possible in the BP network prediction, we may choose error matrix norm as objective function.Considering the objective function that the model final seeks is minimum, we can simply add a minus sign.Therefore, the fitness function can be used as follows: In MATLAB, Sheffield genetic algorithm toolbox has a fitness distribution function named ranking based on the sort to achieve the goal as follows: Fitn  = ranking (−obj), where obj is the output of the objective function.
The optimization process of MPGA certainly contains the inherent optimization operation of standard genetic algorithm (SGA), which has selection, crossover, and mutation operation.Selection operation adopts roulette selection operator (RWS) in the Sheffield genetic algorithm toolbox.Crossover operation adopts single-point crossover operator (xovsp).Crossover probability   arises at random in the range of [0.7, 0.9]; then they are endowed corresponding population that generated randomly.Mutation operation adopts discrete mutation operator (mut).Just like crossover probability, mutation probability arises at random in the range of [0.001, 0.05]; then they are endowed corresponding population that generated randomly.The parameters were set to default value (see Table 3) so as to obtain the best exploitation.Not only does MPGA optimization contain the SGA optimization, but it also contains the migration operator and artificial selection operator.Various groups are relatively independent, but they contact with each other by immigration operator (immigrant).It is able to take the best individual arising from genetic operation into other populations every certain evolution population.This process allows the exchange of information between populations.The best individual in other populations can be selected by artificial selection operator (EliteIndividual) to preserve within the elite population.Seeing the minimum preservation population of the best individual as termination substratum, the elite population no longer does genetic operation.The abovemotioned operation effectively solves the premature problem in SGA.

Modeling and Forecasting.
In this paper, we established MPGA-BP model compiled by MATLAB software to train the sample data and forecast dam deformation.From the error changing curve of the inducing generation (see Figure 6), we decided to demonstrate the relationship between error changes and genetic generations.The figure shows a trend of decrease of network error as a whole with the increasing of generation.After the genetic generation reached 21 generations, fitness is stable and gradually converges to 0.3781.Moreover, the optimal initial weights and threshold were substituted into BP neural network for training.In Figure 7 On the other hand, stepwise regression and ordinary BP have the same topological structure 10-13-1 and were also carried out in this study for comparison with MPGA-BP model.The same data has been applied to train and forecast.Details of the investigation are listed in Table 4.
Through the contrast in Table 4 it can be seen that the error of statistical regression is higher than that of MPGA-BP model and BP model.The reason is that the limited data for linear regression model is hard to fitting nonlinear relationship accurately.The prediction error range of MPGA-BP network model is in [−1, 1] while that of BP network model is in [−1.8, 2.6].By contrast, MPGA-BP model is better than BP model on the overall accuracy.By calculating, we can find that the standard deviation of prediction error is 0.4778 and 0.7545, respectively.What is more, the average relative error is 2.35% and 8.04%, respectively.Statistics indicate that MPGA-BP model is much more stable than BP model.And not only that, but the number of network training times is 8 and 14 times, respectively, which means MPGA-BP model is superior to the BP network model in prediction accuracy and the convergence.In order to better reflect the superiority of the three models, test sample error matrix norm of the predicted and measured values can be used to compare.Norm results came from the MATLAB and SPSS running in Table 5.
By contrast, it is not difficult to find that error matrix norms of MPGA-BP model are smaller than other two models in the process of the training, validation, and prediction.This reflects that MPGA-BP model is superior to other two models from another aspect.From Figure 8 we can see the forecasting process line of three models compared with the measured values curve.Likewise, from Figure 9 we can see the comparison between the three kinds of model prediction error curve.
Figures 10 and 11 are the simulation rendering of MPGA-BP model and its error distribution curve, respectively.Figure 10 illustrates that the transformation trend of the predicted and the measured values is consistent on the whole.The error distribution interval is in the range of [−1, 1].It is only a few points in this interval that fitting precision is low  and error is big.The individual point has certain deviation because BP network reflects the common sense rather than a single individual.
To sum up, we suggest that MPGA-BP model is far superior to statistical regression model and BP network model.It can be proved from three angles of prediction accuracy, convergence speed, and error matrix norm.It means that MPGA-BP model that has been built is scientific and effective.It is feasible that predicting gravity dam deformation uses this model.

Conclusions
This study extends the methods of genetic algorithm with backpropagation neural network (GA-BP) and multiple population genetic algorithm (MPGA) to gravity dam deformation analysis.A hybrid multiple population genetic algorithm     backpropagation (MPGA-BP) neural network algorithm has been proposed to cope with premature convergence problem and local extremum problem.A case study using the observations of a gravity dam in China has been presented and discussed to examine the performance of the proposed model.Compared with the statistical regression model and the traditional neural network algorithm, this model prediction accuracy is improved greatly.The applicability of MPGA-BP for the prediction of gravity dam deformation has been highlighted.The results show that the proposed model can be used as an excellent tool instead of that commonly used in gravity dam deformation analysis and prediction.

Figure 2 :
Figure 2: Calculation flow chart of BP neural network.
, the blue line represents the training process while black line represents the target error.The figure shows the termination of network training after 8 times of training.The training error of sample is 0.000864965 which is less than the target 0.001.

Figure 6 :
Figure 6: Error evolution process curve of MPGA-BP network.

Figure 8 :
Figure 8: Predicted values of three models compared with the measured values.
is the relation function of the aging factor;  is order value of observational days;   is the order value of base day;  is the number of temperature factors. 0 is the regression constant;   is the regression coefficient;  0 and   are determined by regression analysis.
predetermined error values to adjust the weights between layers.This process will repeat until the predetermined error values present.Assuming the input value of neural network is [ 1 ,  2 , . . .,   , . . .,   ] T , the output value is [ 1 ,  2 , . . .,   , . . .,   ] T and the target value is   ; as the hidden layer excitation function,

Table 1 :
Variables used to forecast dam deformation.Then, the number of hidden layer nodes has a great impact on the generalization and the training speed.If the number of nodes were too small, network nonlinear mapping ability would be low, prediction accuracy would be not high, and the network would be thin.If the number of nodes were too large, the learning time would be too long and the training speed would be slow.So the appropriate number of nodes is crucial to a strong network.At present, there is no clear theoretical guidance to selection method of the number of hidden layer nodes.The empirical formula commonly used in the application is as follows:

Table 2 :
Relationships between the number of hidden layer nodes and the MSE value.

Table 3 :
Parameters used in implementing MPGA-BP variable selection.

Table 4 :
Comparison of measured value under the conditions and predicted values for three models.

Table 5 :
The related parameters for three models.