Integrating the Finite Element Method with a Data-Driven Approach for Dam Displacement Prediction

Both numerical simulations and data-driven methods have been applied in dam’s displacement modeling. For monitored displacement data-driven methods, the physical mechanism and structural correlations were rarely discussed. In order to take the spatial and temporal correlations among all monitoring points into account, we took the first step toward integrating the finite element method into a data-driven model. As the data-driven method, we selected the random coefficient model, which can make each explanatory variable coefficient of all monitoring points following one or several normal distributions. In this way, explanatory variables are constrained. Another contribution of the proposed model is that the actual elastic modulus at each monitoring point can be back-calculated. Moreover, with a Lagrange polynomial interpolation, we can obtain the distribution field of elastic modulus, rather than gaining one value for the whole dam in previous studies.*e proposedmodel was validated by a case study of the concrete arch dam in Jinping-I hydropower station. It has a better prediction precision than the random coefficient model without the finite element method.


Introduction
For dams that have been running for years, their actual physical and mechanical properties often have considerable differences from the initial or design values, due to the effect of external environment. Hence, it is crucial to monitor and evaluate the dam's structural behaviors and its actual material properties. Using monitoring devices installed inside the dam, displacement, rotation, seepage, and stress can be measured, among which displacement is the most important [1]. Dam's displacement is not only affected by the external conditions (e.g., water load, temperature, and aging), but also depends on its material properties such as elastic modulus. Elastic modulus, the ratio of applied external stress to the elastic deformation of material, is usually difficult to measure; hence, it is often obtained by back-analysis based on monitored displacement data.
For the dam's displacement prediction, two approaches have been commonly used: one is numerical simulation using finite element method [2,3]; another is data-driven method based on monitored displacement data [4,5]. e principle of numerical simulations is to solve the constitutive equations; emphasis is given on physical mechanism governing the displacement [6,7]. Yet the numerical simulation often has considerable inaccuracies induced by unavoidable simplifications and idealizations of the dam's structural and material properties. Moreover, the simulation results fully depend on the initial setting of parameters, without any calibration with the actual displacement. Many researches have improved the accuracy of finite element method, especially when complex situations are involved [8][9][10]. However, there is still a gap in integrating the numerical simulation with monitored data in real world and avoiding the inaccuracies due to structural simplifications.
For data-driven methods, with the development of soft computing techniques, the modeling algorithms improved from the simple linear regression methods [11][12][13] to the advanced machine learning methods such as support vector machines (SVM) [14,15] and artificial neural network (ANN) [16]. In recent years, many synthetic models that integrate several machine learning methods also have been carried out [17,18]. Unlike the finite element methods that simulate the dam's behavior based on the design structural and material properties, data-driven methods cope with monitored data in real world. However, compared with the fact that finite element method can simulate the displacement of arbitrary point on the dam, the statistical model can merely model the displacement at the monitoring point. In addition, the data recorded at each monitoring point were modeled independently; the spatial and temporal correlations among the monitoring points were rarely discussed, which may weaken the generalization ability and sometimes result in overfitting problems [19,20].
In order to take considerations of spatial correlations among monitoring points in data-driven models, recent researches integrated clustering models with statistical models; they first clustered the monitoring points into several groups based on one or several indicators and then constrained the coefficients of explanatory variables in statistical model based on the clustering results. Shao et al. clustered the monitoring points based on their spatial characteristics, namely, distance from the monitoring point to the dam foundation [21]. Hu et al. provided a clustering model that considers the temporal-spatial characteristics [22]. However, their constraint ability is still not satisfactory, as they provided constraint for each group rather than for each monitoring point. e primary objective of the present study is to consider spatial correlations and provide constraints for each monitoring point rather than for each group.
With this objective in mind, we took advantage of finite element method that can simulate the deformation at any arbitrary point on the dam to consider the structural correlations of each monitoring point. Specifically, we determined that the water pressure component and temperature component of the whole region of dam with the finite element model, the ratio of design and actual elastic modulus, and the ratio of design and actual coefficient of linear expansion are only variables in the proposed model. en we constructed variables relating to time in aging component to consider the temporal correlations. e random coefficient model assumes the monitoring points as a panel and the coefficients of all explanatory variables in the model following a Gaussian distribution, which constrains the coefficients at a certain extent. e multicollinearity problem is solved by constraining the explanatory variables and the model becomes more stable and has a better prediction ability. Another contribution of the proposed model is that the actual elastic modulus can be back-calculated for each monitoring point. Rather than gaining one elastic modulus for the whole dam in previous studies, with a Lagrange polynomial interpolation, we obtained a distribution field of elastic modulus from the value at each monitoring point.
is article is organized as follows. Section 2 presents the development of the synthetic model that combines the finite element method with the random coefficient model. e engineering project and dataset are exhibited in Section 3. e modeling results are presented in Section 4. Section 5 discusses the prediction precision of the proposed model and the distribution field of elastic modulus. Concluding remarks complete the paper in Section 6.

Statistical Model.
In the statistical model, the concrete dam's displacement δ is commonly divided into three components: the water pressure component δ H , the temperature component δ T , and the aging component δ θ [23]. e dam's displacement is expressed as follows:  [1]. δ H can be simplified as where H is the upstream water level, a i are the coefficients of water pressure component which are related to the dam height h, m is the downstream slope angle, and d is the distance from the monitoring point to the dam foundation, which are constants for indicated monitoring points. In addition, a i also depends on dam's material properties including elastic modulus of the dam body E c , Poisson's ratio of the dam body μ c , elastic modulus of the foundation E r , and Poisson's ratio of the foundation μ r . Poisson's ratios μ c and μ r have negligible effect on the dam's displacement. erefore, the elastic modulus of the dam body and elastic modulus of the foundation E c and E r are the merely random coefficients in the water pressure component.

Temperature Component.
e temperature component δ T is mainly composed of the displacement produced by the temperature variation of dam body concrete and dam foundation rock. When there is lack of temperature data, multiperiod harmonic is used to describe the temperature component: where i indicates the period; for the annual cycle i � 1, and for half a cycle i � 2; b 1i and b 2i represent coefficients; t is the cumulative number of days from the initial value to monitoring value; and m 3 is usually taken as 1 or 2.

Aging Component.
Under long-term external loads, the dam produces inelastic displacement related to time; a fixed form of trend function is used to describe the aging component: where b 0 i represents the coefficients of temperature component, α c is the actual coefficient of linear expansion of the dam body, and T is the environmental temperature. With a given design coefficient of linear expansion α 0 c and temperature T, we can obtain a numerical solution of temperature component δ 0 T using finite element method. By fitting the temperature T and the numerical solution δ 0 T , the coefficients b 1 i of T can be obtained as We then apply b 1 i instead of b 0 i in equation (9), and δ H then can be expressed as where ψ is the ratio from the actual coefficient of linear expansion α c to design coefficient of linear expansion α 0 c . e aging component δ θ characterizes irreversible displacement caused by factors such as creep and fatigue of concrete. Due to the difficulties of expressing δ θ theoretically, here, we provide a formula, which includes time and monitoring point's coordinates as variables, to describe its tendency: where x is the horizontal coordinate and z is the vertical coordinate of the monitoring point, θ � (t/100) , t is the cumulative number of days from the initial value to monitoring value, and d 1lm and d 2lm are unknown coefficients. en, the radial displacement can be exhibited as where i is the index of measuring point, u is the random error term, δ 0 H are series of water pressure component calculated under groups of upstream water level H with finite element method, δ 0 T are series of temperature component calculated under groups of temperature T with finite element method, and coefficients ξ i , ψ i , d 1lm , d 2lm , and u are unknown coefficients in the model of all monitoring points which can be calculated with random coefficient model.
In the model, we can obtain the actual material parameters E c and ψ when ξ and ψ are calculated. In addition, ξ i δ 0 H and ψ i δ 0 T represent the displacement produced by water pressure and fluctuation of temperature; L l�0 M m�0 (d 1lm θ + d 2lm ln θ)x l i z m i represents the displacement produced owing to aging behavior of material such as creep and deterioration of material strength and so forth.

Solving with the Random Coefficient Model.
With random coefficient model, pending coefficients in equation (9) of all monitoring points can be calculated simultaneously obeying an asymptotic normal distribution. e monitoring displacement data are two-dimensional panel data containing time series and cross-section panel. Data on one panel represent the displacement data at an indicated time.
Advances in Civil Engineering e regression coefficients of one panel can be expressed by the following formula: where y it represents the two-dimensional dam displacement data; x kit represents the two-dimensional data of explanatory variables; t is the time index; i is the index of section; k is the index of the explanatory variables; the coefficient β ki is independent with time, which can be decomposed into β k and c ki ; β � (β 1 , . . . , β K ) ′ is the common mean coefficient vector; c � (c 1i , . . . , c Ki ) ′ is the degree of the individual deviated from the common mean; u is the random error term. e objective is to obtain β ki from the two-dimensional data series y it and x kit . According to the above analysis, we consider that the random coefficients β ki obey an asymptotic normal distribution. Here, β i � β + c i is considered as a random coefficient with a common mean value β; Swamy gave the following assumptions [24]. In the equations below, the symbol at the upper right corner of the letter represents the transposition operation: By integrating the observation data, the matrix equation can be obtained as follows: where where N is the number of panels, T is the amount of data in the panel, the compound error term Xc + u is a diagonal matrix, and the i-th diagonal block is ψ i � X i ΔX i ′ + σ 2 i I T . Under Swamy's assumption, the estimator of β using the ordinary least squares is biased. When X ′ X/NT converges to a nonzero constant matrix, we can get a consistent noneffective estimation. e generalized least squares estimation β GLS is the optimal linear unbiased estimator of β: e variance of the estimator can be written as Using we can obtain the unbiased estimator of σ 2 i and Δ.
However, Δ obtained from estimator might be negative. In this case, we chose us, even though the estimator is not unbiased, it is invariable nonnegative. Moreover, when T ⟶ ∞, the estimate is consistent. erefore, β GLS obeys an asymptotic normal distribution and is a valid estimator of β. Figure 1 presents the flowchart of the proposed synthetic model which integrates the finite element method into a random coefficient model based on the statistical model. e cross-section displacement data are analyzed as a panel in the random coefficient model, and the independent random coefficients ξ i and ψ i obey normal distributions. In addition, the proposed model can derivate the actual elastic modulus and actual coefficient of linear expansion at each monitoring point, which are dominant factors representing dam's properties.

Case Study
We selected the concrete dam at Jinping-I hydropower station as an example, which is the tallest arch dam in the world at present with a height of 305 m (the elevation of the dam crest is 1885 m and the elevation of dam foundation is 1580 m). It is a double-curved arch dam with the crest width of 16 m and the reservoir volume of 474 × 10 6 m 3 . e dam is located in the southwest of China at the Yalong River (see Figure 2). e construction of the dam's body was completed on June 2013. e displacement of an arch dam consists of three parts: vertical, tangential, and radial displacement, among which radial displacement is the most important indicator to evaluate the health of a concrete arch dam. Hence, we took the radial displacement as an example in this study. In order to monitor the radial displacement, the plumb system, consisting of 40 direct plumb meters and 30 reversed plumb meters, was installed in the selected dam. e structures of plumb meters are illustrated in Figure 3. For the direct plumb meter, the anchor is fitted on the dam crest with a block and, for the reversed plumb meter, the anchor is buried in the bedrock of dam foundation. e model of plumb meters used in this project is the CCD 50 manufactured by Xi'an Huateng Co., Ltd., with a measurement range of 0-0.1 m, a resolution of 0.1 × 10 − 4 m, and a measurement accuracy of 0.1 × 10 − 3 m.
In this study, we selected a dataset of 24 monitoring points that were recorded during the period from June 16, 2013, to August 25, 2015. ere are 26 sections from left bank to right bank and 6 vertical lines (5#, 9#, 11#, 13#, 16#, and 19#) in the dam. e 24 monitoring points were located on the 6 indicated vertical lines (see Figure 4).
Since the dataset had been selected, we first eliminated the misdata and obtained 274 validated data samples for each monitoring point. As shown in Figure 5, the temperature evolution has a significant regular periodicity relative to time. e upstream water level is manually controlled depending on the water resources allocation of the whole river basin. e water level in June 2013 was fairly low, as it was just newly constructed and putted into operation. Since then, the water level was gradually increased into a normal level. Figure 6 shows the time evolution of the monitored radial displacement for all the monitoring points. We classified the monitoring points by their spatial distribution (i.e., the vertical lines where they are located). When coupled with the graphic information in Figures 5(a) and 6, it is notable that the general trend of the monitored radial displacement is approximately in line with the upstream water level. Another noticeable trend is the similarity in radial displacement for monitoring points at one vertical line and for adjacent monitoring points, which means that the displacement has a strong structural correlation.

Construct 3D finite element model
Calculate δ H with finite element method Obtain coefficients Calculate δ T with finite element method

Coefficient Determination Using the Finite Element Model.
In order to quantify the spatial distribution of displacement of all monitoring points, we first established a three-dimensional finite element model based on the actual geological characteristics of the dam (see Figure 7). e threedimensional model includes three sections: dam body, dam foundation, and surrounded mountains; and the dam body was discretized into 38537 elements and 31941 nodes. e boundary of surrounding mountains was set to 2-3 times higher than the dam body in all the directions (x, y, and z).       Figure 8 exhibits the radial displacement simulated by the finite element method under three water levels (1700 m, 1780 m, and 1880 m). We can see that the displacements in all these subfigures are approximately horizontal symmetry, with the displacement at the midline being more significant than in border areas. In addition, with the increasing of upstream water level, the vertical coordinate of the center of minimum displacement region moves from the bottom to the dam crest.
After the radial displacement field on the cross section had been simulated, we can obtain the displacement at each monitoring point under the six selected upstream water levels. Simulated radial displacement of all the monitoring points versus upstream water level is illustrated in Figure 9. For all monitoring points, the radial displacement increased sharply with the increase of the upstream water level.
We then calculated the coefficients a 1 i (introduced in equation (8) in Section 2) in the water pressure components by ordinary least square (OLS) regression based on simulated radial displacement. As an arch dam, i � 1, 2, 3, 4 e coefficients a 1 1 , a 1 2 , a 1 3 , and a 1 4 for each monitoring point are listed in Table 1

Coefficients for Temperature Component.
Similar to the calculation of the coefficients in water pressure component, as the temperature of Jinping region varies from 4°C to 24°C in the real world, we selected six temperatures within this range: 4°C, 8°C, 12°C, 16°C, 20°C, and 24°C. We simulated the radial displacement under these temperatures using the finite element method and then obtained the relation between simulated radial displacement and indicated temperatures. As shown in Figure 10, in contrast with the water load impact, the radial displacement declines steadily with the rise in temperature.
Fitted with the simulation results, we obtained the coefficients b 1 i (i � 1, (2) in equation (11) (see section 2.1) for each monitoring point. It should be noted that the displacement curves in Figure 10 never pass the origin of coordinate axes, which implies the necessity of adjusting equation (11) by adding a constant term. Table 2 shows the fitting results of the coefficients b 1 i and the constant term.

Fitting Using the Random Coefficient Model.
After the coefficients in the water pressure component and temperature component had been obtained, we fitted the monitoring data based on equation (13) using the random coefficient model. e coefficients of the aging components d 1lm and d 2lm for each monitoring point were fitted by the random coefficient model (see Table 3). Meanwhile, we can obtain the ratio between design and actual elastic modulus ξ and the ratio between actual and design coefficient of linear expansion ψ; then we can derive the actual elastic modulus E c and actual coefficient of linear expansion α c (see Table 4). e elastic modulus reflects the ability of being deformed elastically when a stress is applied on an object. e coefficient of linear expansion represents the capacity of the length changes when material is heated or cooled; namely, the length changes by an amount proportional to the original length and the change in temperature. e elastic modulus may serve to evaluate the dam's running status from the view of material.
It is interesting from Table 4 that E c is dependent on the spatial location of monitoring points, where the minimum is 14.89 GPa at the monitoring point PL9-3, and the maximum is 32.51 GPa at IP16-1. We will provide a detailed discussion in section 5.2.
With the random coefficient model, the radial displacement of all the monitoring points can be modeled simultaneously. e datasets were divided into two groups; 80% of the whole data were selected as fitting data, and the other 20% were selected to validate the model (validation data). We developed the prediction model with the fitting data.
en, we used the validation data to evaluate the performance of the model. Taking the monitoring points located at vertical line 5# as an example, the fitting and predicting results obtained from the synthetic model are shown in Figure 11. Both the fitting results (left side of the line) and validating results (right side of the line) fit well with the monitored data.
In order to validate the synthetic model, we calculated the correlation coefficient R and the residual standard deviation s for the data series of all the monitoring points. e expressions of R and s are presented as follows: Usually, a model can be validated once the correlation coefficient R is above 0.9. As exhibited in Table 5, R for all the monitored points are higher than 0.987, which means that the fitting results and the monitoring data have a quite good agreement. e residual standard deviation s is located in the range of 0.186 to 1.550. e data lead us to the conclusion that the model performed well in fitting radial displacement of concrete dam.

Improvement of Prediction Precision.
Here, we selected correlation coefficients R as the indicator to evaluate the prediction accuracy of proposed model and two previous models. One is the most commonly used statistical model based on ordinary least squares (OLS) regression. Another the number of explanatory variables in the proposed model is reduced more. Hence, the results of coefficients estimates are more stable and the prediction ability is better in the proposed model. As shown in Figure 12, the correlation coefficients R of three models based on validation dataset of 24 monitoring points are compared; the proposed model had the highest R in 14 monitoring points and Clustering-RCM model had the highest R in 7 monitoring points. For IP16-1, PL19-3, PL5-2, and PL16-2, the difference of R between the proposed model and Clustering-RCM model is smaller than 0.001. In summary, the proposed model has a better prediction precision than the Clustering-RCM model, and the Clustering-RCM model performs better than OLS regression.

Back-Analysis of Elastic Modulus.
From the actual elastic modulus E c at each monitoring point presented in Table 4, we can derive the field distribution of E c for the whole dam section using a Lagrange polynomial interpolation (see Figures 13(a) and 13(b)). It is striking that the actual elastic modulus at one point is strongly dependent on its spatial location on the dam section. In particular, the actual elastic modulus is small in the central area and large in the border area.   Elastic modulus is a variable material property during the running status owing to the external environments and time. It is usually obtained by back-calculation based on monitored displacement and corresponding upstream water level, which is roughly written as E c � (f(H)/δ H ), where f(H) is the expression of upstream water level. One reason for the regularity of elastic modulus distribution is that the displacement δ H near foundation is limited owing to the strong restraint of dam foundation. Taking the annual radial displacement in 2014 as an example, the displacement in the central area is also higher than in border area (see Figure 13(c)). Hence, the elastic modulus of concrete used in the region of surrounding the dam foundation is higher than the dam center.
In Figure 13(d), the design elastic modulus of three divided region areas A, B, and C is 24 GPa, 23.5 GPa, and 23 GPa, respectively. First, it is noted that the actual elastic modulus for most areas on the dam has a deviation from the design value. Moreover, the actual elastic modulus in central areas is commonly smaller than the design value, whereas the values near the foundation are larger than design value. It is inevitable that each material may have different disintegration rate under the external load. Mainly, the distribution of residual between the actual and design elastic modulus roughly agrees with the spatial location and deformation under external load.

Limitations.
e proposed method can be merely applied in projects with known topography data and design data which include material parameters of dam, dam foundation and mountains, and physical dimension.

Advances in Civil Engineering
Another issue that should be noticed is that the correctness of data at all monitoring points should be guaranteed. With random coefficient model, all data series are analyzed simultaneously, so the measurement errors in one monitoring point may affect the accuracy of other monitoring points.

Conclusions
In this article, we presented a synthetic model that integrates the finite element method into a random coefficient model. e primary objective is to take the structural correlations among each monitoring point into account in a statistical model.
With this objective in mind, we first used the finite element method to simulate the water pressure component and temperature component of displacement at each monitoring point, so as to constrain the spatial correlations among monitoring points. en, with random coefficient method, explanatory variable coefficients of all monitoring points are calculated, obeying an asymptotic normal distribution. Finally, we selected the concrete arch dam in Jinping-I hydropower station as an example to validate the proposed model. Remarkable results can be concluded in two aspects.
First, with the proposed model, the modeling displacement data fitted well with the monitored data. We find that the proposed model performed better than a combined model of random coefficient model and Gaussian mixture clustering model, owing to the fact that finite element model can provide more rigorous constraints than clustering methods for each monitoring point.
Second, using a Lagrange interpolation, we can obtain the distribution field of the actual elastic modulus based on the actual elastic modulus at each monitoring point. e field distribution of the actual elastic modulus has deviation from the design value due to a coeffect of external load. In addition, with the help of finite element method, the distribution field of the displacement on the basis of monitored data can also be obtained.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest. 16 Figure 13: e maps of (a) field distribution of actual elastic modulus E c , (b) actual elastic modulus E c , (c) annual radial displacement in 2014, and (d) zoning designation of concrete material [26].
Advances in Civil Engineering 15