A Coupled Model for Dam Foundation Seepage Behavior Monitoring and Forecasting Based on Variational Mode Decomposition and Improved Temporal Convolutional Network

,


Introduction
Te concrete dam is one of the main types of high dam constructions in the world.Among high dams above 200 m, the concrete dam accounts for more than 60%.Terefore, it is of great signifcance to ensure the safety of concrete dams.Te behavior of dam foundation seepage is one of the key factors afecting the safety and stability of concrete dams [1][2][3][4].On the one hand, the uplift pressure generated by seepage directly acting on the dam foundation is one of the important unfavorable loads afecting the structural stability of the gravity dam; on the other hand, under the long-term efect of seepage, the joint fssure of the rock mass around the dam foundation may transform into a weak interlayer [4][5][6].Te seepage pressure is an important manifestation of dam foundation seepage, so it is of great signifcance to establish a high-performance concrete dam foundation seepage pressure prediction model [7,8].However, due to the complex environment of dam foundation seepage in reality, there exist many infuencing factors that cause the dam foundation seepage pressure to have obvious nonstationary and nonlinear characteristics [9][10][11][12].Furthermore, the seepage pressure data may be contaminated by noise [13].All of the above bring challenges to an accurate prediction [14,15].Tus, it is necessary to develop new methods to get better performance of the dam foundation seepage pressure prediction model.
Under the action of environmental factors, external loads, and other factors, the original monitoring data inevitably sufer from certain noise interference [16][17][18][19].For the dam monitoring data contaminated by noise, traditional denoising methods include wavelet analysis, empirical mode decomposition (EMD), seasonal-trend decomposition based on Loess (STL), and so on.For example, Li et al. [20] used the STL method to decompose the horizontal displacement monitoring sequence of the concrete dam and used the method of calculating the maximum signal-to-noise ratio to denoise the signal.Te noise components of the measured signal are generally located in the high-frequency portion of the signal.However, the wavelet analysis only further decomposes the low-frequency part of the signal but does not continue to decompose the high-frequency part.For nonstationary signals, when the useful signal is drowned by the noise, the wavelet threshold denoising method is not ideal [21].Te EMD decomposition is prone to end efects and modal confusion problems.By applying these methods, the denoising of the signals will also lose the detailed characteristics of the high-frequency portion.Terefore, it is necessary to fnd a more refned denoising method.
Te wavelet packet analysis is an extension of the wavelet analysis, which also decomposes the high-frequency part of the signal, in a more refned way.Te extraction efect of high-frequency information is better in wavelet packet analysis than in wavelet analysis under the condition of low frequency [22].Variational mode decomposition (VMD) is an adaptive and completely nonrecursive signal decomposition estimation method proposed by Dragomiretskiy and Zosso in 2014.Tis method has a solid foundation of mathematical theory, overcomes the end efect and mode confusing defects of the EMD method, and has better noise robustness.
Consequently, this paper combines the advantages of VMD decomposition and wavelet packet analysis.Firstly, the contaminated signals are decomposed into several components by VMD, and the components with more noise are denoised by the wavelet packet threshold.Finally, the signal pieces are reconstructed to obtain the denoised signal.
In terms of dam monitoring quantities' prediction, statistical model methods such as stepwise regression were frst used.However, their index selection is subjective, and the results are not good for high-dimensional nonlinear data [3,[23][24][25].After that, the emergence and development of machine learning techniques provide a new way for dam monitoring quantities prediction.For example, Malekloo et al. [22] used the ELM algorithm to build an accurate and easy-to-train gravity dam displacement monitoring model.Su et al. [26] ftted the rough set theory and support vector machine theory to obtain the relationship between dam safety operation and infuencing factors.Te classical machine learning method has a good efect on small sample sets; however, for large sample sets, the classical methods frequently converge slowly and easily fall into local optimal values.
As a branch of machine learning, deep learning has a good capability of feature extraction and data ftting [17,18,27,28].It has been widely used in image, speech, and natural language processing [29,30], and it mainly includes several major types of model structures such as convolutional neural network (CNN) and recurrent neural network (RNN).Dam monitoring quantities' prediction is a typical time series prediction problem [31,32].In this respect, the currently commonly used deep learning algorithms are RNN and its variants are LSTM, etc.For example, Wei et al. [33] established RNN and trained into dynamic predictors of landslide displacement using a training algorithm named reservoir computing.Te biggest weakness of RNN methods (RNN and its improved version) is that the limited number of network layers (usually 2-3 layers) can easily cause overftting.Consequently, the extraction of information is not concise enough, which leads to long processing time for large-scale data.Terefore, it is necessary to fnd and study new prediction methods.
Time convolutional network (TCN) is a new deep learning algorithm to solve time series prediction problems [34].It combines the best practices extracted from CNN, which merge the advantages of the traditional CNN model and RNN model: parallel data processing as the CNN model to extract key information, and a similar processing mechanism as the RNN model for time series data, which has certain memory.Its performance in a variety of tasks and data sets is comparable to or even exceeded RNN models.At present, TCN has been well applied in many felds, such as speech recognition and machine translation, but it has not been used in the dam engineering feld.
Tis paper proposes an improved temporal convolutional network (ITCN) model suitable for dam foundation seepage pressure data prediction.A hysteresis experiment is carried out to obtain the optimal model by correlating the receptive feld size of the ITCN model with the hysteresis of the dam foundation seepage pressure.Finally, the optimal ITCN dam foundation seepage pressure prediction model of each measurement point is obtained after the training process.To evaluate and compare the efectiveness of the proposed model, we adopted two deep learning models, which are the RNN model and the RNN model's variant LSTM model, which is more common in the feld of time series prediction, and the stepwise regression statistical model, which is more commonly used in the feld of traditional dam engineering, as the benchmark methods.Te MSE, RMSE, MAE, and MAPE were used as prediction accuracy evaluation indicators.Te verifcation results confrm the fexibility and high prediction accuracy of the proposed prediction model.
Te rest of the study is organized as follows.Section 2.1 introduces the basic principle and specifc method of the VMD-wavelet packet denoising.Section 2.

Methodology
where A k (t) is the instantaneous amplitude of u k (t), and Te bandwidth of each IMF component is estimated by performing Hilbert transform on all IMF components, then transforming the analytic signal to its corresponding baseband, and estimating it by the Gaussian smoothing method to calculate the L2 norm of its gradient.So far, we can construct a constrained variational problem, that is, under the condition that the sum of the modal components is equal to the input signal f, and the sum of the bandwidth of each modal component is minimized.

Variational Problem Solving
. By introducing both quadratic penalty terms and Lagrangian multipliers, the problem can be turned into unconstrained.Ten the alternate direction method of multipliers (ADMM) is used to turn the problem into a series of suboptimization problems.Te approximate solutions of IMF component u k , center frequency ω k , and Lagrangian multipliers λ(t) are obtained as follows: where  f(ω),  u i (ω),  λ n+1 (ω), and  u n+1 k (ω) are the Fourier transform of f(t), u i (t), λ n+1 (t), and u n+1 k (t), respectively, and n is the current iteration times.
Let IMF component u k , center frequency ω k , and Lagrangian multipliers λ(t) alternate towards optimization iterations.When the accuracy convergence criterion Ten k IMF components decomposed can be obtained, and the remaining undecomposed part of the signal is the residual component.
To sum up, the algorithm fowchart of VMD decomposition is shown in Figure 1.

Principle of Wavelet Packet Treshold Denoising
2.2.1.Wavelet Packet Analysis.In mathematics, a wavelet packet is composed of a set of linear combined wavelet functions; therefore, the choice of wavelet basis function directly afects the efect of wavelet packet denoising.Currently, there are hundreds of wavelet basis functions, and according to [35], the best basis function in wavelet denoising is the db4 wavelet.Terefore, this study chooses the db4 wavelet as the basic function of wavelet packet decomposition.
Te wavelet packet decomposition tree can intuitively show the decomposition process of the wavelet packet.When the signal S is decomposed into three levels, its wavelet packet decomposition tree is shown in Figure 2.
In Figure 2, A represents the low-frequency component of the signal, D represents the high-frequency component of the signal, and their subscript represents the decomposition level (scale).Te components of diferent decomposition scales can be combined with each other.Terefore, many decomposition structures can constitute the wavelet packet basis library.Each wavelet packet basis can completely save all the energy of the signal.However, the refected signal characteristics are diferent; therefore, it is necessary to determine a set of optimal wavelet packet basis.
In the wavelet packet basis library, the wavelet packet basis which minimizes the cost function is the optimal wavelet packet basis.Commonly used cost functions include gate threshold coefcient, relative energy, and entropy criterion [20].In this study, Shannon entropy is used as the cost function.

Denoising Treshold Selection
(1) Treshold Estimation Method.In the process of wavelet packet denoising, how to choose threshold T is a key problem.Common threshold estimation methods include Stein's rigrsure threshold, Sqtwolog threshold, heursure threshold, and minimax threshold.Trough experiments, for the data in this paper, the rigrsure threshold can retain more signal characteristics; therefore, the rigrsure threshold is selected as the threshold T in this paper, and its basic principle is as follows.
Let S be the signal to be denoised, Among them, N is the number of elements in signal S, each element in Q is the square of each element in signal S, and they are arranged in an ascending order.Here, defne the risk vectors R � whose elements are as follows: Structural Control and Health Monitoring We take the minimum value R a of each element in R as the risk value according to its subscript a, and the corresponding threshold is determined as follows: where σ is the noise signal deviation [22].
(2) Treshold Function.After the threshold, T is determined in the process of threshold denoising for each node coeffcient, and two threshold functions proposed by Li et al. [23], hard threshold and soft threshold, are widely used.
Compared with the hard threshold method, the soft threshold denoising method can achieve the optimal estimation to ensure that the denoised signal is as smooth as the original signal; therefore, the soft threshold function is used in this study.

Seepage Pressure Monitoring Model.
For each prediction model, the multidimensional input factors are the same, and the diferences are that the statistical model additionally includes previous items.Here, the frst step is to introduce the statistical model factor selection.
According to the analysis of actual engineering data, the changes in upstream and downstream water levels, rainfall, and dam foundation temperature all have a certain impact on the concrete dam foundation seepage pressure; in addition, considering the change in the overall internal environment, the aging factor should also be selected.Terefore, the statistical model of dam foundation seepage pressure is as follows: Te selection of each component factor in the statistical model is introduced as follows.

Upstream Water Level
Component Y Hu .Since the dam foundation seepage pressure has a certain hysteresis relative to the change of upstream water level, the infuence of upstream water level in the previous month needs to be considered, which is where H ui is the average upstream water level on the monitoring day and 1 ∼ 4 days before, 5 ∼ 10 days before, 11 ∼ 20 days before, and 21 ∼ 30 days before the monitoring day (i � 1∼5) and a ui is the regression coefcients.

Downstream Water Level Component Y Hd
. Te downstream water level component is similar to the upstream water level component, that is, where p i is the average rainfall on the monitoring day and 1 day before, 2 day before, 3 ∼ 4 days before, 5 ∼ 15 days before, and 16 ∼ 30 days before the monitoring day (i � 1∼6); d i is the regression coefcients.

Temperature Component Y T .
To fully refect the infuence of temperature change on dam foundation seepage pressure, we choose to use the measurement temperature at the measuring point and the sine wave periodic function as the temperature component: where t is the cumulative number of days from the initial measurement date to the monitoring date; t 0 is the cumulative number of days from the initial measurement date to the frst measurement date of the data series taken for modeling; i � 1, 2 is the annual cycle and semiannual cycle; b 0 , b 1i , b 2i are regression coefcients.

Aging Component Y θ .
Due to the deposition in front of the dam and the change of impervious body's impervious efect, the composition of aging component is complex, and generally, the following forms are adopted: where θ is the previous t divided by 100; θ 0 is the previous t 0 divided by 100; and d 1 , d 2 , d 3 are regression coefcients.In summary, the statistical model for concrete dam foundation seepage pressure prediction is where a 0 is a constant term.
It should be noted that, as can be seen from the formula, to refect the hysteresis of the impact factor, the statistical model adopts the method of adding the previous term.However, the deep learning methods of ITCN, LSTM, and RNN in this paper, all have a certain degree of memory.Terefore, there is no need to add previous term in their multidimensional input factors.In addition, they are consistent with the statistical model.

Improved Temporal Convolutional Network (ITCN).
Dam foundation seepage pressure prediction is a typical time series prediction problem, and the TCN model is suitable for processing prediction tasks with time series structures [36].However, in reality, the environment of dam foundation seepage is complex: there are many infuencing factors, and there is a limited hysteresis relative to environmental changes.So, the original TCN model structure is not suitable for the prediction of dam foundation seepage pressure.Consequently, based on the TCN model, this paper proposed an improved temporal convolutional network (ITCN) model suitable for dam foundation seepage pressure data prediction.Te model structure retains the basic structure of the TCN model as a whole, that is, dilated causal convolution structure, residual block, and fully convolutional network (FCN); in addition, according to the characteristics of dam foundation seepage pressure data, the following improvements have been made: (1) To better extract the features of the input multidimensional factor data and to adapt the subsequent convolutional layer, a fully connected layer is added in the front of the model to increase the dimension of the data (2) For the characteristics of dam foundation seepage pressure with limited hysteresis and environmental complexity, this study refers to the idea of bottleneck residual block to improve the residual block in the original TCN model Te structure of each part of this improved temporal convolutional network (ITCN) is introduced in detail as follows.
2.4.1.Fully Connected Layer.Dam foundation seepage pressure prediction is a typical multidimensional input time series prediction problem.To extract advanced features of the input multidimensional factor data [x i ] and to adapt the subsequent convolutional layer, a fully connected layer is added in front of the ITCN model.Te number of neurons in the fully connected layer k is greater than the input data dimension n.Ten, use the k-dimensional data that increased by the fully connected layer as the input data of the subsequent convolutional layer, as shown in Figure 3: 2.4.2.Improved Residual Block.As a typical time series prediction problem, the dam foundation seepage pressure prediction can be expressed as follows: we use the input multidimensional factor data sequence Obviously, the output prediction sequence of seepage pressure values needs to meet the limitation of causality condition, which is Structural Control and Health Monitoring the predicted value of seepage pressure at time t that is only related to the multidimensional factor data sequence at time t and before and has nothing to do with the "future" data sequence .For this reason, here, the concept of fully convolutional network and causal convolution is frst introduced.

Fully Convolutional Network.
Compared with the traditional convolutional network, the fully convolutional network (FCN) [26] uses convolution layers instead of fully connected layers in the last few layers, building a complete fully convolutional network to achieve intensive prediction.In other words, the element-level prediction of the sequence can be achieved under one-dimensional convolution, which is the signifcance of introducing the fully convolutional network into time series prediction problems.
In addition, compared with the low-level convolution network, the convolution network at the high level has a larger receptive feld, which can sense the historical information in a longer time frame, with good sensitivity to the changes in characteristics, and this is very helpful in building long-term memories.
Te limitation of causal conditions is uniquely specifc to the problem of causal time series prediction.Based on the fully convolutional network, the causal convolution structure [27,28] is developed, as shown in Figure 4. Causal convolution can be regarded as cutting the fully convolutional network in half, only performing convolution operations on the input at the current time t and the previous time.
To some extent, time convolutional network can be simply expressed as TCN � 1DFCN + causal convolutions; thus, the CNN model is transformed into a model suitable for processing causal time series data.
Te receptive feld of ordinary causal convolutional networks is linearly dependent on the network depth.To extract information from historical data over a long period, one requires a fairly deep network structure or a large convolution kernel, which will greatly increase the computational burden of the model.Terefore, TCN introduces the dilated causal convolution structure [28].
Te dilated causal convolution increases the receptive feld of the model by adding holes to the standard causal convolution.Te size of the hole is the number of intervals between the points of the convolution kernel.Te larger the hole, the larger the receptive feld of the convolution kernel, and the longer historical information the convolution output is related to.
Generally, to avoid the gridding efect, the hole size D of each layer is set as an exponential form of the hyperparameter dilation rate d (1, d 1 , d 2 , . . ., d i ), and the dilation rate d should not be smaller than the size of the convolution kernel.A three-layer dilated causal convolutional network with a convolution kernel size of 2 and d � 2 is shown in Figure 4.As can be seen, by adding holes, the receptive feld of the model expands exponentially with the increase of the network depth.
For a single convolution kernel, the size of the receptive feld of dilated causal convolution is where i is the number of layers where the convolution kernel is located; C is the size of the convolution kernel; and D is the size of the hole.
In this paper, we deduce that for the dilated causal convolution network with depth n, the fnal receptive feld size is In addition, it was found during the experiment that in order to quickly calculate the approximate range of the receptive feld, the following approximate formula could be used: Te gradual increasing and deepening of the network may cause network degradation.Te residual block [29], by adding a shortcut connection to the redundant layer of the network, can realize identity mapping and make the deep network equivalent to the shallower optimal network structure.
Several layers of networks containing a shortcut connection are called residual blocks.In a convolutional network, the residual block can be expressed as where Activation(•) represents the activation function and W − H represents the convolution kernel weight to be learned.6 Structural Control and Health Monitoring Tis paper combined with the characteristics of the dam foundation seepage pressure prediction improves the residual block in the original TCN model.Te practice has proved that the residual block needs at least two layers of networks to achieve a good improvement.In the residual block of the original TCN, the two layers of networks are both dilated causal convolutional layers.Although this kind of residual block can greatly increase the receptive feld of the model, it also limits the development of network depth.In practical engineering projects, dam foundation seepage pressure has hysteresis relative to environmental changes, but such hysteresis has a certain limit.Meanwhile, because the environment of the dam foundation seepage is complex, there are many infuencing factors, and the dam foundation seepage pressure problem is very complicated.In other words, the model does not need to have a huge receptive feld; instead, the depth of the model is required.
Terefore, this paper refers to the idea of the bottleneck residual block [29].After many trials, we designed an improved residual block, whose structure is shown in Figure 5: As shown in Figure 5, for the residual mapping part (F(x)) on the left, the improvement is that we replace the frst dilated causal convolutional layer in the original residual block with a standard convolutional layer that has no causal relationship; instead, the second layer maintains the dilated causal convolution unchanged.In this way, the depth of the network can be doubled while ensuring a certain model receptive feld to improve the feature extraction ability of the model.At the same time, the number of output channels in the standard convolutional layer is set to be consistent with that in the dilated causal convolutional layer, which means the frst layer of convolution performs the dimension increase or decrease processing on the input data to ensure the number of input channels is equal to the number of output channels for the second layer.
In addition, the frst standard convolution layer and the second dilated causal convolution layer both use the rectifed linear unit (ReLU) [30] as the activation function.Te dropout regularization layer [31] is retained after each layer, which is the same process as in the original TCN.In addition, the second layer is processed by weight normalization.
For the identity mapping part (x) on the right, that is, the shortcut connection, if the input data and output data have the same dimensions, add input x and output F(x) directly; if the input data and output data have diferent dimensions, add 1 × 1 convolution, adjust the number of flters, and make sure that the tensors are added in the same scale.(1) Tis model uses a fexible convolution architecture: According to the number of output interfaces, the input sequence of any dimension can be mapped to the output sequence of fxed dimensions freely, with the ability of multidimensional input and multidimensional output.Tis proposed model can therefore solve the problem of the diversity of infuencing factors on dam foundation seepage pressure.(2) By using the dilated causal convolution structure, this model has the memory ability and is capable of solving the hysteretic problem in the prediction of dam foundation seepage pressure.(3) Trough the stack dilated convolution layer and parameter sharing mechanism, the proposed model greatly reduces the computational burden of the model.( 4) By improving the residual blocks, while ensuring a certain model receptive feld, this model can extract higher dimensional features, thus solving the limited hysteresis and large complexity due to environmental changes of the dam foundation seepage pressure.

Te Basic Workfow of the Proposed Model.
For the dam foundation seepage pressure prediction model based on VMD-wavelet packet denoising and ITCN proposed in this stud, the basic workfow is shown below: Step 1: For the measured data of dam foundation seepage pressure at each measuring point, judge whether it has been contaminated by noise.

Structural Control and Health Monitoring
Step 2: we perform the VMD decomposition experiment on the output data of Step 1.According to the decomposition situation of diferent decomposition levels, determine the fnal decomposition level and the components that need to be denoised.
Step 3: for the components selected by Step 2, according to the percentage of energy recovery perf2 value of the data after denoising and the retained signal characteristics, determine the level of wavelet packet decomposition and the denoising threshold.Ten perform wavelet packet threshold denoising.
Step 4: we reconstruct the signal with the components denoised by the wavelet packet threshold and the remaining components that do not need denoising.Ten, the dam foundation seepage pressure data after VMD-wavelet packet denoising is obtained.
Step 5: after preparing the denoised dam foundation seepage pressure data and multidimensional factor data composed of relevant environmental factors, normalize these data, and divide them into the training set, validation set, and test set.
Step 6: we set the corresponding parameters of the model and initialize them.
Step 7: we perform hysteresis experiments on diferent measuring points and fnd the optimal receptive feld that is closest to the real lag time of dam foundation seepage pressure at the measuring point to obtain the optimal ITCN dam foundation seepage pressure prediction model structure for each measuring point.
Step 8: based on this optimal model structure, we train the model to obtain the optimal ITCN dam foundation seepage pressure prediction model for each measuring point and input the multidimensional factor data at the future moment.Te future changes of dam foundation seepage pressure can consequently be predicted.
Te basic fowchart can be expressed, as shown in Figure 7:

Case Study
3.1.Project Overview.Uplift pressure is one of the important unfavorable loads afecting the structural stability of a gravity dam.Compared with the arch dam, the gravity dam has a wider foundation, longer seepage path, and more obvious hysteresis.Terefore, this paper selects a high gravity dam with dam foundation uplift pressure measuring points in the frst and second rows behind the impervious curtain in the middle dam block, as the research object of dam foundation seepage pressure.
Tis high gravity dam is an RCC gravity dam whose maximum dam height is 168 m and is divided into 24 dam blocks.Te two dam foundation uplift pressure measuring points selected in the middle dam block (13#) are numbered UP13 and UP27 respectively, as shown in Figure 8: In terms of measured data, we selected the period (2015-2-10 ∼ 2018-12-31) as the research period.During this period, the monitoring frequency was guaranteed to be once a day, and a total of 1421 groups of dam foundation seepage pressure data were obtained.
In terms of the environmental variables, four environmental variables including upstream water level, downstream water level, rainfall, and dam foundation temperature were selected; among which, the dam foundation temperature was selected to be the measurement temperature of

VMD-Wavelet Packet Denoising.
It can be seen from the measured process line that the measured signal of the UP27 measuring point has been severely contaminated by the noise.Terefore, before training the ITCN model, it is necessary to perform the VMD-wavelet packet denoising processing.

VMD Decomposition.
According to the basic fow, frstly, the decomposition level K of VMD needs to be determined.In general, when K is less than 3, the decomposition level of the signal will be insufcient, and there will be mode mixing; when K is too large, false signal components will appear.So this paper starts with K � 3 and gradually increases the decomposition levels to fnd the optimal K value.
After comprehensive comparison, for the measured signal of the dam foundation seepage pressure at the UP27 measuring point, carry out VMD 5-level decomposition.Te noise components can be better concentrated in IMF1∼3 components and residual components, as shown in Figure 10.Terefore, let the number of decomposition levels K � 5, and select IMF1∼3 components and residual components for wavelet packet threshold denoising.

Wavelet Packet Treshold Denoising.
As mentioned earlier, in this paper, when performing wavelet packet threshold denoising, the basis function chooses to be the db4 wavelet; the cost function uses Shannon entropy; the threshold estimation method uses rigrsure threshold; and the threshold function uses soft threshold.Terefore, it is only necessary to determine the level of wavelet packet decomposition for each component.
Te key basis to determine the level of wavelet packet decomposition is the percentage of energy recovery perf2 of the data after denoising, whose expression is as follows: where x is the signal before denoising and xd is the signal after denoising.Te smaller the perf2 is, the less information is retained after denoising, and the better the denoising efect is.

Structural Control and Health Monitoring
For each component, the search range of the best decomposition level is set to 1 ∼ 7 and record the perf2 value after each denoising in the following fgure.
As shown in Figure 11, for IMF1 components, when the decomposition level is 2, the minimum perf2 value is reached; the other components reach the minimum perf2 value when the decomposition level is 3. Te signal image after denoising also retains many signal features and achieves a good denoising efect.Terefore, the optimal decomposition level of the IMF1 component is determined to be 2, and the optimal decomposition level of the remaining components is all 3.
Reconstruct the signal with the denoised components and IMF4∼5 components to get the dam foundation seepage pressure data of UP27 measuring point after VMD-wavelet packet denoising, as shown in Figure 12:

Model Training and Prediction.
After preparing the nonnoise dam foundation seepage pressure data and multidimensional factor data composed of relevant environmental factors, etc, input data into the initial ITCN model.By performing the hysteresis experiment, optimize the model structure.Te optimal ITCN dam foundation seepage pressure prediction model is obtained by training.All the prediction models in this paper are based on the Tensor-Flow2.0.0a platform using Python3.7 language.

Hysteresis Experiment.
Te measured data and seepage theory analysis show that the dam foundation seepage pressure has a certain hysteresis relative to environmental changes, which means the dam foundation seepage pressure data at some time are closely related to the factor data in the previous time period.When the range of the model's receptive feld is closer to the length of this lag time, the model can fully capture the historical information within this time range to achieve the best learning efect.
Te dam foundation seepage pressure of diferent dam types and positions has diferent hysteresis, so for diferent measuring points, we need to change the size of the receptive feld of the model to fnd the optimal receptive feld that is closest to the real lag time of the dam foundation seepage pressure at this measuring point to obtain the optimal ITCN dam foundation seepage pressure prediction model structure for each measuring point.
According to formulas ( 12) and ( 13), in the ITCN network, the parameters that determine the receptive feld are convolution kernel size C, dilation rate d, and the number of residual network layers n.As mentioned earlier, due to the particularity of the dam foundation seepage pressure, the model does not need to have a huge receptive feld, but the depth of the model is required.Terefore, the convolution kernel size C and dilation rate d are set to the minimum value of 2. Te size of the model's receptive feld is changed by adjusting the number of residual network layers.Simultaneously, the flters in each layer of the residual network should be successively reduced to 1 to ensure a smooth dimensional decreasing process.Te size of the model's receptive feld and the flters of each layer for diferent residual network layers are shown in Table 1.
To compare the learning efects of diferent receptive feld models, we record the mean square error (MSE) mean value of multiple training results of the validation set.Te MSE expression is as follows: where y i is the measured value,  y i is the predicted value of the model, and m is the length of the validation set data.
Record the hysteresis experiment results of each measuring point as shown in Figure 13.It should be noted that since it is in a normalized state at this time, the MSE magnitude is small: As can be seen from Table 1 and Figure 13, with the increase in the number of residual network layers, the model receptive feld gradually increases, and the learning efect for dam foundation seepage pressure at each measuring point is gradually enhanced.When a certain optimal receptive feld is reached, the best learning efect is achieved subsequently.However, with a further increase of the receptive feld of the model, the learning efect is no longer better overall.
Te optimal receptive feld of the model refects the real lag time of dam foundation seepage pressure at the measuring point.Te UP13 measuring point in the frst row behind the impervious curtain showed a lag of about 8 days, and the UP27 measuring point in the second row behind the impervious curtain showed a lag of 8 ∼ 16 days.Tis is roughly consistent with the actual cognition and refects the basic seepage law that the longer the seepage path is, the more obvious the hysteresis is.

Model Prediction.
Trough the hysteresis experiment, we obtained the optimal model structure for each measuring point.Based on this training model, when the MSE value of the validation set is the minimum, we stopped the training to obtain the optimal ITCN dam foundation seepage pressure prediction model for each measuring point.

Structural Control and Health Monitoring
To evaluate and compare the efectiveness of the ITCN dam foundation seepage pressure prediction model, we adopted two deep learning methods, the RNN model and its variant LSTM model, which are more common in the feld of time series prediction, and also the stepwise regression statistical model, which is more commonly used in the feld of traditional dam engineering, as the benchmark methods.Use the denoised data for training to predict the change of dam foundation seepage pressure in the last half year.To comprehensively evaluate the prediction efect, in terms of prediction accuracy evaluation indicators, in addition to mean squared error (MSE), this paper also considers root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE).Te formula of each evaluation indicator is as follows: Finally, the prediction results of the dam foundation seepage pressure models at each measuring point are shown in Figure 14.
In addition, to verify the improvement efect of the VMD-wavelet packet denoising on model prediction, the original data of UP27 measuring point without denoising were used to train the ITCN dam foundation seepage pressure prediction model for comparison.Te comparison results are shown in the following charts.
First of all, it can be seen intuitively from the process lines of the prediction results in Figure 14, among all the models, the ITCN model proposed in this study achieves the best ft between the predicted values and the true seepage pressure values.Two other deep learning models such as LSTM and RNN models have slightly weaker ftting performance.Te stepwise regression model has the worst ftting performance although it also refects the right overall  2 further confrm this statement: each indicator value of the ITCN model is smaller than that of the other models.Te indicator values of the ITCN model are only 50%-90% of those of LSTM and RNN models, and 15%-40% of those of the stepwise regression model, and the values are all small.Te above results prove that the proposed prediction model has a strong ftting ability, small overall error, and high accuracy.
Secondly, it can be seen in Figure 15 and Table 3.After the VMD-wavelet packet denoising processing was performed on the dam foundation seepage pressure data contaminated by the noise, the predicted value is closer to the true value.Each evaluation indicator has been reduced by 25-50%.Tis indicates that the infuence of the noise has been reduced to some extent, and the ftting efect of the model has been improved obviously.
But it should be noted that after the denoising of the UP27 measuring point data, the accuracy of the model still has a certain gap with the UP13 measuring point model.Tis is because in addition to the severe noise contamination, the UP27 measurement point data also have obvious abnormal fuctuations.Tese fuctuations, not related to environmental factors, are caused by the measuring point itself.Terefore, it is necessary to check the UP27 measuring point in time to eliminate relevant abnormalities to further improve the prediction accuracy.
Overall, the verifcation results for this engineering example show that the VMD-wavelet packet denoising method can efectively eliminate the infuence of noise contamination on the prediction model; the prediction accuracy of ITCN model is better than that of the other models, and the prediction accuracy of each deep learning model is better than that of the stepwise regression statistical model.Te prediction accuracy of each model is ITCN > LSTM > RNN > stepwise regression.

Conclusion and Discussion
To establish a higher performance dam foundation seepage pressure prediction this work frstly performs VMDwavelet packet denoising for the data contaminated by noise.Ten, we propose and deeply research the ITCN dam foundation seepage pressure prediction model.Te conclusions are as follows: (1) Considering the dam foundation seepage pressure data contaminated by noise, VMD decomposition is utilized to perform wavelet packet threshold denoising on the components with more noise.Experimental results prove that the signal is reconstructed to obtain get the dam foundation seepage pressure data after denoising to eliminate the infuence of noise and improve the prediction accuracy of the model.However, some limitations must be addressed.For newly built dams, more consideration should be given to applying machine learning methods to study the construction of monitoring models under the condition of a small number of samples.Furthermore, data augmentation methods should also be introduced to increase the richness of the data.
3 introduces the relevant infuencing factors of concrete dam foundation seepage pressure and the selection of multidimensional input factors of each prediction model.Section 2.4 introduces the structure, function, and improvement process of each part of the proposed ITCN model, as well as the overall 2 Structural Control and Health Monitoring structure and characteristics of our model in detail.Section 2.5 explains the basic fow of the model as a whole.Section 4 verifes the efectiveness of the proposed method with specifc engineering examples.Finally, the conclusions are drawn in Section 4.

Figure 3 :
Figure 3: Dimension increase diagram of the fully connected layer.

2. 4 . 4 .
Te Whole Architecture of the ITCN Model.In the design of the ITCN model, several residual blocks are stacked to form a deep residual network.Te whole architecture of the ITCN model can be expressed as input of the multidimensional factor sequence [x i ] into the fully connected layer for dimensional increase processing, and then input the result into the deep residual network; after the dimensionality decreases the processing of several residual blocks, fnally, output the one-dimensional dam foundation seepage pressure prediction sequence Y. Te whole architecture of the ITCN model is shown in Figure 6: Te whole architecture of the ITCN model proposed in this paper for dam foundation seepage pressure data prediction has the following characteristics:

Figure 5 :
Figure 5: Schematic diagram of the improved residual block structure.

Figure 6 :
Figure 6: Schematic diagram of the whole architecture of the ITCN model.

Figure 8 :
Figure 8: Schematic diagram of gravity dam foundation and uplift pressure measuring points.(a) Dam body.(b) Te vibrating wire piezometer.

Figure 9 :
Figure 9: Relevant environmental variables and the actual measurement process line of the dam foundation seepage pressure at each measuring point.

Figure 10 :Figure 11 :
Figure 10: VMD 5-level decomposition of the measured signal at the UP27 measuring point.

Figure 12 :
Figure 12: Schematic diagram of denoising efect of UP27 measuring point data.

Figure 13 :
Figure 13: Hysteresis experimental results of dam foundation seepage pressure at each measuring point.

Figure 14 :
Figure 14: Te process line of the prediction results of diferent dam foundation seepage pressure models at each measuring point.

Figure 15 :
Figure 15: Te process line of ITCN model prediction results before and after denoising at the UP27 measurement point.

( 2 )
An improved TCN model is used to build the prediction model according to some characteristics of dam foundation seepage pressure data.Tis ITCN model also retains the advantages of fexible architecture, free adjustment, and a small amount of calculation.(3)In addition, as the ITCN model has a certain memory, it has a good learning efect for the dam foundation seepage pressure with hysteresis.Tis work relates the receptive feld size of the ITCN model to the hysteresis of the dam foundation seepage pressure by changing the receptive feld of the model to research the real lag time of dam foundation seepage pressure.(4) Trough the hysteresis experiment, we obtained the optimal model structure, based on which, the optimal ITCN dam foundation seepage pressure prediction model was obtained by training.Te prediction results show that the prediction accuracy of the ITCN model is better than that of LSTM, RNN, and stepwise regression models.

Table 1 :
Schematic table of residual network with diferent layers.

Table 2 :
Evaluation indicator value for the prediction result of the dam foundation seepage pressure model at each measuring point.

Table 3 :
Evaluation indicator value for the ITCN model prediction results before and after denoising at the UP27 measurement point.