^{1}

^{1}

^{2}

^{1}

^{3}

^{1}

^{2}

^{3}

The structural engineering is subject to various subjective and objective factors, the deformation is usually inevitable, the deformation monitoring data usually are nonstationary and nonlinear, and the deformation prediction is a difficult problem in the field of structural monitoring. Aiming at the problems of the traditional structural deformation prediction methods, a structural deformation prediction model is proposed based on temporal convolutional networks (TCNs) in this study. The proposed model uses a one-dimensional dilated causal convolution to reduce the model parameters, expand the receptive field, and prevent future information leakage. By obtaining the long-term memory of time series, the internal time characteristics of structural deformation data can be effectively mined. The network hyperparameters of the TCN model are optimized by the orthogonal experiment, which determines the optimal combination of model parameters. The experimental results show that the predicted values of the proposed model are highly consistent with the actual monitored values. The average RMSE, MAPE, and MAE with the optimized model parameters reduce 44.15%, 82.03%, and 66.48%, respectively, and the average running time is reduced by 45.41% compared with the results without optimization parameters. The average RMSE, MAE, and MAPE reduce by 26.88%, 62.16%, and 40.83%, respectively, compared with WNN, DBN-SVR, GRU, and LSTM models.

With the rapid development of social economy, the demand is continuously growing for large structural engineering such as subways, bridges, and tunnels, and their safety is also becoming more and more important in the stage of constructions and operations. Since the structural engineering is subject to various subjective and objective factors, the deformation is usually inevitable. Serious deformation even causes disaster accidents, which have brought huge losses to people’s lives and property safety. Therefore, safety requirements are imposed during the construction and operation stages, and the automatic monitoring system of structural deformation has become indispensable. The structural deformation prediction model can grasp the deformation trend using monitoring data, so that emergency measures can be taken to prevent the occurrence of disasters in advance. Therefore, accurate and real-time deformation prediction model has also become a research hotspot in the structural automation monitoring system.

Many structural deformation prediction models had been proposed in the past few years. The existing methods mainly include the regression analysis, gray theory, time series analysis, neural network, and combined prediction models. There are many factors that cause structural deformation, such as monitoring equipment, structural geological conditions, physical properties, and other external environment. The different prediction models have their advantages and adaptability for different monitoring data. Regression analysis is a mathematical statistical method by determining the relationship between structural deformation and relevant factors. The regression method is very effective for fewer variables and data with obvious regularity [

Since the structural deformation data are a typical time series, the typical time series analysis methods including the autoregressive moving average (ARMA) [

In recent years, neural networks and their improved algorithms have been successfully applied in structural deformation prediction [

Because there are many factors that need to be considered, it is difficult for a single model to achieve desired accuracy. Therefore, aiming at the problems of the single model, the combined prediction models were often used to predict structural deformation [

With the development of information technology, deformation monitoring data are entering the era of big data. It is also possible to predict structural deformation using artificial intelligence methods, especially deep learning. At present, the deep learning method is gradually applied to structural deformation prediction. Yang et al. [

The TCN (temporal convolutional network) is an improved CNN algorithm, and its applicability was proved by Bai et al. [

The rest of this study is organized as follows. Section

The TCN is a structural innovation of the one-dimensional CNN. It adds dilation factors in the traditional convolution, which can cover all historical information, effectively use historical information to solve timing problems, and add causal convolution to prevent future information leakage. The TCN model is widely used in speech recognition and time series because of its simple structure and flexible receptive field. Since structural deformation data are a typical time series, the TCN is used to establish the deformation prediction model in this study.

The TCN model mainly has two principles, the output sequence of the network should have the same length as the input sequence, and the network can only use the information from past time steps. According to the first principle, the TCN uses the 1D full-convolutional network to ensure that all convolution layers have the same length after zero padding. According to the second principle, the TCN uses causal convolutions, which means that output

Noncausal convolution kernel size (

Causal convolution kernel size (

The output of causal convolutions in the time step

In order to obtain a long-term memory of sequence, causal convolutions need to increase convolution kernel or deepen the network depth, which would lead to an increase of calculation.

Since the structural deformation data have a strong temporal correlation characteristic, the analysis of structural deformation not only considers the data in previous moment but also in a long time ago. Generally, there are usually three methods to broaden the receptive field with a simple causal convolution. The first method is to deepen the network depth. However, with the network deepening, the network training is complicated and the fitting effect is not necessarily well. The second method is to increase the size of the network convolution kernel. With the increase of convolution kernel size, the receptive field increases linearly. This method enlarges the receptive field, but increases the parameters and network complexity. In general, multiple small overlaying convolution kernels can greatly reduce the number of parameters and computational complexity compared with a large convolution kernel separately. The third method is to increase the step. When the convolution step size is too large, the output sequence length will be reduced due to downsampling. In addition, it is possible to omit the effective information and affect the accuracy of feature extraction.

In order to overcome these shortcomings, Yu and Koltun [

By skipping the input value with the given step size, the dilated convolution obtains the long-term memory of the sequence without increasing the calculation amount. Formally, for a one-dimensional time sequence input

Causal dilated convolution (

The receptive field of the TCN depends on the convolution kernel size

The

The residual module of the TCN is shown in Figure

Residual connections in the TCN.

In order to ensure the safety of the structure, the application of monitoring equipment is wide and essential, and the structural deformation monitoring data are obtained by the automatic monitoring equipment in a certain sampling interval, and it is a typical time sequence which reflects the rules of structural health changing in the time field. The future structural deformation trend can be predicted by analyzing the monitoring historical data [

The flowchart of the proposed prediction model.

As shown in Figure

In equation (

Suppose that

In equation (

Suppose that there are

In order to verify the validity of the model,

The specific steps for the proposed method are as follows:

Step 1: the original data are preprocessed and normalized, and the training and test sets are divided.

Step 2: the hyperparameters of the TCN model are selected through orthogonal experiments

Step 3: the structural deformation prediction result can be obtained by the TCN model

Step 4: calculate the prediction errors

Step 5: repeat steps 1–4 with all hyperparameters combinations

Step 6: obtain the predicted structural deformation when the error is the smallest

In order to verify the availability of the proposed model, the experiments are executed with the cumulative strain data of the upper steel beam in a foundation pit in China. As shown in Figure

The project site for data acquisition.

In order to accelerate the speed of gradient descent, seek the optimal solution, and improve the accuracy, the training samples are normalized according to equation (

In equation (

The original and preprocessed data are shown in Figure

The original and preprocessed data.

In order to verify the validity of the model, some models [

In equations (

Since the proposed model is a deep learning model, network parameters have an important influence on the experimental results. In order to discuss the influence of hyperparameters on the model performance, the optimized model is found by analyzing parameter combinations that may affect prediction performance. The orthogonal test is an experimental method to study multifactors and multilevels through orthogonal tables. It is based on the principles of uniformity and orthogonality. By selecting factors that have a greater influence on the test results, partial experiments can effectively replace comprehensive experiments. It has the advantages of high efficiency and precision to find the optimal parameter combination. Some key hyperparameters

Types and levels of orthogonal experimental factors.

Types of factors ( | Levels | ||||
---|---|---|---|---|---|

1 | 2 | 3 | 4 | ||

A | Kernel size | 5 | 6 | 7 | 8 |

B | Kernel numbers | 8 | 16 | 24 | 32 |

C | Dilation factor | 8 | 16 | 32 | 64 |

D | TCN layer number | 8 | 12 | 16 | 20 |

E | Learning rate | 0.0001 | 0.001 | 0.01 | 0.05 |

In order to verify the effectiveness and stability of the proposed model, each experiment is carried out five times, and the average value was taken as the final result. The computer configuration for the experiment is given in Table

Experimental environment.

CPU | Intel (R) Core (TM) i5-6200U @2.30 GHZ |

RAM | 4 GB |

Operating system | Windows (64) |

Python | 3.7 |

Under the experimental environment and the parameters conditions in this study, the test results of the model hyperparameters optimization experiments are given in Table

Orthogonal experiment results.

Test number | Types and levels of factors | RMSE | MAPE | MAE | Running time (min) | ||||
---|---|---|---|---|---|---|---|---|---|

A | B | C | D | E | |||||

1 | 5 | 8 | 8 | 8 | 0.0001 | 1.08 | 1.13 | 0.66 | 7.73 |

2 | 5 | 16 | 16 | 12 | 0.001 | 1.05 | 0.86 | 0.53 | 38.73 |

3 | 5 | 24 | 32 | 16 | 0.01 | 2.26 | 5.61 | 1.70 | 97.90 |

4 | 5 | 32 | 64 | 20 | 0.05 | 9.09 | 9.93 | 7.26 | 247.30 |

5 | 6 | 8 | 16 | 16 | 0.05 | 1.08 | 1.07 | 0.58 | 45.34 |

6 | 6 | 16 | 8 | 20 | 0.01 | 1.10 | 0.84 | 0.57 | 47.70 |

7 | 6 | 24 | 64 | 8 | 0.001 | 1.05 | 0.64 | 0.49 | 75.61 |

8 | 6 | 32 | 32 | 12 | 0.0001 | 2.41 | 1.83 | 1.69 | 87.78 |

9 | 7 | 8 | 32 | 20 | 0.001 | 1.22 | 0.63 | 0.47 | 77.55 |

10 | 7 | 16 | 64 | 16 | 0.0001 | 1.20 | 1.47 | 0.74 | 129.25 |

11 | 7 | 24 | 8 | 12 | 0.05 | 1.17 | 1.73 | 0.74 | 36.68 |

12 | 7 | 32 | 16 | 8 | 0.01 | 1.08 | 0.76 | 0.51 | 39.23 |

13 | 8 | 8 | 64 | 12 | 0.01 | 1.08 | 0.89 | 0.57 | 68.86 |

15 | 8 | 24 | 16 | 20 | 0.0001 | 1.18 | 1.5648 | 0.73 | 98.53 |

16 | 8 | 32 | 8 | 16 | 0.001 | 1.28 | 1.3177 | 0.83 | 63.74 |

The bold values represent the best prediction result of the model when the TCN model takes this set of parameters.

From the experimental results in Table

The parameters of the TCN prediction model are set as follows: the size of the convolution kernel is 8, the number of convolution kernels is 16, the dilation factor is set as 32, the learning rate is 0.05, the number of TCN layers is 8, residual connections are adopted between TCN layers, the optimization function of the model is Adam, and the loss function is chosen as RMSE. The average RMSE, MAPE, and MAE of the optimized model parameters reduce 44.15%, 82.03%, and 66.48% respectively, and the average running time is reduced by 45.41% at the same time.

The predicted results and the actual measured values are shown in Figure

The prediction results of the proposed model.

In order to further evaluate the performance of the proposed model, the proposed model is compared with the wavelet neural network (WNN), deep belief networks and support vector regression (DBN-SVR), gated recurrent unit (GRU), and LSTM models in this study. The hyperparameters are critical to the prediction performances, and the cross-validation of multiple tests yields the optimal hyperparameter values for different models. The specific hyperparameters are as follows:

WNN: in the wavelet neural network, the number of hidden layer neurons is 3, the number of training is set as 5000, and the kernel function is radial basis function (RBF kernel)

DBN-SVR: the number of network layers of the DBN model is set as 3. The temporal characteristics are extracted through the two stages of pretraining and fine-tuning of the DBN network. Finally, the output is predicted by SVR. The kernel function of the SVR is RBF, the time of training is set as 10000, and the penalty factor is 0.01.

LSTM: the number of network layers in LSTM is 3, and the number of neurons in each layer is set as 24. Each layer is followed by regularization to prevent overfitting. The batch size is 16.

GRU: the number of network layers in the GRU is 3, and the number of neurons in each layer is set as 24. Each layer is followed by a regularization to prevent overfitting, and the batch size is 8.

The prediction results of different models are shown in Figure

The prediction results with different models.

From Figures

The error analysis for different models.

In order to compare the specific errors values in different prediction methods, performance comparison results for different models are given in Table

The performance comparison for different models.

Models | RMSE | MAPE | MAE |
---|---|---|---|

WNN | 2.2940 | 2.6395 | 1.5349 |

DBN-SVR | 1.5038 | 0.8533 | 0.8724 |

GRU | 1.0901 | 0.7688 | 0.5474 |

LSTM | 1.0600 | 0.6570 | 0.4937 |

TCN | 0.9876 | 0.3438 | 0.4138 |

According to Table

In view of the insufficient feature extraction of traditional time prediction models, this study proposes a structural deformation prediction model based on the TCN. Since the structural deformation data have temporal correlation characteristic, first, the temporal features can be extracted by the TCN model, the long-term memory of the time series is obtained by dilated convolution, and the causal convolution is realized by padding to prevent information leakage effectively. The residual connection can reduce the model prediction error. Second, the predicted output can be obtained through full connection. Finally, the hyperparameters of the model are optimized by the orthogonal experiment, and the optimized parameter combination is selected as the parameter of the model, which can fully extract features from the deformation data. The prediction results with the optimal parameters are obtained through the full connection network. Experimental results indicate that the TCN has smaller prediction error when dealing with structural deformation time series. The prediction performances of the proposed model are proved by comparing with WNN, DBN-SVR, LSTM, and GRU. The TCN model is able to extract temporal features with the increase of TCN layers number, which will make it difficult to train the model and prolong the running time. On the other hand, the multiple sensors are generally used for the same measuring point for the deformation monitoring in the structural engineering. The monitoring data of the multiple sensors usually have a certain relationship in temporal and spatial characteristics. We will further explore the spatial characteristics of the monitoring data to improve the accuracy of the prediction model in the future.

The data used to support the findings of this study are available from the corresponding author upon request.

The authors declare that there are no conflicts of interest.

This research was partly supported by the National Key R&D Program of China (2018YFC0808706).