Transfer Learning for CNN-Based Damage Detection in Civil Structures with Insufficient Data

Among various methods proposed for health monitoring of structures, deep learning-based techniques with their powerful performance have attracted considerable attention in recent years. However, a major problem with these methods is that they usually need large amounts of data in the training phase, while such data may not be available in real applications. In this study, compact one-dimensional (1D) convolutional neural networks (CNNs) are utilized that require less data for training. The study is comprised of two parts: the first stage aims to develop a compact CNN that can recognize damages in a structure with high accuracy, when data are provided to some extent. The problem of inadequate training data in health monitoring of experimental and real-life structures is then investigated in the second part. Transfer learning is used to deal with this problem. A compact CNN is utilized as the source domain network and the target domain network receives all of its knowledge from this source. Acceleration time histories from a numerical model, an experimental structure, and a full-scale bridge are utilized to validate the proposed methodology. According to the results, the compact CNN can reach 100% accuracy when data are available for training. Also, for the case of insufficient data, using a compact network as well as transfer learning causes considerable improvement (about 95%) in the accuracy of damage detection.


Introduction
Damage is unavoidable in civil structures during their service lives and can lead to vast human and economic losses if not repaired on time. Civil structures need to be monitored regularly in order to detect damages in initial stages; this can be done automatically through structural health monitoring (SHM) methods; therefore, much research has been devoted to developing these techniques in recent years.
Vibration-based SHM techniques aim to detect, localize, and quantify damages in a structure using vibration signals acquired from it by a network of sensors. ese techniques can be categorized into two types: model based and response based. In model-based methods, the numerical model of a structure is created using its vibration responses and damage detection is conducted by monitoring changes in modal parameters obtained from the model [1]. ese methods have been widely used in literature [2][3][4]; however, the e ciency of model-based approaches relies on the accuracy of the numerical model, while it is impractical to establish a totally accurate ne-tuned model to represent the structure of interest, due to structural and environmental uncertainties, among other factors [5]. us, the main focus of vibration-based SHM researches has recently been on response-based methods. ese methods use various signal processing tools to extract damage indices directly from the measured vibration responses. e health state of civil infrastructures can be e ectively monitored using machine learning algorithms. As a subset of arti cial intelligence, machine learning has proven its high e ciency in many engineering applications and has been growing rapidly with great advances in sensor and computer technologies. Over the last decades, many machine learning algorithms have been utilized in the field of SHM, including, but not limited to, artificial neural network (ANN) [6][7][8][9], fuzzy neural network [10][11][12], support vector machine (SVM) [13][14][15], genetic algorithm (GA) [16][17][18], and federated learning [19,20].
Deep learning is the latest achievement of machine learning that has attracted so much attention from researchers, due to its ability to extract features from raw data automatically. CNNs are currently the most popular algorithm of deep learning, thanks to their great feature extraction capabilities that enable them to outperform other algorithms.
Abdeljaber et al. [21] utilized 1D CNNs to localize damage in a grid structure using raw acceleration signals recorded by sensors installed on the joints of the structure. Lin et al. [22] proposed a new method for automatic feature extraction and structural damage detection. A CNN was designed to extract features and identify damage locations. e performance of the proposed method was tested on both noise-free and noisy datasets. Khodabandehlou et al. [23] proposed the use of CNNs to extract features from acceleration responses and reduce their dimension to be able to classify the damage state. e applicability of this technique was demonstrated using signals obtained from a scaled model of a concrete bridge. Puruncajas et al. [24] presented a method for SHM of jacket-type foundations of offshore wind turbines using acceleration data and CNNs. e accelerometer data were converted into gray scale images and the test set error of the CNN was diminished using a data augmentation technique. Yang et al. [25] employed a parallel CNN and bidirectional gated recurrent unit framework for structural damage detection. Vibration data from IASC-ASCE benchmark structure and TCRF bridge were utilized to evaluate the proposed methodology. Rastin et al. [26] used CNNs as generators and discriminator parts of generative adversarial networks to identify damages in civil structures using only healthy state data to train the networks.
Although CNN-based techniques can successfully be employed for accurate structural damage detection, the main problem with these methods is that they need a large amount of data to train the network, which is not usually accessible in many cases. Transfer learning is one of the most effective methods, developed in the field of deep learning to address this issue. e idea behind this method is that the knowledge obtained by a network with a special dataset (source domain network) can sometimes be transferred to another network with a different dataset (target domain network). is strategy allows for training the target domain network with less amount of data and is particularly effective for sharing knowledge between different classification tasks [27]. Collecting large amounts of data for SHM and damage detection applications is often time and resource consuming, motivating researchers to employ transfer learning techniques to solve this problem.
Chakraborty et al. [27] used transfer learning to identify cracks in an aluminum lug joint. e source domain network was trained utilizing a large number of data points and the knowledge obtained by this network was transferred to another network with insufficient data. Gao and Mosalam [28] employed the pretrained VGGNet as the source domain network to identify structural damage from images, using a target domain network. Feng et al. [29] adopted a deep CNN with transfer learning for damage detection in a hydrojunction infrastructure. e Inception-V3 network, which has a great feature extraction power, and is trained on ImageNet data, was utilized as the source domain network. Han et al. [30] presented a transfer learning framework to diagnose unseen machine conditions. ree strategies were adopted to study feature transferability in diagnosis tasks. In a study by Azimi and Pekcan [31], a new CNN-based approach was introduced for SHM that exploited a form of compressed data through transfer learning. Han et al. [32] proposed a new fault diagnosis framework (deep transfer network) and a joint distribution adaptation scheme to reduce the discrepancy between training and testing data distributions and eliminate the need for a great deal of data for diagnosis models. Zhang et al. [33] used a universal domain adaptation method for fault detection in rotating machines with no obvious presumption on target labels. Han et al. [34] utilized a novel transfer learning-based method for wind turbine and bearing fault diagnosis with extremely limited fault data.
Transfer learning-based SHM techniques presented in the literature employ complex networks that usually have so many layers and are trained using special hardware and millions of samples, which are not always available in real applications. Also, source and target domain networks utilized in these techniques have different objectives causing the target domain network to receive only a part of its knowledge from the source. e aim of this study is health monitoring of structures using compact 1D CNNs, which use less data for training. Health monitoring is carried out in two stages: the first stage aims to develop a compact 1D CNN that can recognize damages in a structure with high accuracy, when sufficient data from the structure are available. e problem of inadequate training data in health monitoring of experimental and real-life structures is then investigated in the second stage via using transfer learning. A compact 1D CNN is utilized as the source domain network and the target domain network receives all of its knowledge from the source. e two stages in this study are applied on a bridge health monitoring (BHM) benchmark model, an experimental grid structure, and a full-scale bridge, for validation. e rest of the study comprises 6 sections: an overview of CNNs and transfer learning is provided in Section 2, Section 3 introduces the proposed methodology, the presented method is validated in Section 5, how to utilize data from the structures is described in Section 4, conclusions are drawn in Section 6, and references are presented in references section.

Convolutional Neural Networks.
In the history of deep learning, few algorithms have been as influential as CNNs. ey have exhibited their great performance in many fields 2 Shock and Vibration including face recognition, speech processing, medical science, and SHM. CNNs employ convolution and pooling layers to extract features from raw signals; fully connected layers are then employed to classify the input data. Figure 1 shows an example of a CNN that consists of two convolution, two pooling, and two fully connected layers and is used to classify 128 × 1 signals into two classes. e success of CNNs is mainly the result of using convolution layers. Each neuron in a convolution layer is connected to only a small part of the previous layer, reducing the computational burden and increasing the efficiency of the network to a large extent. CNNs are employed in this study to detect damages in civil structures, in the manner discussed in Section 3.

Transfer Learning.
Transfer learning is a method developed in the field of deep learning that focuses on applying the knowledge obtained by working out a problem to another related problem [35]. e aim of this technique is to compensate for data shortage in solving the new problem. A source domain network is first trained on a source database and task; then, the learned features are transferred to a target domain network that should be trained on a target database and task [36]. Learning low-level features using a large dataset will improve the performance of the target domain network and the speed of learning is greatly increased. ese features are minor details that can be shared across different related machine learning-based tasks, such as lines or edges in image data. In general, there are two approaches to transfer learning: the first one includes selecting a related task for which there is an abundance of data, developing a source network for this task, and using all or parts of the source model as a base for a network for the target task. In the second approach, a pretrained source model trained on a related task is first selected. All or parts of this model are then used as a starting point for the target domain network. In both cases, the model may need to be tuned on the target dataset. e first approach is adopted in this study.
In order for transfer learning to be effective, three conditions need to be satisfied: (1) e type of data should be the same on both source and target domain networks (2) e amount of the training data for the source domain network must be much greater than that for the target domain network (3) e low-level features obtained for the source task should be suitable for the target task as well As the insufficient amount of data is one of the main problems when employing deep learning-based SHM techniques for real-life civil structures, this study proposes a transfer learning-based approach to overcome this issue. Further explanations in this regard are presented in Section 3.2.

Methodology
As previously stated, this study investigates the health monitoring of civil structures in two stages. e method proposed for this purpose is explained in the following subsections: 3.1. Stage 1. In this stage, a CNN is designed and trained on sufficient acceleration data from a structure, gathered by the sensors installed on it, to detect the presence of damage in the structure. Seven sensors with a random placement are considered for this purpose to record acceleration data in the healthy and damaged states. e output for each sensor is an array of acceleration data. Samples of these arrays are shown in Figures 2-4. Data from these sensors are first normalized between 0 and 1 and then concatenated to form a 7-column matrix. is matrix is divided into 1000 × 7 matrices, which are shuffled before being inputted to the network. e architecture of the CNN and its hyperparameters are determined through trial and error to gain an optimal network with the best possible performance.
e study conducted to find optimal network parameters is further elaborated in Section 5.1. Figure 5 shows the architecture of the network. e 1D convolution layers have one filter of sizes 32, 16, and 8, respectively. e stride size in these layers is equal to 2, 1, and 1, and the exponential linear unit (ELU) activation function is employed for all of them. e softmax function is used as the activation function of the output layer; also, cross-entropy and adaptive moment estimation (Adam) are chosen as loss function and optimizer, respectively. e learning rate, the number of epochs, and the batch size are set equal to 0.005, 30, and 64, respectively. e designed CNN network is trained on an equal number of 1000 × 7 matrices from the healthy and the damaged state, and the validation data are used to verify the performance of the network during training. e trained CNN is then utilized to detect the presence of damage receiving the test data as input. e test dataset also consists of an equal number of healthy and damaged state data. About 75% of the available samples (1000 × 7 matrices) from each state are used in the training phase, and 20% of them are employed as validation samples. e remaining 25% of the samples are employed to check the ability of the network to detect structural damages.

Stage 2.
is stage presents a transfer learning-based solution to deal with the lack of data in deep learning-based health monitoring of civil structures. As a large amount of data can always be obtained from a numerical model, data from the model of a BHM benchmark structure were utilized to train the source domain network. A simple CNN is used as the source domain network, whose details are the same as those of the network described in stage 1; the only differences are that the number of filters in the three convolution layers of the new network is increased to 32, 64, and 128, and the number of epochs is set to 100. Similar to stage 1, these numbers were chosen by trial and error to reach the most optimal performance possible.
After training, the knowledge obtained by this network can be utilized to compensate for the lack of data in experimental and real-life structures. e target domain networks, which are trained on inadequate experimental and real-life data, have the same structure as the source domain network. A comparison between source and target domain data is shown in Figures 2-4. e convolution and the pooling layers of the target domain networks receive their knowledge completely from the source domain network through transfer learning, so weights and biases are not updated in these layers. In the end, a fully connected layer classi es the input data as healthy or damaged, and the health state of the structures is evaluated.

A Bridge Health Monitoring Benchmark Model.
e grid structure located at the University of Central Florida and its numerical model were developed by Burkett [37] to provide a test bed for researchers to evaluate their SHM techniques. A scheme of this structure and its dimensions are depicted in Figure 6. S3 × 5.7 and W12 × 26 pro les are used for beams and columns, respectively. e numerical model of the structure is used to validate the proposed methodology.    Damage is simulated by releasing the major moment of a beam-to-girder connection. To obtain the data needed for SHM, the structure is excited by a dynamic load of 10 kN, and its vibration response in the healthy and the damaged state is recorded by 7 sensors with a random placement. Figure 2 shows samples of the data recorded by one of these sensors in the healthy and the damaged state. Also, the damaged connection and excitation location are depicted in Figure 7.

Qatar University Grandstand Simulator.
e main steel frame of Qatar University (QU) grandstand simulator is considered as an experimental structure to test the performance of the proposed method. is structure is shown in About 30 damage scenarios were considered for this structure, in which each of the damages was introduced to the frame by loosening the bolts at a speci c beam-to-girder connection. A sensor was installed on each of these connections to record the acceleration response of the frame in the healthy and the damaged states under random shaker excitation [38]. Data from 7 randomly selected sensors are utilized in this study. Here, data gained by each sensor when its corresponding joint is damaged are employed to form the damaged state input samples by concatenating the array of data points from each sensor and dividing the resulted matrix into 1000 × 7 matrices. Samples of acceleration signals recorded by one of these sensors in both states are depicted in Figure 3.

Tianjin Yonghe Bridge.
e Tianjin Yonghe bridge is an old cable-stayed bridge located in mainland China (Figure 9). e main span of the bridge is 260 m long. It also has two side spans of the length of 125 m and two 60.5 m tall towers. e total width of the deck is 11 m (9 m for vehicles and 2 m for pedestrians). After 19 years, the bridge was opened to tra c, and serious damages were observed in the midspan girder, so the bridge was repaired and rehabilitated. Meanwhile, the Center of Structural Monitoring and Control (SMC) at the Harbin Institute of Technology designed an SHM system and installed it on the bridge. is included 14 accelerometers on the deck and 1 accelerometer on top of the south tower. e sensor placement on the bridge is depicted in Figure 10 [39].
A while after the bridge was reopened to tra c, new damage patterns were observed in the side spans and piers. During this period, the acceleration time histories of the bridge under tra c and environmental loads, from the healthy to the highly damaged state, were recorded by the    Shock and Vibration installed SHM system [40]. In this study, data from 7 randomly chosen sensors, recorded on January 17, 2008, and July 31, 2008, are used as the healthy and the damaged state data, respectively. Samples of signals from one of the sensors recorded on these dates are depicted in Figure 4.

Validation of the Proposed Methodology
To validate the performance of the proposed method, it is applied to the three structures shown in Section 4. For this purpose, Python 3.8.2 is utilized on a computer with an Intel Core i5-5200U processor and 4 GB installed memory (RAM). e results obtained in each of the two stages are presented in this section.

Stage 1.
e CNN described in Section 3.1 is used here to detect the presence of damage in the three structures. A total of 4096, 1048, and 17280 samples (1000 × 7 matrices) are available for the BHM benchmark model, QU grandstand simulator, and the Tianjin Yonghe bridge, respectively. About 60% of the available samples from each structure are used to train the network, 15% of them are used to validate the training process, and the remaining 25% are employed to test its ability to detect damages. e sensitivity analysis conducted to determine network parameters includes checking the accuracy, loss, and total training time of the network, for varying optimizers, learning rates, and epochs, when receiving data from the three structures. About 12 cases were considered for this purpose (Table 1). Figure 11 shows the results of the training process considering these cases. It is obvious from the gure that choosing case (b) results in the best performance, so this case was considered in the proposed method.
Accuracy and loss graphs of the CNN when trained on data from the three structures are depicted in Figure 12. Also, the number of the utilized samples and the results of the training process are presented in Table 2. As can be seen Shock and Vibration 7 from the gure and the table, for all structures, when 75% of the available samples are used in the training phase, the network reaches a very high classi cation accuracy and a very low error. is means that the created model is well trained and is able to accurately identify the class of the input data. However, according to the studies conducted in this study, for all structures, when less than 50% of the available data is used in the training phase, the test accuracy decreases to less than 60%. Since the amount of available data from experimental and real-life civil structures is often low, the CNN by itself might be inadequate for the SHM of these structures. e e ciency of the second stage of the proposed methodology in dealing with this problem is validated in the next subsection.

Stage 2.
In this stage, it is assumed that only 45% and 10% of the available samples from QU grandstand simulator and the Tianjin Yonghe bridge can be utilized in the training       as test samples. Accuracy and loss graphs of the source domain network and the results of the training process are shown in Figure 13 and Table 3 respectively. e convolution and the pooling layers of the target domain networks receive their knowledge completely from the trained source domain network. ese networks are then trained and validated on 45% and 10% of the available samples from QU grandstand simulator and the Tianjin Yonghe bridge, respectively ( Figure 14). e results of employing transfer learning technique for the two structures are shown in Figure 15 and Table 4. e obtained results indicate that although the amount of data used to train the target domain networks is quite low, and they can identify the class of the input data with a high accuracy. Samples of the predicted labels for a number of randomly selected healthy and damaged state matrices from QU grandstand simulator are listed in Table 5. In this table, 0 and 1 labels represent healthy and damaged states, respectively. It can be observed that the labels for all of the considered data samples are predicted correctly. e results were similar for the other structure. is proves that the proposed transfer learningbased technique is e cient in health monitoring of these structures, even when the amount of available data is very low.

Conclusions
e aim of this study was to provide a comprehensive method for damage detection in civil structures, even when the amount of available data is low. is aim was reached through two stages using compact 1D CNNs, which need much less data for training compared to complex networks. In the first stage, a compact CNN was developed to recognize damages in a structure with high accuracy, when provided with sufficient raw acceleration data from the structure. e problem of the lack of data in experimental and real-life structures was then studied in the second stage. A transfer learning-based technique was proposed to deal with this issue. A compact CNN was used as the source domain network to be trained on sufficient data from a numerical model. e target domain networks, which were to be trained on inadequate data from experimental and real-life structures, would receive their knowledge completely from the source domain network through transfer learning.
To validate the performance of the proposed methodology, it was applied on the numerical model of the BHM benchmark structure, QU grandstand simulator, and the Tianjin Yonghe bridge. e results demonstrated that the proposed method can successfully be utilized for damage detection in civil structures, even with a low amount of acceleration data.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.