Ensemble Classifiers and Feature-Based Methods for Structural Damage Assessment

In this paper, a new structural damage detection framework is proposed based on vibration analysis and pattern recognition, which consists of two stages: (1) signal processing and feature extraction and (2) damage detection by combining the classification result. In the first stage, discriminative features were extracted as a set of proposed descriptors related to the statistical moment of the spectrum and spectral shape properties using five competitive time-frequency techniques including fast S-transform, synchrosqueezed wavelet transform, empirical wavelet transform, wavelet transform, and short-time Fourier transform. *en, forward feature selection was employed to remove the redundant information and select damage features from vibration signals. By applying different classifiers, the capability of the feature sets for damage identification was investigated. In the second stage, ensemble-based classifiers were used to improve the overall performance of damage detection based on individual classifiers and increase the number of detectable damages. *e proposed framework was verified by a suite of numerical and full-scale studies (a bridge health monitoring benchmark problem, IASC-ASCE SHM benchmark structure, and a cable-stayed bridge in China). *e results showed that the proposed framework was superior to the existing single classifier and could assess the damage with reduced false alarms.


Introduction
Intelligent damage detection of civil infrastructures is vital in structural health monitoring (SHM) in order to improve damage prediction performance and reduce maintenance costs. erefore, developing efficient methods for detecting structural damage in the early stage is extremely important for identifying the structural integrity and supporting decision-making on the structure's repair. In recent years, the main SHM focus has been on vibration-based techniques because of their ability in detecting damage which is invisible within the internal areas of the structure before being observed by visual inspection [1]. ese techniques are based on the idea that damage changes both physical properties of a structure and its dynamic characteristics which are revealed in the measured vibration response [2]. A vibration-based damage detection method includes three main steps: (1) signal monitoring, (2) signal processing, and (3) data interpretation. e objective of signal processing, which is a major component of any vibration-based technique, is to extract subtle behaviour changes in vibration data (feature extraction) in order to depict whether the structure is damaged or not [3,4]. Various features that have been applied in vibration-based SHM research consist of time-domain, frequency-domain, and time-frequency domain features extracted by signal processing methods.
Time-domain features represent temporal aspects and are fast and readily applicable. ese techniques such as mean, root mean square, skewness, kurtosis, and productivity ratio can directly perform on time-series data [5][6][7]. Frequency-domain features represent frequency content and spectral aspects obtained by fast Fourier transform (FFT) such as energy in different frequency bands, frequency bands, and Fourier coefficient [8][9][10]. Time-frequency domain features such as energy concentration, amplitude levels in time-frequency (TF) bands, and time-frequency distribution can be extracted by using various signal processing tools to represent signal characteristics in joint time and frequency domains. Time-frequency techniques such as wavelet transform (WT), Wigner-Ville distribution, shorttime Fourier transform, and Hilbert-Huang transform have overcome time information loss problem of frequency-domain methods. Wang and Shi [11] proposed a novel damage index, namely, energy curvature difference (ECD), based on wavelet packet transform, to identify damage in structures. Results of their study indicated the proposed ECD index was sensitive to low damage levels and applicable for damage identification. Xu and Wu [12] proposed a damage assessment strategy based on the energy of acceleration responses for identifying damage in long-span bridge structures. Xin et al. [13] introduced an improved empirical wavelet transform (EWT) method using measured dynamic responses of structures to identify structural modal parameters. Young et al. [14] introduced three damage-sensitive features (DSFs) by applying the continuous wavelet transform. ese DSFs were extracted from structural responses and determined as wavelet energy functions at specific times and specific frequencies. Liu et al. [15] introduced a timefrequency analysis method, i.e., S-transform, to analyse the vibration signals of a reinforced concrete beam under different loading force states in order to extract changes in the vibration data for damage identification. Synchrosqueezed wavelet transform (SWT) was employed to detect features for structural damage assessment [16,17].
In the data interpretation stage, an automatic decisionmaking system is required to classify the structure condition into different health condition categories. Indeed, combining pattern recognition algorithms with signal processing techniques has attracted the attention of many researchers in recent years. Some of the most common classification methods used in structural damage detection are artificial neural networks, fuzzy logic, support vector machine, k-nearest neighbor, and Bayesian classifiers [18][19][20][21].
Each of the abovementioned signal processing techniques has its own advantages and disadvantages which may affect the final results of the damage identification process. Some of these techniques are proper for one application, but not for another. us, it is important to choose a signal processing method for assessing structural damage. If the method is not appropriate, it may lead to erroneous results or false alarm.
In the present study, the proposed strategy is an extension of a technique proposed by Bisheh et al. [22], which evaluated the possibility of damage occurrence by analysing the measured structural responses using feature extraction and selection. e present work aims to identify the presence of damage in structures via employing pattern recognition methods using a set of damage-sensitive features. e proposed feature set as a structural damage indicator is selected by combining the feature-based technique, feature selection, and ensemble classifier methods to increase the detection accuracy or reduce false alarms. A traditional method, i.e., short-time Fourier transform (STFT), was employed as a tool for nonstationary signal processing to extract information contained damage from vibration signals. However, assessing the damage occurred in the bridge depends on how efficiently damage features are extracted by signal processing procedures. erefore, using more recent signal processing procedures can improve the accuracy of feature extraction.
is work focuses on the recently developed signal processing methods for feature extraction and selection process in order to provide feature subsets of damage.
e ensemble classifier is considered to find an optimal feature set with high accuracy as the indicator of structural damage. On the other hand, in the study by Bisheh et al. [22], support vector machine (SVM) was utilized to classify damage as a classical classification method. In this study, classifier combination techniques were applied to enhance the accuracy of the damage detection. Accordingly, in the first stage, five signal processing techniques were used as potential candidates for feature extraction and the results obtained from various approaches were investigated and compared. ese techniques consisted of fast S-transform (FST), synchrosqueezed wavelet transform (SWT), empirical wavelet transform (EWT), wavelet transform (WT), and short-time Fourier transform (STFT). In the second stage, ensemble-based classifiers were used for finding the predicted class in order to improve the overall performance of damage detection based on three individual classifiers, namely, MLP, KNN, and SVM. In particular, the following combining algorithms were applied: majority voting, algebraic combiners, decision templates (DTs), and Dempster-Shafer (DS) as ensemble-based methods. e performance of the proposed method was validated using a suite of numerical simulations and full-scale studies. is paper is organized as follows.
e related techniques are briefly presented. Next, the proposed framework is described and applied to the structures. Later, the results are discussed and, finally, conclusions are given in the last section.

Feature Extraction Methods in Vibration Signals.
Feature extraction is a crucial step for signal processing and a key procedure in damage identification in structures. Feature extraction aims to extract a set of features, which maximizes the recognition rate by retrieving the most important data from the raw data. e extraction manner of ideal features that can reflect the relevant information of structural damage as complete as possible is the most important factor in achieving high assessment performance. In this paper, time-frequency features are applied since they are able to extract signal characteristics which may be hidden in the time domain. Furthermore, they can track the timevarying nature of real signals, which is not possible using the conventional methods. ese descriptors are the spectral indicators corresponding to the statistical properties of the spectrum and spectral shape properties. ese features are calculated using Equations (1)-(10)as indicated in Table 1 [23].
Various time-frequency techniques are used for feature extraction in order to obtain damage features with less dimensionality and higher sensitivity. Time-frequency techniques analyse time-varying spectral properties of the vibration signals. ese techniques include the FST, SWT, EWT, WT, and STFT. e detailed description of the used methods is given in Appendix A.

Ensemble Classifiers.
Ensemble techniques combine various classifiers to enhance performance scores relative to an individual classifier. Instead of using a single classifier, this method combines several weak classifiers together to enhance the recognition precision. In this paper, three individual classifications, i.e., KNN, MLP, and SVM, were implemented to find the predicted class. ese classifiers can rely on different classification strategies. e following algorithms are used to cover various categories of combinations. e decision of the t th classifier is given as d t,j ∈ 0, 1 { }, where j � 1, . . . , C and t � 1, . . . , T. T and C are the number of classifiers and classes, respectively, d t,j � 1, when the t th classifier selects class ω j and d t,j � 0 for other cases [24].
(1) Majority voting: the most well-known majority voting classifiers are as follows: (1) unanimous voting which means all classifiers are used; (2) simple majority, in which more than half of the classifiers are used; or (3) plurality voting, in which the sum of the whole votes exceeds 50%. e plurality voting method is shown as follows: (2) Average rule: the mean of all classifiers is calculated. is rule is the same as the summation rule with the dividing factor of 1/T: (3) Extrema rule: these functions simply take the maximum or minimum among the classifiers' individual outputs. e extrema of the individual classifiers are calculated as follows: (4) Product rule: a label is assigned to each classifier by multiplication. e best classifiers which have the score near 1 are selected. In the same way, classifiers with a low score (close to 0) are given less chance to be selected.
(5) DTs: they are calculated by the mean of decisions for every classifier in the training process. e decision profile of each instance, DP (x), is compared to the DT of the corresponding class and the most similar ones are selected as the ensemble decision: where N j is the number of class j instances. (6) DS-based rule: DS theory is widely used in data fusion techniques which use belief functions (unlike the common probability theory) to combine data from different sources. Inspired by data fusion, DS theory is used here for ensemble combination. If DT t j is the t th row in the DT, and C t (x) is the t th classifier output, the proximity Φ j,t (X) is computed as follows:

Parameter Equation
Spectral centroid S1 � ( (K/2)− 1 |X(k, n)| is the magnitude spectrum (magnitudes of the STFT) of the input signal. e detailed description of the individual classifiers used is given in Appendix B.

Proposed Framework
e proposed strategy was an extension of the method proposed by Bisheh et al. [22]. In this section, we provide a proposed feature set based on feature extraction and present an ensemble learning framework that combines classifiers trained on different feature sets. An optimal feature set was obtained by combining feature-based methods, feature selection, and ensemble classifier techniques in order to improve damage detection accuracy. First, various feature extraction techniques in the time-frequency domain and the feature selection method were employed for extracting the spectral descriptors from signals in order to provide the damage-sensitive feature subsets. Next, an effective feature set which was in good correlation with damage was found by using ensemble classifiers. Different combinations of these classifiers were examined to find the best subset of features with the highest damage detection capability, as is summarized in Figure 1: (1) Dataset containing vibration data before and after damage of the structures is used. e data are the vertical acceleration of the bridge deck which is divided into shorter segments for processing. (2) e descriptor set (or instantaneous features in timefrequency domain) presented in Table 1 is extracted from vibration signals by using different signal processing methods including STFT, FST, WT, SWT, and EWT. By applying each of the competitive time-frequency techniques, the extracted feature set produces one vector for each data segment. Forward selection method is carried out to eliminate the redundant information and select damage feature from the descriptor set. Before the analysis, normalization is conducted because data have different ranges, as where σ j and μ j are standard deviation and mean of the j th dataset and v j and v nj are input and normalized data points, respectively. (3) To investigate the ability of each feature set extracted by using four competitive time-frequency methods, three schemes for decision-making are used. e individual classifiers include multilayer perceptron (MLP), SVM, and k-nearest neighbor (KNN). Crossvalidation is conducted to verify the classification performance for the unseen data. k-fold cross-validation method is employed in the work. To do so, the original data sample is randomly divided into k equal subsamples. It is worth noting that MATLAB (2009) is used for calculation. (4) Ensemble learning methods are applied to improve the performance of diverse classifiers. ese methods combine different classifiers that are trained on different feature sets and are obtained by using various signal processing methods. To find the predicted class, strategies employed in combining single classifiers to cover various categories of combinations include algebraic combiners (maximum/minimum/average/sum/product rule), majority voting, DTs, and Dempster-Shafer.

Bridge Health Monitoring Benchmark
Problem. e numerical model of a bridge health monitoring benchmark problem represented by the University of Central Florida was used to validate the proposed method, as shown in Figure 2. e physical model had two spans of 5.49 m in the longitudinal direction with continuous beams. It was supported by 1.07 m columns and the width of the bridge was 0.92 m. By applying a finite element model, the numerical benchmark problem was prepared. e FE model included 1056 degrees of freedom, 176 nodes, and 181 elements. ree damage cases with different levels such as boundary condition change and reduced stiffness at connections were simulated. Several sensors, such as accelerometers, were located on the model to record dynamic responses under random loading. More details about the benchmark study and the numerical model can be found at the benchmark bridge website [25]. For this part of the study, a number of different accelerometers located on the model were considered to record the vertical accelerations (N1, N2, N4, and N5 in Figure 2). In addition to the undamaged bridge, three damage cases (i.e., cases A, B, and C) were considered to show the effectiveness of the proposed optimal feature subsets and compare them with each other.

Case A: Removing Plate and Releasing Moment at N3.
In this case, the gusset plates at node N3 were removed and the moment of the transverse beam connecting at this node was released by removing the bolts. Also, 10% white noise was added artificially. e response data were collected from the model. To evaluate the effect of feature extraction methods on the final results, the features presented in Table 1 were extracted by using different signal processing techniques. Next, forward feature selection (FFS) was performed for each of the feature extraction methods in order to obtain an optimal feature set, as damage index. STFT, FST, WT, SWT, and EWT were considered as signal processing methods. e classification precision of various feature sets in damage detection of the bridge was studied using SVM as a classifier to separate damaged and healthy states. e classification accuracies are presented in Table 2. Results showed that the proposed features were successful in detecting damage for this case at node N2 without giving any false alarm or misclassification. Moreover, selecting the features as a damage index satisfied precision for classifying or predicting the bridge condition for this case at node N1. Results illustrated that STFT was more successful in extracting these features than other methods since we 4 Shock and Vibration obtained 100% and 99.6% classification accuracy for this case at node N2 and N1, respectively.

Case B: Boundary Support Restraint at N6 and N7.
For this case, moment releases at nodes N6 and N7 were eliminated and 10% white noise was added artificially. e deck was now fixed to the column. e response of the accelerometers was recorded from the model at nodes N4 and N5, and feature extraction and selection process were performed for this location, as shown in Figure 2. Average classification accuracy is presented in Table 3, which shows that these features were efficient for damage detection. Figure 3 depicts the effectiveness of all feature extraction methods, which had the average accuracy of 100%, for this case at node N4. Results clearly demonstrated that N5 had less accuracy than N4 since it was the away node from the damage location.

Case C: Release Moment at N3.
e moment of the transverse beam connecting at node N3 was released and 10% white noise was added for case C.
e vertical accelerations at nodes N1 and N2 were recorded as the input data to investigate the effectiveness of features and feature extraction methods in improving the final results. For this case, results showed that damage detection accuracy depended on the signal processing techniques, in which STFT was relatively successful. It obtained the average accuracy of 95.2% for this case at node N2, as shown in Table 4.

e IASC-ASCE Benchmark Structure.
In this section, a shear-building structure is employed to verify the proposed method, as depicted in Figure 4. e structure was a fourstory, 2 × 2 bay, steel-frame structure conducted by the University of British Columbia (UBC). e model was 2.5 m wide and 3.6 m high. Each floor was 0.9 m high and there were fixed connections between the beams and columns. It had two braces on each floor and the steel plates were located on each bay. Two analytical models were developed to generate the simulated response data. e first one was a 12degree-of-freedom shear-building model. In this case, the slabs and beams were assumed to be rigid bodies and it constrained all motion, except one rotation and two horizontal translations per floor. In the second model, each of the nodes had six DOFs, including three rotational DOFs and three translational DOFs, with respect to the x, y, and z directions. For more details on the benchmark structure, refer to Johnson et al. [26]. In this paper, finite element model of the 120-degree-offreedom structure which was accessible through the Task Group website was applied. In addition to the undamaged structure, four damaged cases were defined to examine the capability of various signal processing methods for damage detection, as shown in Figure 5. e damage patterns are defined as follows: (a) Stiffness in the braces of the 1 st story was removed    Table 1. Forward selection algorithm was carried out to select a feature from the feature set. Next, SVM was     Tables 5 and 6. Results showed that all the signal processing methods, except EWT, were successful since they reached 100% classification accuracy for damage patterns.
Using signal characteristics in joint time and frequency domains and frequency contents obtained by EWT to extract the spectral descriptors was not successful for damage patterns in a shear-building structure. e spectral indicators or time-varying descriptors corresponding to the statistical properties of the spectrum and spectral shape properties were extracted from all the empirical modes in the EWTanalysis. Since some of the empirical modes may not be efficient for extracting damage features, using all empirical modes can lead to misclassifications or false alarms in identifying structural damage. Using the effective empirical modes (one mode or more than one mode) for feature extraction, instead of all modes, may improve damage detection accuracy for the damage patterns and can be investigated in future studies.

Full-Scale Study.
e Tianjin Yonghe Bridge is a cablestayed bridge with a continuous prestressed box girder. e bridge had a main span of 260 m and two side spans of 99.85 m. e total width of the bridge was 11 m (four vehicle lanes of 9 m wide and two 1 m pedestrians). is bridge was built in 1987 and, after 19 years of operation, cracks as wide as 2 cm were observed at the bottom of the midspan girder. During the repair process between 2005 and 2007, an SHM system was designed and implemented for the bridge. e monitoring system consisted of 14 uniaxial accelerometers placed on the deck, downstream and upstream, as shown in Figure 6. More details on the full-scale bridge benchmark problem can be found in [27,28] and are accessible at http:// smc.hit.edu.cn.
In August 2008, two damage patterns were identified during the bridge inspection: the external portions of both side spans were cracked and the piers were damaged (Figure 7). Fortunately, the time history data of the accelerations for both healthy and damaged states were available. e available data consist of 24 h records (24 parts of 1 h length), which were recorded on January 1, January 17, February 3, March 19, March 30, April 9, June 16, and July 31 2008. Data in the healthy and damaged conditions were recorded on 17 January and 31 July 2008, respectively, in the same locations [27]. e sampling frequency of the data was 100 Hz.
In this study, the whole process consisted of feature extraction, feature selection (to obtain optimal feature subset), three individual classifications, and ensemble techniques for finding the predicted class. By applying different signal processing techniques in the time-frequency domain as the potential candidates, the set of features was extracted to identify the feature set yielding the most accurate classification. Different feature sets were prepared by feature selection in order to eliminate the redundant feature and select effective features from the original feature sets. In the feature selection stage, forward feature selection technique [29] was employed to achieve an optimal reduced set of features. Also, normalization was conducted before feature selection. By applying individual classifiers and, then, ensemble classifiers, the classification precision of various feature sets was studied.

Assessment Capability of Feature Sets.
By applying different signal processing methods, the features were extracted from the measured acceleration data, as shown in Table 1. Different feature sets were calculated by using different signal processing techniques. In other words, the feature sets 1, 2, 3, 4, and 5 included features extracted by using STFT, FST, WT, SWT, and EWT, respectively. In the feature extraction part, the data were divided into successive segments by time-frequency analysis. Features were then computed for each of these blocks and, finally, the mean of the extracted feature for a 1 h period was considered as the input for classification. Afterwards, using STFT, the spectrogram of the data was computed. A hamming window with the length of 100 and overlap of 50% was considered for STFT. In the next step, classification was carried out and the corresponding accuracy was calculated for healthy and damaged data segments by utilizing individual classifiers. After performing forward feature selection process for all the original feature sets, three types of classifiers were used, which included MLP, KNN, and SVM. In the classification part, all the experiments were conducted by using 10-fold cross-validation. 50% of the data were used randomly for training and the rest for testing. For each of the original feature sets, the subset of important features as the optimal subset was obtained by using forward feature selection, as presented in Table 7.
It was found that the most significant features for the original feature set, extracted by using STFT, were S8, S2, S7, S3, and S1, while the selected features in feature set 2, extracted by using FST, were S10, S5, S4, S1, S9, and S7. Feature set 4 with effective features S5, S2, S9, S1, S4, S7, S6, and S3 was obtained by using SWT and the forward selection method. e average accuracy of the individual classification methods for different feature sets is shown in Figure 8. Results showed that feature set 1 and feature set 4, extracted by STFT and SWT, respectively, with the classification accuracy of about 97%, had better performance than other feature sets with the classification accuracy of about 92%. In other words, STFT and SWT methods can be effective tools for extracting the proposed damage features. Figure 9 shows normalized damage features extracted by using different signal processing techniques for various monitoring dates. e exact time, at which damage occurred, was not known [28,30]. In other words, the data label for different months was not clear, except for January 17 and July 31 2008, which were considered healthy and damaged labels, respectively, in the abovementioned research. However, each of the features extracted by different methods partly referred the damage effect on the bridge behaviour from healthy to damage states for the monitoring dates. It was clear that the change in the bridge behaviour occurred almost in May.

Effectiveness of Ensemble-Based Classifiers.
To improve the classification performance of each feature set, various types of ensemble-based classifiers were   implemented, which illustrated different combining strategies between the classification methods. e selected combinations included algebraic combiners (maximum, minimum, summation, average, and product rule), majority voting, DS, and DTs for improving the performance of the single classifiers. e average accuracy obtained using these techniques is presented in Table 8.
Majority vote is one of the simplest and most intuitive ensemble combination techniques. Essentially, the ensemble chooses the class that is chosen by the majority of the classifiers. By applying this method, the detection performance is increased to 99%. After classifying the feature sets, algebraic combiners are used to combine base classifiers and 99% classification accuracy is achieved by the summation method. In the DT combiner, the most common decision specifications are obtained for each class. Next, new patterns are classified by comparing their decision profiles with the DT of each class using similarity measurements. It is seen that the accuracy of damage identification is increased to 99% by DTs and DS techniques. Results showed that the ensemble-based classifiers improved the classification performance, so that the maximum accuracy of 99% was   achieved for feature set 1 and feature set 4. is means that false alarm was decreased by about 2%. For feature set 2, 99% accuracy was yielded compared to 92% accuracy obtained by individual classifiers. Also, for the feature sets extracted by FST and EWT, the accuracy of 92% was increased to about 95%. Generally, ensemble-based classifiers have better performance than individual classifiers, among which majority voting, summation rule, DT, and DS methods similarly lead to high accuracy in identifying the damage of the bridge. To summarize the above results, the average improvement in performance between ensemble-based classifiers and single classifiers for different feature sets is presented in Table 9. Results indicated that improvement in performance between around 2% to 5% can be achieved by applying ensemble-based classifiers and decreasing the false alarms. Ensemble classifiers make multiple classifiers with various types of features to raise the detection precision and decrease variance and bias. In ensemble-based systems, if single classifiers are diverse, they can cause various errors; combining these classifiers can decrease the total error through averaging. us, the final classification result of the ensemble classification computed by some predefined rules of each classifier is better than the result of a single classifier.

Conclusions
In this paper, a framework was proposed which involved feature extraction and combination of individual classifiers for structural damage detection. In the first stage, a set of time-varying descriptors was extracted from the vibration signals by using competitive signal processing techniques as the potential candidates for damage assessment of the bridge to achieve original feature sets. ese techniques included STFT, FST, WT, SWT, and EWT. To evaluate the effectiveness of feature sets obtained through forward feature selection process, three diverse classifiers, i.e., MLP, KNN, and SVM, were employed. In the second stage, ensemble methods were used to achieve improved    recognition accuracy for the bridge in order to combine the output of single classification algorithms. Majority voting, algebraic combiners, DTs, and DS were used as ensemblebased methods to predict the class. Apart from the numerical studies, a full-scale bridge was utilized to validate the efficacy of the proposed method. It was observed that the methodology was successful in damage detection. Overall, the following conclusions can be drawn from the results: Feature sets with the effective features extracted by using STFT and SWT yielded better performance than other feature sets, which achieved the average accuracy of 97%.
Using an ensemble classifier instead of an individual classifier improved the average classification accuracy. By applying DTs and DS-based combination, 99% accuracy was achieved for the feature sets extracted by STFT and SWT. Furthermore, for feature set 2, the average classification accuracy was increased to 97%. In other words, using an ensemble of classifiers yielded 2% to 5% reduction in the false alarm.

A.1. Short-Time Fourier Transform (STFT)
STFT is a signal processing technique for analysing nonstationary signals, in which statistic features alter over time.
It extracts several blocks (frames) of the original vibration signals to represent frequency contents of the signal by moving a window block over time. If the frame used is sufficiently small, each of the extracted blocks can be estimated as the stationary signal, so that fast Fourier transform (FFT) can be applied. By shifting the window through the whole record and implementing the Fourier transform, the relevance among time and variance of frequency could be recognized and time-varying spectrum is calculated for each of the frames. STFT of a sequence (time-series data x(i)) can be defined as follows: where X(k, n) denotes the window function and k denotes the time index. STFT breaks up the signal in time domain into a number of signals of shorter duration. en, it applies Fourier transform to each part. A spectrogram that is the squared magnitude of STFT is defined as follows [31]:

A.2. Wavelet Transform (WT) and Synchrosqueezed Wavelet Transform (SWT)
WT is a signal processing tool that is applied to time-and frequency-domain analysis for getting the optimal equivalence between time resolution and frequency resolution. By using a wavelet basis function, WT decomposes the original signal into several components at different frequency bands. For original data x(t), the continuous wavelet transform (CWT) is defined as follows: where the asterisk ψ * is the complex conjugate and w u (s) denotes the wavelet coefficient. e translation parameter u and scale parameter s differ continuously and ψ(t) ∈ L 2 (R) · ψ(t).
SWT reallocates the CWT coefficients to get a sharper exhibition in both frequency and time domains. e following steps are involved in SWT: (i) For time u and scale s, a CWT is computed in order to retrieve the amplitudes at the temporary frequencies.
(ii) An instantaneous frequency w(u, s) for the data x(t) is calculated as the derivative of the coefficients w x (u, s) at any point (u, s): (iii) e information from the time-scale plane is converted into the time-frequency plane, in which any amount of W x (u, s) is reassigned to (u, ω l ). ω l denotes the frequency that is the closest to the instantaneous frequency of w(u, s) and its synchrosqueezed transform T(u, ω l ) is written in the following equation: where Δω � ω l − ω l− 1 is the width of every frequency bin and, equivalently, for Δs � s k − s k− 1 . Detailed information regarding CWT and SWT can be found in [32].

A.3. Fast S-Transform (FST)
FST is an algorithm which was introduced by Brown and Frayne [33] for solving the classic S-transform with significantly reduced computational requirements. FST uses double frequency sampling to decrease a narrowed window and the amount of data to decrease the instances that need to be assessed. e FST method is as follows: ( . (A.9)

A.4. EWT
e EWT is a new approach for building adaptive wavelets which is an adaptive data analysis technique developed by Gilles [34] to extract various modes of a time-domain signal. By defining a series of wavelet filters adapted to the processed signal, all the modes can be extracted. Based on the Fourier viewpoint, the transform builds a number of bandpass filters and constructs supports of the filters based on the location of information in the signal spectrum. In other words, EWT decomposes the input signal x(t) into narrow subbands in time-frequency domain based on frequency information of the signal, compared to DWT, in which the subbands are based on the sampling frequency of the input signal. e following steps are involved in EWT: (1) Fourier transform and segmentation: as the first step, local maxima are found by Fourier spectrum x(ω) of the input signal. en, the spectrum is segmented and, for each segment, boundaries ω n are assigned as the middle of two successive maxima. (2) Filter construction: a group of wavelets called empirical wavelets is constructed as the band-pass filters on the segmented Fourier spectrum. (3) Empirical transform: by constructing wavelet filters, EWT is applied similar to the conventional WT to decompose the signal into narrow-band signals as where detail w ε f (n, t) and approximation w ε f (0, t) coefficients can be calculated by dot product of empirical wavelets and scaling functions, respectively. e reconstructed signal f(t) is obtained by the following equation. Detailed information on this method is given in Gilles [34].

B.1. K-Nearest Neighbor (KNN)
e k-NN algorithm is a nonparametric classification method that performs very well in the problems with unknown and nonnormal distributions, contrary to the simplicity of the method. is algorithm finds k nearest points between the sample and the data for a particular sample in the feature space. To define the neighbors based on the distance, KNN classifier needs a positive integer K and a metric d. Generally, the Euclidean distance is used as the most common metric to calculate the distance between training samples and query data. After this operation, K samples with minimum distances are chosen to define threshold that is the K number for this method and the result is the class with more samples inbound. e length of the line between points u and v is the Euclidean distance between them.
If u j and v i are two points in Cartesian coordinates and Euclidean n-dimensional, the distance from u to v is defined as follows [35]:

B.2. MLP Artificial Neural Network
An ANN is a computational model and a supervised learning algorithm based on a set of connected nodes or units called artificial neurons. e multilayer perceptron network training with backpropagation algorithm is a very popular ANN architecture and is applicable in several domains including a few structural engineering applications. By connecting perceptrons, a neural network structure called the multilayer perceptron (MLP) can be designed. A typical multilayer perceptron network is constructed with layers of neurons and consists of at least three layers of nodes: (1) an input layer, (2) hidden layers, and (3) an output layer, as shown in Figure 10. Each neuron in a layer receives a weighted sum of its inputs (x) that are passed through an activation function (f) providing the output (O) described mathematically as follows [36]: In an activation function (f), weight (W) and bias (b) are used to control steepness and delay, respectively.

B.3. Support Vector Machine (SVM)
e SVM is a relatively new multivariate statistical approach and a supervised learning model that was initially developed for classification, pattern recognition, and regression tasks. It can be used to classify one or more classes efficiently using small datasets. e main idea of SVM is that it locates the optimal separation plane (boundaries) between various classes. It searches boundaries by maximizing the margin (the distance between the nearest point and boundary of each class) from the training data. For nonlinear data which are not linearly separable, data are transferred to a higherdimensional space (called a space kernel) by a transformation function. e nearest data points which determine the margin are called support vectors, the increasing number of which may increase the complexity of problems.
For the given training data, T � x i , y i m k , in which x k is the feature vector as the input and m is the data point number, y i ∈ − 1, +1 { } is the label. y i � − 1 belongs to one class and y i � +1 to the other class. For a linear kernel, we have where w denotes the weight vector and b denotes a scalar.
Maximizing the margin can be achieved through minimizing w. e optimal hyperplane with larger margin which separates the data can be defined as a solution to the following constrained quadratic optimization problem [30]: If the input data are not linearly separable, SVM transforms the data into higher dimensions by using kernel functions. e generally used four kernel functions are as given in Table 10.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.