A Damage Classification Approach for Structural Health Monitoring Using Machine Learning

1Departamento de Ingenieŕıa Eléctrica y Electrónica, Universidad Nacional de Colombia, Cra 45 No. 26-85, Bogotá, Colombia 2MAN Energy Solutions SE, Test & Validation—R&D Engineering Four-Stroke (EEEFTTM), Stadtbachstr. 1, 86153, Augsburg, Germany 3Control, Modeling, Identification, and Applications (CoDAlab), Departament de Matemàtiques, Escola d’Enginyeria de Barcelona Est (EEBE), Universitat Politècnica de Catalunya (UPC), Campus Diagonal-Besòs (CDB), EduardMaristany, 16, Barcelona 08019, Spain 4MEM (Modelling, Electronics, and Monitoring Research Group), Faculty of Electronics Engineering, Universidad Santo Tomás, Cra. 9 Nos. 51-11 Bogotá, Colombia 5Faculty of Electronics Engineering, Universidad Sergio Arboleda, Calle 74 #14-14, Bogotá, Colombia


Introduction
Data-driven algorithms have demonstrated their utility in structural health monitoring (SHM) applications.In fact, the use of this kind of approaches is a useful tool for real-time condition monitoring (CM).However, one of the challenges in the use of data-driven algorithms is associated with the size and quantity of information which is often obtained from sensor networks or multiple sensors.This information represents a great deal of data to process and analyse.In this sense, it is necessary to develop better methodologies which allow avoiding false alarms in the damage identification process.
An SHM system typically includes five steps in its design: these are (i) sensor network design; (ii) data acquisition; (iii) feature extraction, (iv) diagnosis, and (v) prognosis.The first four stages normally involve methods for data-sensor fusion, multivariate statistical modelling and pattern recognition algorithms.For the later, a physics-based model is almost inevitable so that reliable predictions can be performed.It is evident that structural health monitoring systems have been advancing worldwide as shown by the amount of relevant available scientific papers and recent practical applications [1][2][3].Among the solutions in the application of data driven algorithms for SHM, there are many applications in bridges 2 Complexity [4][5][6], aeronautics [7,8], aerospace [9,10], wind turbines [11][12][13], among others.
As a contribution to the development of new ways to process and evaluate the condition of a structure using data from sensors, a methodology for damage classification and detection is presented in this paper.This work is also motivated by the need to further develop, integrate and evaluate damage identification algorithms [7,14,15].The proposed methodology is based on an acousto-ultrasonic approach in which ultrasonic waves are generated in a piezoelectric transducer sensor network in several actuation phases.The captured signals are preprocessed by means of the discrete wavelet transform (DWT) for feature extraction and then integrated into a nonlinear multivariate model where some nonlinear components are generated in order to form feature vectors for all the actuation phases and to train a machine by means of the machine learning point of view.Afterward, measurements with the sensor network are captured from the structure in an unknown state and the interaction with the training machine allows defining the current structural state according to the states defined in the training step.To validate the proposed methodology, experiments are carried out in a composite sandwich structure in which increasing damage is intentionally introduced and a composite plate with simulated damages.
The remaining part of this paper is organized as follows.For completeness, the article first presents a brief summary of the basic theoretical background for the different evaluated signal processing algorithms.Afterward, the methodology is introduced in Section 3. Section 4 is devoted to the experimental validation, where the experimental setup and results are included.Finally, conclusions are given in Section 5.

Theoretical Background
This section introduces some brief concepts about some wellknown methods that are used in the developed methodology.Authors suggest reviewing the references in each subsection if more information about each method is required.

Discrete Wavelet Transform.
The discrete wavelet transform (DWT) is a very useful tool, used in an increasingly broad horizon, image processing, health care, energy distribution, SHM, and others.That can be defined as a filter bank structure to distinguish features through the use of lowpass filters and high-pass filters [16,17].This configuration allows representing the variability of a given function by means of coefficients at a specified time and scale.These coefficients are calculated by using quadrature mirror filters and are decomposed in approximation (A1, A2,. ..) and detail coefficients (D1, D2,. ..) [7] as is shown in Figure 1.
Detail coefficients are low-scale, high-frequency components, while the approximation coefficients represent the high-scale, low-frequency components.The wavelet technique has been of great interest in recent years and has direct application for the SHM like demonstrates some research works [18][19][20][21][22].For further details about DWT and its implementation, please refer to [23].

Hierarchical Nonlinear Principal Component Analysis.
The hierarchical nonlinear principal component analysis is also known as h-NLPCA and is also defined as a nonlinear generalization of traditional PCA [24].This is a method based on a multi-layered perceptron (MLP) architecture with an auto-associative topology.The auto-associative network works with the inputs and outputs to perform the identity mapping by using the square error [24].This architecture, shown in Figure 2, includes a bottleneck layer which allows us to compress data and reduce the dimension of the original data.Note that the nodes in the mapping and demapping layers must have nonlinear transfer functions; nonlinear transfer functions are not necessary for the bottleneck layer [25].With the purpose of guaranteeing that the calculated nonlinear components have the same hierarchical order as the linear components in standard principal component analysis (PCA) and in contrast to standard NLPCA, the reconstruction error is controlled by searching a k dimensional subspace of minimum mean square error (MSE) under the constraint that the (k-1) dimensional subspace is also of minimal MSE [26].
This process is repeated for any k-dimensional subspace where all subspaces must be of minimal MSE.h-NLPCA describes the data with greater accuracy and/or by fewer factors than PCA, provided that there are sufficient data to support the formulation of more complex mapping functions [27,28].

Machine Learning.
In the recent years, the machine learning (ML) has been the focus of many researchers in the area of structural health monitoring (SHM) by its effectiveness and continuous development [29][30][31][32].Machine learning is a set of algorithms that can extract, in an automatic way, the hidden patterns in a large group of data [33,34].There are two different approaches in ML according to the training process: (i) Supervised, where the machine gets the inputs and the expected outputs.The machine is trained to find the complex patterns and relationships between them and obtain generalized responses based on this training with right answers [35,36].

Bottleneck Layer
Mapping Layer De-Mapping Layer Input Layer Output Layer (ii) Unsupervised, where the machine is trained to find the similarities in the data and provide a clustering organization to indicate its proximity [37,38].
In this work, a supervised training is explored; in this sense, some of the supervised machines used in the methodology are then explained.On one hand, k nearest neighbours (kNN) is a machine learning algorithm that has a very simple strategy.More precisely, the elements are classified by the distance to others and the frequency with which this proximity is presented.It is important not to take the risk of overfitting.In this case, the trained machine will only apply for the current group of data.Therefore, to ensure that this does not occur, it is important to keep a moderate number of characteristics and training examples.On the other hand, decision trees are a predictive model used in, for instance, data mining and statistics.This mechanism maps the observations in a structure that allows us to reveal conclusions about these observations.This structure also allows us to extrapolate these conclusions and predict new behaviours with new observations.To extract the desired structure that describes the information, an analysis of the data and the critical values that builds a better division of them is performed.This division is performed after locating the choice nodes and the change nodes in the decision structure with the aim of obtaining a better decision branches and a best behaviour in the prediction.
In order to facilitate that the machine reaches the goal, it is very common to simplify the input data through some techniques [2,39].In this work, only the supervised type is explored and results are presented by the use of the confusion matrices.These types of matrices are a very useful tool to classify data considering the following classes: true positives, false negatives, false positives and true negatives.

Damage Classification Methodology
In this work, piezoelectric transducers were used because these devices are cheap, easy to install, lightweight, and with several other good characteristics [40,41].Figure 3 shows a representation of the methodology applied.This can be divided into two parts: training and testing, where in both cases the strategy uses data from the structure collected by a piezoelectric sensor network in several actuation phases.This network is built with several piezoelectric transducers which are attached to the structure under test in a permanent way and distributed over its surface as in Figures 9 and 13.Because these transducers can work as actuators or as sensors, each actuation phase is defined by a PZT working as an actuator and using the rest of PZTs as sensors, this procedure is repeated for each PZT in the sensor network [42].This means that an excitation signal is applied to a piezoelectric sensor and propagated signals through the structure are collected by the rest of sensors, organized and preprocessed.This process is repeated for each sensor in the structure [43,44].Each signal captured by the acquisition system is preprocessed by the Discrete Wavelet Transform at a defined decomposition level and, as result, a reduced signal is obtained and organized by each actuation phase as in [43].These steps are the same for training and testing steps.Once data are preprocessed and organized, during the training step, h-NLPCA is applied to the data by each actuation phase and a determined number of nonlinear components are obtained and used for training the machines; in particular for the explored cases in this paper the first three scores (S1, S2, and S3) were used by each actuation phase to define the feature vector for training the machines as it is shown in Figure 4.This Figure is an example when only four sensors are used as in the case of the specimen 1.As result of this step, a machine with the information of the structural states is trained and is available for the testing step.
Testing is performed by using data from the structure in an unknown structural state and projecting the information to the nonlinear components, as results of this projection appear the scores which are used as input to the trained machine to predict the kind of structural state.This procedure allows us to detect and classify the structural state.

Experimental Validation
To validate the methodology, data from two structures are considered.A carbon fiber-reinforced plastic (CFRP) sandwich structure with some damages on the multilayered composite sandwich structure and a CFRP plate with an added mass to simulate damages were used.Several experiments were collected per each structural damage state Complexity 5

Damage Number Description 1
Delamination: started symmetrically from the right side of the sample at its middle position along the y-axis.Its width along the y-axis is 16 mm and its depth along the x-axis is 10 mm Signal collected 60.000 samples per channel 60.000 X 3 sensors 180.000 (including undamaged state) to train the machines and to test the behaviour of the prediction as will be explained in the following subsections.In addition, there is a detailed description of the measurement procedure, the structures, and the results obtained from the use of the developed methodology.

Measurement Procedure.
As it has been previously introduced in the last section, the interaction with the structure is performed by the signals applied and collected to the piezoelectric sensor network.In the cases of the structures evaluated in this paper, piezoelectric sensors PIC-151 were used.The inspection is performed in four phases for the specimen 1 and nine phases for the specimen 2 due to the number of piezoelectric sensors installed in each structure.
During the first actuation phase (phase I), the first piezoelectric was stimulated with a Hamming windowed cosine signal, 12 volts of the peak value, and a frequency determined for each structure, and the information of the interaction of the propagated waves with the structure is collected in different places of the structure by the rest of the sensors.Figure 5 describes an example of this actuation phase.The second actuation phase (phase II) implies the use of the second piezoelectric as an actuator and the rest used as sensors and so on.This process ends when all piezoelectric transducers have been used as an actuator.All this information is stored for the subsequent process in several matrices and files, one per each actuation phase.Experiments consider different structural states (healthy and structure with damage in different positions) as it will be explained in the following subsections.Number of samples of each sensor is 60.000.This means that the number of columns in this matrix is (n-1) sensors x 60.000 samples.Figure 6 shows this organization, the corresponding preprocessing, and the procedure to extract the h-NLPCA scores for the case of a structure with four piezoelectric sensors.In this case, the first 30 scores are retained during the model construction with h-NLPCA.
After scores are obtained per each actuation phase, the feature vector for training is defined.Figure 7 shows the assembled training vector.In this case, training is made with a vector of twelve elements (three scores from each actuation phase), this means that, for instance, in the case of specimen 1 with 4 sensors, 4 actuation phases were considered as in Figure 7.In the same way, the Figure shows the case where 150 experiments were acquired for each structural damage state.With respect to the normalization, group scaling was used in each matrix from each actuation phase [45].
These steps are repeated in the same way for data during training and testing process.Following some details about the particular experiments with each evaluated specimen will be presented.

Specimen 1: CFRP Sandwich
Structure.The first structure corresponds to a CRFP sandwich structure (Figure 8), where the damages are intentionally produced to simulate different damage mechanisms, i.e. delamination and cracking of the skin.These damage mechanisms are summarized in Table 1.The overall size of this structure is 217 mm x 217 mm x 31 mm and it is made of carbon/epoxy material with a 0.5 mm thickness.The stacking sequence is [0 ∘ 90 ∘ ] (Figure 9).
The core is made of polyetherimide foam with a 30 mm thickness.Four PIC-151 piezoelectric transducers from PI Ceramics are attached to the surface of the structure with equidistant spacing.Figure 9 shows a photo of the experiment.
The scan frequency was a 50 kHz, with a peak voltage of 12 V, and Hamming windowed cosine form, with five cycles.Seven structural states were studied (healthy state and six damages) as it was previously explained.In each structural state 150 experiments were performed, according to the following distribution: 100 experiments were used for training and 50 experiments for testing.Data from each experiment was preprocessed by means of the DWT.The family  Daubechies (db8) was chosen to obtain the approximation coefficients [46,47].This selection was applied since previous works demonstrated that this family contains most relevant information for this kind of applications.Coefficients are used to build the hierarchical nonlinear PCA model for each actuation phase.The architecture of the h-NLPCA is a five-layer nonlinear autoencoder network with 3-4-2-4-3 components as in [48].As a result, three components by each actuation phase are used to build the feature vector that is used as the input for the training process to different machines.For the training part, the MATLAB classification learner app was used.Subsequently, testing is performed by using data from the structure in an unknown structural state and projecting the information to the nonlinear components.The projected information, called scores, is used as the input to the trained machine to predict the kind of structural state.This procedure allows to detect and classify the structural state.
Several machines were trained to determine the elements in the feature vector, i.e., to determine the influence and the number of scores to use by actuation phase and the number of experiments for an adequate training machine.Table 2 shows the results in the prediction process when two scores by each actuation phase and fifty experiments are used in the training step.During the prediction, one hundred experiments per damage are used.Twenty supervised learning machines were training using MATLAB's classification learner toolbox.
As it is possible to observe, all structural states are not properly predicted in all the trained machines, this means that a low number of scores affect the classification process.Table 3 shows the results when the number of scores per actuation phase are increased to five.As it is possible to observe, prediction improves in most of the machines, however, it is necessary to determine an adequate number of scores, because when it is increased could produce machine overfitting, and the learning may be poor.This is that the  mistakes are added to others predictions and growing up the uncertainty.
Consistent with previous research, fine kNN and weighted kNN showed better results in the classification.However, when the number of scores is increased, other machines such as bagged trees and subspace kNN significantly improved their performance.Following this analysis, three scores were defined as the number of scores to use because present similar results to the obtained with a greater number of scores.Some consideration about the algorithms can be summarized as follows, the k nearest neighbour (kNN) classifier is an algorithm recommended to work with low dimensional data.Particularly, in this kind of machine, the number of neighbours have an effect over the response so, in general, the use of a reduced number of neighbours improve the outcome.Decision trees (DT) are a different kind of machine.In this case, DT is a classification mechanism that allows us to construct a predictive model where the value of splits can increase or decrease the flexibility of this algorithm, as well 8 Complexity as the use of various trees (ensemble).Other kind of machine explored in this paper is the RUS (Random Under Sampling) algorithm in RUSBoost, which is a mechanism to eliminate data distribution imbalances.
Figures 10 and 11 show the results in the damage classification process for fine kNN, weighted kNN, simple tree, and rusboosted trees.Detailed information about the definition of these machines can be found in [7,44,49].As it is   possible to observe in Figure 10, both fine kNN and weighted kNN presented some of the best results since in most of the experiments, the classification was properly performed, verifying its good behaviour like a statistical classifier [50].For instance, in the fine kNN classifier, 348 cases have been correctly classified out of 350 cases.This magnitude represents 99,4% of correct decisions.It is worth noting that the specimen with damage is never confused with the healthy state of the structure thus leading to an absence of missing faults.The only misclassification between damages occurs with a sample corresponding to damage 2 that is classified as damage 3. Similar results are obtained when the weighted kNN is considered as the classifier.In this case, 348 cases have been correctly classified out of 350 cases, which represents 99,4% of correct decisions, too.However, in this case, all the damages are perfectly classified.The number of false alarms is quite reduced in both cases: 1 out of 50 (2%) and 2 out of 50 (4%), with respect to fine kNN and weighted kNN, respectively.Worst results in the classification are obtained when rusboosted trees and simple tree machines are used.These results are summarized through the corresponding confusion matrices in Figure 11.The overall accuracy is 26,3% and 88,9%, in the case of rusboosted trees and simple tree machines, respectively.The classification is especially unacceptable in the case of rusboosted trees where damages 2 to 6 are all misclassified.This structure was instrumented with nine piezoelectric transducers PIC-151 from PI Ceramics which are attached to the surface of the structure as it is shown in Figure 12.Damage on the tested composite was simulated by localizing masses at different positions as described in Table 4.
The excitation signal is a 12 V Hamming windowed cosine train signal with 5 cycles, 150 experiments have been performed and signals from sensors have been also recorded per sensor-actuator configuration to each structural state.
To determine the carrier central frequency for the actuation signal in each structure, a frequency sweep was performed and spectral analysis of each signal was analysed in order to determine the optimal excitability frequency (structure and sensors) where the obtained signals have a signal/noise ratio that helps to the data analysis.The carrier frequency in this specimen was found to be 30 kHz.A photo of this second specimen can be found in Figure 13.
As with the previous specimen, several machines were trained and three scores were used per actuation phase.For this second experiment and for the case of fine kNN and weighted kNN, the result are even better (Figure 14).More precisely, in the fine kNN classifier, 349 cases have been correctly classified out of 350 cases.This magnitude represents 99,7% of correct decisions.It is worth noting that the specimen with damage is never confused with the healthy state of the structure thus leading to an absence of missing faults.The only misclassification between damages occurs with a sample corresponding to damage 1 that is classified as damage 2. A perfect classification is obtained when the weighted kNN is considered as the classifier.In this case, 350 cases have been correctly classified out of 350 cases, which represents 100% of correct decisions.With respect to this second specimen, false alarms are no longer present.Worst results in the classification are obtained when rusboosted trees and simple tree machines are used.These results are summarized through the corresponding confusion matrices in Figure 15.The overall accuracy is 28,3% and 70,9%, in the case of rusboosted trees and simple tree machines, respectively.The classification is especially unacceptable in the case of rusboosted trees where damages 2 to 6 are all misclassified.Although the percentage of correct decisions fluctuates between 28,3% and 70,9%, both machines are able to accurately identify the structure with no damage.

Complexity 11
In general, the behaviour of these four machines in this paper with respect to both specimens is coherent with previous results in the literature.For instance, in the work of Vitola et al. [44,49], a distributed sensor network is used to detect and classify structural changes with and without the influence of environmental conditions.Although in those papers how the data is collected and preprocessed differs significantly from the current work, the performance of both fine kNN and weighted kNN is similar.

Conclusions
In this work, a damage classification methodology has been introduced.The proposed methodology includes the use of a piezoelectric sensor network, discrete wavelet transform, hierarchical nonlinear PCA, and machine learning approaches.The methodology has been validated with excellent results showing its capability for damage classification tasks.Although different machines were trained, only the best two and the worst two of them were included in the paper, showing that the best results were obtained with fine kNN and weighted kNN machines and worst results are obtained by the use of trees.This is because the way as the data is organized by the different machines as was introduced along the paper.
In order to work with machine learning algorithms, it is very important to select the training data in a proper way.Otherwise, results in the trained machine can be different to the system expectations.The nonlinear scores demonstrated that the extracted information was very useful, since these scores reduced significantly the information by facilitating

2
Extended the previous damage to a width of 33 mm and depth of 42 mm 3 A crack of 25 mm length initiated at the middle position along the vertical y-axis and in the parallel direction to the x-axis 4 Extended the previous crack to a length of 30 mm 5 Extended the previous crack to a length of 45 mm 6 Extended the previous crack to a length of 70 mm SENSORAS ACTUATOR SENSOR Hamming Signal 12Vpp 30Khz

Figure 6 :
Figure 6: Data organization: h-NLPCA scores, before building the training vector.

Table 2 :
Behavior of machines with two scores per sensor (specimen 1, four sensors).

Table 3 :
Behavior of machines with five scores per sensor (specimen 1, four sensors).
4.3.Specimen 2: CFRP Composite Plate.The second structure, shown in Figures 12 and 13, corresponds to a CFRP plate made of 4 equal layers and stacking of [0 ∘ 90 ∘ 90 ∘ 0 ∘ ].Dimensions are 200 mm x 250 mm with a thickness of 1.7 mm

Table 4 :
Damages in the CFRP composite plate.