Locating Impact on Structural Plate Using Principal Component Analysis and Support Vector Machines

A new method which integrates principal component analysis (PCA) and support vector machines (SVM) is presented to predict the location of impact on a clamped aluminum plate structure. When the plate is knocked using an instrumented hammer, the induced time-varying strain signals are collected by four piezoelectric sensors which are mounted on the plate surface. The PCA algorithm is adopted for the dimension reduction of the large original data sets. Afterwards, a new two-layer SVM regression framework is proposed to improve the impact location accuracy. For a comparison study, the conventional backpropagation neural networks (BPNN) approach is implemented as well. Experimental results show that the proposed strategy achieves much better locating accuracy in comparison with the conventional approach.


Introduction
Damage detection is very meritorious in specifying and diagnosing the location and severity of damage for decreasing accidents and maintaining the stability of various structures [1,2].Structural health monitoring involves the process of realizing the damage detection and characterization for the concerned structures [3].In order to implement structural health monitoring, a variety of approaches have been reported in previous works [4,5].Majority of the investigations have been focused on impacted metallic and composite plat structures which are widely used in different engineering scenarios such as aircrafts and automobiles.This research is concentrated on the study of impact location detection for a metallic structural plate subject to low-velocity impacts.
In the literature, various soft computing algorithms have been popularly employed for signal processing dedicated to impact detection [6].For instance, Jones et al. implemented the location and magnitude estimation by applying backpropagation neural networks (BPNN) to experimental data, which were acquired by sensors attached on a fully clamped plate [7].The authors adopted several methods to improve the accuracy of detection, such as removing DC shift and high frequency noise and only using impacts within the range of training set.More recently, it has been demonstrated that the neural-networks-based impact location strategy is applicable to more complex structures such as real aircraft component along with a high degree of material and geometrical complexity [8].Hence, it is evident that the impact detection investigations on simple structures have practical meanings.
As the damage severity is correlated with the kinetic energy of impact, it has been shown that the in-plane strain waves caused by the impact-induced damage in fiberreinforced composite plates are associated with the type and extent of damage [9].It is noticeable that, in order to ensure the consistence of experimental condition, the impacts are commonly produced at the levels which will not cause damage to the structures.This means that the detection methods concern nondamaging impact events.
The impact detection can be resolved as classification or regression problems.It has been well recognized that neural networks (NN) are appropriate to handle complicated classification or regression problems [10].Neural networks are easy to implement because their realizations do not require the establishment of mathematical models.However, they exhibit some challenges in terms of overtraining and generalization since their optimal parameters are difficult to determine.Comparing with neural networks, an alternative approach-support vector machines (SVM)-offers better generalization and more stable performance.In the literature, the SVM was employed in [11] to solve the classification problem.It was demonstrated that a result which is better than that of conventional NN can be achieved.Besides, the least squares support vector machines (LS-SVM) have been adopted in engineering applications [12,13].Considering that the conjugate gradient method cannot solve the KKT system of LS-SVM directly owing to the coefficient, an enhanced LS-SVM with the Hilbert transform was proposed to reinforce the convexity and computational speed [14].However, majority of existing SVM-based works are dedicated to the classification problems [15].It has been shown that the regression method is capable of providing better accuracy than the classification approach in solving the impact detection problem [8].Hence, an SVM regressionbased approach is devised in the present research.
Furthermore, most of regression-based approaches so far developed are based on some transforms of the original sensor signals.Then, a single-layer regression model is established to predict the output once new input occurs [7,8].Nonetheless, preliminary experimental investigations reveal that such an intuitive approach is not able to produce satisfactory impact detection accuracy in the current research.To this end, a new method of two-layer SVM regression is proposed for predicting the low-velocity impact location in this paper.The method consists of two general steps in terms of data preprocessing and model establishment for the prediction.In the data preprocessing part, the timevarying strain data are acquired using surface-mounted piezoelectric sensors on the plate.Before the training of SVM for subsequent prediction, the principal component analysis (PCA) is executed to extract principal components from the mass sensor output signals.As a result, less computational cost would be implemented as the dimensions of data are greatly reduced with minimal information loss.The twolayer SVM regression model is then implemented along with the model parameters adjusted using grid search and cross-validation method.The first layer contains four models which handle training data acquired from each of the four sensors, respectively.After that, the second layer of model is constructed by these four models outputs which involve underlying information of the sensor locations.Preliminary comparison shows that the two-layer SVM regression model leads to a better accuracy than the model which is built without knowing location information of sensor.To illustrate the effectiveness of the proposed detection strategy, the conventional BPNN method has been implemented for a comparative study.Results demonstrate the efficiency of the proposed strategy for the impact location detection explicitly.
The remainder of the paper is organized as follows.Section 2 presents a brief review of the PCA and SVM algorithms.The experimental setup is described in Section 3, which is then adopted for the data acquisition as conducted in Section 4 where the location detection strategy is outlined in detail.A series of experimental studies are carried out in Section 5 along with a comparative study with respect to BPNN-based approach.Finally, Section 6 concludes this paper.

Algorithms
In this section, the basic PCA and SVM algorithms are briefly reviewed.
2.1.PCA.Principal component analysis (PCA) is a widely used statistical analysis technique which aims at selecting the most significant patterns or features from multivariate data set to simplify complicated problems.In the area of computer science, this technique is often responsible for "data preprocessing" or "dimensionality reduction" while extracting effective features to retain most of information of the original data.More precisely, the basic theory of PCA is an orthogonal transformation that adjusts the direction of a new coordinate system in which the variance with most of information can be projected onto the first coordinate.
PCA can also be explained by using formulas as follows.After PCA receives a data matrix X (in this research, X consists of all of the data in training testing sets), a set of principal components can be returned as the eigenvectors of the correlation matrix.The formula of correlation matrix R can be expressed in the following form: where  denotes the number of data in X. Afterwards, the eigen problem is solved, in which q stands for the eigenvectors of R and an associated  indicates the eigenvalue of R.There are many eigenvalues for R which are denoted by   for  = 1 to .Each eigenvalue   is associated with an eigenvector q  .As a result, the transformation function T = [q 1 , q 2 , . . ., q  ] can be used to calculate the projection a of the new matrix Y on that principal direction as shown below where the principal component a  represents the projection of Y onto the principal direction.
In this research, the original data matrix X is manipulated using PCA to obtain the transformation function T. After that, T is applied on X again to calculate the projection a in a new coordinate system.As the projection a produced by PCA distributes in descending order in principal directions, a userdefined tolerance can be chosen to reduce the dimensions while minimizing the information loss.In this paper, the tolerance is selected as 98% by trial and error.

SVM.
In view of the difficulties suffered by traditional neural network approaches, such as generalization and probability of overfitting, support vector machines (SVM) have been proposed which belong to supervised learning algorithms.SVM has become a popular active research area as it promotes the development of machine learning theories and techniques.SVM can be used to solve both classification and regression problems.
In classification problems, the goal of SVM is to produce a classifier that can well separate new data into two classes based on the training and testing data sets.In the case of nonlinear examples, it transforms the data in nonlinear input space to higher dimensional linear feature space through a nonlinear mapping and then finds the optimal hyperplane to classify the data.
As for nonlinear regression problems, SVM performs similar to the situation in nonlinear classifications.The nonlinear mapping can be used to map the data into higher dimensional linear feature space, and then it employs the kernel function to eliminate the effect of dimensionality.Besides, it adopts an -insensitive loss function given by along with the following constraints: Based on the solution of (  ,  *  ), the target nonlinear function is given by In ( 4) and ( 5), the parameters  of the loss function and the constant  affect the calculation results, and they are usually defined by user.To generate an optimal setting of these parameters to maximize the performance of study, genetic algorithms or exhaustive search method are usually adopted.In this paper, the grid search algorithm coupled with cross-validation is chosen to find the best pair of  and  owing to its convenience and exhaustive capability.Actually, both methods will be assigned a searching range with a small interval, and then each pair of  and  is computed to take out the best values.As a result, an optimal pair of  and  in a specific area could be obtained.The details of applying SVM to the present research by developing a novel two-layer SVM regression model are described in Section 4.1 later.

Experimental Setup
The experimental setup consists of a rectangular 491 mm × 392 mm clamped plate, four piezoelectric patch sensors pasted on the plate by conductive glue, a data acquisition board assembled in personal computer, and an instrumented hammer with rubber head for knocking.The experimental setup is depicted in Figure 1(a), and a photo of the plate structure is shown in Figure 1(b).
The plate is fastened onto a heavy metal table using screws located on the four corners.The four sensors are mounted on the plate surface by conductive glue.The distances between sensors and the nearest edges are 170 mm (horizontally) and 100 mm (vertically), respectively.The dimension of the plate and the location of sensors are shown in Figure 2.
For data acquisition purpose, a data acquisition board (NI PCI-4472) manufactured by National Instruments, Inc. is inserted in a PCI slot of personal computer.The time-varying data are gathered by a program developed with LabVIEW software.All the sensors are sensitive to the impacts applied on the plate.In this situation, when the hammer knocks on the plate, the voltages collected by sensors increase rapidly initially and then decay to zero quickly.During this transient response time, the sensors output voltage signals are recorded and saved in data files.The data acquisition rate is set as 100 kHz and the number of acquired samples is selected as 10000.The format of data storage in the file is presented in Figure 3.
In the data file, there is a 10000 × 6 matrix corresponding to the impact.The first column is the time stamp of sampling, the second is the impact force magnitude, and the remaining four columns are the voltages acquired by the first, second, third, and forth sensor, respectively.The last four columns of the voltage values will be used in the later training and prediction procedures.

Data Acquisition and Impact Detection Strategy
4.1.Data Acquisition.To detect the location of impact using SVM regression models, two sets of impact data have been acquired with the aforementioned experimental equipment.
The first set includes a grid of 63 impacts (nine impacts equally separated horizontally by seven impacts equally separated vertically), which is shown in Figure 4.It is used for the SVM model training.Another set of 100 random impacts, which is differed from the training set, is presented in Figure 5.
In order to quantify the detection performance, a root mean square (RMS) error function, which was used in previous work [7], is employed here to provide a measure of detection accuracy.The RMS error function is expressed by where  is the amount of testing impacts, (, ) are the real coordinates of the impacts within testing set in unit of

Time
Hammer Sensor 1 Sensor 2 Sensor 3 Sensor 4 0.000000 0.001527 0.002631 0.002451 −0.002360 −0.000214 centimeter, and (  ,   ) are the predicted results of SVM in unit of centimeter.The error function gives the radial distance between the real and computed impact location in centimeters.
Alternatively, another evaluation approach [6] is employed to better visualize the results.By this method, the averaged errors of  and  coordinates are calculated, respectively.Then, the product of them is considered as the evaluation factor.The smaller the ratio of the factor to the area of plate, the better the detection accuracy for the impact location.
where  denotes the area of the plate structure and  is the ratio of error area to the whole plate area.

Proposed Detection Strategy.
The whole procedure of the proposed impact detection strategy is depicted in Figure 6.
The procedure is implemented in Matlab with toolboxes of  PCA and SVM algorithms.Generally, the strategy contains the following two steps.The first step is data preprocessing for data dimension reduction, which extracts the principal components from each data file using PCA.As impact happens, each of the four sensors collects 10000 voltage values.In each data file, except for the first column of time stamp and second column of impact force, one impact corresponds to the other four columns of piezoelectric sensor outputs (10000 × 4).Using PCA, all the impact data inputs into PCA are projected on another coordinate system, in which the projections of original data are distributed in descending order in the principal directions.Only a part of values represent the dominant feature of information.After defining a tolerance of information loss, 19 main components with nearly complete information are selected.Hence, the data dimension is significantly reduced from 10000 to 19 for each sensor output.With these reduced components, the result is still accurate for the purpose of impact location.
The second step concerns the regression model development.The four sensors represent four positions on the plate, and the generated main components (19 × 4 matrix) have direct relation to the impact.However, direct use of the data from those four independent sensors cannot offer a good detection performance.To build a relationship among those four sensors, this paper proposes a new method which merges the four models corresponding to these sensors into a single model.This step is further divided into the following three substeps.
(1) For the model training process,  data files are used.
Each data file records the four sensors' outputs associated with each impact.From each data file, a 19 × 4 matrix is generated using the PCA results of the four sensors' data.For the convenience of SVM regression, the matrix is transposed to generate a 4 × 19 format, where each row represents one sensor output related to one impact.Thus, a total of  impacts allows the generation of an  × 19 matrix from each sensor output.Using this  × 19 matrix as the training input and the corresponding coordinates of the  impacts as desired output, one SVM regression model is obtained for each sensor output.Hence, four SVM models are developed for the four sensors.(2) Using the training sets in previous Step (1) as the testing sets for each SVM model, one set of the impact location is estimated by the SVM model.Hence, the four SVM models predict four sets of the impact location.It is worth mentioning that each of the four SVM models represents one sensor mounting site.Thus, the underlying reason of the different predicted impact locations lies in the different mounting site of the four sensors.Using the four locations predicted for the  impacts, that is, an  × 4 matrix, as the training input and the corresponding actual coordinates of the impacts as desired output, the fifth SVM regression model is established.(3) For the purpose of model testing, other  data files are used.In the testing process, the  data files are processed by the same way as interpreted in Step (1) and Step (2).That is, four matrices ( × 19) are generated to produce the first layer of four SVM models, and then the four SVM models' outputs ( × 4) are employed to establish the second layer of SVM model to predict the impact location, as depicted in Figure 6.

Results and Discussions
5.1.SVM with PCA.Using the approach described previously, the SVM model is trained with the training set as shown in Figure 4 and then tested using the original testing set as illustrated in Figure 5.
In order to enhance the accuracy, a grid search method coupled with cross-validation is applied to find the optimum parameters for SVM.Then, by analyzing the result, the mean error of  coordinates is found to be 2.6847 cm and the mean error of  coordinates is 2.023 cm, which give an area error corresponding to 0.28% of the plate area.Besides, the RMS errors for  and  coordinates are 4.0755 and 3.0809 cm, respectively.As a result, the overall RMS error of the results is 0.5109 cm.
The discrepancy between the predicted and actual  coordinates is illustrated in Figure 7.In addition, the error  between each set of the predicted and actual  coordinates is given in Figure 8.It is observed that some specific points exhibit large error.By inspecting the properties of those impact points, it is further found that the apparent errors are dependent on whether the location of testing impact lies in the training region or not.Hence, it is desirable to enhance the detection accuracy by changing the constitution of the testing set.By comparing the predicted and original coordinates, it is observed that test impacts which locate outside the training set have higher errors than those inside.One reason lies in that the detection is more difficult when the impact is outside of the training region.Thus, it is reasonable to delete the impacts outside the training set so as to construct a new testing set.The sites of deletion are illustrated in Figure 9.
Consequently, the SVM models are trained with the original training set and tested with the modified testing set.The final results give an RMS error of 0.343 cm.The area of error occupies 0.129% of the plate size.Moreover, the RMS errors of the  and  coordinates are derived as 2.6954 and 1.8543 cm, respectively.The differences between the predicted and real values for the  and  coordinates are shown in Figures 10 and 11, respectively.
To illustrate the necessity of modifying the testing set, a comparison between the results obtained from the original testing set with 100 impacts and modified testing set with 91 impacts is presented in Table 1.The comparison covers the mean errors of  and  coordinates, the error area with respect to plate size, RMS errors of  and  coordinates, and the integration of them, that is, the RMS error.It is found that the detection using the modified set accomplishes much better result than that using the original set.Evidently, the comparison result indicates the reason why the modified testing set is employed in the forgoing and subsequent testing experiments.

BPNN with PCA.
To illustrate the detection performance of the proposed approach in comparison with other method, a three-layer BPNN with 76 : 11 : 2 topology is adopted.The data preprocessing is also conducted by using the PCA.Because of the random initial conditions, the program uses an iterative loop to explore the optimal number of neurons (i.e., 11 neurons) in the hidden layer.The best result predicted by the BPNN is listed in Table 2 with comparison to the SVM result.

Discussions on Detection Results. It is evident from
Table 2 that the SVM delivers better detection accuracy in comparison with BPNN result.As compared to SVM, the reason why the BPNN performance is worse may arise from the insufficient number of the training data sets.In this sense, the SVM is capable of accomplishing better location detection accuracy than BPNN using a small set of training data.The location detection results generated by the proposed strategy are also compared with typical previous works based on regression modeling.For example, [7] adopted the integrated real and imaginary parts of FFT results for the impact strain signals as inputs and the RMS error as shown in (8) to evaluate the performance.The RMS error of detection result was 1.56 cm, which is much larger than the RMS error (0.343 cm) achieved by the current work.
In addition, another work in [6] gathered the magnitude and time after impact of maximum response as input for neural networks training.That study resulted in an error area of 1.5% with respect to the plate size.Recalling that the current research obtains an RMS error of 0.343 cm and error area of 0.129% using SVM regression, a much better result has been achieved by the proposed approach in comparison with that in the previous work [6].Therefore, the experimental results reveal that the proposed two-layer SVM in combination with PCA exhibits more powerful capacity of generalization and offers higher accuracy than conventional approaches.

Conclusions
This paper proposes a new approach by integrating PCA and two-layer SVM to predict the location of impacts on a plate structure.By extracting principal components, the PCA is capable of improving the computational efficiency.Using the PCA results, a two-layer SVM framework incorporating five SVM models has been built for predicting the  and  coordinates separately.The models have been trained using a set of 63 impacts and tested using another set of 91 impacts.The presented detection strategy delivers an RMS error of 0.343 cm, which indicates a better performance than that of the conventional backpropagation neural networks based method.
In the future, the effects of noise and variation of impact magnitudes on the detection accuracy will be investigated.Moreover, with an online real-time realization [13,16], the whole prediction system is expected to meet the actual requirements and increase the prospect of applications.Besides, severe impacts generally lead to real damage.Hence, the issue of whether the damage affects the subsequent prediction will be investigated as well.

Figure 1 :Figure 2 :
Figure 1: Experimental setup for impacts.(a) Schematic of hardware connection, (b) photo of the plate structure.

Figure 3 :Figure 4 :Figure 5 :
Figure 3: Format of data storage in data file.

Figure 6 :
Figure 6: Flow chart of the proposed impact detection strategy.

Figure 7 :
Figure 7: (a) Comparison of measured  coordinates and predicted results (100 impacts for test).(b) Comparison of measured  coordinates and predicted results (100 impacts for test).

Figure 8 :
Figure 8: (a) Error of predicted and measured  coordinates (100 impacts for test).(b) Error of predicted and measured  coordinates (100 impacts for test).

Figure 9 :
Figure 9: Site of testing set after deletion.

Figure 10 :
Figure 10: (a) Comparison of measured  coordinates and predicted results (91 impacts for test).(b) Comparison of measured  coordinates and predicted results (91 impacts for test).

Figure 11 :
Figure 11: (a) Error of predicted and measured  coordinates (91 impacts for test).(b) Error of predicted and measured  coordinates (91 impacts for test).

Table 1 :
Comparison of predicted results from different test sets with 100 and 91 impacts.

Table 2 :
Comparison between SVM and BPNN results.