Classification of Earthquake-Induced Damage for R / C Slab Column Frames Using Multiclass SVM and Its Combination with MLP Neural Network

Nonlinear time history analysis (NTHA) is an important engineering method in order to evaluate the seismic vulnerability of buildings under earthquake loads. However, it is time consuming and requires complex calculations and a high memory machine. In this study, two networks were used for damage classification: multiclass support vector machine (M-SVM) and combination of multilayer perceptron neural network with M-SVM (MM-SVM). In order to collect data, three frames of R/C slab column frame buildings with wide beams in slab were considered. For NTHA, twenty different groundmotion records were selected and scaled to ten different levels of peak ground acceleration (PGA).Thus, 600 obtained data from the numerical simulations were applied toMSVM andMM-SVM in order to predict the global damage classification of samples based on park and Ang damage index. Amongst the four different kernel tricks, the Gaussian function was determined as an efficient kernel trick using themaximum total accuracy method of test data. By comparing the obtained results fromM-SVM andMM-SVM, the total classification accuracy of MM-SVM is more than M-SVM and it is accurate and reliable for global damage classification of R/C slab column frames. Furthermore, the proposed combined model is able to classify the classes with low members.


Introduction and Background
Artificial neural networks (ANNs) are one of the popular computational models applied widely throughout the sciences.Also they are specially used in many fields of civil engineering like materials strength prediction, thermographic inspection of electrical installations within buildings, traffic management and transportation systems, forecast water pressure in pipes, and so forth.Generally ANNs are used to solve complex problems by considering effective indices and establish a good relationship between the input and output parameters.Moreover, these networks can be applied in classification problems.The first classification algorithm was presented by Fisher [1].In this algorithm, minimizing the classification error of train data was evaluated as an optimization criterion.This method has been used in many classification algorithms, yet there are some problems encountered mainly the generalization properties of the classifiers, which are not directly involved in the cost function.Also for doing the training process, determining the structure of the network was not easy.As an example, to determine the optimum number of neurons in the hidden layers of the multilayer perceptron (MLP) neural networks or the number of Gaussian functions in radial basis function (RBF) neural networks are difficult and time consuming.Cortes and Vapnik [2] introduced a new learning statistical theory which leads to presenting the support vector machines (SVMs).The significant features of these networks are their ability to minimize the classification errors, maximize the geometric margins between classes, design the classifiers with maximum generalization, and automatically determine the architecture of network for classifiers and modeling the nonlinear separator functions using nonlinear cores.
In recent years, several different neural networks such as SVM have been applied in different branches of civil engineering.In a tunnel construction, an intelligent controlling system was presented by Jun et al. [3].This system needed to recognize the geophysical parameters to find the optimum solution of problems.Therefore, a nonlinear optimization technique was employed using the least square support vector machine (LSSVM).The results showed that this method is timesaving and easy to use in local optimal problems.Mingheng et al. [4] employed several different models of traffic flow using SVM to find the best intelligent traffic control tool.They obtained that amongst the three proposed models, the SVM with the historical pattern data for the target road section model has the best performance.Vafaei et al. [5] applied MLP neural network to identify the realtime seismic damage for concrete shear walls.It was observed that the neural network could detect the amount of imposed damage with high accuracy.Two different neural networks, the adaptive neuro-fuzzy inference system (ANFIS) and the three-layered artificial neural network (TL-ANN) model, were used to estimate the earthquake load reduction factor for industrial structures by Ceylan et al. [6].They showed that the ANFIS model was more successfully than the TL-ANN model.Xie et al. [7] investigated the amount of voids inside the concrete using SVM.The grid-algorithm and the genetic-algorithm were used to determine the kernel function and network parameters.The obtained results presented that the SVM exhibits a promising performance for identification of voids inside the reinforced concrete.In addition, ANNs were used in conjunction with each other.Köroĝlu et al. [8] applied MLP neural network in two models: single MLP and combined MLP with itself (CMLP) for estimation of the flexural capacity of the quadrilateral FRP-confined R/C columns.They obtained the model of CMLP having lower prediction error than the single MLP model.In order to classify the cardiac arrhythmias, Castillo et al. [9] considered a hybrid intelligent system which consists of the fuzzy K-nearest neighbors with the MLP and a very high classification rate was obtained.To predict the short-term wind power generation, combination of genetic algorithm (GA) and orthogonal least squares (OLS) algorithm with RBF neural network was proposed by Chang [10].The test results indicated that the proposed model is reliable with the sufficient performance.
The main aim of this research is to classify the vulnerability of R/C frames taken from slab column frame buildings built in Famagusta, Cyprus, by using ANNs.The distinguishing characteristics of this building type are the rectangular columns and the slabs supported on columns with wide beams in slab.Since the nonlinear time history analysis (NTHA) is time-consuming and imposes a significant computational burden, two networks include M-SVM and MM-SVM were proposed as good alternatives and efficient networks.Several common kernel tricks were tested to determine the best kernel function and applied to these machines for learning process.Therefore, using the M-SVM and MM-SVM, the classes of global damage for the similar frames can be predicted easily and earthquakeinduced damage can be prevented by retrofit plan.

Nonlinear Time History Analysis (NTHA).
Nonlinear time history analysis (NTHA) is one of the most common numerical analysis methods for evaluation of building's behavior under seismic loads.It can be applied to a system with single or multifreedom degrees.In this method, several ground motion records are selected based on the soil condition of a specific zone, distance from fault line, the time duration of earthquakes, amplitude and frequency content, and so forth.Also the dynamic responses in the incremental state can be determined by scaling the records based on the different levels of ground motion parameter such as peak ground acceleration (PGA), peak ground velocity (PGV), or peak ground displacement (PGD), which is called incremental nonlinear time history analysis (INTHA).

Sample Frames and Material
Properties.One of the existing building types in Cyprus is slab-column frame buildings with wide beams and rectangular columns.In this study, three R/C slab-column frames with the 4, 6, and 8 levels were selected from this type of buildings which is representative of the midrise frames in Famagusta, Cyprus.These buildings were designed according to 1975 version of the Turkish seismic design code [11].The compressive strength of concrete, yield, and ultimate strength of steel were specified based on the previous researches equal to 15 MPa, 220 MPa, and 300 MPa, respectively [12].The properties of soil type IV (D) were considered for this zone.Based on the detailed information mentioned in the building documents, the rectangular columns with aspect ratio between 2 and 3 (height/width ratio of cross section area) were used.Additionally, the beams heights were equal to slab thickness (around 15 cm) and were used as connection elements between columns.Figure 1 depicts the plan views for the 4-, 6-, and 8-story frames.

Ground Motions.
A significant step for doing the NTHA is selecting a set of ground motion records.In this study, due to the lack and uncertainty of ground motion records for the Famagusta region, twenty records were selected carefully which their average had the most correlation with Turkish design code [13].This set of records was chosen based on the D-type site properties, the strike-slip fault mechanism, and the distance less than 100 km from the fault line [14].The records were taken from the Berkeley database site [15] and then multiplied by factor of 2.3 in order to fit with the Turkish design spectrum using root mean square error reduction technique.The characteristics of these ground motion records are tabulated in Table 1.Additionally, the Turkish design spectrum [13], the mean, and the response spectrums of these scaled records are shown in Figure 2.

Damage Identification Method.
Several different criterions such as ductility ratio [16], interstory drift [17], flexural damage ratio [18], and maximum permanent drift [19,20], were introduced in order to evaluate the damage level of buildings.These criteria calculated the amount of structural vulnerability only based on a proper theoretical background.Later, mathematical models of damage that have been determined based on vulnerability of buildings under the earthquakes were defined as the functions of structural   members' ductility, strength of materials, distance from the fault line, effective time duration of the earthquake, and so forth.Therefore it leads to improved damage function according to theoretical calculations and practical tests.In this study, Park et al. damage index was selected among the improved indices and is defined as the linear combination of the maximum displacement and the dissipated energy [21].This index is expressed in the following equation:   where   and   are the maximum deformation and ultimate deformation of element under monotonic loading, respectively;   is the yield strength of the structure, which can be calculated by nonlinear dynamic analysis; ∫  ℎ is the hysteretic energy absorbed by the structural element during the response history; and  is a constant parameter.Therefore, based on this index, three classes of damages are used which is shown in Table 2.

Support Vector Machine
SVM has been introduced for the classification and pattern recognition problems by Cortes and Vapnik [2].It is a relatively new learning algorithm used for binary classification problems.The main difference between SVM and the other algorithms is that the SVM minimizes the operational risk as an objective function instead of minimizing the classification error.The original pattern classification of this machine is to classify the linear input data using the perfect hyperplane into two classes with the largest margin in between classes.For nonlinear input data, a nonlinear mapping is used to transfer the input data from the primal space to the higher dimensional feature space and lead to finding the proper hyperplane.Furthermore, SVMs have also been extended to solve multiclass problems.
In the two-dimensional space, the discriminator is a line in the middle of the margin between the classes.Thus, for -dimension space, the discriminator is a hyperplane.Suppose the distance between the each separate data and the discriminator is equal to 1, the two support hyperplanes are considered parallel to the discriminator, and the classifier function can be obtained as follows (see Figure 3): For a unique separator, the maximum margin between classes is needed.Thus, if the distance between the support hyperplanes is equal to , using (2), the optimum margin () is given by After calculating the maximum margin, the target function is defined as follows: Subject to (s.t.) : Since the probability of being the separated data in nature is very low and more datasets are inseparable; therefore, the discriminator (hyperplane) is also determined based on minimum number of errors.As a result, those members belonging to another class are penalized based on the distance from the boundary of its own class () (see Figure 3).This strategy is represented as a model of soft margin SVM.For this reason, nonnegative variables (  ) are defined and called as slack variable s.t.  ≥ 0. Thus, ( 5) is changed as follows: By multiplying both sides of first s.t. of (6) by , the primal problem becomes Minimize thus The primal problem is a quadratic program but it cannot be solved easily because it does not just depend on the parameters which are related to input vectors.Therefore, this equation changes from the primal form to dual form by using the Lagrange method.The Lagrange factors (  ,   ) must be nonnegative real coefficients and (8) becomes where  is penalty factor.In this case,   is a saddle point.Thus, at this point, the minimum value should be taken with respect to the parameters , , and  and the maximum value should be taken with respect to the Lagrange multipliers (  ,   ).This can be done by taking the partial derivative with respect to , , and  in order to change the primal problem to a maximum problem as follows: By substituting (10) and ( 11) into (9), the dual problem is obtained as follows: Also based on (12), the box constrains are defined as In addition, by considering ℎ  and  as the following definition: and substituting ℎ  and  into (13), the dual formulation becomes where  and  are defined as Therefore, the target function is expressed as follows: The quadratic programming problem (see (19)) can be solved easily by using the quadprog function in the Matlab software and the values of   are calculated.Then by substituting   values into (10), the values of  are obtained.Also for calculating the bias term, the Karush-Kuhn-Tucker (KKT) conditions [23] are necessary and sufficient for the optimization problems.Therefore, these conditions should be established in optimum point (see (9)).The bias value is calculated as Thus based on K.K.T conditions three cases occurred.
Case 1. None support vectors if (  = 0) Case 2. Outliers if (  = ) Case 3. Support vectors if (0 <   < ) In Case 3, each   corresponding to   are support vector machines.Thus by multiplying both sides of first s.t. of (19) by  as follows: the amount of the bias term can be obtained as follows: and also using (10),  becomes Finally, by having the amounts of  and , the optimal hyperplane decision function can be expressed as follows: 3.2.Nonlinear SVM.For nonlinearly data, the selection of optimal hyperplane for separation of data is difficult.For this case, Cortes and Vapnik [2] used the Hilbert-Schmidt theory [24] in order to transform the -dimensional input vector  into (usually higher) an -dimensional feature vector by using an -dimensional vector function 0: 0 :   →   , 0 :  → ,  = 0 () . ( Therefore based on the SVM algorithm, the discriminator equation can be applied into  space instead of  space as follows: And according to the properties of soft margin classifier method, the dual problem is obtained as follows: by substituting (  ,   ) instead of 0(   ) ⋅ 0(  ), the dual formulation becomes  3. Therefore, the optimal hyperplane decision function is expressed as follows: (32)

Multiclass SVM (M-SVM).
The basic theory of SVM is designing the discriminator (hyperplane) with maximum margin between the two classes, while most of classification problems are in the multiclass models [25].For  classes' model, Cortes and Vapnik [2] presented a strategy to compare one class with the remaining classes and this leads to generating the  classifiers.Therefore, this method needs the solution of the  quadratic programming optimization problems.This strategy can be named as "one-versus-rest" and was used in this research.

Data Generation, Selection of Input Parameters
For classification of the imposed global damage of these sample frames, M-SVM and MM-SVM were employed.In order to do so, twenty suitable ground motion records are selected and scaled to ten levels of PGA, then each scaled record applied to each frames and the amount of damage were obtained using IDARC-2D software [26].After carrying out the NTHA, 600 data were generated and divided into three classes based on the Park and Ang damage definition (see Table 2) including 153, 62, and 385 data for class number 1, class number 2, and class number 3, respectively.For generation of each input data, the appropriate parameters should be selected that are able to describe the properties of R/C frames and ground motion characteristic.In this investigation, seven and four parameters were chosen for structural and ground motion properties, respectively.The structural parameters were selected based solely on geometry and dimension of sample frames and without any relation to the engineering analysis such as first mode period, ductility and energy absorption of structural elements, and top displacement.In addition, for ground motion parameters, the main and available characteristics of records were selected.The ranges and definition of parameters are shown in Table 4.
The scaling of the data set is very important for training and also testing process of network.Thus, before presenting the data to the network, it is advised to normalize them.Therefore, linear normalization method was used to change the input parameters range between zero and one.

Models Used for Classification
In this study, two models of neural networks consisting of M-SVM and MM-SVM were applied.In order to find the best kernel function for training process, the total accuracy prediction scores of the test data were calculated.Also, the kernel parameters (, , , ,  0 , and  1 ) and penalty factor () should be determined to reach the maximum margin between classes and the minimum classification error between real and predicted data.The amounts of , , ,  0 and  1 were obtained using trial and error.Also for the two remaining parameters ( and ), the grid-search method was considered and the best values were selected automatically using Libsvm-3.17[27] in the Matlab software.The results showed that 143, 149, 155, and 152 class labels from the total of 180 test data class labels were correctly predicted for linear, polynomial (5 degree), Gaussian and sigmoid functions, respectively.Therefore the Gaussian function was chosen as the best kernel trick function.

M-SVM Model.
In the M-SVM, the set of normalized data which include 600 input data and each data containing eleven elements were shuffled and then applied to this machine that 70% and 30% of the total data were used for training and testing process, respectively.Figure 4 shows the comparison of actual classes and predicted classes of the imposed global damage for train data, test data, and all data of M-SVM.In this Figure, the hollow circles and stars indicated the actual classes and predicted classes, respectively, which if the classification be correctly done, then the hollow circles and stars will be overlapped together.The obtained results showed that the M-SVM was predicted the classes number 1 and number 3 with high precision.But for class number 2, this performance was very low because the M-SVM was not able to determine the proper margins based on feature of input data.
Also for evaluation of the obtained results from classified data, the confusion matrix is used and is defined as an error matrix or a contingency table to determine the performance  of network.Each element of this matrix expresses the number of actual classes versus predicted classes.The structure of confusion matrix is shown in Figure 5.

Actual class Predicted class
Whereas TP is a true positive observation, TN is a true negative observation.FN is a false negative since observation is an actual negative (−) but the classifier label is positive (+) and FP is a false positive since observation is an actual positive (+), nonetheless, the classifier label is negative (−).For assessment of this matrix, some parameters can be used which are shown in Table 5.Based on extracted confusion matrices, the amounts of SEN, SPC, PRE, ACC, Error, NPV, and PPV for each class and each set of data are presented in Table 6.The obtained results from these parameters showed that the best and worst classifications were done for class number 3 and class number 2 with ACC equal to 95% and 89.83%, respectively.Also, the low value of PRE for class number 2 represents inaccuracies in classification of this class.

MM-SVM Model.
For generation of MM-SVM, a onelayer feed-forward MLP neural network [28,29] was used at first level, then the output of this network was applied to M-SVM in form of input data.This combined artificial neural network was named MM-SVM and is shown in Figure 6.At the first level of MM-SVM, 600 input data applied to MLP neural network.In this network, each layer's nodes are connected to the next layer nodes with a specific weight similar to the synaptic weight in human neural networks.For training process, the Levenberg-Marquardt back propagation algorithm was employed to update the weights and bias terms of the MLP network.Therefore, using this network leads to changing the primal data space from the eleven dimensions to one dimension.The MLP network consists of eleven neurons in the input layer, optimum neurons in the hidden layer, and one neuron in the output layer.The number of neurons and the type of activation function in the hidden layer are very important parameters for the network training process.In this research, Gaussian and linear functions were used for the hidden layer and the output layer of the MLP network, respectively.In addition, the number of hidden layer neurons was determined based on the minimum test error, as in the following equation: where   is the output of the neural network,   is the desired output,  is the number of testing or training data,  is the number of testing or training segments, and  is the number of neural network outputs for testing and training procedures [30].The test error values for different number of neurons in hidden layer which were calculated based on the Gaussian function are shown in Figure 7.  Therefore, the minimum test error was obtained equal to 4.09 for the 25 hidden layer neurons.In the MLP network, all the data was divided into three sets that include train data (70% of all data), validation data (15% of all data), and test data (15% of all data).For finding the best fit of the data set, the statistical models were used.For this reason, root mean square error (RMSE), mean square error (MSE), correlation coefficient (), mean of error (), and standard deviation of error () were considered [31].In the training process, the network stopped after 16 iterations with MSE and gradient equal to 0.0413 and 0.0254, respectively.In addition, the best validation performance was 0.09486 at epoch 10.The values of RMSE, MSE, , , and  are presented in Table 7.
As seen in Table 7, the RMSE and MSE variables in testing cases are greater than in training and validating cases.Also the amount of Rs ranged between 0.93508 and 0.95944.Figure 8 compares the real output values and the predicted values of all data.The most variation of errors for all data sets happened between −0.5 and +0.5 which successfully represents the network damage prediction values with high accuracy.The regression and fit function for the train, validation, test, and all data are shown in Figure 9.The high value of  (around 0.95) indicates a good linear relationship between predicted values and actual values for the total response in the MLP model.Moreover, the histogram of error for all data is presented in Figure 10.The concentration of the bins error around the zero line with mean −0.017479 and standard deviation 0.26452 for all sets of data represents a good performance of this network.
At the second level of MM-SVM, the obtained outputs of the first level (MLP network) were applied to the M-SVM as input data.This set includes 600 input data and each data consists of only one element.Figure 11 shows the comparison of the actual classes and the predicted classes of the imposed global damage for the train data, test data, and all data of MM-SVM.For all classes, the results showed that the MM-SVM predicted the classification of global damage with high accuracy compared to M-SVM.Indeed, reduction in feature space of input data and creating high correlation between the input and output data by the MLP neural network lead to determine more precisely of margins for each class by the M-SVM.
For evaluation of the system's performance of MM-SVM network, the confusion matrix for the train data, test data, and all data is given as follows.
Confusion matrix for the train data = [  Based on extracted confusion matrices from MM-SVM, the amounts of SEN, SPC, PRE, ACC, Error, NPV, and PPV for each class and each set of data are presented in Table 8.The maximum and minimum of SPC were obtained equal to 98.66% and 93.98% for class number 2, respectively.Also for all data, the ACC-values for class number 1, class number 2, and class number 3 were extracted equal to 97.33, 94.67, and 97.33 respectively.
The above obtained results for all three classes showed that although the amount of data for each class is different,  the amount of these parameters are close together which indicated the MM-SVM model to be highly suited to classification of global damage for R/C slab column frames and provide reference for future seismic assessment of this frame's type.

Discussion
Since that the percentage of input data for class number, class number 2, and class number 3 was equal to 25.5%, 10.3%, and 64.2% of all data, respectively, the M-SVM model showed very weak performance for classification of class with minimum members.In fact, this model predicted only 2, 0, and 1 data from the set of 41, 21, and 62 considered data for the train, test and all data cases of class number 2, respectively, whereas the MM-SVM model was able to predicte 36, 11, and 44 data for this class.Indeed, using the MLP model in first level of MM-SVM leads to reducing the dispersion and complication in feature space of input data and for this cause, in the second phase, the M-SVM was able to determine the margins for each class with high precision.Table 9 compares the ACC-value for M-SVM and MM-SVM models.These results showed that the MM-SVM

Conclusion
In this study, the classification of the imposed seismic damage under earthquake loads for R/C slab-column frames in the Famagusta city was investigated.Based on the Park and Ang global damage classification, the three classes of damage are applied including repairable (economic), beyond repair (not economic), and loss of building (collapse).The NTHA was used to generate the 600 databases.Two ANNs, M-SVM, and MM-SVM were applied to establish the relationship between the structural and ground motion parameters (input data) and damage classification (output data).The following conclusions were obtained from this investigation.
In the M-SVM, to find the best kernel trick, four different kernel functions were applied including linear function, polynomial function (5 degree), Gaussian function, and sigmoid function and they evaluated using maximum accuracy of test data.The results showed that the Gaussian function had the maximum accuracy and it was employed as an efficient kernel trick function.
The number of hidden layer neurons for the first level of MM-SVM (MLP neural network) was selected based on the minimum test error, which was obtained equal to 25 neurons.Additionally, the value of  for train data, validation data, test data, and all data was calculated around 0.96, 0.94, 0.94, and 0.95, respectively.
Comparing the classification results of the M-SVM and MM-SVM showed that the total accuracy of MM-SVM is more than M-SVM.Also for class number 2 (class with the lowest member), the obtained values of PRE or PPV indicated that the MM-SVM predicted the label of this class with high efficiency towards the M-SVM.Thus, the MM-SVM is proven as an efficient and time saving way for classification of the imposed global damage under earthquake loads and it can be used for similar R/C slab-column frames solely by selecting the structural geometric and ground motion parameters.In addition, this method of damage classification can be used by the insurance companies because it is easy and fast.

Figure 2 :
Figure 2: The response spectrums of individual scaled records, mean and Turkish design spectrum (Target).

1 Figure 3 :
Figure 3: A sample of linear soft margin SVM.

Figure 4 :
Figure 4: Comparison of the actual and predicted classes for train data, test data, and all data of M-SVM.

Figure 7 :
Figure 7: Number of hidden layer neurons versus test error.

Figure 8 :
Figure 8: Comparing the real and predicted values.

Figure 9 :
Figure 9: The regression and fit function for each set of data.

Figure 10 :
Figure 10: The error histogram for all data.

Figure 11 :
Figure 11: Comparison of the actual and predicted classes for train data, test data, and all data of MM-SVM.

Table 1 :
Characteristics of the twenty natural records.

Table 2 :
The classification of global damage based on the Park and Ang investigation.

Table 4 :
The range of structural and ground motion parameters used.

Table 6 :
The SEN, SPC, PRE, ACC, Error, NPV, and PPV values for each class.

Table 7 :
The RMSE, MSE, , , and  values for each set of data.

Table 8 :
The SEN, SPC, PRE, ACC, Error, NPV, and PPV values for each class.

Table 9 :
Comparing the ACC-value for M-SVM and MM-SVM.