Application of PCA and SVM in Fault Detection and Diagnosis of Bearings with Varying Speed

Vibration analysis is widely used as an efficient condition monitoring (CM) tool for rotating machines in various industries. Fault detection and diagnosis (FDD) models play an important role in the development of any CM system. The use of artificial intelligence (AI) has since gained recognition in the development of fault detection and diagnosis systems. In this paper, a combination of principal component analysis (PCA) which is used for reducing the data dimensionality, and support vector machine (SVM) which is adopted for classification to carry out fault detection and diagnosis of faults in bearings using vibrations. The diagnostic feature design and machine learning toolbox in MATLAB are used to develop features and train the models, respectively. Real data from the Mendeley data depository is used to test and evaluate the models. Model training is carried out using data with varying speeds representing different conditions of bearing making it different from similar approaches involving SVM. The choice of data used proves that SVM can be able to classify faults with consideration of the varying operating speeds. Results have shown that the combination of PCA and SVM is effective in fault diagnosis of bearing faults under varying speeds such that a 97.4% classification accuracy was achieved. The result implies that PCA and SVM can be implemented in various industrial setups where variable speeds can occur both intentionally or nonintentionally. Furthermore, the method was able to differentiate between compounding faults and faults that occur at different times. The confusion matrix further proves the quality and accuracy of the trained model. Future work will focus on the development of models that can carry out the prognosis of faults in bearings as well as to model for other faults other than bearing faults.


Introduction
Condition monitoring abbreviated as CM is a form of predictive maintenance. It is a tool for estimating the current health of rotating equipment using techniques such as vibration analysis, motor current analysis, oil or wear debris analysis, and temperature analysis. Out of all these techniques, vibration analysis is well studied and has been accepted in most CM processes [1]. CM involves fault detection and diagnosis of machines in various industries. Before CM, time-based maintenance (TBM) was adopted through various industries. TBM leads to a waste of manpower, time, and money as maintenance is done based on prede ned time [2]. e adverse conditions of industries and the need for continuous production put a strain on rotating machines making them susceptible to faults [3]. Bearings are common parts found in most rotating machines such as motors, turbines, engines, and pumps; they are very much susceptible to faults thus making their maintenance signi cant especially that their failure leads to machine failure causing disturbances in production [4]. Condition of bearings are categorized into the following: healthy bearing, inner race or ring damage fault, outer race or ring damage fault, and ball bearing damage fault. e faulty conditions are due to various factors such as overloading machines, misalignment, and improper mounting. Bearings are generally made of two concentric rings: an outer ring and an inner ring. In between the two rings, there are ball bearings or roller bearings which reduce rotational friction in rotating equipment. e various parts of a bearing that usually fail are ball bearings, inner races, outer races, and cage faults, which are fully illustrated in Figure 1.
In this work, real vibration data is analyzed to detect and diagnose bearing faults using PCA and SVM techniques. It should be noted that the data used in the paper is of various speeds. Traditionally, fast Fourier transform (FFT) techniques are applied to infer faults from vibration data. FFT struggles to reveal the faults in the signal, hence the introduction of order tracking technique was introduced.
ese aforementioned techniques require a trained person to further analyze the signals to detect and diagnose faults using the fault characteristic frequencies. is leads to timewasting and an increased likelihood of human errors. In recent years, intelligent approaches ranging from machine learning approaches to more advanced deep learning approaches have since been researched. However, most of the research surrounding the field does not cater for the changes in speed and load conditions of rotating equipment which have an effect in the equipment diagnosis. Furthermore, the effects of compounding faults or more than one fault occurring at the same time are somehow neglected. e application of PCA and SVM adopted in this work automates the fault detection and diagnosis process hence eliminates the need for trained personnel, avoid time-wasting, and reduce human errors. Even though the approach has been researched before, the method has not been tested for cases in which the data is of various speed conditions such as an increase in speed, a decrease in speed, increasing the speed, decreasing the speed, and lastly decreasing the speed of increasing speed. Note, the changes occur during the measuring or sampling period hence the purpose of this paper is to test the performance of PCA-SVM with K-fold cross validation on fault detection of bearing fault with various operating speed conditions and for a compounding faults.
is remainder of the paper structure is as follows: Section II presents the review of the related literature sources, section III presents the methods and techniques used, sections IV present the results, and lastly, the conclusion is reported in section V.

Related Work
FDD based on vibration analysis has been extensively studied and developed, from simple models to more complex mathematical algorithms. FFT is commonly adopted as a preprocessing technique for vibration data. Order tracking has been the most used technique for analyzing nonstationery signals to reveal the order characteristics necessary for fault diagnosis [6,7]. In recent years, the autoregressive integrated moving average (ARIMA) has gained recognition [8]. An extended version of ARIMA has also been used [9] to solve the complexity in the modelling of multisensor condition monitoring. e use of artificial intelligence (AI) has also gained popularity in developing models and algorithms for fault detection and diagnosis. Various models have been developed in recent years. One notable tool that is globally accepted is the use of machine learning models or algorithms in fault detection and classification. Most utilized ML models include K-nearest neighbor [10], Naïve Bayes, artificial neural network, and support vector machines (SVMs). Kumar et al. used the novel convolutional neural network for bearing defects. e approach is unique as it can be able to cope with insufficient data unlike other convolutional neural networks, while creating the deep learning effect [11]. SVM has been used [8] to solve change detection problems in a gearbox. Multiple measurement vector compressive sampling, a combination of geodesic minimal spanning tree, stochastic proximity embedding, and neighborhood component analysis, and multiclass SVMs are used in [4] for condition monitoring of roller bearings in rotating machines using vibration signals. e approach has been proven to have reached high-bearing health classification accuracy while outclassing existing methods. Chowdhury adopted multiclass SVM's, Naïve Bayes, binary decision tree, discriminant analysis, nearest neighbour, and ensemble classifiers to test time-windowed extracted features in nonintrusive load monitoring [12]. As it has been stated, AI methods are the recent advances in the FDD field. One of the most used and studied AI methods in the field is SVM. Ahmed et al. [13] proposed a fault detection and diagnosis approach based on SVM for fault classification. Furthermore, the approach was incorporated with compressive sampling and Laplacian score to generate compressively sampled data from raw vibration data signals and to rank the sampled signals, respectively. e method was tested on bearing data sets and proved to be able to classify faults surpassing the other AI-based approaches. Fault diagnosis depends greatly on features and greatly depends on feature extraction. e separation of the two processes create inferior fault diagnosis accuracy. Zhang et al. [14] proposed a two-phase approach aiming to synchronously extract features and optimize SVM parameters for improved fault diagnosis. A hybrid filter and wrapper method were adopted for this purpose. e proposed method was trained and tested for bearing fault diagnosis and rotor fault diagnosis, while at it, the method proved to be able to perform fault classification tasks perfectly. Amir et al. [15] did a study comparing the accuracy of the classical SVM and one class SVM (OC-SVM) for fault detection and diagnosis in bearings. Classical SVM tends to just create a hyperplane across two classes, whereas OC-SVM maximizes the distance between the origin and positive values in the absence of negative values. Furthermore, OC-SVM has the ability to learn given little data. e authors found that OC-SVM classification accuracy is higher than that of classical SVM and this is also affirmed by [16]. More research on the application of SVM in fault detection and diagnosis problems can be found in [17].
SVMs are also adopted for fault classification in [15,16] and other classification problems in [18]. In [1], principal component analysis (PCA) is used on FFT data to reduce the dimensionality of the data. PCA can distinguish between various motor faults and provide an inexpensive and simple alternative [19]. e application of PCA in fault detection and diagnosis problems is also explored in [20][21][22]. Another well-researched approach is the extension of artificial neural networks including deep normalized convolutional neural networks [23] and deep neural networks with batch normalization (BN) [24] for the classification of bearing faults. Other neural network approaches are reported in [25][26][27][28][29]. In [26], they proposed a fault detection and diagnosis based on deep convolutional neural network (DCNN) with a mandate to combine the feature extraction and fault classification into a single stage. e approach is different from most methods such that it depends on feature learning rather than manual feature engineering hence eliminating the reliance on an individual's knowledge for feature extraction and selection. With raw data, the approach achieved 98% accuracy for the training sample and 91% for the test sample. In comparison with other approaches based on manual feature engineering, the approach is well on par. e approach seems to reserve most features as raw signals are used and there is less signal processing. Another notable insight about the approach is the ability to reflect on the unknown information relating to the bearings. However, the approach does not seem to cater for changes in speed rotating and situations where 2 more faults exist are not covered in the context. Proposed in [30] is the meta learning method which aims to learn prior from relevant tasks without learning from the start. e method is named meta learning fault diagnosis (MLFD) and the authors achieved an average classification accuracy of 97.28% for complex working conditions on case western data. Deep learning techniques which have a dependency of larger datasets are discussed in the following publications: [30][31][32][33][34][35].
In this paper, the capabilities of ML are tested in a combination of PCA and SVM with K cross-validation techniques are employed to detect and diagnose bearing faults using vibration analysis methods.

Methodology
is section presents the methods, tools, and techniques used in this research work. e whole process from data processing to model training is carried out in MATLAB. Since the fault labels are known already from the data publisher, order tracking was not done. A combination of SVM and PCA is adopted for the fault diagnosis of speedvarying bearing data. PCA is adopted to reduce the data dimensions as well as to reduce computation time, and SVM is used to classify the bearing faults using PCA data. As it is shown in Figure 2, raw time-domain vibration data is prepared using MATLAB. is includes labelling the data according to the type of fault they represent and creating an ensemble of the data to treat the data as a whole. e data is then taken through the preprocessing stage where the order spectrum and power spectrum are created. is enables the extraction of the features which follow next in both the time and frequency domains. e diagnostic feature designer toolbox in MATLAB is employed for the extraction and selection of the features which are then used in the classification learner to train and verify the PCA and SVM models. e PCA and SVM parameters are optimized for the desired results of high accuracy. e evaluation criteria are based on the performance of the algorithm on the classification of the various faults or classes.

Data Specification.
Raw experimental data in the form of vibration signals from a bearing is used for training the model. e data is available through the Mendeley data repository and more information on the data can be found in [36]. e setup for the collection of data is shown in Figure 3. e sampling frequency is 200 kHz for 10 seconds. e bearing type is ER16 K with 9 balls, a ball diameter of 7.94 mm, and a pitch of 38.52 mm. e data used in the method covers a wide range of different rotating speed conditions hence the effects of changes in speed during the sampling period on the bearing vibrations are taken into consideration. e operating speed conditions include increasing speed, decreasing speed, increasing speed then decreasing speed, and lastly, decreasing speed then increasing speed. e various speeds can be seen in [36]. e data represent various health conditions, including healthy bearing, ball damage fault, combined faults, an inner ring or race fault, and outer race or ring faults. In this paper, the health conditions are represented as follows: 0-healthy, 1-ball fault, 11-combined faults, 111-inner ring fault, and 1111-outer race fault. A total of 40 samples in the dataset were used, each class or fault represented by 8 samples. A combined fault is defined as a case in which ball fault, inner race fault, and outer race fault occur at the same time.

Preprocessing and Feature Extraction.
Firstly, the data is preprocessed to ensure that it is ready for the extraction of condition indicators. e signals were prepared into an ensemble to process them all at once. Preprocessing of data is carried out using both signal-based functions and modelbased functions. e time synchronous average (TSA) signal of the data is deduced from the original data to filter out noise and disturbances. Spectral analysis of vibration signals has been proven to be the most widely used for rotating machines. Spectral analysis allows for easy detection of the resonance frequency or the fault frequency [37]. e order  and power spectrums of the signals are extracted from the resulting TSA signal, as it is free from disturbances. e power spectrum allows for the characterization of frequency content and resonances within a system. Faults cause changes in the spectral signature making it easy for the extraction of features using the power spectrum. e order spectrum, similar to the power spectrum provides an extended understanding of harmonically interrelated systems in rotating machinery. Features are extracted from both the frequency and time series domains. e features are ranked according to their significance; less important features are not selected for classification. e data is then exported to the classification learner for model training. With PCA active, the optimized Gaussian SVM model is trained. Upon completion of training, a confusion matrix is generated which is used to check how the trained model performs for each class.

Condition Indicator Classification.
PCA is used to reduce the dimensions of the training data features. SVM is adopted as a classification model. Features that are less significant in the preprocessed data are sidelined to increase the accuracy of our model.

Principal Component Analysis (PCA)
. PCA is one of the statistical learning algorithms used to reduce the features extracted from a signal. It has always been adopted for prediction, classification, and feature extraction problems [38,39]. It changes a lot of related variables into new sets of uncorrelated variables and, in the interim, holds most of the information on the first signal. e principal components (PCs) are acquired from the uncorrelated variables to detect and confine process anomalies in a vigorous way [40]. For simplicity, any given normal data matrix X (N × m), X is transformed into a new matrix T (N × r) where r is greater than m. is is achieved by using a transformation matrix P (m × r) [21].
where P and T represent the orthogonal loading matrix and score matrix, respectively.

Support Vector
Machine. Support vector machines popularly known as SVM is a supervised machine learning model used for both classification and regression problems, defined by a separating hyperplane/line for a given training set [41]. SVM has always been adopted for various classification problems such as Internet traffic classification [42]. Various hyperplanes separating the two or more classes (Figure 4) exist, but the SVM classifier depends primarily on the hyperplane or line that has the maximum separating margin among the fault classes ( Figure 5) [44,45]. e larger the separating or functional margin, the lower the classification error [46]. Originally, SVMs were designed for binary classification; however, they can also be used for nonlinear classification problems with the help of kernels [47], hence making them more versatile for classification problems [46]. Linear SVM is used since it is simple to implement. e linear classifier function equation (2) is expressed as follows: w and b are unknown coefficients that are determined from the minimization cost function shown in the following: Subject to C is a user-specified, positive, regularization parameter adopted for control of trade-off among the model complexity and empirical risk [48].
To transform (4) for nonlinear classification, a kernel notion shown in (5) is used.
where Φ is the nonlinear operator. e following types of kernels are usually adopted: linear kernel, polynomial kernel, RBF kernel, and MLP kernel [49]. Mathematical Problems in Engineering 5 erefore, SVM is represented as follows: where s k , k � 1, . . . , N i represents support vectors that correspond to the training data samples, set during the training step, and y i is the class label. e SVM parameters are stated in Table 1.

K-fold Cross-Validation.
Cross-validation involves partitioning datasets into various equal k parts to reduce the overfitting and under fitting of classification models. It allows model training while reserving the k th fold or part for the validation of the training accuracy. After model or algorithm training, the fold is used to test the trained model [50].

Results and Analysis
is section presents the results from the preprocessing of the data and the training of the SVM model. A total of 60 vibration datasets were available. 40 datasets were used for the training purpose and 20 datasets were left for testing purposes. Figure 6 shows the original vibration signals before preprocessing. e health conditions represented by the data are healthy, ball fault, combined faults, an inner ring or race fault, and outer ring or race fault. ese are labelled as 0, 1, 11, 111, and 1111. e y-axis represents the vibration (peak to peak amplitude) in m/s 2 and the x-axis indicates the timestamps (samples) of the vibrations. e frequencydomain plots are transformed from the time domain. Figures 7 and 8, respectively, depict the power and order spectrums of the vibration signals. Spectrum analysis is adopted due to its ability to show the underlying feature which cannot be seen in a simple time-domain signal and allows for easy detection of the resonance frequency or the fault frequency. e order and power spectrums of the signals are extracted from the raw vibration signal. e power spectrum allows for the characterization of frequency content and resonances within a system. In the power spectrum, the y-axis indicates the power of the signals in decibels, and the x-axis represents the frequency of the signals on a logarithmic scale. e logarithmic scale allows coverage of a wider range of frequencies. In the order spectrum, the y-axis indicates the power of the signals in decibels and the x-axis represents the orders, i.e., the frequency expressed in multiples of the running speed. e order spectrum is necessary to reveal the order characteristics of the signal [6].
Features are extracted from both the time-domain series and the frequency-domain series. e features are ranked according to their significance using analysis of variance (ANOVA). e feature ranking can be seen in Figure 9. Less important features were not selected for use in classification model training through a manual feature engineering process as seen in Table 2. A scatter plot was used to investigate the features by plotting the features against each other to see the combinations which can give the best classification results by showing variations in the classes. Figure 10 shows that a combination of the peak frequency of the order spectrum and skewness in time-domain features can be great for the classification of bearing faults. e scatter plot suggests that overfitting may be present between class 0 and class 1111 as they appear to be closer to each other. is may be a factor leading to the misclassification error of the algorithm. Figure 11 shows the legend for the scatter plot.
With the K-cross validation set to 10 to reserve a set of 10 data sets for the validation test, the PCA was set to keep 3 numeric components after training and a Gaussian kernel was selected for the optimizable SVM (one vs all). PCA, as it has been stated, it is used to reduce the dimensions of the feature dataset by giving a direction to keep features with maximum variance. e model achieved a classification accuracy of 97.4% with a training time of 88.315 seconds. e test accuracy was found to be 90% for the remaining 20 data samples. Figure 12 displays the confusion matrix which is used to check how the trained model performs for each class. e model can achieve at least 89% of positive predicted values (annotated in green) and achieve at most 11% of false discovery rates (annotated in pink) for some of the classes. A false discovery rate occurred for class inner race fault class where some of its data was predicted to be a combined fault class. is may be because class 11 represents inner race faults and some of the value of the features may be greatly present in the case of combined faults. e analysis of the results implies that PCA-SVM can be used to classify bearing faults when there are changes in speeds while sampling, to avoid representing changes in speed as faults.
e comparisons based on accuracy are documented in Table 3 [35,[51][52][53][54]. From Table 3, it can be taken that the method performed fairly compared to other approaches. However, SVM host an advantage of low computational time and the ability to cope with limited data. e SVM model was tested on test vibration data resulting in the confusion matrix in Figure 13. e test accuracy was 97.8% and the test results have shown that        SVM can indeed be greatly considered for fault detection and diagnosis of rotating equipment using vibration data.

Conclusion
Vibration analysis has proven to be the most consistent technique when it comes to condition monitoring of rotating equipment. In this paper, a combination of PCA and SVM with cross-validation is used for fault classification of bearings using real data from the Mendeley data depository. e trained model achieves an accuracy of 97.4% with a training time of 88.315 seconds. e model shows an accuracy of at least 89% for all the health condition classes. ese results imply that PCA and SVM can be combined to detect and classify faults; however, more work needs to be put into further optimization of the approach to improve the classification accuracy and reduce the dependence of individual knowledge on the feature selection process. e implication of the result based on the training and test accuracies suggests that PCA and SVM can be successfully employed for real engineering practice. e main advantage of the method is tied to the fact that SVM can achieve high accuracy in cases where there is limited data as depicted in most industrial setups. Industrial setups have limited data since the industrial sites are not collecting and archiving historical data. Hence, there is not enough data to train AIbased algorithms. Additionally, SVM and PCA do not provide computational complexity, hence they are suitable for application in real engineering scenarios. Future work includes testing the approach on other problems based on acoustic waves and also to expand the research on components with both bearings and gears.
Data Availability e vibration data, consisting of tacho signal and accelerometer signals used in the research is found at [36] Conflicts of Interest e authors declare that they have no conflicts of interest.