A Vibration Feature Extraction Method Based on Time-Domain Dimensional Parameters and Mahalanobis Distance

To accurately describe the characteristics of a signal, the feature parameters in time domain and frequency domain are usually extracted for characterization. However, the total number of feature parameters in time domain and frequency domain exceeds twenty, and all of the feature parameters are used for feature extraction, which will result in a large amount of data processing. For the purpose of using fewer feature parameters to accurately reflect the characteristics of the vibration signal, a simple but effective vibration feature extraction method combining time-domain dimensional parameters (TDDP) and Mahalanobis distance (MD) is proposed, i.e., TDDP-MD. In this method, ten time-domain dimensional parameters are selected to extract fault features, and the distance evaluation technique based on Mahalanobis distance criterion function is also introduced to calculate the feature vector, which can be used to classify different failure types. Finally, the proposed method is applied to fault diagnosis of rolling element bearings, and experimental analysis results show that the proposed method can recognize different failure types accurately and effectively with only ten time-domain dimensional parameters and a small quantity of training samples.


Introduction
e functions of mechanical equipment are becoming more and more diverse, and the working environment is getting harsher and more complex [1,2]. is leads to the gradual aging of various components during long-term operation, and the potential for failure gradually increases. Rolling element bearings and other rotating parts play a very important part in today's manufacturing industry, so their failure may damage the machine and reduce its running capability [3,4]. To lessen the damage of machinery and keep the equipment performing at its best, people usually use the vibration signals to detect the faults of machine parts. Furthermore, different fault diagnosis methods have been developed [5][6][7]. Of course, many algorithms have also been proposed and applied to different fields, and a wealth of research results have been achieved [8][9][10][11]. In the field of feature extraction of rolling bearing vibration signals, many scholars have also studied various methods, among which the earliest research on vibration signal processing of bearings is very simple and mainly relies on analysis in time domain [12], analysis in frequency domain [13], and analysis in time-frequency domain [14]. In the period between 2001 and 2010, modern vibration signal processing methods include wavelet transform method [15], empirical mode decomposition (EMD) [16], and entropy [17]. More advanced fault diagnosis approaches have been proposed since 2011, such as spectral kurtosis and kurtogram [18], ensemble EMD [19], and improved wavelet transform [20]. Generally speaking, the purpose of signal processing in the failure diagnosis of rolling element bearings is to extract the vibration feature information of the fault signal, that is, feature extraction, so as to distinguish different signals that are caused by different types of faults.
As one of the key procedures of rolling bearing fault diagnosis, feature extraction actually directly affects the diagnosis results. For the purpose of acquiring rich fault information, the traits in time-domain, frequency-domain, and time-frequency domain are extracted. Naturally, we will think of constructing a feature set containing all the above fault information to identify and distinguish different types of faults. However, in general, the feature set of all the above fault information may contain redundant features, mutually exclusive features, and superior features. In case all features in the feature set are directly entered into a classifier, the classification process will get slow and the classification accuracy is difficult to be guaranteed. erefore, how to select several features that can better reflect the machine state to improve the calculation speed of the classifier and the classification accuracy has become a research hotspot. ere are a number of feature selection methods such as deep learning algorithm [21,22], singular spectrum entropy [23], power spectrum entropy [24], and distance evaluation technique [25]. Owing to its reliability and simplicity, the distance assessment technique is commonly used in the diagnosis of the fault. Here, the distance assessment technique based on Mahalanobis distance (MD) criterion function has been introduced and used in vibration feature extraction of rolling element bearing fault.
In this paper, we utilize time-domain dimensional parameters to represent the fault characteristics of vibration signals collected from rolling bearing. In order to better combine the feature parameters, Mahalanobis distance criterion function is introduced for failure classification of vibration signals. Consequently, a vibration feature extraction method which is based on dimensional parameters in time domain and MD classification is proposed and used for troubleshooting of rolling element bearing. In the first place, time-domain dimensional parameters are extracted from training samples and test samples respectively. In the second place, calculate the MD of every feature parameter of test samples and then obtain every norm of the fault feature vectors from test samples. Finally, each norm of the fault feature vectors is input into the MD classifiers and the fault modes of rolling element bearings can be discerned. e analysis results of test signals collected from rolling element bearings prove the validity and availability of the presented method. And this method also has the characteristics of requiring only several feature parameters and a handful of training samples. e rest of this article is arranged as follows. In Section 2, feature parameters in time domain are introduced, and the definitions of Mahalanobis distance (MD) and Euclidean distance (ED) are given briefly. In Section 3, the novel vibration feature extraction method is put forward. In Section 4, the validity of the presented method is verified by the experimental data of faulty rolling element bearings. In the end, conclusions are given in Section 5.

Methodology
In the system of condition monitoring and fault diagnosis, the signals which are collected from the testing equipment are usually time-domain signals. Since the test signals are random, they cannot directly reflect the state change of the system. erefore, it is necessary to analyze the test signals to find the characteristics that reflect the statistical law. e methods of analysis in time domain and analysis in frequency domain are often used for signal feature extraction. Analysis in time domain is used to estimate and calculate various time-domain parameters of signals, while analysis in frequency domain is another kind of description of signals and it can disclose some information that cannot be found in time domain. Although the time-domain feature parameter can distinguish between the normal case and the fault case, as the degree of the fault deepens, its value may show large abnormal fluctuations. is demonstrates that single time-domain feature parameter cannot effectively evaluate the fault condition. Usually the frequencydomain feature extraction can reflect the periodic components in the signal, while this assumption based on stationary theory is not applicable to nonstationary and nonlinear signals without periodic characteristics. us, the frequency-domain features cannot be used alone as the fault feature parameters of vibration signals in general. Fault features can be accurately extracted by combining features in time domain and features in frequency domain. But this will increase the amount of data processing and reduce the efficiency of diagnostic testing. To decrease the number of feature parameters and the amount of data processing, we propose a vibration feature extraction method that uses only several feature parameters in time domain. e feature parameters in time domain can be divided into dimensional features and dimensionless features according to whether they have dimension or not. Dimensional features include minimum, maximum, mean, variance, mean square, and root-mean-square value. e commonly used dimensionless features include kurtosis factor, shape factor, crest factor, impulse factor, and clearance factor. eir expressions [4,26] are shown in formulas (1) to (15).

Dimensionless Parameters.
where x(n) is a signal series for n � 1, 2,. . ., N, and N is the number of data points. As mentioned above, time-domain dimensional parameters can be applied to compose an eigenvector. Before the eigenvector is fed into a classifier, the distance evaluation technique based on Mahalanobis distance is used to form the most superior feature which can better reflect the characteristic of a signal. In order to prove the superiority of our method, time-domain dimensionless parameters and Euclidean distance are also introduced for comparisons.

Description of Time-Domain Feature Based on MD.
In mathematics, MD is a method of statistics used to measure the similarity of two sets of data. Unlike Euclidean distance (ED) [27], MD is dimensionless and also considers the correlation between data. e time-domain feature based on MD is defined as [3].
where T i denotes the fault characteristics of test sample x (n), as shown in Sections 2.1 and 2.2. S T i denotes the fault characteristics set of each state of training samples. e state respectively represents the normal condition, rolling body fault, inner race fault, and outer race fault of rolling bearings. While mean(S T i ) and var(S T i ) are the mean and variance of S T i separately, MD i is the MD discriminate distance of T i to S T i . e MD method can offer a number to evaluate the resemblance between the unknown and known sample sets. In general, the smaller the Mahalanobis distance of the samples, the more likely they are to belong to the same fault category. erefore, we can use MD to extract fault feature and classify samples of different levels of failure.

Description of ED.
To show the advantage of MD method, the ED is also introduced in this paper. In mathematics, the ED is the "ordinary" straight-line distance between two points in Euclidean space [27]. In Cartesian coordinates, ED can be defined as follows: where satisfies the following four conditions: (1) x 1 and x 2 are interchangeable in the expression of ED; that is, ED (x 1 , As suggested above, compared with the MD, ED is easily affected by the dimension. erefore, this paper adopts MD method to extract fault features and classify faults.

The Proposed Method in This Paper
In order to show the superiority of our method, we also define three other feature extraction methods. e feature extraction method combining the time-domain dimensionless parameters and MD is described as follows: e feature extraction method combining the dimensional parameters in time domain and ED is described as follows: e feature extraction method combining the dimensionless parameters in time domain and ED is described as follows: Mathematical Problems in Engineering 3 erefore, we can get four feature extraction methods in the above-mentioned ways: M 1 , M 2 , E 1, and E 2 . In the process of failure diagnosis method detailed in Section 3.2, the feature extraction is calculated by M 1, for instance.

Fault Diagnosis Method Based on Time-Domain Dimensional Parameters and MD.
When something goes wrong with the machinery, the feature parameters in time domain will change, and then the value of M 1 will change too. us, M 1 can be used to reflect the change of machinery under different operational conditions.
In the proposed method, every time-domain dimensional parameter is first calculated to extract the fault characteristic from the vibration signals. Because each characteristic parameter of a signal contains only one value, MD can be used to compute the feature distance vector MD Df1 , whose 2-norm will be used in identification of the work state of the machinery. A flowchart of the failure diagnosis method based on time-domain dimensional parameters and MD is shown in Figure 1. e main operation process is as follows: . .,10) for test samples according to (16). (5) Compute M 1 of the test samples according to (18) and (19).

Experimental Validation
e experimental data come from the Bearing Data Center of Case Western Reserve University [28]. As shown in Figures 2 and 3, the test bearings are 6205-2RS JEM SKF deep groove ball bearings, the motor loads are 0 horsepower, 1 horsepower, 2 horsepower, and 3 horsepower, and the motor speeds are 1797 r/min, 1772 r/min, 1750 r/min, and 1730 r/min. Besides the normal bearing, the rolling element bearings with inner race fault, outer race fault, and rolling element fault are all in our consideration. Considering the large amount of data and the limited length of the article, we randomly select a set of data for research, in which the single point faults with a diameter of 0.5334 mm and a depth of 0.2794 mm are set into the tested bearing by means of electro-discharge machining. And the motor load is zero horsepower, the motor speed is 1797 r/min, and the sampling frequency of per channel is 12000 Hz.
In light of the mean motor speed and the sampling frequency, each signal sample contains at least about 400 sampling points to reflect the running state of the bearing. To obtain more information of the bearing, each signal sample consists of 1024 data points. Each type has 25 vibration signals, and totally 100 samples are randomly selected from the data sets. e typical vibration signal waveforms of normal bearings and faulty bearings (rolling element fault, inner race fault, and outer race fault) are shown in Figure 4.
First of all, T i is calculated for 100 samples in the experiments. For every failure type, 10 samples are randomly selected for training and another 15 samples for testing, so the number of test samples obtained is 60. All the data used for testing differ from the data used for training. In the next place, the mean and variance of fault feature set for training samples in each state are calculated, and then MD discriminant distance MD i is computed according to (16).
irdly, the M 1 values of test samples can be obtained according to (18) and (19)

Mathematical Problems in Engineering
To prove the superiority and effectiveness of the presented method, Section 4.1 introduces the other three feature extraction methods for comparison, which are M 2 method, E 1 method, and E 2 method. e analysis process of each method is identical to the vibration feature extraction method based on M 1 method, as shown in Figure 1. e corresponding fault classification results of rolling bearings are depicted as  e classification results show that these three feature extraction methods can also classify faults to some extent. Next, we conduct a quantitative study on the accuracy of each feature extraction method.
In view of the complicacy of the practical operating conditions, we analyze the factors that influence the classification accuracy, such as signal-to-noise ratio (SNR), the number of training samples, and the sample data points. In the first place, the performance of our method and other three methods in different SNR conditions is studied. e data sets (10 training samples and 15 test samples for each state) mentioned above are used for experiment tests, in which Gaussian noise with different SNRs is added. e experimental results of the four methods are shown in Table 1. From Table 1, we can see that our method has high classification accuracy in different SNR conditions. e recognition rate of our method is stabilized at about 98% while the maximum recognition rates of other methods are 88.3%, 88.3%, and 85%. Even when the SNR � 0 dB, the classification accuracy of our method still reaches 96.7%, which is significantly higher than other classification methods. is demonstrates that the proposed method has high recognition rate and antinoise capability.
In general, the number of training samples and the sample data points may have an impact on the recognition results. In the purpose of investigating the influence of training samples and sample data points with same sampling frequency on classification accuracy, different training samples and sampling points are chosen to confirm the failure types. e number of training samples is set to 10, 8, 6, 4, and 2, respectively. e number of sampling points N is set to 2048, 1024, 512, 400, 256, and 128, respectively, and the sampling points are continuously intercepted sample data. For different training samples and sampling points, the failure classification results are listed in Tables 2-6 , respectively. Furthermore, to clearly describe the relationship between the classification precision of different methods and the number of sampling points for each fault type under different number of training samples, the results of Tables 2-6 are illustrated in Figure 9. It can be found from Figures 9(a)-9(e) that the classification accuracy of our method remains above 90% when the number of sampling points is more than 1000, and the number of training samples has little impact on classification accuracy. Generally speaking, the classification precision of our method is higher than the other methods under different number of training samples. However, we can see from Figures 9(b)-9(e)) that the classification accuracy of our method is lower      Mathematical Problems in Engineering than E 1 method when the number of sampling points is less than 400. As mentioned above, each signal sample contains at least about 400 sampling points to reflect the operating state of the bearing. erefore, the information contained in the operating state of the bearing cannot be reflected when the number of sampling points is less than 400, so the classification accuracy has characteristics of randomness in such a situation. us, the number of sampling points cannot be less than 400. When the number of sampling points is more than 400, we can see that our method is        superior to the other three methods. And the classification precision of each method tends to be stable when the number of sampling points is more than 2000.

Conclusions
In this article we propose a vibration feature extraction method based on dimensional parameters in time domain and Mahalanobis distance (TDDP-MD). Firstly, select ten time-domain dimensional parameters to extract fault features. Secondly, calculate the feature vector using the Mahalanobis distance criterion function, and put the feature vector into a classifier to achieve fault classification. Finally, use the proposed method to diagnose the rolling element bearings faults. Effects of SNR conditions, the number of training samples, and the number of sample data points on the classification precision of the presented method are discussed, and the classification results of the proposed method are compared with those of the other three methods. Experimental results show that the presented method can effectively recognize the states of roller element bearings under working conditions. It also has a good classification effect under different SNR conditions, the number of training samples, and the number of sampling points.