Adaptive Extraction Method Based on Time-Frequency Images for Fault Diagnosis in Rolling Bearings of Motor

In order to diagnose the faults of rolling bearings in motors via time-frequency analysis of bearing vibration signals quickly, this paper puts forward a method of extracting the main components from time-frequency images. A threshold is adaptively determined based on the gray histogram feature of the time-frequency images obtained from the vibration signals of the motor rolling bearings. Then, a mask template is generated by the threshold and a binarization processing. Based on a multiplication operation between the mask template and the original time-frequency image, the signal component with low energy in the time-frequency image is ﬁltered out, and only the main components with high energy is remained for fault diagnosis, which is convenient for the subsequent identiﬁcation of the faults for motor rolling bearings. The main components in the time-frequency images can be retained adaptively with the thresholds determined by the time-frequency images themselves.


Introduction
Condition monitoring and fault diagnosis for equipment can monitor the health status of equipment in real time and determine the fault location and severity by the changes of some signals, which can not only avoid the occurrence of major accidents but also greatly save maintenance costs. While a motor is working, factors such as overload impact, assembly error, poor lubrication, or impurity doping will lead to the failure of the bearing. e vibration signals of a motor will show the unsteady characteristic, and then, the nonstationary signals have the characteristics of limited duration and timely variation. e traditional signal processing methods are mostly based on the assumption of a stable state, which can only analyze the statistical characteristics of the signal in the time domain or frequency domain, but are unable to reveal the instantaneous characteristics in the joint time-frequency domain. e timefrequency representation of a signal can describe the energy distribution and time-varying characteristics in the timefrequency domain, which is the most complete expression method for unstable signals. Along with the development of image recognition, some mechanical fault identification methods are put forward based on the time-frequency image texture, shape, and other visual feature extraction. ese methods can not only help us to understand the images but also are good to improve the recognition accuracy.
Many scholars have studied this problem. Hongkun et al. [1] make an investigation of the rolling bearing faults' diagnosis by a time-frequency image processing technology, and the experiment results showed that the Hough transform of time-frequency images can effectively classify the faults of rolling bearings. Isobe et al. [2] combined the local wave time-frequency spectrum with image processing to extract the features from vibration signals of reciprocating machines. Cai et al. [3] calculated the Wigner-Ville distributions of acceleration signals by time-frequency analysis, obtained a series of time-frequency gray images from the above distributions by image processing, and then obtained a group of fractal texture characteristic parameters from these gray images to identify the abnormal status of a diesel engine valve gap. Wei and Zhan-Sheng [4] studied a diagnosis method that is based on gray level-gradient cooccurrence matrix, by extracting the information of image texture characteristic to conduct the fault diagnosis of a rotating machine. Cai et al. [5] proposed a new fault diagnosis method based on the time-frequency image recognition of EMD-WVD vibration spectrums by SVM.
rough extracting the moment invariant feature of the images, the diagnosis eigenvectors were achieved, and their modes were recognized by an improved binary tree classifier. Verstraete at al. [6] proposed a deep learning enabled featureless method, where the images generated by time-frequency representations of the raw data were fed into a deep convolutional neural network (CNN) architecture for classification and fault diagnosis, and the results are good.
In time-frequency images, important information is expressed through time-frequency components with high energy. erefore, when the distribution law of frequency components in time-frequency images is studied, the timefrequency components with low energy can be regarded as noise and be filtered out, which will help us to pay attention to the time-frequency components with high energy. n is the time-frequency images, and the energy of the time-frequency component is reflected with the gray value of image, so the classification of images can be achieved based on the important components. A noise removal method for timefrequency images is studied in this paper. A binarization processing is applied to the time-frequency images to get a mask template with which the original images are overlapped to highlight the components with concentrated energy. en, the fault diagnosis can be carried out according to the remained signal components. e remaining sections of this paper are arranged as follows. Firstly, the method of extracting the main components of time-frequency images is introduced in Section 2.
en, the OTSU method, the KSW-Entropy method, and our improved method based on OTSU and KSW-entropy methods are introduced in Section 3. e comparison of the results and the analysis of the experimental data are described in Section 4. e summary of our results is given in Section 5.

The Method of Extracting the Main Components from Time-Frequency Images
In fault diagnosis for mechanical equipment, especially in the treatment of nonstationary signals, we mainly focus on the changes of the main signal components which will greatly affect or even determine the characteristic of the whole signal. We often want to know how the frequency of a signal component is changed with time and how the energy of a signal component is changed with time. By the methods of time-frequency analysis, we can see the changes of signal components. ere are many methods of time-frequency analysis, such as Wigner-Ville distributions (WVD) [7][8][9], short-time Fourier transform (STFT) [10][11][12], wavelet transform [13][14][15], and Hilbert-Huang transform [16][17][18]. Among these methods, the short-time Fourier transform is simple and can be worked out quickly, while giving the main information of how the signal component is changing with time. Although the time-frequency resolution of STFT is not as high as that of WVD, STFT is widely used because of its free of crossterms, which limits the application of WVD largely. In order to show the results of time-frequency analysis visually, images are usually used, where the time is expressed in horizontal coordinate and the frequency is expressed in a vertical coordinate. In this paper, STFT is used to get the time-frequency images of motor bearings.

Short-Time Fourier Transform (STFT) and Time-Frequency Images.
e STFT is a popular method for analyzing nonstationary signals, which is a transform of traditional Fourier transform [19]. e basic idea of STFT is as follows.
When a short-time window function is applied to an original signal, the original nonstationary signal can be viewed as a stationary signal during the very short interval of the window. e window function ω(t) is then moved so that x(τ)ω t,f (τ − t) can be always considered as a stationary signal for a continuous finite time length. en, the power spectrum of the signal at different time periods can be calculated. e STFT of the signal x(t) is defined as where x(τ) is the signal to be analyzed, ω(τ) is the sliding window function, and F w e discrete STFT is defined as where ω(k) is the window function with the length of N, the sliding step of the window function is s sampling time interval, m is the location of the window, corresponding to the time parameter of STFT, and n is the frequency parameter. Suppose the sampling frequency of the original signal x(k) is f s ; then, the sampling time interval is T s � 1/f s . F ω x (m, n) is the spectrum of the signal at the time of msT s , where the frequency parameter of n corresponds to nf s /N. By using STFT, we can get the power spectrum of the signal at different time. en, we show the results of STFT in time-frequency images with the horizontal axis as time and the vertical axis as frequency and the amplitude of the STFT as the gray value. In order to observe the energy distribution in time-frequency images, this paper inverts the gray scale of time-frequency images, that is, at a certain moment and a certain frequency, the larger the energy is, the smaller the gray value will be.

Extraction of the Main Components from Time-Frequency Images.
A time-frequency image can be regarded as an ordinary two-dimensional image, where the time is expressed in horizontal coordinate and the frequency is expressed in vertical coordinate. And, the energy of every time-frequency component is reflected with the gray value.
In the process of bearing faults' diagnosis, the classification of important feature components can be achieved based on the classification of the gray value of the image. at is to say, the features of faults are largely contained in the main components whose energy is expressed with large gray values in the time-frequency image. So, our attention can only focus on the parts with large gray values in the timefrequency image.
In this paper, an adaptive method of extracting the main components of time-frequency images is presented. Firstly, STFT is used to get the time-frequency image of vibration signals. en, a suitable threshold is calculated according to the time-frequency image based on the methods of OTSU and KSW-entropy. en, a mask template is generated according to the threshold with the same size as the original image. e value of each pixel is 0 or 1, where 1 means the pixels will be kept and 0 means the pixels will be removed. en, the timefrequency image which only retains the main components is obtained by a multiplication between the mask template and the original time-frequency image. Finally, the fault diagnosis is carried out based on the time-frequency image with only main components. e recognition computation of the timefrequency image with only the important fault feature information retained will be much smaller than that of the original time-frequency image. e process of the main components extraction method is shown in Figure 1.

The Adaptive Methods of Threshold Selection
e key of our method is the threshold selection of image binarization, which also means the selection of the energy threshold. An appropriate energy threshold can extract the main characteristics components of a time-frequency image and filter out other weak signals or irrelevant features. erefore, an improved adaptive threshold selection method is proposed based on the KSW-entropy algorithm and OTSU threshold segmentation algorithm.

reshold Based on OTSU.
Among all the algorithms related to image threshold, OTSU algorithm [20], proposed by OTSU, a Japanese scholar, is considered as the best algorithm for threshold selection in image segmentation. It divides the image into background and foreground according to its gray scale. As variance is a measure of gray distribution uniformity, the greater the interclass variance between the background and foreground, the greater the difference between the two parts of the image. If part of the foreground is misclassified into background or part of the background is misclassified into foreground, the difference between the two parts will decrease. erefore, the segmentation that maximizes the variance between classes means that the probability of misclassification is minimized.
e principle of OTSU is as follows. If a threshold value is set as t, then the image pixel can be divided into two categories of C1 (whose gray value lesser than t) and C2 (whose gray value greater than t). Assuming that the mean gray values of the two classes of pixel grayscale are μ 1 and the average gray value of the whole image is μ, the percentage of C1 to total pixels is ω 1 , the percentage of C2 to total pixels is ω 2 , the total number of pixels is N × M, and the interclass variance is σ 2 .
en, the formulas can be expressed as follows: According to formulas (6) and (7), the final expression of interclass variance is If the maximal image gray is L, by trying every gray value and calculating the interclass variance of C1 and C2 pixels of Mathematical Problems in Engineering the image, the best threshold T can be found with the biggest interclass variance: 3.2. reshold Based on KSW-Entropy. In 1985, Kapur, Shaoo, and Wong proposed a method to select threshold automatically based on optimal entropy, which was abbreviated as KSW-entropy algorithm [21]. e method applies the entropy of image information to image segmentation. For an image, a threshold value is found to divide the histogram into two categories, and the information entropy of the two categories is calculated, respectively. Based on the threshold, the entropy is maximum. Entropy is used in information theory to describe uncertain factors. e more ordered a system is, the lower its entropy is. In the image, the boundary distribution of the target is the most uncertain, so the boundary between the image target and the background has the maximum entropy. e KSW-entropy algorithm is good for image segmentation with fuzzy boundaries between the target and background.
For an image with a gray scale of L, assuming that p 0 , p 1 , p 2 , . . . , p L−1 are the probability distribution of each gray level in the image. Image pixels are divided into two categories by the threshold t. e pixels whose gray values are in the range of [0, t] are divided into C1 category and the pixels whose gray values are in the range of [t + 1, L − 1] are divided into C2 category. Let P C1 � t i�0 p i be the sum of the probability of pixels in C1 and P C2 � L−1 i�t+1 p i be the sum of the probability of pixels in C2, and P C1 � 1 − P C2 . e probability distribution of each pixel in C1 is p 0 /P C1 , p 1 /P C1 , p 2 /P C1 , . . . , p t /P C1 , and the probability distribution of each pixel in C2 is p t+1 /P C2 , p t+2 /P C2 , p t+3 /P C2 , . . . , p L−1 /P C2 . en, the information entropy E(C1) of C1 and entropy E(C2) of C2 are calculated as follows: e total information entropy is After traversing the whole gray levels of L, the threshold T that maximizes entropy E is the optimal segmentation threshold:

reshold Based on Combined OTSU and KSW-Entropy.
e segmentation result of OTSU is not good for the image with blurred edges, which is mainly reflected in the misclassification of image edges and the sensitivity to noise. However, the edge part of images is processed better with KSW-entropy than with OTSU, but in the background part, where a wrong segmentation may be classified. So, we combine the methods of OTSU and KSW-entropy to propose an adaptive threshold segmentation method.
In order to satisfy formulas (9) and (13) simultaneously as far as possible, considering the theory of multiobjective programming, the linear weighting method in the evaluation function is used to reconstruct a function of threshold selection. Suppose the weight of interclass variance is S, E min is the minimum entropy in the calculation process of the calculating, E max is the maximum entropy, and norm(σ 2 ) is to normalize the interclass variance of all calculated thresholds into [E min , E max ]. en, the mathematical model of our method can be expressed as follows: e weight S is calculated by the threshold T1 and the threshold T2 which are determined by OTSU and KSWentropy. Considering OTSU's missing edge and KSW's excessive background, the best threshold should be positioned between the thresholds determined by the two methods. So, when the threshold value of image is decided, both the variance and entropy should be taken into consideration. At the same time, due to the effect of both methods, the value of the variance should be moved towards the direction of the maximum entropy, and the entropy value should be moved towards the direction of maximum variance, to achieve a balance of the effect of two methods. erefore, the definition of S can be expressed as the following formula: with the weight S, the threshold of an image can be selected dynamically and adjustable. e classification between the edge and the background of a time-frequency image can be achieved by taking the maximum intercategory variance and the maximum entropy into consideration as far as possible.

Introduction to the Bearing Data.
e experimental data we used were obtained from the Bearing Datasets of Case Western Reserve University (CWRU) [22][23][24]. e test rig consisted of a 2 horsepower (hp) motor driving a shaft mounted with a torque transducer and encoder. e torque is applied to the shaft by a dynamometer and a control system. e acceleration data of vibration was measured near to the motor bearings. e faults of the motor bearings were artificially seeded using electro-discharge machining (EDM). Faults ranging from 0.007 inches (or 7 mil) to 0.040 inches in diameter were introduced separately at the inner raceway, rolling element (i.e., ball), and outer raceway. Faulted bearings were reinstalled into the test motor and the vibration data was recorded for motor loads of 0 to 3 horsepower (the motor speeds ranged from 1720 rpm to 1779 rpm).
A vibration data of a faulty bearing we analyzed came from the dataset, where the fault size is 7 mil with zero loading, and the shaft rotation speed is 1797 rpm, and the sampling frequency is 12 KHz. In the process of STFT, a hamming window with the length of 63 is used, and the sliding step of the window is 1. Firstly, the results of normal bearing in the same situation are shown in Figures 2 and 3. Figure 2 is the time domain and frequency domain waveforms, and Figure 3 is the joint time-frequency distribution image. e waveforms of time and frequency are also shown in Figure 3, where the upper waveform is for time domain and the left waveform is for frequency domain. e joint time-frequency distribution image of STFT is shown in the right-bottom corner in Figure 3. Figure 4 shows the waveforms of a faulty bearing, respectively, in time and frequency domains where the inner ring is faulty in size of 7 mil. e joint time-frequency distribution image of the faulty bearing is shown in Figure 5 as the same manner in Figure 4. In the following parts, we only show the time-frequency images of STFT.
By comparing the time domain waveforms, the frequency domain waveforms, and time-frequency images of the normal bearing and the fault bearing, it can be seen that the waveforms are quite different if a bearing has fault or not. From Figure 2, we can see that the frequency of vibration signals of normal bearings is mainly concentrated near 160 Hz, 360 Hz, 1050 Hz, and 2100 Hz, among which the component near 1050 Hz has the largest energy. e signal component at 160 Hz has the second largest energy. We can only obtain this information from the spectrum diagram. However, it can be seen from the time-frequency image that the components near 1050 Hz do not always exist; these components appear at about 0.009 s, 0.046 s, and 0.079 s, respectively, and the duration is less than 0.01 s, as shown in Figure 3.
In the vibration signal of the faulty bearing, as shown in Figure 4, the signal components are particularly rich, mainly concentrating in the frequency band range between 2600 Hz and 2900 Hz and around 3900 Hz. From the time-frequency image as shown in Figure 5, we can see that even within these two frequency bands. e signal components appear intermittently and the durations of each component are slightly different. At the same time, we can also see that, in addition to these main components, there are also many components of weak energy distributed randomly in the time-frequency domain, which tend to disturb our attention due to their weak energy and random distribution. We hope to filter out these disturbances and then we can concentrate on finding the components that reflect the characteristics of the bearing failure.

Comparison of the Extraction Effects.
e original timefrequency image of the faulty bearing data is shown in Figure 6. e mask template and the extracted main components by the threshold of OTSU are shown in Figures 7  and 8. e mask template and the extracted main components using KSW-entropy are also shown in Figures 9 and  10. e mask template and main components extracted by our method are shown in Figures 11 and 12. And the main components of the normal bearing extracted by our method are shown in Figure 13. e threshold selected by our method is 192, which is between the threshold of 205 and 157, respectively, obtained by the methods of OTSU and KSW-entropy. Comparing the Figures 6, 8, 10, and 12, we can see that when the threshold value is different, the extracted main components are not exactly the same. e larger the threshold is, the less the time-frequency components are filtered out. e threshold calculated by the method of KSW-entropy is smaller than that of OTSU, so the main components extracted by the method of KSW-entropy are less than the main components extracted by the method of OTSU. e amount of the main components extracted will affect our judgment and ability to grasp the principal information of faulty bearings.
By comparing Figures 12 and 13, it can be seen that the main time-frequency components extracted from the timefrequency images of the faulty bearing and the normal bearing are greatly different. e fault of the bearing can be judged by observing the distribution of these major components. From the main time-frequency components extracted by our method, it can be easily seen that the signal components are mainly concentrated in the frequency bands around 1300 Hz, 2800 Hz, and 3600 Hz, as shown in Figure 12.
ese signal components do not appear continuously, but occur at regular intervals with slight changes in energy each time. For example, there are some obviously signal components occur at 0.0085 s, 0.0272 s, 0.0455 s, and 0.0642 s, as shown in Figure 12 with red dotted lines, and the time interval between these signal components is about 0.0185 s. Between each two components with obviously high energy, there are also two signal components with slightly lower energy. at is to say, a signal component is occurred at almost every 0.0066 s or so. From this time period we can see that the frequency of the components is about 152 Hz which is close to the characteristic frequency of inner bearing ring fault. Based on the analysis of the main components of the time-frequency image, we can roughly infer that there may be a fault in the inner bearing ring.
In order to make the results more convincing, another data is analyzed to verify our method. e data is recorded on the fault size of 21 mil, and the motor load is 2 horsepower with 1750 rpm, and the sampling frequency is also 12 kHz. e waveforms of time and frequency domain are as shown in Figure 14 time-frequency image of STFT and the extracted main components are shown in Figures 15 and 16. e threshold we calculated with our method is 202. As shown in Figure 16, where the bearing failure is more serious, the signal components are still mainly concentrated in frequency band of 2400 Hz∼3400 Hz. e signal components in this frequency band are very abundant and occur discontinuous. e time intervals are not constant and the intensity of the signal components are also various.

Conclusions
is paper presents an adaptive method of extracting the main components from time-frequency images, which is based on the gray histogram features of time-frequency images. In order to get a mask template, with which the main components of time-frequency images can be extracted, a threshold is firstly calculated adaptively by a method combined of OTSU and KSW-Entropy. en, by the idea of binarization processing, the mask template and the original time-frequency image is operated with multiplication; thus, the signal components with little energy in time-frequency image can be filtered out. By this method, the main components of time-frequency images can be retained adaptively while some little details or noisy components can be filtered out, which will help us to focus on or find the characteristics of the time-frequency images obtained from the vibration signals of motor bearings. With this method, the effective pixel points of time-frequency images can be effectively reduced, and the amount of data to be processed during the later recognition processing will also be reduced, which will help us to use computers to automatically recognize or classify time-frequency images for bearing faults' diagnosis.

Mathematical Problems in Engineering
Data Availability e data used to support the findings of this study can be obtained from https://csegroups.case.edu/bearingdatacenter/ home.

Conflicts of Interest
e authors declare that they have no conflicts of interest.