Acoustic Diagnosis of Rolling Bearings Fault of CR400 EMU Traction Motor Based on XWT and GoogleNet

Acoustic diagnosis has been a research hotspot in recent years because of the advantages of noncontact signal acquisition. However, acoustic diagnosis technology has not been applied to bearing fault diagnosis of Electric Multiple Units (EMU) traction motor. Traditional fault diagnosis methods are difcult to diagnose acoustic signals with complex noise. An intelligent fault diagnosis method based on Cross Wavelet Transform (XWT) and GoogleNet model is proposed in this paper. Firstly, the fault feature enhancement algorithm is proposed using XWTand bandpass fltering. Secondly, the CR400 EMU traction motor bearing fault test bed is built to collect real fault acoustic signals from two diferent positions, then XWTis applied to the original signal to identify the fault feature frequency band, then bandpass fltering is used to flter out the noise frequency band other than the fault feature frequency band. Finally, the kurtosis spectrum of the denoised signal and the original signal are input into GoogleNet, respectively, for fault classifcation. Te result shows that (1) GoogleNet achieves 98.23% accuracy in the fault classifcation for denoised signals, while only 89.66% accuracy for the original signals. (2) Deep learning is an efective method for the acoustic diagnosis of motor bearing faults in EMU trains.


Introduction
Te railway transportation system plays a major role in the rapid development of the national economy. As the core component of the train, the health status of the traction motor directly afects the safety of train operation [1]. According to the survey, bearing failure is one of the most frequent train faults [2][3][4]. When the bearing failure cannot be found, that may lead to train derailment, resulting in huge accidents and economic losses. Terefore, it is necessary to monitor the train bearing and get the health status of the bearing in time. As a traditional diagnostic technology, the diagnosis technology based on vibration signals has been improving and has become the mainstream fault diagnosis technology [5][6][7][8]. In recent 15 years, acoustic diagnosis technology was very important in bearing condition monitoring and was always the research hotspot of fault diagnosis [9][10][11][12][13][14][15]. Previous studies have shown that acoustic measurement technology can be successfully applied to the feld of fault diagnosis [16,17]. Compared with the vibration signal acquisition method, it is more convenient to collect acoustic signals. Tere is no need to drill the bearing seat, and the strength of the structure will not be afected. Terefore, acoustic diagnostic technology has more advantages than vibration diagnosis technology.
Although acoustic diagnostic technology has advantages, in practice, acoustic signals are always extremely complex and contain a lot of noise, so it is difcult to extract fault features from original signals, which brings great difculty to the reliability of acoustic diagnostic technology [18]. With the development of computer technology, machine learning has been widely used in the feld of mechanical equipment fault diagnosis. By taking the time-domain statistical characteristics of the original signal or denoised signal as the input dataset, excellent fault diagnostic results can be achieved by using excellent algorithms such as KNN, SVM, and DBN [19][20][21][22]. Convolutional neural network (CNN) has superior performance in the feld of feature recognition. By taking the characteristic data of the signal as the input data of the CNN for deep feature extraction, which can achieve an excellent fault classifcation function. For example, Fengli proposed deep convolution domain advisory transfer learning (DCDATL), in which the deep convolution residual feature extraction method is used to extract the deep fault features from the bearings signal, and fnally the fault classifcation achieves an accuracy of more than 90% [23]. Alexander takes the kurtosis spectrum of acoustic emission signal as the input image of LeNET5 network and fnally uses the softmax classifer for fault classifcation. Under the working condition of 250 rpm-500 rpm, he achieved a classifcation accuracy of 95.6% to 100% [24]. Zhan proposed a normalized convolutional neural network model, applied the model to the fault classifcation of the bearing dataset of Western Reserve University, and fnally achieved a classifcation accuracy of 98.5% [25]. Liu used a probabilistic neural network (PNN) to diagnose the denoised vibration signal and obtained 100% classifcation accuracy [26]. Zhang et al. applied a support vector machine (SVM) to intelligent fault diagnosis of bearings and fnally achieved 89.58% accuracy on the overall sample [27]. Appana extracts the deep feature of the signal through the self built CNN architecture and fnally uses softmax to classify the fault type, achieving an accuracy rate of 86.5% [28]. Tao uses DBN for bearing fault identifcation and compares the diagnostic efect of DBN with SVM BPNN and KNN [29]. Liu et al. proposed Categorical Adversarial Autoencoder (CatAAE) for unsupervised bearing fault diagnosis. Finally, under the SNR of 20 db to −4 db, the diagnostic accuracy is 96.76% to 85.76% [30]. Kumar et al. used ANN to diagnose the signals after wavelet denoising and achieved 96.67% accuracy [31]. At the same time, using a machine learning algorithm for fault classifcation and comparing the classifcation accuracy of the model on the denoised signal and the original signal can also verify the efect of denoising.
Cross wavelet transform (XWT) has been widely used in the feld of regional climate analysis. It can be used to analyze the coherence of two signals in the time-frequency domain and obtain a common frequency band [32]. Research shows that XWT can be used to enhance the fault feature of bearing signals, but XWT is rarely used in the feld of fault diagnosis. For example, Jimeng proposed a bearing fault feature enhancement method based on XWT, and his experimental comparison showed that the bearing fault feature was enhanced in the time and frequency domain of the vibration signal [33]. Lihua applied XWT to transformer fault diagnosis, XWT was applied to the vibration signals collected from diferent directions. Ten, the principal components of the signal were extracted according to the cross wavelet coherence spectrum. Te results show that the interference components in the signal were greatly reduced and the fault feature were enhanced [34]. At present, XWT has not been applied to the research of acoustic signal fault feature extraction. Aiming at the problem that it is difcult to extract fault features from the complex sound produced by a working train traction motor. Combining with XWT and bandpass fltering, this paper proposed a method of fault feature enhancement. Te innovations of this paper are as follows: (1) in order to study the acoustic diagnosis of bearing fault based on the real acoustic signal of traction motor, we have established a CR400 experimental platform for collecting acoustic signal of train traction motor bearing faults; (2) the fault feature frequency band (coherent frequency band) is identifed by wavelet coherence analysis of the acoustic signals; (3) fnally, the kurtosis spectrum of the signal is used as the input image dataset of CNN to realize the acoustic diagnosis of the traction motor bearing of CR400 EMU.
Te layout of this paper is as follows: Section 2 introduces the fault feature enhancement method proposed in this paper, Section 3 introduces the deployment of the experimental platform and the source of sound data, and Section 4 provides the details of the processing of fault feature extraction. Finally, in Section 4.3, the fault classifcation of bearing based on GoogLeNet is done to verify the efectiveness of this method. In the end, the research of this paper is summarized in Section 5.

Fault Feature Enhancement Algorithm Based on XWT
In practice, the waveform of the sound signal emitted by the working traction motor is very complex. Tere are many noise sources in the sound generated by the motor, so it is difcult to identify the bearing fault. When the noise is removed out from the original signal, the periodic impact component caused by the bearing fault in the signal will be more obvious. In order to remove the noise component in the signal, this paper proposed a method combining XWT and bandpass fltering to flter the fault feature frequency band of the original signal. Te feature signal is reconstructed based on the method to achieve the function of noise reduction. In the following, the sound signal emitted by the traction motor is described as x sound (t) (abbreviated as x s (t)) for convenience of expression.

Wavelet Coherence Analysis Based on XWT.
Based on the theory of wavelet analysis, XWT can be used to analyze the coherence of two sets of time-domain signals in their whole frequency domain. Multiple sound signals are collected from diferent directions for the same target, and these sound signals contain noise and fault signals similarly. In this paper, two microphones will be used to collect the sound of the working traction motor from two diferent directions, collected signals are called x s1 (t) and x s2 (t), respectively. Due to diferent sound propagation paths, the signals collected by the microphone at diferent positions will be diferent. Based on the principle of the wavelet transform, XWT is applied between x s1 (t) and x s2 (t) as follows: 2 Shock and Vibration where a is the scaling factor, τ is the translation factor, * represent complex conjugate, ψ(t) is the morlet wavelet function, ω 0 is the initial phase angle. Te absolute value of W xs1−xs2 (a,τ) is the cross wavelet power spectral density. Te higher the value, the greater the coherence between x s1 (t) and x s2 (t). Due to the randomness and instability of noise, the cross wavelet power spectral density between noise signals will be very small, that is, its coherence is very small. According to the diference of coherence content, in the complete frequency band of the traction motor signal, the noise and fault signal will be easily distinguished [32,35]. In order to visually observe the coherence of the whole frequency band and the frequency band of noise and fault signal, the wavelet coherence spectrum of x s1 (t) and x s2 (t) needs to be made. In the wavelet coherence spectrum, the coherence is expressed by the brightness of the color.

Bandpass Filter.
Te bandpass flter can transmit the signals within a specifc frequency range, and block the signals outside this frequency range to achieve the purpose of selective transmission. After identifying the noise frequency band or fault feature frequency band, the noise signal can be purposefully deleted from the original signal by using bandpass flter. At present, bandpass flter has been widely used in signal processing.

Steps of Fault Feature Enhancement.
According to the wavelet coherence spectrum, the frequency band of the noise can be identifed, and the design parameters of the bandpass flter can be determined. Te algorithm steps of fault feature enhancement based on wavelet coherent spectrum plus fltering are as follows: (1) Deploy two microphones in two diferent positions around the motor to collect two sets of sound signals (x s1 (t) and x s2 (t)). For ensuring that the propagation paths of x s1 (t) and x s2 (t) are diferent from each other, the two microphones are placed at the positions of the motor with a diference of 90 degrees. (2) Te cross wavelet transform is applied to x s1 (t) and x s2 (t), and the wavelet coherent spectrum between them is drawn. According to the light and dark distribution of each frequency band in the wavelet coherent spectrum, the frequency band of the noise is determined. Based on this noise band, a bandpass flter is designed. (3) Te bandpass flter is used to flter and reduce the noise of x s2 (t) based on the wavelet coherent spectrum. Finally, the denoised signal is marked as x feature (t). Te fow of the whole process is shown in Figure 1.

CR400 EMU Motor Bearing Acoustic Data Experiment
An acoustic bearing fault test bed of the traction motor of CR400 EMU is established. Te model of the traction motor is YQ-625, its rated output power is 625 kW, and the maximum output speed is 5600 rpm. Te test bearing is a cylindrical roller bearing whose type is NU214. In this experiment, the bearing is installed at the drive end of the traction motor. Bearing specifcations are shown in Table 1.
Laser etching is used to produce single point damage. Te width of laser etching damage is 0.3 mm or 2 mm which are mild fault and severe fault respectively. Mild fault or severe fault are produced on cage, ball, outer race, and inner race, respectively. Tere are eight fault types, namely cage mild fault, cage severe fault, ball mild fault, ball severe fault, outer race mild fault, outer race severe fault, inner race mild fault, and inner race severe fault. Tese fault types are referred simply to as CMF, CSF, BMF, BSF, ORMF, ORSF, IRMF, and IRSF, respectively. Te fault bearings are shown in Figure 2.
Five microphones are placed around the traction motor to collect the sound signal when the traction motor is working. Te layout of the test bed is shown in Figure 3. Te bearing speed is set to 2414 rpm to simulate the working condition of the train at 160 km/h. Te sampling rate is set to 54.94 kHz. Te sampling duration of bearing acoustic signal is shown in Table 2.

Signal Denoising Based on XWT.
Based on the method proposed in this paper, we frst need to deploy two microphones around the motor. As shown in Figure 3, microphone 1 and microphone 2 are selected as research objects. Because the relative positions of these two microphones and the motor are 90 degrees diferent, and their signal propagation directions are completely diferent. Te acoustic signals collected by the two microphones are abbreviated as x s1 (t) and x s2 (t). Teir sound pressure fuctuation waveforms are shown in Figure 4.
Taking the fault of CMF as an example, the method in this paper is used to deal with it. Firstly, cross wavelet transform is applied to x s1 (t) and x s2 (t), and then the wavelet coherence spectrum is obtained, as shown in Figure 5.
It can be observed from Figure 5 that the noise frequency band is between 0.125 KHz and 4 KHz and below 0.03125 KHz, and the other frequency bands are coherent frequency bands (common frequency bands). Ten, the bandpass flter is used to reduce the noise of x s2 (t), so that only the coherent frequency band is retained in the signal.     Ten, the fault characteristic signal is obtained. Te sound pressure fuctuation waveform of x feature (t) is shown in Figure 6.
Compared with the original signal, the complexity of x feature (t) become lower. A large part of the noise in the signal is fltered out, and the burr in the waveform is eliminated to a great extent. Other fault signals have been processed in the same way, and the repeated expression will not be repeated.

Spectral Kurtosis Analysis of Feature Signal.
Spectral kurtosis (SK) was precisely defned by Antoni in 2006 [36]. As the more developed SK analysis for optimum selection of the bandwidth, the kurtogram is accepted and used in fault diagnosis, particularly in bearings. Moreover, kurtosis has a high probability of carrying a high value for nongaussian noises and fault feature [37]. When the bearing fails, there will be a gap between its components and ferce collision will occur between them under violent rotation. Te fault impact causes a component of a specifc frequency band into the original signal. In this paper, the spectral kurtosis of the raw signal and feature signal is calculated based on short-time Fourier transform. SK is very sensitive to the transient impact included in the signal. When the noise is removed, the SK value will increase to indicate that signal could better refect the fault of the bearing.
Te kurtogram takes the frequency as the horizontal axis and uses the color scale to represent the spectral kurtosis value of each frequency. Figure 7 (left) shows the spectral kurtosis of IRMF's original signal, and Figure 7 (right) shows the spectral kurtosis of its feature signal (x feature (t)). It can be seen from Figure 7 (right) that the spectral kurtosis is the largest in the range of center frequency of 2.5753 kHz and bandwidth of 1.7169 kHz, that is, the transient impact is the most obvious in this range. In Figure 7, the maximum of kurtosis (K max ) of feature signal is 23.8332 which is signifcantly larger than the original signal's 2.6636. Terefore,    Shock and Vibration 5 the x feature (t) can better refect the bearings fault than the original signal. In kurtogram, the frequency band corresponding to the brightest color region can better characterize the impact between bearing part [38]. Figures 8-14 show the kurtogram of other seven fault feature signal. It can be seen that for all fault types, all the K max of the fault feature signal are signifcantly greater than their original signal. According to the defnition of spectral kurtosis, the greater the value, the more obvious the impact component in the signal. After the processing by the method proposed, the noise in the original signal could be removed to a great extent.
Tere is a great diference in spectral kurtosis for different types of fault data. Te diference of center frequency, bandwidth, and maximum spectral kurtosis results in the diferent light and dark distribution of the corresponding kurtogram. So the kurtogram of x feature (t) are used as the feature images of bearing fault signal to represent the fault condition. Figure 15 shows the preparation process of kurtograms dataset. Te sampling duration of the signals are recorded in Table 2. In order to increase the number of samples, the equal interval overlapping segmentation method is adopted to generate a sample signal. Te time length of a single sample is set to 1 second and the interval time is set to 0.5 second. After dividing the signal, a total of 4512 sample signals are generated. Ten, kurtograms of all sample signals are randomly divided into a training dataset and testing dataset in a ratio about 7 : 3.

Fault diagnosis Based on GoogleNet and Denoised
Signal. Convolutional neural network (CNN) is used for fault classifcation, and the efectiveness of the proposed method will be further verifed by comparing the classifcation performance of CNN on the original signal and fault feature signal. Based on the original GoogleNet, a fault classifcation model is established. GoogleNet uses the Inception structure to improve the sparsity of the network, so as to improve the computing speed. And, GoogleNet has higher performance than AlexNet and LeNet5. For more details about GoogleNet, please browse to reference [39]. GoogleNet structure is shown in Table 3.
Since there are totally 8 types of bearing faults in this paper, the classifcation number of GoogleNet was set to 8. Te loss function used is cross entropy loss function, and Adam optimizer is used to optimize all the weight parameters of GoogLeNet. Before fault classifcation, in order to ft the dataset as soon as possible, the initial learning rate was set to 0.001. Te model learns the kurtograms dataset 45 times in total, and the learning rate are changed every 15 times to ensure the stability of the model. Te learning rate was set to 0.0001 and 0.00001 at the 16th learn and 31st learn, respectively. Every time the model fts in the training dataset, the performance of the model on the test dataset will be output. Figure 16 shows the ftting process of GoogLeNET to the kurtogram dataset in this paper. Te blue curve in Figure 16              Shock and Vibration 7 shows the change of the ftting degree of the model on the training dataset. Te orange curve indicates the performance of the model on the test dataset after each ftting of the model to the training dataset. It can be seen that at the end of the 15th ftting, the ftting accuracy of the model on the training dataset reached to 98.23%, then the classifcation accuracy on the test dataset reached to 96.30%. Ten, as the learning rate decreases to 0.0001, the ftting extent of the model becomes better. At the end of the 45th ftting, the accuracy has reached to 98.23%. Te result shown in Figure 16 shows that the fault feature signal processed can efectively refect the fault state of the bearing.
Te results show that the method proposed in this paper can enhance the fault feature, and GoogLeNet can accurately identify the fault types of faulty bearings according to the information of kurtograms of fault feature signals.

Comparison Experiment for Denoise or Not.
Te original signal is divided based on the steps shown in Figure 15, but XWT-Bandpass noise reduction is not applied before the division process. Ten, the kurtosis spectrum dataset of all original signal samples is produced. Finally, the kurtosis spectrum of the original signal is input into the

Shock and Vibration
GoogLeNet described in Table 3 for fault classifcation, and the fault classifcation accuracy of the original signal is obtained. Figure 17 shows the ftting process of GoogLeNet to the kurtogram dataset of original signal and fault feature signals. Te blue curve and orange curve in Figure 17      be seen that compared with the fault feature signal, GoogLeNet's ftting degree of the original signal is worse, which refects that the fault feature of the signal after noise reduction is more obvious. Figure 18 shows the confusion matrices of classifcation of the fault feature signal and the original signal. It can be seen from Figure 18 that the classifcation accuracy of all fault types has been improved after fault feature enhancement. Te faults classifcation recall and precision of all fault types before and after noise reduction are shown in Table 4.

Comparison with Other Diagnostic
Methods. At present, many classifcation methods have been used in bearing fault detection, such as SVM, CNN, DBN, KNN, and BPNN. And the application of these methods in the feld of fault diagnosis has achieved good results. Te results of these methods and our methods as well as the fault classifcation accuracy of each fault of a part methods are listed in Table 5 to compare the efect. It can be seen from Table 5 that the accuracy of our proposed method is higher than that of other methods, and the accuracy rate of each fault is maintained between 95% and 100%. Te performance and stability of the method are better, which again demonstrates the superiority of this method.

Conclusion
In this paper, microphones are used to collect the sound signals of the working motor from diferent positions. According to the principle of the cross wavelet transform and bandpass fltering, the coherent frequency band is distinguished out and reserved. According to the defnition of spectral kurtosis, by comparing the spectral kurtosis values of the original signal and the feature signal, it is proved that the feature signal can better refect the impact caused by bearing fault. Te kurtograms of feature signal is input to GoogLeNet for fault classifcation and then an accuracy of 98.23% was achieved. While the fault classifcation accuracy of the original signal by GoogLeNet is only 89.66%. So the efectiveness of the proposed method is further proved. Terefore, the method proposed in this paper can efectively remove noise and enhance the fault feature, and excellent fault diagnosis can be achieved by using a convolutional neural network.

Data Availability
Tis study does not involve any public data sets.

Conflicts of Interest
Te authors declare that they have no conficts of interest.