Research on the Sparse Representation for Gearbox Compound Fault Features Using Wavelet Bases

The research on gearbox fault diagnosis has been gaining increasing attention in recent years, especially on single fault diagnosis. In engineering practices, there is always more than one fault in the gearbox, which is demonstrated as compound fault. Hence, it is equally important for gearbox compound fault diagnosis. Both bearing and gear faults in the gearbox tend to result in different kinds of transient impulse responses in the captured signal and thus it is necessary to propose a potential approach for compound fault diagnosis. Sparse representation is one of the effective methods for feature extraction from strong background noise. Therefore, sparse representation under wavelet bases for compound fault features extraction is developed in this paper. With the proposed method, the different transient features of both bearing and gear can be separated and extracted. Both the simulated study and the practical application in the gearbox with compound fault verify the effectiveness of the proposed method.


Introduction
As a fundamental mechanical component for transmitting power, gearbox has been widely used in modern industry.Because of its complicated structure, hostile working conditions, and other reasons, gearbox is usually easily damaged and breaks down.Therefore, it is of great significance to develop proper condition monitoring and fault diagnosis method for gearbox in order to prevent the unexpected machine fault during operation and even casualties [1].
When a fault is occurring in a bearing or a gear, both of which are vital components for gearbox, periodic transient impulses appear in its generated vibration signals.Researches have shown that the transients in the captured signal always comprise the important information of fault feature from the defective component [2].Therefore, gearbox fault feature extraction can be transformed into extracting the transients in the generated signal.
During the past two decades, various advanced signal processing methods have been proposed for effective fault feature extraction in rotary machines.Time-frequency analysis [3,4], whose analysis of a signal is performed in both time and frequency domains, is developed for nonstationary signal processing.As a typical method of time-frequency analysis, Wigner-Ville Distribution (WVD) has proven its effectiveness in mechanical fault diagnosis [5,6].Wavelet transforms, which decompose an original signal into different scales with varying frequency bandwidths [7,8], are also used to extract fault-related information of rotary machines [9,10].Empirical mode decomposition (EMD) [11], a self-adaptive signal processing technique, decomposes the nonlinear and nonstationary signal into a set of intrinsic mode functions (IMF).It has also been introduced to fault diagnosis of rotary machines [12,13].
However, methods mentioned above are mostly used on the occasion that there is single fault in the machinery.A number of engineering practices have shown that there is usually more than one fault in a gearbox, which is demonstrated as compound fault [14].When different faults exit simultaneously, vibration signal excited by several faults always contains different transient impulse responses, which makes it quite complex and difficult to identify each fault 2 Shock and Vibration from the observed signal in terms of traditional methods.Hence, some novel techniques for compound fault diagnosis have been developed gradually.The blind source separation (BSS) technique [15] can separate several original signals that cannot be observed directly from the superposed signals and has been used in extracting machinery faults features from different rotary components [16,17].However, the analyzed compound signals in BSS are usually derived from different channels through several transducers, which may bring some inconvenience in installing transducers during engineering application.Morphological component analysis (MCA) [18] was also developed for compound signal decomposition based on the morphological diversity of each component [19,20].However, MCA requests that the vibrations generated by each faulty component are totally irrelevant, which may lower the separation quality.Additionally, some intelligent methods based on models are proposed in recent years for compound faults signals separation [21,22], but due to its difficulty in acquiring the appropriate data, these intelligent methods have not been widely promoted.
Meanwhile, there has been a growing interest in the study of sparse representation of signals recently [23,24].With an overcomplete dictionary which contains prototype signalatoms, signals can be described as a sparse linear combination of these atoms [25,26].Till now, sparse representation has already been used in single fault feature extraction, and its excellent extraction property has been proven [27,28].According to the characteristics of gear fault vibration signal, Cai et al. [27] proposed the sparsity-enabled signal decomposition using tunable -factor wavelet transform and successfully extracted the fault feature of gear.In [28], Fan et al. constructed a sparse dictionary corresponding to the signal characteristic and combined Majorization-Minimization (MM) algorithm to extract the gear transient impulse responses sparsely.
Based on the engineering practical requirement, further research on the gearbox compound fault diagnosis should be conducted.On the basis of failure ratio investigation among the components in gearbox, the main failure components tend to be bearing and gear.However, there are few methods for extracting gearbox compound fault features of both bearing and gear faults.Hence, it is of much importance to pay attention to gearbox compound fault diagnosis when both bearing and gear faults occur.Due to its excellent selfadaptability, concise expression, and other merits, sparse representation is introduced in this paper for separating and extracting the gearbox compound fault features.Once there are both bearing and gear faults in a gearbox, there will always be two different kinds of transient impulse responses in the sampled signal.Considering the diversity in waveforms of each fault, different optimal wavelet bases thus can be constructed by correlation filtering [29,30].Then the constrained optimization algorithm with obtained special basis is incorporated to get a series of sparse coefficients which represent the specific fault.After representing the fault in sparse coefficients with the suitable constrained optimization algorithm one by one, the impulse time and the period parameter of each fault in the gearbox can be detected from the sparse coefficients properly.
The rest of this paper is outlined as follows.In Section 2, the basic theoretical background concerning the proposed method is introduced.A simulated study is given to verify the effectiveness of the proposed method in Section 3. In Section 4, the proposed method is applied to the gearbox compound fault features extraction to further verify its effectiveness.At last, Section 5 gives the conclusions.

Theoretical Framework
2.1.Sparse Representation Theory.Signal sparse representation is to represent the signal with as fewer nonzero values as possible in an overcomplete dictionary, in order to simplify the procedure of signal processing.The following part gives a concrete description on sparse representation theory.
Assume that a set A = {a  } ∈Γ contains  elements and  linearly independent vectors with  ≪ ; thus, the set A is an overcomplete dictionary or basis.Each column of the matrix A is a signal called an atom.Considering the signal sampled from the transducer contains noise, the observed signal y() can be modeled as where y() is the observed noisy signal, x() is the true signal without noise, and () is the noise.The true signal x() can be represented sparsely with the overcomplete basis A and also can be described as a linear combination of certain atoms of A. Therefore, the representation of x can be expressed as x = Ac, where c is the vector of representation coefficients which also represents the transient.The occurrence of an impulse in the cyclic signal generates a value in the sparse coefficient vector c accordingly.Therefore, when the cyclic impulses occur in the signal, cyclic values occur in the sparse coefficient vector c correspondingly, and other values in the sparse coefficient vector will be zeros theoretically.Thus, the representation coefficient vector c has sparsity, and then the cyclic transient components as well as the impulse time of the signal can be extracted from the sparse coefficients.As a result, the estimation modal in (1) turns into where A is an  ×  matrix, with  < ; c is a length- vector.The more similar the basis A and the signal x are, the sparser the vector c will be.Based on the sparse representation theory, to obtain the sparse representation of signal x under overcomplete basis A, the optimization function can be constructed as where ‖c‖ 0 is the  0 -norm of vector c, counting the nonzero values of vector c.
where ‖c‖ 2 is the  2 -norm of vector c, defined as ‖c‖ 2 2 = ∑  =1 |c()| 2 , and  is the regularization parameter.After the minimization of objective function in (5), a sparse representation vector c can be obtained.To minimize J(c), an iterative algorithm must be introduced.The traditional gradient descent methods always converge slowly, such as iterative shrinkage/thresholding algorithm (ISTA) [34] and fast IST algorithm (FISTA) [35].In order to improve the speed of convergence, Afonso et al. proposed a novel technique named the split augmented Lagrangian shrinkage algorithm (SALSA) [36], which is faster than the earlier methods.The algorithm updates the vector c during each iteration so as to minimize the objective function J(c) until the optimal solution ĉ is gained.
Considering the unconstrained optimization problem in which the objective function is the sum of two functions, (5) can be written as where Then variable splitting is introduced to create a new variable denoted by u, to serve as the augment of  1 , under the constraint that u = c.This leads to the constrained problem: which is obviously equivalent to the unconstrained problem in (6).Then, use the following definitions: With these definitions, ( 7) can be transformed into The augmented Lagrangian function for this problem is defined as where  is a vector of Lagrange multipliers and  ≥ 0 is the penalty parameter.The augmented Lagrangian method (ALM) is used to minimize the objective function (z, , ), and the following results can be obtained: where  is the iteration counter.Considering the concrete forms of the function (z), matrix H, and the vector b, novel results can be written as Equation ( 11) is a strictly convex quadratic function to be minimized, which leads to the solution u (+1) directly, and the soft threshold facilitates the minimization of ( 12), after which the iteration procedure of SALSA can be listed as By running the iterative numerical algorithm SALSA, the optimal sparse solution ĉ can be found eventually.With the sparse solution ĉ, which means most elements of the vector ĉ are closer to zero, the reconstructed x can be represented as x = Aĉ.In the sparse vector ĉ, there are successive periodic nonzero coefficients which present the transient responses in the original signal.Thus, the fault period can be calculated from the envelope spectrum analysis of the reconstructed signal, after which the fault feature is eventually extracted.

Selection of the Optimal Wavelet Bases by Correlation
Filtering.After choosing SALSA as the sparse optimization algorithm, the selection of wavelet bases turns out to be another key work.In order to represent the original signal as sparse as possible, the basis should be as relevant to the signal as possible.Considering the characteristics of vibration signal of the faulty bearing, the Laplace wavelet, which is in shape similar to the signal transients caused by bearing localized fault, is selected to construct the wavelet basis during the extraction of bearing fault feature.The Laplace wavelet is defined as where the parameter vector  = (, , ) determines the wavelet properties.These parameters (, , ) denote frequency  ∈ R + , damping ratio  ∈ [0, 1) ⊂ R + , and time index  ∈ R, respectively.The parameter  1 is used to normalize the wavelet function.
The parameters , , and  belong to the subsets of F, Z, and T, respectively.With different parameters, the Laplace wavelet dictionary can be constructed as With the constructed Laplace wavelet basis, correlation filtering is introduced to identify the optimal set of parameters (, , ).Correlation is used to determine the similarity between the wavelet basis and the original signal and measured by inner product operation.The correlation function   is defined to calculate the correlation degree between the basis   () and the original signal x(): where  is the angle between   () and x().The smaller the angle is, the more similar the basis   () and the original signal x() will be.Therefore, the optimal wavelet atom with optimal parameters (, , ) can be obtained by maximizing the correlation function   at each time value from the constructed Laplace wavelet dictionary.Peaks of   for a given time value  can be represented as and the time index parameter  can be calculated by maximizing the coefficient   ().With correlation filtering, the optimal parameters (, , ) can be found effectively; then the optimal wavelet atom with these parameters can be constructed.
Meanwhile, the Morlet wavelet, which is in shape similar to the vibration signal transients caused by gear localized fault, is used to construct the wavelet basis during the extraction of gear fault feature.The Morlet wavelet is defined as where the parameter vector  = (, , ) also determines the wavelet properties.These parameters (, , ) denote frequency  ∈ R + , damping ratio  ∈ [0, 1) ⊂ R + , and time index  ∈ R, respectively, and the optimal set of the characteristic parameters also can be found by correlation filtering.The parameter  2 is used to normalize the wavelet function.

Separation and Extraction for Gearbox Compound Fault.
However, the sequence of the compound fault features extraction should be determined first.The influence of propagation path of signals in the gearbox is taken into consideration to handle the problem.The vibration signals, generated from the gearbox, always contain not only the normal vibrations but also the fault vibrations.In theory, these signals transmit arbitrarily in the gearbox, but as a whole there is an overall propagation path of signals: gear-spline-shaft-bearing-casing [37].When there are faults in both gear and bearing, the signal sampled by the sensor always contains different kinds of transient impulse responses.Because the sensor is placed on the casing, which is closer to the bearing according to the propagation path of the fault vibration signal as shown in Figure 1, the energy of bearing fault feature is thus higher than that of gear fault feature in the captured compound signal.As a result, the feature of bearing with higher energy is extracted first in order to reduce the interference during the extraction of gear fault feature.Since the choice for bases A 1 and A 2 has been made, by incorporating them into the algorithm SALSA during the extraction procedure of each fault feature, two sparse vectors ĉ1 and ĉ2 can be obtained one by one.That is, due to the characteristics of the vibration signal of the defective bearing, the Laplace wavelet basis, which is matched with the original compound fault signal by correlation filtering, is firstly incorporated into the iterative algorithm SALSA.Then, a sparse vector ĉ1 , which represents the bearing fault feature sparsely, and the reconstructed signal x1 of bearing fault can be obtained.After the sparse representation of bearing fault feature, the amplitude of each transient impulse is represented by the sparse vector ĉ1 .In order to estimate the real amplitude of the bearing fault transients, a constrained optimization strategy is proposed to estimate the amplitude of the single fault component by introducing the parameter .The spectrum of the residual fault signal x − x 1 is denoted by  1 () where x is the original signal, x1 is the reconstructed signal of bearing,  1 is the peak frequency of x1 , and  is a positive parameter.When  1 () is minimized eventually subjected to its constraints, it indicates that the bearing fault component in the residual fault signal has been removed to the largest extent.By solving problem in (20), an optimal value  opt1 is acquired, and the estimated bearing fault signal can be obtained by the function x 1 =  opt1 x1 .
Then, after removing the bearing fault signal, the residual signal should only contain gear fault transient responses and underground noise.To extract the fault feature of gear, Morlet wavelet basis, matched with the obtained residual signal by correlation filtering, is incorporated into the iterative algorithm SALSA.After that, the second sparse vector ĉ2 representing the gear transients is generated.Similarly, the reconstructed signal x2 of gear can be gained by x2 = A 2 ĉ2 .Similar to the bearing fault feature extraction, it is also necessary to estimate the real gear fault signal by solving another constrained optimization problem: where the spectrum of the residual fault signal x res − x 2 is denoted by  2 (), x res is the residual signal after removing the bearing fault component, and x2 is the preliminary reconstructed fault signal of gear.After solving problem in (21), another optimal value  opt2 is acquired naturally.The estimated gear fault signal thus can be obtained by the function x 2 =  opt2 x2 .To summarize, the procedure of the proposed method in this paper to separate and extract gearbox compound fault features using wavelet bases is presented in Figure 2.

Simulated Study
To verify the effectiveness of the proposed method, a simulated compound fault signal processing is performed for different features extraction.Considering the characteristics of the compound fault vibration signal in a gearbox, the signal is constructed as where x 1 () is a period cyclic impulse responses signal to simulate the bearing fault in Figure 3(a).The values of parameters are given.The frequency is  1 = 3500 Hz, the damping ratio is  1 = 0.080, the time index is  1 = 0.1 s, the cyclic period is  1 = 0.007 s, and the normalized parameter is  1 = 1.x 2 () is also a period cyclic signal to simulate the vibration signal of faulty gear in Figure 3 values of the parameters are listed:  2 = 275 Hz,  2 = 0.0074,  2 = 0.02 s,  2 = 0.05 s, and  2 = 0.6.The signal () is white Gaussian noise, which is weighted by   = 0.3.The sampling frequency is 25.6 KHz and the sampling number is 5000.Figure 3(c) gives the waveforms of the noisy compound signal.
To separate and extract each fault feature from the noisy compound signal, the proposed sparse representation under wavelet bases is applied.According to the procedure in Figure 2, the first step is to obtain the optimal Laplace wavelet basis matched with the original noisy signal by correlation filtering, which is shown in Figure 4(a).Then incorporate the matched Laplace wavelet basis into the iterative processes of the algorithm SALSA, after which the sparse coefficients representing the transient feature of the faulty bearing can be obtained, as shown in Figure 4(b).The reconstructed signal of bearing is shown in Figure 4(c), and Figure 4(d) gives the envelope spectrum analysis of reconstructed signal.In Figure 4(b), there are successive periodic nonzero values in the sparse vector, which represent the bearing fault transient in the original signal.In Figure 4(d), the fault characteristic frequency of bearing is obtained as 141.1 Hz, almost consistent with the theoretical value ( 0 = 1/ 1 = 142.9Hz). Figure 4(e) shows the estimated bearing fault signal with  opt1 = 1.716.
Remove the estimated bearing fault signal from the original noisy compound signal; we can get a residual signal as shown in Figure 5. Similar to the bearing, the first step is to obtain the optimal Morlet wavelet basis matched with the residual signal by correlation filtering, which is illustrated in Figure 6(a).Then, incorporate the Morlet wavelet basis into the iterative processes of SALSA, sparse coefficients representing the gear fault feature can be obtained in Figure 6 simulated compound fault signal.The effectiveness of the proposed method has been proven preliminarily.
Taking the noise inference into consideration, the noise intensity   is increased gradually in order to analyze the robustness of the proposed method.Figures 7-10 show the extraction results when   is selected as 0.4, 0.5, and 0.6, respectively.As shown in these figures, bearing and gear faults can still be separated and extracted from the compound signal accurately.Based on these above analyses, it can be concluded that the proposed method has the capability to suppress the noise inference until the noise intensity is increased to a higher value.

Application to Gearbox Compound Fault Features Extraction
To further verify the effectiveness of the proposed method in practical engineering application, defective gearbox data is analyzed.The research object is a single stage transmission gearbox in a test-bed, as illustrated in Figure 10.The faulty gear is a helical gear, whose working parameters are listed in Table 1.The bearing model in the experiment is 30625, taper roller bearing, and its geometric parameters are listed in Table 2. Based on the known parameters, the theoretical fault frequency of the bearing outer race can be calculated as 176.18 Hz.Aimed at getting gearbox compound fault data, a crack width of 0.4 mm is set in the outer race of the bearing using the linear cutting technique to simulate the localized fault of a bearing, and half a tooth is cut in the driving gear by electric sparkle technique to simulate the localized defection of a gear as shown in Figure 11.Additionally, to reduce the influence of propagation path, the sensor is placed on the bearing end    The measured vibration signal with compound fault is shown in Figure 12(a).From Figure 12(a), the characteristics of each fault cannot be identified clearly.Therefore, the proposed method in this paper is employed to extract different transient features from the noisy signal.Figure 12(b) gives its frequency spectrum.Figure 12(c) is the envelope spectrum analysis of the original signal.In Figure 12(c), there exist different frequency components.Thus, the location of the fault cannot be identified exactly in the gearbox.According to the procedure of the proposed method in this paper, the first step is to obtain the optimal Laplace wavelet basis matched with the original measured signal by correlation filtering.Figure 13(a) gives the result of matched Laplace wavelet basis.Then, the matched Laplace wavelet basis is incorporated into the iterative algorithm SALSA.Figure 13(b) shows the sparse coefficients of bearing.In Figure 13(b), there are successive nonzeros in the sparse vector, which represent the transient of bearing outer race fault.Figure 13(c) is the reconstructed signal of faulty bearing.Figure 13(d) gives the envelope spectrum analysis of the reconstructed signal, where we can conclude that the feature frequency of the bearing outer race is 174.1 Hz, and it is consistent with the theoretical value 176.18 Hz.Hence, it indicates that there exists a localized fault in the outer race of bearing indeed.Finally, Figure 13(e) shows the estimated bearing fault signal with  opt1 = 1.322.
After removing the estimated bearing fault signal from the original signal, we can obtain the residual signal in Figure 14.Then the next steps of the proposed method are conducted and the extraction results of the defective gear fault feature are shown in Figure 15.

Conclusions
This paper proposes a novel method to represent the gearbox compound fault features sparsely using different wavelet bases so as to separate the different faulty components from the compound signal.Based on the sparse representation theory, the proposed method introduces the numerical iterative algorithm SALSA under Laplace wavelet basis and Morlet wavelet basis, respectively, to solve the BPD problem, after which two sparse vectors can be obtained one by one.One vector represents the transient feature of faulty bearing and the other represents the transient feature of defective gear.As a result, the proposed method converts the gearbox compound fault features into a series of sparse coefficients, which facilitates gearbox fault diagnosis.Both the simulated study and the application to the sampled gearbox compound fault data verify that the proposed method can separate and extract the compound fault features of the gearbox effectively.

Figure 1 :
Figure 1: Diagram of test-bed (the red arrow illustrates the propagation path of the compound fault signal).

Figure 2 :
Figure 2: Procedure of the proposed compound fault transients extraction method.

Figure 3 :
Figure 3: Simulated signals: (a) the simulated signal for faulty bearing, (b) the simulated signal for faulty gear, and (c) noisy compound fault signal.

Figure 4 :Figure 5 :Figure 6 :
Figure 4: Results of bearing fault signal: (a) optimal Laplace basis, (b) sparse coefficients, (c) reconstructed signal, (d) the envelope spectrum analysis of reconstructed signal, and (e) the estimated bearing fault signal.

Amp. (m/s 2 )Figure 7 :
Figure 7: Extraction results of the compound signal with   = 0.4: (a) the noisy signal, (b) sparse coefficients of bearing fault component, (c) the envelope spectrum analysis of reconstructed bearing fault signal, (d) sparse coefficients of gear fault component, and (e) the envelope spectrum analysis of reconstructed gear fault signal.

Figure 8 :
Figure 8: Extraction results of the compound signal with   = 0.5: (a) the noisy signal, (b) sparse coefficients of bearing fault component, (c) the envelope spectrum analysis of reconstructed bearing fault signal, (d) sparse coefficients of gear fault component, and (e) the envelope spectrum analysis of reconstructed gear fault signal.

Figure 9 :
Figure 9: Extraction results of the compound signal with   = 0.5: (a) the noisy signal, (b) sparse coefficients of bearing fault component, (c) the envelope spectrum analysis of reconstructed bearing fault signal, (d) sparse coefficients of gear fault component, and (e) the envelope spectrum analysis of reconstructed gear fault signal.

Figure 10 :
Figure 10: Experimental gearbox in a test-bed.

Figure 12 :
Figure 12: The measured signal and its spectral analysis: (a) the measured signal with compound fault, (b) its frequency spectrum, and (c) its envelope spectrum analysis.

Figure 13 :
Figure 13: Results of bearing fault signal: (a) optimal Laplace basis, (b) sparse coefficients, (c) reconstructed signal, (d) the envelope spectrum analysis of the reconstructed signal, and (e) the estimated bearing fault signal.

Figure 14 :
Figure13(b)  shows the sparse coefficients of bearing.In Figure13(b), there are successive nonzeros in the sparse vector, which represent the transient of bearing outer race fault.Figure13(c) is the reconstructed signal of faulty bearing.Figure13(d)gives the envelope spectrum analysis of the reconstructed signal, where we can conclude that the feature frequency of the bearing outer race is 174.1 Hz, and it is consistent with the theoretical value 176.18 Hz.Hence, it indicates that there exists a localized fault in the outer race of bearing indeed.Finally, Figure13(e) shows the estimated bearing fault signal with  opt1 = 1.322.After removing the estimated bearing fault signal from the original signal, we can obtain the residual signal in Figure14.Then the next steps of the proposed method are conducted and the extraction results of the defective gear fault feature are shown in Figure15.Figure 15(a) gives the optimal matched Morlet wavelet basis.Figure 15(b) gives the sparse coefficients.In Figure 15(b), there are successive nonzeros in

Figure 15 :
Figure 15: Results of gear fault signal: (a) optimal Morlet basis, (b) sparse coefficients, (c) reconstructed signal, (d) the envelope spectrum analysis of the reconstructed signal, and (e) the estimated gear fault signal.

Figure 15 (
c) shows the reconstructed signal of the defective gear.Figure15(d)is the envelope spectrum analysis of the reconstructed signal, from which the feature frequency can be obtained as 25.6 Hz, nearly consistent with the theoretical value 24.67 Hz.Therefore, it indicates that there is a localized fault in the gear indeed.Finally, Figure15(e) shows the estimated gear fault signal with  opt2 = 2.193.

Table 1 :
Working parameters of gears in the tested gearbox.

Table 2 :
Geometry of the tested bearing.