Title Fault Diagnosis Method of Gearbox Multifeature Fusion Based on Quadratic Filter and QPSO-KELM

Effective filtering and noise reduction, feature extraction and fault diagnosis, and prognostics technology are important to Prognostics and Health Management (PHM) of equipment. +erefore, a multifeature fusion fault diagnosis method based on the combination of quadratic filtering and QPSO-KELM algorithm is proposed. In the quadratic filtering, stable filtering can reduce the impact of noise and fast-kurtogram can filtrate fault frequency bands with rich fault information. +en, the time-domain, frequency-domain, and time-frequency parameters of the secondary filter signal are extracted. MSSST was used to analyze the filtered signal, and the time-frequency image was obtained.+e time-frequency parameter was extracted from the time-frequency image by 2DPCA, and all the extracted parameters are taken as the fusion fault feature of the gearbox. Finally, the fault feature parameters are taken as the training sample and testing sample of QPSO-KELM for training and testing to achieve the purpose of fault diagnosis. +e experimental results show that the proposed method can effectively filter the noise, complete the fault mode identification of gearbox, and improve the fault diagnosis accuracy better than other methods.


Introduction
As a key component of mechanical transmission system, gearbox failure will lead to equipment shutdown, economic losses, and even casualties. e earlier the gearbox fault is found and repaired in time, the less the loss will be caused. Gearbox is mainly composed of rotating shaft, bearing, and gear. Its common fault types are gear fault and bearing fault. e fault will produce an impact signal, but the background noise interference and complex propagation path make it difficult to judge the fault state of gearbox timely and accurately. In order to filter out the noise, extract the effective state information, and better monitor and diagnose the state of gearbox, the related research is being carried out gradually.
As the signal is inevitably disturbed by noise in the process of acquisition, filtering technology as a signal denoising method plays an important role in the follow-up research. At present, the commonly used methods of filtering and denoising can be divided into linear filtering and nonlinear filtering. Linear noise reduction algorithm is not suitable for processing complex vibration signals. erefore, the nonlinear filtering noise reduction method is the mainstream noise reduction method in the field of mechanical equipment fault diagnosis. e commonly used technologies include noise reduction method based on empirical mode decomposition (EMD) [1][2][3][4], noise reduction method based on wavelet transform [5][6][7], and noise reduction method based on manifold learning [8][9][10][11]. For example, Boudraa and Cexus [1,2] proposed a noise reduction method based on EMD threshold, Gu et al. [7] used wavelet analysis to denoise data, and Song et al. [11] and others used the manifold learning method to denoise. Although the above methods can achieve some effective results, the mechanical transmission system is often disturbed by impact noise in the motion process, and the impact noise can be better filtered by a stable filter.
Feature extraction technology is the core of fault diagnosis. e commonly used feature parameters include time-domain feature [12,13], frequency-domain feature [14], and timefrequency feature [15]. Time-domain feature is the feature information directly extracted from the time-domain signal, which can intuitively obtain the signal feature. e frequencydomain feature is extracted from the frequency spectrum obtained by Fourier transform. e time-frequency feature contains more information. e commonly used time-frequency analysis methods include short-time Fourier transform (STFT) [16,17], continuous wavelet transform (CWT) [18,19], and S-transform (ST) [20][21][22]. e S-transform inherits the advantages of STFT and CWT and develops on the basis of STFT and CWT. However, there are still some problems such as low resolution and energy leakage of the time-frequency images obtained by using ST technology.
Intelligent classification algorithm is widely used in the field of fault diagnosis, and its effect is good. Common intelligent algorithms include neural network [23][24][25], support vector machine (SVM) [26,27], extreme learning machine (ELM) [28,29], kernel extreme learning machine (KELM) [30], deep learning [31,32], and other methods. Among them, the KELM algorithm has stronger comprehensive advantages in sample demand, network generalization, calculation speed, and accuracy and is more suitable for solving gearbox fault diagnosis problems with high calculation speed and accuracy requirements. However, the classification ability of KELM network is affected by the value of kernel parameters and penalty coefficient, so it is necessary to optimize the above structural parameters. erefore, a method, using the quantum particle swarm optimization algorithm [33][34][35] to optimize the structural parameters of KELM (QPSO-KELM), is proposed in this paper. e algorithm based on QPSO can optimize the structural parameters of KELM with strong global search ability, so as to improve the learning speed and classification accuracy of KELM and improve the accuracy of gearbox fault diagnosis.
At present, most of the methods are not perfect in terms of diagnosis process and structure, and main problems are as follows: (1) e effect of signal filtering and noise reduction is not obvious enough, leading to the fault feature extraction which is not obvious enough.
(2) e common time-frequency analysis methods also have the problems of energy leakage and low resolution, which cannot extract effective two-dimensional features.
(3) e fault diagnosis accuracy of intelligent classification algorithm is not enough. To a large extent, the classification accuracy can be improved by improving the parameters. erefore, filtering noise reduction, feature extraction, and fault diagnosis technology are the focus of this paper [36].
In order to solve the above problems, denoising, feature extraction, and intelligent fault diagnosis technology are listed as the research priorities in this paper, and the following researches are carried out. e related steps involved are shown in Figure 1.
e working process is summarized as follows: (1) Quadratic filtering: collecting the signals of gearbox under different fault conditions, as the signals are susceptible to noise, the stable filter is first used for primary filtering, and then, the fast-kurtogram is used to filter frequency bands with rich fault information [37][38][39]

Stable Filter and Fast-Kurtogram. Materials and
Methods should be described with sufficient details to allow others to replicate and build on published results. Please note that publication of your manuscript implicates that you must make all materials, data, computer code, and protocols associated with the publication available to readers. Please disclose at the submission stage any restrictions on the availability of materials or information. New methods and protocols should be described in detail, while well-established methods can be briefly described and appropriately cited.
Research manuscripts reporting large datasets that are deposited in a publicly available database should specify where the data have been deposited and provide the relevant accession numbers. If the accession numbers have not yet been obtained at the time of submission, please state that they will be provided during review. ey must be provided prior to publication.
Interventional studies involving animals or humans and other studies requiring ethical approval must list the authority that provided approval and the corresponding ethical approval code.
Conventional signal processing methods usually filter noise by linear algorithms. But when the noise is impact and the trailing is serious, the linear filter cannot deal with it effectively. Stable filter is better for this kind of impact signal.
e standard additive noise model of a signal is as follows: 2 Mathematical Problems in Engineering where s t is the signal of interest and n t is the noise. Noise signals come from a wide range of sources, such as equipment operation, resonance, and environmental noise.
e distribution of such noise can be expressed by symmetrical stable distribution. Stable filter can filter out the noise to the maximum extent and recover the signal s t .
For the noise which conforms to the stable distribution, the nonlinear filtering technology based on the same steady-  state distribution model has the best processing effect. ese filters have good robustness and can also effectively deal with the situation of serious signal tailing distribution. Suppose ρ(x) � − log f(x) is the negative value of logarithmic density, and the cost function is defined as follows: e stable filter is defined as the minimum θ value of the cost function: When ρ(x) � − log f(x), the minimum value of cost function is the maximum likelihood estimation of parameter θ. When the distribution is the Gaussian distribution of α � 2, ρ(x) � − x 2 /2, and the minimum value θ LINEAR can be obviously got. When 0 < α < 2, filtering is nonlinear, and the minimum value of equation (3) must be found by numerical method.
A variety of filtering forms can be obtained by replacing the cost function equation (2). e simplest extension is to add a nonnegative weight ω i : In the detection problem, the known signal s 1 , . . . , s n is extracted from the signal interfered by stable distributed noise, which can be expressed as follows: x t � θs t + n t , t � 1, 2, 3, . . . .
If there is no known signal in the signal, θ � 0, if there is a known mode in the signal θ ≠ 0. e filtering process can be realized by constructing a real filter function: By the minimization equation (6), θ can be obtained, and θ is the estimated value of the extracted known signal.
After getting the stable-filtered signal, the fast-kurtogram is used to select the frequency band with rich fault information.
e basic theory of fast-kurtogram is as follows. Spectral kurtosis is a performance index reflecting the instantaneous impact strength. Its basic theory is to calculate the kurtosis value of each spectral line, so as to find the frequency band of impact [42]. e calculation equation is as follows: where c i k (n) is the complex envelope of signal x(n) at the corresponding central frequency e spectral kurtosis is proposed by Antoni et al. But the calculation of kurtosis is cumbersome and not conducive to engineering practice. erefore, Antoni further proposed the concept of fast spectral kurtosis. e fast spectral kurtosis is an inverted pyramid structure. e abscissa represents the frequency, the ordinate represents the number of decomposition layers, and the color depth of the image represents the spectral kurtosis under different central frequencies and bandwidths. According to the color depth of fast spectral kurtosis, the optimal bandwidth and center frequency can be obtained. en, the fault frequency band is extracted by band-pass filter, and then, the square envelope order spectrum of fault signal is obtained by square envelope analysis. e inverted pyramid structure of the fast-kurtogram is shown in Figure 2.

MSSST-2DPCA. For signal x(t), the S-transform of x(t)
can be defined as follows: where the result of S(t, f) represents the time spectrum of the signal; g(t − τ, f) is the Gaussian window function; t and f represent time and frequency, respectively; and τ is a time parameter that represents the position of the Gaussian window on the timeline. According to equation (9), synchronous squeezing S-transformation (SSST) is performed on signal x(t), and the calculation steps are as follows.
e coefficients of S-transform are S x (t, f), which can be obtained by equation (8).
Calculate the time partial derivative of the S-transform, and the equation is as follows: e instantaneous frequency ω x can be obtained by transforming equation (10), and the calculated result is By compressing the value of the interval [ω x − (1/2)Δω, ω x + (1/2)Δω] to ω x , we can get the synchronous compression transformation value SSST. e formula is as follows: e time-frequency analysis method of MSSST can be obtained by multiple iterations of the SSST, and the iterative equation of MSSST is as follows: 4 Mathematical Problems in Engineering SSST [1] � SSST, where SSST [1] is the SSST and N is the number of iterations For MSSST, it is only required to calculate ω [N] x (t, ξ) and substitute it into the first-order SSST, and the calculated result is the same as equation (13). erefore, the computational complexity of MSSST is similar to that of the firstorder SSST, and it is beneficial to improve the calculation speed.
e proposed MSSST is effective to solve energy leakage and low resolution of the spectrum.
After getting the time-frequency images, 2DPCA is used to extract the feature matrix. e theory of twodimensional principal component analysis (2DPCA) is as follows.
Suppose X is the n-dimensional column vector, A is the image matrix of m × n, and Y is the m-dimensional projection vector of A on X after linear transformation. e process can be expressed as follows: e trace of covariance matrix Y of S x is defined as the total divergence: In maximum total divergence criterion, when the divergence of the projected vector Y is the largest, the direction of vector X is the optimal projection direction, where the equation of S x is e covariance matrix expression of the image is as follows: According to equation (19), G t is a nonnegative definite n × n-dimensional matrix. Suppose there are M training images, the j-th image is represented as A j , and the mean value of all training images is recorded as A, then the divergence matrix expression is as follows: e normalized expression is as follows: e X obtained from the criterion formula of the maximum divergence is called the optimal projection axis. e optimal projection axis X is the eigenvector corresponding to the maximum eigenvalue of G t . In general, one optimal projection axis is not enough. erefore, the first d orthogonal unit eigenvectors with the largest eigenvalues are selected as the optimal projection axis, which can be expressed by the following equation: e optimal projection vector X 1 , ..., X d of 2DPCA is used to extract the two-dimensional features of the image. Suppose sample image A, according to the above equations,

QPSO-KELM.
KELM algorithm is adding kernel function into ELM, and the input weights and offsets in ELM are replaced by kernel function mapping, which makes the output of the ELM more stable and solves the over learning problems.
ELM is a single hidden layer feedforward neural network, and its hidden layer weight does not need to be adjusted by feedback regulation. e structure of the ELM is shown in Figure 3.
then the ELM training model can be expressed as where So β i , w i , and b i can be used to obtain equation (25), as follows: Equation (23) can also be expressed as a matrix: where H is the output matrix of the hidden layer, and the equation of β is as follows: where H + represents the inverse matrix of H, and β ′ is the optimal quadratic solution to β.
According to the structural optimization and ERM criteria, the output weight β of ELM can be determined, that is, where ξ i � [ξ i,1 , . . . , ξ i,m ] T represents the network calculation error, (1/2)‖β‖ 2 is the structural error, (1/2) N p�1 ‖ξ p ‖ 2 is the empirical error, and C is the penalty factor. rough the optimization of the solution, the optimal solution of β can be obtained as follows: erefore, the output model function of ELM is as follows: By using kernel operation to replace matrix operation H in ELM, the KELM classification model can be established. e kernel matrix is defined as follows: where Ω is the symmetric matrix of N × N and K(x, y) is the kernel function. Equation (30) can be expressed by kernel function as follows: where λ represents the output weight matrix of KELM. Different support vector machines can be constructed by selecting different kernel functions. Among the commonly used kernel functions, compared with the polynomial kernel function and sigmoid kernel function, the radial basis kernel function has the advantages of simple parameters and strong adaptability to randomly distributed samples. erefore, a  SVM based on radial basis kernel function is established in this paper, and its expression is shown as follows: According to equations (28) and (33), it can be seen that KELM contains kernel parameter and penalty coefficient C, and it is necessary to find the optimal parameter to make the classification effect of KELM the best.
By quantizing the iterative updating process of particles of PSO algorithm, the QPSO can reduce the algorithm complexity and improve algorithm convergence speed and global search ability. e basic principles of QPSO are as follows.
Assume that Ω is the d-dimensional search space, and the population number of particles in the space is b, then the position of the ith particle can be expressed as Suppose the individual optimal position of the particle is p ibest , and the global optimal position of the particle is p gbest ; the p ibest and p gbest are as follows: e particle can find and update its individual optimal position p i and population optimal position p g through iterative operation, and the average optimal position mbest can be introduced as the population optimal center. en, the particle optimization process can be expressed as where α is the contraction expansion factor, which is dynamically adjusted in the iterative operation according to the following equation: where N represents the maximum number of iterations. α 1 and α 2 are the initial and final values of α, respectively; usually, the two parameters are set as α 1 � 0.5 and α 2 � 1.
According to the basic principle of KELM and QPSO, the network training process which used QPSO to optimize the network structure parameters of KELM is as follows: Step 1. Initialize the position of population particles, and set parameters such as particle swarm size, iteration step size, and termination conditions.
Step 2. Initialize the current position of each particle as p ibest , define the classification accuracy of KELM as fitness function, and calculate the fitness value of each particle. e position of the particle with the maximum fitness value was initialized as p gbest .
Step 4. e fitness value of each particle is calculated, and the individual optimal position p best , the group optimal position p gbest , and the group optimal center mbest are updated based on the optimal fitness value.
Step 5. Judge whether the termination conditions are met. If so, stop the calculation and output the result; if not, return to step 3.
According to the above steps, the modeling flow chart of QPSO-KELM can be drawn, as shown in Figure 4.

Experimental Verification and Discussion
To verify whether the proposed method is available, three preset fault tests of the gearbox are conducted. e fault simulation test rig is shown in Figure 5. e test rig consists of four parts: the power and control part, the bearing fault simulation part (not used), the gear fault simulation part, and the data acquisition part (not shown).
is section mainly uses the gearbox fault simulation part. e perspective and internal structure of the gearbox are shown in Figure 6. e teeth number in the gearbox from high-speed shaft to low-speed shaft is 41, 79, 36, and 90. e fault gear is gear 3 with 36 teeth. e fault types include wear, break, and miss teeth. e fault bearing is located in the deep groove ball bearing at the end cover of the gear 3 side of the medium-speed shaft (Figure 6(b)). e deep groove ball bearing type is ER-16k, and its fault types include inner race fault and outer race fault.
In the test, two vibration acceleration sensors are set in vertical and horizontal directions, respectively ( Figure 6(c)). e data acquisition parameters are set as follows: sampling frequency fs � 20.48 kHz, sampling time t � 48 s, and motor speed 20 r/s. e main dimension parameters of the bearing ER-16K are shown in Table 1. According to the speed of the motor and the parameters of the gearbox and the bearing ER-16K, the relevant frequency can be calculated. e calculation results are shown in Table 2.
f r1 is the rotating frequency of high-speed axis, f r2 is the rotating frequency of medium-speed axis, and f r3 is the rotating frequency of low-speed axis; f m1 is the meshing frequency of gear 1 and gear 2, and f m2 is the meshing frequency of gear 3 and gear 4; f BPFO is the fault characteristic frequency of the bearing outer race, f BPFI is the fault characteristic frequency of the bearing inner race, and f BSF is the fault characteristic frequency of the bearing ball spin.

Denoising with Quadratic Filter.
e quadratic filtering method can effectively filter out the interference noise in the signal and select the fault frequency band with rich fault information. Two cases of bearing inner race fault and gear Mathematical Problems in Engineering broken tooth fault are selected as fault cases in the quadratic filtering part.
Firstly, the stable filter is used to filter out the noise, and the detailed process is as follows.
e fault signal of bearing inner race and gear broken tooth are filtered with stable filter, and the filtering parameters are set as α � 1.5, β � 0, and c � 1; then, the filtered spectrum can be obtained, as shown in Figure 7.
Comparing with the spectrum before and after filtering, the signal-to-noise ratio (SNR) is improved with a stable filter. e SNR of bearing inner race fault before filtering is − 9.3491, the SNR after filtering is increased to − 8.2371, the SNR before filtering is − 8.7745, and the SNR after filtering is − 8.1016. Comparing with the spectrum before and after filtering, it can be seen that the trailing part of the signal is reduced, which indicates that the stable filter is effective to reduce noise.
Secondly, the fast-kurtogram technology is used to process the filtered signals, and the fault frequency band with rich fault information can be obtained. e fastkurtogram diagram and square envelope results of bearing inner race fault and tooth break fault are shown in Figure 8. Figure 9 is the square envelope spectrum of bearing inner ring fault and gear broken tooth after quadratic filtering. It can be seen from Figure 9 that the feature frequency of bearing inner race fault is obvious, and the bearing inner race fault can be judged according to the fault feature frequency. In the square envelope spectrum of tooth broken fault, the meshing frequency, frequency doubling, and the surrounding side band can be observed, so as to judge the gear tooth broken fault. e above results show that the fault information of gearbox can be reflected to a great extent through the signal analysis after quadratic filtering.

Feature Images Extraction with MSSST-2DPCA.
Time-frequency analysis is carried out on the filtered inner race fault signal of bearing and tooth break fault signal, and the time-frequency analysis results of STFT, CWT, ST, and MSSST are compared. e time-frequency analysis results are shown in Figures 10 and 11.
From the time-frequency analysis images, it can be seen that the order of time-frequency resolution and energy aggregation is MSSST > ST > CWT > STFT, both in bearing inner race fault or gear broken tooth fault.
After getting the time-frequency images, 2DPCA is used to extract the six cases of normal, tooth miss, tooth break, tooth wear, and bearing inner race fault and outer race fault. e parameter settings are as follows: the original image size is 258 * 330, and the image size after 2DPCA feature extraction is 72 * 99. By using the MSSST method, 960 images are generated for each case, and a total of 5760 images can be obtained. en, the 2DPCA algorithm is used to extract the feature matrix of the image. Comparing STFT and MSSST, and the results are as follows. e images in Figure 12 and 13 are the time-frequency features of the filtered fault signal. Comparing the results of 2DPCA-STFT and 2DPCA-MSSST, it can be seen that the feature extracted by 2DPCA-MSSST contains more abundant fault information.

Feature Fusion.
In the previous paper, 19 feature parameters are extracted by calculation, and the feature matrix corresponding to the time-frequency image is obtained by using MSSST-2DPCA technology. e fusion features are Initialize particle swarm size, evolutionary algebra, and particle position (C, σ)

Determine particle fitness function (the test classification accuracy of KELM)
Calculate the fitness function value of each particle Update individual optimal location, group optimal location, and group optimal center Output the optimal parameter combination of KELM(C, σ) Whether the termination conditions are satisfied?
No Yes Figure 4: Flow chart of QPSO-KELM. QPSO is used to search global optimal parameters of KELM. 8 Mathematical Problems in Engineering obtained by fusing the feature parameters and the feature matrix. e specific steps are as follows: (1) Since the size of the feature image is 72 * 99, in order to ensure the same length of the feature vector, 99 segments of the same length signal are intercepted from the collected signals, 19 feature parameters are extracted, respectively, and the feature parameter matrix of 19 * 99 can be obtained. e corresponding image is shown in Figure 14.

Fault Diagnosis Based on QPSO-KELM.
e time-domain, frequency-domain, and time-frequency feature parameters obtained above are summarized; then, 960 groups of features can be obtained, and the features are randomly divided into training samples and testing samples according to the ratio of 3 :1; the training samples number is 720, and the testing samples number is 240. e 720 training samples are used to train the QPSO-KELM. After training, the 240 testing samples are input into the QPSO-KELM, and then, the test results are obtained, as shown in Figure 15. ere is no error point in Figure 15, so the classification accuracy is 100%.

Comparison and Discussion
In order to verify the effectiveness of the proposed method, ELM and PSO-KELM algorithms are used to compare with the QPSO-KELM algorithm. Meanwhile, in order to verify the noise filtering effect of quadratic filter, the fault classification accuracy before filtering is also analyzed based on the above three methods, which is compared with the fault diagnosis effect after filtering. e experimental results are shown in Table 4 and Figure 16.  In Figure 16 and Table 4, it can be seen that, among all the classification methods, QPSO-KELM has the best classification effect and the highest fault diagnosis accuracy. By comparing the fault diagnosis accuracy before and after filtering, it can be concluded that no matter which method is used, the fault diagnosis accuracy after filtering has been improved.
erefore, the effectiveness of the proposed method is proved.

Conclusion
In this paper, a fault diagnosis method based on gearbox multifeature fusion, quadratic filter, and QPSO-KELM algorithm is proposed. rough the analysis of the experimental results, the following conclusions can be drawn: after stable filtering of the collected signal, the noise interference can be effectively filtered out, and the SNR can be improved. Combined with the method of fast-kurtogram, the fault frequency band with rich fault characteristic signals can be screened out, which is also conducive to the improvement of fault diagnosis accuracy. By extracting the time-domain, frequency-domain, and time-frequency characteristic parameters, more fault characteristic information can be extracted, which is beneficial to improve the accuracy of fault diagnosis. e MSSST method is proposed to improve the time-frequency analysis method, which can effectively solve the problems of energy leakage and low resolution. e KELM algorithm optimized by QPSO, to a large extent, improves the learning ability and classification accuracy of the KELM algorithm. Compared with other methods, the classification effect is better. Although the method proposed in this paper is better than the traditional fault diagnosis method, it still needs to continue to explore in some aspects, looking for some new fault characteristic parameters to make it more conducive to monitor the current state of the equipment. In this paper, only some single fault modes of the gearbox are analyzed, but in the actual situation, the compound fault of gearbox is more common. So, the next step of the study is to extract more sensitive and effective fault feature parameters and try to analyze more complex gearbox failure.

Data Availability
e experiment data are obtained through the laboratory bench of our unit, which is not open to the public.

Conflicts of Interest
e authors declare that there are no conflicts of interest.