An Intelligent Diagnostic Method for Multisource Coupling Faults of Complex Mechanical Systems

In the actual environment, there are difcult points such as complicated mechanical system fault types, random fault locations, and inconspicuous minor fault signals, which make it difcult to accurately diagnose faults. Tis paper proposes a new method for fault diagnosis of an adaptive multisensor bearing-gear system based on GAF/MTF (Gramian angular felds and Markov transition felds) and ResNet (deep residual network). First, we establish a multisensor signal acquisition system to monitor the running signals of the bearing-gearbox composite test bench in real time. Faulty parts include multiple types of composite faults of diferent sizes, diferent fault types, and diferent transmission stages. Second, based on GAF/MTFs, the multichannel timing signal collected by using the acquisition system is converted into multichannel pictures, and pictures are fused and compressed into three-channel pictures. Finally, we input these pictures into ResNet for fault diagnosis. Te experimental results show that the GAF/MTF-ResNet model has a recognition accuracy of 72.14% for a total of 520 classifcation label test sets under diferent motor speeds, diferent sampling times, and diferent types of faults. Among them, the accuracy of the motor speed and sampling time is close to 100%, and the accuracy of gearbox failure and bearing failure is 75.25% and 88.97%, respectively. Tis shows that the method provides new ideas for the composite fault diagnosis of mechanical systems under diferent working conditions and diferent types of faults and has theoretical guiding signifcance.


Introduction
With the progress of science and technology and the rapid development of the economy, mechanical equipment plays an increasingly irreplaceable role in people's daily life.Tis puts forward more stringent requirements for the safety and stability of the equipment.Prognostic health management (PHM), as an extremely critical part of equipment operation and maintenance, has attracted widespread attention [1,2].
Bearings and gearboxes, as the basic components of mechanical equipment, have the characteristics of large usage, high precision requirements, and the complex operating environment.Teir failure will directly afect the overall operating status of the equipment.Tey have become a research hotspot in the feld of PHM [3,4].However, there are still insufcient research studies on early failures, composite failures, and multiple types of failures of these components [5,6].
For the diagnosis of a single component failure, Sun and Jia proposed a new method based on experimental datadriven random fuzzy evidence acquisition and intuitionistic fuzzy set fusion [7].Meng et al. calculate the time-varying mesh stifness of gears teeth with diferent crack lengths.Te results show that the impulse factor is sensitive to fault characteristics [8].Shao et al. use transfer learning for bearing fault diagnosis and have achieved good results [9].Although these studies have good performance in the fault diagnosis of a single component, they have not considered the complexity of the mechanical system and compound faults caused by the coupling of various components in actual operation.
Gearboxes have become the main platform for compound failure experiments due to their wide range of applications and simple structure [10].Compared with the signal generated by the failure of a single component, the signal, vibration, sound, current, torque, and other diferent signals, generated by the compound failure of the gearbox is more diverse and complex.Tese signals have diferent performances and require diferent sensors to collect.Different sensors have higher sensitivity to specifc fault types.Terefore, fusion of signals from multiple sensors is a key link in gearbox fault diagnosis [11].Feature extraction and selection of diferent types of signals have become two major difculties in multisensor information fusion [12,13].
Te neural network has the advantages of self-adaptation and high precision and has been widely used in various felds [14,15].As a typical structure of the neural network, the convolutional neural network (CNN) [16] has been used in image recognition [17], pose estimation [18], and other felds.Deep convolutional neural networks (DCNNs) [19] is to meet people's higher precision requirements for neural networks, which are obtained by further deepening the number of network layers based on the CNN network structure.Studies have found that DCNN has better recognition accuracy than CNN [20][21][22].Te deep residual network (ResNet), as a kind of DCNN, solves the degradation problem of deep networks [23].Terefore, the network can be used to build a super deep network, such as 101layer depth to solve complex problems [24,25].
In the feld of mechanical fault diagnosis, DCNN has also been well applied.Shao et al. use the wavelet transform to generate time-frequency distribution (TFD) images from multisensor signals and then use DCNN to learn discriminative representations from TFD images to perform fault diagnosis of asynchronous motors [26].Based on multisensor data fusion, Jing proposed an adaptive multisensor data fusion method based on DCNN for fault diagnosis.Te results demonstrate that the proposed method can detect the conditions of the planetary gearbox efectively [27].
Although these studies have made in-depth research on the fault diagnosis of mechanical components, there are the following problems [28]: (1) At present, much research exists on the fault diagnosis of a single component of the mechanical system, and the research on coupling faults is not in depth.In particular, coupling failures of diferent transmission stage components have not been studied [26,27].
(2) Te operating conditions are single, fault types are few, and the diagnostic results are single [29], and the complex operating state under actual operation cannot be well simulated.
(3) Te classical neural network model has the problem of gradient disappearance or gradient explosion when dealing with long-term series data, which reduces the accuracy of fault diagnosis.
In response to the abovementioned problems, we built a multisensor acquisition system, including three-way accelerometers, microphones, current sensors, torque sensors, and rotary encoders based on the bearing-gearbox composite test bench.Te designed fault types include diferent sizes and types of faults, individual faults, composite faults in the same transmission stage, and composite faults in different transmission stages.After that, the GAF/MTF algorithm is used to fuse the multidimensional time series into two-dimensional RGB images.Tese images are used to train ResNet in order to fnd the optimal model.Trough the GAF/MTF algorithm, the complex multidimensional time series signal task is transformed into a two-dimensional image classifcation task suitable for neural networks.Te multilayer advantage of ResNet can better deal with the transformed images.Finally, the multisensor bearing-gear system for multitype coupling fault diagnosis is realized, and accuracy is high.

Gramian Angular Field and Markov Transition Field.
Deep learning has the problem that it performs well in computer vision and pattern recognition, but it performs poorly in processing time series.In response to this problem, Wang and Oates proposed a GAF/MTF method for encoding time series into pictures [30].Tis method can encode any type of time series into Gramian angular summation felds (GASF), Gramian angular diference felds (GADF), and Markov transition feld (MTF) pictures, respectively.Te loss of information after encoding is very small.
Given a time series X � x 1 , x 2 , . . ., x n ,   of the length n, the range of the X interval is scaled to [−1, 1] through minmax scaler.

􏽥
Te standardized time series  X is recalibrated by the polar coordinate encoding of Formula (2), where radian ∅ is the arc cosine of  x i , the interval range is [0, π], the radius r is determined by the timestamp t i corresponding to x i and the interval [0, 1] phase mapping, and N is a constant factor to regularize the span of the polar coordinate system and is related to the number of time stamps included in the time series.
Te coding of Formula (2) has the following advantages: the bijective mapping is realized from the time series to polar coordinates; that is, given a time series point x i , there is only 2 Shock and Vibration one point (∅, r) corresponding to it in the polar coordinate system.Contrary to the Cartesian coordinate system, the radius r maintains the time dependence in the polar coordinate system, which can prevent the loss of the time label.Terefore, in polar coordinates, we can identify the time correlation in diferent time intervals by calculating the triangular sum or diference between each point.We defne GASF and GADF as follows: where I is the unit row vector [1, 1, . . ., 1] and subscripts i and j of ∅ represent the diferent rows and columns of the matrix GASF and GADF, respectively.After converting to the polar coordinate system, we regard the time series of each time step as a one-dimensional quantity space.By defning 〈x, y〉 , the GAFs of the two inner products are actually quasi-Gramian matrices G i,j .
Te diagonal matrices of GASF and GADF are GAFs have several advantages.First, they provide a way to maintain time dependence, because when the matrix position moves from the upper left corner to the lower right corner, the corresponding time also increases.Second, they include time correlation because G (i,j‖i−j | �k) represents the relative correlation of superposition or diference with respect to the direction of the time interval k.Te main diagonal G i,i is a special case when k � 0 and contains the original value or angle information.Finally, the time series can be reconstructed from the main diagonal.However, when the length of the original time series is n, the size obtained by the Gramian matrix formula is n × n, which leads to larger GAFs.
Similarly, for time series X � x 1 , x 2 , . . ., x n ,  , we defne MTF as follows: By dividing the data into Q quantiles, the Markov transition matrix (W) of Q × Q can be established.Among them, q i and q j (q ∈ [1, Q]) are the data quantiles containing the time stamp i and j (temporal axis), respectively.ω i,j is given by the frequency with which a point in the quantile q j is followed by a point in the quantile q i .By considering the time position, the matrix W containing the transition probability on the amplitude axis is expanded into the MTF matrix (M).Te main diagonal M ii is the probability from each quantile to itself when k � 0 ((the self-transition probability).Te MTF matrix (M) is more sensitive to the distribution of X and the time dependence of the time step t i than the Markov transition matrix (W), which is constructed by directly calculating the transitions between quantiles along the time axis in a frst-order Markov chain [30].

Multidimensional Time Series
Imaging.We introduce the GAF/MTF framework into multisensor data fusion.First, the GAF/MTF unit is established, and then, the multidimensional time series imaging framework is established based on this unit.Te framework is used to synchronously compress and fuse the time series collected by diferent sensors into RGB images of a specifed size according to the timestamp.GASF, GADF, and MTF conversion are performed on the time series X � x 1 , x 2 , . . ., x n ,   of the length n, and a three-dimensional matrix I of size [n, n, 3] is obtained.Each layer in the third dimension of matrix I corresponding to GASF, GADF, and MTF conversion results from top to bottom.Because the size of the matrix I is related to the length n of the time series X, when n is large, the size of the matrix I will be too large, which is not conducive to subsequent calculations.Terefore, after downsampling I with a decimation flter, a matrix I ′ of size [n ′ , n ′ , 3] is obtained.Te GAF/MTF unit is established, as shown in Figure 1, where the input is a single-channel timing signal and the output is expanded into three two-dimensional single-channel pictures, represented by grayscale images.
Further more, for the multichannel timing signal {X 1 , X 2 , . . ., X m }, the number of channels is m and the signal length is n.After GAF/MTF conversion, a matrix set containing m pictures of size In Formulas ( 4)-( 6), the larger the value of the calculation result, the more obvious the feature.We propose the concept of vertical maximum pooling; that is, the maximum Shock and Vibration pooling window with a depth of m is established at the corresponding position of the multidimensional picture, and the window size and sliding step are set to 1.By taking the maximum value in each window, the size of the matrix set is After that, the GASF layer, GADF layer, and MTF layer are placed on the red layer, green layer, and blue layer, respectively, and saved as RGB images.Te fowchart is shown in Figure 2.
Trough this method, the time series signal classifcation problem is converted to an image classifcation problem, which will bypass the shortcomings of deep learning that does not perform well in processing the time series signal.At the same time, it realizes the feature-level fusion of the signal, reduces data dimension, and is more conducive to the calculation of subsequent steps.Te GAF/MTF conversion used does not need to set hyperparameters, which avoids the interference caused by manual feature selection.

Architecture of the Deep Residual Network.
For the twodimensional image classifcation problem, there are already relatively mature neural networks available for use, including AlexNet [31], GoogleNet [32], and VGG [33].Tese networks are deep convolutional networks.With the improvement of people's requirements for image classifcation accuracy and data volume, the depth of the network model continues to increase, and the learning ability is further enhanced.However, the degradation problem of the deeper 4 Shock and Vibration network may cause a higher error rate than the shallower network [34].In response to this problem, He et al. proposed the deep residual network (ResNet), which has a special "shortcut connection" method compared with the previous network [23].Figure 3 shows the two types of shortcut connections.
Trough this structure, the input x and output F(x) of the block undergo element-wise superposition.Tis simple addition does not add additional parameters and calculations to the network, but it can greatly improve the speed and efect of model training.When the number of layers of the model increases, this simple structure can solve the degradation problem well.

Training Network Parameter.
In the process of model building, the training time of the network is the most important part of the time cost.We introduce some special mechanisms to improve the training speed of the network, including stochastic gradient descent with momentum (SGDM) solver, L2 regularization, and minibatch.Te SGDM solver is based on the stochastic gradient descent (SGD) solver combined with frst-order momentum to simulate the concept of momentum in physics.It replaces the true gradient by accumulating previous momentum [35].Te solver can achieve good acceleration in the early stage of the descent; in the middle and late stages of the descent, it jumps out of the local minimum; when the gradient changes direction, it suppresses oscillation and accelerates convergence.L2 regularization will add an L2 norm after the original loss function.Tis constraint usually imposes a large penalty on sparse and peaked weight vectors and prefers uniform parameters [36].Tis efect will encourage the neural unit to use all the input of the upper layer instead of part of the input.Terefore, it will avoid the overftting phenomenon.Te minibatch divides a large training set into several small training sets, and each time only the samples contained in the small training set are used for training [37].Compared with the batch gradient descent trained with all samples, the parameters using mini batch gradient descent are updated faster.It is conducive to more robust convergence and avoids local optimal solutions and can avoid the amount of data imported into the network at a time and reduce the hardware demand.

Adaptive Diagnostic Model for Multicondition Composite
Fault of the Bearing-Gear System.An adaptive multisensor bearing-gear system for the fault diagnostic model based on GAF/MTF and ResNet is proposed.Tis model can fuse diferent lengths and diferent types of time series collected by using sensors at diferent sampling frequencies into twodimensional images through GAF/MTF adaptive featurelevel data.All two-dimensional images have the same size and number of channels.In theory, this fusion method can fuse arbitrary sensor data.ResNet used later, through its ultradeep network structure and special residual mechanism, can mine deep fault features in a relatively short time and make decisions to obtain the fnal diagnostic result.
As the network depth increases, the model accuracy and training time will increase accordingly.Terefore, we have selected ResNet18 for the model, which contain 18 convolutional layers, representing the deep network and ultradeep network.
Te fowchart of the model is shown in Figure 4. First, diferent types of sensors collect experimental data.Sensors include three-way accelerometers, microphones, current sensors, torque sensors, and rotary encoders.Te sampling frequency of sensors is related to the motor speed.After that, the collected experimental data are preprocessed, including operations such as deleting invalid signals and randomly dividing signal segments.Trough these operations, the n-channel sensor data can be preprocessed to obtain a matrix of size

Shock and Vibration
In Figure 4, the input of the model is the multichannel time series collected by using sensors, and the output is the diagnostic result, which realizes end-to-end diagnosis.In the operation of the model, the internal structure can be adjusted adaptively without manual participation.Te adaptive step includes the automatic fusion of time series signals of diferent lengths into pictures of equal size in the GAF/MTF conversion; in the training of ResNet, the adaptive adjustment of hyperparameters facilitates the sharing of parameters in network operation.

Construction of a Test Bed.
Te mechanical part of the bearing-gear system fault diagnostic test bench includes a drive motor, a bearing seat with replaceable bearings, and a planetary gearbox, as shown in Figure 5. Te bearing is fxed in the bearing seat by the bearing end cover and can be replaced.Te planetary gearbox includes a sun gear at the driving end, three planet wheels surrounding the sun gear, and a ring gear fxed to housing.Te motor drives the bearing and the sun gear to rotate synchronously through the transmission shaft, and the planetary gear is driven by the meshing of the sun gear.Tere is a long transmission distance between the bearing and the gearbox.By replacing the faulty parts of the bearing and the gearbox, it is possible to simulate compound failures under diferent transmission stages.
In this study, we built a multisensor acquisition system, as shown in Figure 6.Te acquisition system includes two three-way accelerometers, two microphones, torque sensors, rotary encoders, and current sensors.Te three-way accelerometer and the microphone are, respectively, arranged at the bearing seat and the planetary gearbox; the torque sensor is arranged at the output end of the motor; the rotary encoder is arranged at the input end of the planetary gearbox; the current sensor is arranged on an input line of the driving motor.Table 1 shows the types of sensors used in the experiment and the types of signals collected separately.Trough these sensors, various data during the operation of the experimental platform can be monitored synchronously and saved by using the NI industrial computer.

Types of Faulty Parts.
In the experiment, the artifcial faults of bearings and gears were made to simulate the faults generated under the actual operation of the equipment.Te bearing is a cylindrical roller bearing, which is divided into   two structures: the outer ring is detachable, which is used to make the outer ring and rolling element failure; the inner ring is detachable, which is used to make the inner ring failure.Te two bearings have the same size.Regarding the gearbox, we made diferent types of failures of the planetary gear and the sun gear.Figure 7 shows the types of faulty parts of bearings and gearboxes.Trough the combination of fault types of bearings and gearboxes, many mechanical faults in the actual environment can be simulated.Table 2 shows a list of faulty parts, corresponding to the faulty part coding shown in Figure 7. Tree processing methods are mainly used in the production of faulty parts, electrodischarge machining (EDM), wire cutting, and milling.Among them, wire cutting is used to produce larger faults; EDM and milling are used to produce smaller faults.Tere are 12 types of faulty parts and 2 types of normal parts in the experiment.Te 12 types of faulty parts include 6 types of bearing failure parts and 6 types of gearbox failure parts.
Correspondingly, we designed the experiments, as shown in Table 3

Comparative Model.
We divide the collected data into the training set, validation set, and test set at a ratio of 7 : 1 : 2. For the 26 types of faults at the four speeds, 5 sampling lengths of 0.2 s, 0.4 s, 0.6 s, and 1 s were used to intercept.Finally, a total of 520 categories are divided, and each category contains 200 signal samples.
In order to verify the superiority of the GAF/MTF-ResNet model, we established some appropriate comparison models based on the results of peer research.Figure 8 shows the structure of each comparative model.(a) In 2019, Shao et al. proposed the fault diagnostic model for multisensor [26], which uses the accelerometer and current sensor to collect the operating signal of the test bed, and then the authors perform wavelet transform on the signal collected by using each sensor to obtain the corresponding time-frequency diagrams and establish a multi-input single-output DCNN network for single fault diagnosis of the gearbox.(b) On the basis of (a) model, we increase the input sensor signal to 10 types, including the vibration signal, current signal, torque signal, and sound pressure signal.(c) Jing et al. proposed a multisensor diagnostic model for compound faults in 2017 [27].Te model preprocesses the signals   All comparative models are built using the Keras framework [38] based on TensorFlow [39], and Numba [40] is used to accelerate training.TensorFlow is an open source code developed and maintained by Google Brain.It is a symbolic mathematics system based on datafow programming, which has been widely used in various machine learning.Keras is an open source neural network library written in Python and has a high-level application program interface with TensorFlow.Te code structure is written by an object-oriented method that is fully modular and extensible, which simplifes the difculty of model building.Numba is a JIT compiler that can compile Python functions into machine codes.Te Python code is compiled by Numba (only for array operations), and its running speed can be close to those of the C code or Fortran code.
Te model running environment is the Windows10 system equipped with the NVIDIA TITAN XP graphics card, and the GPU is used to accelerate training.
Table 4 shows the parameters of each model including the comparative model.Among them, the parameter setting of the batch size should be chosen as large as possible based on enough computer video memory.Keskar et al. [41] and Smith et al. [42] proved that when other conditions are the same, a larger batch size will be selected, and the model will show better performance.

Experimental Results and Analysis.
Figure 9 shows the model using ResNet18.As the number of epochs increases, the accuracy of the training set and validation set changes.In order to ensure the reliability of the experimental results, for composite faults, if the model does not predict all faults, the accuracy rate is considered 0.
In Figure 9, for ResNet networks of diferent depths, the accuracy of the training set can eventually reach 100% accuracy.As the epoch increases, the accuracy of the verifcation set does not decrease, which shows that the "shortcut connection" mechanism can well avoid the overftting phenomenon.Te accuracy of the validation set of ResNet18 can reach 71.92%, and the accuracy of the test set is 72.14%.Te results show that the model has the ability to identify complex fault types.In order to test this ability, we carried out the experiments, as shown in Tables 5 and 6.We conducted a more detailed analysis of the results of the test set.All classifcation labels are divided into fve types: the gearbox fault label, bearing fault label, gearbox and bearing fault label, speed label, and time label, among which compound faults at the same transmission stage are regarded as separate fault types.Te fve types of labels show the accuracy of the test set under each label.
In Table 5, our model has a particularly good classifcation efect for speed and time labels, and the recognition rate is close to 100%.Compared with gearbox faults, this model has better classifcation capabilities for bearing faults.
In order to refect the advantages of the model built compared to those of other models, experiments were carried out on all comparative models.Using the model parameters, as shown in Table 5, all the experiments used the same training set, validation set, and test set and the same running environment.Table 6 shows the results of the comparative experiments.We recorded the accuracy of the training set, validation set, and test set, as well as the training time of the model.
In Table 6, compared to that of other models, the GAF/ MTF-ResNet model has better performance for the training set, validation set, or test set.Te training time of the model using ResNet18 is not much diferent from that of other models.Te wavelet-DCNN model performs poorly compared to the wavelet-DCNN (max) model because the former uses less sensor data than the latter.Te 1D-DCNN model with the one-dimensional time series signal as input has the worst performance, and it has almost no classifcation ability for diferent types of faults.Te reason may be  Shock and Vibration that in the process of building the model, the input of the neural network must be guaranteed to have the same size, so the strategy of adopting signals of diferent lengths adaptively according to the motor speed is used in this experiment.Tese signals are scaled in data preprocessing.In the process of scaling, the loss of information is inevitable, and especially, the loss of information from the one-dimensional time series signal may be more obvious, which ultimately leads to poor performance of the model.

Conclusion
In In the process of building the model, the artifcial selection of key parameters such as feature extraction was avoided, and the end-to-end operation of the model was realized.Te experiment we designed has complex failure types, small failure sizes, and low motor speed, which requires the model to have high resolution capabilities.More importantly, we have designed mechanical faults in diferent transmission stages so that the experimental content can more realistically simulate the coupling faults in the actual complex working conditions of the mechanical system.
Te GAF/MTF algorithm is used to realize the fusion and dimensionality reduction of multidimensional information based on the time-frequency domain information from the original signal.Moreover, this algorithm transforms the original time series classifcation problem into a two-dimensional image classifcation problem, which can avoid the shortcomings of the neural network's poor

Figure 5 :Figure 6 :
Figure 5: Te mechanical system of the test bed.

Figure 7 :
Figure 7: Failure part list (the red circle marks the fault location).
Shock and Vibration collected by using sensors and directly inputs them into the one-dimensional deep convolutional network (1D-DCNN) model.Te output of the network is the type of fault.Tis model uses some convolutional layers to replace specialized feature extraction units.(d) Tus, we have built the GAF/ MTF-ResNet model for multisensor complex fault diagnosis.

Figure 8 :
Figure 8: Te structure of comparative models.

Table 1 :
Types of sensors and signals.
, including the no failure experiment (No. 1), single failure experiment (No. 2-No.12),compound failure experiment in the same transmission stage (No. 13-No.19),compound failure experiment in diferent transmission stages (No.20-No.25),and multiple compound failure experiment (No. 26).Each experiment in the table contains subexperiments with four motor speeds.Te approximate speeds of the motors are 86.4RPM, 288 RPM, 576 RPM, and 864 RPM, respectively.Te precise speed can be calculated by using the rotary encoder on the test bench.A total of 104 experiments are performed.

Table 2 :
Failure parts list.
this paper, a GAF/MTF-ResNet model based on multisensor detection is proposed for coupled fault diagnosis of bearing-gear mechanical systems.Te model frst uses the GAF/MTF algorithm, which bypasses the gradient disappearance or gradient explosion problem existing in the classic neural network model, to extract multichannel time series signals of diferent types and lengths into two-dimensional RGB pictures through features.Ten, we use the ResNet network to train and classify picture sets, and fnally, we realize the diagnosis of diferent fault types.

Table 6 :
10l model experiment results.In the table, the data with the best performance in each group are bold, including those with the highest accuracy and the least time, to highlight the advantages of the method used in this paper.10ShockandVibration performance in processing time series.ResNet introduced afterwards has a deeper number of layers and better performance than other networks, and the training time has not increased excessively.Te experimental results show that the GAF/MTF-ResNet model has a training set accuracy of 100% and a test set accuracy of 72.14% for 520 types of experimental tags with diferent speeds, diferent sampling times, and diferent types of failures.Among them, the resolution of the speed and sampling time is close to 100%, and the resolution of gearbox failure and bearing failure is 75.25% and 88.97%, respectively.It has high application performance and further research value.