Composite Fault Diagnosis for RotatingMachinery of Large Units Based on Evidence Theory and Multi-Information Fusion

Due to the complexity of the structure and process of large-scale petrochemical equipment, different fault characteristics are mixed and present multiple couplings and ambiguities, leading to the difficulty in identifying composite faults in rotating machinery. +is paper proposes a composite faults diagnosis method for rotating machinery of the large unit based on evidence theory and multi-information fusion. +e evidence theory and multi-information fusion method mainly deal with multisource information and conflict information, synthesize multiple uncertain information, and obtain synthetic information frommultiple data sources. To detect faults in rotating machinery, the dimensionless index ranges of composite faults are first used to form a feature set as the reference. +en, a two-sample distribution test is applied to compare the known fault samples with the tested fault samples, and the maximum statistical distance is used. Finally, the multiple maximum statistical distances are fused by evidence theory and identifying fault types based on the fusion result. +e proposed method was applied to the large petrochemical unit simulation experiment system, the results of which showed that our proposed method could accurately identify composite faults and provide maintenance guidance for composite fault diagnosis.


Introduction
Rotating machinery works in complex environments and has difficulty in separating the signal of faults in industrial petrochemical plants, thereby complicating the fault diagnosis decision [1][2][3].Once the large unit presents problems, it needs to be entirely stopped for inspection, which can result in a huge economic loss.erefore, it is essential to quickly identify the fault signal and predict the fault types.
For the fault diagnosis problem of rotating machinery, researchers have presented solutions including the dimensionless algorithms [4][5][6], neural networks [7][8][9], classification method [10][11][12][13][14], and evidence theory [15][16][17].Among these, the dimensionless algorithm is insensitive to signal disturbance, amplitude changes, and stable frequency signals.erefore, it has been widely used in rotating machinery fault diagnosis [18,19].In [20], three EEMD-based three dimensionless indexes were proposed to characterize the railway axle bearing steadiness states and detect the different defects.Wu et al. [21] proposed a method of Hilbert-Huang transform and instantaneous dimensionless frequency normalization and applied them for the gearbox system.However, there is a certain overlap between the normal state and a range of various fault states using this method.In other words, the scope of the dimensionless indexes of normal equipment and fault equipment is difficult to distinguish, which makes the decision more difficult.To solve these problems, Xiong et al. [22] proposed a genetic programming method based on dimensionless indexes in the time domain, which has achieved positive results in rotating machinery classification.However, constructing new dimensionless indexes with this method presents many deficiencies.For instance, the operator set and the termination character set affect the complexity and convergence of the program.erefore, when the search scope is expanded, it is easy to lose potential useful fault information.All of these variables can affect fault diagnosis efficiency.To solve the problem of information loss due to the reduction and clustering classification during the fault feature information generation, Dempster [23] proposed an integrated fault diagnosis method of the dimensionless indexes immune detector.In [24,25], a small sample method was proposed, but the method had an overfitting problem.In [26,27], a fault diagnosis method of induction motor based on sparse noise reduction self-encoder was proposed, and it reduced the risk of network overfitting in small samples.
e theory evidence is an uncertainty theory [28][29][30][31], the major characteristics of which are measuring and addressing various kinds of uncertain information and using the synthetic principle to obtain multi-information entropy.In this way, the theory evidence can process multi-information and conflicting information better and thus has been widely used in areas such as information fusion and uncertain reasoning.In [32,33], the advantages of the evidence theory to propose a fault diagnosis method based on multisensor information fusion were taken, which improved the fault diagnosis accuracy rate.Xiao [34] proposed an evidence theory and fuzzy preference methods to handle the conflicting evidence combination problem in a multisensor environment; it does not consider cracks and misalignment.In [35,36], a multiparameter comprehensive diagnosis system model was proposed, and the difficult problem of distinguishing various faults in the same symptom domain in the field of fault diagnosis was solved.Song and Jiang [37] proposed a new evidential fault diagnosis method in which multiple hypotheses are taken into consideration.
Evidence theory and multi-information fusion are mainly aimed at multi-information fusion; it is unreasonable to install multisensors on the petrochemical equipment unit.erefore, we can improve the accuracy of fault diagnosis.In this paper, we propose a method for faults diagnosis based on dimensionless indexes, multi-information fusion, and two-sample distribution test.e author believes the reasons that lead to faults of large units are difficult to identify, mainly due to the existence of multi-information and conflict information.e multi-information fusion method can fuse multiple uncertainty probabilistic information to determine the potential faults of samples.e main contributions in this paper are concluded as follows: (i) Two-sample distribution of the known fault samples is compared with the tested fault samples, and the maximum statistical distance (ii) Multi-dimensionless information is fused, and the fault types can be identified according to fused results (iii) e method we proposed is verified with a large petrochemical unit simulation experiment system and is shown to effectively improve fault identification e rest of this paper is organized as follows: Section 2 describes the process of fault diagnosis and theoretical basis: dimensionless algorithm, two-sample distribution test, and multi-information fusion.Section 3 verifies the reliability of the proposed method.In Section 4, the conclusion of this paper is presented.

Proposed Method
In this section, the dimensionless algorithm is first briefly introduced.
en, the two-sample distribution test is compared with the cumulative probability distribution function of the known fault samples and the tested fault samples, and the maximum statistical distance is obtained.
en, the multisimilarity is fused.Finally, the process of fault identification is conducted.e diagram of composite fault diagnosis for rotating machinery of a large unit based on evidence theory and multi-information fusion is shown in Figure 1.

Dimensionless Algorithm. Dimensionless indexes refer
to the ratio of two quantities with the same dimensions.e basic idea behind the dimensionless algorithm is to achieve the eliminating dimension of the two-dimensional ratios that are based on the probability density function, so the dimensionless indexes are not affected by the frequency and amplitude of the mechanical signal in the fault diagnosis [38,39].e dimensionless algorithm is defined as follows: where x is the amplitude of the vibration time-domain signal, p(x) is the probability density function, and l and m are the molecular and denominator coefficients, respectively.e five dimensionless indexes, waveform index, peak index, impulse index, margin index, and kurtosis index, are shown in Table 1 [40].
Assume that the vibration time-domain signal of the known fault sample is X � (x 1 , x 2 , . . ., x n ) and the amplitude of the tested fault sample is Y � (y 1 , y 2 , . . ., y n ).
e dimensionless indexes of the known fault samples are Among them, c and d are the minimum and maximum of the dimensionless indexes of the known fault samples, and e and f are the minimum and maximum of the dimensionless indexes of the tested fault samples.

Two-Sample Distribution Test.
e probability density of the nonstationary random signal is x(t), and the probability density function at time t is defined [41] as where P r is the probability, which reflects the probability that the signal falls within different amplitude intensity regions.For each state of the process, the probability function can be described as 2

Shock and Vibration
where T represents the length of the sample and T x represents the signals between x and x + Δx. e probability that the value of x(t) is less than or equal to the probability distribution density function of the signal is represented by f X (x): e random variable x and its cumulative probability distribution function can be represented by F X (x): F X (x) is used to represent the cumulative probability distribution function of random sample observations with sample size n.
Assume that the cumulative probability distribution function of the known fault samples is M (F a 1 (x), F a 2 (x), . . ., F a n (x)) and the tested fault samples is . By comparing the cumulative probability distribution function of the known fault samples with the tested fault samples, the statistical distance of each cumulative frequency is obtained [42]: After comparing M with N, the statistical distance is W n (D 1 , D 2 , . . ., D n ), and the speci c value can be described: Assume that the cumulative probability distribution function of the known fault samples is H n (x) and the cumulative probability distribution function of the tested fault samples is R n (x).e maximum statistical distance, k value, between them is obtained by comparison: Cumulative probability distribution function of tested fault samples:

Shock and Vibration
Note: x is the vibration time-domain signal; p(x) is the probability density function; β is the kurtosis value; X max is the maximum value; X rms is the RMS value; X r is the square root amplitude; d|X| is the average amplitude.

Shock and Vibration
After comparing H n (x) with R n (x), the k value can be described: A small probability z can be described as For each x value, if D n > λ/ n √ , it means that the difference between the H n (x) and R n (x) is too large, and negative x obeys the distribution hypothesis of the known fault samples.It indicates that the distribution function of the known fault samples has a high degree of tting with the tested fault sample distribution function.
For any assumption, its trust degree Bel(A) is de ned as the sum of the basic probabilities corresponding to all subsets of A, that is, Bel : 2 Θ ⟶ [0, 1].Bel(A) is de ned as follows: e Bel(A) function, called the lower bound function, represents the full trust in A.
For any assumption, the likelihood function PI(A), which is the sum of the basic probabilities corresponding to all subsets of A, is as follows: where PI functions are called upper-bound functions, which represent the degree of trust in −A. e relationship between the trust function and the likelihood function is obtained from formulas ( 12) and ( 13) as follows: PI(A) ≥ Bel(A), A ⊆ Θ. e uncertainty of A is expressed as follows: μ(A) PI(A) − Bel(A).en, (Bel(A) PI(A)) is called the trust interval.
e trust interval and tting range are shown in Figure 2.
Assuming that M 1 and M 2 are two probability assignment functions on 2 Θ , M 1 is orthogonal to M 2 , M M 1 + M 2 and is de ned as where According to formula (14), the orthogonal sum of multiple probability assignment functions M M 1 + M 2 + • • • + M n is de ned as follows: where c 1

Fault Identi cation Process.
e process of evidence theory and multi-information fusion is described below: Step 1.
e vibration time-domain signal of the rotating machinery is collected.e amplitude of vibration time-domain signal for the known fault sample is X (x 1 , x 2 , . . ., x n ) and for the tested fault sample is Y (y 1 , y 2 , . . ., y n ).
e dimensionless algorithm is able to process the vibration time-domain signal, to obtain ve dimensionless indexes, according to equation (1) and Table 1.
e dimensionless index of the known fault samples is A (a 1 , a 2 , . . ., a n ), and that of the tested fault samples is Step 3.According to A and B in step 2, the cumulative frequency of the known fault samples is M (F a 1 (x), F a 2 (x), . . ., F a n (x)), and that of the tested fault samples is Step 4. We compared the cumulative probability distribution of the known samples with the tested fault samples
Step 5. e evidence theory and multi-information fusion method are used to fuse the maximum statistical distance k � (k 1 , k 2 , . . ., k n ) and obtain the fusion result M(A) according to equations ( 12)-( 15).

Validation Experiment
In this section, in order to verify the effectiveness of the proposed approach, the composite fault diagnosis of rotating machinery is studied and verified by using the large petrochemical unit simulation experiment system.

Data Acquisition and Processing.
e simulation experiment system consists of the multistage centrifugal air compressor unit, various test stations, and test software.e acquisition software can display the time-domain waveform of the vibration time-domain signal in real time, extracting signal characteristics while storing historical data and monitoring the operation of the unit.Large petrochemical unit simulation experiment system is shown in Figure 3, and the model and parameters of its main components are shown in Table 2.

Shock and Vibration
In the experiments, the EMT390 data collector is used to collect the vibration time-domain signal of the composite faults and a discrete 1024-point set of data.e collection of 100 groups of vibration time-domain signal data is done under each fault condition.e first 50 groups are known as fault samples, and the latter 50 groups are the tested fault samples.e test conditions are as follows: motor speed is 1000 r/min and motor rated power is 11 KW.e sensor is placed next to the gearbox as shown in Figure 3(b).
According to the laboratory conditions, six different faults were combined, and four fault conditions of rotating machinery are shown in Figure 4.
Vibration time-domain signal acquisition and processing: (1) In this paper, the sampling frequency (rate) is 1024 Hz, which means 1024 points are collected per second and a total of 400 seconds is collected.Acquisition of different faults vibration time-domain signals is shown in Figure 5. (2) e vibration time-domain signal of the rotating machinery is processed according to the dimensionless algorithm, and the range of five dimensionless indexes was obtained, as shown in Table 3. e main cause of this situation is multisource information and conflict information between the information.Although the change rules of Sample 2 and Sample 6 are obvious, according to the principle of the same fault minimum k value, the fault types can be identified: gear teeth missing and bearing outer ring wear; large and small gear teeth missing and bearing inner ring wear.Nevertheless, the accuracy of the identification in the six groups of the tested fault samples was only 33.33%.

Evidence eory and Multiple Information Fusion Results.
e efficiency of the proposed composite fault diagnosis method based on evidence theory and multi-information fusion for a large petrochemical unit is verified, and the results are presented in Figure 7.Each color represents the probability of different faults occurring in the same sample.e five k values through the evidence theory and multi-information fusion are different, which is described below: By comparing the probabilities of different faults, the fault types are identified based on the maximum probability.
e probability occurrence of faults is given in Figure 7, and the maximum probability is able to identify the fault types.In the six groups of test faults samples, the accuracy rate reached 100%, and the accuracy of relative two-sample distribution test was improved by 66.67%.e experimental results demonstrated that the proposed method is able to identify the fault types accurately.

Conclusion
is paper proposes a composite fault diagnosis method of large unit rotating machinery based on evidence theory and multi-information fusion.It effectively solves the problem of multisource information and conflict information between information and improves the accuracy of composite fault diagnosis.
e results of the simulation experiment showed that the maximum probability of occurrence can accurately predict the composite faults types by using evidence theory and multi-information fusion method.In the future work, we would further study the robustness and applicability of our proposed method in the real applications by considering more rotating machinery working conditions and noise levels.Gear teeth missing and bearing outer ring wear Gear teeth missing and bearing inner ring wear Gear teeth missing and lack of ball bearings Large and small gear teeth missing Large and small gear teeth missing and bearing outer ring wear Large and small gear teeth missing and bearing inner ring wear

Figure 1 :
Figure 1: Diagram of composite fault diagnosis for rotating machinery of a large unit based on evidence theory and multi-information fusion.

Figure 6
shows the different faults.e two-sample distribution test tests five dimensionless indexes including the junction point of waveform index, impulse index, margin index, peak index, and kurtosis index of the maximum statistical distance k value.Each color shown in Figure6, represents different faults.edistribution of five k values for different faults with different samples can be seen in Figure6.e two-sample distribution test identifies the fault types based on the five k values between the dimensionless indexes.e k value of the same fault samples is small relative to the other different fault samples, indicating that the two samples belong to the same distribution.Different fault samples k value is larger than the same fault, indicating that the two samples belong to different distributions which are described below: Sample 1. e four fold lines overlap each other, and the fault types of the sample were not recognized.Sample 2. Gear missing teeth and bearing outer ring broken line at the bottom, and this failure's five k values are smaller than others'.e sample failures are gears teeth missing and bearing outer ring.Sample 3. It can be seen from the figure that the five k values of the gears teeth missing and bearing outer ring are the largest, and the fault was ruled out.However, the other three fold lines overlap each other, and the fault samples cannot be identified.Sample 4. e results show that the four broken lines overlap each other without significant difference.In this situation, the fault cannot be identified.Sample 5. From Figure6(e), you can see three faults at the same time.It still overlaps, and the faults cannot be identified.Sample 6.From the distribution of the fold line, it is possible to identify that the samples have large and small gear teeth missing and bearing inner ring wear.ere are no obvious change rules in the five k values of sample 1, sample 3, sample 4, and sample 5 in the fault samples.

Figure 7 :
Figure 7: e result of evidence theory and multiple information fusion.

Table 3 :
Five dimensionless indexes of composite fault states.