Research on Fault Diagnosis for Pumping Station Based on TS Fuzzy Fault Tree and Bayesian Network

According to the characteristics of fault diagnosis for pumping station, such as the complex structure, multiple mappings, and numerous uncertainties, a new approach combining T-S fuzzy gate fault tree and Bayesian network (BN) is proposed. On the one hand, traditional fault tree method needs the logical relationship between events and probability value of events and can only represent the events with two states. T-S fuzzy gate fault tree method can solve these disadvantages but still has weaknesses in complex reasoning and only one-way reasoning. On the other hand, the BN is suitable for fault diagnosis of pumping station because of its powerful ability to deal with uncertain information. However, it is difficult to determine the structure and conditional probability tables of the BN. Therefore, the proposed method integrates the advantages of the two methods. Finally, the feasibility of the method is verified through a fault diagnosis model of the rotor in the pumping unit, the accuracy of the method is verified by comparing with the methods based on traditional Bayesian network and BP neural network, respectively, when the historical data is sufficient, and the results are more superior to the above two when the historical data is insufficient.


Introduction
With the operation of the first phase of the South-to-North Water Diversion Project, the reliability and ability to achieve preset functions of the pumping stations and units at every level will affect the effectiveness of the whole project, while the faults of each pumping station may cause major issues about engineering safety, significant economic losses, and serious social impact if expanded further.Therefore, it is of great significance to monitor, evaluate, predict, and diagnose the running state of the pumping stations and units.
The operating state of the large-scale pumping station is affected by the coupling of hydraulic, mechanical, and electromagnetic factors.And the factors that affect its efficiency or failure are often multiple.At the same time, there is different steady status or transition status corresponding to different conditions like starting and stopping and blade adjustment in the course of operating.Therefore, the fault diagnosis of pumping station (group) can be divided into conventional fault diagnosis and uncertain fault diagnosis.The former such as electrical equipment has been solved because it is possessed in its computer monitoring system.However, these uncertain fault diagnoses under the coupling of mechanical, hydraulic, and electromagnetic factors are difficult [1][2][3].
Bayesian network is an intelligent method combining probability theory, graph theory, and decision theory.Recently, many researchers focus on the field of fault diagnosis, especially in the complex systems with large amount of uncertain information [4][5][6].But its application in large pumping and drainage pumps is very little.In paper [7], Bayesian network is firstly applied to the fault diagnosis of the hydrogenerator by constructing a simple fault diagnosis system for the hydrogenerator set, SmartHydro, which uses vibration of different frequency as the fault features to realize the diagnosis of several major faults caused by factors that are mechanical, hydraulic, electromagnetic, etc.This method makes full use of the advantage of Bayesian network to solve the problem of uncertain fault diagnosis in pumping unit.However, the fault mechanism of a real pumping unit is far more complicated.A large number of nodes and conditional probability tables are required to construct a complete Bayesian network.So Bayesian network is combined with the Noisy Or model in paper [8] to calculate the connection probability between a single node and the result in the whole system with formula only by determining the probability relationship between every node and the result.It is verified that this model greatly reduces the amount of conditional probabilities that need to be determined and advances the application of Bayesian network in fault diagnosis in pumping units.However, it is difficult to understand the fault mechanism and build a Bayesian network accurately in large complex systems such as pumping stations.That needs the assistance of experienced experts and learning based on a large number of historical data, especially historical fault data.As a newly developed large-scale complex system, the number of historical fault data of pumping station (group) in the South-to-North Water Diversion Project is very little.In addition, the number of nodes in constructed Bayesian network is very large.Therefore, it is more difficult to determine the structure and conditional probability tables.
Some researchers have combined the traditional fault tree theory with the Bayesian network to solve the problem of constructing a Bayesian network.However, there are many shortcomings in the traditional fault tree: (1) the logical relationships and probability between events need to be known exactly.(2) Compatibility is not strong.That means the existing data is not applicable when the system conditions are changed.(3) Every event can be described only with two states: {0, 1}.The T-S fuzzy gate fault tree analysis method proposed by Song et al. [9] integrates the fuzzy theory into the fault tree, which can not only overcome shortcoming (1) through describing the connection between events as an uncertain item but also describe multiple states of the system conveniently.But there are still some disadvantages, such as poor compatibility, complex reasoning process, and only oneway reasoning.
Therefore, this paper combines T-S fuzzy gate fault tree method and Bayesian network method [10] that can convert the fuzzy gate rule of T-S fuzzy gate fault tree into the conditional probability table of Bayesian network and make full use of the efficient parallel two-way reasoning ability of Bayesian network to realize the uncertain fault diagnosis of pumping station.Finally, the Bayesian network is constructed according to the above method, and the fault diagnosis of the rotor, which is one of the most important and most faulty components of the pump unit, proves the correctness and superiority of the network.
The remainder of this paper is outlined as follows: first, the T-S fuzzy gate fault tree and Bayesian network are briefly reviewed.Then, the concrete steps of transforming the T-S fuzzy gate fault tree to Bayesian network are described.Finally, the effectiveness and superiority of the proposed approach are illustrated by taking the rotor which is one of the most important and most prone to fault in the pump unit.

T-S Fuzzy Gate Fault Tree
Compared with the traditional fault tree method, T-S fuzzy gate fault tree method combines the fuzzy theory with the fault tree method, provides the relationships between upper and lower events with uncertainties, and expresses fault probabilities with fuzzy numbers.These events between layers are connected through the fuzzy gate, which is a production rule defining the probability of different states of the top event caused by different combinations of bottom events.A typical T-S fuzzy gate fault tree model is depicted in this section.Figure 1 shows a T-S fuzzy fault tree model.
In Figure 1, 1, 2, . . ., 5 are five bottom events, each with   ( = 1, 2, . . ., 5) values of state described as 5 , respectively.  ( = 1, 2, 3) is the number of fault states for the top event  1 , and the intermediate events  2 ,  3 can be described as 2 , and   3 3 .  ,   , and   represent fuzzy rules of T-S gates , , and , respectively.The fuzzy rules of the local T-S fuzzy gate fault tree composed of  2 , T-S gates , and  1 ,  2 , and  3 can be represented in Table 1.
T-S gate has the following rule or formula, which is also the conditional probability of the corresponding nodes in BN: 3 ) .
(1) multiple variables, and the causality between variables is represented by directional connection lines.Give each root node a prior probability, and each child node takes the conditional probability Table 9.A Bayesian network can be represented by a multiple tuple ⟨, , ⟩ where  is the node variable,  is the directional connection line between nodes,  is the conditional probability table representing the connection strength between nodes.It combines directional acyclic graph with probability theory.It is more objective and scientific with the formal probability theory foundation, and its knowledge representation form is also more intuitive.The Bayesian network is more objective with the combination of the prior knowledge of experts and the posterior data.The prior knowledge dominates when the posterior data is less, while the posterior knowledge dominates when the posterior knowledge is abundant.A typical Bayesian network structure is depicted in Figure 2.

Construction of Bayesian
Network.Three parts should be determined when building a complete Bayesian network: the node variables, the structure of the network, and the conditional probability table for every node.There are three main methods to determine the latter two.
(1) Through Experts' Experience Completely.This method is affected by the limitation of human's knowledge, and the bias of a network can be found easily in practical application.
(2) Learning through Historical Data Completely.When the historical data is sufficient, this method has strong adaptability by reasoning the structure and parameters of BN scientifically.
(3) Combining the above Two.The historical data is often insufficient, so the nodes and structure of BN can be determined by experts, and the parameters can be determined through learning from data.This method is applied more in practice because it can reduce the difficulty of determining parameters of the network and the structural learning error caused by insufficient data.
Parameter learning methods of data-driven BN are mainly the following two: (1) Maximum-likelihood estimation method and Bayesian method can be used when data is sufficient.They are shown, respectively, in where  is a random variable,  = ( 1 ,  2 ,  3 , . . .,   ) is a data set, and ( | ) is the maximumlikelihood function of .
(2) When data is insufficient, if the topology of the network is known, EM (expectation maximization) can be used to calculate the parameters.If the structure of network is unknown, the structure maximum expectation method can be used.Specific steps are not detailed here, which can be found in related literature.

Bayesian Network Inference.
There are three kinds of inference in Bayesian network: support inference, causal inference, and diagnostic inference.This section focuses on the last one, which determines the cause according to the measured characteristic node with abnormal phenomenon when the fault occurs.Steps are as follows: (1) Obtain the state fact of the feature node, and let its probability value be 1.
(2) Let the obtained fact node be , and then the marginal probability of any node  is (3) According to the given (), (, ) can be calculated by marginalizing the joint probability density of all nodes.
The formula used in the process of fault diagnosis include the following: Bayesian formula is Chain rules are

Transformation from T-S Fuzzy Tree to BN
In the process of transforming T-S fuzzy fault tree to BN, the top event, middle event, and bottom event of the T-S fuzzy fault tree correspond to the leaf node, the intermediate node, and the root node of the Bayesian network.For the fuzzy rules between events, they correspond to the conditional probability tables between nodes.According to the relationship between the top events and the middle events and the relationship between the middle events and the bottom events, the root nodes, the intermediate nodes, and the leaf nodes are connected with the directed connection lines to form a complete BN [10,11].The flow chart is depicted in Figure 3.

Bottom events
Intermediate events

Top events
Fuzzy gates

Fuzzy rules
The root nodes

Intermediate nodes
Leaf nodes

Conditional probability tables T-S fuzzy gate fault tree
Bayesian network

A priori probability
where [ P(  =     ,  =   )/ P( =   )] is the center of gravity of the fuzzy subset, which converts the fuzzy subset into an exact value.
If the fault probability fuzzy subset of all the root nodes is known, then the fault probability fuzzy subset of the leaf node  =   can be obtained through the conditional independence of Bayesian networks and chain rules.It is expressed by the following formula: where () represents the set of all parent nodes of the leaf node  and P( is the fault probability fuzzy subset if fault state of the root node   is denoted as     .

Fault Diagnosis of Rotor Based on T-S Fuzzy
Gate Fault Tree and BN

Rotor Fault Diagnosis of Water
Pump.Buildings, electrical and mechanical equipment, and auxiliary equipment are the main components of the pump station.The mechanical and electrical equipment mainly includes the main water pump, the power machine, the electrical equipment, and the metal structure [12].As the direct work part of the operation of the unit, the main water pump and motor are the most prone to failure of the pumping station, whether they can operate safely affects the function and efficiency of the pumping unit directly.Studies have shown that more than half of faults of rotating machinery are caused by the fault of the rotor, which is a major component of a pump unit [13].Therefore, the fault diagnosis of rotor is the most important part of the fault diagnosis of the whole pump unit.When there is a fault in the rotor, it not only does great harm to the whole pump unit but also affects the task of watering and drainage of pump unit seriously so as to result in immeasurable losses.So it is necessary to implement fault diagnosis for rotor.Vibration is the main form of faults in the rotor and abnormal increase in amplitude of power frequency is the most common phenomenon.In the following, the effectiveness of the proposed method in fault diagnosis is illustrated through the phenomenon that the amplitude of power frequency of the rotor increases.The common faults that cause the abnormal increase in amplitude of the rotor's power frequency are the mass imbalance and thermal bending of the rotor.And the reasons for the mass imbalance are the fouling, breakage, or shedding of components and initial eccentricity, while the causes of the thermal bending are the inappropriate parking of unit and uneven heat in movement.The T-S fuzzy gate fault tree constructed with this method is depicted in Figure 1.Table 2 represents the corresponding modes or causes and states of faults of each node.There are two states (yes, no) in the events of an abnormal increase in amplitude of the rotor's power frequency and a breakage or shedding of the component and three states (severe, general, and none) for the other fault events.Then, the T-S fuzzy gate fault tree is transformed into a Bayesian network depicted in Figure 4 according to the method depicted in Figure 3.
Table 3 shows the fault data of some root nodes.This data comes from some large pumping stations in Jiangsu Province in recent years and is sorted out by using statistics.When the fault state of fundamental frequency is 1, combine these data and consultations with experts, and the possible fault probability fuzzy subset of the rotor system are shown in  {2.96 × 10 −6 , 3.0 × 10 −6 , 3.04 × 10 −6 } Table 4, where the central value of every fuzzy probability subset is the maximum possible value of probability of fault, and the left is the lower limit, and right is the upper limit.Tables 5∼7 are the fuzzy gate rule corresponding to the constructed T-S fuzzy gate fault tree.For example, rule 1 in Table 5 indicates that when the state of ( 1 ,  2 ,  3 ) is (0,0,0), the probability that the upper event takes 0, 0.5, and 1 is 1, 0, and 0, respectively.The fault probability fuzzy subsets of  1 ,  2 , and  3 can be obtained in Table 8 by (1) and (8).
When the fault state of the leaf node  1 is 1, the fault probability of the root node  1 = 1 can be obtained by (7).Similarly, the fault probability of the remaining root nodes is shown in Table 9.
In the case where the amplitude of the acquired rotor's power frequency increased abnormally, the probability of fault of each root node is obtained through the above reasoning.From Table 9, the order from large to small is  2 >  3 >  4 >  5 >  1 .So the most likely cause of the abnormal increase in the amplitude of the rotor's power frequency is breakage or shedding of components and then is eccentric.

Algorithm Comparison and Analysis.
In the case of complete historical data, the structure of BN is constructed by experts' experience, and the conditional probability table of every node is obtained through learning from the data.When the leaf nodes are abnormal, the traditional fault diagnosis method based on BN is built through this method, and it calculates the fault probability of leaf nodes shown in Table 10.In addition, the BP neural network is trained for the same data, and the results of fault diagnosis are also shown in Table 10.Considering the incomplete data, the states of some nodes are set as unknown, and the results of fault diagnosis through the method this paper mentions, traditional BN, and BP neural network are also shown in Table 10.
For ease of analysis, the results in Table 10 are converted into the line chart shown in Figure 5.It can be seen from the figure that when the historical data is complete, the results of the method proposed in this paper are similar to that of traditional fault diagnosis method based on BN, so the effectiveness of the method proposed in this paper is demonstrated.But the results of the method based on BP neural network have errors with both, because fault diagnosis based on BP neural network requires a lot of effective historical data, which is not available in reality.When the historical data is incomplete, the results obtained by the three methods are all in error with that obtained when the data is complete.But the diagnosis results of the method in this paper are closer to that with complete data.The reason is  that when the data is incomplete, the accuracy of diagnosis result through traditional BN is affected by the increasing error in parameter learning, and the diagnosis result of BP neural network is more inaccurate because of the incomplete data.The method in this paper can reduce the impact of data loss effectively for the integration of experts' experience and T-S fuzzy fault tree.
In this paper, T-S fuzzy fault tree is used in construction, and the advantage of BN is used in reasoning.At present, the reasoning algorithm based on joint tree is the fastest in calculation and the most widely used in BN.The computational complexity of the method is exponentially increasing with the increase of the largest agglomeration in the joint tree.In dealing with general BN, the computing speed of current computers can meet the requirements.

Conclusions
This paper combines the T-S fuzzy fault tree method with Bayesian network method to solve the problem of fault diagnosis of pumping unit.This method overcomes not only the complex reasoning of T-S fuzzy fault tree method but also the difficulty of determining the structure and conditional probability table of Bayesian network.The effectiveness of the method is verified by fault diagnosis of the rotor, which is one of the most prone parts in the pump unit.The results are superior to the simple Bayesian network method when the data is insufficient.This method can be applied to the fault diagnosis of pumping station with complex structure, many uncertainties, and multiple mapping.

Figure 3 :
Figure 3: T-S fuzzy tree converted into BN.

Figure 4 :
Figure 4: Bayesian network for fault diagnosis when the amplitude of the rotor's power frequency increased.
s probability value BN in complete data BN in incomplete data FFTA&BN in complete data FFTA&BN in incomplete data BPNN in complete data BPNN in incomplete data

Figure 5 :
Figure 5: The fault probability of root nodes when the leaf nodes state is 1.

Table 1 :
Rules for T-S gate .

Table 2 :
The fault modes and cause for increased amplitude of the rotor's power frequency.

Table 3 :
Partial characteristic data of fault state of rotor system of some large pumping stations in Jiangsu Province.

Table 4 :
Fault probability fuzzy subset of root node with fault state 1.

Table 5 :
The fuzzy gate rules for imbalance of rotor's quality.

Table 6 :
The fuzzy gate rules for thermal bending of rotor's quality.

Table 7 :
The fuzzy gate rules for increased amplitude of the rotor's power frequency.

Table 8 :
Probabilities fuzzy subsets of leaf nodes and intermediate nodes.

Table 9 :
Conditional probability table of root nodes.