Bayesian Network Based Fault Prognosis via Bond Graph Modeling of High-Speed Railway Traction Device

Reliability of the traction system is of critical importance to the safety of CRH (China Railway High-speed) high-speed train. To investigate fault propagation mechanism and predict the probabilities of component-level faults accurately for a high-speed railway traction system, a fault prognosis approach via Bayesian network and bond graph modeling techniques is proposed. The inherent structure of a railway traction system is represented by bond graph model, based on which a multilayer Bayesian network is developed for fault propagation analysis and fault prediction. For complete and incomplete data sets, two different parameterlearningalgorithmssuchasBayesianestimationandexpectationmaximization(EM)algorithmareadoptedtodetermine theconditionalprobabilitytableoftheBayesiannetwork.TheproposedprognosisapproachusingPearl’spolytreepropagation algorithmforjointprobabilityreasoningcanpredictthefailureprobabilitiesofleafnodesbasedonthecurrentstatusofrootnodes. Verificationresultsinahigh-speedrailwaytractionsimulationsystemcandemonstratetheeffectivenessoftheproposedapproach.


Introduction
CRH (China Railway High-speed) high-speed train traction system is a complex electromechanical coupling system, which consists of a lot of electrical and mechanical devices, such as pantograph, traction transformers, traction converters, and traction motors.Along with the growth of running time, some components in a traction system like IGBTs (insulated gate bipolar transistors) and diodes will degrade with age.These fatigued components are likely to have various abrupt faults such as short-circuit or open-circuit faults, which definitely increase the risk of serious accidents in the entire railway.Thus, fault prognosis is urgently demanded in high-speed railway traction systems.
Due to the complex structures and behaviors of electromechanical coupling traction systems, it is difficult to describe the causalities accurately through analytical models, which limits the application of the existing analytical model based fault prognosis methods [1].Instead, the data-driven statistical model based prognosis methods, especially the Bayesian networks, have become the mainstream [2][3][4][5][6][7].However, constructing an accurate Bayesian network structure is a big challenge in practice.Some scholars have proposed approaches to learn the Bayesian network structure from data [4,5].Due to the reason that the accuracy of the learned Bayesian network is largely affected by the richness of the data and the prior knowledge of the network ordering, few attractive results have been obtained, according to which an idea of constructing Bayesian network via bond graph model is proposed in this paper.
Bond graph modeling has been widely applied to lots of engineering fields for modeling various dynamic systems.It is particularly popular for modeling electromechanical coupling systems.In bond graphs, different elements that belong to different energy domains (such as mechanical, electrical, and electromagnetic domains) can be described by the same model structure with uniform modeling language.Bond graph theory and its recent applications have been summarized in [8][9][10].Due to its capability in describing the causality of a complex system, bond graph has been investigated in modeling of high-speed railway vehicles in recent years [11][12][13].The bond graph model of a squirrel cage induction motor is introduced in [14], which contributes to the bond graph modeling of high-speed train traction device.Model based [15][16][17] and data-driven based [18,19] FDD (fault detection and diagnosis) approaches, especially the bond graph based methods [20][21][22][23][24][25], are becoming hot research topics.However, the causal relationships in a bond graph that consist of important information for reasoning are seldom used in fault prognosis domain directly.This paper proposes a general procedure for constructing a Bayesian network structure on the basis of bond graph modeling for fault prognosis of high-speed train traction device.The causal relationships revealed by the bond graph model are combined with the reasoning capacity of the Bayesian network.For complete and incomplete data sets, Bayesian estimation and expectation maximization (EM) algorithm are adopted, respectively, to determine the conditional probability table of the Bayesian network.Pearl's polytree propagation algorithm is used for joint probability reasoning.The failure probabilities of the other leaf nodes are determined by the current status of root nodes.The simulation results on CRH 5 traction system can verify the effectiveness of the proposed approach.

Modeling of CRH 5 Traction Device
2.1.Railway Traction System.CRH 5 high-speed trains are providing convenient public transportation among major cities in China.According to [26], the CRH 5 traction system is made up of pantograph (model: DSA250), vacuum circuit breaker (model: 22CBNG), traction transformer (electrical standard: IEC 60310), traction converter (model: YGN2Q213), traction motor (model: YJ87A, three-phase squirrel cage induction motor), and other components.An AC-DC-AC driving method is adopted, as shown in Figure 1, where 25 kV HVAC (high voltage alternating current) is transformed into 1700 V AC (alternating current) by the traction transformer through pantograph; then, the converter (AC-DC-AC) outputs three-phase AC with controllable voltage and frequency for the traction motors.
For simplicity, the traction system consisting of two sets of inverters and induction motors is studied in this paper.
The three-phase inverter bridge circuits can realize VVVF (variable velocity variable frequency) drive of the three-phase ACIMs (alternating current induction motors).The inverter circuit, as shown in Figure 2, uses IGBTs as switch elements in its main circuit and control system design, where the voltagespace vector control scheme is adopted in the inverter control circuit.
Open-circuit fault and short-circuit fault are two kinds of common faults in traction inverters.Switch-on failure of the transistors and breakdown of the motor phase can cause open-circuit faults, which will increase torque pulsations, copper losses or reduce mean torque and efficiency.Switchoff failure of the transistors and ground of the phase terminals can cause short-circuit faults, which will bring overload burning of the stator and rotor circuits.

Bond Graph Modeling
2.2.1.Bond Graph: Basis.Electromechanical systems are governed by many effects issued from different physical phenomena and various technological components.Bond graph, a unified and multidomain modeling and simulation approach, is well suited for such systems.Bond graph provides possibilities for both structural and behavioral system analysis [27].According to [28], power variable is the product of effort (represented by ) and flow (represented by ), where effort variable represents force, voltage, or pressure and flow variable represents current, flow, or velocity.Capacitance (), inertias (), resistances (), sources {  ,   }, gyrator (GY), transformer (TF), and junctions 0, 1 are generic bond graph components, which can be classified as 1-port components {  ,   , , , }, 2-port components {TF, GY}, and multipleports components {1, 0}.These bond graph components are connected by a set of bonds represented by half arrows indicating positive energy flow from one variable to another.For each bond, an effort and a flow variable were remarked to describe the signals of the bond graph components connected to the bonds.The generic components and bond connections form the structure of bond graph model, where a set of relations called constitutive relations is used to describe the behavior of each component [29].
Besides the modeling capability for electromechanical systems, bond graph approach can derive equations or information from the graph itself by using a concept called the causality.In bond graphs, a stroke is marked at one end of each bond, which indicates the direction of an effort or flow signal.Users can derive the relations or analytical expressions between system variables to understand how fault signals propagate in the system.5 Traction Devices.The converter in CRH 5 unit consists of two sets of rectifiers, two sets of inverters, one set of traction control device, and cooling system.Each traction motor is controlled by one set of inverters, whose equivalent circuit is shown in Figure 3.The three-phase AC motors can be equivalent to a series circuit of resistances, inductances, and counter electromotive force, where the effect of the counter electromotive force can be neglected when the load side is unloaded.For bond graph modeling of the inverter part, IGBT and diode circuit can be simplified by a switch model consisting of a turn-on resistance  on with low resistor value, a turn-off resistance  off with high resistor value, and an ideal switch .In Figure 3, the series connecting  on1 and  1 is the equivalent circuit of IGBT  1 ,  off1 is the equivalent circuit of diode VD 1 , and so on.Different equivalent resistances ( eq ) corresponding to different status can be calculated as follows: The final equivalent circuit of the inverter and three-phase AC motors for open-circuit fault and short-circuit fault status are shown in Figures 4 and 5, respectively.

Bond Graph Modeling of Traction System.
According to [28], the main difficulty in bond graph modeling of power switching device is how to describe the discrete system dynamics.So far, many bond graph based modeling approaches for power switching device have been proposed, such as enumeration method, Petri Net method, and MTF (multitransformer) method.MTF method is the most popular method, where modulus  = 0 or  = 1 represents the status (ON or OFF) of power switching devices.As shown in Figure 6,  1 - 12 and  off1 - off12 represent switching tubes  1 - 12 and diodes VD 1 -VD 12 , respectively, in the inverter circuit.In bond graph modeling of the three-phase AC motors, the resistance   and the winding   determine the behavior of the stator circuit; the resistance   and the winding   indicate the losses of hysteresis and the magnetic flux losses in the stator and rotor circuit; the resistance   and the winding   determine the rotor circuit behavior.The mechanical power generated by each phase is modeled by the "MGY" ports.The rotational mechanical power is added to the "1" junction, which can be applied to the motor shaft modeling by the inertia port "" with parameter "." The friction losses are modeled by the resistance port "" with parameter "."  Open-circuit fault and short-circuit fault are two kinds of common faults in inverter circuits.According to circuit analysis, the features of the collector-emitter average voltage of the switching tube are the same when an open-circuit fault happens on  1 ,  3 , or  5 or a short-circuit fault happens on  2 ,  4 , or  6 .Aiming at accurate fault location, define the behaviors of  on and  off as follows: the value of  on increases rapidly when an open-circuit fault happens; the value of  off decreases rapidly when a short-circuit fault happens.In Table 1,  10 ,  13 ,  17 ,  20 ,  24 , and  27 represent the collector-emitter average voltages of the bonds numbers 10, 13, 17, 20, 24, and 27, respectively."0" represents the nominal value of the collector-emitter average voltage; "+" represents the increase of collector-emitter voltage; "−" represents the decrease of collector-emitter voltage.
According to the equivalent circuit shown in Figures 4  and 5, Figure 7

Fault Prognosis Based on Bayesian Network
3.1.Key Idea.Since the three-phase circuits in the inverter are exactly the same, B-phase circuit is shown as an example to study the fault prognosis in this paper.Figure 8 illustrates the fault prognosis mechanism using the Bayesian network of B-phase circuit to predict the fault probability of the stator or rotor circuit.For offline implementation, the bond graph model of CRH 5 inverter and three-phase AC motor are used to construct a Bayesian network for fault prognosis.The Bayesian network based fault prognosis module is activated when the structure of predictive Bayesian network is obtained and the conditional probability distributions are acquired.For online implementation, evidence extracted from the system measurements can propagate through the network when the probability distributions for each variable are inferred.

Bayesian Network.
Bayesian network is a directed acyclic graph (DAG), where nodes represent the random variables.The directed edges leading from cause variables to effect variables represent the causal relations.The measurements are donated by conditional probabilities between nodes and father-nodes.According to [30], prior probabilities need to be specified for root nodes, while conditional probability distributions (CPDs) are specified for nonroot nodes.The edges in the Bayesian network represent the joint probability distributions which can be defined as where parents(  ) is the parent set of node   .The equation above, known as a chain rule indicating the joint probability distribution of all variables in the Bayesian network, is the product of each variable's probabilities when its parents' values are given.The probability distribution of each variable or partial variable can be obtained by Bayesian network inference when other variables are known.A Bayesian network comprises two parts  = ⟨, Θ⟩, where  is a DAG conveying the direct dependence relationships within the data set, while Θ is the CPD of each variable.Assume that ∏   represents the set of direct parents of   in .Θ contains a parameter    |∏   =   (  | ∏   ) for each   , such that the network  can represent the following joint probability distribution [31]: where the input variable is defined as the effort variable of the connected bond that has a causal stroke assigned at the "0" junction.The storage elements are usually assigned preferred derivative causality [29].
The bond graph modeling implies the state equations describing the system dynamics.The causal relationships and power transfer between each component can be obtained clearly through the state equations coming from the bond graph.List the causal relationships of B-phase circuit in number 1 inverter as follows:  9.

Building Bayesian Network.
The first step is to establish the directed links between variables for a causal network by using the causality derived from the bond graph model of CRH 5 traction system.Secondly, use intermediate variables to obtain the conditional probability distributions (CPDs).Thirdly, specify the CPDs for each variable.
The directed graph model of B-phase circuit transforming from bond graph model is shown in Figure 9. Due to the reason that Bayesian network is a directed acyclic graph, some necessary simplifications are proposed for directed graph model having loops in [2].The Bayesian network of B-phase circuit is shown in Figure 10,  on3 ,  off6 ,  off3 , and  on6 are the root nodes corresponding to the components of B-phase circuit in equivalent diagram.  ,   ,   , and   can indicate the behavior of the stator and rotor circuits which are the leaf nodes in the Bayesian network of the B-phase circuit.   17 is the copied node of node  17 according to the procedure in [2].

The Parameters of Bayesian
Network.Different parameter learning algorithms are used to obtain the conditional probability table of the Bayesian network for complete and incomplete data sets.For complete data sets, Bayesian estimation algorithm is used for parameter estimation.It searches the parameter value with maximum posterior probability according to the a priori knowledge when the topological structure  and training data set  are known: where  is a fixed unknown parameter and ( | ) is the prior probability of  under the topological structure .Let polynomial parameters  1 ,  2 , . . .,   satisfy ∑  =1   = 1; then the posterior probability of the parameter can be represented by (6) when ( | ) is subject to Dirichlet distribution: The formula of parameter estimation is as follow: For incomplete data sets, the expectation maximization (EM) based iterative algorithm is used to compute the maximum likelihood probability of the network.Let initial value of the parameter be equal  (0) .Modify the parameter constantly to achieve its maximum value of the maximum likelihood probability [ln ( | )], where  is the whole of the training sample.
The probability distribution expectation of data set  is represented as follows when the observable training sample  and current  are given: Maximize the function ( () | ) by means of maximum likelihood estimation algorithm: where  represents observable data set;  represents the data set which has not been observed; the whole training data  =  ∪ .

Fault Prognosis Scheme.
In this paper, fault prognosis scheme based on Bayesian network of B-phase circuit is designed to predict fault probability of stator or rotor circuit that may cause abnormal behaviors of overall system.The Bayesian network based fault prognosis is a kind of prediction mechanism using joint probability distribution to obtain the fault probability of child nodes, when network structure, fault probabilities of the root nodes, and conditional probability table of the other nodes are given.The causal variable analysis is main application of Bayesian network when the observed statues on any of the random variables are given.Conditional probability of unobserved modes is updated through belief propagation and inference can be made about the most probable status [32].In the example of Figure 10, if variable   is observed as true, the statues of  off6 can be inferred from this evidence such that the probability   off6|  (, ture) is needed for this inference, and it can be quantified by marginalizing the joint distribution under the condition that the status of   is known: where , , , , , , and  are the values of the nodes  off6 ,  on3 ,  off3 ,   ,  on6 ,  20 , and  17 .The difficulty of this approach is that, as the number of nodes grows big, precise calculation of new condition probability becomes an NP-hard problem.Due to the reason above, the polytree propagation algorithm, a Monte Carlo based approximate reasoning algorithm, is adopted in this paper.According to the Bayesian network in Figure 10, the fault probabilities of each node are as follows:

Results and Discussions
Firstly, a bond graph model of CRH 5 traction system is formed in the environment of 20-SIM.When open-circuit fault happens on IGBT V 3 , load current   loses its positive half wave, which may cause the motor stalling.When shortcircuit fault happens on IGBT V 3 ,   increases rapidly, such that short circuit on IGBT which may cause catastrophic failure on stator or rotor circuit is the fault we studied in simulation section.
In this section, 10000 sets of data produced by experiments are divided into 10 groups, where 600 sets of data are for training and 400 sets of data are for test in each group.According to [33], the fault distribution in electric drive system is as follows: power devices (37%), capacitors (20%), inductors (5%), resistors (2%), connectors (15%), gate drives (16%), and others (5%).What is more, in [33], the fault probabilities of IBGT open-circuit fault (37%) and shortcircuit fault (43%) are also given.Regarding the complete data from experiments as the training sample of parameter learning, compute the conditional probability of intermediate nodes and leaf nodes by maximum likelihood estimation.The conditional probability table is shown in Table 2, where ,  represent normal nodes and faulty nodes, respectively.
Predict the fault probability of stator or rotor circuit when short-circuit fault of IGBT components in B-phase inverter circuit happens.According to Pearl's polytree propagation algorithm used for joint probability reasoning, the prediction results on fault probability of stator and rotor circuit through 10000 sets of complete data are shown in Figures 11 and 12.
In modern engineering systems, some values in the measurement data set are missing such that EM (expectation maximization) method, a well-known parameter estimation algorithm, is used for probabilistic reasoning, combining with  Pearl's polytree propagation algorithm.Each iteration step of EM algorithm contains two steps: the E-step (expectation) and the M-step (maximization).The procedure alternates between the two steps until convergence is achieved.It should be pointed out that the search speed will slow down when EM algorithm comes close to its convergence point.
In this simulation, 10000 sets of data are hidden randomly by 20% to imitate the data missing.The remaining sets are divided into 10 groups, where 600 sets of data are for training and 200 sets of data are for test in each group.The conditional probability table of each node can be obtained by EM algorithm.Predict the fault probability of leaf nodes by Pearl's polytree propagation algorithm when prior probabilities of the root nodes are given.In Figures 13 and 14, the prediction results of stator or rotor circuit under incomplete data can also verify effectiveness of the proposed approach.

Conclusion
According to the schematic diagram of CRH 5 traction system, a bond graph based model of inverter circuit and threephase AC motor is built.Then, a bond graph and Bayesian network based fault prognosis approach is proposed to predict the fault probability of stator and rotor circuit, when the prior fault probabilities of IGBT components are given.The bond graph based model of CRH 5 traction system is used for building the Bayesian network, which can solve the problem of constructing an accurate Bayesian network structure in practice.Different parameter learning algorithms, such as Bayesian estimation and EM algorithm, are adopted to determine the conditional probability table of the Bayesian network for complete and incomplete data sets.The fault probabilities of leaf nodes (stator and rotor circuit) can be predicted by joint probability reasoning through Pearl's polytree propagation algorithm.The simulation results can verify its effectiveness in fault prognosis for both complete and incomplete data sets.Our future works lie in the following: (1) the bond graph modeling of junction parts should be considered; (2) some new bond graph modeling approaches of power switching device (IGBT) can be adopted in our future work to achieve a more accurate modeling.

Figure 3 :
Figure 3: Equivalent circuit of the inverter and three-phase AC motor.

Figure 6 :
Figure 6: Bond graph modeling of CRH 5 inverter and three-phase AC motor.
(a) shows the bond graph of B-phase circuit in number 1 inverter, where IGBT  3 has an open-circuit fault.Figure 7(b) shows the bond graph of B-phase circuit in number 1 inverter, where IGBT  3 has a short-circuit fault.

Figure 7 :
Figure 7: Bond graph modeling of B-phase inverter circuit in fault.

Table 1 :
Common faults in the bond graph model of CRH 5 traction system.

Table 2 :
Conditional probability table of parameter learning.