Data-Driven Fault Diagnosis Method for Power Transformers Using Modified Kriging Model

A data-driven fault diagnosis method that combines Kriging model and neural network is presented and is further used for power transformers based on analysis of dissolved gases in oil. In order to improve modeling accuracy of Kriging model, a modified model that replaces the global model of Kriging model with BP neural network is presented and is further extended using linearity weighted aggregation method. The presented method integrates characteristics of the global approximation of the neural network technology and the localized departure of theKrigingmodel, which improvesmodeling accuracy. Finally, the validity of thismethod is demonstrated by several numerical computations of transformer fault diagnosis problems.


Introduction
Transformer is one of the most important equipment in power system [1], which is mainly used to transfer electrical energy between two or more circuits through electromagnetic induction. In the course of using this equipment, some factors such as electrical, thermal, and mechanical stresses may lead to irreversible damage to the insulating material [2]. In order to improve the reliability of power supply, fault diagnosis for transformer has drawn much attention from researchers, and many fault diagnosis methods have been widely proposed during the past decades.
At present, Dissolved Gas-in-oil Analysis (DGA) is a commonly used method to identify incipient failures of transformer fault [3]. With the development of artificial intelligence and computer technology, many fault diagnosis algorithms have been proposed based on DGA, such as neural network [4,5], fuzzy logic [6,7], expert system [8], support vector machine [2,9], and rough set theory [10]. Given that existing methods have their own characteristics and some limitations, effective fault diagnosis methods that integrate advantages of existing technologies to improve the modeling accuracy still remain an open area of research.
As a classic modeling technology, Kriging model combines a global model plus localized departures to construct approximation from sample data. It has been widely used in the field of Computer-Aided Engineering (CAE) [11,12]. On the other hand, neural network technology is a well-known information processing paradigm and has been widely applied in various areas due to its advantages such as adaptive learning. In this paper, a data-driven fault diagnosis model based on Kriging model and BP neural network is constructed and then is used for transformer fault diagnosis problems based on DGA. In order to improve the modeling accuracy of Kriging   response values. The output can be viewed as a combination of a regression model and a stochastic process [13,14]. The regression model is equivalent to the global simulation of the sample space, and the stochastic process is equivalent to local deviation.
In order to further improve the accuracy of the Kriging model, this paper proposes a transformer fault diagnosis method that combines Kriging model and neural network technology. The structure of this hybrid model is shown in Figure 1. The main steps for constructing the hybrid model are as follows: (1) Determine characteristic variables and fault types based on DGA method, and collect sample data and test data.
(2) According to sample data, construct BP network model.
(3) Construct the Kriging model based on sample data. (4) The global simulation of Kriging model is modified and updated by neural network, and then the hybrid model is constructed.

Kriging Model and Parameter
Optimization. The Kriging model contains global simulation plus localized departures, and the basic principles of which can be briefly given as follows [13,14]: set approximate function as ( ), and the function between the response value and the independent variable of the system can be formulated as follows: where ( ) is the unknown function of interest, ( , ) is the regression model that is equivalent to the global simulation, and is regression parameter; ( ) is a normal stochastic process in which the mean value is 0 and the variance is denoted as 2 . It reflects the randomness of the response and is equivalent to partial divergence. The covariance matrix of ( ) is formulated as follows: where R is correlation matrix; the order of matrix is . , = 1, 2, . . . , ; is the number of sample points; and are the th and the th sample points; ( , , ) is the correlation function between and . In this paper, we utilize the Gaussian correlation function: where is the dimension of the problem; ( ) and ( ) are the th dimensional components of the th and the th sample points, respectively; is the unknown related parameters of the interpolation model.
In general, can be replaced with a scalar . Thus, formula (3) can be formulated as Therefore, the estimated value of test point can be given by the following equation: wherêis the estimate of the global simulation; y is sample data response; f is column vectors; r ( ) is the correlation vector between observation point and sample data, which can be formulated as follows: r ( ) = ( ( , 1 ) , ( , 2 ) , . . . , ( , )) . (6) When ( ) is a constant,̂can be simplified and estimated by the following equation: The parameter determines the accuracy of the Kriging model, which can be solved by the following optimization problem: where2 is variance estimation, which can be determined by the following equation: To optimize parameter , intelligent optimization algorithms are commonly used. In this paper, a modified particle swarm optimization (MPSO) [15] is used to optimize the parameters of Kriging model. The key points of applying Mathematical Problems in Engineering 3 MSPO to perform optimization are threefold: (1) make as encoding in real numbers; (2) take (8) as the objective function; and (3) with respect to the constraint condition of > 0, the commonly used penalty function method is applied.
The inertia weight and learning factors in MPSO [15] are updated as follows: where and are the initial value and the final value of inertia weight , respectively; 1 and 1 and 2 and 2 are the initial value and the final value of learning factors 1 and 2, respectively; is the maximum number of iterations; is the current iteration number.

Combinations of Neural Network and Kriging Model.
The mapping relationship between the characteristic variables and fault types of transformer is very complex, which increases difficulty in improving high accuracy of Kriging model. On the other hand, neural network technology, for example, BP network, is a well-known information processing paradigm with some advantages such as adaptive learning and strong adaptability. The basic principle of BP network can be found in many references such as [4,5,16], details of which are not introduced here. In general, the output of BP network can be formulated as follows: where are the outputs of BP network, is transfer function, are network weights, are network thresholds, and are the outputs of the upper layer node.
To improve modeling accuracy, a modified Kriging model (hybrid model) is constructed by combining Kriging model and BP neural network technology, and the overall design of which has been shown in Figure 1. More specifically, the global model of Kriging model is replaced with BP neural network, which is given by (14): Further, this modified method can be extended using linearity weighted aggregation method, which is formulated by (15): where is the modified global model and 1 and 2 are weighting coefficients. Thus, the final output of the hybrid model can be given as follows:̂(  Obviously, the hybrid model is of generality. When 1 = 0 and 2 = 1, the hybrid model becomes original Kriging model and (16) can be rewritten by (5). When 1 = 1 and 2 = 0, the hybrid model can be formulated by (14), where the global model of Kriging model is replaced with BP neural network.

Feature Variable and Fault Type.
In general, the concentrations of five gases (H 2 , CH 4 , C 2 H 6 , C 2 H 4 , and C 2 H 2 ) dissolved in transformer oil can be selected as characteristic variables based on DGA data samples. The corresponding fault types of the characteristic variables contain normal, high temperature overheating, medium temperature overheating, low temperature overheating, partial discharge, low energy discharge, and high energy discharge. In this paper, we select some DGA data published in [17,18]. The distributions and coding of these fault data are shown in Table 1.

Parameter Setting.
In this paper, BP network structure is set as three-layer network. The selected sample data have 5 characteristic variables; thus the number of nodes in the input layer of the network is set as 5; the number of nodes in the output layer is set as 1; the number of hidden layer nodes is set as 8 by trial and error. The initial weights and thresholds of the network are randomly initialized, and Log-sigmoid is selected as function transfer function.

Analysis of Examples.
(1) Table 2 shows the comparisons between the proposed method and other methods. As far as these test examples are concerned, the proposed method  can effectively improve the accuracy of transformer fault diagnosis.
(2) Tables 2 and 1 also show that the calculation time of presented method is averagely about 7.84 s for test data including 111 sample points (hardware configuration: CPU i5, RAM 4 G; programming software: Matlab), which demonstrates the efficiency of the presented method.
(3) Table 3 lists some test results using the presented method (due to limitation of paper length, the complete results are not listed here). The results show that diagnosis accuracy is basically satisfactory.

Conclusions
In this paper, a data-driven fault diagnosis model based on Kriging model and neural network is proposed. The proposed model is based on the Kriging model and integrates neural network technology. Meanwhile, the localized departures of Kriging model are retained. The presented hybrid model is further used for power transformer fault diagnosis problems based on DGA method. Some numerical computations of transformer fault diagnosis problems are conducted, and the results show the feasibility and efficiency of the proposed method. In addition, the presented modified Kriging model is of some potential application value in other areas such as power system and engineering machinery.

Conflicts of Interest
The authors declare that the mentioned received funding did not lead to any conflicts of interest regarding the publication of this manuscript and there are not any possible conflicts of interest in the manuscript.