Modeling Analysis of Power Transformer Fault Diagnosis Based on Improved Relevance Vector Machine

A newmethod of transformer fault diagnosis based on relevance vectormachine (RVM) is proposed. Bayesian estimation is applied to support vector machine (SVM) in the novel algorithm, which made fault diagnosis system work more effectively. In the paper, the analysis model is presented that the solutions of RVM have the feature of sparsity and RVM can obtain global solutions under finite samples. The process of transformer fault diagnosis for four working statuses is given in experiments and simulations. The results validated that this method has obvious advantages of diagnosis time and accuracy compared with backpropagation (BP) neural networks and general SVMmethods.


Introduction
Power transformer is one of the key equipment for electric power transmission and distribution, which is a widely distributed, complex, and expensive equipment in the power system.And its safety situation plays a great effect on stability and security level of power system.Therefore, it is of great realistic significance to study the fault diagnosis technology and raise the level of maintenance of power transformers.Dissolved gas analysis (DGA) [1,2], which provides operation information by detection of certain gases generated in an oil-filled transformer, is the method widely adopted by the utilities.The concentrations of the dissolved gases, their generation rates, ratio of respective gases, or total combustible gases in the oil are the attributes used in the DGA method to interpret a malfunction.Accuracy and reliability of the DGA data are influenced by many factors, so the DGA three-ratio method and its improved ratio methods by classical IEC [3] are considered that limitations are obvious, such that the ratio crosses the coding boundary or codes change sharply [3,4].And it cannot offer completely objective, accurate diagnosis for all the faults, such as the low diagnostic accuracy for overheating fault.With the development of artificial intelligence, some solutions [5,6] based on neural networks are applied to power transformer fault diagnosis process.Though some improved results in faults diagnosis can be obtained, this kind of methods has some inherent disadvantages in application, such as local optimization and danger of overfitting [7,8].Support vector machine (SVM) overcomes the drawbacks of neural networks in aspects of convergence and realtime application [9,10].In recent years, SVM methods have been applied to fault diagnosis and identification of power transformers [11][12][13].However, the input characteristic information during the process of fault diagnosis is large in order that SVM methods take heavy computation to approximate the optimal solution and spend much time in parameter searching [14,15].So they are difficult in real-time monitoring and power transformer fault diagnosis.
According to practical situations that there are many uncertainties in the running of transformers and there are finite samples during fault diagnosis process, relevance vector machine is introduced to fault detection and identification of power transformers.RVM, which applies Bayesian estimation model to SVM algorithms [16,17], can decrease hyperparameters in diagnosis algorithms and bring good adaptive ability in kernel function and model parameters chosen.RVM is a machine learning methodology based on sparse Bayesian learning theory proposed by Tipping in 2000.It absorbs the advantages of wonderful generalization and precision from SVM; meanwhile, it overcomes some inherent limitations of SVM and possesses advantageous features such as high degree of sparsity, fewer kernel functions, and low computation load.Particularly, the RVM classification can provide posterior probabilities for class memberships, which is suitable to analyse indeterminate problem in power transformer fault diagnosis.By dissolved gas analysis of power transformer oil, fault diagnosis models are established based on RVM in the paper.Through fault diagnosis for several kinds of power transformers, the proposed algorithm is validated to be better than BP neural networks and general SVM methods in the aspects of finite sample size, diagnosing speed, and diagnosing accuracy.It provides a new way of thinking for resolving the transformer fault diagnosis problem in real time.In the future, the research of transformer diagnosis faults based on RVM is real-time implementation for power transformer systems.Recently, data-driven fault diagnosis methods become more popular in many industry sectors [18][19][20][21].So a combination of the proposed method and data-driven methods is the future work to realize largescale real-time implementation in fault diagnosis of power transformers.
The paper is organized as follows.In the next section, we introduce RVM model of fault diagnosis.Diagnosis process for power transformers is extended in Section 3, which includes fault information preprocessing, diagnosis model training, and fault identification for power transformers.In Section 4, we give the experiment results in application and offer some benchmark comparison with least squares support vector machine (LS-SVM) and BP neutral networks before summarising in Section 5.

Analysis of RVM Model for Faults Diagnosis
RVM based on Bayesian learning framework introduces Bayesian theory of Gaussian process in SVM.The results of inference are shown in the form of probability density, and it can be written as where   () is a nonlinear kernel function and ℎ  is the weight of the model.For avoiding overfitting of traditional SVM, based on Bayesian framework [22], maximum likelihood method is applied to training of the model weight.RVM defines the prior probability distribution as follows: where   is hyperparameter of the prior distribution.Assuming that input training sample set is {  ,   }  =1 , target value   is independent, and input noise is Gaussian distribution with variance  2 , the maximum likelihood function of training sampling set correspondingly can be expressed as where k = [ 1 , . . .,   ] T , h = [ℎ 1 , . . ., ℎ  ] T , and Φ = [1, Φ 1 (  ), . . ., Φ  (  )] T is the response of all kernel functions with input   .The posterior distribution of the weight value conditioned on the data is given by combining the likelihood function (3) and the prior probability distribution (2) within Bayesian rule: where a = [ 1 , . . .,   ] T denotes the vector of hyperparameters.And the posterior probability distribution over the weights is multivariate Gaussian distribution: where Σ = ( −2 Φ T Φ + A) −1 is the posterior covariance of the multivariate Gaussian distribution, A is defined as diagonal matrix with the elements ( 1 , . . .,   ), and  =  −2 ΣΦ T is the mean of the multivariate Gaussian distribution.Integrating the distribution with respect to weight value, the likelihood distribution of training value can be given by Marginalize ( 6); marginal likelihood distribution of hyperparameters can be written as where C =  2 I + ΦA −1 Φ T .Estimated values of weights in RVM method are given by the means of the posterior probability distribution.Maximum posterior estimation of weights depended on hyperparameter a and noise variance  2 , and estimated values a and  2 can be computed by the maximum marginal likelihood distribution.Uncertainty of diagnosis model prediction can be represented by the uncertainty of optimum weight, which is shown in the posterior distribution.If input x * is given, the corresponding probability distribution of output can be expressed as and this probability distribution is Gaussian distribution: In ( 9), the predict mean and variance are, respectively, Using Bayesian theory to compute parameters of fault diagnosis model based on RVM, diagnosis parameters choice can be optimized and application range of the faults diagnosis can be extended.

Diagnosis Process of the Transformer Based on RVM
where  = 1, 2, . . ., 5 is the index of each gas.  is the concentration of gas in the unit ppm.Then, we can obtain special character information by computing the maximum concentration of the gas as  6 = log 10 (max 5 =1 (  )), which is used for crosswise comparisons for all groups of test data.The six characteristics of data construct a six-dimension vector x = [ 1 ,  2 ,  3 ,  4 ,  5 ] T .This fault character information vector can be used to determine the working condition of power transformer: high energy discharge, low energy discharge, overheating, or normal condition.

Diagnosis Model Training.
After faults feature extraction, it is necessary for fault diagnosis that RVM classifiers are trained to identify four possible states: high energy discharge, low energy discharge, overheating, and normal condition.So three diagnosis RVMs should have been set up and trained by all training samples.Each RVM corresponds to one of the fault states in power transformers.Training samples are sent into RVMs for training and identification.The output is assigned to one if the data of identification result is larger than zero.On the contrary, the output is zero if the data is less than zero.Firstly, the RMV1 is trained to overheating identification by the training sample set.Output of RVM1 for overheating samples is set to one, and the output for other fault states remains zero.Secondly, the RVM2 is trained to separate high energy discharge from low energy discharge.The output of RVM2 is set to one when the input is a high energy discharge sample; otherwise output of RVM2 is zero.Finally, RVM3 for low energy discharge is trained.In the same way, the output for low energy discharge samples is set to one; otherwise output is zero.Thus, three RVMs could be obtained to realize binary classification to all the samples for faults diagnosis.In the process, the kernel functions implicated in the three RVMs are all Gaussian radial basis functions.The flowchart of diagnosis model training was shown in Figure 1.

Fault Identification of Power Transformers.
After features extraction of DGA in Section 3.1, the test data with character information are considered as identification samples, which would be input to the RVMs established for faults diagnosis.
The three RVMs are used to recognize the fault types.If the output of RVM1 is larger than zero, we could consider that the fault of power transformer is overheating.If the output of RVM1 is less than zero, the sample continues to be sent to RVM2 and RVM3 for recognizing the faults of high energy discharge and low energy discharge in the same way.If the outputs of the three RVMs are all less than zero, the power transformer would be judged in normal working condition.Thus, RVM classifier identifies the four states of transformers after three times of identification process.

Experimental Results and Analysis of Faults Diagnosis of Power Transformers
Sixty groups of historical data about gas concentrations in certain power transformer were collected as the training set.The data corresponding to actual fault conditions were known.In the training dataset, there were four working states of the power transformer, which were overheating, high energy discharge, low energy discharge, and normal operation.Each state had 15 samples.Three RVMs (RVM1, RVM2, and RVM3) were constructed by the samples related to each state in the training dataset, separately.And other 10 groups of samples of the running transformer were considered as the test data set; the test samples included 3 overheating samples, 2 high power discharge samples, 3 low power discharge samples, and 2 normal operation samples; see Table 1.
For evaluation of the proposed fault diagnosis model, we compared three learning models: the proposed RVM, BP neutral network and LS-SVM.There were 6 six input nodes, 44 hidden nodes, and 4 output nodes in BP neutral network, in which target error is 0.01.For LS-SVM, target error is 0.01 with the Gaussian radial basis function as the kernel function.Test parameters included training time, test time, and diagnosis accuracy of three methods in the condition of the same training samples.The test samples were 50 groups of historical data samples of the power transformer.The performance of testing was shown in Table 2 2, we can find that training time obviously become, longer for both of fault diagnosis models with the training sample number increasing.However, training time of the proposed optimum RVM model is much less than that of LS-SVM.In Figure 3, the testing diagnosis accuracy reaches 100% for the proposed method when the number of training samples is larger than 25.The proposed method is also superior to LS-SVM in test time see (Figure 4), in which the training sample number is 200.All the performance described above proves that the proposed RVM algorithm has advantages in the aspects of global optimization and sparsity of the solution.It has a wider range of adaptability than that of conventional SVM model.From the results of the faults diagnosis model by the RVM algorithm in the experiments, we can conclude that the proposed method can output the probability of working states of power transformers, which supply more useful information for transformer overhaul.The method is superior to conventional diagnosis methods in diagnosis accuracy, response time, and sample number.

Conclusion
The novel fault diagnosis model based on RVM for power transformers is proposed in the paper.Bayesian learning framework applied in the model makes hyperparameters decrease and improves the sparsity of the solutions.It overcomes the limitations of some traditional algorithms and enhances applicability of fault diagnosis.The proposed method uses three RVMs to identify the faults of power transforms and brings about good results.The probability statistics of characteristics of the diagnosis model provide the advantages of small sampling, high sparsity, and low computation load.Compared to BP neutral network and LS-SVM, the test time of fault diagnosis is shorter and accuracy of fault diagnosis is much higher.The results of experiments and simulations validate that this diagnosis method is suitable for real-time surveillance and identification of power transformers fault.The study of this issue is quite significant for transformers fault diagnosis application.The future work is to realize real-time implementation in power transformer systems by the proposed method.

Figure 1 :
Figure 1: The flowchart of diagnosis model training.

Figure 2 :
Figure 2: Training time via different numbers of training samples.

Figure 3 :Figure 4 :
Figure 3: Diagnosis accuracy via different numbers of training samples.
Internal faults of transformers are classified into heat fault and electrical fault.In this paper, RVM method was applied to diagnose four types of working situations in power transformers, which include high energy discharge, low energy discharge, overheating, and normal condition operation.After the following three steps: characteristic information extraction by DGA, diagnosis model training, and faults recognition, the fault of transformer can be identified accurately.3.1.Fault Information Preprocessing.Through DGA, concentrations of dissolved gases H 2 (hydrogen), CH 4 (methane), C 2 H 6 (ethane), C 2 H 4 (ethylene), and C 2 H 2 (acetylene) could be obtained.The information of concentration of the five dissolved gases can be used to verify the four fault types stated above.So the ratio of each gas can be computed firstly by the following method:

Table 1 :
Sample group of dissolved gases.

Table 2 :
Diagnosis comparison of the algorithms.
. It is shown in Table 2 that the faults diagnosis models of LS-SVM and proposed RVM algorithms are superior to those of BP neutral network in the aspects of training time, test time, and diagnosis accuracy under the same target error.Furthermore, because of application of Bayesian probability model in the proposed RVM algorithm, the number of hyperparameters decreases.So test time and diagnosis accuracy of the proposed algorithm are better compared to those of conventional LS-SVM algorithm.It is more suitable to practical application.Next, the performance of training time, diagnose accuracy, and test time of LS-SVM and the proposed RVM are illuminated.Figures 2 and 3 show the performance of training time and diagnose accuracy in different numbers of training samples.The range of training samples number is from 25 to 200.From Figure