A Model of Intelligent Fault Diagnosis of Power Equipment Based on CBR

Nowadays the demand of power supply reliability has been strongly increased as the development within power industry grows rapidly. Nevertheless such large demand requires substantial power grid to sustain. Therefore power equipment’s running and testing data which contains vast information underpins online monitoring and fault diagnosis to finally achieve state maintenance. In this paper, an intelligent fault diagnosis model for power equipment based on case-based reasoning (IFDCBR) will be proposed. The model intends to discover the potential rules of equipment fault by data mining. The intelligent model constructs a condition case base of equipment by analyzing the following four categories of data: online recording data, history data, basic test data, and environmental data. SVM regression analysis was also applied in mining the case base so as to further establish the equipment condition fingerprint. The running data of equipment can be diagnosed by such condition fingerprint to detect whether there is a fault or not. Finally, this paper verifies the intelligent model and three-ratio method based on a set of practical data. The resulting research demonstrates that this intelligent model is more effective and accurate in fault diagnosis.


Introduction
The increasing scale of power system and the growing amount of power equipment have accelerated the integration of power grid.Due to the reform of electricity system, electric power enterprises continuously improve the service quality to satisfy the demand of the customers.Meanwhile, operation costs reduction tends to be the primary goal of electric power enterprises to maximize profits.Although, during "The Ninth Five-Year Plan" period, the number of accidents in power grid decreased significantly, there are still 3 accidents caused by electrical equipment fault which accounts to 23.1% out of the total number of accidents.Furthermore, the power grid accidents caused by power equipment malfunction have a tendency to increase year by year.On average, the corresponding power load of each accident increases to 585.7 MW/time and accident recovery time grows up to 526 min [1].The subject as to ensure the safety of power equipment, to observe potential fault in time, to reduce power system accident effectively, and to improve power supply quality and reliability, has become the most urgent problem in relation to power system.The real-time online monitoring as to the condition of the equipment is the core essence which contributes to the safety of power system [2,3].As a result, the maintenance of power equipment underpins the normal operation of power system.
Condition based maintenance (CBM) which is based on actual health of the power equipment has strong pertinence and timeliness.CBM can ensure the safety of power grid, minimize the resources of maintenance of power equipment, reduce the operating cost of power enterprise, and improve the benefits of enterprise and society.Consequently, CBM has gained ground in both research and commercial realms [4,5].CBM assesses the health status of equipment base on online condition information and attribute information of power equipment comprehensively.Then it will make a decision whether to repair the power equipment according to the assessment result.In order to ensure the safety and reliability of the power equipment, the equipment with unperfected running status requires timely maintenance.In contrast, 2 Mathematical Problems in Engineering the frequency of maintenance for the equipment with fine running status should be reduced in order to ensure such good status as long as possible [6][7][8].
State diagnosis of power equipment relies on a vast amount of equipment state data.It is affected by the history running data and external environment of the power equipment.State diagnosis first estimates whether the equipment needs maintenance through comparing the time-point data with historical data vertically and analyzing data generated from the same kind of equipment in different time phase.Furthermore status diagnosis helps to identify the types of fault, the cause of fault, and corresponding severity.Based on the analyzed results obtained from status diagnosis, it will introduce reliable strategies and methods to overcome these faults [9].
In this paper, we propose an intelligent fault diagnosis model for power equipment based on case-based reasoning which will satisfy the new requirements of power grid.Moreover, a case study was conducted by applying IFDCBR and three-ratio method.The result shows that IFDCBR is more accurate and more widely used.The rest of the paper is organized as follows.In the second part, a new kernel function for SVM regression analysis is constructed based on the radial basis kernel function and polynomial kernel function.Then we propose IFDCBR in part three.Afterwards, we make a case study with IFDCBR and threeratio method.The result shows that IFDCBR is not only more accurate but also more widely used.Finally we summarize the paper and put forward the potential challenges.

Relevant Theories
2.1.Case-Based Reasoning.Case-based reasoning (CBR) is an important method used in learning and solving problems in artificial intelligence.The core theory of CBR is to apply the past experience in solving similar problems at present or in the future [10,11].What is more, the problem solving ability increases as experience accumulated.When there are no suitable algorithms or models to solve the problem, CBR can use case-based experience to effectively solve this problem [12][13][14].
The typical steps in rational problem solving process include four main parts: case retrieval, case reuse, case revision, and case preservation [15], shown in Figure 1.First of all, to analyze target sample, CBR searches similar case in the problem space (case retrieval).Then it directly uses (case reuse) or adapts the solution of the example case (case revision) to solve the upcoming problem.Finally, the target case will be saved to the case base after being screened.

Regression Analysis of SVM.
Regression analysis is a statistical method to identify the interdependence between two or more variables in quantitative relationship [16,17].
According to the theory of CBR, seeking an optimal function relation  = () between the status and the operation data of power equipment based on historical running data set is the core of fault diagnosis.Essentially, fault diagnosis of power equipment is a method to classify power equipment status data [18].
One key technology of classifying data is to find the optimal hyperplane which is an optimal linear discriminant function.We set   ∈  as the sample data which is divided into categories: +1 and −1, expressed as   ∈ {+1, −1} ( = 1, . . ., ).  is the number of corresponding data sample categories and  is the size of samples.The general form of the function in linear separable conditions is shown as follows: where  ∈   is an -dimensional vector and  is the offset.We should determine the coefficients  and  of linear function.The constraint is that the bias squares of () =  ⋅  +  and the observed  is most minimum.In other words, it is the process of solving the optimal regression hyperplane equation.
Vapnic proposed  function which is not sensitive to the error as where The value  is a negative number.It can be ignored when the deviation of () and the observed  is not more than .The  function gives a sensitive zone with width of 2 named  pipeline, as shown in Figure 3.
The deviation is 0 if samples are in the  pipeline.Otherwise, the sample is away from the group.In fact, samples from the group fundamentally are error or noise data which should be ignored to ensure the hyperplane is suitable for most of the samples.Therefore, we introduce slack variable   > 0 ( = 1, 2, . . ., ) and penalty factor C to control the condition of samples data from the group: pipeline and slack variable   in linear conditions are shown in Figure 2.
Essentially, the SVM regression problem is fitting (  ,   ) with () =  ⋅  + , where  = 1, 2, . . ., ,   ∈   ,   ∈ .We convert seeking the optimal hyperplane into convex quadratic programming problem under the constraint condition based on  function after introducing slack variable   and penalty factor . Consider min  is a given punishment coefficient beforehand, which is generally determined by experiment.When the slack variable of samples from the group is certain, the loss of objective function increases with the increasing of penalty factor .
In order to solve the convex quadratic programming, we introduce the Lagrange function: where ,  * are the Lagrange multipliers.Function  meets the conditions, / = 0, / = 0, / = 0, from which we can get the hyperplane: In view of the nonlinear regression problems, the basic idea of SVM is converting the inseparable data samples  in -dimensional space into separable data samples   in a highdimensional feature space with the kernel function.
The nonlinear problem in the original data space is converted into linear problem in a high-dimensional feature by introducing kernel function Θ.As a result, we use Θ(  ) to replace   , which is According to the defined kernel function (,   ) = Θ()⋅Θ(  ), we can get nonlinear regression function of SVM as For RBF kernel function (, ) = exp(−‖ − ‖ 2 /2 2 ), we enter  = 0.1 and selected kernel parameters  = 0.2, 0.4, 0.6, 0.8, 1.The characteristics of RBF kernel function are shown in Figure 3(a).We can easy see that the impact is only in a small area (−0.2, 0.2) which is near the test point  = 0.1.RBF kernel function is not sensitive while it tends to 0 away from the test points.In contrast, we set  = 0.1,  = 1, and  = 1, 3, 5, 7, 9 for polynomial kernel function (, ) = (⟨, ⟩ + )  and the characteristics of it are shown in Figure 3(b).As we can see, the impact of it is not only in the area near the test point  = 0.1 but also away from it.Polynomial kernel function has a strong ability of global data generalization.
In this paper, we construct a new kernel function with these two kernel functions, as follows: where By introducing the compound kernel function into formula (9), we can get the SVM nonlinear regression functions as follows: where ,  * are the Lagrange multipliers and ,  ≥ 0, + = 1.

Data Model.
A unified information model for power equipment is the basis of state diagnosis and prerequisite for standardization of equipment [21].The unified model is convenient for macrostatus diagnosis of equipment and makes it possible to share status data of the same or similar category equipment.Definition 1.One defines the information model of power equipment as  = {, , , , }, where one has the following.
The running data of power equipment is gathered and stored by time.The running data is closely related to the gathering time .We construct abstract spatial data model for equipment.In this model, the monitoring value is  axis and the time is the  axis, as shown in Figure 5.
As we can see, ( 1 ,  1 ) is on behalf of the monitoring value at time   .We can get the equipment monitoring data sample  = (  ,   ) at time   when we slice the data at time   along  axis.
Based on the information model and abstract spatial data model of equipment, we can define the power equipment case.
Definition 2. Power equipment case is a data collection which reflects the status of power equipment.One defines it as Case = ⟨  ,   , ⟩, where (i)   is the data sample of equipment monitoring point at time ; (ii)   is the environmental factors of the equipment; (iii)  is the condition of the equipment in environment   at time .Generally,  is a constant which is the classification thresholds of equipment conditions. is the important basis of case training.One assumes  as null when one cannot confirm the value in the process of case reasoning.

Status Fingerprint.
We assume that  is a status case base of equipment .There are five types of conditions identified as {, , , , }.We set up five subsets of case,   ,   ,   ,   , and   , according to different types.We train cases to establish the status fingerprint reader according to SVM nonlinear regression functions shown in formula (11).We assume that () is the status fingerprint reader of   ,   ,   ,   , and   , the train goals are Φ(), Φ(), Φ(), Φ(), and Φ(), and the tolerance is .The five subsets of conditions can be mapped to different data intervals, as shown in Figure 6.
According to the theory of SVM regression analysis, we can get the status fingerprint readers by training the cases.We set In this paper, we define ((), Φ()) as the equipment status fingerprint, in which () is the recognizer of status fingerprint and Φ() is the standard value of fingerprint identification.We set   as the tolerance for fingerprint similarity.Thus we can set up the equipment status fingerprint, as shown in Table 1.

Diagnosis Model.
We establish equipment case base on the basis of running data, history data, environmental factors, and equipment data.Then we establish equipment fingerprinting by training the cases based on SVM regression theory in order to diagnose the status of power equipment.There are two phases in IFDCBR: learning and application.We train the fingerprint recognizer and establish the fingerprint identification database in the learning phase, while we diagnose the status with fingerprint identification in application phase.IFDCBR is shown in Figure 7.
We establish the equipment case base based on the basic information, historical operating data, parameter calibration of equipment, and environmental data.Then we set the learning goals and train the equipment fingerprint recognizer and establish the fingerprint identification database.
In the application stage of IFDCBR, first we get the current equipment monitoring data.Then we get the untested equipment condition case after filtering and normalization processing.Next we recognize the untested equipment condition case with the equipment fingerprint recognizer so as to get the tested equipment fingerprint.By comparing the equipment condition fingerprints with the tested equipment fingerprint, the resulting highly fitted fingerprint falls into the diagnosis category.
We can assess the diagnosis result given by application model of IFDCBR to optimize the equipment fingerprint recognizer.At the same time, we can improve and optimize the equipment condition fingerprints by storing the untested equipment condition case into equipment case base.In this way, the IFDCBR has the ability of self-learning and selfoptimizing.

Empirical Analysis
In this paper, we program corresponding applications to test IFDCBR.Then we diagnose the oil-immersed transformer using the IFDCBR and three-ratio method.The results show that IFDCBR is more effective and accurate.

Classification of Transformer Condition.
Types of the equipment condition are the premise of equipment state analysis.If we can not get the fine-grained equipment condition types, the roughly types are needed.The reasonable equipment condition classification model is very important at overcoming the appearance of overfitting and underfitting in IFDCBR.We divided the transformer condition into six types according to analyzing the transformer faults and the related information of dissolved gas in transformer insulation oil [22][23][24], as shown in Table 2.

Modeling. A 110 KV oil-immersed transformer in Hebei
Electric Power Company is put in PMS system and oil chromatography-line monitoring system, 1229 pieces of gas concentration data are collected and added to equipment data such as temperature and humidity, and 819 pieces of useful data are gotten after data was organized and noise data removed.Among them, 573 for the normal operation data, 246 for failure data, as a basis for experimental research data, we get data distribution of status classification as shown in Table 3.
Sample library of equipment status T is established according to experimental data, T1 is a subset of normal samples, T2 is a subset of low energy discharge fault samples, T3 is a subset of high energy discharge fault samples, T4 is a subset of partial discharge fault samples, T5 is a subset of low temperature thermal samples, and T6 is a subset of high temperature thermal fault samples.
The key to establishing a diagnostic model is optimizing parameters of identifier after the sample library was established.In this paper, we optimize classifiers using Cyclic Variable Method [25].We train the fingerprint recognizer by repeatedly using related tools in Libsvm and get parameter values as shown in Table 4.
Through the above training, we can finally get the kernel function parameters:  = 6.2,  = 8.5,  = 0.315,  = 0.455, and  = 77.The status fingerprint of 110 KV oil-immersed transformer is shown in Table 5.

Results
Analysis.Each type of data is randomly divided into two parts at the ratio of 4 : 1 when classifier is trained to ensure the accuracy and reasonableness, 80% of data is used to train the fingerprint recognizer, and the other 25% is Insulating oil decomposes since inside heat is caused by equipment failure, and the reasons include tap poor contact, weak weld between leads and connections tap, and core multipoint grounding and partial short circuit.

Summary
State diagnosis with artificial intelligence technology has become important part of smart grid construction.In this paper, CBR is used to organize online monitoring data, historical operating data, environmental data, and the basic test data of similar power transmission equipment; device status fingerprint is established to analyze real-time detection data.The results show that the diagnostic model has higher accuracy and wider applicability; it provides a viable solution for discovering and excluding latent failure of power transmission equipment.However, the diagnostic model still faces many challenges at the terms of self-learning,self-optimizing, and the assessment of diagnostic results.
and  are the coefficients to adjust the effects of compound kernel function by RBF kernel function and polynomial kernel function.We can adjust the coefficients to keep the fitting and generalization ability for data samples of different distribution.The characteristics of compound kernel function are shown in Figure 4 while we set  = 0.1,  = 0.5,  = 1,  = 1, and  = 0, 0.2, 0.4, 0.6, 0.8, 1.0.Based on the characteristics of compound kernel function, we can see that it contains the characteristics of both RBF kernel function and polynomial kernel function.It has a good ability of data fitting and generalization.The compound kernel function can be adapted to different distributions data set by adjusting the parameters  and .

Figure 4 :
Figure 4: Feature figure of diagnostic kernel function.

Figure 5 :
Figure 5: Spatial data model of equipment.
Kernel functions include local nuclear functions and global functions.Different kinds of kernel function have a great influence on extracting efficiency of data information.Local kernel function applies to data with small scale but more accuracy.By contrast, global kernel function fits into data with large scale but less accuracy.RBF kernel function (, ) = exp(−‖ − ‖ 2 /2 2 ) is a typical example of local kernel function while polynomial kernel function (, ) = (⟨, ⟩ + )  is a very good global kernel function.By solely applying global function or local function, we cannot completely cover the distribution character of data.In practice, we should choose and construct a new kernel function reasonably to accommodate the kernel global and localized according to the data characteristics.

Table 2 :
Classification of transformer condition.An intermittent discharge failure, such as potential discharge caused by poor contact discharge between the laminations, core and ground, and core and screw mandrel.

Table 3 :
Distribution of data.
used to test.The highest classification accuracy is 95.19%, the lowest is 75%, and the average is 82.19%, as shown in Table6.Five groups of data in Table7cannot find results by using traditional three-ratio method (TRM) or comes to
misjudgment and the diagnostic results gotten by IFDCBR are consistent with the actual situation.As we can see, the validity and accuracy of the model are high and will increase as more learning experience gathered.

Table 6 :
Test results of the experimental cases.

Table 7 :
Comparison of diagnosis results.