A Novel Power System Reliability Predicting Model Based on PCA and RVM

The power system reliability is an important index to evaluate the ability of power supply. According to the characteristics of the practical grid operation, this paper trains and sets up power grid reliability predicting model, based on relevance vector machine, taking the load supplying capacity of power grid and natural calamities as input variables, and the outage time of power grid failure affecting the reliability of the power supply as output variables. In the modeling process, through principal component analysis of the training sample set of relevance vector machine, the input factor number of sample is improved, the input number of network is reduced, the network structure is simplified, and the predicting accuracy is increased. Simulation results are provided to verify the effectiveness of the proposed algorithm, which show that it provides a new way for power system reliability predicting.


Introduction
The power system reliability is the ability of power system to provide continuous power supply for all the consumers.With the development of society and the increasing of people's living level, the requirement for power system reliability is becoming more and more important in recent years.The higher reliability of power system is not only the need of the consumers but also of importance for the development of power companies.Therefore, in the past few decades, the power system reliability has attracted an increasing attention by lots of researchers inland or abroad; several different ways have been proposed to improve the power system reliability by analyzing affecting factors [1].
The traditional predicting assessment methods for power system reliability require the historical data with accurate network structure and the reliable elements [2].In [3], the authors applied the Monte Carlo method in the assessment of the power system reliability, where the calculation structure was simple, but the calculation error was inversely proportional to the square root of experiment frequency, which reduced errors at the expense of computing time.In [4], a grid reliability algorithm based on the sensitivity analysis was proposed, in which a more important key element information was obtained through the sensitivity analysis of element reliability parameters.In [5], a power grid reliability evaluation model based on radical basis function (RBF) neural network has been proposed, where the states of power system could be firstly classified through the adaptive algorithm of RBF neural network, then the speed of identifying grid failure state could be improved by computing the power grid reliability.In [6], the BP neural network was used to predict the power supply reliability of a city grid; however, due to the less influential factors, the predicting performance was not good and the speed was slow, and sometimes it even fall into local extremism.The authors in [7] proposed a power system reliability predicting assessment algorithm based on the actual operation constraint, which used the actual power grid operation parameters to predict the reliability index of planned distribution network, so that the reliability index could be decomposed and the reliability management could be arranged.In [8], the Markov cut-set algorithm was employed to evaluate power system reliability.
However, the grid structure is too complex and the amount of data is too huge, and it is often difficult to determine the structure, so the traditional predicting methods Mathematical Problems in Engineering cannot be used to predict the power system reliability.On the other hand, in practical applications, if all the influencing factors are taken as the inputs into the sample set, it will inevitably lead to dimension expanding of the sample set and then affecting the predicting accuracy and the generalization capability.
Based on the above discussion, this paper presents power system reliability predicting model based on relevance vector machine (RVM); the principle component analysis (PCA) method is used to extract sample sets, and then it is taken into the RVM model after eliminating the correlation between variables.The proposed approach not only considers the feature extracting ability of PCA but also takes advantage of the nonlinear approximating ability of RVM, and thus the predicting accuracy of the predicting model and the generalization ability can be improved greatly.
The rest of this paper is organized as follows.In Sections 2 and 3, the theory of PCA and the principle of RVM are introduced, respectively.In Section 4, the reliability predicting model is proposed and the corresponding algorithm is presented.In Section 4, an illustrative example about a regional grid is provided to illustrate the effectiveness of the developed results.At last, this paper completes with a conclusion.

The Theory of PCA
In this section, the theory of PCA is briefly introduced.It firstly calculates the correlation matrix of the data matrix derived from many sample data, then gets the accumulated variance contribution according to the eigenvalues of the correlation matrix, and finally determines the principal components from the eigenvectors of the correlation matrix.The main results can be divided into the following five steps [9][10][11].
Step 1 (standardization of the original data).The main purpose is to eliminate the effects of different dimensions of the original variables and large numerical difference; that is, where   is the th column of the original data X × ,  is the class number of the sample data,  is the number of evaluation indexes, and (  ) and Var(  ) stand for the mean and covariance of   , respectively.
Step 4. Calculation of the principal component contribution rate and cumulative contribution rate.The principal component contribution rate (%) is derived by and the cumulative contribution rate (%) is derived by The number of principle component is selected depending on the cumulative contribution, the cumulative contribution rate is usually set more than 85%-90%, and then the corresponding first  main ingredients comprise most of the information provided by the  original variables.The number of principle component is .

The Principle of RVM
RVM is a learning algorithm derived from the theory of Bayesian learning algorithm, which is based on the support vector machine [12][13][14].This algorithm combines several theories, such as the Markov chain, the Bayesian theorem, automatic related decision prior, and the maximum likelihood, and thus it has the following advantages: (1) the high sparsity; (2) the shorter training time due to only the setting of the kernel function; (3) the flexible choosing of the the kernel function for it does not need to satisfy the Mercer condition.For a given training sample set {  }  =1 , the output set is defined as {  }  =1 , and the RVM regression model can be obtained by where  ∼ (0, 2 ) is the independent sample error,   ( = 0, 1, . . ., ) is the weight coefficient, (,   ) is kernel function, and  is the sample quantity.
For the independent output set, the likelihood function of the whole sample is where  = ( When employing the maximum likelihood method to solve  and  2 directly, it usually results in serious over-fitting problem, and in order to avoid this phenomenon,  is set as the zero-mean Gaussian prior distribution by using the sparse Bayesian principle where  is the corresponding hyper parameter of weight ; if every weight is corresponding to one hyper parameter, then it can control the influences of the prior distribution on each parameter, and thus the sparse characteristics of the RVM can be realized.
After defining the prior probability distribution and the likelihood distribution, according to the Bayesian theorem, the posterior probability distribution of all the unknown parameters can be derived by where the posterior covariance matrix is Ψ Finally, the maximum likelihood method is used to estimate the hyper parameter  and variance  2 .Given the new input sample , the corresponding output probability distribution will follow Gaussian distribution, and the corresponding predicting value is derived by  =  T ().

Selection of the Input Variables.
It is well known that power system reliability highly depend on the grid supply capacity and the natural environment of the grid; therefore, in this paper, some main factors reflecting supply capability of the grid itself, such as the available efficient of the grid equipment, the ratio of the power system reliability, power supply radius, capacity-load ratio, the loss of electricity as for power rationing, the unit of the new substation capacity, and high temperature, lightning strikes, strong winds, and heavy rain which has a larger impact on the safe operation of power system, are all taken as the input of the model.The available efficient of the grid equipment expresses the ratio of the available hours of major equipment to hours during survey period.The ratio of the power system reliability stands for the ratio of the power supply total hours to hours during survey period.Power supply radius means the physical distances of power supply circuit between power point and the farthest point of the power supply.Capacity-load ratio expresses the ratio of the city substation capacity to the corresponding load on the basis of which meet with the power supply reliability.The loss of electricity as for power rationing shows the power supply loss value caused by power rationing during survey period.The unit new substation capacity illustrates the corresponding values to the new substation capacity of different voltage levels when increasing the unit load during survey period.

Selection of the Output Variables.
In the daily production, there are two main aspects affecting the power system reliability: one is the fault power outages, and the other is the preliminary arrangement power.It is known that grid system is made up of a large number of transmission lines, transformers, switch equipment, sorts of reactive power compensation equipment, and power points.The final direct representation is mainly the grid blackout due to these factors on the grid.Therefore, in this paper, the grid average fault power failure time is taken as the output of the power reliability predicting model.

RVM Modeling and Structure Model.
The basic idea of the hybrid kernel function is briefly introduced in advance; firstly, the principal component analysis is performed on the original input variables, then adopting RVM to model the new matrix, and finally, we get the result of predicting.The selection of kernel function of RVM does not need to satisfy Mercer condition; therefore, there have a certain degree of freedom in the selection of the kernel function, and integrating the different characteristics of the kernel function, we will get the nuclear function which has better properties.
In this paper, the linear combination of the RBF kernel function and polynomial function is chosen as the kernel function of RVM, which can be derived by where (  ,   ) is the radial basis function, (  ,   ) is the binomial kernel function, and  is the weight of kernel function.It is a single kernel function if  = 0 or  = 1. is the width of kernel function. and  are the parameters to be optimized, where the grid search algorithm is employed.Figure 1 is the structure diagram of the power supply reliability predicting model.The unit load new substation

Numerical Analysis
In this section, a regional grid is taken as an example to illustrate the effectiveness of the proposed algorithm.This region mainly relies on the 500 kV and 220 kV grid, the transformer substations are mainly 220 kV and 110 kV, and there are more 220 kV and 110 kV lines.  1 that the cumulative contribution rate of the first five principal components already achieves 98.24%, which is larger than 95%; that is, the first five principal components can provide sufficient information, so these five irrelevant new variables can replace the original nineteen variables.Firstly, we get the eigenvectors of the five new principal components and then calculate the load of variables in the principal component, and the calculation results are given in Table 2.
Finally, based on the five principle components, the RVM approach is used to predict the new sample space, and some predicting results are shown in Table 3.It can be seen from Table 3 that the predicting accuracy of RVM method is better than that used by the BP neural network and has a shorter training time.The new approach proposed combines the  RVM method and the PCA method together, which does not change the structure of the sample data but can reduce the number of input variables and simplify the network structure, thus the learning rate and performance of the network can be improved due to the elimination of the correlation of the network input factors; therefore, the predicting performance can be improved greatly and can get a shorter training time.

Conclusions
In this paper, a novel power system reliability predicting model based on PCA and RVM has been proposed.By using the PCA method, a variety of factors affecting the reliability of power supply were analyzed, and the new derived matrix has been taken as the input of RVM for training and predicting.This lower-dimensional matrix could eliminate the redundant information and reduce the number of the dimensions of the sample space, thus the predicting accuracy of the network could be greatly improved.At last, simulation results on a regional grid were provided to show the usefulness and effectiveness of the proposed predicting approach.The proposed approach not only provides a scientific reference for improving power system reliability but also provides an effective method for the grid investment modeling.Furthermore, for testing the robustness of the proposed method, it would be better to consider the impact of noise on prediction for future research [15][16][17][18].

Figure 1 :
Figure 1: Schematic diagram of power supply reliability structure based on PCA and RVM.

Table 1 :
The Eigenvalue and principal component contribution.
The average fault power failure time from year 1998 to year 2008 and 19 influence factors of historical data are taken as the training sample, and the actual data of year 2009 (true value is 12.13465 per house) is taken as the testing sample.The power supply reliability predicting model of this region can be established in the MATLAB environment.By using the PCA discussed in Section 3, for the original input sample X 12 × 19 , the eigenvalues and principal component contribution of principal component can be derived, and the results are provided in Table1.It follows from Table