Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems

Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO) based support vector machine (SVM) classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR) method with a pseudorandom binary sequence (PRBS) stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.


Introduction
Distribution networks deliver electrical energy from transmission systems to consumers and are important and integral part of all power systems. Once an electrical fault occurs in any distribution feeder, immediate fault classification plays an important role in postfault analysis and power supply restoration. The accuracy of the fault type information assists the fault diagnosis system not only to locate the electrical faults promptly but also to ensure power quality as well as reliability of the system [1,2].
A variety of approaches have been developed to build an effective fault classifier in electrical distribution feeders. As the amount of power delivered by a distribution system significantly increases, it is essential to focus on fault classification schemes. The studies of fault classification in distribution feeder can be divided into three separate categories, as follows: (1) impedance based method [3,4], (2) travelling wave based method [5,6], (3) and artificial intelligence based method [7,8]. The most common method for fault classification in power systems is known as timedomain reflectometry (TDR) [9][10][11].
TDR is rather simple to implement; however, it is not a perfect fault-location method since any single pulse stimulus injected into the electrical line is quickly attenuated along that line, causing fault location and classification to become inaccurate. To overcome this problem, an improved TDR method using incident pseudorandom binary sequence (PRBS) excitation is proposed to locate such faults in [12]; however, it should be noted that it is only applied for highpower transmission lines. Actually, it is quite difficult to apply the TDR method to find faults in distribution feeders because of the various junctions and ends of branched network involved. As a result, various reflected responses may occur in the reflectometry trace [13]. Therefore, an intelligent algorithm is required to extract fault location information on a multiple-branched network from the reflectometry 2 Computational Intelligence and Neuroscience trace provided. SVM has been used successfully to resolve classification issues for a wide range of applications because of its strongly regularized characteristic and rapid training speed [14][15][16].
To build a SVM classifier, the aspect of feature subset selection plays an important role in detecting relevant variables in classification spaces. Principal component analysis (PCA) [17] and multidimensional scaling (MDS) [18] are two traditional methods applied to remove redundant variables in the original feature vectors. Authors in [19] proposed a Hadoop scheme to extract feature in parallel, in which hundreds of mappers are composed. In a recent paper [20], Ma and Niu used the firework algorithm to select input features by removing redundant influence in order to improve the icing forecasting of high voltage transmission line.
In addition to feature subset selection, the optimal set of SVM parameters also plays an important role in the distribution of samples in a given search space. Vapnik showed that the penalty parameter and kernel function parameter such as gamma for the radial basis function (RBF) significantly affect the performance of SVM [21]. Various researches have been proposed to select these two parameters, but there is no general opinion for their settings [22]. The grid search method (GSM) is investigated to determine optimal parameters by attempting different values and selecting those values possessing the least amount of testing error [23]. Because of the computational complexity involved with GSM, genetic algorithm (GA) has been developed to improve classification accuracy and reduce training time by using a minimal number of features [24]. However, it takes significant amounts of calculation time due to the complex operational process, including inheritance, selection, recombination, and mutation. To overcome this relative problem, Kennedy and Eberhart proposed a population-based search technique known as particle swarm optimization (PSO) [25]. The primary advantage of the PSO based encoding technique is in its capacity to decrease trapped status in local optima and increase the classification accuracy as well as the training speed.
In this paper, a novel method based upon PSO techniques is developed to simultaneously optimize input features and SVM parameters in order to classify the fault types found in the distribution network. These fault types can be divided into ten classes, including single phase-to-ground faults (AG, BG, and CG), line-to-line faults (AB, AC, and BC), double line-toground faults (ABG, ACG, and BCG), and three-phase shortcircuit faults (ABC). Further, this PSO-SVM classifier uses a dataset obtained from TDR analysis with PRBS excitation. Not only is the proposed PSO based encoding technique easy to use, but it also helps to significantly increase the success rate of the SVM classifier.
The remainder of this paper is constructed as follows. In Section 2, the theory of the proposed method is discussed, including TDR, SVM, and PSO. Section 3 presents the modeling of a typical two-branched distribution feeder. The developed PSO based SVM fault diagnosis approach is given in Section 4. In Section 5, experimental simulation results and discussions are presented. Finally, a conclusion is presented in Section 6.
Cdz Gdz TDR is based on a single pulse being injected into the given line or cable to be examined. Afterwards, some of the pulse energy is reflected back to source whenever it reaches the point of any discontinuities, such as electrical faults, tee joints, or line terminals. Since the propagation velocity is assumed to be constant, the fault distance can be measured based on the expected pulse transit time. Hence, the reflectometry trace will not only display the desired information of the fault type, but also determine the fault location. Assume a distribution line is modeled by a lumpedparameter equivalent circuit as shown in Figure 1 with a distributed series inductance , resistance , capacitance , and conductance .
A voltage introduced at the generator will require a certain amount of time to propagate along the line represented in the following equation: where V( , ) and ( , ) are the forward travelling voltage and current waves, respectively. The amplitude of incident pulse will be attenuated along the line and the phase of the voltage travelling along the line will be distorted resulting from varying frequency [26]. The attenuation and phase shift are determined by the propagation coefficient, as shown in where and are the attenuation coefficient and the phase change coefficient, respectively. The velocity at which the voltage moves down the line can be defined in = . ( From (1), using the Laplace transform and differential equation, we can obtain Computational Intelligence and Neuroscience 3 where V + ( − / ) and + ( − / ) are the forward travelling voltage and current waves, respectively; V − ( + / ) and − ( + / ) are the backward travelling voltage and current waves, respectively. Equating the coefficients of − / , (4) can be rewritten as where is called the characteristic impedance. When the line is terminated with any load whose impedance value is other than the characteristic impedance, a reflected wave will occur at the load and then propagate back toward the source. The voltage moving down the line in this case is given by means of where is called the load impedance. This reflected wave is related to the incident wave by representation in the following equation: where Γ is called the receiving-end voltage reflection coefficient and is called the transit time. TDR is quite simple to implement, but it is not a perfect technique since the use of single pulse excitation that is quickly attenuated along the line. In addition, the pulse width is one of the factors that affect the accuracy rate of the reflectometry method. TDR method, using incident pseudorandom binary sequence (PRBS) excitation can solve these problems by using cross-correlation (CCR) function between the reflected wave and incident wave given by (8) for fault diagnosis in distribution feeders: where is the cross-correlation (CCR) function between the reflected wave and incident wave; is the forward signal and is the feedback signal.
As previously mentioned, a variety of different components exist along electrical distribution lines like transformers, capacitors, tap changers, phase splitters, and so forth so it is not easy to extract fault locations from various reflections observed in the reflectometry trace. In this study, a multilayer SVM classifier is proposed as a supporting technique for the TDR method to provide fault diagnosis in multibranch distribution networks, including single phaseto-ground faults (AG, BG, and CG), line-to-line faults (AB, AC, and BC), double line-to-ground faults (ABG, ACG, and BCG), and three-phase faults (ABC).

Support Vector Machine.
A support vector machine (SVM) was first mentioned by Vapnik in 1995, and it has become one of the most optimal techniques for data classification. It has a solid theoretical foundation based on a combination between the structural risk minimization principle and statistical machine learning theory (SLR). The main advantages of SVM are the global optimization and high generalization ability. Further, it overcomes overfitting problems and provides sparse solutions in comparison to existing methods such as artificial neuron network (ANN) and refined genetic algorithm (RGA) in fault classification.
In standard linear classification problem, for example, one should separate the set of training data, ( , ), = 1, 2, . . . , , is the number of given observations, where ∈ are feature vectors and ∈ (−1, +1) are label vectors. A binary classification problem can be posed as an optimization problem in the following way: Subjected to: where is the regularization parameter; the penalizing relaxation variables. Equation (10) means It is to be noted that the nonlinear classifier may be denoted in the input space as where ( ) is the decision function and the bias * is calculated by the Karush-Kuhn-Tucker (KKT) conditions; ( , ) is the kernel function that produces the inner product for this feature space. In this paper, the following radial basis function (RBF) is used: where is the kernel parameter.
To obtain optimum performance, some SVM parameters need to be select property, including the regularization parameter and the kernel parameter . In this work, PSO technique is applied to optimize these two parameters accordingly.

Particle Swarm Optimization.
Particle swarm optimization (PSO) is inspired by the social and cooperative behavior displayed by various species to fill their needs in the search space. This algorithm is guided by personal experience , overall experience , and the present movement of the particles to decide their next positions in the search space. Computational Intelligence and Neuroscience Figure 2: The PSO search mechanism th particle at th iteration.
Further, the experiences are accelerated by two factors 1 and 2 , and two random numbers 1 and 2 generated between [0 1]; whereas, the present movement is multiplied by an inertia factor . Mathematically, updated positions of each particle in the search space can be expressed using the two equations discussed below.
The initial population (swarm) of size and dimension is denoted as In (14), , represents personal best th component of th individual, whereas represents th component of the best individual of population up to iteration . Figure 2 shows the search mechanism of PSO in a multidimensional search space. The initial of each particle is their initial position, whereas the initial is the initial best particle position among randomly initialized population. The and of each particle are updated as follows.
At iteration , where ( ) is the objective function subject to minimization. The updating procedure should be repeated until a stop condition is reached, such as a prespecified number of iterations

System Modeling
An equivalent model has to be constructed by using Simulink software and MATLAB Toolbox to simulate a typical twobranched distribution feeder shown in Figure 3, in which dots represent the distribution transformers and their loads.
Two distribution transformers in the sample system are used to reduce the voltage on the distribution line to the level of customers that are distributed along a feeder. Their parameters and connection phases are shown in Table 1 [31]. It is noted that these distribution transformers are operated in a full-load condition with 0.8 lagging power factor; as a result, the sample distribution system is operated with unbalanced conditions in occurrence. The main feeder and laterals are constructed by means of overhead lines whose positive-sequence impedance is 0.131 + 0.364 Ω/km [31].

Developed PSO Based SVM Fault Diagnosis Approach
Since the TDR technique does not diagnose fault easily in the distribution networks hence it requires to be supported from other intelligent techniques in order to obtain the best results. This paper proposes a PSO based SVM classifier to improve the performance of the TDR method in fault classification in electrical distribution feeders. The overall structure of SVM short-circuit classifier is shown in Figure 4, in which PSO is performed to optimize the feature subset and SVM parameters. For this, the data acquisition for data preprocessing is mentioned first.

Data Acquisition.
To obtain a suitable dataset for classification process, PRBS disturbance is injected directly into the secondary circuit of the current transformer (CT) 200/5A which is placed at the beginning of the line under test. The primary circuit of the CT is connected to the main feeder; thus the amplified PRBS is propagated along the line to diagnose any faults which may occur. Once a fault occurs in the distribution feeder, it causes producing a reflected signal that travels between the fault location and the substation. Then, these reflected responses are cross-correlated with the incident impulse by (8) in order to reduce the impact of noise as well as surmount amplitude attenuation. It is worth noting that, for each of the fault types specified, the magnitudes of the feedback waves are different at the shortage time; as a result, the peaks of the CCR are not found to be the same. Hence, the reflected responses and CCR between the reflected wave and the incident wave are used as input feature vectors for the training phase. The total number of feature vectors is 12, and they comprise a feature vector = [V 1 , V 2 , . . . , V 12 ] , in which V 1 -V 6 are the reflected voltage and current obtained at the substation and V 7 -V 12 are the peaks of CCR between the reflected and the incident waves.

Feature Extraction.
For utilization of the reflectometry method, various echo responses are collected, in which some irrelevant data may be confusing to the SVM classifier and subsequently increase the training time. Feature extraction is the best effective method to select appropriate input features in order to improve the speed of training as well as to ensure the success rate of classification. For optimum feature selection in this work, PSO is employed to improve the performance of the SVM classifier. To select optimum features of the given dataset, a binary string has been optimized using PSO where each bit represents a given feature of the dataset. In the binary string, a "0" represents an ignored feature, whereas a "1" represents a selected feature of the dataset. The optimum features are those features taken from the given dataset which correspond to the optimized binary string having its bit as a "1." For this, a given set of predefined SVM parameters has been used while the selection of features of the given dataset using PSO is made. At the end of feature selection stage, the selected strings provide the information regarding the features needed for optimizing the SVM parameters.

Optimum SVM Parameters.
The performance of SVM is susceptible to kernel function parameter and the regularization parameter , so these parameters must be carefully selected to increase the classification accuracy. In this paper, PSO technique is used to select the parameters of the SVM classifier. Performance is measured according to the classification accuracy on unseen testing data. In the learning stage, the PSO based encoding SVM model is trained based on structural risk minimization to minimize the training error. While training error improvement occurs, penalty parameter and kernel function parameter are regulated by means of PSO. The regulated parameters with minimal error are reported as the most suitable parameters. As a result, the optimal parameters ( and ) are to be obtained.
Once the optimized parameters of the SVM are obtained, then it is used for the retraining of the SVM model. After the training phase, the SVM classifier is ready to identify new samples in the testing phase. The testing set is also chosen by means of the above feature selection from the original dataset obtained by the TDR trace. Then, testing patterns are inputted to the trained multilayer SVM classifier which can identify all the 10 types of faults, including single-phase-to-ground faults (AG, BG, and CG), line-to-line faults (AB, AC, and BC), double-line-to-ground faults (ABG, ACG, and BCG), and three-phase faults (ABC).
Detailed experiment procedure for feature extraction and SVM parameter selection using PSO algorithm can be expressed using the following steps: (1) Read complete data and set , 1 , and 2 parameters.
(2) Initialize positions X and velocities V of each particle of population. (3) Initialize sets of SVM parameters within its ranges as particle position and velocity. (4) Form SVM using training dataset and initialized positions of each particle.
(13) If max ite then = + 1 and go to step (6); else go to step (14). (14) Optimum solution obtained: print the results of optimum generation as (15) Retrain SVM with optimum features and parameters; then identify unknown samples on testing dataset.
The experiment procedure can be visualized in Figure 5.
Computational Intelligence and Neuroscience 7 Table 2: Dataset of ten fault types located at distances of 3 km and 4 km from the substation.
, and c are magnitudes of reflected voltages and currents, respectively; cc-V a , cc-V b , cc-V c , cca , ccb , and ccc are CCR between reflected signal and incident signal.

Test Results and Discussion
In this paper, the fault types are considered by using a 127bit PRBS stimulus with frequency = 1 MHz and a velocity of 198,000 km/s propagated along the sample system given in Figure 2. The dataset used in this study was obtained at the substation end by TDR analysis, with the number of features being 12, in which six features are considered to be the magnitudes of reflected signals and six remaining features are extracted from the peaks of CCR between the feedback wave and the forward wave. This dataset is comprised of 5700 samples generated by creating each type of fault at different locations on two laterals with varying fault impedance value.
Note that training and test sets are randomly divided from the original dataset, in which 4500 and 1200 are used for training and testing set, respectively. Table 2 only gives a few portions of the dataset for purposes of brevity, which were created by a simulation of the ten types of short-circuit fault on the first lateral, located at distances of 3 km and 4 km from the substation.
In this paper, PSO technique is used to select the features and parameters of the SVM classifier. Preliminary experiments also permit this study set population size as 10; inertia weight has been taken into account as between 0.1 and 0.5 (considered randomly at each iteration); and acceleration factors ( 1 and 2 ) have been taken as equal to 2 with a maximum iteration set to 1000. Table 3 gives the results of the classification accuracy for the SVM algorithm using a dataset both with and without PSO optimization. The optimum values of and of SVM classifier are 181.0193 and 1.1212 without consideration of PSO and are 15.0381 and 0.0334 with consideration of PSO. From this table, it is observed that the classification accuracy in the case of using the entire feature is 93%, whereas the classification accuracy in the case of using a PSO based encoding technique is found to be 97.15%. This demonstrates the optimal efficiency of the proposed method in which PSO optimization is applied. All 12 features are autoselected from the corresponding input, and the testing success rate has been improved significantly. The remaining features are 8, which are 1-7 and 9. Furthermore, Table 3 provides the computational times for training SVM classifier. The overall simulation time taken by the SVM classifier without PSO is 134.8 seconds, whereas with PSO it is 83.54 seconds. It should be concluded that the PSO technique takes a relatively shorter computational time for training.  The convergence characteristic of the proposed PSO is shown in Figure 6. From this figure, it is can be observed that MSE beyond 15 iterations is nondecreasing; thus the optimized SVM parameters can be obtained prior to the total training time taken (83.54 sec).

Conclusions
In this paper, a multilayer support vector machine (SVM) based on optimum parameters optimization and feature selection approach has been developed to classify ten types of faults in radial distribution feeders. Particle swarm optimization (PSO) has been used as an optimizer to improve the performance of SVM classifier by selecting an appropriate feature subset and kernel parameters. Further, time-domain reflectometry (TDR) with pseudorandom binary sequence (PRBS) stimulus has been utilized for generating a fault dataset. In the proposed technique, not only does using PRBS injection overcome the stimulus distortion problem, but it also surmounts the impact of noise to provide a reliable dataset for SVM classifier. The proposed PSO based SVM classifier has been successfully applied to identify all ten types of short-circuit faults in the radial distribution network observed. The achieved high accuracy rate in classifying fault types (over 97%) demonstrates greater effectiveness over existing fault identifiers.