Optimized Extreme Learning Machine for Power System Transient Stability Prediction Using Synchrophasors

A new optimized extreme learning machine- (ELM-) based method for power system transient stability prediction (TSP) using synchrophasors is presented in this paper. First, the input features symbolizing the transient stability of power systems are extracted from synchronized measurements. Then, an ELM classifier is employed to build the TSP model. And finally, the optimal parameters of the model are optimized by using the improved particle swarm optimization (IPSO) algorithm. The novelty of the proposal is in the fact that it improves the prediction performance of the ELM-based TSP model by using IPSO to optimize the parameters of the model with synchrophasors. And finally, based on the test results on both IEEE 39-bus system and a large-scale real power system, the correctness and validity of the presented approach are verified.


Introduction
Monitoring the power system stability status in real-time has been regarded as an important work to guarantee the power system safe and stable operation [1,2].Up to now, the existing transient stability analysis (TSA) methods mainly can be divided into 3 classes: direct methods [3], time-domain simulations [4], and the extended equal area criterion method [5].Unfortunately, these methods cannot work well for realtime stability analysis of modern complex power systems.
In recent years, pattern-recognition-based TSA (PRTSA) has been attracting the ever-growing attention of researchers all over the world [6,7].This kind of method has proved to be potential in the area of on-line dynamic security analysis by applying of the techniques of machine learning.By far, the PRTSA model mainly includes artificial neural networks (ANN), decision trees (DT), and support vector machines (SVM) [8][9][10][11][12][13][14][15].However, the reported PRTSA approaches usually suffer from some inherent disadvantages and lack the ability of big data management and utilization, which restricts its further application in actual operating scenarios.For example, ANN has problems of overfitting, local optima, and slow convergence, and SVM has difficulty in parameter selection.On the other hand, wide area measurement systems (WAMS) provide the synchronous measurement information for the wide area power systems [16], which makes it possible to explore wide area protection and control schemes to avoid the system collapse [17][18][19].
In recent years, a novel machine learning algorithm called extreme learning machine (ELM) is proposed by Huang et al. [20].Contrasted with those conventional PRTSA approaches, ELM has a lot of significant advantages, such as better generalization ability and a much faster learning speed [21][22][23].Inspired by the social behavior of flocks, particle swarm optimization (PSO) algorithm is proposed in 1995 [24].PSO has been widely used to solve a variety of optimization problems with many of advantages including good robustness, fast convergence speed, and high search efficiency [25].
In this paper, a novel ELM-based transient stability prediction (TSP) method using synchronized measurements is proposed.Moreover, to further improve the prediction performance, the ideal model is obtained by applying the improved particle swarm optimization (IPSO) algorithm to select the optimal parameters of the model.

Mathematical Problems in Engineering
The rest of this paper is arranged as follows.First of all, the used methodologies including ELM classification, PSO are presented briefly.Secondly, the proposed real-time TSP method based on IPSO-ELM is presented in detail.Finally, the proposal is tested using the IEEE 39-bus system and a real system.

Related Methodologies
where a  and   are, respectively, the input and output weights vector, (⋅) is activation functions, and   denotes the bias of the th hidden node.
For the convenience of expression, (1) can be rewritten as where H is the hidden layer output matrix.ELM is to minimize the training error as well as the norm of the output weights [20] Minimize : H − Y    2 and          .
Finally, the minimal norm least square method is used in the original implementation of ELM where H † is the Moore-Penrose generalized inverse of H.

PSO.
The fundamental principle of PSO is to find the optimal solution in the complex search-space by moving candidate solutions (called particles) according to the competition and collaboration among particles through repetitive iterations.The movement of each particle is determined by a mathematical formulae over its position and velocity.In the  + 1 iteration, the velocity and position renewal equation of th particle are, respectively, as follows: where  best and  best are, respectively, denoted as the local and global best known solution;  and  are, respectively, denoted as the evolutionary generation and the inertia weight;  1 and  2 are the learning factors, which represent the self-cognition and social-cognition in turn;  1 and  2 are uniform random numbers obeying the 0-1 distribution.

Real-Time TSP Based on IPSO-ELM
3.1.IPSO Algorithm.As pointed out by the famous "No free lunch" theorem, the overall performances of different optimization algorithms are equivalent [26], which implies that none of algorithms can always achieve the optimal for all aspects.In this paper, a mutation strategy is introduced to avoid the premature convergence to local optimum of PSO.First, the optimization process is monitored by dynamically monitoring changes in population fitness variance  2 : where   denotes the population size,   is the fitness value of the th individual particle,  best refers to the best fitness value in the whole population, and  avg denotes the average fitness in the current iteration.Second, when premature convergence occurs, a mutation strategy is used to maintain population diversity.Specifically, the positions of particles are updated by adding random perturbations timely, as shown as follows: where   is the variation coefficient, rand is a real number randomly generated in the range from 0 to 1,  (+1) and  ()  are, respectively, the positions of the (+1)th and th iteration of particles.The criterion to determine the occurrence of premature convergence is given as follows: Here,  +1 and   are the population fitness variances of the ( + 1)th and th iteration.

Fitness Function.
As is known, a proper fitness function plays an important role in optimization problems.In this paper, the fitness function is the classification accuracy of 5fold cross-validation (CV): where  is the model parameter vector to be optimized, which is represented by the position of each particle.

Coding Scheme.
A mixed-integer encoding scheme is used in the optimization process [27].Here, it is considered that the th individual particle/state   will be constituted by where  is the total count of input features,   ( = 1, . . ., ) is a binary variable, and each binary code ("1" or "0") refers to whether the corresponding feature is selected or not;   = {0, 1, 2} ( = 1, . . ., ) is an integer variable that defines the activation function   of each neuron  of the hidden layer as follows: The use of parameters   makes possible the adjustment of the number of neurons (if   = 0 the neuron is not considered) and the activation function of each neuron (sigmoid or linear function).To facilitate the optimization, the decision variables are mapped into real variables within the interval [0, 1]; and all variables need to be converted into their true value when computing the fitness value of each particle [27].

Modeling Process.
The modeling process of the proposed method can be divided into 8 steps.
Step 1.The used data preprocessing approach here is -score standardization method [15]: where  is the mean of any feature  in sample data,   is the standard deviation of the feather ;   is the normalized value corresponding to ,  ∈ .
Step 2. Initialization of the parameters: the maximum iteration number is assigned to 200, the population size   is set to 20, and the number of ELM hidden layer neurons is 50.
Step 3. Initialization of the population:   solutions are generated randomly, and each solution is corresponding to a particle, which is encoded according to (10).
Step 4. According to (4), calculate the output weights  and the individual fitness values in turn.
Step 5.According to the optimization mechanism of PSO algorithm, update the location of individual particle and generate the next populations.
Step 6. Dynamic monitoring of changes in the population fitness variance: once premature convergence occurs, save the current optimal solution  * , and go on to the mutation operation.If a better solution   ((  ) > ( * )) is found in the solution space, then update the optimal solution  * =   , and quit the mutation operation.
Step 7. Judgment of termination condition: the optimization process will be terminated, if the current number of iterations  exceeds the prespecified maximum number of iterations or the value of fitness function is greater than 99.00%; otherwise,  =  + 1 and jump to Step 4.
Step 8. Acquisition of the ideal model: output the optimal solution  * , and obtain the ideal TSP model.
The flowchart of the modeling process is shown in Figure 1.

Construction of the Initial Feature Set.
As is known to us all, input features play an important role in PRTSA [8,14,15].However, the used features in previous works are mainly prefault static features.The reason for this is that the former measurement systems are not able to provide wide area dynamic information.To take full advantage of the synchrophasors information from WAMS [16][17][18][19], the presented approach selects input features from both prefault static information and postfault dynamic information.
As an extension of related works, the selected initial features used in the presented approach are the same as the ones in [15] (see [15] for further details).These features are be made up of prefault static features and the postfault dynamic features.Therefore, the above selected feathers comprehensively indicate the stability of power systems during different stages of the disturbance process, and they are appropriately selected to constitute the original feature set of the transient disturbed pattern space.

TSP Based on the Trained Model.
In this section, how the ELM-based TSP model is used after it has been trained is explained in brief.It should be noted that synchrophasors are simulated through the detailed time-domain simulations in this work.Here, all the used input features can be obtained from the following physical quantities, comprising the rotor angle, angular velocity, mechanical power and electromagnetic power of each generator, and the generators' inertial time constants; and all these physical quantities are available from PMU measurements except for the given inertial time constants.Therefore, the proposed method is able to be applied to TSP based on PMU measurements.
In the present approach, it is supposed that tripping signal(s) issued by the local protection is available for triggering the TSP system.Once a fault is cleared by the action of relevant relays, the trigger allows starting of taking the samples of the input variables to construct the input vector for the proposed ELM-based TSP model.And then, the proposed model takes 9 consecutive synchronously measured samples of each generator at the rate of 60 per sec to form the input vector for the classifier.Finally, for a specific input vector, the transient stability status of the disturbed power system can be immediately predicted by using the trained model.

Results and Discussion
For power system TSA, the New England 39-bus test system is a widely used test system to examine the performance of various assessment methods [9][10][11][12][13][14][15].The system contains 10 generators and 39 buses, which is shown in Figure 2.

Generation of Knowledge Base.
It is known that for PRTSA, the generalization ability of a TSA model largely depends on the completeness and representativeness of the knowledge base (KB).Hence, it is an important work to generate the used KB, reflecting the relationship between the input features and the stability status.In this work, KB is made through extensive time-domain simulations in detail.The employed generator model is four-order model with IEEE DC1 excitation system; the load model is the constant impedance model.The fault type is three-phase short-circuit faults, and the fault clearing times are varied from five to ten cycles.It is assumed that reclosures are successfully applied and the network topology is not changed when the fault is cleared.The load levels used ranged from 80% to 130% of the basic load level, and the active and reactive power outputs of each generator are correspondingly assigned.Among the total 3300 created samples, 2200 ones are randomly chosen as the training samples and the rest as the testing samples.
A class label Class Lable of each sample is denoted by a transient stability index which is related to the relative rotor angle deviation during the transient period of a disturbed power system [13,15].The label Class Lable of a sample is determined as where sgn(⋅) is a sign function, | ⋅ | is the absolute value function, and Δ max is of the maximum relative rotor angle deviation between generators in the period.By plotting the rotor angle swing curves of the generators, an unstable case is illustrated in Figure 3, and a transient stable case is illustrated in Figure 4.

Training Results of Different Optimization Algorithms.
Comparison tests are carried out by using other optimization algorithms comprising PSO and genetic algorithm (GA).
In order to facilitate comparative analysis, all the common parameters of these algorithms are assigned to the same numerical values; and other parameters are set as follows: in GA, the mutation probability is set to 0.01, and the crossover probability is assigned to 0.85; in PSO, both the two learning factors  1 and  2 are assigned to 2 and the inertia weight  is set to be linearly decreased from 0.9 to 0.4.At the same time, taking into account the randomness of the above-mentioned algorithms, all of them are executed 100  1.
From Table 1, it can be seen that the proposed IPSO algorithm has better quality of solution, shorter training time, and higher search success rate than other optimization algorithms such as GA and PSO.The reason for this is that the dynamic monitoring mechanism and the mutation strategy are comprehensively employed during the optimization process in the proposed approach; thus it has the best result and the most stable performance.Therefore, it can be drawn that IPSO is able to solve the ELM's parameter optimization problem effectively.It should be noted that both the training time and the number of hidden nodes are the average value of 100 times in Table 1.
The fitness evolution curves of the optimal individuals for the three optimization algorithms during the process are shown in Figure 5.
Figure 5 shows that all these algorithms have obvious effects in the parameter optimization process for ELM.Among all these algorithms, IPSO has the fast search efficiency, which reaches the optimal solution only through 106 iterations; moreover, the best fitness value of IPSO is the highest one among all these optimization algorithms.At the same time, IPSO has a transient pause at 55th iteration but soon continues to decline.This suggests that IPSO can quickly jump out the local optimal solution and overcome premature phenomenon effectively for its powerful global search capability.Therefore, the above results demonstrate that IPSO is able to enhance the solution quality and search efficiency for the proposed TSP approach.

Test Results
. First, the proposed IPSO-ELM-based TSP model is tested.It is known that fast prediction of instability is crucial for TSA, since the transient stability is a very fast process which demands a control measure within short period of time (<1 s) [13].The required observation time in the presented approach is 150 ms (9 cycles), and that allows over 400 ms for measurement, telecommunication, and processing delays.Once the data are in the control center, the prediction time using the trained model is short.For example, with a MATLAB code implemented on a PC with 2.66 GHz CPU and 3 GB of RAM, this calculation required only 42.267 ms.In practical applications, the computing time can be further reduced by means of using the state-of-the-art parallel computing and distributed computing technology.Therefore, the conclusion can be drawn on the basis of the evidence that the proposed method is able to predict the transient stability status of power systems in real-time.And then, comparison tests by using ELM are carried out with the results shown in Table 2. Taking into account the occasionality of test accuracy Acc, the classification performance of the tested TSP model should also be evaluated by using some statistical indicators, such as the Kappa statistic value Kap and the area under the ROC curve AUC [15].For this reason, a composite indicator  is adopted here to comprehensively evaluate TSP models, as defined by the following equation: Table 2 illustrates that the proposed algorithm manages to predict the transient stability with good accuracy.The reasons for this are as follows: on one hand, the classification features extracted from WAMS information fully reflect the dynamic response characteristics of the actual system; on the other hand, ELM has good generalization ability, for it pursues the minimization in both the norm of the output weights and the training error.At the same time, it can be observed that the classification ability of IPSO-ELM is much better than that of ELM.Therefore, the conclusions can be safely drawn that IPSO is an effective way to enhance the prediction performance for the proposed ELM-based TSP model.

Results of Comparative Tests.
In order to further properly test the proposed scheme, comparative tests are carried out by using other TSP models, such as DT [7,10], multilayer perception (MLP) [6,9], and SVM [13], which are carried out with the results shown in Table 3.The parameters of the above-mentioned TSP models are set as follows.DT is constructed using the C4.5 algorithm; the radial basis function is used as the kernel function of SVM, and its associated parameters are optimized through the grid search combined with 5-fold cross validation [13,15]; in MLP, the hidden neuron number is set to 25, the learning algorithm is the backpropagation algorithm.
Table 3 shows that the proposed method has the superior predictive performance and training speed than other TSP models, not only because of the superior generalization ability and fast training speed of ELM itself but also because of application of IPSO to determine the optimal parameters of ELM.As a result, a conclusion can be safely drawn that the proposed method is effective in real-time transient stability prediction for power systems.

Application to the Power System of Liaoning Province
In order to further verify the applicability of the proposed approach to practical large power systems, the proposed method is examined on the power system of Liaoning province, China.
The modeled system comprises 91 generators, 750 major buses, and some series compensated lines and SVCs.It is a highly interconnected grid with an approximate installed capacity of 39657.2MW, covering an area of 148,000 square kilometers.The system has formed 5 connected channels with the external power network.The contingencies considered are three-phase shortcircuit faults.The stability criterion employed here is exactly the same as in the former case, IEEE 39-bus system.Through large amounts of simulations, there are 2000 samples created totally.Of all the samples, 1320 ones are selected as the training samples randomly, and the rest are used as the testing ones.The results of the power system of Liaoning province are shown in Table 4.
As demonstrated in Table 4, the proposed approach is applicable to large-scale real power systems as well.Furthermore, the results also show that the proposed approach is able to determine the transient stability status in the power system of Liaoning province, China.Therefore, the applicability of the proposal to a real power system is verified.

Conclusions
Machine learning has proved to be promising for solving on-line TSA problems.However, the existing PRTSA approaches cannot meet the needs of big data management and utilization.To overcome this problem, a new optimized ELM-based approach for real-time power system TSP using synchrophasors is presented.Based on the test results on the well-studied IEEE 39-bus system and a large-scale real power system, the conclusions can be obtained as follows.
(1) The method proposed is able to effectively predict the power system transient stability status using synchronized measurements, and it has better predictive performance and generalization ability than other commonly used TSP models, such as DT, MLP, and SVM.
(2) The predictive performance of the proposal is evidently strengthened by using the proposed IPSO to optimize the parameters of the ELM-based TSP model with synchronized measurements.Furthermore, by means of the introduction of the dynamic monitoring mechanism and the mutation strategy, IPSO has the better global optimization ability and the faster searching efficiency than the traditional optimization algorithms comprising GA and PSO.
(3) For a wide area protection and control system, the presented method may be used as trigger to start the related emergency control measures.Meanwhile, the proposed TSP model is able to be applied to other similar pattern recognition problems.

Figure 1 :
Figure 1: Flowchart of the modeling process.

Figure 5 :
Figure 5: Best fitness evolution curves of different algorithms.
2.1.ELM Classification.Assuming an ELM with  hidden layer neurons to model data samples {x  , y  }  =1 , it can be mathematically represented as

Table 1 :
Comparison results of different algorithms.

Table 2 :
Test results of the proposed approach.

Table 3 :
Test results of other TSP models.

Table 4 :
Test results of the power system of Liaoning province.