Stacked Autoencoder Framework of False Data Injection Attack Detection in Smart Grid

&e advanced communication technology provides new monitoring and control strategies for smart grids. However, the application of information technology also increases the risk of malicious attacks. False data injection (FDI) is one kind of cyber attacks, which cannot be detected by bad data detection in state estimation. In this paper, a data-driven FDI attack detection framework of the smart grid with phasor measurement units (PMUs) is proposed. To enhance the detecting accuracy and efficiency, the multiple layer autoencoder algorithm is applied to abstract the hidden features of PMUmeasurements layer by layer in an unsupervised manner. &en, the features of the measurements and corresponding labels are taken as inputs to learn a softmax layer. Last, the autoencoder and softmax layer are stacked to form a FDI detection framework. &e proposed method is applied on the IEEE 39-bus system, and the simulation results show that the FDI attacks can be detected with higher accuracy and computational efficiency compared with other artificial intelligence algorithms.


Introduction
Phasor measurement units (PMUs) can measure the voltage and current phasors directly with the help of global positioning system synchronization clock [1,2]. Due to the ability of monitoring the transient dynamics of power systems, more and more PMUs have been installed in the smart grid. Meanwhile, the rapid developments of enhanced monitoring and information technology also facilitate the malicious cyber attacks [3]. e large-scale integration of renewable energy resources poses a challenge for the security of the system operation due to inherent uncertainties of renewables [4][5][6]. e cyber attacks on the power system monitoring and data acquisition systems are the main objectives for attackers to seriously threaten the power system operating safety. Attackers launch a cyber attack by sending a malicious information to the control center from measurements. One of the most important functions of a state estimator is bad data detection, by which some malicious attacks can be detected because the value of the objective function increases dramatically when attacks are launched. However, one kind of the serious cyber attacks that cannot be detected by bad data detection in state estimations is the false data injection (FDI) attack [7].
Up to now, lots of research works have been developed on different cyber attacks. Under the assumption that the network topology and parameters are known by the attackers, the FDI attack method is proposed in [8] for the first time. However, it is hard for the attacker to obtain the full acknowledgments of power systems. Aiming at this problem, in [9], a FDI attack method is given based on only partial knowledge of the system topology and a subset of meter measurements. To reduce attack costs and detection risks, the minimal set of meters that required to be compromised is taken as the objective function in [10]. In [11], the FDI attack is combined with other kind of cyber attacks, forming an enhanced FDI attack method. Once the FDI attack is launched in power systems, it is hard to be detected. To prevent the measurements being attacked, the meters should be protected. Lots of methods for minimizing the protection costs have been presented in [12,13].
At the same time, the corresponding FDI attack detections are becoming a hot research topic. In [14], a reactance perturbation-based scheme is proposed to detect and identify originally covert FDI attacks on power system state estimation that enhances the security of state estimation without significantly increasing the operational cost in power systems. In [15], an online anomaly detection algorithm that utilizes load forecasts, generation schedules, and synchrophasor data to detect measurement anomalies is given. In [16], the feasibility and limitations of adopting the proactive false data detection approach to thwart FDI attacks on power grid state estimation are studied, and a framework to detect FDI attacks on power grid state estimation by using the proactive false data detection approach is proposed.
With the rapid developments of artificial intelligence technologies, the research works of data-driven technologybased detection methods are increasing dramatically. e principle component analysis is used to analyze the FDI attacks in the real-time environment [17], providing a more accurate and sensitive response than the previous FDI detection techniques. In [18], a supervised learning using labeled data called support vector machine-based FDI attacks detection method is proposed. e principal component analysis is used to reduce the dimension of the data to be processed, which leads to lower computation complexities. Use of deep learning for solving pattern classification problems is proven to be an effective way in engineering [19]. Under the FDI attack condition, spatial and temporal data correlations may deviate from those in normal operating conditions. Based on this characteristic, a discrete wavelet transform algorithm and deep neural networks' techniques are used to construct an intelligent system for AC FDI attack detection, which is proposed in [20]. In [21], the deep learning technique is applied to recognize the behavior features of FDI attacks with the historical measurement data and employ the captured features to detect the FDI attacks in real time. Although the deep learning is an effective method to detect the FDI attacks, some drawbacks, such as the heavy computation loads and bad generalization abilities with a huge amount of inputs, restrict the further applications. Autoencoders [22,23] are one of the effective methods to cope with these problems, which can learn compressed features in an unsupervised manner, attracting more and more researchers' interests [24,25]. However, the effectiveness of autoencoder decreases when the number of hidden units is more than the dimension of input data. To address this problem, sparse autoencoders, in which the sparsity is integrated into the autoencoder model to learn more efficient sparse features, have been developed [26]. In [27], a denoising autoencoder is used in wind turbine gearbox fault diagnosis, which can learn useful features from raw inputs by denoising. Due to the abilities of abstracting robust representations from noisy data, the denoising autoencoder is applied in many fields in recent years [27,28]. In [29], autoencoders are used to reduce dimension and extract features from measurement datasets. Further, the autoencoders are integrated into an advanced generative adversarial network framework, which successfully detects anomalies under FDI attacks with a few labeled measurement data. However, the single-layer autoencoder cannot abstract entire representations of the original data. Aiming at this problem, a stacked autoencoder is proposed, which is made up of multiple autoencoders. e output of the first layer of the autoencoder is taken as the input of the second layer.
In this paper, a stacked autoencoder-based FDI attack detection framework in the smart grid is proposed. e main contributions are listed: (1) A data-driven FDI attack detection framework is proposed. e topology errors and bad data are detected by state estimations. e hidden FDI attacks in measurements that cannot be identified by state estimation are detected by the intelligent algorithm.
(2) e stacked autoencoder is applied to detect the FDI attacks. Compared with other methods, the performances of the stacked autoencoder are better in the condition that the amounts of ordinary and attacks' samples differ widely. (3) e proposed method is applied on the IEEE 39-bus testing system. e performances of the proposed method are better than the traditional deep learning methods, which are capable of practical applications. e rest of this paper is organized as follows. Section 2 establishes the power system linear state estimation model. e bad data detection method is also given. In Section 3, the basic principle of FDI attacks is given. In Section 4, the stacked autoencoder-based FDI attack detection method is proposed. To evaluate the performance of the proposed FDI attack detection method, the case study is carried out under different conditions in Section 5. Finally, Section 6 concludes this paper.

Linear State Estimation Model.
With the rapid development of PMUs, it is possible to take the linear state estimation based on phasor measurements. e linear state estimation can be solved directly without iteration. As a result, the calculation burden of linear state estimation is lighter than nonlinear estimation. e measurements of linear state estimation include real and imaginary parts of bus voltages and currents phasors which can be measured directly. In the linear state estimation, the real and imaginary parts of bus voltages are taken as states that should be estimated.
e relationships between branch current measurements and states are derived from the π equivalent of transmission lines, which are shown as follows: where I ij,r and I ij,i are the real and imaginary parts of the branch current phasors going from bus i to bus j, respectively, g ij and b ij are the conductance and susceptance of branch i-j, respectively, g i0 and b i0 are the conductance and susceptance of the shunt branch at bus i, respectively, and e i and f i are the real and imaginary parts of voltage phasor of bus i, respectively. e matrix form of (1) is Equation (2) can be rewritten as . .] T , z B is the vector of the branch current measurements, and x is the vector of states.
In addition to the branch current measurements, the injected currents and bus voltages can be measured by PMUs also. e measurement equation of linear state estimation is where z U and z IN are the phasor measurement vectors of bus voltages and injected currents, respectively, I 2m × 2n is the measurement matrix of bus voltages, m and n are the number of buses equipped with PMUs and the total bus number, respectively, and Y M is the injected current measurement matrix. Equation (4) can be rewritten as where z is the measurement vector, v is the measurement error, and v satisfies Gaussian distribution with zero mean and variance σ 2 . Equation (5) is linear, so the linear weighted least squares can be used to estimate the states. e objective function is to minimize the sum of weighted variances, which is shown as follows: where J is the objective function, R is a diagonal matrix, the ith diagonal element of R is 1/σ 2 i , and σ i is the variance of ith measurement. e estimated states are where x ⌢ is the estimated states.

Bad Data Detection.
Under the normal condition (no bad data in measurements), the sum of estimated measurement variance is under a given threshold ε; however, if the measurements experience bad data, the threshold ε would be exceeded. e sum of estimated measurement variance is given as where r ⌢ is the estimated measurement residual, e bad data can be detected by the following judgement: If the measurements experience bad data, the measurements would be removed one by one, and the states are estimated again until all bad data are removed.

False Data Injection Attacks
Aiming at the above bad data detection, FDI attack can construct an attack vector to the measurements that are able to bypass the bad data detection, but the estimated states deviate from the true values seriously. Assuming that the attackers can obtain the system typologies and parameters, the FDI attacks are formulated as follows: where z a is the attacked measurement and a is the attack vector. If a is not artificially designed, the sum of estimated measurement variance would exceed the threshold, and the attack would be detected. As a result, the attacker must find out a proper vector a that will satisfy the following constrain: where r ⌢ a is the estimated measurement residual under the bad data condition, x ⌢ c � x ⌢ + c, x ⌢ c is the estimated states under attack condition, and c is the estimated deviation with attacked measurements. It can be seen from (11)  is will cause serious consequences on power systems, while it cannot be detected. e attacked measurements satisfy all constraints as the normal measurements, which can be presented as follows: Equation (12) shows that if the attacked measurement z a satisfies constraints (5), the estimated states will deviate from actual values.
is character leads to the hardness of detecting the FDI attacks using the traditional methods. In this paper, the stacked autoencoder is proposed to abstract the intrinsic features of the attacked measurements.

Stacked Autoencoder.
e autoencoder is a typical unsupervised learning neural network; the inputs of it are a set of unlabeled data. An autoencoder includes two parts: encoder and decoder. A reduced dimensional feature representation can be obtained by the encoder, which is taken as the inputs of decoders. e decoder tries to reconstruct the original input according to the reduced dimensional feature. e structure of the autoencoder is shown in Figure 1. z is the measurement vector, which is taken as inputs of the autoencoder. y is the reduced dimensional feature of z abstracted by the encoder, which is the decoder input. e output z is the reconstruction of the original input z. e objective of the autoencoder is to try to copy its input to its output by two transformations: where f and g are the activation functions of the encoder and decoder, respectively, W 1 and W 2 are the weight matrixes, and b 1 and b 2 are the bias vectors. W 1 , W 2 , b 1 , and b 2 can be obtained by training the autoencoder using the unlabeled data z. It must be noted that the autoencoder can reconstruct different original inputs accordingly, which means that the feature representation y contains all information of the original input z in a lower dimensional form. As a result, the objective of the autoencoder is to minimize the gap between the output z and input z. us, in the training process, the reconstruction loss function is where J a is the loss function of autoencoders. In our FDI attack detection, once an autoencoder is trained, the output layer is useless. Only the hidden layer of the encoder is used to abstract the features of inputs. However, the application of a single encoder is limited. Aiming at this problem, the stacked autoencoder is proposed; the structure of it is shown in Figure 2. It can be seen that the outputs of one encoder are taken as the inputs of the next encoder. By this way, several encoders are stacked together to form a multilayer autoencoder. e features of original data are abstracted layer by layer. e stacked autoencoder is trained by the layer-wise unsupervised pretraining method. e encoder 1 is trained using the original data z by (14). e output of encoder 1 y 1 is taken as the input for training encoder 2. is process continues until the last encoder is trained. e output of each encoder is less than the former one. In the last, a softmax layer is trained by supervised learning using the output of the last encoder as input. e softmax layer function maps input scalars to a probability distribution; the values of it range from 0 to 1. e softmax layer is always used as the output layer for the classification problem. e probability function of the softmax layer is where ϕ is the probability function of the softmax layer, s is the input of the softmax layer, s l is the lth input element, and C is the total number of inputs. e sum of the softmax layer output elements is 1, and the value of each element represents the probability of the according classification.

Framework of False Data Injection Attack
Detection. e flowchart of the proposed FDI attack detection is shown in Figure 3. After the measurement z k is obtained, the linear state estimation should be taken first. en, the value of the objective function is used to detect bad data. If the value exceeds the threshold, the bad data is deleted, and the state estimation is taken again, until all bad data are deleted. FDI attacks can bypass the bad data detection, so the proposed FDI attack detection is taken in the next step. If the attack is detected, the attacked measurements should be identified, which is not the research topic of this paper.

Descriptions of the Testing System and Data.
To testify the validity of the proposed FDI attack detection method, the IEEE 39-bus testing system [16,19] is used in this study. e voltage and current phasors can be measured by PMUs, which are taken as the inputs of the FDI attack detector. e power system states are obtained by power flow calculation using MATPOWER [30]. To simulate the practical operating condition, the generator and load powers are created by Monte Carlo simulations.
e simulated values are true values, while the measured values are generated by adding specific distributed random numbers to the true values. e measurement errors of amplitudes and angles are 2% and 2°, respectively. Assume that the attacker chooses 5 states to be attacked, and the estimated deviation c ranges from − 2 to 2. e attacked value a � Hc is added to measurement z to form z a . In practice, the attacked measurements are far less than the normal measurements. In this simulation, the training set includes 5000 normal measurement samples and 500 attacked samples; the testing set includes 3000 normal samples and 300 attacked samples.
In this study, two encoders and a softmax layer are stacked to form the stacked autoencoder-based FDI attack detection framework. e overall structure as well as the input and output numbers of the stacked encoders are shown in Figure 4.

e Performances of the Method.
To evaluate the performance of the detection method, the confusion matrix is used to analyze the detection results quantitatively, which are defined in Figure 5. e true positives (TP) means that actual attacks are correctly classified as attacks; the true negatives (TN) means that actual normal measurements are correctly classified as no attack; the false positives (FP) means that actual normal measurements are incorrectly classified as attacks; the false negatives (FN) means that actual attacks are incorrectly classified as no attacks. e following three indexes are used to evaluate the ability of the proposed method, which are defined as where Acc, Pre, and Rec are the accuracy, precision, and recall, respectively, Acc represents the overall performances of the method, Rec evaluates performances of the attack detection, and Pre evaluates the probability that the normal measurements are not detected as attacks. e confusion matrix of the detection results is shown in Figure 6. It can be seen that the 300 attacks are detected out; the others are detected as normal measurements. e index values of Acc, Pre, and Rec are 100%, 100%, and 100%, respectively.

Comparison with Other Methods.
ree other detection methods, i.e., multilayer perceptron (MLP), support vector machines (SVM), and deep neural network (DNN), are applied in the simulation. e neuron number in the hidden layer of MLP is 15. If the output of MLP is smaller than 0.5, the classification is no attack; otherwise, the classification is being attacked. For the DNN, the number of hidden layers is 4, and the unit number of each hidden layer is 150. e confusion matrixes and the methods are shown in Figure 7. It shows that the TN numbers of the three methods are 3000, meaning that all normal measurements are correctly detected. However, the 300 attacks are not detected accurately; the detection performance of which can be evaluated by the index of Rec shown in Table 1. Among the three methods, the performance of the DNN method is better than the other two methods. However, it is still worse than the proposed detection method.

Sensitivity Analysis.
In this section, the influences of the following factors to the detection performances will be studied:  Figure 8. It shows that 20 attacks are not detected in Case 1, meaning that the performance of the proposed method decreases if the neuron is less. In Case 3, 16 attacks are not detected. e reason is that the neuron number of encoder 1 is 20, which cannot abstract the full features in the measurements, although the neuron number of encoder 2 is 200.
(2) e number of encoders: the influence of the encoder number stacked in the detection algorithm is studied. e following 3 cases are considered:  Figure 9. It can be seen that 9 attacks are not detected in Case 1 because there is only one encoder, and the features cannot be abstracted fully. Although there are 3 encoders in Case 3, 7 attacks are not detected because the neurons of each encoder are less.
(3) Attack proportions of the training set: in practice, the attacked samples are much less than the normal samples. e influence of attack proportions in the training set is studied also. e detection framework of Figure 4 is applied, and the testing samples include 3000 normal measurements and 300 attacks. e following training sets are considered: Case 1: 7000 normal samples; 500 attacks Case 2: 9000 normal samples; 500 attacks Case 3: 9500 normal samples; 200 attacks e confusion matrixes are shown in Figure 10. It shows that, with the decreasing proportion of attack samples, more attacks cannot be detected. e proposed method is sensitive to the proportion of attacks in the training set. e reason is that the features of FDI attacks are hard to be abstracted by the encoder when the attack proportion is low.

Conclusion
In this paper, a stacked autoencoder-based FDI attack detection framework is proposed, and it is applied on the IEEE 39-bus testing system under different conditions. e confusion matrix and 3 indexes are used to evaluate the performances of the detection methods.
e simulation results show that the neuron numbers of encoders influence the detection performance. If the neurons are less, the features cannot be abstracted fully, resulting in the low Rec values. e encoder number is another aspect influencing the detection performances. If the encoders are less, some attacks cannot be detected. It should be noted that if the neurons are less, the detection performances still decrease even when many encoders are stacked. e proposed detection method is sensitive to the attack sample proportion in the training set. If too few attacks are in the training sets, the features of FDI attacks cannot be abstracted fully, and the detection performance is decreased.
e FDI attack detection based on stacked autoencoders can be carried out in the following areas: the method of determining the optimal number of encoders and neurons, denoising function of the detectors, robustness to the wrong labeled samples, and detection with unbalanced data. Another interesting topic is to extend this work for detecting cyber attacks in integrated energy systems [31][32][33][34][35][36].
Data Availability e IEEE 39-bus system data used to support the findings of this study are included within the article.  Mathematical Problems in Engineering 7