Data injection attacks in a cyber-physical system aim at manipulating a number of measurements to alter the estimated real-time system states. Many researchers recently focus on how to detect such attacks. However, most of the detection methods do not work well for the nonlinear systems. In this paper, we present a compressive sampling methodology to identify the attack, which allows determining how many and which measurement signals are launched. The sparsity feature is used. Generally, our methodology can be applied to both linear and nonlinear systems. The experimental testing, which includes realistic load patterns from NYISO with various attack scenarios in the IEEE 14-bus system, confirms that our detector performs remarkably well.
National High Technology Research and Development Program of China2015AA0160081. Introduction
A cyber-physical system (CPS) is a dynamical system, which integrates the computational components (i.e., real-time operations) with its physical components (i.e., hardware facilities). Examples of CPS can be large-scale distributed systems, such as smart grid, transportation networks, railway control system, and medical monitoring. The design of CPS involves various of disciplines, such as control engineering, software engineering, and mechanics and networks. Particularly, control engineering is a communication network for transmitting sensor data (measurements) so that the system operator can in real-time monitor the production process. Among the control disciplines, a scheme called bad data detector (BDD) is applied to detect whether there exists a disruption of sensor data caused by the genetic malfunction or malicious attacks. The classical BDD technique is to utilize the “residual principle,” which calculates the difference between the observed readings and the computed readings based on the estimated system states. When an attack is injected into the system, BDD will remove those readings (collected from the sensors), of which residuals are larger than a threshold.
As the increased vulnerabilities proposed by the recent discoveries of system malware, concerns about the security of CPS are arising. In 2011, a malware, known as Stuxnet [1], successfully penetrated the networks of Iran’s uranium enrichment infrastructure via programmable logic controllers. From this instance, we can see that it is possible for an attacker to introduce errors on physical readings. Inspired by this attacking strategy, a class of attacks named data injection attacks are proposed in recent years, which can affect the system control algorithms and thus lead to abnormal operations [2, 3]. Hence, sufficient attention should be paid to the detection techniques against this attack, which is easy to be implemented by strong adversaries who are quite knowledgeable about the targeted systems.
To fight against this attack, existing works focus on the detection of data injection attacks and the protection of nonlinear measurements [4, 5]. Detectors utilizing the sparsity and low rank of the system topology are proposed in [6–8]. Greedy and game theory methods have been used for optimizing the placement of devices [9], to lower the possibility of the construction of data injection attacks. Applying the machine learning techniques to conduct the classification is proposed in [10]. They propose a “first difference aware” machine learning (FDML) classifier to detect the cyber attacks. A graph theory-based algorithm is proposed in [11] to determine which measurement signals an attacker will alter. However, we notice that all detection models except [11, 12] are conducted in a constrained setting, by assuming that the functions from system states to measurements are linear. This assumption is too stringent to fit for some nonlinear systems, for example, alternative current (AC) model in power grids.
This paper investigates an alternative approach to detect data injection attacks in the nonlinear system. We propose a detector framework named F-DDIA to reconstruct the initial states of the plant from the corrupted observations, which formulates an error correction problem. In particular, we notice that, due to the property of data injection attacks, only a small fraction of the observations are supposed to be attacked at a given time instance. Thus, we formulate the error correction problem as a sparse optimization problem which can be solved with the general l1-minimization program technique. In this paper, we apply Douglas-Rachford techniques [13] among minimization techniques. Furthermore, we employ the “divide-and-conquer” principle to construct a compressive sensing model of a linear subspace, which is interesting in the general mathematical settings.
To validate and illustrate our algorithm, we use real-world CPS power grids as a case study. In particular, we use the data injection attacks model proposed in [2], where the attacks are directed by injecting false data into the sensors. Simulations based on IEEE 14-bus test systems validate the effectiveness of our methodology. The results show that the proposed algorithm can efficiently identify the data injection attacks (i.e., with high precision and recall values) and recover the initial system states (i.e., with small average phase error).
The rest of this paper is organized as follows. Section 2 presents the system model in a nonlinear system, including preliminaries related to a broad class of attacks. Section 3 states the problem and derives a theoretical justification of the efficacy of the security algorithm in a general cyber-physical system model. Section 4 analyzes the performance of the proposed approach through simulations. Section 5 gives concluding remarks.
2. Preliminaries2.1. System Model and Bad Data Detector
A cyber-physical system is usually described by the following widely adopted discrete-time nonlinear dynamical model:(1a)xk+1=δxk+Buk+wk,(1b)zk=hxk+ek,where at time k∈I≜{0,…,T-1}: x[k]∈Rn is the system state; u[k]∈Rl is the bounded input vector; a[k]∈Rm is the measurement vector (data collected by the sensors); w[k] denotes the state noise (i.e., Gaussian with known statistics); and v[k] denotes measurement errors. Here the matrix B is a constant matrix, δ:Rn→Rn denotes the state transition function and h:Rn→Rm denotes the topology of the system, which are the nonlinear functions with respect to the states. The process of estimating system states from the measurements is called state estimation.
In traditional weighted least squares (WLS) state estimation, the system states are valid only if the measurement residual vector r[k] is less than a threshold [14],(2)rk=zk-hx^kl2,where x^[k] is the estimated system state after the process of state estimation. Specifically, the presence of bad measurements is inferred if Jr[k]>τ, where τ is a chosen identification threshold. Upon detection of bad data, two kinds of methods, named the largest normalized residual test (rNmax) and hypothesis testing identification (HTI) method, are widely used to identify whether the measurements contain bad data.
2.2. Data Injection Attack
Data injection attacks are commonly known as false data injection attacks [2], data framing attacks [3, 15], in the sense of the following definition.
Definition 1.
A vector a[k] is called a (κ,m)-data injection attack if there exists an index set i∈A, where A is the set of manipulated measurements and A⊂P≜{1,…,m}, such that
akl0≤κ;
ai[k]=0,∀i∈P∖A;
ai[k]≠0,∀i∈A.
To implement this class of attack, it requires the attacker to have the knowledge of either the measurements information (z) or the topology configuration (h(·)). Specifically, data injection attack can be written in the form of(3)z¯k=zk+ak=hxk+ak,where a[k] is the injected false measurement data. There are many ways to generate this type of attacks. For example, if h(·) is available to the attacker, the attack a can be constructed in the following form (namely, false data injection attack in a linear system):(4)a=Hc,where c is the error injected on the system state and H=∂h(x)/∂x is the Jacobian matrix. However, to implement this attack, the attacker needs to take control of at least κ sensors, where κ≤m.
2.3. Measurement Dynamics
We can use the polynomial regression approach to fit the measurement dynamics,(5)zk+1=δxk+Buk+wk=δ′zk,where δ′:Rm→Rm denotes the dynamics of the measurements. Furthermore, we define zi[k] as the ith corrupted measurement at time k. That is, a polynomial regression model, which expresses the dynamics of the ith measurement can be given as follows:(6)δi′zik=γi,1zikl+⋯+γi,lzik+γi,l+1,where l is called the degree of the polynomial and i∈P. We denote γi=(γi,1,…,γi,l+1)∈Rl+1. As δi′zik can be expressed in matrix form in terms of a response vector zi[k] and a parameter vector γi,j, where 1≤j≤l+1, we can rewrite zi[k+1] as a system of linear equations:(7)zik+1=Xγi,1⋮γi,l+1,where X=(zikl⋯zi[k]1)∈Rl+1. Thus, the dynamical matrix γ can be estimated as(8)γ^i=XTX-1XTzik+1i∈P.
3. Our Methodologies
In this section, we formulate the detection problem as an error correction problem. We will further describe and explain why we can use l1-norm minimization technique (including Douglas-Rachford) to solve the detection problem.
3.1. Sparse Optimization Problem Formulation
In this paper, we consider the scenario that an attacker is limited to the resources of κ sensors and possesses the knowledge of system topology h, as well as the historical measurements Z¯=(z¯[0];…;z¯[T-1])∈RmT. Denote Z=(z[0];…;z[T-1])∈RmT as the initial measurements (without attacks) in time base. The obtained temporal observations Z¯ can be expressed as(9)Z¯=Z+A,where A=(a[0];…;a[T-1])∈RmT. Remark that, due to the property of data injection attacks, only a small fraction of the observations are supposed to be attacked at a given time instance. Hence, noticing the sparsity of vector A, the detection problem can be converted to(10)minimizeAAl0subjecttoZ¯=Z+A,akl0≤κ,k∈I,where κ is the maximum number of the meters that can be compromised. Under certain conditions which are explained above, we will focus on the problem of recovering the sparse vector A from Z¯. And we denote the optimal solution of problem (10) as A∗.
3.2. Subproblem Formulation
In the rest of this paper, we define the matrices A=[A1,…,ATm+m], Z=[Z1,…,ZTm+m], and Z¯=[Z¯1,…,Z¯Tm+m]. We further define the matrices E, W, and W¯ in the following forms:(11)E=E1T⋮EmT=A1Am+1…ATm+1⋮⋮⋱⋮AmA2m…ATm+m∈Rm×T;W=W1T⋮WmT=Z1Zm+1…ZTm+1⋮⋮⋱⋮ZmZ2m…ZTm+m∈Rm×T;W¯=W¯1T⋮W¯mT=Z¯1Z¯m+1…Z¯Tm+1⋮⋮⋱⋮Z¯mZ¯2m…Z¯Tm+m∈Rm×T.We can further obtain the following formulation among Ei∈RT, Wi∈RT, and W¯i∈RT:(12)W¯i=Wi+Eii∈P,Al0=∑j=1mTAjl0=∑i=1mEil0=El0.
We denote by colk∈I(E)∈Rm the columns of the matrix E. Hence, problem (10) is equivalent to(13)minimizeEEl0subjecttoW¯=W+E,colkEl0≤κ,k∈I.Note that El0=∑i=1mEil0; we can further solve problem (13) by seeking for the locally optimal choice for each Ei∗ with the hope of finding a globally optimal solution (E∗):(14)minimizeEiEil0subjecttoW¯i=Wi+Ei,i∈P.The solution of this subproblem (14) will be given in Section 3.4. After solving m above optimization problems, the optimal solution E∗ will be checked by the following constraints:(15)fcolkE∗=sgnκ-colkE∗l0. For any k∈I, if f(colk(E∗))=1, there exists the attack; otherwise, there does not exist any data injection attack.
3.3. Solving Subproblem by <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M93"><mml:mrow><mml:msub><mml:mrow><mml:mi mathvariant="script">l</mml:mi></mml:mrow><mml:mrow><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>-Minimization
Recall that the dynamical coefficients (γ1,…,γm) are obtained (by polynomially fitting in Section 2.3). In view of adversary, W¯i can be rewritten as (16)W¯i=z¯i0z¯i1z¯i2⋮z¯iT-1=zi0+ai0zi1+ai1zi2+ai2⋮ziT-1+aiT-1=1γi,lγi,l2⋮γi,lT-1zi0+1γi,l1γi,l2γi,l1⋮⋮⋱γi,lT-1…γi,l1Ei+0giz¯0giz¯1+γi,lgiz0⋮giz¯T-1+γi,lgizT-2+⋯.Then we use the notation W~i as follows: (17)W~i=W¯i-0giz¯0giz¯1+γi,lgiz0⋮giz¯T-1+γi,lgizT-2+⋯=Γizi0+ΨiEi,where the matrices Γi∈RT and Ψi∈RT×T are(18)Γi=1γi,lγi,l2⋮γi,lT-1,Ψi=1γi,l1γi,l2γi,l⋮⋮⋱γi,lT-1…1.
In this paper, We have an approximation gizk≐giz¯k. The reason we take this approximation is that the difference of z[k] and z¯[k] is (19)giz¯k-gizk=γi,1z¯ikl+⋯+γi,l-1z¯ik2-γi,1zikl-⋯-γi,l-1zik2.For example, gi(z¯[k])-gi(z[k])=γi,1ai[k](1+ai[k]) when l=2. Since the values of γi,1 are small (i=1,…,m), gi(z[k])≐gi(z¯[k]). We have done experiments about this fact, and the experimental result supports our approximation claim. Then, W~i in (17) can be updated as(20)W~i≐W¯i-0giz¯0giz¯1+γi,lgiz¯0⋮giy¯T-1+γi,lgiz¯T-2+⋯.We can further take the QR decomposition of Γi∈RT [16]:(21)Γi=SR10=Si1Si2R10,where S∈RT×T, Si1∈RT, Si2∈RT×(T-1), Ri1∈R1, and Si1Si2 is orthogonal. Before multiplying (17) by Si1Si2T, we can have(22)Si1TSi2TW~i=Ri10zi0+Si1TSi1TΨiEi.By using the second block row, we can solve the following problem to obtain the sparse solution E, instead of A:(23)Si2TW~i=Si2TΨiEii∈P.Hence, the problem is reduced to reconstruct a sparse vector Ei from the observations Si2TW~i. Problem (14) is equivalent to the following problem:(24)minimizeEiEil0subjecttoSi2TΨiEi=Si2TW~i,where Ei∈RT. As is discussed above, solving problem (24) is in general NP-hard since it requires searches over all subsets of columns of Si2TΨi, a procedure which has exponential complexity. To overcome this problem, a frequently discussed approach considers a similar program in the l1-norm:(25)minimizeEiEil1subjecttoSi2TΨiEi=Si2TW~i.This operation is common and can be found in [13, 17, 18]. Throughout this paper, we consider Douglas-Rachford splitting algorithm [13] in the context of above l1-minimization.
3.4. Theoretical Guarantee
In this paper, we are also interested in studying the theoretical conditions under which obtaining the solution of the problem is guaranteed. It is well known that an inverse problem of finding the solution to the compressive sensing problem involves mathematical questions on the existence, uniqueness, and stability of the solution. On the other hand, the equivalence of the solution between (13) and (25) is not very clear and proof may be needed. We therefore consider two questions for a given Si2TΨi and signal Si2TW~i(i∈P): (i) uniqueness: under which conditions a possible sparsest solution is necessarily unique to problem (13)/(25)? and (ii) equivalence: under which conditions a sparse solution to problem (13) is also equivalent to the solution of problem (25)?
3.4.1. Uniqueness
As is described in Section 3.3, solving problem (24) requires exhaustive searches over all subsets of columns of Si2TΨi. Actually, it is a combinatorial procedure in nature and has exponential complexity. Inspired by [7, 17], Theorem 3 provides a sufficient condition for a unique solution to problem (24). It guarantees obtaining a unique sparse vector (i.e., E) from the corrupted observations (i.e., Z¯) for the l0 minimization. We denote by rowi∈P(E)∈RT the rows of the matrix E. Before giving the theorem, we need to first introduce the following definition [17].
Let Si2TΨi be the matrix with the finite collection of vectors colSi2TΨik∈I∈Rm as columns. For every integer 1≤ν≤|I|, we define the ν-restricted isometry constants ρν to be the smallest quantity such that Si2TΨi obeys(26)1-ρνEi2≤Si2TΨiEi2≤1+ρνEi2,for all real coefficients Ei∈P.
The number ρν measures how close the vectors rowi(Si2TΨi) are to behave. In particular, for ν=1, we can have(27)1-ρ1≤rowiSi2TΨi2≤1+ρ1,for∀i∈P.
To see the relevance of ρν to the error recovery problem, we consider the following theorem.
Theorem 3.
In a cyber-physical system, let Si2, W~i, Ψi, ν, κ, and I be specified as above. A sparse solution E can be uniquely recovered from solving the optimization problem (13), if ρ2ν<1, and colk∈I(E)l0≤κ.
Proof.
We first prove that if ρ2ν<1, there exists a unique Ei to problem (24). Suppose for the sake of contradiction that the solution is not unique; then there exist two solutions Eopt1≠Eopt2. Thus, there exists at least one variable i(1≤i≤m) such that(28)Si2TΨirowiEopt1=Si2TW~i,Si2TΨirowiEopt2=Si2TW~i,where rowi(Eopt1)l0=rowi(Eopt2)l0=ν. Then we can have(29)Si2TΨirowiEopt1-rowiEopt2=0.By construction rowi(Eopt1)-rowi(Eopt2) is of size less than or equal to 2ν. Applying (27) and the hypothesis ρ2ν<1, we conclude that rowi(Eopt1)-rowi(Eopt2)2=0, contradicting the hypothesis that rowi(Eopt1) and rowi(Eopt2) are distinct.
Then we prove that E is unique to problem (13). Given the proof that Ei, or equivalently rowi(E), can be uniquely obtained by solving problem (24) and E=[E1;…;Em], we conclude that E is unique to the following problem:(30)minimizeEEl0subjecttoW¯=W+E.And given the condition that col(E)k∈Il0≤κ, we can conclude that E is also unique to problem (13).
In the literature, a lot of efforts have been made to determine how sparse the desired corrected error must be for equivalence to hold. As we consider to use l1-minimization instead of l0 (to obtain the desired error), the conditions in the above lemma may not be guaranteed. Thus, Theorem 4 gives a general condition, which guarantees a unique solution Ei for l1-minimization problem.
Theorem 4.
In a cyber-physical system, let Si2, W~i, and Ψi be specified as above. A sparse solution E can be uniquely recovered from solving the optimization problem(31)minimizeEEl1subjecttoW¯=W+E,colEkl1≤κ,k∈I,if, for all E∗≠E, we have E-E∗Jl1-E-E∗J¯l1<0 and col(E)k∈Il0≤κ, where J and J¯ are the support of vectors E and E∗-E, respectively.
Proof.
We prove that given any Eopt1≠Eopt2 and Eopt2-Eopt1Jl1-Eopt2-Eopt1J¯l1<0 and colk(E)l0≤κ(k∈I), we can always uniquely recover E∗ from (31). Suppose for the sake of contradiction that the solution is not unique; then there exist two instinct solutions that Eopt1≠Eopt2 but Eopt1l1=Eopt2l1. We use the vectors Aopt1=row1Eopt1;…;rowmEopt1∈RmT and Aopt2=row1Eopt2;…;rowmEopt2∈RmT instead of Eopt1 and Eopt2, respectively.(32)Aopt1l1=Z¯-Zopt1l1=Zopt1+Aopt1-Zopt2l1=Aopt2-Zopt1+Zopt2Jl1+Zopt1-Zopt2J¯l1≥Aopt2Jl1-Zopt1-Zopt2Jl1+Zopt1-Zopt2J¯l1=Aopt2Jl1-Aopt2-Aopt1Jl1+Aopt2-Aopt1J¯l1>Aopt2Jl1=Aopt2l1,contradicting the hypothesis that Aopt1≠Aopt2. Therefore, we conclude that rowi(E) is unique to problem (25). Equivalently, E is unique to the following problem:(33)minimizeEEl1subjecttoW¯=W+E.Furthermore, given the condition that col(E)k∈Il0≤κ, we conclude that E is unique to problem (31).
In conclusion, Theorems 3 and 4 show that the hypothesis of our theorem holds provided that the sparse error can be uniquely corrected. Naturally, if the assumption does not hold, then neither does (13) or (31).
3.4.2. Equivalence
Next, we will discuss the conditions under which it is theoretically possible to use l1-minimization to obtain the sparse solution E (or A) instead of l0-minimization. We derive an algorithm for precisely verifying l0-l1 equivalence. We can use the following definition and proposition proposed in [19].
We define SKd(B1) as the collection of all d-dimensional faces of the l1-ball B:(34)SKdB1≐μ∈RmT:μl1=1,μl0≤d+1,where B1≐μ∈RmT:μl1≤1.
Proposition 6 (see [<xref ref-type="bibr" rid="B19">19</xref>, Proposition <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M231"><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:math></inline-formula>]).
In a cyber-physical system, let Si2, W~i, and Ψi be specified as above. For every Ei∈RT and Si2TW~∈RT-1, the following implication holds:(35)Si2TW~-Si2TΨiEi∗l0≤12Ci⟹Ei∗=argminEiSi2TW~-Si2TΨiEil1,if and only if ∀μ∈SKCi-1(B1),∀STW~∈RT∖0 and μ+Si2TΨiSTW~l1>1, where Ci=(number of columns of Si2TΨi that are linearly independent).
Proof.
See Proposition 3 in [19].
Note that implication (35) is the condition that we want to verify. As we need to deal with high-dimensional matrices (e.g., E∈Rm×T), we need to give asymptotic guarantees of equivalence, which is described in Proposition 6. In our experiments, it is confirmed that we can benefit from this equivalence, even when the matrices are in high dimensions.
4. Experimental Results4.1. Case Study: Power Network
We employ a real-world power grid system as the test system we used. A state-space control model in a smart grid consists of buses connected to transmission lines. We use the IEEE 14-bus system as the test system [20]. Moreover, we use the real load data in year 2016 from New York Independent System Operator (NYISO). The NYISO load data include the 11 regions (namely, A-H). Similar to [12], the following procedures are used to estimate 5-minute system state (x) using load pattern from NYISO.
Link each load bus of IEEE 14-bus system to one region of NYISO using the following matrix:(36)2345691011121314FCIBGKEHJDA.
The first row of the matrix is the bus number of IEEE 14-bus system and the second row represents the corresponding NYISO region index.
Normalize the load data collected from NYISO to the initial real and reactive load of the corresponding IEEE 14-bus system. Due to lack of reactive load information in NYISO database, we use the direct current (DC) power flow model to estimate system states. This condition can be relaxed when the reactive load data is available.
Add the normalized load data on the IEEE 14-bus system.
Estimate the system state (x^) from the solution of power flow analysis for benchmarking purpose. In this paper, we apply Newton-Raphson algorithm for estimating x^.
Similar to [12], we estimate T operating points of the system state (x) by adding the normalized 5-minute load data on the MATPOWER IEEE 14-bus case file [21]. In this paper, we use one-day NYISO data as the testing set. Thus, on one day, there will be 288 operating points. So, we set T=288 to construct the F-DDIA method. Second, we prepare the attacked samples as follows. We let the parameter κ range from 1 to m=54 in the IEEE 14-bus test system. For each κ, we simulate κ-specific meters to attempt the attack construction (a=Hc) with a randomly injected error c. Thus, at most, a total of 6564 labeled samples, which includes 6017 attack samples and 547 initial samples (without attacks), are prepared.
4.2. Parameters in Load Fitting
According to Section 2.3, the γ in (6) is the parameter of the measurement (load) dynamical model for power grid system. We estimate γ by polynomial regression using data traces of z¯[k+1]-z¯[k]. The historical load data in NYISO and attack samples prepared in previous session are used to construct the matrix γ^ (i.e., polynomial regression in order of l) in (6). The measurement dynamics at each time k are estimated by the data of 24 hours prior to the time. For example, if we want to estimate the load dynamics at 0:05 am Jun 30, Zone F, the load data samples (which may contain attacks) during 0:05 am, Jun 29–0:00 am, Jun 30 are used.
We are concerned about what the regression order l is appropriate for fitting the dynamics of the system. The experimental results show that l=2 is a suitable regression order. As the increase of l will improve the load fitting accuracy at the cost of computation time, we will use l=2 in the rest of our experiments. Table 1 gives the regression results for predicting the dynamical model by using the load data on Jun 30, 2016.
Regression coefficients for the predicting at 11:55 pm, Jun 30, 2016.
γ^1
γ^2
γ^3
Zone A
-7.12×10-6
1.02
-14.62
Zone B
-2.66×10-6
1.01
-2.47
Zone C
-1.63×10-5
1.05
-41.82
Zone D
-1.5×10-3
2.39
-314.79
Zone E
-2.14×10-5
1.03
-12.99
Zone F
-8.72×10-6
1.02
-16.77
Zone G
-4.03×10-6
1.01
-3.64
Zone H
-6.22×10-5
1.04
-4.19
Zone I
-2.67×10-5
1.04
-13.75
Zone J
-1.76×10-6
1.02
-474.37
Zone K
-1.27×10-6
1.01
-18.61
Specifically, we take Zone F for an example; Figure 1 shows a quadratic polynomial fit of load in Zone F with 95% confidence bounds (the 95% interval indicates that we have a 95% chance that a new observation will fall within the bounds.). We collect the hourly data to fit the model, where the blue “+” represents the actual hourly load, and the green curve describes the fitting model.
The quadratic polynomial fit of the load data in Zone F with 95% confidence bounds on Jun 30, 2016, when l=2.
4.3. Performance Matrices
When A is calculated by our detector, we set the following rule to identify whether the system is attacked:(37)Dik=1Ai+11k≥σob×Z¯i+11k0otherwise,where σob is the observation threshold when detecting data injection attacks. The parameter σob will be discussed later in this section. We denote the user-defined threshold Di[k]=1 when z¯i[k] is identified as attacked. Then, we identify whether z¯[k] is attacked by aggregating the values of Di[k](i∈P). We predict z¯[k] as attacked (denoted as Label[k]=1) if the sum of Di[k] is larger than the all-users-defined threshold Na, and secure (denoted as Label[k]=0) otherwise:(38)Labelk=1∑i=1mDik>Na0otherwise.
In smart grid networks, the major concern is not only the detection of attack cases but also that of the secure cases. In other words, after following the rule (38), we need to be careful of the samples with high precision and recall performance in order to avoid false alarms. Therefore, we utilize precision and recall metrics, which are commonly used for classification tasks [10]. Specifically, as Table 2 defines, we denote CA as the number of attacked samples, which we identified as attacked, WA as the number of secure samples, which we identified as attacked, CS as the number of secure samples, which we identified as secure, and WS as the number of attacked samples, which we identified as secure.
Denotations for defining evaluation metrics.
Attacked
Secure
Classified as attacked
CA
WA
Classified as secure
WS
CS
In addition, the performance of the proposed detector can be measured by the precision and recall metrics:(39)Pa=CACA+WA,Ra=CACA+WS,Ps=CSCS+WS,Rs=CSWA+CS,where Pa(Ps) and Ra(Rs) indicate the precision and recall values for the class attacked (secure), respectively. Precision values give information about the decision performance of the algorithms among identified class. And recall values measure the degree of attack retrieval.
4.4. Performance on Detecting Attacks
We first analyze the performance of the proposed algorithm against the attacks, which are made from a set of false data injection attacks when κ=1. In the experiments, we observe that the selection of threshold parameter σob does affect the precision and recall performances. Table 3 shows the comparison for different σob values. Pa and Rs increase as σob increases and remain 100% when σob≥0.04. In addition, Ra and Ps decrease as σob increases. Note that the precision value at σob=0.01 is 7.14% and the recall value at σob>0.06 is lower than 50% for class attacked. Thus, the optimal σob value should be in range [0.02,0.06]. Note that the performance at σob=0.05 is quite similar to that at σob=0.06; thus we do not draw the performance at σob=0.06 in Figures 2, 3, 4, and 5 to avoid unreadability.
Performance of proposed detector against multiperiod attacks for IEEE 14-bus system, κ=1.
σob
Pa
Ra
Ps
Rs
0.01
7.14%
100%
100%
95.47%
0.02
17.33%
83.87%
99.94%
98.61%
0.03
44.64%
80.65%
99.93%
99.65%
0.04
100%
70.97%
99.90%
100%
0.05
100%
64.52%
99.88%
100%
0.06
100%
63.64%
99.87%
100%
0.07
100%
41.67%
99.80%
100%
0.08
100%
45.45%
99.81%
100%
0.09
100%
41.67%
99.80%
100%
0.10
100%
38.10%
99.78%
100%
0.20
100%
9.10%
99.68%
100%
Precision of attacked samples for the IEEE 14-bus system.
σob=0.02
σob=0.03
σob=0.04
σob=0.05
Recall of attacked samples for the IEEE 14-bus system.
σob=0.02
σob=0.03
σob=0.04
σob=0.05
Precision of secure samples for the IEEE 14-bus system.
σob=0.02
σob=0.03
σob=0.04
σob=0.05
Recall of secure samples for the IEEE 14-bus system.
σob=0.02
σob=0.03
σob=0.04
σob=0.05
The performance of different σob values for identifying attacked samples is compared in Figures 2 and 3, where κ/m∈[0,1]. We observe that Pa increases and Ra decreases when σob increases. The precision value of attacked class is approximately 100% when σob=0.04 and σob=0.05. The recall value of the attacked class increases with rising κ/m values and is approximately 100% when κ/m is larger than 54.55%. Although the proposed algorithm at σob=0.02 and σob=0.03 may correctly detect the attacked samples as κ/m increases, the secure variables are incorrectly labeled as attacked and therefore give more false alarms.
Meanwhile, the performance of identifying secure samples is compared in Figures 4 and 5. Both values (precision and recall) of the secure class are high (i.e., near 100%). Summing up, the above experimental results show that if we choose the parameter σob∈[0.04,0.06], our methodology can efficiently detect the data injection attacks.
4.5. Performance on Recovering System States
In this part, we compare the performances of our detector and the residual-based approach with the performance of recovering the initial systems states. We first introduce how we evaluate the performances of an algorithm. In IEEE 14-bus system, the state vector x will have 14 bus voltage magnitudes and 13 phase angles, where the phase angle of one reference bus is set as the reference. If the system is observable [14], the state vector x can be represented as follows: x=V1,V2,…,V14,θ2,θ3,…,θ13T, where Vi,θi is voltage magnitude and voltage angle at bus i. Therefore, the average absolute phase error for bus i, denoted as ζθi[k], can be described as follows:(40)ζθik=1ϑ∑j=1ϑζθijk=1ϑ∑j=1ϑθ^ijk-θikθik,where
ϑ is number of testing samples;
ζθij[k]ith is bus absolute phase error at time k when under the jth attack in the testing samples;
θ^ij[k]ith is bus recovered phase angle at time k when under the jth attack in the testing samples;
θi[k]ith is bus true phase angle at time k.
The proposed algorithm and residual-based algorithm have been tested under various attack scenarios (i.e., κ=1,2,…,11). Table 4 presents the results when κ=1 and κ=11, respectively. We can first see the superiority of our methodology, comparing with the residual-based algorithm. For example, when κ=11, the average phase error of our proposed algorithm on bus 12 is 0.2886, whereas the error is 4.2793 for the residual-based algorithm, which is 13 times larger than our algorithm. Second, we can see that the average phasor errors for κ=1 are in general smaller than those for κ=11, which means that the performances of both algorithms work better when κ is small. Third, we see that the F-DDIA result of bus 11 (or bus 13) is quite different from the that of bus 12 (or 14). We think the reason that causes this phenomenon is because of the κ value. When κ=1, the bus indexes 11–14 are with little difference. To sum up, the reason that causes this difference of the average absolute phase error is complex, and thus the F-DDIA performance depends on a series of parameters (i.e., κ,σob, etc.).
ζθi[k] for IEEE 14-bus system when σob=0.05.
Bus index
ζθi[k] (κ=1)
ζθi[k] (κ=11)
Residual-based
Our detector
Residual-based
Our detector
2
0.2388
0.0770
0.5855
0.1387
3
0.7420
0.3612
1.2131
0.1374
4
0.3274
0.0430
0.9988
0.2117
5
0.4026
0.1076
0.9927
0.2187
6
1.0687
0.3478
2.0091
0.7807
7
0.6875
0.0493
1.6335
0.4186
8
0.6875
0.0493
1.6335
0.4186
9
0.8095
0.0443
1.8286
0.4823
10
0.9589
0.1817
1.7800
0.7220
11
0.9869
0.2925
1.8136
1.6674
12
1.2767
0.4810
4.2793
0.2886
13
1.1997
0.4977
2.1074
1.1148
14
1.0374
0.2894
1.6876
0.3817
4.6. Comparison on Execution Time
In our experiments, we find that the proposed approach is faster than other works. The residual-based fault detector takes around 50min (0.043 s per sample), while the proposed approach only takes 12min (0.011 s per sample). The 12min of our approach includes load dynamics fitting and Douglas-Rachford iterations process. The main computation burden for our proposed approach is to proceed Douglas-Rachford iterations for basis pursuit process. In general, we do not consider the state estimation process. This is why our proposed approach is faster than the other one.
5. Conclusions
The paper examines the problem of detecting data injection attacks in smart grid networks. We propose a detection framework named F-DDIA, which can recover the initial system state, as well as the real measurement readings. Due to the sparse nature of data injection attacks, l1 minimization technique (including Douglas-Rachford) can be applied. The validation of the proposed detecting algorithm is validated using load data from NYISO. Our detector works well in both linear and nonlinear systems.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work is supported in part by National High Technology Research and Development Program of China (no. 2015AA016008).
LangnerR.Stuxnet: dissecting a cyberwarfare weaponLiuY.NingP.ReiterM. K.False data injection attacks against state estimation in electric power gridsKimJ.TongL.ThomasR. J.Data framing attack on state estimationWangJ.HuiL. C. K.YiuS. M.CuiX.WangE. K.FangJ.A survey on the cyber attacks against non-linear state estimation in smart gridsProceedings of the Australasian Conference on Information Security and Privacy (ACISP '16)2016Berlin, GermanySpringerWangJ.HuiL. C. K.YiuS. M.WangE. K.FangJ.A survey on cyber attacks against nonlinear state estimation in power systems of ubiquitous citiesLiuL.EsmalifalakM.DingQ.EmesihV. A.HanZ.Detecting false data injection attacks on power grid by sparse optimizationHuQ.FooladivandaD.ChangY. H.TomlinC. J.Secure state estimation and control for cyber security of the nonlinear power systems2016, https://arxiv.org/abs/1603.06894HanD.MoY.XieL.Robust state estimation against sparse integrity attacks2016, https://arxiv.org/abs/1601.04180KimT. T.PoorH. V.Strategic protection against data injection attacks on power gridsWangJ.TuW.HuiL. C. K.YiuS. M.WangE. K.Detecting time synchronization attacks in cyber-physical systems with machine learning techniquesProceedings of the In 37th IEEE International Conference on Distributed Computing Systems (ICDCS '17)2017HugG.GiampapaJ. A.Vulnerability assessment of AC state estimation with respect to false data injection cyber-attacksChaojunG.JirutitijaroenP.MotaniM.Detecting False Data Injection Attacks in AC State EstimationDemanetL.ZhangX.Eventual linear convergence of the Douglas-Rachford iteration for basis pursuitAburA.ExpositoA. G.WangJ.HuiL. C. K.YiuS. M.Data framing attacks against nonlinear state estimation in smart gridProceedings of the IEEE Global Communications Conference Workshop (GLOBECOM '15)2015GoodallC. R.CandesE. J.TaoT.Decoding by linear programmingZhangJ.ZhaoC.ZhaoD.GaoW.Image compressive sensing recovery using adaptively learned sparsifying basis via L0 minimizationSharonY.WrightJ.MaY.Computation and relaxation of conditions for equivalence between l1 and l0 minimization2007UILU-ENG-07-2208Urbana-Champaign, Illinois, Ill, USAUniversity of IllinoisKodsiS. K. M.CanizaresC. A.Modeling and simulation of ieee 14 bus system with facts controllers2003Waterloo, Ontario, CanadaUniversity of WaterlooZimmermanR. D.Murillo-SánchezC. E.ThomasR. J.MATPOWER: steady-state operations, planning, and analysis tools for power systems research and education