Lightweight Data Aggregation Scheme against Internal Attackers in Smart Grid Using Elliptic Curve Cryptography

1State Key Lab of Software Engineering, Computer School, Wuhan University, Wuhan, China 2Co-Innovation Center for Information Supply & Assurance Technology, Anhui University, Hefei, China 3College of Communication and Information, University of Kentucky, Lexington, KY, USA 4Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing, China


Introduction
By providing bidirectional communications of electricity and information, the smart grid performs real-time monitoring of power usage [1].Based on the real-time information, the providers can monitor the power generation and consumption and get immediate power demand of each area.Then, they can take prompt action to optimize the power supply.The consumer can also get the current power price and adjust his/her behavior to lower expenses.Therefore, the smart grid can achieve efficient, economical, and reliable power services.Due to such advantages, the smart grid was a widespread concern for governments, industry, and academia in the last decade and is considered as the most promising candidate of the next generation power system [2].
The National Institute of Standards and Technology (NIST) presents a model and describes seven important domains of the smart gird [3].As shown in Figure 1 [4], a smart gird consists of seven important domains, that is, the power generation (PG) domain, the power transmission (PT) domain, the power distribution (PD) domain, the power customer (PC) domain, the power operation (PO) domain, the power market (PM) domain, and the power service provider (PSP) domain [5,6].After being generated, transmitted, and distributed in the PG domain, the PT domain, and the PD domain, respectively, the customers in the PC domain can enjoy wonderful life based on the power.The PO domain, the PM domain, and the PSP domain manage the power flow, the participants, and all third-party operations, respectively [7,8].The smart meters in the smart grid collect the consumers' power consumption data and other information and send them to the remote control center.Generally speaking, the smart meter is installed outside the door of a consumer and an attacker is in charge of the communication channel easily due to its openness.The attacker may maliciously modify the power consumption data to increase/decrease the consumer's power expense.He/she also can get the daily routine of the consumer in order to commit crimes.For example, he/she knows that the consumer goes out when there is no power consumption and sneaks into the house to steal expensive things.
To address the above problems, how to achieve secure communications in the smart grid becomes an issue that needs to be addressed.In particular, ensuring the data's integrity and confidentiality is even more important.Several cryptographic schemes can be applied for secure communications in the smart grid.Many key management schemes [9][10][11], key distribution schemes [12][13][14], and key agreement schemes [15][16][17] were presented in recent years.However, many of these schemes cannot implement the integrity and confidentiality simultaneously.To address this challenge, data aggregation schemes have been proposed by several researchers and applied in the smart grid.However, most of them are vulnerable to attacks from internal attackers.Although several data aggregation schemes against internal attackers were proposed to enhance security, their computation or communication costs are too high for practical smart grid applications.In addition, the smart meter has very limited computation and communication capabilities.It is therefore necessary to design lightweight data aggregation schemes for practical deployment.

Our Contributions.
To reduce both computation and communication costs, we propose a lightweight data aggregation scheme based on the Elliptic Curve Cryptography (ECC) [18,19], which can obtain the same security level but with a much shorter key size.The main contributions of our paper are demonstrated as follows: (i) First, we propose a lightweight data aggregation scheme based on Schnorr's signature scheme [18].(ii) Second, we prove that the proposed lightweight data aggregation scheme is secure and is able to satisfy security requirements.(iii) Finally, we analyze the performance of the proposed lightweight data aggregation scheme to demonstrate its high performance.

Organization of the Paper.
In Section 2, we briefly review related papers about data aggregation schemes.In Section 3, we give some preliminaries, including backgrounds of ECC, network model, and security requirements of the data aggregation scheme.In Section 4, we present our lightweight data aggregation scheme based on ECC.In Section 5, we describe a security model for the data aggregation scheme and present the security analyses of our scheme.In Section 6, we present the computation and communication analyses of our data aggregation scheme.

Related Works
To guarantee secure communication in open environments, a lot of authentication schemes [20][21][22], encryption schemes [23][24][25][26], and secure outsourcing schemes [25,27,28] have been constructed in last several years.Li et al. [29] and Garcia and Jacobs [30] designed two data aggregation schemes using Paillier's encryption scheme [31].To improve performance, Lu et al. [32] designed an improved data aggregation scheme using Paillier's encryption scheme and the super-increasing sequence.However, the above three schemes [29,30,32] cannot protect consumers' privacy because none of them can provide anonymity.To protect consumers' privacy, Zhang et al. [33] designed a security-enhanced data aggregation scheme based on the Chinese Remainder Theorem and Paillier's encryption scheme.Chen et al. [34] also designed a security-enhanced data aggregation scheme with fault tolerance based on Paillier's encryption scheme.
Unfortunately, internal attacks are not considered in the above data aggregation schemes [29,30,[32][33][34] thereby allowing internal attackers to access the consumers smart grid data.To address this weakness, Fan et al. [35] designed the first data aggregation scheme that can withstand attacks from internal attackers by using blinding technology.Unfortunately, Bao and Lu [36] demonstrated that Fan et al. 's data aggregation scheme cannot guarantee the integrity of transmitted data.To enhance security, He et al. [4] designed an improved data aggregation scheme based on Boneh et al. 's encryption scheme [37].The performance of Fan et al. 's data aggregation scheme [35] and He et al. 's data aggregation scheme [4] is not good enough because they use bilinear pairing operations.

Preliminaries
3.1.Elliptic Curve.Given a prime number , we say that the equation  2 =  3 +  ⋅  +  mod  defines an elliptic curve (  ), where ,  ∈   and Δ = 4 3 + 27 2 ̸ = 0 mod  [38].It is well known that all points on (  ) and the infinite point O make an additive group G.Given a generator point  with a prime order , the scale multiplication operation is defined as  ⋅  =  +  + ⋅ ⋅ ⋅ +   times , where  is a positive integer.
Previous researches have showed that the following problems in the group G are suitable for the design of public key cryptography because no probabilistic polynomial time algorithm can solve them efficiently [38].
Discrete Logarithm () Problem.Given an element  ∈ G, the DL problem is to extract an element  ∈  *  such that  =  ⋅ .

Network Model.
As shown in Figure 2 [4], there are three participants in the system of a data aggregation scheme, namely, a trusted third party (TTP), an aggregator (Agg), and a smart meter (SM  ) [4,35].The functions of the above three participants are presented as below.
(i) TTP: it is a trusted third party and its function is to generate blinding factors to withstand the internal attackers.(ii) Agg: it is the manager of the smart grid and its function is to generate the system parameters and the private keys of smart meters.
(iii) SM  : it is a smart meter and its function is to collect consumers' electricity consumption data and send it to Agg.
The workflow of the system is presented as follows.
(1) Agg produces the system parameters and the mast private key; (2) SM  registers in Agg and gets its private key; (3) TTP generates the blinding factors for Agg and SM  ; (4) SM  collects the electricity consumption, produces a ciphertext, and sends it to Agg; (5) after collecting all ciphertexts, Agg checks their validity and extracts the sum of all electricity consumption data.

Security Requirements.
Based on recently works, we know that a data aggregation scheme for the smart grid should meet the below security requirements [4,35].
(i) Confidentiality.The consumer's power consumption data indicates his/her habit and its leakage may be used by an attacker to commit a crime.To ensure the consumer's safety, a data aggregation scheme should provide confidentiality; that is, both the external attackers and the internal attackers cannot extract the electricity consumption data from intercepted messages.
(ii) Authentication.The malicious attacker may forge a message and impersonate the consumer.To ensure if the received message is transmitted by a legal SM  , a data aggregation scheme should provide authentication; that is, Agg can check the legality of the received message.
(iii) Integrity.All messages are transmitted over open communication channels and the malicious attacker may modify them to break regular transactions.To protect the rights and interests of all participants in the smart grid, a data aggregation scheme should provide integrity; that is, Agg can detect any modification of the received data.
(iv) Resistance against Attacks.Due to the openness of communication channels in the smart grid, the system is vulnerable to many types of attacks.To obtain secure communications in the smart grid, a data aggregation scheme should supply resistance against attacks; that is, it can withstand the replay attack, the modification attack, the man-in-the-middle attack, and the impersonation attack.

The Proposed Data Scheme
We describe our proposed lightweight data aggregation scheme, which consists of three phases, namely, the initialization phase, the registration phase, and the aggregation phase.
Initialization Phase.In this phase, Agg executes some steps to produce the system parameters.TTP and Agg execute some other steps to produce the blind factors against internal attackers.
Agg runs the following steps to produce the system parameters.
(2) Agg selects an element  with the order  existing on (  ), where  is a prime.
TTP and Agg execute the following steps to produce the blinding factors.
Registration Phase.In this phase, SM  registers in Agg.After registration, SM  receives its private key and becomes a legal smart meter.As demonstrated in Table 1, SM  and Agg run the following processes to finish the registration.
(1) SM  randomly chooses an element    ∈  * q , computes    =    ⋅ , and transmits {id  , Therefore, the correctness of the registration phase is demonstrated.
Aggregation Phase.In this phase, SM  extracts the power consumption data and sends it to Agg.Agg checks the validity of the received messages and aggregates all the received data.As demonstrated in Table 1, the steps below are executed by SM  and Agg.
(1) SM  gets the power consumption data   , randomly chooses an element   ∈  * q , and computes According to the above equations, the correctness of the aggregation phase of our scheme is demonstrated.

Security Analysis
The security of the proposed lightweight data aggregation scheme is analyzed in this section.First, we present a security model for the data aggregation scheme.Second, we demonstrate that the proposed lightweight data aggregation scheme is provably secure in the security model.Finally, we demonstrate that the proposed lightweight data aggregation scheme can meet the security requirements presented in Section 3.

Security Model.
Based on security models [40] for signcryption schemes, we presented a security model for data aggregation schemes.The security of confidentiality and unforgeability is formally defined by two games executed by an attacker A and a challenger C. A is allowed to make the following queries.
(i) ℎ  (): for such a query made by A, C randomly selects  ∈  Setup.C produces the system parameters and sends them to A.
Qurey.In this phase, A picks a challenging identity id *  and is able to adaptively make ℎ  , CreateSM, CorruptSM, Signcrypt, and Designcrypt queries except that it cannot make a CorruptSM query with id *  .
Forgery.In this phase, A outputs a ciphertext {  ,   ,   ,   , } corresponding to the challenging identity id *  .We say A wins in the above game if {  ,   ,   ,   , } is valid and it is not generated by executing a Signcrypt query.

Security Analysis
Theorem 3. The proposed data aggregation scheme is able to provide confidentiality if the CDH problem is hard.
Proof.Assume that an attacker A wins the game defined in Definition 1 with a nonnegligible advantage .Based on A's capability, we can construct a challenger to solve the CDH problem with a nonnegligible advantage.Given an instance (,  1 = ⋅,  2 = ⋅) of the CDH problem, C sets  pub ←  ⋅  and sends params = {, , , , ,  pub , ℎ 1 , ℎ 2 , ℎ 3 } to A. C randomly picks up an identity id  as the challenging identity and answers queries from A according to the rules below.After that, A can make ℎ  , CreateSM, CorruptSM, and Signcrypt queries and get the corresponding responses.Then, A outputs   as his/her guess against the confidentiality.C selects a random tuple (, ) from  ℎ 2 and outputs  as the solution of the given CDH problem.
Let  ℎ 2 denote the number of ℎ 2 -query involved in the game.The probability that C can solve the given CDH problem is  = / ℎ 2 .Because of the nonnegligibility of , we know that  is nonnegligible.This contradicts with the hardness of the CDH problem.Thus, the proposed data aggregation scheme is able to provide confidentiality.(v) Designcrypt(id  ,   ,   ,   ,   , ): for the query made by A, C checks the validity of the ciphertext and decrypts it to get the plaintext using the systems secret key .
At last, A outputs a forged ciphertext (id  ,   ,   ,   ,   , ).C stops the game if the equation id  = id  holds.Based on the forking lemma [41], C can output another valid ciphertext (id  ,   ,   ,   ,    , ) by choosing a different hash function ℎ 1 .Since both ciphertexts are valid, we can derive the following two equation: Wireless Communications and Mobile Computing 7 Based on the above two equations, we can derive the equation below: C outputs (  −    ) ⋅ (  −    ) −1 as the solution of the given DL problem.To compute the probability that C solves the DL problem, three related events are listed below.
(ii)  2 : C is able to forge a legal ciphertext.
Let  ℎ 1 denote the number of ℎ 1 involved in the game.It is easy to get that Pr[ 1 ] = 1/ ℎ 1 and Pr[ 2 | 1 ] = .Then, the probability that C solves the DL problem is Because of the nonnegligibility of , we know that  is nonnegligible.This is in contradiction with the hardness of the DL problem.Thus, the proposed data aggregation scheme is able to provide unforgeability.

Analysis of Security
Requirements.We will show that the proposed lightweight data aggregation scheme is able to meet security requirements presented in Section 3.
(i) Confidentiality.The internal attacker against the proposed data aggregation scheme can compute  = ∑  =1 (  − ℎ 2 ( ⋅   )).Without the blinding factor , he/she cannot extract the sum of the power consumption data by computing  =  +  0 mod .Besides, Theorem 4 shows that the proposed lightweight data aggregation scheme can supply confidentiality against any external attacker.Thus, our lightweight data aggregation scheme can supply confidentiality.
(ii) Authentication.Theorem 3 shows that any attacker cannot forge a legal ciphertext.Then, Agg can verify the legality of received messages by verifying if the equation Therefore, the proposed data aggregation scheme can provide authentication.
(iii) Integrity.Theorem 3 demonstrates that any attacker against the proposed data aggregation scheme cannot forge a legal ciphertext.Agg can detect any modification of the received data by verifying if the equation (∑  =1   ⋅   ) ⋅  = (∑  =1   ) pub + ∑  =1 (  ⋅   ⋅   +   ⋅   ⋅   ) holds.Therefore, the proposed data aggregation scheme can provide integrity.
(iv) Resistance against Attacks.The proposed lightweight data aggregation scheme can resist the replay attack, the modification attack, the man-in-the-middle attack, and the impersonation attack.The reason is analyzed below.
(1) Replay Attack.The timestamp  is involved in the ciphertext.Agg can find any reply of previous message by verifying 's freshness.Thus, the proposed lightweight data aggregation scheme can resist the replay attack.
(2) Modification Attack.Theorem 3 demonstrates that any attacker against the proposed data aggregation scheme cannot forge a legal ciphertext.Agg can detect any modification of the received data by verifying if (∑  =1   ⋅   ) ⋅  = (∑  =1   ) pub + ∑  =1 (  ⋅   ⋅   +   ⋅   ⋅   ) holds.Thus, the proposed lightweight data aggregation scheme can resist the modification attack.
(3) Man-in-the-Middle Attack.The above analysis demonstrates that the proposed lightweight data aggregation scheme can supply authentication; that is, Agg can authenticate SM  by checking if   ⋅  =  pub +   ⋅   +   ⋅   holds.Thus, the proposed lightweight data aggregation scheme can resist the man-in-the-middle attack.
(4) Impersonation Attack.Theorem 4 shows that any attacker cannot forge a legal ciphertext without SM  's secret key.Then, Agg can detect any impersonation by verifying the validity of the received ciphertext.Therefore, the proposed lightweight data aggregation scheme can resist the impersonation attack.

Performance Analysis
We analyze both computation and communication costs of our lightweight data aggregation scheme in this section.We also compare its performance with two of the most recently proposed data aggregation schemes to show its lightweight costs.
To achieve a fair comparison, we compare recently proposed aggregation schemes under the same security level.In the BGN encryption scheme [37], two 512-bit prime numbers  = 2 ⋅   + 1 and  = 2 ⋅   + 1 are applied in our experiments, where   and   are also large prime numbers.In schemes based on bilinear pairing, a Tate pairing based on a Type A elliptic curve Ê :  2 =  3 + 1 mod p with a prime order q is applied in our experiments, where the lengths of p and q are 512 bits and 160 bits, respectively.In schemes based on ECC, an elliptic curve  :  2 =  3 +  ⋅  +  mod  with a prime order  is applied in our experiments, where the lengths of p and q are 160 bits.

Analysis of Computation Costs.
Based on the well-known cryptographic library MIRACL [42], we have implemented all related operations on a personal computer with an Intel I5-3210M 2.50 GHz Center Processor Unit (CPU), an 8 Gbyte Random Access Memory (RAM), and the Windows 7 operation system.Table 3  Each SM  in the proposed scheme executes two point multiplication operations related to ECC and two general hash functions.Therefore, SM  's runtime is 2 ×  PM ECC + 2 ×  GH = 2 × 0.986 + 2 × 0.001 = 1.974 microseconds.Agg in the proposed scheme executes 3 ×  + 2 point multiplication related to ECC, 2 ×  point addition related to ECC, and 3 ×  general hash functions.Therefore, Agg's runtime is Table 4 and Figure 3 show the runtime comparisons among Fan et al. 's data aggregation scheme [35], He et al. 's scheme [4], and the proposed scheme.From Tables 4 and  2, the proposed scheme incurs a lower computation cost as compared to Fan et al. 's scheme and He et al. 's scheme at both sides of SM  and Agg.

Analysis of Communication Costs.
Since the sizes of  1 ,  1 ,   ,   , p, and q are 512 bits, 512 bits, 512 bits, 160 bits, 1024 bits, and 160 bits, respectively, we can determine that the sizes of elements in  *  ,  1 ,  2 ,  *   ,  * p, and  * q are 1024 bits, 1024 bits, 1024 bits, 160 bits, 1024 bits, and 160 bits, respectively.We assume that the size of both the timestamp and the identity are each 32 bits.The communication costs of the related data aggregation schemes are shown below.
In Based on the above evaluation, we note that the proposed data aggregation scheme incurs lower communication cost than He et al. 's data aggregation scheme.The proposed data aggregation scheme incurs a higher communication cost than Fan et al. 's data aggregation scheme.Security is most important for a data aggregation scheme.Therefore, it is reasonable to address serious security weaknesses in Fan et al. 's data aggregation scheme at the cost of increasing the communication cost slightly.

Conclusion
To ensure security and protect privacy in the smart grid environment, several data aggregation schemes have been proposed in recent years.However, most of them are not secure against internal attackers.To address the problem, Fan et al. [35] proposed a data aggregation scheme to mitigate internal attacks.Unfortunately, their data aggregation scheme suffers from serious security weaknesses.To enhance security, He et al. [4] proposed an improved data aggregation scheme using bilinear pairing.However, the performance of He et al. 's scheme is not very suitable for the smart grid environment because the smart meter has limited computation capability.In this paper, we have proposed a novel data aggregation scheme that can thwart internal attacks for the smart grid environment.The security analysis shows that the proposed scheme is provably secure and can meet the security requirements.Besides, performance evaluation results show that the proposed scheme incurs lower communication costs.The stronger security and better performance of the proposed scheme demonstrate that it is more suitable for smart grids.
With the fast development of quantum computing, the traditional mathematical problems (such as the DL problem and the CDH problem) are likely to be solved in polynomial time by quantum computers.Subsequently, all the above data aggregation schemes for the smart grid will not be secure at all.The lattice has been widely used to construct many cryptographic schemes that can provide resistance against the strong capabilities of quantum computers.However, no data aggregation scheme based on the lattice has been proposed yet.To improve security, it is worthwhile to consider the design of a data aggregation scheme for the smart grid based on the lattice approach.

2 WirelessFigure 1 :
Figure 1: The model of the smart grid.
, sends  to A, and stores (, ) in the table  ℎ  , where  = 1, 2, 3. (ii) CreateSM(id  ): for such a query made by A, C generates SM  's secret key and blinding factor and stores them in the table  SM .(iii) CorruptSM(id  ): for such a query made by A, C sends SM  's private key and blinding factor to A. (iv) Signcrypt(id  ,   ): for such a query made by A, C generates a ciphertext {  ,   ,   ,   , } corresponding to the message   .(v) Designcrypt(id  ,   ,   ,   ,   , ): for the query made by A, C checks the validity of the ciphertext and decrypts it to get the plaintext.Challenge.A picks a challenging identity id *  , chooses two messages  0 and  1 , and sends them to C. C picks a random element  ∈ {0, 1}, produces a signcrypted ciphertext {  ,   ,   ,   , }, and sends it to A. In this phase, A can adaptively make ℎ  , CreateSM, CorruptSM, and Signcrypt queries except that it cannot make a CorruptSM query with id *  or a Designcrypt query with {  ,   ,   ,   , }.Finally, A gives its guess   ∈ {0, 1} about the value of  selected by C. Pr[  = ] − 1|.A wins in the above game if it guesses the value of  correctly.
* Definition 1.A data aggregation scheme is able to provide confidentiality [indistinguishability against adaptive chosen ciphertext attacks (IND − CCA)] if no attacker can win the following game with a nonnegligible advantage.Setup.C produces system parameters and transmits them to A.Phase 1.A is able to adaptively make ℎ  , CreateSM, Cor-ruptSM, Signcrypt, and Designcrypt queries.

Theorem 4 .
The proposed data aggregation scheme is able to provide unforgeability if the DL problem is hard.Proof.Assume that an attacker A wins the game defined in Definition 1 with a nonnegligible advantage .Based on A's capability, we can construct a challenger to solve the DL problem with a nonnegligible advantage.Given an instance (,  1 =  ⋅ ) of the DL problem, C picks a random integer  ∈  *  , computes  pub =  ⋅ , and sends params = {, , , , ,  pub , ℎ 1 , ℎ 2 , ℎ 3 } to A. C randomly selects an identity id  as the challenging identity and answers queries from A according to the rules below.(i)ℎ(): C keeps atable  ℎ  of the form (, ), where  ∈ {1, 2, 3}.Upon receiving such a query, C checks if  ℎ  contains a tuple (, ).If so, C sends  to A; otherwise, C randomly picks up an element  ∈  *  , stores (, ) into  ℎ  , and sends  to A. (ii) CreateSM(id  ): C keeps a table  SM of the form (id  ,   ,   ,   ).Upon receiving such a query, C checks if  SM contains a tuple (id  ,   ,   ,   ).If so, C sends   to A; otherwise, C answers the query through the rules below: (1) If id  = id  , C randomly picks two integers   ,   ∈  *  and sets   ←  ⋅ .C stores (id  ,   ,   ) and (id  ,   , ⊥,   ) into  SM , respectively.(2) Otherwise (id  ̸ = id  ), C randomly selects three integers   ,   ,   ∈  *  and sets   ←  −1  ⋅ (  ⋅  −  pub ).C stores (id  ,   ,   ) and (id  ,   ,   ,   ) into  SM , respectively.(iii) CorruptSM(id  ): C checks if  SM contains a tuple (id  ,   ,   ,   ).If not, C makes CreateSM-query with the identity id  .After that, C returns (id  ,   ,   ,   ) to A. (iv) Signcrypt(id  ,   ): C checks if id  and id  are equal.If they are not, C extracts the tuple (id  ,   ,   ,   ) from  SM and uses it to produce a ciphertext {  ,   ,   ,   , } according to the description of the proposed data aggregation; otherwise, C randomly selects two integers   ,   ∈  *  and computes   =  −1  ⋅ (  ⋅  −  pub −   ⋅   ) and   =   +   + ℎ 2 ( ⋅   ).C stores (id  ,   ,   ,   , ) into  ℎ 2 and sends {  ,   ,   ,   , } to A.