Secure Data Aggregation Aided by Privacy Preserving in Internet of Things

With the rapid development of Internet of Things (IoT), more and more wireless smart devices are widely deployed in its typical applications. These devices acquire numerous data, and their security transmission is a challenge issue due to the openness and broadcast nature of wireless network. Security data aggregation (SDA) plays a significant role in IoT which can not only protect the data privacy but also reduce the network traffic among smart devices. In this paper, a multiple application SDA (MASDA) mechanism is proposed and it can ensure the data confidentiality without losing the integrity of data transmission. Firstly, we discuss an improved homomorphic message authentication code (iHMAC) which can verify the integrity of sensing data and eliminate the injection of false data. Secondly, a multiple application elliptic curve cryptography mechanism (iECC) is described. We introduce multiple types of data simultaneous encryption into an elliptic curve encryption system. This encryption mechanism aggregates the encrypted sensing data and prevents the sensing from being tampered and leaked in relay nodes. Finally, MASDA is a multiple application mechanism and provide both the confidentiality and integrity compared with other SDA schemes. It fuses the sensing data of different applications in a single data packet, and the base station can recover them one-by-one. The security analysis and the simulation experiments verify that MASDA has better performances in terms of the security, the computation cost, the communication overhead, and the data accuracy.


Introduction
The Internet of Things (IoT) is an intelligent system which can realize the sensing, the communication, and the decision-making through its underlying technology [1]. With the emerging of fifth generation (5G) communication, the reasonable implement of distributed devices has become an important research topic in IoT [2,3]. The rapid development of sensor techniques, including the microprocessor and wireless communication, provides a feasible solution and promotes the concept of IoT to a more broad field in data communication, data exchange, and data integration [4]. As the driving force of IoT, the wireless sensor networks (WSNs) are widely used in typical IoT application scenarios, such as the industrial manufacturing, the smart transportation, the environmental monitoring, and the smart city [5,6]. The integration of IoT and WSNs is one of the key schemes to push forward the third wave of the information technology revolution. However, new security threats become a huge obstacle to prevent some novel network techniques from being applied into IoT with WSNs. In an IoT system, sensor nodes of WSNs are usually deployed in open and hostile environments, attackers can more easily compromise the data security through the eavesdropping and the tampering attacks. What is worse, the constrained resources (e.g., the battery energy, the computation capacity, and the communication scope), it is a challenging issue to directly introduce the traditional security mechanisms into WSNs [7][8][9]. Therefore, security data aggregation (SDA) is proposed to guarantee the security of IoT, eliminate the redundancy of data acquisition, and improve the energy efficiency of smart nodes.
Although data aggregation (DA) is a common technique to reduce the data redundancy, improve the network efficiency, and prolong the lifetime of the network, it is vulnerable to various attacks in WSNs [10,11]. Therefore, DA algorithms need to be designed together with the security protocol in order to protect the sensing data and improve the network efficiency as well. The goal of SDA is to provide security for DA and prevent the private data from being eavesdropped on and leaked to unauthorized users in the process of data transmission among resource-limited sensor nodes [12]. As a result, the network efficiency and security can be assured. In the traditional sense, confidentiality and integrity are two important indexes for data transmission. The confidentiality requires that the attacker cannot understand the transmitted data and the integrity means that the receiver can distinguish the tampered data from untampered ones.
The confidentiality of DA can be implemented through two different mechanisms, the hop-by-hop encryption and the end-to-end encryption. The hop-by-hop DA mechanism encrypts the sensing data before sending them to the next hop and the ciphertexts are decrypted in the next hop in order to aggregate them with the sensing data of other sensor nodes. However, the hop-by-hop encryption leads to the nonignorable network delay. Moreover, the attacker may obtain the decrypted in the mediation node. The endto-end DA mechanism directly aggregates the sensing data in the ciphertext, and the encryption is not necessary for relay nodes. In recent years, many SDA algorithms have been proposed to protect the data privacy of WSNs [13][14][15][16][17]. These mechanisms provide end-to-end data confidentiality which can improve data privacy and reduce the aggregation delay. However, most of these contributions focus on the confidentiality of data and ignore the integrity of DA. In addition, these mechanisms are difficult to work in heterogeneous WSNs with different sensor nodes (e.g., the temperature sensor, the humidity sensor, and the light sensor) in a practical multiple application sensor network [18,19] and a malicious node can eavesdrop the keys and tamper the private data. Therefore, it is important to design a multiple application SDA mechanism with the aim of protecting confidentiality and integrity simultaneously.
Three major challenges need to be conquered in designing an SDA mechanism in multiple application scenarios. (i) In an IoT system with WSNs, sensor nodes are usually deployed in unattended even hostile environments, and each sensor node is equipped with an unchargeable battery and low computation processor. This implies that an SDA scheme does not play a huge burden on sensor resources. (ii) The security of data collection includes both confidentiality and integrity, which needs to design a novel encryption mechanism. (iii) Traditional encryption algorithms have high complexity and are difficult to support simultaneous aggregation of multiple application data (such as temperature and humidity).
According to the aforementioned issues, we propose a multiapplication secure data aggregation (MASDA) mechanism based on the elliptic curve encryption of compoundorder finite groups, which can protect the data aggregation, reduce the energy consumption, and provide the simultaneous DA for multiple applications. The contributions are summarized as follows: (i) A multiapplication SDA mechanism is proposed. It can provide confidentiality and integrity in the data transmission of IoT with WSNs (ii) An elliptic curve encryption scheme based on compound-order finite groups is discussed, which can encrypt different application data which can be aggregated at the cluster head in the ciphertext. The communication overhead and computation complexity are reduced (iii) A novel homomorphic message authentication is designed. It can verify the data integrity of sensing data and the injection of false data can be eliminated The rest of this paper is organized as follows. In Section 2, we discuss the related work of data aggregation scheme. In Section 3, we describe the network model and attacks. In Section 4, we present the detailed descriptions of the proposed mechanism MASDA. In Section 5, we provide security analysis. After providing simulation experiment and analysis in Section 6, we give our conclusion of this paper and future work in Section 7.

Related Works
As the critical technique of IoT, SDA has received great attention in recent years. In this section, we classify the SDA into three categories, namely, the confidentiality mechanism, the integrity mechanism, and both confidentiality and integrity mechanisms.

Confidentiality.
In [20], Alghamdi et al. investigated the reliable and secure end-to-end DA issue. Two data aggregation approaches were discussed, and the selective forwarding attack and the modification attack were taken into consideration in a homogeneous cluster-based WSN. The authors suggested that the secret sharing and signature had the potential ability to aggregate the data without understanding the contents of messages. Meanwhile, the base station can verify the aggregation result and retrieve the raw data from the aggregated data. However, these approaches can be adopted in a homogeneous network with single application data. They focus on the confidentiality of the network, and the data integrity is left to be investigated in the future. In [21], Zhong et al. proposed the latest privacy homomorphism scheme based on the elliptic curve encryption, which filtered out the false data and avoided unnecessary energy consumption. It ensured that the base station was able to verify the received data and recovery the original sensing data, which provided the arbitrary aggregation in a WSN. In [22], Gope and Sikdar discussed a lightweight and privacy-friendly shield-based spatial data aggregation mechanism which depended on the lightweight encryption to protect the data privacy, such as hash functions and the exclusive OR operations. Compared with other methods, the proposed scheme significantly reduced the computation overhead. In [23], Fang et al. proposed a data aggregation approach, called CSDA, based on the cluster privacy preserving. In CSDA, the slice-assemble technique was designed and the number of pieces was determined according to the network size so as to improve the aggregation flexibility and data privacy.

Wireless Communications and Mobile Computing
These reviewed protocols focus on the confidentiality of data aggregation and protect data from being leaked to unauthorized nodes. However, they ignore the data integrity in data aggregation. Therefore, the aggregation data are prone to be tampered with by malicious nodes, and the cluster head or the base station may not collect the accurate data.
2.2. Integrity. In [24], Arazi proposed a complete and high compact MAC scheme based on the stream encryption. This contribution is to implement the hash conversion based on a stream cipher, in which the intensity of the hash is closely related to the underlying security of the password. In [25], Shen et al. combined the advantages of the aggregation signature with the ID-based cryptography and proposed an ID-based aggregation signature (IBAS) mechanism. IBAS consisted of six probabilistic polynomial time (PPT) algorithms: the setup algorithm, the key generation algorithm, the signing algorithm, the verification algorithm, the aggregation algorithm, and the aggregation verification algorithm. It also deduced the security of the Diffie-Hellman hypothesis based on the random oracle model, which proved that IBAS was able to ensure the integrity of data and reduce communication and storage costs.
These data aggregation protocols provide the integrity protection of private data and achieve the end-to-end security of DA. However, some issues are left to be further studied, such as low confidentiality and the excessive consumption of resources.
2.3. Both Confidentiality and Integrity. Boudia et al. [26] developed a secure aggregation scheme based on the stateful public key cryptography (SASPKC). SASPKC depended on the symmetric homomorphic encryption and the message authentication code to aggregate the ciphertext and generate the signature of aggregation data, respectively. It provided not only the end-to-end data confidentiality but also the data integrity. Shim and Park [27] have worked on a secure data aggregation protocol (Sen-SDA) for heterogeneous networks. Sen-SDA employed the additive homomorphic encryption to reduce the length of ciphertext and improve the end-to-end security. In order to provide hop-by-hop authentication, the pairless identity-based signature (IBS) technique and the binary fast search (BQS) were discussed to filter out the false data in a heterogeneous WSN. As a result, Sen-SDA can protect the private data and improve the efficiency of multisignature verification. Shim and Park [28] proposed a t times lattice-based homomorphic cryptosystem based on random soft noise techniques for smart grids. After the sensing data were aggregated, the smart meter (SM) generated a complete aggregation ciphertext and the trusted authority (TA) issued a time stamp and a new random number. These components meet the security requirements of integrity, confidentiality, and authenticity, which can defend the replay attack, the forwarding attack, and the quantum attack.
Although the existing protocols have discussed the security of DA, a few contributions have paid their attentions to support the integrity and the confidentiality simultaneously. Moreover, the current solutions either aggregate single application data or limit it to a certain type of data query. These facts inspire this study. We consider both the data confidentiality and the data integrity and combine the elliptic curve encryption with the homomorphic message authentication to concurrently aggregate the multiple sensing data, which can expand the application scenarios of SDA.

Network Model and Attacks
In this section, we present the network model and the attacks in an IoT with WSNs. The symbols used in the MASDA protocol are shown in Table 1.
3.1. Network Model. In this paper, we deploy a cluster network topology to aggregate the sensing data in a WSN. Three types of nodes are involved, the base stations (BS), the cluster head (CH), and the cluster member (CM). The sensor nodes are capable of sensing, calculating, and transmitting the collection data. A network is divided into several application groups according to the different functions of nodes, such as the temperature sensor group, the humidity sensor group, and the light sensor group. Nodes of different groups are randomly scattered in the network. CH receives the data from its CMs and aggregates them based on the application types. The aggregation results are eventually transmitted to BS. Figure 1 depicts the cluster model of data aggregation.

Attack Types.
We take various attacks in IoT into consideration including the passive attack and the active attack. The specific attacks are described in Table 2. 3.2.1. Passive Attack. The passive attack usually listens to the communication channel and eavesdrops on the transmission data, which leads to the leaking of confidentiality even infers the key of the cryptosystem. The possible passive attack includes the ciphertext-only attack, the known-plaintext attack, and the selected plaintext attack. In the traditional sense, an encryption mechanism is an effective measure to protect the sensitive data from being attacked by the passive attack in an IoT with WSNs.

Active Attack.
Active attack attempts to compromise the network security by changing the data in a network. It may tamper with the sensing data, replay, or discard a whole packet. An IoT may be damaged by one or more active attacks, such as the replay attack, the malleable attack, the unauthorized aggregation, and the forged attack. These negative impacts of these attacks can be degraded through verifying the identity of nodes and checking the integrity of the aggregated data.

MASDA Algorithm
In this section, we firstly introduce the idea of MASDA. Then, the elliptic curve encryption algorithm and the homomorphic message authentication are discussed. Finally, a numerical example is provided.  Figure 1, CH is elected according to the residual energy and the distance between the sensor node and BS. Firstly, each CM encrypts its sensing data using the elliptic curve encryption algorithm. The different private keys are used for different application data. Secondly, CM generates a message authentication code using the shared key with CH after the data are encrypted. Then, CM sends the ciphertext and the message authentication code to CH. After CH receives the message from the CM, it verifies whether the message authentication code of ciphertext is valid. Finally, CH directly aggregates the ciphertext and transmits the aggregation data to BS which decrypts the aggregated ciphertext and recovers the raw data of different applications.

Elliptic Curve Encryption Algorithm.
We adopt the privacy homomorphic encryption algorithm based on the elliptic curve encryption algorithm [29] in MASDA. The security of homomorphic encryption depends on the subgroup decision problem. In other words, assumed that an element belongs to the compound sequence group, n = q 1 q 2 , it is infeasible to judge whether it belongs to the subgroup, q 1 . This allows a hidden aggregation only relying on the ciphertexts encrypted with different keys. The algorithm supports the additive homomorphism and the multiplication homomorphism, and we will describe the additive homomorphic encryption in order to depict the design of MASDA.

Key Generation.
The message receiver, R, generates a tuple, ðq 1 , q 2 , E, nÞ, according to the security parameter, τ ∈ Z. E is a collection of elliptic curve points which forms a cyclic group. n is the order of E (n = q 1 q 2 ). Two points of order n are randomly selected (u and g) from E. Let h = u q 2 and the order of h is q 1 . The public key P pub = ðn, E, g, hÞ is sent to the receiver and the private key P pri = q 1 .

Encryption.
The sender, S, chooses an integer M (M < q 2 ). The length of M is close to the length of q 2 . The value of data m should be less than or equal to M. After S receives the public key, it generates a random number rðr ∈ f0, 1, 2,⋯,n − 1gÞ and obtains the ciphertext C = g m + h r . g m and h r are the scalar multiplication of elliptic curve points, and the plus represents the addition of elliptic curve points.

Data Aggregation.
Two ciphertexts received by aggregator A are expressed as follows: A fuses these ciphertexts according to Equation (3). Then, it sends ciphertext C ′ to the receiver R.
4.2.4. Decryption. After receiving the data from A, R decrypts them using the private key, P pri , according to C ′ q 1 = g m 3 q 1 + h r 3 q 1 where m 3 = m 1 + m 2 and r 3 = r 1 + r 2 . Equation (3) is transformed to R recovers the original data m 3 according to Because the length of plaintext is less than or equal to M, the time complexity of decryption is Oð ffiffiffiffi ffi M p Þ according to reference [30].

Improved Homomorphic Message Authentication Code.
In [31], Kamal et al. proposed a homomorphic message authentication code (HMAC), which satisfied the homomorphic property. The message authentication code can not be calculated even if the attacker knows the original data. In order to meet the demands of IoT with WSNs, we improve the property of HMAC, called iHMAC, which includes three polynomial-time algorithms, the signature algorithm, the aggregation algorithm, and the verification algorithm. In iHMAC, BS constructs a pseudorandom number generator, G : K G ⟶ F p , and a pseudorandom function, F : The key pair is used to generate MAC, ðk mac1 , k mac2 Þ, where (k mac1 ∈ K G ) and (k mac2 ∈ K F ).
The ith CM calculate MAC according to its original data m as The aggregation node receives the authentication codes of j nodes and the aggregation result of these codes is ma BS generates a verification code, mac BS , according to the collected aggregation data, m agg , and the keys, ðk mac1 , k mac2 Þ, as It proves that the data are integral if mac agg = mac BS . Otherwise, the data are trend to be tampered.

Confidentiality and Integrity Mechanism of MASDA.
MASDA is designed to be worked in multiple application scenarios, and the traditional elliptic curve encryption needs to be improved so that it can ensure the confidentiality as well as the integrity.
4.4.1. Key Generation. BS generates a tuple, ðq 1 , q 2 ,⋯,q k+1 , E, nÞ, according to the security parameter τ ∈ Z. E is a collection of elliptic curve points which form a cyclic group. n is the order of E and satisfied that n = q 1 q 2 ⋯ q k+1 . k + 2 points of order n are randomly selected, ðu 1 , u 2 ,⋯,u k+1 , gÞ from E. The value of h is determined according to

Unauthorized aggregation
Malicious nodes must know the encryption key and data authentication key to realize unauthorized aggregation. Integrity

Forgery packet
The attacker forges the packet without knowing the encryption key. Increase network energy consumption.

Wireless Communications and Mobile Computing
where the order of h is q k+1 and β is Then, the k keys are allocated to k application group nodes. P z is the key of the zthð1 ≤ z ≤ kÞ application group node.
where α is The public key deployed for the zth application group node is P pubz ðn, E, g, h, P z Þ, and the private key group, P pri = ðq 1 , q 2 ,⋯,q k+1 Þ, is retained by BS. In the integrity verification, BS uses the keys, ðk mac1 , k mac2 Þ, to generate MAC in a sensor node in advance.

Encryption.
In order to ensure that the length of T z is close to the length of q z , T z should satisfy with T z < q z and the data, m, should be less than or equal to T z . After a CM of the zth application group receives the public key, it creates a random number, rðr ∈ f0, 1, 2,⋯,n − 1gÞ and encrypts m with the public key to ciphertext, C, according to The CM with identity i also calculates a verification code depending on the ciphertext C using the following equation; CM sends data packects to the aggregation node (CH).
4.4.3. Aggregation. After CH receives data from j nodes in a cluster, it first verifies whether the data are integral according to their authentication codes. CH obtains a new MAC, mac CH , corresponding to the data of every CM using Equation (13) and compares mac CH with the authentication code sent by a node. If the verification successes, the data of nodes will be aggregated. Otherwise, the data will be discarded. The encrypted data and MAC are separately aggregated to produce the aggregation ciphertext, C agg , and the aggregation MAC, mac agg , as shown in where m l represents the original data of the lth node, r l denotes the random number of the lth node, and mac l is the MAC of the lth node. The aggregation ciphertext and message code, C agg | mac agg , will be transmitted to BS for decryption and verification in the next step.
Then, BS compares mac BS with mac agg . If mac agg = ma c BS , BS decrypts the ciphertext of aggregation data; otherwise, BS regards the date as a tampered or incomplete one.
After the data is authenticated, BS encrypts the ciphertext using the private key P pri . For different data of k applications, different private keys are used to decrypt them. If BS wants to recover the data of the zth application, they are decrypted according to where Then, Equation (17) is transformed to C γ agg = ð∑ j l=1 m l Þγ. Let ∑ j l=1 p ∑ j l=1 m l =P, the aggregation result m z agg of the zth application is m z agg = logp γ C agg γ. BS can recover the aggregated information of all application groups according to the abovementioned steps, and the goals of secure data transmission and integrity verification are achieved.

A Numerical Example of MASDA.
In this section, we will use a numerical example to illustrate the working mechanism of MASDA. Suppose that two clusters ððCluster 1 and Cluster 2 Þ Þ for two applications are deployed. The public key of node t is ðn, E, g, h, P t Þ and the public key of node h is ðn, E, g, h, P h Þ. The aggregation nodes in Cluster 1 and 6 Wireless Communications and Mobile Computing Cluster 2 are DA 1 and DA 2 , respectively. Cluster 1 is farther from BS compared with Cluster 2 . There are two different types of nodes in each cluster, the temperature sensor node node t and the humidity sensor node node h .

Key Generation.
In key generation step, the orders of P t , P h , and h are 11, 13 and 17, then n = 11 × 23 × 17 = 2431.

Encryption.
Two nodes in Cluster 1 are encrypted according to the following method, where the random number is randomly generated by the nodes. m t1 = 1 is the sensing data collected by node t1 with the random number 4. The data are encrypted using Equation (19) and the message verification code MAC t1 is generated based on C t1 .
The sensing data collected by node h1 is m h1 = 3 with the random number 6. The ciphertext is C h1 according to Equation (20), and the message verification code MAC h1 is calculated based on C h .
4.5.3. Aggregation. When DA 1 receives the data from two nodes, it verifies the MAC and aggregates the data to C DA1 according to Equation (21). The message verification codes are also fused to MAC agg1 according to Equation (22).
Then, DA 1 sends the aggregation data C DA1 to DA 2 . In Cluster 2 , two nodes (M t2 = 4, M h2 = 2) receive the aggregation data and the random numbers are 2 and 7 in these nodes, respectively. After the same encryption process is executed as Cluster 1 , the aggregation data C agg are obtained using Equation (23) in DA 2 and the message verification code of aggregation data is MAC agg = MAC agg1 + MAC agg2 .
Because the order of h is 17, h 17 = ∞, where ∞ is the generator in elliptic curve encryption. Therefore, Equation (23) can be rewritten as C agg = P 5 t + P 5 h + h 2 . Finally, Cluste r 2 sends C agg and MAC agg to BS.

Decryption and Verification.
After BS receives C agg and MAC agg , it verifies the MAC and decrypts the data if the verification is successful. The private key of temperature data is P t = 17 × 13 = 221, and BS can decrypt the temperature data m h according to where can also use the private key of humidity data to recover the original humidity data.

Security Analysis
In this section, we will show the resistance of MASDA against the passive and the active attacks.
5.1. Ability to Resist Passive Attacks. The passive attacks include the ciphertext analysis, the known-plaintext attack, and the selected plaintext attacks. The elliptic curve encryption of MASDA relies on the factorization of large integers, and it is robust to the ciphertext analysis. For the knownplaintext attack and the selected plaintext attack, the encryption encryption of MASDA is related to the random number and the ciphertext is probabilistic. Therefore, MASDA can defend against the known plaintext attack and the selected plaintext attack.The above attack methods are all for the analysis of ciphertext and plaintext. According to the characteristics of MASDA plaintext and ciphertext, we can draw the following conclusion: MASDA achieves end-to-end confidentiality having indistinguishable ciphertext in the presence of a probabilistic polynomial time adversary.
Since random numbers are added during encryption, MASDA generating different ciphertexts for the same plaintext. The attackers cannot deduce the plaintext from the eavesdropped ciphertext. By comparing the ciphertext, the attackers also cannot deduce any important information. Furthermore, the ciphertext is secured by appending the MAC. Even if the attackers breaks the signature and gets the ciphertext, they cannot decrypt it because the key is in the BS. Due to we assume the BS is a powerful and trusted device, the attackers cannot obtain the secret to decrypt the ciphertext. Therefore, MASDA has the ability to resist plaintext and ciphertext analysis.
In addition, we analyze the brute force cracking of the key. We compare the scheme with two other asymmetric homomorphic encryption mechanisms based on elliptic curve encryption mechanism: EC-OU [32] and EC-EG [33]. The security of EC-OU is based on the intractability of factoring. EC-EG security is based on the elliptic curve discrete logarithm problem. In elliptic curve encryption algorithm, the security of encryption mechanism mainly depends on the protection of key by users. The more bits of the key, the harder it is to crack the ciphertext. In MASDA, the length of the private key is k | q | , where k is the number of application types and |q | is the length of the key in iECC. Table 3 compares the key strength of MASDA with EC-OU and EC-EG. As shown in the table, the key strength of EC-OU and EC-EG depends on the length of the key. However, the key strength of MASDA depends on the number of applications and the length of the key. As the number of applications increases, the key strength of MASDA also increases. The key strength of EC-OU and EC-EG does not change with the number of applications.

Ability to Resist Active
Attacks. The active attacks mainly include the replay attacks, the malleability, and the 7 Wireless Communications and Mobile Computing unauthorized aggregation. These active attacks can compromise data integrity or generate false data. MASDA adopts iHMAC to defend against these attacks. The node uses the pseudorandom number key, and the sent data to generate the MAC and sends the result to the aggregation nodes. After receiving the MAC, the aggregation nodes perform an aggregation operation on the MAC and send the aggregated MAC to the BS. The BS checks whether the data is under active attack by calculating the MAC. Data integrity and false data screening are ensured if the calculated MAC is the same as the received MAC. Therefore, it is difficult for an attacker to launch an attack for the following reasons: (i) It is difficult to generate a MAC unless the key is known (ii) The keys generated by each node are different The key is not shared with other nodes We will analyze in detail MASDA's defense against active attacks.

Replay Attacks.
Replay attacks destroy data freshness. Data freshness is used to measure whether the collected data is recent. If the collected data is within the query time period, the result is recent. Data freshness is violated if an adversary submits legitimate data to a queryer before or after the query time period. MASDA can not be resistant to replay attacks. However, in situations where data freshness is critical, there are two methods that can be used as an additional means of protection.
(i) Node adds timestamp to packet. BS asks the nodes to collect data for a time period t. The sending node records the time of sending the data and adds it to the data packet as a timestamp. After BS receives the data packet, it first checks the timestamp. If the timestamp on the packet is within t, BS will receive the packet. Otherwise, regardless of whether the data is correct, the BS will not receive this data packet (ii) BS uses different keys in different time periods. In this way, BS sends a new key for generating the MAC to the node at regular intervals. If an attacker tries to interfere with the data collection process by exploiting old data data old . The attacker will send the data old into WSNs. When the data old arrives at the aggregation node or BS, the aggregation node uses the new key to verify whether the MAC in the data old is correct. If it is not correct, the data packet will be discarded 5.2.2. Malleability Attack. Malleability is an undesirable property of cryptosystems that negatively affects the data integrity of ciphertexts. In this attack, one ciphertext is modified into another so that decryption produces the associated plaintext. We will show that MASDA has the ability to resist such attacks. We assume that the attacker actively obtains the information in the wireless signal and attemps to make the BS decrypt a different plaintext by modifying certain bytes in the data. When the modified data arrives at BS, BS generates a MAC mac modify using the modified data. Then, BS compares the mac modify with the original mac original . Since iHMAC adopts linear calculation, different inputs will lead to different outputs. Therefore, mac modify is different from mac original . Then, BS can verify whether the data has been modified by comparing the mac modify with the mac original . The attackers can successfully change the ciphertext if and only if they can forge a valid MAC for the ciphertext. According to HMAC [31], this is a difficult work for an attacker.

Unauthorized Aggregation.
Unauthorized aggregation is a particular weakness of homomorphic encryption schemes. The main idea of this attack is to collect multiple correct ciphertexts into a single forged but valid ciphertext to deceive the BS. If the CH can only perform data aggregation, anyone can mislead the BS by dropping some packets to forge a wrong aggregation result.
In MASDA, each CH not only performs the aggregation operation but also generates a signature on the aggregation result. Thus, the BS can check the authenticity and integrity of the aggregated data sent by CHs. For the unauthorized aggregation, the adversary must destroy at least one sensor node before it can obtain the keys of iECC and iHMAC. Our encryption mechanism is built on the asymmetric cryptosystems of elliptic curve encryption. Since it is very difficult and infeasible to know the details of the curve, attackers are unable to perform unauthorized aggregation.

Forged Packet Attack.
Forged packet attacks disrupt the data aggregation process by injecting fake data into the network, causing the results to deviate from the true value, making such attacks impossible for the base station to determine. As a result, network resources are wasted and become useless as a result. To detect spurious data injection or forged packets, MASDA appends a MAC to each ciphertext and aggregated ciphertext. MAC allows each sensor node to ensure the origin of the messages it generates. In MASDA, iECC is used for data encryption, and then, iHMAC is used Table 3: EC-OU, EC-EG, and MASDA key strength comparison.

Scheme
Key strength (bits) to prevent false data injection. After receiving the data, the aggregation node first verifies whether the data is reliable. If the verification is successful, the aggregation ciphertext and the aggregation MAC value are sent to BS, and the responsible node discards the ciphertext. After receiving the message, the BS first checks the integrity and decrypts it if it passes the verification. Otherwise, BS considers that the information has been tampered with or incomplete. Therefore, the scheme can effectively resist forged packet attacks.

Simulation Experiment and Analysis
In this section, we provide the experimental simulation of MASDA. We employ the OMNET++ as the simulation platform and the parameters are shown in Table 4. Four types of nodes are deployed in a WSN for four applications. MASDA are evaluated in terms of the computation overhead, the communication cost, and the aggregation accuracy.
6.1. Computation Overhead. The homomorphic encryption mechanism based on symmetric key encryption consumes fewer resources. However, the mechanism based on asymmetric homomorphic encryption is more secure. EC-OU, EC-EG, and MASDA all include the encryption, the aggregation, and the decryption operations. However, they are based on different mathematical foundations. we follow the evaluation method defined in [27], first calculate the number of |q | -bit modular multiplications and then convert different calculation operations into basic unit numbers (1024-bit modular multiplication) and regard it as the evaluation standard. Their computation overheads are shown in Figure 2. Figure 2 shows that EC-EG is the best one compared with EC-OU and MASDA in computation overhead because of the smaller modulus. Noticed that it is at the price of lower encryption strength. EC-OU and MASDA choose the same modulus, so their computation overheads are the same level. However, neither EC-EG nor EC-OU can support the data aggregation of different applications, while MASDA can be applied to multiple application scenarios. In terms of decryption, MASDA needs more computation overhead than EC-OU and EC-EG. Generally speaking, BS has the unlimited resources and the more computation overhead of MASDA in decryption does not pay a huge burden on the overall lifetime of WSNs.

Communication Cost. The communication cost in
WSNs is closely related to the length of the ciphertext. For MASDA, the different applications of sensor nodes will affect the length of the ciphertext. The length of ciphertext is ðk + 1Þ × |q | bits in MASDA, and k is the number of application types in a node. Therefore, the length of ciphertext increases with the increase of application types. In [34] the length of ciphertext in EC-OU is 3 × |q | +2 ð|q| = 341 bitsÞ, and that of EC-EG is 2 × |q | +2 ð|q| = 163 bitsÞ. We choose |q | = 163 bits and |q | = 341 bits as the modules of MASDA and the length of EC-OU, EC-EG, and MASDA are shown in Figure 3. Figure 3 shows that the length of ciphertext in EC-EG is the smallest one. With the increasing of k, the length of ciphertext in MASDA also increases. Meanwhile, the ciphertext length of MASDA is smaller gradually than EC-OU when |q | = 163 bits and k < 5 of MASDA. If the applications of nodes are greater than two in the condition of |q | = 341 bits, the ciphertext of MASDA is longer than EC-EG and EC-OU. However, neither EC-EG nor EC-OU can support the notable attributes of MASDA, such as the multiple application data aggregation and recovery of aggregated data in BS. By analyzing the results, it can also be observed that there is a trade-off between the security and the length of ciphertext in MASDA. If the stronger security is the first demand, a larger modulus is recommended. If the lifetime of WSNs is the major goal, a smaller modulus is practicable.
The energy cost is also correlated with the amount of data transmission. We compared MASDA with EC-EG and EC-OU in data transmission. Different number of sensor nodes (120, 150, and 180) are randomly scattered in a 100mÃ-100 m square area with BS in the center. Each node belongs to a cluster and CH is determined according to LEACH protocol. The probability of a node being selected as CH is 0.05. The simulation result of each mechanism is the average of 10 simulation rounds. In MASDA, CH is  ; the data packet is divided into 8 data slices (≤30 bytes) and a serial number is assigned. 10 Wireless Communications and Mobile Computing allowed to aggregate the data being forwarded by other CHs in order to fuse the data of different clusters to a single ciphertext. EC-EG and EC-OU can only aggregate a single application. CSMA is adopted as the media access control mechanism. The upper bound of retransmission is five times and the initial energy of node is E i . Then, we can evaluate the energy cost of MASDA. Two energy models are usually adopted in data transmission, the free space model and the multipath attenuation model as shown in [35] The maximum frame of media access control layer is 39 bytes, and its structure is depicted in Figure 4(a). Noticed that the bit error rates are various if different sizes of data packets are adopted. Therefore, it is necessary to divide the aggregation ciphertext and mark the sequence number in each slice. When k = 4 and |q | = 341, the length ciphertext of MASDA is almost 200 bytes and the application layer data packet (ciphertext and message authentication code) of MASDA can be divided into 8 data packets before sending. Each packets is less than 30 bytes, and the serial number is assigned. After receiving these small data blocks, BS can reconstruct the original data according to the assigned sequence numbers. Figure 4(b) shows the slicing pattern of ciphertext (EC-EG and EC-OU also divided their data into the same slices as MASDA.) Figure 5 shows the survival of nodes in the network with different numbers of sensor nodes. As the number of surviving nodes decreases with the increase of time, the number of packets sent in the whole network decreases gradually. The frame format of media access layer of EC-OU and EC-EG is the same as that of MASDA, so the node survival of the three mechanisms is similar. In addition to the packet format, we also control all sensor nodes to be in the survival state, so as to fairly compare the transmission volume of the three mechanisms. Figure 6 shows the total data transfer volume for different mechanisms. We obtain the same ciphertext length by adjusting the modulus |q | . When k = 2, the length of EC-EG ciphertext is 326 bits. To achieve the same length of ciphertext, we set |q | = 107 in MASDA. When k > 2, we set |q | = 163 in MASDA. At this time, the key strength of MASDA is much higher than EC-EG. In addition, we also simulate the data transmission volume of MASDA under the same ciphertext length as EC-OU (n = 1024) (when k = 2,|q | = 341; when k = 3, |q | = 257; and when k = 4, |q | = 203). The specific analysis is shown in Figure 6. When k = 2, the total amount of data transmission of EC-OU is the largest. However, the key strength of MASDA is higher than that of EC-OU and EC-EG. When k > 2, our scheme is significantly better than EC-EG and slightly lower than EC-OU. This is because the key length of MASDA exceeds 163 bits, and the cost of key generation will undoubtedly increase. The results show that our scheme achieves a good compromise in terms of energy consumption and safety. According to the simulation results, we can choose the desired mechanism according to the amount of data transmission and the security level of different applications.
6.3. Aggregation Accuracy. In some traditional data aggregation (DA) mechanisms, the sensing data may be changed due to the compression algorithm and the data accuracy is an important indicator to evaluate the security of DA. EC-EG, EC-OU, and MASDA aggregate the sensing data only depending on ciphertexts and the raw sensing data are not be compressed during the data aggregation. Therefore, SDA mechanisms based on homomorphic encryption can provide better performance in accuracy. Meanwhile, the bit error rate is another factor which affects the accuracy of DA. In an ideal situation, we generally assume that there is no the data conflict and the packet loss in a network and the accuracy of DA may reach 100%. However, the data conflict and the packet loss are inevitable and the data accuracy is various in different mechanisms. In [36], Li et al. proposed a metric for data accuracy and defined it as "the ratio of the actual sum of the original data to the sum of the data received by BS." We adopted this definition and deployed a simulation network with 120 sensors and the bit error rate is 5%. We compared the data accuracy of EC-EG, EC-OU, and MASDA as shown in Table 5.
It can be seen that MASDA is the best one in these mechanisms. However, their differences are not so remarkable. After the 240th round, the data accuracy tends to be stable and the data accuracy is mainly affected by the channel noise at this time. The reason is that EC-EG, EC-OU, and MASDA transmit data packets in the same way. At the same time, because the channel congestion in the transmission process decrease with the increase of time, the accuracy of aggregated data is also improved. Noticed that MASDA has other prominent advantages compared with EC-EG and EC-OU. It can be applied to multiple applications and provide confidentiality and integrity simultaneously. This indicates that we expand the application scope of SDA without losing the security and effectiveness in terms of the encryption strength, the energy consumption, and the data accuracy.

Conclusion
In this paper, we discussed a the multiple applications secure data aggregation mechanism (MASDA) and its applcation in IoT. This scheme can encrypt the different application data in a single ciphertext and aggregate the encrypted data in a relay node in order to reduce the overhead and ensure the   data integrity through the homomorphic message authentication code. MASDA has desired confidentiality and integrity and the security analysis and simulation experiments show that it can maintain the higher security, the longer lifetime, and the better accuracy. Although our scheme may provide a solution for security aggregation in WSNs, there are still many meaningful topics to be studied in the future. We should verify the impact of differen packet loss rates on the aggregation accuracy and design a more robust homomorphic encryption scheme. The decrease in communication cost is also a huge challenge in subsequent studies.

Data Availability
The datasets generated or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.