EPCT: An Efficient Privacy-Preserving and Collusion-Resisting Top-k Query Processing in WSNs

. Data privacy threat arises during providing top-k query processing in the wireless sensor networks. This article presents an eﬃcient privacy-preserving and collusion-resisting top-k (EPCT) query processing protocol. A minimized candidate encrypted dataset determination model is ﬁrst designed, which is the foundation of EPCT. The model guides the idea of query processing and guarantees the correctness of the protocol. The symmetric encryption with diﬀerent private key in each sensor is deployed to protect the privacy of sensory data even a few sensors in the networks have been colluding with adversaries. Based on the above model and security setting, two phases of interactions between the interested sensors and the sink are designed to implement the secure query processing protocol. The security analysis shows that the proposed protocol is capable of providing secure top-k queries in the manner of privacy protection and anticollusion, whereas the experimental result indicates that the protocol outperforms the existing works on communication overhead.


Introduction
Wireless sensor networks (WSNs), as one of the important technologies in the Internet of ings (IoT), have been widely deployed to provide practical solutions in various applications, such as environment monitoring, military target sensing, and smart home application. Meanwhile, data privacy leakage in WSNs is becoming the main obstruction, which slows down its further development. For example, in the scenario of a smart home application, videos or pictures collected by wireless IP-cameras could be eavesdropped for illegal profit. As a result, privacy protection on sensitive data is a critical issue that must be addressed in WSNs.
In WSNs, the top-k query is one of the critical operations in data aggregation for sensor monitoring process. e top-k query requests the k lowest or highest data items collected from IoT sensors in WSNs. For example, "collecting the 10 lowest humidity data in forest area A-Z in last 2 hours" is an example of top-k query, which can be performed for fire monitoring. Our aim of this work is to design a secure top-k query approach with privacy-preserving and collusionresisting manners.
is article presents an efficient privacy-preserving and collusion-resisting top-k query processing protocol (EPCT) in WSNs. We first propose a minimized candidate encrypted dataset determination model, which is the foundation of our proposed protocol. It guides the idea of query processing and guarantees the correctness of the protocol. ere are two phases of interactions between the queried sensors and the sink in EPCT. In the first phase, when the queried sensors receive a top-k query from the sink, they first use their own private keys to encode the maximum of the collected data in the interested time slot, respectively, and then, they submit the encrypted data to the sink. In the second phase, the sink decrypts the received ciphertext and calculates the candidate sensors; after that, it unicastly informs the candidate sensors to submit the rest candidate data. Once the sink obtains enough data from the candidate sensors, the final result of the query is determined. e security analysis and performance evaluation indicate that the proposed approach EPCT has the ability of protecting data privacy and performing efficiently in transmission overhead. e main contributions of this article are listed as follows: (i) We present a minimized candidate encrypted dataset determination model, which is the foundation of our proposed scheme. It guides the idea of query processing and guarantees the correctness of the protocol. (ii) We present a novel privacy-preserving and collusion-resisting top-k query processing protocol, which consists of two phases of secure interactions between the queried nodes and the sink. We also analyse the correctness, security, and transmission overhead of the proposed method. (iii) We perform evaluations on the transmission overhead of the proposed protocol and the existing works. e experimental result shows the advantages of the proposed scheme in transmission overhead.
e remainder of this article is organized as follows. Section 2 discusses the related work. Section 3 introduces the network model, query model, threat model, and the problem description. Section 4 proposes the minimized candidate encrypted dataset determination model. Section 5 presents the top-k query processing protocol and the analysis of this protocol. Section 6 presents the performance evaluation of query protocols on communication cost, and Section 7 gives a conclusion of this article.

Related Work
Secure data queries (such as top-k query, range query, and MAX/MIN query) are critical operations for sensor monitoring and data collection in security-sensitive environment.
ere are a lot of works  focusing confidentiality, integrity, and completeness when performing data queries.
Kui et al. [3] utilize the pairwise-key and order-preserving symmetric encryption and the together to protect the privacy of data in top-k queries in two-tiered WSNs. Peng et al. [4] encoded both sensory data and top-k query commands, and storage nodes are designed to be able to correctly perform top-k queries over those encoded data. Li et al. [6] use pseudorandom hash function with bloom filter and partition algorithm to protect data privacy and integrity for top-k queries, respectively. Tsou et al. [7] constructed a layered authentication tree by an order-preserving symmetric encryption and used it to verify the completeness of query results. Zhang et al. [12] designed a renormalized arithmetic coding method such that storage nodes can calculate exact top-k query results without knowing real values of data, and they proposed a verification scheme to detect compromised storage nodes. Peng et al. [13] encoded top-k queries by threshold-based scheme and proposed a secure protocol that storage nodes can calculate query results over encrypted sensory data. Xingpo et al. [14] proposed secure top-k query protocol with privacy and integrity preservation by deploying the secure data preprocessing in sensor nodes. Wu and Wang [17] bound the collected sensory data with the corresponding locations to achieve secure top-k query processing on hybrid sensory data. Liu et al. [18] proposed a verifiable top-k query protocol on two-tiered mobile sensor network, which adopts the distinct symmetric data encryption and maps real nodes into virtual nodes. ese methods are designed for two-tiered WSNs, which adopt resource-rich storage nodes in traditional multihop WSNs. e different network architecture makes them not suitable for addressing secure top-k queries in traditional multihop WSNs.
In traditional multihop WSNs, the earlier studies [19,[25][26][27][28][29] proposed various top-k query schemes but without concerning any security issues. Huang et al. [30] designed a privacy-protection top-k query algorithm using a filter and a data distribution table.
e algorithm adopts conic section function to protect the privacy of the sensory data. But, the algorithm is vulnerable when collusion attacks happen. It is because all sensor nodes share the same secure keys and functions. If a sensor node colludes with adversaries, these secure keys and functions will be disclosed, and the adversaries could obtain the private data of other innocent sensors. In our previous work [31], we gave the first solution providing the privacy-protecting and anticollusion top-k query processing scheme in wireless sensor networks. It adopts the bloom filter and HMAC when performing interactions between nodes and the sink to achieve secure top-k query processing. However, there is some space for transmission overhead saving because of the redundant data submission and the false positive of bloom filter. is article presents an efficient and secure top-k query processing protocol, which can address the above problems.
Additionally, some previous studies have focused on the privacy-preserving range queries [8,11,32] and MAX/MIN queries [15,16] in WSNs. Because the query types are different, the ideas of these works cannot be applied to achieve the secure top-k queries in WSNs.

Network Model.
e architecture adopted is shown in Figure 1. e network routing topology is structured as a tree, which is following TAG protocol [33]. Assuming that in our scenario, n sensors S � s 1 , s 2 , . . . , s n are deployed and a sink. Sensor nodes are sensory devices with limited resources in energy, storage, and computation. ey are in charge of collecting data items from their neighboring areas and then submitting the collected data to the sink through the tree route. e sink is a resourceful device, which executes query commands from users and returns query results to users. When receiving a query command, the sink cooperates with those queried sensors in S to process queries according to predeployed protocols. After the sink obtains the query result, it returns the result to the upper-level users.

Top-k Query Model.
A top-k query is a data aggregation operation to get k highest or lowest sensory data from queried sensors. It is denoted as a triple Query t � (t, S, k) where t is the queried time slot identity, S is the set of interested sensors, and k is the number of interested data items. For example, (t, s 1 , s 2 , . . . , s 12 , 3) is a top-3 query to obtain the 3 highest or lowest data items during sensors s 1 , s 2 , . . . , s 12 in the time slot t. Each sensor s i ∈ S is assumed to collect N data items in a time slot, which is denoted as D i � d i,1 , d i,2 , . . . , d i,N , and each data item collected by a sensor is assumed to have an unique score. e uniqueness of collected data items can be achieved by integrating the data collecting time and the sensor identity into the data item score calculation. It ensures the uniqueness and correctness of a top-k query result.

reat Model.
e honest-but-curious threat model [9] is adopted in this article. e sink is trustful while sensors could collude with adversaries to leak out their collected or forwarded data. But the sensors that has been attacked still perform the pre-deployed protocols and cooperate with other innocent (noncompromised) sensors to process query commands. We have to note that the innocent sensors are the majority in WSNs; otherwise, the network will be useless. e goal of the proposed secure top-k query protocol is described as follows: (1) A sensor only owns the data collected by itself, and the data can be shared with the sink. It has no idea of the data collected by other sensors even when they are colluding with the adversaries. (2) Query results can only be obtained by the sink, but the adversaries have no idea of them even when there are a few compromised sensors colluding with the adversaries. (3) e k data items obtained by the sink are the k highest or lowest data items collected by the queried sensors, which means that the query result is correct.
Because sensors have limited energy, the network lifetime is usually determined by the energy consumption of the sensors. Reference [33] shows that sensors consume most energy in data transmission. us, the transmission overhead of network is an important metric for performance evaluation. We will perform the evaluation on this metric in Section 6.

Minimized Candidate Encrypted Dataset Determination Model
Based on the idea making, the proposed protocol efficient in transmission overhead. We propose the minimized candidate encrypted dataset determination model in this session.

Minimized Candidate Sensor Set.
Let Query t � (t, S, k) be a query command, and each sensor s i ∈ S collects N data items in a time slot, the set of collected data of all sensors in S Definition 1. For a top-k query, the query result R t is a dataset having the k largest data items of D. L(R t ) is denoted as the lower bound of R t , which is the minimum of R t .
Definition 3. For a top-k query, we define Φ is a sensor set consisting of k sensors whose in-node-maximums are the k largest in-node-maximums of sensors in S, that is (1) Proof. According to Definition 1, L(R t ) is the lower bound of R t , which is the kth largest data of D. Because |Φ| � k, there are k in-node-maximums of sensors of Φ, that is, □ Lemma 2. Φ is the candidate sensor set of a query, which means that all data in the query result R t are contributed by sensors of Φ, that is, Proof. We give the proof by contradiction. We are assuming that there is at least one data of R t , which is not contributed by a sensor of Φ. It means that , where x is collected by s j and s j is not in Φ, i.e., s j ∈ S − Φ. We are assuming that x is the lth largest data of D. en, we can deduce two results: 1. 1 ≤ l ≤ k holds because of x ∈ R t and |R t | � k. 2. According to the definition of Φ, for ∀y ∈ d i,1 |s i ∈ Φ , because x is assumed to be collected by s j ∈ S − Φ, we have y > d j,1 ≥ x, where d j,1 is the in-node-maximum of s j . Additionally, there are k in-nodemaximums contributed by sensors in Φ, i.e., |d i,1 |s i ∈ Φ}| � k. erefore, we can deduce that l > k holds.

Security and Communication Networks
Obviously, there are contradictions between 1 and 2. As a result, we deduce that Lemma 2 holds. Lemma 2 It indicates that all sensors in Φ are candidate sensors, which contribute the query result. In addition, Φ is also the minimized candidate sensor set, and we prove it in Lemma 3. □ Lemma 3. Φ is the minimized candidate sensor set that contribute the query result R t .
Proof. To prove this lemma, we have to prove the following two observations.
Observation 2. Any sensor deletion from Φ could incur the incompleteness of query result. If and only if the two observations hold simultaneously, then we can deduce that Φ is the minimized candidate sensor set that contribute the query result.
Proof to Observation 1. According to Definition 3, for ∀s i ∈ Φ and ∀s j ∈ S − Φ, d i,1 and d j,1 are their in-nodemaximums and d i, □ Proof to Observation 2. To prove the second observation, we just need to prove that, for any sensor of Φ, its collected data could belong to the query result R t . If it is true, then deleting any sensor from Φ could cause the incompleteness of R t . We are assuming that the collected data of sensors of Φ satisfy: . Because |Φ| � k, the top-k query result R t is determined and R t � d p,1 |s p ∈ Φ . It means that the in-node-maximums of all sensors of Φ are just the elements of R t . It is obvious that, in such circumstance, deleting any sensor from Φ will incur the incompleteness of R t . erefore, the second observation is proved.
According to the proofs, the above two observations both hold. us, Φ is the minimized candidate sensor set that contribute the query result.

Minimized Candidate Encrypted Dataset.
To protect data privacy, each sensor owns its private key only by itself. When a query is started, sensors first encrypt the qualified data by their keys and then submit the encrypted data to sink. For sensor s i , we are assuming its key is g i , which is only shared by s i and sink. e encrypted data of d i,j is denoted as (d i,j ) g i . Definition 4 (minimized candidate encrypted dataset). For a top-k query, the minimized candidate encrypted dataset, denoted as Γ, is contributed by sensors of Φ and consists of the minimum number of encrypted data that have the encrypted query result in it.
We are assuming that the candidate sensors are Φ � s 1 , s 2 , . . . , s k and their in-node-maximums are d 1,1 , d 2,1 , . . . , d k,1 , respectively, where d 1,1 > d 2,1 > · · · > d k,1 . For any sensor s i ∈ Φ, its collected data items are us, the calculation of Γ is given as follows: where We give an example to describe the minimized candidate encrypted dataset. As shown in Figure 2, we are assuming that there are 5 nodes s 1 , s 2 , s 3 , s 4 , s 5 , and each sensor has collected 4 data items.
eir in-node-maximums satisfy d 1,1 > d 2,1 > · · · > d 5,1 . For sensor s i , its collected data satisfy 4 . According to Definition 4, the minimized candidate encrypted datasets when k � 3 and k � 5 are shown in the dotted-lined area and solid-lined area, respectively. Lemma 4. Γ is the minimized candidate encrypted dataset that has the encrypted query result.
Proof. To prove this lemma, the following two observations need to be proved. □ Observation 3. For any (d i,j ) g i ∉ Γ, which is generated by s i , L(R t ) > d i,j holds.
Observation 4. Any encrypted data deletion from Γ could incur the incompleteness of query result. If and only if the two observations hold simultaneously, then we can deduce that Γ is the minimized candidate encrypted dataset that has the encrypted query result.
Proof of Observation 1. For sensor s i , it has two alternative cases, which are s i ∉ Φ or s i ∈ Φ. We give the proofs in such two cases: (i) Case I: s i ∉ Φ. According to Lemma 3, Φ is the minimized candidate sensor set that contribute the query result R t . Because where d i,j ∈ D j and then L(R t ) > d i,j is deduced.
is deduced according to equation (4). In the calculation of Γ, d 1,1 > d 2,1 > · · · > d i,1 > · · · > d k,1 and d i,1 > d i,2 > · · · > d i,j > · · · > d i,N are the given assumption. us, there are at least k , � i + j − 2 data larger than d i,j . According to k − i + 2 ≤ j ≤ N and It means that there are at least k data larger than d i,j . Definition 1 shows that the query result R t has the k largest data, so the minimum of R t is obviously larger than d i,j , that is, L(R t ) > d i,j . e deductions in two cases both lead to the same result L(R t ) > d i,j , and the first observation is proved.

□
Proof of Observation 2. To prove the second observation, we just need to prove that, for any (d i,j ) g i ∈ Γ, the corresponding plaintext data d i,j could belong to R t . If it is true, then deleting any encrypted data from Γ could cause the incompleteness of R t . According to the assumptions of the calculation of Γ that the minimized candidate sensor set is Φ � s 1 , s 2 , . . . , s k , where their in-node-maximums satisfy d 1,1 > d 2,1 > · · · > d k,1 and the collected data of any According to the calculation of Γ in equations (3) and (4), we have j ≤ k − i + 1, then |C| ≤ k − 1 holds. It means that d i,j is at least the kth largest data when equation (5) hold. In such scenario, d i,j always belongs to R t . erefore, we have that deleting any encrypted data from Γ could cause the incompleteness of R t . Observation 2 is proved.
According to the above proofs, two observations both hold. us, Γ is the minimized candidate encrypted dataset that has the encrypted query result. Lemma 4 is proved.
Lemma 4 It indicates that Γ is the minimized candidate encrypted dataset that has the encrypted query result. It is a key to achieve efficient privacy-preserving query processing method.

Top-k Query Processing
At first, an efficient privacy-preserving and collusionresisting top-k (EPCT) query scheme is introduced here. en, the correctness and security analysis, and performance of the proposed EPCT protocol will be presented.

Query Processing Protocol.
e queried nodes and the sink are involved as the cooperators in this EPCT protocol. To perform the protocol, sensors and the sink are firstly settled with keys in the network deployment. Each sensor is deployed a private key, and it only shares the key with the sink. e sink owns keys of all sensors, whereas sensors have no idea of each other's keys. e protocol has two phases, shown in Figure 3. e command is broadcasted to sensors in S, before the sink receives a top-k query Query t � (t, S, k) in the first phase from the user. Once the sensor s i gets Query t , it transmits the encrypted in-node-maximum in the queried time slot t to the sink. As the first phase ends, the second phase begins. In the second phase, the minimized candidate sensor set is determined according to the maximum values of the queried sensors. en, the sink transmits the second phase data request command to those candidate sensors. After each candidate sensor submits the qualified encrypted data, the sink obtains the minimized candidate encrypted dataset, and then, it will get the final query result after decryption. e processing of the top-k query Query t is finished. e detailed procedures of the query processing protocol are shown in Protocol 1.

Protocol 1. EPCT protocol is shown as follows:
(1) Phase 1: (1) As a query Query t � (t, S, k) is running, the first phase starts to process. Sink broadcasts Query t through all the networks and initials the dataset Γ � ∅. en, it waits till the first phase responses from the queried nodes in the networks. (2) For each node s i ∈ S, s i encrypts its in-nodemaximum d i,1 by using its private key g i , after s i gets the Query t . en, s i generates the encrypted data (d i,1 ) g i , submitting the message as follows to the sink.
(2) Phase 2: (1) As the submitted message from a queried sensor s i ∈ S arrives, 〈t, i d(s i ), (d i,1 ) g i 〉, the sink decrypts (d i,1 ) g i with the shared private key g i and gets the plaintext in-node-maximum of s i . s i obtains all the decrypted in-node-maximums of the nodes in S, d i,1 |s i ∈ S , before it determines the top-k data. If the determined top-k data are 1 and the corresponding sensor list according to the decent sequence of data are Φ � s 1 , s 2 , . . . , s k . According to Lemma 3, Φ are the set of minimized candidate sensors. en, the sink appends d 1,1 , d 2,1 , . . . , d k,1 into Γ and transmits the following messages to the k-1 candidate nodes in Φ − s k in unicast mode. sink ⟶ s i : 〈t, (k − i) g i 〉, ∀s i ∈ Φ − s k . (7) (2) For each candidate node s i ∈ Φ − s k , as the message 〈t, (k − i) g i 〉 arrives, s i decrypts the ciphertext and gets the plaintext number k − i. Security and Communication Networks en, s i encrypts k − i collected data items and sends them to the sink, e.g., where (3) e sink obtains the message 〈t, i, LR i 〉 transmitted by the candidate node s i ∈ Φ − s k in the second phase, before the ciphertext of the message is decrypted. e plaintext data after decryption are denoted as Dec(LR i , g i ) and appended into Γ. After all messages submitted from the candidate nodes are processed, the minimized candidate encrypted dataset Γ is determined, where (4) e sink gets the top-k data of Γ, which is the exact query result R t .
As presented in Protocol 1, the query command Query t arrives from the user in the first phase, before the sink broadcasts it through the whole network. As a queried sensor knows Query t , it encodes the in-node-maximum before transmitting the encrypted data to the sink, where the received ciphertext is decrypted to obtain the in-nodemaximums of the queried sensors in the second phase. Afterwards, the sink uses the in-node-maximums to determine the candidate sensor set Φ − s k , and then, it unicasts each candidate sensor in Φ − s k to start the second phase. Once a candidate sensor receives the unicast message, it submits the rest data in ciphertext according to the request to the sink. As the sink obtains all the needed data from candidate nodes, the query result is determined in the end.

Correctness Analysis.
In the proposed EPCT protocol, when a user starts a query command Query t , the sink will know the minimized candidate encrypted dataset Γ after interactions of the sink and sensors within two phases. Γ is consisting of the coded data items of query result. According to Lemma 4, for any (d i,j ) g i ∉ Γ, (d i,j ) g i does not belong to the query result R t , definitely. Additionally, Γ is the minimized candidate encrypted dataset that contains the encrypted query result. Any encrypted data deletion from Γ could incur the incompleteness of query result. As Γ received by the sink, it can get the query result by obtaining the top-k data from Γ. erefore, our proposed scheme is capable of guaranteeing the correctness of top-k query result.

Security Analysis.
e security analysis is conducted here for the privacy of the collected data and the query results. With the cooperation of the sink and the sensors in EPCT in these two phases, each node is deployed with a private key, which is only shared with the sink. e collected data of sensors only exists in data submission from sensors to the sink. When a top-k query is started, two phases of query processing are performed. In the first phase, each sensor performs a symmetric encryption to encrypt its innode-maximum and then transmits it to the sink. Secondly, candidate nodes are unicastly informed by the sink. ey encrypted a fixed number of collected data according to the request and then sends the enciphered date to the sink node. Clearly, the data collected and transmitted through the network are all in the form of ciphertext. Every node in WSN owns a unique private key, so it can only get access to the data it collected. However, it fails to know the data collected by other sensors because of the computational infeasibility of symmetric encryption. Even a few nodes probably are attacked and colluded with adversaries, they can only snoop the collected data of those colluded sensors, but they have no idea of the collected data of innocent sensors. Besides, due to the query result is decrypted and computed in the sink and sensor nodes only process the encrypted data for the query, it is hard for the attackers to know the plaintext query result even if a few compromised sensors are colluded with them. erefore, this proposed EPCT is a privacy-preserving and anticollusion top-k query processing protocol, which can protect the privacy of collected data of sensors even a few compromised sensors are in collusion with the adversaries, which can protect the privacy of collected data from adversaries even a few compromised sensors are in collusion with the adversaries.

Communication Cost Analysis.
In WSNs, sensors have limited energy resource, and the energy are mainly consumed by communication. During the top-k query procedures, the communication cost of the network is mainly caused by transmission overhead of sensors. e parameters used in sensor networks are introduced in Table 1.
We are assuming that the transmission overhead of phase 1 and phase 2 are C 1 and C 2 , respectively. According to the proposed EPCT protocol, all sensors participate in phase 1, whereas only the candidate sensors participate in phase 2. en, we obtain C 1 � n · l q + n · l i d + l t + l c · L, e total communication overhead of the whole network is computed as follows:

Performance Evaluation
Based on the improved simulator of [34], we implement three protocols, EPCT, PCTQ [31], and a naive protocol (Naive). For Naive scheme, each node queried firstly encodes its k highest data items and then submits them to the sink. After the sink gets all the ciphertext from sensors, it decrypts them to obtain the final query result. e performance is evaluated by the communication overhead in WSNs. is experiment is conducted on a PC with an AMD R5-3600 (6 cores 12 threads 4.2 Ghz) CPU and 32 GB RAM, running 64-bit win 10 professional OS and Java JDK 1.8. In the simulation, we generate 10 networks with random topologies, and each network is distinguished by different network IDs. In each network, sensors are randomly distributed in area covering a 200 × 200 m 2 , and the communicating radius of a sensor is 6 m. e collected data of sensors are randomly generated in each time slot. e network communication cost C total is measured by computing the average result of these 10 networks. e default settings of other parameters are shown in Table 2.
(1) C total versus Network ID. Figure 4 presents that the transmission overhead of these methods are distributed uniformly in different networks. Naive has much higher cost compared with PCTQ and EPCT. Statistically, the communication overhead of EPCT is averagely 89.06% and 43.23% lower than that of Naive and PCTQ, respectively.
(2) C total versus l c . Figure 5 shows that the communication overhead of EPCT, PCTQ, and Naive increases as the space size of an encrypted data item l c increases. e reason is that the transmission overhead of three approaches are all in proportion to the space size of an encrypted data item. e growth rates of communication overhead in EPCT and PCTQ are smaller than that in Naive. Statistically, EPCT reduces about 89.14% and 38.32% transmission overhead than Naive and PCTQ, respectively.
(3) C total versus n. Figure 6 presents that the communication overhead of three schemes grows as the number of sensors n increases. e reason is that the more sensors are queried, the more data are transmitted in the network, i.e., the higher communication costs. Moreover, the curves in Figure 6 tell that the growth rate of transmission overhead in Naive is significantly higher than that in PCTQ and EPCT.
Statistically, EPCT saves about 89.51% and 42.00% communication overhead than Naive and PCTQ, respectively. (4) C total versus k. As shown in Figure 7, the transmission overhead of three methods all increases as the number of requested data items k increases. It is that when k increases, more data items are requested in all three protocols. e growth rates of communication cost in PCTQ and EPCT are both lower than that in Naive. Specifically, EPCT saves about 93.44% and 44.57% on average than Naive and PCTQ in communication cost. Para Description l i d e space size of a sensor ID l t e space size of a time-slot l c e space size of a coded data item l q e space size of a query command L e average path length from sensors to the sink  According to the results of Figures 4-7, the transmission overhead of EPCT is the lowest in three protocols, whereas the overhead of Naive is much higher than the others. Because in EPCT and PCTQ, transmission only caused by candidate sensors need to, whereas in Naive scheme, all sensors are participated in transmission. Specifically, there are k · (k + 1)/2, at least k 2 , and n · k encrypted data items are submitted from sensors to the sink in EPCT, PCTQ, and Naive, respectively. As a result, according to the above evaluations, compared with the PCTQ and Naive protocol, it has been shown that the proposed EPCT has less network communication cost and more efficient.

Conclusion
Data privacy threat arises during providing top-k query processing in the wireless sensor networks. To address this issue, we proposed a novel and efficient top-k query processing approach, which is capable of privacy protection and anticollusion. We fist present a minimized candidate encrypted dataset determination model, which is the foundation of the protocol. e model guides the idea of query processing and guarantees the correctness of the protocol. e symmetric encryption with different private keys in each node is employed for data privacy and even to prevent the attackers from colluding with a few nodes. Based on the above model and security setting, two phases of secure interactions between queried nodes and the sink are designed to implement the query processing protocol. e security analysis shows that our scheme is capable of providing privacy-protecting and collusion-resisting top-k queries, whereas the experimental result indicates that our approach is efficient by evaluating the network communication.
Data Availability e data generated randomly in WSN and used to support the findings of this study are available from the corresponding author upon request.