Efficient Privacy-Preserving Protocol for k-NN Search over Encrypted Data in Location-Based Service

With the development of mobile communication technology, location-based services (LBS) are booming prosperously. Meanwhile privacy protection has become the main obstacle for the further development of LBS. The k-nearest neighbor (k-NN) search is one of the most common types of LBS. In this paper, we propose an efficient private circular query protocol (EPCQP) with high accuracy rate and low computation and communication cost. We adopt the Moore curve to convert two-dimensional spatial data into one-dimensional sequence and encrypt the points of interest (POIs) information with the Brakerski-Gentry-Vaikuntanathan homomorphic encryption scheme for privacy-preserving.The proposed scheme performs the secret circular shift of the encrypted POIs information to hide the location of the user without a trusted third party. To reduce the computation and communication cost, we dynamically divide the table of the POIs information according to the value of k. Experiments show that the proposed scheme provides high accuracy query results while maintaining low computation and communication cost.


Introduction
Nowadays, the location-based services are developing rapidly with the wide use of mobile Internet and smart mobile devices.Location-based services use mobile device to learn the current location with the help of built-in positioning devices and get the location information through the mobile network.
In a location-based service, a user obtains the query result by providing accurate locations to the service provider.The user privacy is probably obtained by the adversary through the conjunction of relevant background knowledge and the captured location information.How to utilize the service while protecting the location privacy of the user has become a research topic in location-based services in recent years.
The privacy data of location-based services includes three aspects: identity information, location information, and query content.The privacy of location refers to hiding the accurate location of the user.The privacy of query content refers to hiding the specific description of the request submitted by the user.When the query content is obtained, the characteristics and the behaviors of the user can be deduced.
In the LBS, the core of privacy-preserving is to cut off the relevance of the identity information, location information and query content.The query contents from the user are relevant to each other in the case of continuous querying.Therefore, it not only enhances the difficulty of privacy protection, but also increases the computation cost.
There are two communication modes in the privacypreserving query in LBS: one is based on the trusted third party (TTP) and the other is without trusted third party (TTP-free).Although TTP-based solutions [1][2][3][4][5][6][7] are able to collect enough information to maximally meet the needs of privacy protection, there are two problems: (1) It is difficult to obtain the TTP that fill the bill.(2) Centralized attack on TTP makes it become the bottleneck in the query scheme.TTP-free-based solutions take advantage of the limited information to help maximize privacy protection.However, there 2 Complexity is a lack of methods which are outstanding in all three aspects of query accuracy, efficiency, and privacy-preserving.
Lien et al. [8] have proposed a private circular query protocol (PCQP) without a TTP, which uses Moore curve and Paillier cryptosystem to implement the protection of the location and query content.The scheme contains a large number of homomorphic additions and multiplications, and thus it requires higher computation and communication cost.Utsunomiya et al. [9] made some improvements on the basis of PCQP and proposed a lightweight private circular query protocol (LPCQP) with divided POI-table to effectively reduce the number of homomorphic additions and multiplications.The dividing of the POI-table is performed only once in the initialization process and the number of subtables observably influences the accuracy of the query.In some extreme cases, the scheme cannot return enough  POIs to the user.In addition, when the number of POIs in a subtable is much larger than , the homomorphic additions and multiplications bring large number of unnecessary computation cost.LPCQP uses the homomorphic encryption scheme proposed by Smart and Vercauteren to ensure the security.The scheme needs a large size of public key and a large amount of computation.
Considering the advantages and disadvantages of the above two schemes, this paper proposes an efficient private circular query protocol (EPCQP) to mitigate the drawbacks of LPCQP without damaging the security.The proposed scheme utilizes the fully homomorphic encryption scheme to address the problem of secure querying over encrypted data in LBS.To omit the redundant homomorphic additions and multiplications, the proposed scheme dynamically divides the encrypted POI-table according to the query requirement of the user.The data security depends on the Brakerski-Gentry-Vaikuntanathan (BGV) homomorphic encryption scheme [10].In addition, the user utilizes the circular shift and modulo operation to replace the real location which guarantees the location privacy-preserving.
The proposed scheme has the following advantages.
(1) Location Privacy and Data Privacy.The proposed scheme can defend the correlation attack, the background knowledge attack, the offline keyword guessing attack, the inference attack, the man-in-the-middle attack, and the link attack.
(2) Computation Efficiency.The computation cost is reduced by 99% or more when  is from 5 to 50 compared with that of PCQP.When  is smaller than 25, the computation cost of EPCQP is significantly lower than that of LPCQP.
(3) High Accuracy Rate.The accuracy rate of the proposed scheme is higher than 90% even if  is large for the uniform dataset, and it is higher than 84% when  is large for the realworld dataset.The remainder of the paper is organized as follows.Section 2 introduces the relevant background knowledge.Section 3 describes the proposed protocol in detail and Section 4 discusses the performance of the proposed scheme based on various experiments.Related work is reviewed in Section 5 and the conclusions are drawn in Section 6.

Background
In this section, we review the main techniques which are utilized in the proposed protocol.
2.1.Moore Curve.Space-filing curves [11] represent a class of curves which traverse through all points in a twodimensional region or more generally an -dimensional hypercube, without crossing themselves.Hilbert [12] demonstrated the general geometrical generating procedure for constructing an entire class of space-filling curves in 1891.Hilbert curve has the capability of superior clustering and partially retaining the neighboring adjacency of the original data [13,14].Figure 1 illustrates the Hilbert curves of different orders.An -th order Hilbert curve can pass through all cells in a 2  × 2  square girds.The number on the corner of each cell denotes an index, called -value, from the set [0, 2  − 1] when the curve traverses the cells.
Moore curve [15] is a variation of the Hilbert curve with end-point-connected property.Figure 2 illustrates the Moore curves of different orders.The POIs in a two-dimensional region can be stringed into a circular structure.Our scheme adopts Moore curve due to the circularly connected property.

Homomorphic Encryption.
Without the trusted third party, our scheme adopts the homomorphic encryption to protect the data privacy.Due to the property of the homomorphic encryption showed in (1), the server can perform addition or multiplication of plaintexts without decryption.In (1),  1 ,  2 , pk, and sk denote the two plaintexts, the public key, and the private key, respectively.The encryption and decryption are denoted as  pk () and  sk ().Here, +  and ×  denote the homomorphic addition and multiplication over ciphertexts.
(1) Homomorphic encryption [16] was proposed by Rivest et al. in 1978. Gentry [17] proposed the first fully homomorphic encryption scheme in 2009, which used bootstrapping to construct a fully homomorphic encryption scheme from a somewhat homomorphic encryption scheme.The scheme can be summarized as follows: FHE = SHE + Bootstrapping, and its security depends on certain worst-case problems over ideal lattices.In order to reduce the computational complexity of the decryption circuit, he designed a lattice-based decryption circuit, and its security depends on the assumed hardness of two problems: sparse subset sum problem (SSSP) and the certain worst-case problems over ideal lattices.Gentry's scheme uses matrix operations and vector modular arithmetic, resulting in discontinuities in computation and fast growth in computation cost.In addition, the size of the ciphertext generated by the corresponding 1-bit plaintext is exponentially increasing; thus the size is becoming too long to be realized by programming.Despite all of these shortcomings, Gentry still makes a great contribution to the study of the fully homomorphic encryption.To improve the poor practicality of Gentry's fully homomorphic encryption scheme, many optimization schemes of the fully homomorphic encryption have emerged since 2009.In 2010, van Dijk et al. [18] applied simple algebraic structure to construct a fully homomorphic encryption scheme, which is based on integer arithmetic, on the basis of Gentry's scheme.This scheme is simpler, but not practical.
In 2010, Smart and Vercauteren [19] proposed a fully homomorphic encryption scheme with smaller key and ciphertext size.In 2011, Gentry and Halevi [20] improved the original fully homomorphic encryption scheme with a new key generation algorithm, and the full polynomial inversion is not required in this scheme.In 2011, Gentry and Halevi [21] put forward some schemes which did not require squashing step and the hardness of SSSP assumption to further optimize the performance, and the practicability was improved.In 2012, Brakershi et al. [10] proposed the Brakerski-Gentry-Vaikuntanathan homomorphic encryption scheme which applied the key switching and modulus switching technique, meanwhile with additional BitDecomp technology to manage the noise.In the same year, Coron et al. [22] proposed a fully homomorphic encryption scheme over the integers which reduces the public key size of the van Dijk et al. scheme.
Most of the recent research results are still achieved by improving the original Gentry's scheme.Halevi and Shoup developed a new fully homomorphic encryption library, namely, HELib [23] in 2013, which is based on the BGV homomorphic encryption scheme, using modulus switching technique to reduce the ciphertext noise.It accomplishes the homomorphic operation of subtraction and shift, on the basis of the original additions and multiplications.In addition to the functional improvement, the performance of HELib is mainly optimized by Smart-Vercauteren ciphertext packing technique [24] and Gentry-Halevi optimization [25] to further improve the efficiency of homomorphic operations.In general, the homomorphic operation of encrypted data is accomplished and it shows better performance.
Taking everything above into consideration, we adopt the homomorphic encryption scheme released in HELib in the proposed protocol.

The Proposed Protocol
An efficient private circular query protocol is proposed which can achieve high accuracy and low computation cost without the trusted third party.

Initialization Process
Step 1.The server constructs a Moore curve and generates the -index for every registered POI on the target region.
In this step, an LBS server selects the appropriate parameters to construct a Moore curve that covers up the target region and builds the POI-table containing the information of all registered POIs.The POI-table contains the -index and POI-info of each POI, for example, the longitude, the latitude, and the name.Each stored POI in the POI-table is numbered in accordance with the evenly distributed -index Encrypt P t 0,j and q M Decide t

Obtain d from lookup-table
Obtain H-index and m  pk (P t 0,j ), and with common difference  instead of the associated -value.
The definition of -index will be presented in Section 3.2.1.
Step 2. The user generates the public and private key pairs and sends the public key pk to the server.Then the server encrypts the POI-table with the public key pk.

Query Process
Step 3. The user issues a -NN query to the server.
Step  .The details of the dividing method will be presented in Section 3.2.2.
Step 5.The sever announces the setting parameters of the Moore curve and the lookup-table to all registered users in public.The user can retrieve the -index of his current location on the target region and the subtable that he should search.The details will be presented in Section 3.2.3.
Step 6.The user chooses an integer t to generate a 2 × 2, offset circular shift permutation matrix   , (,  = 0, . . ., 2 − 1) and a vector   .In the first row of   , , the (2 −  + 1)th element is the only nonzero element.The -th element of   is the only nonzero element.The definitions of   , and   will be presented in Sections 3.2.3 and 3.2.4.
Step 7. The user calculates the value (- +  × ) modulo (2 × ) to generate the shift--index.The - is retrieved from the lookup-table according to the current location.The user encrypts   and the first row of   , by the public key pk generated in Step 2, denoted as  pk (  ) and  pk (  0, ).Then the user sends the shift--index,  pk (  0, ), and  pk (  ) to the server.
Step 8.The server utilizes  pk (  0, ) to construct a secret circular shift matrix  pk (  ).The server aggregates all of the subtables with  pk (  ) into a new table   sub .The -indexes of the POIs in   sub are numbered again from  to (2 × ).
Step 9.The server utilizes  pk (  ) to perform a secret circular shift on   sub based on the fully homomorphic property with the public key of the user.The server performs a -NN search upon the circularly shifted   sub and then returns the  encrypted results to the user.The detail will be present in Section 3.2.4.
Step 10.The user decrypts the received results with the private key sk selected in Step 2 and obtains the required -NN POIs.
The - of the user has been added by  (in Step 7), and the POIs in   sub have been secretly circularly shifted by ( × ) (in Step 9).Based on the additive and multiplicative homomorphism, the secret -NN search results in the shifted   sub will be consistent with the results searched in their plaintexts of the original subtable.

3.2.1.
Mapping from H-Value to H-Index. Figure 1 shows that the starting cell is not adjacent to the ending cell in the Hilbert curve.The searching range will be reduced when the user is near to the starting or ending cell of Hilbert curve, and it means that the query accuracy will be reduced.Figure 2 indicates that the start point and the end point of Moore curve are neighbors.Therefore, we adopt Moore curve to transform a two-dimensional space into a sequence of -V.With the capability of Moore curve, all the POIs can be constructed into a circular structure which is important when dividing the POI-table in the subsequent steps.In addition, the results mainly rely on the order of -V, so that altering -V of POIs do not affect the query results.- denotes an evenly distributed sequence numbered in the ascending order of -V with a common difference .- of the -th POI in a Moore curve is calculated as where  is an integer greater than or equal to one and  is the sequencing-order of the POI along with the ascending order of -V in the given Moore curve.The server constructs a POI- where  denotes the number of all entries in the POI-table.
The -th subtable, denoted as   sub , is defined as where   denotes the -th entry of the original POI-table and    denotes the -th entry of the -th subtable.
As mentioned in Section 3.2.1, the start point is adjacent to the end point in Moore curve.Geographically, the first POI and the last POI stored in the POI-table are close to each other.Therefore, the first and the last POIs are neighbors in two-dimensional space regardless of the -index distance between them.If the number of all entries in the -th subtable is less than 2, it will be appended using the entries in the original POI-table from the first to the   -th in order.  is calculated as As shown in Figure 4, there are nine POIs in the POItable.If  = 2, the POI-table will be divided into three subtables each containing four POIs.The third subtable contains , , , and .

Aggregating Subtables.
The user looks for the - from the lookup-table according to the current location and then chooses the subtable which contains the nearest POI as the subtable for querying.Next, the user obtains the index of the target subtable by retrieving the ID sub-table column of the lookup-table.Without loss of generality, let the -th subtable be the one that the user selects.The user generates a vector   defined as The user encrypts   with the public key pk and then sends the ciphertext  pk (  ) to the server.The server multiplies each element of  pk (  ) by the corresponding subtable.Then the -th subtable multiply  pk (0) multiply  pk (0) The server aggregates all the subtables into a new table   sub calculated as According to ( 6) and ( 8), all entries of the POI-info in the subtables become zero in their plaintext domain except the -th subtable.Note that the -indexes of the POIs in   sub are numbered from  to (2 × ).
Due to the properties of homomorphic encryption,   sub satisfies Figure 4 indicates the process of aggregating subtables.There are nine POIs and a user (Q) on the map.According to the Moore curve, the nine POIs are stored in the POItable in the ascending order of -indexes.When the user issues a -NN search with  = 2, the server divides the POItable into three subtables and each subtable has four POIs.The POI  has the nearest -index to the user; therefore, the user selects the first subtable to query.Then all the subtables are aggregated into one table with the entries of the first subtable.

Secret Circular
Shift in the -th Subtable.In order to keep the original -index secret to the server, the POI-info column of the original POI-table is circularly shifted in PCQP.On the basis of Paillier encryption scheme, an approach for circular shift by the encrypted matrix-vector multiplication is proposed in PCQP, where the POI-info column is in its plaintext domain whereas the permutation matrix is encrypted.Although the circular shift in the entire POI-table maintains the neighboring relationship with POIs, it requires a huge number of calculations due to the multiplication applied across the entire POI-table.Utsunomiya et al. proposed a lightweight scheme (LPCQP) to mitigate the drawbacks of PCQP.Inspired by LPCQP, we utilize the circular shift in   sub and modulo operation to hide the real location of the user.
After shifting the POI-info column  units downward circularly, the same -NN search results can be obtained by changing the querying -index to the shift--index, which is calculated as shifted--index = (- +  × ) mod (2 × ) .(10) When  is a negative integer, it represents an upward shifting.
The shift parameter  is decided by the user; hence the server cannot obtain the original -index of the user according to the shifted--index.The current location of the user is protected from being disclosed.
Likewise, the circularly shift on the POI-info column on the server side should be kept secretly.In PCQP, the entire POI-info column has been circularly shifted by multiplying an encrypted  ×  matrix.The operator of multiplicative and additive homomorphism will incur a huge overhead, especially for the mobile services.In this paper, only the POI-info of the -th subtable which the user selects will be circularly shifted by multiplying an encrypted 2×2, -offset matrix   , , which is defined as where ,  = 0, 1, . . ., 2 − 1.
The user encrypts the permutation matrix   , with the public key pk to obtain the ciphertext  pk (  ) and then sends it to the server.The server multiplies  pk (  ) with the aggregated table   sub in order to circularly shift the POI-info column of the -th subtable and keeps the -index column intact.
Note that the pseudorandom numbers are different during each encryption process, it means that the server has no way to distinguish the encrypted '0' and '1' .In order to reduce the computation and communication cost, the user only encrypts the first row of   , and sends  pk (  0, ) to the server.Then sever constructs  pk (  ) by circularly shift on  pk (  0, ).For ease of understanding, there is an example on how to construct  pk (  ).Let ) .
The server multiplies  pk (  ) with the aggregated table   sub to obtain the circularly shifted and decrypted POI-info data.
After decryption, the results will have the same shifted value t.
As shown in Figure 5(a), there are 8 POIs in the subtable, where  = 4 and  = 2.The -index of the user is 10, and the results will be , , , and .The shift--index is 6 when  = 6.As presented in Figure 5(b), the user will obtain the same results according to the shift--index.
Note that the first entry and the last entry of the subtable are not adjacent to each other.Therefore, the neighboring relationship will be changed after circularly shifting the -th subtable, and this will reduce the search accuracy.This effect is negligible compared with the reduction in computation and communication overhead.

k-NN Search.
Lien et al. [8] proposed a cross-like -NN search algorithm to achieve high accuracy rate and showed that two additional queries started from the central cells of search region are sufficient to achieve the reasonable accuracy rate in most cases.However, it may include duplicated POI especially when the user is close to the boundary.Utsunomiya et al. [9] proposed a group-based query point selection algorithm which achieved a higher accuracy rate.We adopt the methods proposed in LPCQP to improve the accuracy rate.

Security Analysis.
Mainly the most common attacks that can obtain some private information of the user can be listed as the following 6 chances.We divide them into three groups where there are two of them in each group.
The first group consists of correlation attack and background knowledge attack.The adversary utilizes the former one to eavesdrop some input queries and output results through the network.Then combining with some prior knowledge obtained from the latter one about the basic information of the user such as age or job, the adversary can infer the location of the user with a relatively large probability.
The second group includes offline keyword guessing attack as well as inference attack.There is a trapdoor generated from a search word and it does not leak any information of this search word.The first attack guesses the content of the encrypted data by computing trapdoors of some widely used words.Simultaneously, the second attack combines some background knowledge of the data content with some access patterns to identify the trapdoor of some words.
The last group is divided into man-in-the-middle attack (MITM) and link attack.For MITM, the adversary needs to be a third party between users and servers.It needs to guarantee that both two parties believe they are having conversations directly with the other one.When it comes to the link attack which is more commonly used, the adversary combines the inaccurate location information of the user and the data source from the outside to determine the accurate location or the identification of the user.
In this paper, we are aiming at proposing an efficient scheme for -NN search with perfect privacy-preserving.In the following part, we will illustrate the security analysis of six different kinds of attacks that are mentioned previously.

The Correlation Attack.
Correlation attack belongs to the plaintext attacks that utilize a statistical weakness due to a poor choice of the Boolean function.
The correlation attacks can be successfully mounted due to the fact that obvious correlations between the output of a special linear feedback shift register (LFSR) and the outputs of all the LFSRs defined by Boolean functions can be distinguished.Thus combining with part of the keystream knowledge, an adversary can get the key of the special LFSR by brute-forcing.
As for the correlation attack in EPCQP, the user sends  pk (  0, ) and  pk (  ) which are all encrypted with distinct random numbers, with these random numbers being changed every time.Therefore, the server will have no idea of correlating queries issued by the user.It implies that our scheme is secure under the correlation attack.

The Background Knowledge
Attack.This kind of attack exploits the close relationship between several standard common attributes where the sensitive attribute is exactly among them.In this way, the adversary can reduce the cardinality of the possible value set to find this sensitive attribute.
It can be seen that only the user knows the shifts amount because it is encrypted in our scheme.As for the server, it can learn nothing about the location though it can get the query history and the profiles of the user.This arises from the fact that the user can change the shifts amount every time while all the sensitive information transferred is encrypted.
In terms of this attack in our scheme, we only transfer the shifted location chosen by the user during the -NN search.The shifted location (- +  × ) mod (2 × ) is not sensitive because the shifts amount is selected by the user independently and randomly.Therefore, no sensitive information about the user will be leaked to the server, and thus our scheme is resistant to this attack.

The Offline Keyword Guessing
Attack.Generally speaking, dictionary attack as well as offline guessing attack occurs based on the fact that the so-called weak secrets may have low entropy.This means that it comes from the value set of a small cardinality.Similarly, the keywords also come from a rather smaller set when compared with the weak secrets such as passwords.Basically, low entropy means high probability, so the user is more likely to use the low entropy keywords when querying the table.
In our scheme, as for the offline keyword guessing attack, taking the location of the user, for example, the user rarely uses the low entropy keywords.Particularly, the location data in the server is collected as a lookup-table which is not the sensitive information of the user.Accordingly, by offline keyword guessing attack, no one can guess location data of the user during the query in our scheme.

The Inference Attack.
The inference attack aims at gaining knowledge of the user by analyzing the data.If the adversary can obtain the real value of the sensitive information with a relatively high probability, we say the user leaks the information.In the whole process, the adversary cannot directly obtain any data from some trivial information.
When it comes to the inference attack in our scheme, it utilizes the access pattern of the user, such as a document that contains previously queried keywords.However, no sensitive information about the access pattern of the user will be leaked to the server or an adversary because   , and   are encrypted with distinct random numbers, and these random numbers are scrambled by the user each time when the user queries.Consequently, our scheme is robust to this attack.

The Man-in-the-Middle Attack.
The man-in-themiddle attack is a widely used attack where there exists a third party called man-in-the-middle.It plays the role of a user when communicating with the server and then it will play the role of a server when facing the user.So it needs to imitate all the information transmitted in the network to make the user and the server believe that they are truly having conversation with the other one.
There are two widely used methods in preventing manin-the-middle attack.Authentication is used to make sure that the information transmitted during the communication is from a legitimate source.Tamper detection ensures that the information is not tampered during the transmission.
In our scheme, the user will never send any sensitive information or plaintext to the server, nor will the server do.The user only sends , pk, (- +  × ) mod (2 × ),  pk (  0, ), and  pk (  ).As for the server, it only needs to send the parameters of Moore curve, lookup-table, , , and the final result set which is encrypted by the homomorphic encryption scheme.Totally, they are encrypted with distinct random numbers and the numbers are varied every time when the user queries.Meanwhile the information of  which is the location of the subtables is transmitted in the form of  pk (  ).Thus, the adversary cannot play the role of the server, because he has no idea of  and cannot communicate with the user.In sum, our scheme can defend against this attack.
3.3.6.The Link Attack.When it comes to the link attack which is more commonly used, the adversary combines the inaccurate location information of the user and the data source from the outside to determine the accurate location or identification of the user.However, all of the location information is represented as (- +  × ) mod (2 × ) in the aggregated subtable.The server can learn nothing from the shift amount, though it may have access to the partial knowledge of the data source from the outside.Therefore, our scheme can also guarantee the security under link attack.

Performance Evaluation
In this section, we compare the performance of EPCQP with that of the related two works: PCQP [8] and LPCQP [9].We adopt the homomorphic encryption scheme released in HELib.The proposed scheme is implemented in JAVA language and performed on a laptop computer with a 1.6 GHz Intel Core i5 CPU and 4 GB RAM.
We use two datasets including a uniform dataset and a real-world dataset.Each dataset contains 10000 POIs.The real-world dataset is extracted from the base stations datasets in China.We randomly select 1000 locations on the map to issue the -NN search in each experiment and the results are averaged.
4.1.Query Accuracy.The accuracy criteria have been well used in data mining and machine learning areas [26][27][28][29][30]. Similarly, query accuracy is used for validation of the experiment in this paper.The value of  and  is varied from 5 to 50 and from 10 to 500, respectively.Let  and  denote the returned result set and the -NN ground-truth result set, and the accuracy rate is defined as In LPCQP, only the -th subtable is selected by the user for searching.The number of the subtables will directly affect the query accuracy.When  is greater than   , the server returns less than  POIs to the user.Although the computation cost is reduced by dividing the POI-table only once in the initialization process, the query accuracy rate decreases in some cases.After dividing the POI-table into  subtables, ( −   ) POIs are lost due to the aggregating subtables in (8).Hence, the quality of the results gets even worse as the number of the subtables gets larger.In contrast, the server can construct the subtables with a large unit when  is small, which guarantees that the number of results satisfies the querying requirement.Nevertheless, when   is far greater than , as mentioned in ( 7) and ( 8 and multiplications will be too wasteful to find at most  POIs.In order to find the maximum  that satisfies the high accuracy and significantly reduces the computation cost, Utsunomiya et al. [9] defined the ratio  as where  LPCQP and  PCQP denote the accuracy rate of LPCQP and PCQP.The performance results showed that /  should be 0.5 or less in order to achieve  ≥ 0.95.It means that when   ≥ 2, the scheme can keep a considerable high accuracy and reduce the computation cost.
When  >   /2, the entries lost from aggregating the subtables are too much.The situation will get worse when k >   , and the user will receive less than  results in this case.
In our scheme,   = 2.As mentioned above, EPCQP achieves a high query accuracy rate and reduces the computation cost which is presented in Section 4.2.The advantage of EPCQP is obvious compared with the case that  >   in LPCQP.
Figures 6 and 7 represent the accuracy rate versus  for the different datasets.It shows that the accuracy rate of EPCQP is higher than that of LPCQP when  is 500.Although the accuracy rate of EPCQP is slightly lower than that of PCQP and LPCQP when  ≤ 100, it is still higher than that of LPCQP when  > 25 and  = 200.The accuracy rate of EPCQP is higher than 90% even if  is large for the uniform   dataset, and it is higher than 84% when  is large for the realworld dataset.We define   as where  EPCQP denotes the accuracy rate of EPCQP.Figures 8  and 9 indicate the ratio   versus  for the two datasets.When  = 1, the value of   denotes the ratio of  EPCQP to  PCQP .Note that the ratio   is kept to 0.9 or above regardless of the various  for the two datasets.Particularly, for the cases of  = 500, EPCQP achieves   ≥ 1 when  ≥ 10.As shown in Figures 8 and 9, EPCQP keeps high accuracy rate of LPCQP when  < 500 and improves the accuracy rate when  = 500.Besides, the advantage of EPCQP will be more significant when  is larger than 500.large.As mentioned in Section 4.1, in order to achieve a high accuracy in LPCQP,   should be greater than or equal to 2.As shown in Table 1, the cost of multiplication in EPCQP is lower than that in LPCQP when   > 2.Besides, the cost of addition in the shifting process is reduced obviously.Therefore, our proposed scheme has a high accuracy with the lower computation cost.

Computation
The proposed scheme has the approximate computational time for aggregating subtables compared with that in LPCQP.Figures 10 and 11 represent the computational time for encrypting the circular shift matrix and the shifting process, respectively.As shown in Figure 10, the computation cost on encrypting the circular shift matrix in EPCQP becomes onethousandth of that in PCQP or below when  ≤ 20, and it is about one-tenth of that in LPCQP regardless of the various  when  = 10. Figure 11 shows that the computation cost of the shifting process in EPCQP becomes one-thousandth of  that in PCQP or below when  ≤ 30, and it is one percent of that in LPCQP regardless of the various  when  = 10.Figures 10 and 11 show that the computation cost of EPCQP is lower than that in LPCQP when  ≤ 100.Although the computation cost of EPCQP is slightly higher than that in LPCQP when  = 200 and  > 25, the accuracy rate of EPCQP is higher than that of LPCQP. Figure 12 represents the total computation cost of the PCQP, LPCQP, and EPCQP.As shown in Figure 12, the computation cost of our scheme is lower than that of LPCQP regardless of  when  ≤ 100.The decreasing rate of the total computation cost is represented in Figure 13.When  = 1, it denotes the decreasing rate of the total computation cost compared with that of PCQP.From these results, the computation cost of our proposed scheme is reduced by 99% or more compared with that of PCQP.When  ≤ 25 and  ≤ 200, the decreasing rates are positive number, and it means that the computation cost of EPCQP is lower than that of LPCQP.
Based on the above result, we can say that EPCQP keeps high accuracy rate of LPCQP while reducing the computation cost.There is a trade-off between the accuracy rate and the computation cost of EPCQP.

Communication Cost.
In this section, we discuss the communication cost of our scheme.Table 2 shows the communication cost of PCQP, LPCQP, and EPCQP.As described in Section 3.2.4,the encrypted matrix is constructed from the first row.Let  denote the bit-length of an encrypted POI-info.The user sends  pk (  0, ) as ( × ) bits to the server in PCQP.In LPCQP, the user sends  pk (  0, ) and  pk (  ) as (  + ) bits.The communication cost of our scheme is (2 + ) bits.After searching, the server returns  results to the user.
As showed in Table 2, the downlink communication cost of EPCQP is the same as PCQP and LPCQP.In the uplink process, the communication cost of EPCQP is lower than that of PCQP.In summary, the communication cost of our scheme is in the same degree compared to LPCQP; meanwhile it achieves high accuracy.

Related Work
The existing methods for location privacy protection mainly fall into three categories.[31][32][33][34][35][36][37][38][39] methods generate a cloaking region to the location server and the server returns the query results to users or a trusted third party.anonymity is the most common model [40], which was firstly implemented in LBS by Gruteser and Grunwald [1].Chow et al. [5] maintains the location information of the user by using the R-tree structure, which is the classic method for protecting the location of users by using the -anonymity model.It proposes a scheme that accurately searches nearest neighbors in the rectangular anonymous region.

Location Obstruction.
The user continues to submit queries with a specific fake location to the location-based server, and the server iteratively returns the results based on the fake location until the user obtains the result that satisfies the privacy and security requirements.The classic algorithm of location obstruction is SpaceTwist [41], which requires multiple rounds of communication, and the communication cost required for each complete query is large.

Spatial Transformation.
The principle of spatial transformation for protecting location privacy is to convert the location information from conventional data space to another.The scheme [42] utilizes homomorphic encryption to accomplish data interaction between users and servers.Although it achieves strong privacy protection, it is difficult to be adapted to the application environment of continuous queries and real-time responses with extremely expensive computation cost.In order to reduce the computation cost, a scheme [6] based on Hilbert curve is proposed.The scheme, which effectively reduces the computation cost of encryption, transforms all POIs in two-dimensional space into a sequence of integers in one-dimensional sequence and maintains the original neighborhood relationship approximately.The drawback of this scheme is that the query accuracy is not high.A -anonymous spatial region construction mechanism [7] is proposed for distributed systems.It combines the user location with Hilbert-order to form a spatial area with other peer nodes and then sends the area and the query requirement to the server.HilAnchor scheme [43] is based on SpaceTwist and Hilbert curve.With only two rounds of communication, the user can get the exact -nearest neighbor POIs without the leakage of the location.The MobiCrowd algorithm [44] utilizes the buffer to guarantee that queries can be accomplished locally in order to reduce the communication cost.
Query privacy is as important as location privacy.The scheme [45] generates dummy queries so that the server and the attacker cannot obtain the preference information of the user by summarizing the rules of the query contexts.An attack mode [46] marks a query according to the query context, the location, and the querying time.Reference [47] proposed the -Approximate Beyond Suspicion scheme, which first utilizes a clustering algorithm (such as the means) to cluster the users who have the similar location and issue the similar queries and then calculates the anonymous areas according to -anonymous to protect the privacy of the location and query.

Conclusion
In this paper, we propose a privacy-preserving circular query protocol with high accuracy and low complexity, which can be utilized in the location-based -NN search.With the circular shift and the homomorphic encryption, the proposed scheme accomplishes the efficient querying and the privacy protection simultaneously.We adopt the method that the server dynamically divides the encrypted POI-table according to the query of the user.Our scheme mitigates the drawbacks of PCQP and LPCQP without impairing the advantages of them.The computation cost is reduced by 99% or more compared with that of PCQP by allowing a 6% reduction in the accuracy rate regardless of .Comparing with LPCQP when  ≤ 100, the computation cost is reduced by up to 47.5% with a 5% reduction in the accuracy rate.With the rapid development of the spatial crowdsourcing, the location privacy has attracted more and more attention.We expect our scheme will inspire the research of location privacy protection and encrypted data computing in spatial crowdsourcing.

Figure 3
illustrates the whole architecture of EPCQP.

Figure 6 :Figure 7 :
Figure 6: Query accuracy rate versus k of the uniform dataset.

Figure 8 :
Figure 8:   versus k of the uniform dataset.

Figure 9 :
Figure 9:   versus k of the real-world dataset.

Figure 10 :Figure 11 :
Figure 10: Computational time for encrypting the circular shift matrix.

Figure 13 :
Figure 13: Decreasing rate of the total computation cost.
4. The server divides the POI-table into  subtables.Note that the -index column of the POI-table will not be encrypted.The number of all entries in each subtable is 2.ID sub-table denotes an index of the subtable in which a POI is contained.The sever stores the mapping from -value to -index of each POI into a lookup-table.In addition, the lookup-table records the ID sub-table in order that the user knows which subtable he should search.Consequently, the lookup-table contains three attributes of each POI: -, -V, and ID sub-table NN query, he selects only one subtable which contains POIs that the user needs.Although LPCQP reduces computation cost observably compared with PCQP, it requires a tradeoff between the query accuracy and computation cost on the server.Utsunomiya et al. have illustrated that /  should be 0.5 or less in order to obtain the high query accuracy, where   denotes the number of all entries in a subtable.The query accuracy of LPCQP becomes worse as  get larger due to the loss of some POIs from aggregating subtables.Extremely, when  >   , the server only returns   POIs to the user.The dividing of the POI-table is performed only once in advance; therefore, it cannot satisfy the query requirement if  is not appropriate.Inspired by LPCQP, we propose an efficient -NN search scheme to mitigate the drawbacks of LPCQP, called EPCQP.In EPCQP, the size of each subtable depends on the query requirement of the user instead of dividing the POI-table only once in the initialization process.We divide the POItable into  subtables, and each subtable has 2 POIs. is calculated as table which contains the POI-info and - of all POIs and records the mapping from -V to - in the lookup-table.The server should update the POI-table and lookup-table whenever any POI changes and then publicly announces the new lookup-table to all registered users.3.2.2.Dividing the POI-Table.PCQP requires a huge number of calculations due to the multiplication applied across the entire POI-table.Utsunomiya et al. proposed a lightweight -NN search protocol according to dividing the POI-table into  subtables.The server divides a POI-table in the initialization process in advance in LPCQP.When the user issues a -

Table 1 :
(8)t.In the query process of EPCQP, the server first divides the POI-table into  subtables.Compared with calculation of ciphertexts and homomorphic encryption/decryption, the cost of dividing table is negligible.As presented in Section 3.2.3, the server multiplies each element of  pk (  ) by the POI-info column of the corresponding subtable and then aggregates all the subtables into a table   sub .The process of aggregating requires  multiplication and 2( − 1) additions according to(8).Finally, the server multiplies  pk (  ) with the aggregated table   sub to obtain the circularly shifted and decrypted POI-info data which requires (2 − 1) additions and ( × 2) multiplications.Table1represents the computation cost of PCQP, LPCQP, and EPCQP.Compared with the cost of shifting process, the cost on aggregating subtables can be negligible when  is Comparison of computation cost on the server.and   denote the number of all entries in the POI-table and a subtable, respectively.

Table 2 :
Comparison of communication cost.and   denote the number of all entries in the POI-table and a subtable, respectively. n