With the rapid development of location-based services in the field of mobile network applications, users enjoy the convenience of location-based services on one side, while being exposed to the risk of disclosure of privacy on the other side. Attacker will make a fierce attack based on the probability of inquiry, map data, point of interest (POI), and other supplementary information. The existing location privacy protection techniques seldom consider the supplementary information held by attackers and usually only generate single cloaking region according to the protected location point, and the query efficiency is relatively low. In this paper, we improve the existing LBSs system framework, in which we generate double cloaking regions by constraining the supplementary information, and then
In recent years, with the rapid development of cellular network and GPS (Global Positioning System) positioning technology, the use of LBSs (location-based services) devices (such as phone, PAD) became more and more popular, while driving the rapid growth of LBSs applications. The typical LBSs applications include retrieval of POI (such as MeiTuan), map (such as Google Maps), GPS navigation (such as Amap), and location-aware social networks (such as Wechat). It can be said that LBSs have penetrated into many aspects of life, and the invocation of LBSs undoubtedly brings great convenience to people’s life.
At the same time, LBSs privacy risks also attract the attention of the society, because when the user requests LBSs, specific location information is needed to submit, and the locations which are involved in a large number of user’s query data [
The existing LBSs framework is shown as Figure
A centralized framework of LBSs.
The traditional LBSs privacy protection algorithms seldom consider that the anonymizer is not credible, so that the user’s specific location information is sent directly to the anonymizer. If the data in anonymizer is leaked and used by the attackers, the user’s location data will be disclosed directly. In addition, the attacker may make a fierce attack based on the probability of inquiry, map data, POI, and other supplementary information. For example, if a region is covered by a very low query number of locations such as lakes and mountains, the attacker can exclude the region with a large probability, so that the risk of user’s exposure in the remaining region will be increased.
In this paper, we propose improving the existing LBSs framework and design several related algorithms within the framework. First, user’s actual location which is contained in the query request is generalized into grid id, and the user’s grid region is matched to another region by a dynamic matching algorithm, so that double cloaking regions are formed by considering that the attacker has a background of the number of historical queries; second,
The remainder of the paper is organized as follows. Related work is discussed in Section
At present, researchers have put forward a lot of privacy protection methods for LBSs, and
Spatial cloaking [
Dummy position [
Encryption-based methods [
In summary, the existing location privacy protection mechanisms and methods still have the following problems:
The differences of this paper include the following:
In this paper, we assume that the attacker is a strong attacker. LBSPs can be seen as strong attackers, since LBSPs not only have supplementary information, such as the number of historical queries, but also know the privacy protection mechanism. A strong attacker usually infers the region where the user is located and then combines the supplementary information to filter the user’s region and even makes reverse attack based on the privacy protection mechanism, so that the attacker can uniquely identify the user’s region, then infer the user’s real location from the region, and finally access the user’s privacy.
For example, as shown in Figure
The strong attacker may not only have the supplementary information, but also know the privacy protection mechanism. Suppose that we simply use the region which is the closest to the user’s history query as the generation mechanism of double cloaking region, and the attacker knows the mechanism. As shown in Figure
As shown in Figure
As shown in Figure
Improved framework of LBSs system.
Two mechanisms of generating cloaking region.
Cloaking region is formed by randomly matching the number of queries
Cloaking region is formed according to the closest number of query
Data storage structure.
The query request passed by the anonymizer to LBSPs is denoted as
The LBSPs return the candidate results to the anonymizer as
The anonymizer returns candidate results to the user as
The quality of service obtained by the user is measured by the Euclidean distance between the dummy and the user. If the user is closer to the dummy position, the location of request and the result are more similar; therefore the service quality would be higher. Assume that the user’s specific position is
For simplicity, we list the notations used in this paper as Notation section shows.
In order to protect the user’s real location which is contained in the query request, we employ the double cloaking region mechanism. The double cloaking region includes real cloaking region (
The whole process of our proposed solution is shown as Figure
The improved framework.
In our improved framework, there are two important algorithms, which are dynamic matching algorithm
The main idea of dynamic matching algorithm (DMA for short) is to separate the regions with relatively large, relatively small, and zero number of queries, so that the two regions with obviously different number of queries will not be matched together to form a double cloaking region. As shown in Figure
Historical requests on the map.
The positions where users make historical requests
Users are allocated randomly into the 4 × 4 grid region
As shown in Figure
Number of historical queries in 4 × 4 region.
And then the numbers of queries in the 4 × 4 region are divided into three categories, relatively large, relatively small, and zero, which are realized by the classical dynamic clustering algorithm [
Take the data in Figure
The pseudocode of
The core idea of dummies generation algorithm
The shortest distance between the user and the random dummy
The coordinate system of the two-dimensional coordinate origin is established at the lower left vertex of the grid. In the anonymizer, there is data of each grid length
When When When When When
Base rule
If If If If
In total,
Take
An illustrating example of
An initial division when
The final division when
According to
In this paper, we use the historical GPS sampling point data within the range of 5.5 km × 3.5 km in Hefei city as historical inquiry points, which includes more than 60,000 sampling points produced by more than 30,000 people. The data consists of ID, latitude, and longitude, in which “ID” is the user’s unique identifier; “longitude” and “latitude” together tell the location where the user submits a query. For convenience, the experiment selects an area of 3.2 km × 3.2 km and sets the threshold of edge length
We will compare the dummy algorithm (DA) and naive algorithm (NA) with our proposed
Dummy algorithm.
As shown in Figure
Naive algorithm.
The coding language is Python and the experiment runs with the 64 bit Windows 10 operating system configured as Intel (R), Core (TM), i5-4590, CPU, and 8 GB.
As shown in Figure
Comparison on time cost of generating dummies.
While
As shown in Figure
Comparison on time cost of result processing on client side.
Please note the experiment is simulated on computer, and the unit of the experimental result is microsecond (us), but in actual environment, when the results are processed by client on smart phones, the unit of time cost will fall into millisecond (ms) level.
In this section, we will compare the total time cost of the three algorithms, taking into account the device performance of the anonymizer and the client. In general, the computing power of our PC is much better than that of phones used by the client. Theoretically the floating-point computing power of 1.3 GHz frequency quad-core ARM processor is about 10 MFLOPs/s, and that of 2.5 GHz frequency Intel quad-core Q8300 is 25GFLOPs/s; the two differ 2500 times. Due to the different computing power of different devices, we deem conservatively that the computing power of PC is 500 times as much as the client device, while the computing power of anonymizer is the same as PC; therefore the total time consumption is
According to formula (
Comparison on total time consumption.
In order to compare
Comparison on the probability of getting better quality of service.
When
In summary,
We further compare the average quality of service of
Comparison on the average quality of service.
In this paper, we propose an improved privacy-preserving framework for location-based services based on double cloaking regions with supplementary information constraints. Compared to previous work, our method is effective in solving the strong attack with supplementary information, and, comparing to generating random dummy positions, generating fixed ones improves the service quality but reduces the computational overhead for the client. However, when the distribution of the information data is extremely nonuniform, the dynamic matching algorithm is difficult to match the region of similar information and forms double cloaking regions with the user’s region. In the future, we plan to improve the dynamic matching algorithm; in addition, we will consider the continuous query requests of the mobile user.
Location-Based Services Providers
Grid ID
Dummies IDs
User ID
A set of query requests submitted by the user to the Anonymizer
A set of query requests passed by the Anonymizer to LBSPs
The set of candidate results sent by LBSPs to Anonymizer
The set of candidate results sent by Anonymizer to Client
RCR randomly matches into the grid of
Spatial hierarchy
Level saturated
Euclidean distance between user and dummy
Dummies Data.
The authors declare that they have no conflicts of interest.
The research is supported by “National Natural Science Foundation of China” (no. 61772560), “Natural Science Foundation of Hunan Province” (no. 2016JJ3154), “Key Support Projects of Shanghai Science and Technology Committee” (no. 16010500100), “Scientific Research Project for Professors in Central South University, China” (no. 904010001), and “Innovation Project for Graduate Students in Central South University” (no. 1053320170313).