The searchable encryption scheme can perform keywords search operation directly over encrypted data without decryption, which is crucial to cloud storage, and has attracted a lot of attention in these years. However, it is still an open problem to develop an efficient public key encryption scheme supporting conjunctive and a disjunctive keyword search simultaneously. To achieve this goal, we introduce a keyword conversion method that can transform the query and index keywords into a vector space model. Through applying a vector space model to a predicate encryption scheme supporting inner product, we propose a novel public key encryption scheme with conjunctive and disjunctive keyword search. The experiment result demonstrates that our scheme is more efficient in both time and space as well as more suitable for the mobile cloud compared with the state-of-art schemes.
With the rapid development of the cloud computation accompanied by the boosting amount of data, more and more enterprises and individuals are willing to share their own data on the cloud platform. Because the data stored in the cloud may be sensitive, such as medical records, the popularity of the cloud storage inevitably brings its users security concern. Specifically, hacker attack and administrator theft can lead to data leakage. In order to protect the data privacy, encrypting data before outsourcing it on the cloud server is a common way. However, users still confront the problem of how to search the encrypted data stored on the cloud efficiently. A straightforward approach is to download all the encrypted data to the clients and then decrypt them all. After obtaining all the unencrypted data, users can search the document by using common information retrieval technical. Nevertheless, this strategy needs tremendous cost of transportation, storage, and computation, which brings a new issue: how to search encrypted data efficiently without decrypting it first.
Many searchable encryption (SE) schemes were proposed to realize keyword search over encrypted data with various search functions. There are two main categories of SE according to its applications: searchable public key encryption and searchable symmetric key encryption. Over the last few years, many searchable symmetric key encryption schemes have been proposed, which achieve complex search conditions such as Boolean keyword search, personal keyword search, and query result ranking [
However, the research process of PEDK is very slow. In order to support disjunction formulae, Katz et al. gave a predicate encryption supporting inner product (IPE) scheme [
Although we can create an PEDK scheme by making use of an IPE scheme and a trivial method presented in [
In this paper, we first propose a new method that can change an IPE scheme into a PECDK scheme and then give an instance. Our contributions are summarized as follows. We design a new approach, which converts an index keyword set and a query keyword set into an attribute matrix and a predicate vector, respectively. Technically, we first use the index keyword set to construct an equation of degree n with one unknown. Then, we apply coefficients and the roots of the equation to create a predicate vector and an attribute matrix, respectively. We propose a construction of PECDK based on the method mentioned in (1) and an efficient IPE scheme proposed in [
The rest of this paper is organized as follows. Related work is discussed in Section
Searchable encryption schemes enable the clients to store the encrypted data to the cloud and execute keyword search over ciphertext domain. Thus, our solution belongs to this field. Due to different cryptography primitives, searchable encryption schemes can be classified into public key system and symmetric key system.
Song et al. first introduced the definition of searchable symmetric encryption and proposed a concrete scheme [
With slower development than searchable symmetric encryption, searchable public key encryption is also difficult to support complex query condition. Boneh et al. brought up the new concept of PEKS and provided several constructions [
Another class of searchable encryption is called range search over encrypted data. It can be used to test whether a multidimension point is inside in a hyperrectangle. Related works were presented in [
Consider a data storage service in cloud, where a data owner has a set of documents
The application scene of the searchable public key encryption involves three roles: data senders, a data receiver, and a cloud server, as illustrated in Figure
Architecture of the search over encrypted cloud data.
In this paper, we focus on the searchable public key encryption supporting conjunctive and disjunctive keyword search. Strictly speaking, we present a formal definition PECDK model derived from the model proposed in [
There are four polynomial time algorithms in the PECDK scheme:
Generally speaking, the security of a searchable encryption means that the cloud server can infer as little information as possible from the encrypted data and the search process without sacrificing the search ability. Before introducing the adaptive security definition of our scheme, we first define the privacy leakage, which is revealed to the cloud server inevitably.
Since the encrypted documents and queries are submitted to the cloud server, the cloud server can obtain the basic size information of these encrypted data easily. This is called the leakage of size pattern.
For each query, the cloud server can obtain the identifiers of data records that match this query. This is called the leakage of access pattern.
Given a record set
Actually, Oblivious RAM can be utilized to preserve access and search pattern, but this technique is too inefficient to be used in the real applications. In this paper, we do not consider the problem of how to protect access pattern and search pattern in our scheme.
The leakage of query privacy means that keywords in the encrypted query will be revealed to the cloud server. It commonly exists in the public key setting since anyone can construct an encrypted index for arbitrary keywords. Because of belonging to public key encryption category, our scheme fails to protect query privacy.
As previous works, we denote the information leakage including size pattern, access pattern, search pattern, and query privacy as leakage function
With the leakage function mentioned above, we introduce an adaptive security definition of the PECDK scheme related to the one proposed in [
A PECDK scheme is adaptively index-hiding against chosen plaintext attacks if for all probabilistic polynomial time adversaries Setup: The challenger Phase 1: The attacker Challenge: Phase 2: Response:
We define
Generally speaking, as long as the information leakage of
Based on the system and security model descried in the previous section, in this section, we present the method that converting index and query keyword sets into a vector space model. This model can be applied to an IPE scheme easily.
We suppose that any keyword
We first construct an equation of degree For the keyword set
According to the coefficient of the For the keyword set
It is not difficult to find that the roots of the equation Note that if there is a keyword
As a result, if we can make sure there is
According to the definition of IPE [ If the symbol in Choosing a counter If The algorithm outputs 0 and ends. If the symbol in Choosing two counters i and j and setting If If
The proposed PECDK scheme can be constructed by making use of the fully secure IPE scheme. Therefore, we have the following proposition.
We denote the previous PECDK scheme [
Index structure for a single file
From Figure
From Figure
Let
Comparison with the previous PECDK scheme.
PECDK-2 | PECDK-1 | |
---|---|---|
pk size |
|
|
sk size |
|
|
Trapdoor size |
|
|
Index size |
|
|
Encryption time |
|
|
Test time |
|
|
Since, in the encryption and test phrase, the time cost of the pairing and the power operation are much more than other operations, we do not take account of other operations. According to Table
In the following, we will argue that our statement where
Statistics for data collections used in the domain of information retrieval.
Dataset name | Documents | Vocabulary size |
---|---|---|
AP88-89 | 164,597 | 247,350 |
WSJ87-92 | 173,252 | 216,539 |
DOTGOV | 1,247,442 | 3,051,601 |
MedTrack | 100,866 | 55,065 |
Yelp2013 | 335,018 | 211,245 |
Yelp2014 | 1,125,457 | 476,191 |
Yelp2015 | 1,569,264 | 612,636 |
IMDB review | 348,415 | 115,831 |
Yahoo answer | 1,450,000 | 1,554,607 |
Amazon review | 3,650,000 | 1,919,336 |
Moreover, we have investigated the OHSUMED collection [
Statistics for documents’ information in the OHSUMED collection (#
Field name | # |
Vocabulary size |
---|---|---|
Title | 5∼20 | 33059 |
Abstract | 50∼200 | 83496 |
In addition, since PECDK-1 needs to test each keyword
For our experiments, we build artificial plaintext index with different number of keywords in a dictionary (i.e.,
For a query with five keywords, Figures Figures Figures Figures Figures
Impact of
Impact of
According to the item (4), since
The running time of key generation in PECDK-1 is less than
Figure
Tables
Impact of
Size of |
5 | 10 | 15 | 20 |
---|---|---|---|---|
PECDK-1 (ms) | 576 | 963 | 1338 | 1805 |
PECDK-2 (ms) | 5564 | 5562 | 5565 | 5603 |
Impact of N time consumption of trapdoor generation (
Size of |
50 | 100 | 150 | 200 |
---|---|---|---|---|
PECDK-1 (ms) | 592 | 576 | 589 | 606 |
PECDK-2 (ms) | 2862 | 5564 | 8224 | 10860 |
Generally speaking, Figures
As shown in Figures The parameters The storage of the index in PECDK-1rises with the square of
Impact of
Although the index structure, such as inverted document and R-tree, can raise the search efficiency, it fails to support dynamic operations in the public key system. Because anyone who can access the pk can construct an index in this system as well, it is difficult to combine indices obtained from data senders into a structured index. Our proposal can dynamically support document update in nature since each document is associated with an encrypted index.
In addition, because of the simple index structure, we can easily accelerate the search process by utilizing the technique of parallel computation. Thus, we argue that our scheme is practical in the cloud platform.
In this paper, we proposed a new approach to construct an efficient PECDK scheme with better performance in time and space complexity under an adaptive security model. To reveal the efficiency of the proposed scheme, we compared it with the existing PECDK scheme presented in [
Since
The authors declare that they have no conflicts of interest.
The authors gratefully acknowledge the National Natural Science Foundation of China under Grant nos. 61402393 and 61601396 and Shanghai Key Laboratory of Integrated Administration Technologies for Information Security (no. AGK201607).