A Blockchain-Based Normalized Searchable Encryption System for Medical Data

1 e School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China 2 e Shandong Provincial Key Laboratory of Computer Networks, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China 3 e Shanghai Key Laboratory of Privacy-Preserving Computation, Matrix-Elements Technologies, Shanghai 201204, China 4 e Department of Ophthalmology, Renmin Hospital of Wuhan University, Wuhan, Hubei, China


Introduction
With the rapid spread of COVID-19 around the world, the healthcare industry has accelerated the shift to digital healthcare services [1,2]. In the era of big data, many hospitals prefer to use remote cloud servers to store and manage huge amounts of electronic medical data. However, due to the inherent properties of the cloud such as centralization and openness, cloud-assisted medical data systems will face new privacy and security challenges [3]. Firstly, medical data outsourced to cloud servers may be accessed or tampered with by unauthorized users. e confidentiality and integrity of the medical data cannot be guaranteed. Secondly, centralized cloud servers may suffer from a single point of failure, which will result in the unavailability of medical data. Although the traditional encryption technology can ensure security, it is difficult to take into account the availability of outsourced medical data.
Searchable encryption (SE) is a critical cryptographic technique to achieve the availability of data while ensuring security and privacy, which enables users to search ciphertext data [4]. In a searchable encryption scheme, the data owner uploads the encrypted data to the cloud server. en, the user needs to construct a trapdoor and submit it to the cloud server to search for data containing the target keywords. In most cases, the server is regarded as a semihonest third-party entity [5,6]; i.e., the server will perform the search operation correctly according to the protocol. However, in practical scenarios, the server may return mismatched search results due to economic interest or single point of failure.
Blockchain is a decentralized computing paradigm with public verifiability and tamper-proof features [7]. Applying blockchain to searchable encryption can effectively solve the problem of untrustworthy search results from the centralized servers [8][9][10]. Smart contracts deployed in the blockchain can perform search functions instead of thirdparty servers and automatically execute search protocols based on trigger conditions to produce correct results. In addition, blockchain nodes record transaction results in an immutable ledger, which guarantees the integrity of the results and eliminates the need for further validation of the results. Even if one or more nodes fail or are corrupted by malicious adversaries, the correctness of the results will not be affected due to the fault tolerance of the blockchain.
In the blockchain-based searchable system, data and search structure are stored in a ciphertext state, and thus, the legality of the data in the system cannot be guaranteed. In the medical scenario, the necessary supervision of encrypted medical data is needed to ensure the legality of medical data. For example, supervisors need to filter the search requests that contain illegal keywords to prevent the spread of false medical information. In addition, supervisors are supposed to check the legality of ciphertext data in the remote cloud when they suspect illegal data or when a user files a complaint with the supervisors. e controllability of medical data is key to maintaining a stable healthcare system, yet there is a lack of research related to the supervision of ciphertext data.

Motivation and Contributions.
Inspired by the work [11], a blockchain-based searchable public key encryption with forward and backward privacy can be used to design a searchable encryption system for medical data. In medical scenarios, multiuser search functions and dynamic updates for authorized users need to be supported with data sharing requirements. In addition, to ensure the stability of the medical searchable encryption system, the ciphertext data in the system need to be legally supervised. As for the blockchain platform, the consortium chain hyperledger fabric is a good option considering the practicality and privacy. Based on the above analysis and practical application scenarios of the medical data, we propose a blockchain-based multiuser searchable encryption scheme supporting supervision and design a searchable encryption system for medical data based on the scheme. e main contributions of this study are as follows: (i) We propose a blockchain-based multiuser normalized searchable encryption (BNSE) scheme, which achieves efficient retrieval of ciphertext data in multiuser scenarios and supports the supervision on the ciphertext data (ii) We design a blockchain-based normalized searchable encryption system for medical data (BNSEM) based on the above scheme, which realizes the application of retrieval on the encrypted medical data (iii) We evaluate the theoretical performance of the scheme and test the practical performance of the system to verify the availability 1.2. Organization. In Section 2, we review the existing research work related to the security and functionality of searchable encryption. In Section 3, we introduce the blockchain technology and broadcast encryption, as well as security definition. In Section 4, we describe the specific construction of our proposed scheme BNSE and prove its security. In Section 5, we present the design of our system BNSEM. In Section 6, we provide the security analysis of BNSEM. In Section 7, we conduct a performance evaluation of the system. Section 8 makes a conclusion of this study.

Related Work
In 2000, Song et al. [4] proposed the first SE scheme, which is a noninteractive single-keyword search scheme. e drawback of the scheme is that it is extremely inefficient when the number of documents is large. However, this pioneering work still greatly contributed to the research and development of searchable encryption. Later, many works [12,13] focus on designing efficient security mechanisms to enhance the security. Meanwhile, some works also introduce searchable encryption schemes for functional extensions, including the multiuser SE scheme [14] and the dynamic SE scheme [15].
To balance security and efficiency, a practical SE scheme will leak some information to the adversary. However, the information leakage attack undermines the security of SE schemes [16]. Adaptive leakage exploit attacks have brought more attention to forward privacy [17]. Song et al. [12] proposed two schemes FAST and FASTIO, both of which have forward privacy. In addition, Bost et al. [6] presented a formal definition of backward privacy, and backward privacy gradually became a major security property of interest. Chamani et al. [18] proposed improvements in various aspects of performance to the work [6].
To avoid the problem of key management and distribution restrictions prevalent in symmetric searchable encryption (SSE) schemes, Boneh et al. [19] proposed the first searchable public key encryption (SPE) scheme, which is a noninteractive single-user search scheme. However, a significant limitation of SPE is that it contains a large number of time-consuming operations, such as bilinear pairs and exponential operations. In 2020, Chen et al. [11] proposed a lightweight SPE scheme with search performance close to the efficient SSE. However, the scheme does not implement multiuser search and cannot share data in multiuser scenarios.
From a functional point of view, most of the current research efforts focus on symmetric searchable encryption schemes that support only single-user search mode; i.e., the data user is the data owner.
e few SSE schemes that support multiuser search also require the owner to calculate a search trapdoor [20] online. Multiuser searchable encryption (MUSE) [21] is a significant research content of SE with practical research significance. In MUSE, a data owner uploads data to a cloud server and wants to share the data with multiple users. Attribute-based searchable encryption (ABSE) [22] can manage the retrieval of ciphertext data in a multiuser scenario, but it is computationally inefficient and lacks practicality.
Broadcast encryption (BE) [23] enables multiuser data sharing and is suitable for scenarios where data users are relatively fixed. Liu et al. [24] designed a multiuser searchable encryption scheme based on a single-user system prototype and inherited the functionality of adding, modifying, and deleting documents from the original dynamic scheme. However, scheme [24] requires online search trapdoor generation and multiple rounds of client-server interaction, which increases the communication overhead. Later, Liu et al. [25] combined public key authenticated encryption supporting keyword search with broadcast encryption BE and proposed a broadcast authenticated encryption primitive BAEKS supporting keyword search, while the scheme reaches a performance bottleneck when the number of users increases to a certain number.
In most existing schemes, the search server is regarded as an honest third party that performs the prescribed search protocol [26]. However, the search server may be a malicious third party that returns partial or even mismatched search results due to profit or random failures. e main reason for these problems is that centralized servers have complete control over the data and execute the protocols independently without supervision. In view of this, blockchain technology [7], a decentralized computing paradigm with public verifiability and invariance characteristics, combined with searchable encryption [14] can effectively solve the problem of untrustworthy third-party search results.
ere are two ways to combine blockchain with searchable encryption, one of which is to use the blockchain for storing credentials and the other is to use the blockchain's smart contract to perform the search function. e first approach still follows the traditional server-side search by storing the transaction credentials on the blockchain [27]. Cai et al. [8] designed a dynamic and efficient searchable encryption scheme using blockchain. Tang [28] extends searchable encryption by saving essential messages on the blockchain and the scheme performs only a small number of operations on the blockchain, thus reducing the burden on the blockchain. When there are disputes and controversies, the misconduct of participants can be revealed through transactions on the blockchain. However, using the blockchain to store credentials still does not prevent the malicious behavior of servers. erefore, researchers have also proposed an alternative construction method to design smart contracts that include search functions instead of cloud servers to perform keyword search operation [14]. Chen et al. [29] used electronic medical record EHR file indexes to construct complex logical structures and store them on a blockchain so that data users can search the file indexes using these logical expressions. Hu et al. [20] enabled users to search private databases in a blockchain environment and implement dynamic access control for searches. However, all of the above schemes outsource complex operations or encrypted data to the blockchain, which greatly degrades the performance of the system. Chen et al. [11] designed a blockchain-based searchable public key encryption scheme with only lightweight hash operations.

Preliminaries
In this section, we introduce the blockchain technology, broadcast encryption, system model, security definition, and design goals.

Blockchain Technology.
In 2008, blockchain technology received widespread attention following the publication of the Bitcoin white paper [7]. Blockchain provides a distributed, immutable, secure, transparent, and auditable ledger. e blocks in a blockchain store transactions at a specific time, and their hash values are recorded by a Merkle tree. e transaction data on the blockchain are shared in a P2P (peer-to-peer) network, and the security of the transaction data is ensured by cryptographic primitives (Merkle tree, asymmetric encryption, and digital signatures).
Since blockchain operates on a P2P network, a P2P network including a number of blockchain nodes (peer nodes, orderer nodes, etc.) needs to be created before deploying a blockchain platform. Each node provides two keys that can be used for encryption and signature. When a transaction is initiated, one node signs the transaction and broadcasts it to other peer nodes. When another node receives the signed transaction, it needs to verify the validity of the transaction before broadcasting it. e peer nodes (also known as miners) collect enough signatures of this transaction to pack it into a block and store it on the blockchain after passing consensus.
Smart contract: a smart contract contains a set of rules and logic, which is a decentralized, information-sharable program code deployed on the blockchain. e parties involve in signing a contract agree on the content of the smart contract and deploy it on the blockchain, which can automate the execution of the contract without relying on any third authority [30]. Smart contracts run automatically once started without the intervention of any contract signatory.

Broadcast Encryption.
A public key broadcast encryption scheme consists of four algorithms, namely system setup (Setup), key generation (KeyGen), encryption (Encrypt), and decryption (Decrypt), defined as follows: (i) Setup(κ) ⟶ (N, EK): with the security parameter κ as input, the maximum capacity N of the broadcast receiver group and the initial encryption key list EK are output. (ii) KeyGen(κ, EK) ⟶ (pk, sk): with the security parameter κ and the encryption key list EK as input, the user's public-private key pair (pk, sk) is output and the public key pk is added to the key list EK. EK, and a plaintext message m to be broadcast as input. e public keys PK � pk 1 , . . . , pk n corresponding to the users in the subset S from the encryption key list EK are selected. e broadcast ciphertext C m of the message m under encryption using the key set PK is output. Note that the broadcast ciphertext can only be correctly decrypted by the receiver in S. (iv) Decrypt(sk, C m ) ⟶ m: taking user's private key sk and broadcast ciphertext C m as input, if user u i ∈ S who has the private key sk, then the user u i can use his private key sk to decrypt the broadcast ciphertext C m and output the broadcast message m.

System
Model. e system model of our BMNSE scheme is shown in Figure 1. It consists of six entities: trusted institute (TI), cloud server (CS), blockchain (BC), data owner (DO), data user (DU), and supervisor (SUP). Although the TI and SUP are not involved in the main process of data search, they are still two indispensable entities that play an important role in the execution of the scheme and the maintenance of the ecosystem. Before running scheme, TI first generates the parameters required for system initialization and issues public key certificates for users who join the system, and TI is offline the rest of the time.
After the initialization is completed, the program needs to perform five main steps, which are described as follows: (1) Encrypt File. e DO first encrypts the data file using a symmetric encryption algorithm and then encrypts the symmetric key using a public key cryptography algorithm. Finally, DO uploads the ciphertext to CS. DU generates a search trapdoor containing the target keyword and then sends the search request containing the trapdoor to a nearby blockchain node. e search request triggers the search process of the smart contract, which then returns the index of all matching encrypted files. (4) Access the Files. e DU first decrypts the encrypted file index returned by the smart contract in step 3 and then accesses the data in the CS after getting the plaintext index. To ensure the legitimacy of the transactional data in the program, the necessary supervision of data cryptography by SUP is required. SUP has two main tasks: first, carrying out periodic audits of cryptographic data stored on CS, and second, scrutinizing the search requests of DU. e purpose of cryptographic data audit is to detect data files that contain illegal or sensitive keywords, timely revoke illegal files hosted on CS, and alert, warn, or punish the corresponding DU. e purpose of scrutinizing search requests is to monitor keyword search requests sent by DOs to the BC in real time and to intercept and warn the noncompliant search requests.
Based on the above system model, the following eight algorithms are defined in our scheme: (i) Setup (κ) ⟶ Param: it is executed by TI and takes the security parameter κ as input and the system public parameter Param as output.
it is executed by TI and takes the public parameter Param as input and outputs the user's public-private key pair (Q u , d u ). (iii) Encrypt (Param, Q u i , DB, Σ) ⟶ EDB: this algorithm is executed by DO. e input parameters contain the system public parameters Param, the public keys Q u i of the authorized DUs, the database DB, and an empty mapping Σ. e algorithm outputs the searchable encrypted database EDB and the initialized mapping Σ.
: this algorithm is executed by DO with input parameters including system parameter Param, public key set Q u i ′ of the users to be authorized, original broadcast cipher Z → , target keyword w k , and secret values r, s saved by DO, where r is the secret value associated with version information and s is the secret value involved in the encryption calculation. e algorithm outputs the updated broadcast ci- is executed by DU with the input of public parameter Param, authorized user's private key d u i and target keyword w k , and the output of search trapdoor T w k . (vi) Search (Param, T w k , EDB) ⟶ RS(w k ): it is automatically executed by the smart contract, takes the system parameter Param, the search trapdoor T w k for the keyword w k , and the encrypted database EDB as input, and outputs the matched search results RS(w k ).
) ⟶ ind w k : this algorithm is executed by DU, which takes the private key d u i and the search result set RS(w k ) as inputs and outputs the decrypted file index set algorithm is executed by SUP with input parameters including system parameter Param, public keys Q u i ′ of authorized users, illegal or sensitive keyword w k and private key d sup of supervisor, and outputs search trapdoor T w k of sensitive words.

Security
Definition. Similar to [12], we demonstrate the confidentiality of our BMNSE scheme with a real/ideal simulation paradigm. To achieve higher operational efficiency, searchable encryption schemes will disclose some information to the server. e leakage information of our scheme is described by the leakage function L � L Setup , e nonformal definition of the confidentiality of searchable encryption scheme is that no information about the database should be revealed other than the information leaked in the leak function L. e formal definition of confidentiality can be presented by a reality/ideal simulation paradigm containing the game Real SE and Ideal SE . Definition 1. Let Π � (Setup, KeyGen, Encrypt, Trapdoor, Search, Supervise) denote the BMNSE scheme, A denote the adversary, and S be a simulator with a leakage function L � L Setup , L KeyGen , L Encrypt , L Trapdoor , L Search , L Supervise as an parameter. e following two probabilistic game experiments are defined: : the game runs the system setup algorithm Setup (κ) to generate system parameters Param and the key generation algorithm KeyGen (Param) to generate the user's public-private key pair (Q u , d u ).
e game publishes the public message (Param, Q u ) and keeps the private key d u secretly. en, the adversary A selects a database DB and performs an encrypted query based on the information (Param, Q u i ). Next, the game runs the encryption algorithm Encrypt(Param, Q u , DB, Σ � EDB) and returns the encrypted database EDB to A. A chooses a keyword w k for the trapdoor query, and the game runs the trapdoor generation algorithm Trapdoor(Param, d u i , w k ) � T w k and returns the trapdoor T w k to A. en, A selects a trapdoor T w k for the search query and the game runs the search algorithm Search(Param, T w k , EDB) � RS and returns the result set RS to A. e adversary A can repeat the above steps several times and finally output a bit : the simulator S generates the system public parameter Param←S(L Setup ) using the leak function of system setup. en, S generates the user's public-private key pair (d u , Q u )←S(L KeyGen ) based on the public parameter Param and the leak function L KeyGen and publishes the public key list Q u i . Next, the adversary A launches an encrypted query and the simulator S generates an encrypted database EDB←S(L Encrypt ) and returns it to A. en, the simulator S uses the leak function of the trapdoor to generate a search trapdoor T w k ←S(L Trapdoor ) in response to a trapdoor query from A. After the adversary issues a search query, the simulator returns the result RS←S(L Search ) using the leak function of the search. Finally, the adversary A outputs a bit b ∈ 0, 1 { }.
Scheme Π satisfies L-adaptive security if for any probabilistic polynomial time (PPT) adversary A, there exists a PPT simulator S such that where negl(κ) is a negligible function.

Design Goals.
Combining the above system model and practical application requirements, our scheme should meet the following functional objectives: (i) Supervisibility. Supervision can ensure the controllability of the cryptographic data. e SUP needs to supervise the encrypted data in the CS and the DU's search requests to ensure that the data can be stored and used in a legal and compliant manner. (ii) Multi-user Search. Multiuser search is a basic function in data sharing scenarios. In this scenario, multiple DUs need to be authorized to access the encrypted data to provide easier data retrieval services.

A Multiuser Normalized Searchable Encryption Scheme via Blockchain
In this section, we describe the specific construction of BNSE in detail and present its security proof. e algorithms are constructed as follows. Setup (κ): the setup algorithm takes the security parameter κ as input. It generates parameters (G 1 , G 2 , q, P, e) for the bilinear map system, where G 1 is an additive group and G 2 is a multiplicative group with the same prime order q, P is a generator of G 1 , and e: G 1 × G 1 ⟶ G 2 is a bilinear map. en, the algorithm picks several secure hash functions . en, the algorithm selects a pseudorandom function Finally, it outputs the public parameter Param � G 1 , G 2 , q, P, e, H 0,1,2,3,4 , h 1,2,3,4 , F, F − 1 . (2) KeyGen (Param): d u ∈ Z * q is generated randomly, and D u � d u · P is computed. e private key of date user is d u , and D u is a secret value to derive the public key. Given the public key of supervisor Q sup , t u ∈ Z * q is randomly selected and Q 1 u � t u · D u , Q 2 u � d u · t u · Q sup + D u , and Q 3 u � H 0 (D u ) · P are computed. en, the key generation algorithm outputs the user's public key Encrypt (Param, Q u i , DB , Σ): the input parameters of the encryption algorithm contain the system public parameter Param, the authorized data users' public key Q u i i∈ [1,n] , where n is the number of authorized users, w k (m w k is the number of files containing the keyword w k ), and Σ[key] � value is a mapping that stores the keyword state pointer, which is able to trace back to the last update of the files including the keyword. en, the following steps are performed: (1) r ∈ Z * q is randomly selected, and the version information VI � r · P is computed for the encrypted database.
(2) Knowing that the authorized users of the encrypted data are u i i∈ [1,n] and each user's public key is For each keyword w k in the keyword set W: (1) e state pointer map Σ[w k ] ⟶ (pt c w k , c) of the keyword w k is retrieved. If the retrieval result is empty, then the state of the current keyword is initialized. Let c←0 and pt c w k ← 0, 1 { } κ , where pt 0 w k is not involved in information storage and c is the number of times the keyword w k is updated. If the retrieval result is not empty, no initialization is required. e pseudorandom permutation key k c+1 w k ← 0, 1 { } κ is randomly generated, and pt c+1 w k � F(k c+1 w k , pt c w k ) is calculated. Subsequently, the local mapping Σ[w k ] � (pt c+1 w k , c + 1) is updated.
e encrypted database is obtained through the above calculation. , where n ′ is the number of all authorized users, and the secret values r, s saved by the data owner, where the random number r involves the version information of keyword w k and the secret value s is used to generate the trapdoor and encrypted index. Authorization update is performed on the file index set Ind w k containing ) + s and n ′ is the total number of all authorized users.
To improve the computation efficiency, the original polynomial ciphertext f(z) can be used to perform the computation by first subtracting the polynomial f(z) from the secret value s and then multiplying f(z) − s with the term generated by the public keys of the users to be authorized to get a new polynomial. Finally, the secret value s is embedded into this polynomial to get a new authorized polynomial f ′ (z). e update process only needs to calculate the relevant terms of the user to be authorized based on the original secret text. In addition, the previously authorized users can still use the original vector Z → to compute the trapdoor and decryption.
Trapdoor (Param, d u i , w k ): with the system parameter Param as input, only the authorized user u i i∈ [1,n] can use his private key d u i to compute the trapdoor of the keyword w k . e steps of the trapdoor calculation are as follows: (1) e version information VI � r · P of the keyword w k is obtained.
Search (Param, T w k , EDB): the search algorithm is the inverse process of the encryption algorithm, with the public parameter Param, the trapdoor T w k of the keyword w k , and the encrypted database EDB as input parameters. An empty set RS(w k )←∅ is initialized to store the search results. en, the following steps are performed: is retrieved from the encrypted database. If inc t w k ′ � ⊥, then the search algorithm is terminated and the search result the search algorithm is terminated and the search result set RS(w k ) is returned. Otherwise, is computed, and then, (j, EI j w k ) is inserted into the search result set RS(w k ). (4) Using the state pointer pt c+1 w k of the keyword w k and the pseudorandom permutation key k c+1 w k obtained in step 2, the previous state pointer pt c w k � F − 1 (k c+1 w k , pt c+1 w k ) is computed. Let pt c+1 w k � pt c w k , and then, step 2 is proceeded.
Decrypt (Param, d u i , w k , RS(w k )): the decryption algorithm is used to decrypt the encrypted indexes in the search results RS(w k ). Using the secret value s computed in step 2 of the trapdoor algorithm Trapdoor, for each record is computed. If op � add, the index ind j w k is used to access the corresponding data ciphertext from the cloud server CS and decrypt the ciphertext using the key key obtained in step 2 of the search algorithm to get the plaintext data file. If op � add, it means this file index has been deleted and there is no need to access this file in the cloud server.
Supervise Param, Q u i , d sup , W * }: the input parameters include system parameter Param, public keys Q u i � (Q 1 u i , Q 2 u i , Q 3 u i ) of authorized users, the private key d sup of supervisor, and the set of sensitive words W * . Q 2 is computed to obtain D u i , and the steps are subsequently performed in the trapdoor algorithm Trapdoor to compute the secret value s. After obtaining a set of secret values S � s { }, the supervisor generates search trapdoors T w k for each secret value s of the sensitive word w k ∈ W * . en, the hash value H 4 (T w k ) of the trapdoor is calculated and the hash values in the list L h are stored and uploaded to the BC through the smart contract to realize the supervision of search requests. Second, the trapdoor set T w k is used to get the matching file index ciphertext by executing the search smart contract and the ciphertext is decrypted using the secret value s to get the file index. Finally, the index is used to locate the illegal file containing the sensitive word w k ∈ W * in CS to achieve the supervision of the ciphertext data in CS.
Correctness analysis: when generating the searchable encrypted data structure, a broadcast polynomial f(z) � n i�1 (z − H 0 (r · Q 3 u i )) + s is constructed. e authorized user u i ∈ u 1 , . . . , u n is able to use his private key d u i to compute After obtaining the secret value s by substituting H 0 (r · Q 3 u i ) into the broadcast ciphertext, the search trapdoor T w k � Security and Communication Networks e(H 1 (w k ), s · P) is computed. e trapdoor search steps are described in the soundness proof of the security proof subsection. As for the ciphertext data supervision, given the private key d sup of the supervisor and the partial public key (Q 1 u i , Q 2 u i ) of the authorized user, D u i is computed as follows: After getting D u i , the secret value s used for searching and decryption can be calculated as formula (1).

Theorem 1.
e BMNSE scheme Π is a L-adaptive secure searchable encryption if F is a pseudorandom permutation function, the hash function is collision-resistant, the DBDH difficulty problem holds, and the polynomial-based broadcast encryption algorithm is adaptively secure.
Proof. We demonstrate the adaptive security of the scheme through a sequence of games similar to reference [11]. e first game G 1 is the real-world game Real Π A (κ). Each game is slightly different from the previous one, but they are indistinguishable from the adversary, finally reaching the last ideal world game Ideal Π A,S (κ). According to the transmission property of indistinguishability, it can be concluded that , thus completing the proof of confidentiality.
In the second game G 2 , it maintains a list of state pointers PList for storing state pointers; i.e., PList[w, c] � pt c . e state pointers are used in the encryption algorithm, and the game G 2 randomly chooses a string pt c ← R 0, 1 { } κ to generate the state pointers instead of using the pseudorandom permutation function F. Because the pseudorandom substitution function F is indistinguishable from the actual random function, the games G 2 and G 1 are indistinguishable.
In the third game G 3 , it models all hash functions as random oracles, where each oracle maintains a list to store input/output pairs. For example, given a random oracle H 1 with input x, the oracle randomly selects a string y← R 0, 1 { } l as output, where l is the output length of the hash function, and stores (x, y) in the list H 1 -List. Because the hash function is collision-resistant, the games G 2 and G 3 are indistinguishable.
In the fourth game G 4 , it computes st w k ∈ G 2 on the basis of t w k � e(H 1 (w k ), s · P) by randomly choosing a secret value s in the encryption phase. Also, the game G 4 needs to maintain a list TList for storing (w k , st w k ) in response to the trapdoor query from the adversary A. (P, sP, H 0 (w k ), t w k ) is a tuple based on the DBDH problem, and (P, sP, H 0 (w k ), t w k ) is a random tuple. If the adversary A can distinguish the games G 3 and G 4 , it means that the adversary is able to distinguish the two tuples, i.e., solve the DBDH problem, which is contrary to the assumption of the hard problem.
us, the games G 3 and G 4 are indistinguishable.
In the last game G 5 , the simulator S maintains two lists, one for simulating random oracle queries and another counter that keeps track of the number of encryption updates since the system was initialized. For each encryption query, two random strings are selected. e simulator uses the encryption history to determine the encryption queries for the keyword w. Based on the encryption history, state pointers and keys can be generated and then the random oracle is updated. In the adversary's perspective, the view generated by the simulator S is completely indistinguishable from the view in the game G 4 .
Summing up, we can get (12) where the advantage of solving the difficult DBDH problem Adv DBDH A (κ) is negligible, so our proposed scheme Π is a L-adaptive secure searchable encryption scheme.

A Blockchain-Based Normalized Searchable Encryption System for Medical Data
In this section, we present our design of the BNSEM system based on the BNSE scheme presented in the preceding section.

System Architecture.
We divide the BNSEM system into three layers: data collection layer, medical data processing layer, and medical data access layer. e system architecture is shown in Figure 2. e entities in the system are roughly the same as those in the BNSE scheme, and the difference is that the entities in the medical system are all medical service providers/users, including the medical data owner (MDO), medical data user (MDU), medical cloud server (MCS), and medical consortium blockchain platform (MCB). In the medical data collection layer, medical data are mainly generated by doctors and patients. On the one hand, patients will generate corresponding medical data when they visit hospitals. On the other hand, the health data will be generated when patients use home medical tools or wearable medical monitoring devices, which can be used as reference indicators for the diagnosis of doctors.
In the medical data processing layer, the patients need to preprocess the data before uploading, including encrypting the medical data, establishing the index of medical file, extracting the keywords in the medical file, and constructing a searchable structure based on the file index and the keywords. Finally, the ciphertext of medical records are uploaded to MCS and the searchable structure are uploaded to MCB.
In the medical data access layer, only authorized medical data users can access the patient's medical data. First, the MDU generates a trapdoor for the target search keyword and sends the search request containing the trapdoor to MCB. en, the smart contract matches the trapdoor with the searchable structure and returns the corresponding medical file index. Finally, the MDU uses the file index to access medical data in MCS and MCS returns the corresponding data to the MDU.

Medical Data Preprocessing.
When a patient goes to the hospital, the doctor makes a diagnosis and generates an electronic medical record. e record includes the diagnosed disease, examination results (medical images, laboratories, etc.), medication prescriptions, and personal information (such as name, age, and gender). Each electronic medical record is treated as a file and has a unique file identifier. e doctor synchronizes the generated medical records to the patient to complete a disease diagnosis process.

Building the Indexes of Medical Records.
When owning a specific number of medical data records, the patient can upload the record files. Before uploading, indexes corresponding to the files need to be constructed. For example, when the patient, i.e., MDO, receives m medical files D � D 1 , D 2 , . . . , D m , several indexes will be constructed for these files. e information related to the files can be embedded into the indexes according to the actual situation, such as the date and size of the files. e file indexes built for m medical data files D are IND � ind 1 , ind 2 , . . . , ind m .

Extracting Keywords from Medical
Records. MDO performs keyword extraction for the keywords contained in each file in D. For medical data files, we mainly consider the keyword extraction of name, gender, and age in basic information, disease name, drug prescription in medical indicators, and doctor, hospital, and visit time in treatment information.

Constructing the Inverse Indexes.
e keywords W ind 1 , W ind 2 , . . . , W ind m extracted from different medical data files in D were integrated to obtain the keyword dictionary W � w 1 , w 2 , . . . , w D .
en, for each keyword w k in the keyword dictionary W, the inverse index Ind w k containing the keyword is constructed. A specific construction of the inverse index of medical record files is shown in Figure 3.

Medical Consortium Block Chain
Platform. In BNSEM system, Hyperledger Fabric is chosen as the medical consortium blockchain (MCB) platform. Because Fabric has a strict access mechanism, it can be managed collaboratively in a polycentric manner by entities from multiple organizations. In addition, the consortium blockchain can best balance the security and efficiency of the system compared with public and private blockchains. Initial access control can be achieved through the access mechanism of Fabric. By deploying smart contracts of Fabric, more fine-grained data access control can be realized.
MCB is a federation of multiple healthcare providers, which is built and maintained by different entities such as hospitals, research institutions, regulatory bodies (e.g., healthcare commissions), and insurance and pharmaceutical companies. Organizations with high trust level preselect some peer nodes as consensus nodes according to their management policies (e.g., supervision institutions and hospital management nodes). ese designated consensus nodes are responsible for managing and updating the distributed ledger, while other peer nodes can only generate or contribute healthcare data transactions. Consensus nodes require a certain amount of computing power to perform consensus algorithms on transactions. In addition, if the number of consensus nodes increases, the degree of decentralization of the system increases and security and scalability can be improved.
MCB enables search structured storage and encrypted medical data retrieval by invoking predesigned and deployed smart contracts. Before MCB operates, the consortium members need to define a number of contracts developed by different organizations covering common terminology, data, rules, and processes to specify the model of data storage and sharing. A client application invokes a smart contract to execute the search protocol. When the execution is complete, the smart contract records the results (i.e., state changes) in the distributed ledger of MCB. Together with the ledger, smart contracts form the core part of the MCB system.

System
Setup. Before the system runs, TI sets security parameters κ and generates system public parameters Param. e system parameters Param include bilinear operation parameters (G 1 , G 2 , q, P, e), hash functions H/h with different output lengths, and pseudorandom permutation functions F/F − 1 with reference to the setup algorithm in Section 4. e system selects AES algorithm as the pseudorandom permutation function to excrypt medical data. e medical data users in the system mainly include the data users' MDUs and the supervisory institution SUP. Before the users join the system, they need to generate a set of public-private key pairs for data authorization. e public-private key pair of SUP is (d sup , Q sup ). e public-private key pair of MDU is (d u , Q u ), which is computed in the setup algorithm.

Encryption and Updating of Medical Data.
After the system is initialized, MDO will store the encrypted medical data and the corresponding searchable structure to authorize access by multiple MDUs. When patients visit the hospital and get multiple electronic medical records, these medical records will be preprocessed as described in subsection B.
Next pointer) is initialized.
en, n MDUs are specified to be authorized, denoted as u i i∈ [1,n] , whose public key is Q u i .
Taking the above parameters as input, the Encrypt data encryption algorithm in Section 4 is invoked to encrypt the medical record database DB to obtain the encrypted database; i.e., searchable data structure EDB � 〈ref pt c+1

Retrieval of Encrypted Medical Data.
When a patient (MDO) goes to another hospital for treatment, the authorized doctor (MDU) reviews the patient's past medical records to assist in the diagnosis. e search process for medical records is as follows: Step 1. MDU selects a keyword w k (e.g., hypertension) and generates the search trapdoor T w k by invoking the trapdoor algorithm using his private key.
Step 2. MDU sends a search request containing the search trapdoor T w k to the smart contract.
Step 3. e smart contract matches the trapdoor T w k with the search structure EDB stored in the blockchain to obtain the encrypted medical indexes EI j w k according to the search algorithm.
Step 4. MDU uses the secret value s to decrypt the ciphertext index EI j w k to get the plaintext index ind j w k and the option (add/del) corresponding to the index and the file decryption key, key.
Step 5. If the option is del, it indicates that the file has been deleted and no access is needed. On the contrary, MDU will access the medical data stored in the MCS with the indexes.
Step 6. MDU decrypts the medical record ciphertext returned from MCS to get the medical record file D using key.

Supervision of Medical Data and Search Requests.
To ensure that the data in BNSEM system can be stored and used legally, supervisors such as the healthcare commission are required to regularly review the encrypted medical data in MCS and monitor the search requests of MDUs in real time. First, SUP maintains a sensitive word dictionary W ′ , which includes sensitive keywords such as prohibited drugs, illegal hospitals, and fake doctors. Next, SUP invokes the supervise supervisory algorithm to locate the illegal files containing sensitive words in MCS using the private key d sup . en, SUP generates trapdoors for each sensitive word in W ′ , and after hash calculation, a trapdoor hash list is obtained. Finally, SUP uploads the hash list to MCB through smart contracts to filter trapdoors in search requests and intercept the illegal requests containing sensitive words.

Forward Privacy.
e requirement of forward privacy is that given a previous search trapdoor, the update query does not reveal information about the keywords that were searched in the past; i.e., the previous keyword trapdoor cannot be used to search medical records newly added after the trapdoor was released. In the BNSEM system, the trapdoor T w k is equivalent to a state pointer of keyword w k . With the help of this pointer, the smart contract will find the latest state pt c+1 w k of the keyword w k , which is used to locate the corresponding encrypted medical file index EI ind w k j , where j � 1, . . . , m w k . e smart contract then computes the last updated state pt c w k � F − 1 (k c+1 w k , pt c+1 w k ) to search the previously updated medical files.
When updating the medical files containing the keyword w k , MDO will compute a new status pointer pt c+2 w k � F(k c+2 w k , pt c+2 w k ), which is used to encrypt the file indexes and generate the searchable structure 〈ref t w k , inc t w k 〉 corresponding to the latest version information VI. Due to the security of the pseudorandom permutation function VI, the adversary cannot predict the next state pointer based on the current state pointer pt c+1 w k and the version information. erefore, the previous search trapdoor cannot be used to search the medical data updated afterward, so forward privacy is guaranteed. e BNSEM system that implements forward privacy can effectively resist file injection attacks and avoid adversaries from inferring the keyword contained in a trapdoor.

Backward Privacy.
Backward privacy limits the updated information of a keyword w that an adversary can obtain during a search query on the keyword w. at is, a searchable encryption system satisfies backward privacy if after a keyword-file index pair (w, ind) is added to the database and then deleted, and a search query on the keyword w will not disclose the index ind. In BNSEM system, encrypting a medical file index yields EI j w k � H 2 (s, w k � � � �j)⊕(op � � � � � ind j w k ), where the secret value s is broadcast encrypted using the authorized MDUs' public key and can only be decrypted by the authorized MDUs. Since the search result is in the form of ciphertext, even if it is stored publicly on the MCB, the adversary cannot decrypt the broadcast ciphertext to recover the secret value s and cannot learn any useful information about the indexes of medical files. erefore, the backward privacy of BNSEM can be achieved.

Distribution.
Although BNSEM requires the use of a centralized MCS to store encrypted data, the search process is accomplished by smart contracts, which ensures the reliability and correctness of search results. First, to achieve the retrieval of encrypted medical data, the MDO uploads the searchable data structure to the distributed MCB platform by invoking the smart contract with storage function. Second, the MDU runs the trapdoor algorithm and uploads the trapdoor to trigger the smart contract with the search function. e correctness of the whole search process does not rely on the MCS, enabling decentralized search. e blockchain is distributed, and each blockchain node is relatively independent and must be authenticated to join the system. It is difficult for the adversary to manipulate a large number of nodes at the same time to change the network rules and damage the blockchain system, which can effectively resist Sybil attack. In addition, since each search is recorded as an immutable transaction on the blockchain, the number of search requests sent by each MDU cannot be tampered with. e online keyword guessing attack (KGA) can be effectively resisted by setting an upper limit on the number of MDU's requests.

Performance Comparison.
We compare the theoretical performance of our scheme with other multiuser searchable encryption schemes, where the MVSSE [24] and BAEKS [25] schemes are both based on public key cryptography, and Π + [20] is a symmetric searchable encryption scheme. In this study, we compare the computational overheads of the main algorithms of searchable encryption schemes, including encryption algorithm, trapdoor algorithm, and search algorithm.
e results of the performance comparison are given in Table 1.
e notations in Table 1 are explained as follows: n denotes the number of authorized MDUs and m denotes the number of indexes containing the keyword w. Symbols h, exp, sm, mul 2 , e, and mtp denote general hash functions (e.g., SHA-256 and SHA-3), exponential operation, scalar multiplication on the group G 1 , multiplication on the group G 2 , a bilinear pair from groups G 1 to G 2 , and a map-to-point map. Although the hash functions H 0,2,3,4, , h 1,2,3,4 used in our scheme differ in input/output lengths, they can all be obtained by simple transformations of the general hash functions and will not add additional complexity. In addition, F/F − 1 denotes pseudorandom permutation function (i.e., symmetric cryptography, e.g., AES and DES algorithms). e time overhead of the above operations is shown in Table 2.
It shows that the computational overhead of encryption algorithm in most schemes is linearly related to the number of indexes m in Table 1. e BAEKS scheme does not consider the number of indexes containing the keywords. In addition, the encryption computational complexity of BAEKS is linearly related to the number of users, so it is not shown in the computational overhead graph. e scheme Π + does not describe the broadcast encryption algorithm it uses, so the broadcast encryption overhead cannot be calculated. e encryption computation overheads of our scheme and the MVSSE scheme are 2 * sm + e + mtp + F + (2m + 4) * h and sm + (2m) * mul 2 + (2 + m) * F + (2m) * h, respectively. Although our scheme contains additional timeconsuming operations, they are independent of the number of indexes. e theoretical computational overhead of encryption algorithm for each scheme with respect to the number of file indexes is shown in Figure 4.
As for trapdoor algorithm, the computation overheads of MVSSE, Π + , and our scheme are 2 * sm + 2 * mul 2 + 2 * F, 5 * F, and 3 * sm + e + mtp + h, respectively. Although the trapdoor computation overhead of our scheme is slightly higher than other schemes, we avoid key management and distribution operations compared with the symmetric scheme MVSSE. Moreover, the user in the MVSSE scheme cannot generate search trapdoors independently and it requires interactive communication with the server. Similarly, the Π + scheme requires the data owner to generate and distribute public-private key pairs for multiple recipients, which does not meet the key security specification. e theoretical computational overhead of trapdoor algorithm for each scheme is compared as shown in Figure 5. e computational overhead of our scheme is slightly higher than that of MVSSE scheme, and the trapdoor generation of Π + scheme only involves pseudorandom permutation operation with minimal time overhead.
When performing search operations, the MVSSE scheme contains multiple scalar multiplication operations, which will incur a large computation overhead. e search computation overhead of our scheme is lower than that of the symmetric searchable encryption scheme Π + because the computations in the main algorithm of our scheme are hash operations or symmetric cryptographic primitives. erefore, our scheme is a searchable public key scheme with high search performance. Figure 6 shows the variation of the theoretical search computation overhead with the number of indexes for each scheme. Our scheme has the lowest computation overhead, and the MVSSE scheme has the highest time overhead with the number of indexes.

Prototype Implementation.
We implement our BNSEM system using the MIRACL cryptographic library (C++) on a PC with 16 GB of RAM, Intel Core i5-7500 CPU, OS Windows 10, and a Fabric consortium blockchain on a PC with 16 GB of RAM, Intel Core i5-7500 CPU, and OS Ubuntu 16.04. In addition, we set the system security parameter κ to 128 bits, implement hash functions with different input and output lengths based on SHA-256, and use the AES algorithm in CBC mode as the pseudorandom permutation function with a key length of 128 bits. Finally, we choose a super-singular elliptic curve (y 2 � x 3 − 3x, p � 2 255 + 2 41 + 1) to achieve the ASE-128 security level. Next, we perform three simulation tests: the time cost of the encryption algorithm with the number of indexes, the time cost of the search algorithm with the number of indexes, and the time cost of all algorithms under a certain number of indexes of our system.
To overall evaluate the efficiency of our system, we test the average time overhead of all algorithms under the condition that the number of indexes containing the keyword is 10000, as shown in Figure 7. e key generation requires multiple scalar multiplication operations on the G1 group with a time overhead of about 85 ms. In addition, the time to generate a search trapdoor of the keyword is about 166 ms, while the time overhead to encrypt a search structure with 10000 file indexes is only 287 ms, mainly because the trapdoor algorithm requires the time-

Conclusion
In this study, we propose a blockchain-based searchable encryption scheme BNSE and design a searchable encryption system BNSEM for medical data based on the scheme. Firstly, the system adopts the smart contract of Fabric to guarantee the accuracy of search results. Secondly, we use polynomial-based broadcast cryptography to implement a multiuser search function. en, the system achieves legal regulation of medical ciphertext data without violating the privacy of the private key. Finally, we provide the security analysis of BNSEM and perform a test of the time cost of each algorithm. For future work, we have considered functional extensions of multikeyword search and range queries on numerical data.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.