Security Enhancements for Data-Driven Systems: A Blockchain-Based Trustworthy Data Sharing Scheme

With the increasingly prominent value of big data, data sharing within enterprises and organizations has become increasingly popular, and many institutions have established data centers to achieve eﬀective data storage and sharing. Meanwhile, cyberspace data security and privacy have become the most critical issue that people are concerned about since shared data often involves commercial secrets and sensitive information. At present, data encryption techniques have been applied to protect the security of the sensitive data stored in and shared by the data centers. However, the challenges of eﬃcient data sharing, secure management of decryption keys, deduplication of the plaintext, and transparency and auditability of the data access arise. These challenges may obstruct the development of data sharing in data-driven systems. To meet these challenges, we propose a secure and trustworthy data sharing scheme and introduce blockchain, proxy re-encryption (PRE), and trusted execution environments (TEEs) into the data-driven systems. Our scheme mainly enables (1) automatic distribution and management of the decryption keys, (2) reduction of the reduplicative data, and (3) trustworthy data sharing and recording. Finally, we implement the proposed scheme and compare it with other existing schemes. It is demonstrated that our scheme reduces the computation and communication overhead.


Introduction
With the development of big data, the Internet of ings, and other network technologies, various kinds of data have been produced. e economic and social benefits of the data trigger the demand for sharing data between institutes and enterprises. erefore, many organizations have established data centers utilizing the private or public cloud to realize effective data storage and sharing. Because data often involves business secrets and sensitive information, among others, data privacy and security are the key issues that people are concerned about, especially in large enterprises and scientific research institutes.
Encryption can be applied to protect the privacy and security of the sensitive data stored in and shared by the data center to a certain extent. Encrypted data sharing schemes [1,2] are proposed, in which the data are encrypted by the owner and can only be decrypted by authorized users. In these scenarios [1,2], an owner negotiates a session key with a group of users in advance so that they can share data with them. However, if a new user is added to the authorized sharing group, a new session key is needed to be negotiated, and data are required to be encrypted using this new session key. is inevitably introduces a large computation overhead if there are frequent changes in the sharing group.
In order to alleviate the above complex key management problems, the proxy re-encryption (PRE) techniques [3][4][5], which allow a proxy [e.g., cloud server (CS)] to convert a cipher of a delegator to different ciphers for different delegatees, have been used to share the data to different users dynamically without the complex key agreement and decrypt-then-encrypt operations. PRE properties make it a practical approach to cloud-assisted data sharing. However, in order to avoid huge computation in PRE's cipher conversion, the CS may not generate the re-encryption ciphertext honestly [6]. Additionally, many PRE-based data sharing schemes [7,8], cannot satisfy nonframeability. In other words, in these schemes, the CS may be maliciously framed for refusing to perform a re-encryption operation or for outputting a wrong reencrypted cipher when it indeed performs honestly.
Recently, blockchain has been applied in data sharing solutions [2,9,10], in which the encrypted data were stored in the off-chain data center (e.g., cloud), whereas the meta and data transfer log were recorded in the blockchain for data retrieval and auditing. en, the data sharing schemes that combine the blockchain and PRE emerged such as in [11][12][13][14], where the encrypted data can be accessed by the authorized users with the help of a proxy server, and the misbehavior of the proxy server or the users in re-encryption can be hindered as all operations would be recorded in the blockchain and can be audited. Nevertheless, references [11][12][13][14] faced the efficiency challenge caused by the complex encryption/decryption and frequent interaction of data owners (DO) and users. On the one hand, frequent interactions and complex ciphertext transformations are heavy burdens to DO and CSs. On the other hand, the storage of large-size data is a great burden to the blockchain. In addition, in [11][12][13][14], the same data will be packaged into different ciphertexts, and the redundant plaintext copies would result in additional storage overhead. ese challenges motivate us to propose a more efficient and versatile encrypted data sharing scheme.
In this paper, we combine blockchain, PRE, and trusted execution environments (TEEs) and propose a flexible and secure data sharing method for data-driven systems. By employing the PRE technology, our scheme allows the encrypted data to be transformed (by the CS) into different ciphertexts for different authorized users without the complex key agreement. Furthermore, by involving the blockchain, the misbehavior of the CS or the users in re-encryption can be recorded and audited. Meanwhile, the smart contract can automatically delegate the re-encryption key to authorized users so that the DO's computation and communication burden can also be reduced. Finally, by utilizing the TEEs, the smart contract can be executed in a secure enclave to protect the DO's private key. e major contributions of our scheme are as follows: (1) Smart contract is employed to control the access of data, such that the decryption key's delegation can be executed automatically and the DO is not required to be online all the time, which greatly reduces the computation and communication burden of the DO and makes the data sharing convenient. (2) Our scheme utilizes the tamper-proof and consensus properties of the blockchain, and the transfer logs of data requests and replays are recorded in the distributed ledger, which realizes the trustful recording and the real-time monitoring.
(3) In our scheme, the duplicate data can be quickly detected to avoid redundancy, and its storage request will be refused. erefore, the ciphertexts corresponding to the same plaintext can be reduced. (4) We conduct experiments to evaluate the performance of our proposed scheme, and the results show that our scheme is more efficient than the existing schemes [11][12][13][14] with respect to computation overhead and cipher size. e rest of this paper is organized as follows: Section 2 surveys the related works. e preliminaries are presented in Section 3. e system model and design goals are described in Section 4. Section 5 introduces the detailed proposal, including the data release and retrieval. e analysis and simulation of the proposed solution are shown in Section 6. Finally, in Section 7, we draw our conclusion.

Related Works
With the rapid development of blockchain technologies, the schemes [2,9,15,16] of decoupling the storage layer and the blockchain have been proposed to achieve efficient and reliable data sharing, especially in large-scale data-driven systems. In these cases, the data generated from the source are stored in the off-chain data center, and the meta (e.g., digest) is recorded on the blockchain for efficient data retrieval. When a user queries data from the blockchain, the distributed ledger's retrieval mechanism can help users quickly retrieve the queried information from the blockchain, which greatly improves the efficiency and credibility of the system. However, most of these schemes use blockchain just as a distributed and immutable database, and the issues such as trusted access control have not been completely solved. e concept of PRE was initially introduced and constructed in [3], in which the DO controls the delegation of data access with the help of a CS. Based on this concept, Ran and Susan [4] proposed a secure PRE scheme against chosen ciphertext attacks. Next, Weng et al. [5] proposed a conditional PRE, which achieves a more fine-grained delegation. In order to ensure secure and efficient data sharing, PRE technology is used in [11][12][13][14] to realize multisharing controls of ciphertext for blockchain-based big data storage. In schemes [11][12][13][14], DO outsource their encrypted data to the cloud using identity-based encryption and grant legitimate users access to the data. However, these schemes face heavy communication and computation costs. Additionally, schemes [11][12][13][14] either rely on a proxy to fully manage their data or require the DO to always be online to delegate the decryption key to the data user, which means either the data transparency and auditability cannot be achieved or the DO needs to be online all the time.
For achieving privacy-preserving and automatic data sharing, Li et al. [17] proposed a blockchain-based privacypreserving data sharing scheme with rewards, which uses smart contracts to automatically generate the decryption keys for users, and the TEEs are used to ensure the security of secret keys in smart contracts. Wang et al. [18] proposed a TEEs-based smart contract execution scheme, which is used to share private data with fine-grained access control for smart grids. Lei et al. [19] proposed a multiparty data sharing platform that combines blockchain and TEEs and realizes automated data sharing. However, these schemes face communication and computation burdens caused by finegrained access control.
Based on the above research, we provide a trust and efficient data sharing scheme. We incorporate blockchain, TEEs, and proxy re-encryption to achieve secure data sharing while achieving (1) efficient one-to-many sharing of data, (2) automatic distribution and management of the decryption keys, (3) reduced reduplicative data storage, and (4) trustworthy data transmission and records.

Preliminaries
is section briefly outlines the preliminaries about pairing groups, blockchain technologies, and PRE. e key notations involved in this paper are summarized in Table 1.

Pairing Groups.
Let pp � (p, G, G T , e) be the pairing parameter, where G, G T is the finite groups of order p and e an efficiently computable bilinear map from G × G to G T , which satisfies the following: (1) Bilinearity: for any generator P, Q ∈ G and a, b ∈ Z p , the equation e(aP, bQ) � e(P, Q) ab holds; (2) Nondegenerability: e(P, Q) ≠ 1; (3) Computability: e can be efficiently computed.
Difficult problems based on the above bilinear pairings are defined as follows: Definition 1. (DL assumption) [3]. Let G be a cyclic group.
e Discrete Logarithm (DL) assumption is that, for all P ∈ G and a ∈ Z p , given an input P, aP { }, the probability of outputting a is negligible for any polynomial time algorithm.

Definition 2.
(3-QBDH assumption) [5]. Let pp � (p, G, G T , e) be the pairing parameter and P a generator of G. e 3-quotient bilinear Diffie-Hellman (DBDH) assumption on pp is as follows: for any unknown a, b ∈ Z p , given P, − aP, aP, a 2 P, bP , the probability of computing e(P, P) b/a 2 is negligible for any polynomial time algorithm.

Some Basic Knowledge of Blockchain.
With the launch of the bitcoin network [20], the concept of blockchain has become widely known to the public. As a decentralized distributed ledger maintained by multiple parties, the primary purpose of blockchain is to solve the trust problem in untrustworthy distributed environments by using peer-topeer (P2P) network schemes, consensus algorithms, asymmetric encryption, password hashing, and other technologies. e blockchain can be used as a secure data management system [15] to ensure data integrity and availability. It can also be used as a supervision and audit platform [21] to achieve transparent supervision of the data.
Additionally, the blockchain can be used as a platform [17] that achieves secure and trusted data processing by utilizing smart contracts (self-executing programs with clauses clearly specified by the underlying code and deployed on the blockchain).

Trusted Execution Environments.
As data in smart contracts are transparent on the blockchain, users' private information can easily be exposed [17]. TEEs, such as Intel Software Guard Extensions (SGX) [22], TrustZone [23], and MultiZone [24], can be utilized to solve this problem. e TEEs enable secure execution of programs on untrusted hosts (e.g., cloud), as the programs can be run in a protected manner by isolating all the operations against the outside world. ey also allow remote verifiers to ascertain a device's current configuration and behavior via remote attestation. It is worth noting that, via remote attestation, a TEE can build a secure channel for the user and other TEEs to communicate with it securely. ese properties make TEEs a good choice for processing and sharing private data. For example, Bowman et al. [25] proposed a TEEs-based private data process scheme, named private data objects (PDOs), which allows mutually untrusted parties to work on private data based on preagreed policies, and the open-source code is provided in [26].

Proxy Re-encryption.
e concept of the RRE was introduced by Blaze et al. [3], in which a semitrusted proxy server is delegated to convert a delegator's ciphertext to a delegatee's without the leakage of the corresponding plaintext. e PRE can be used for secure data sharing in cloud environments, which usually consists of the following five algorithms: KeyGen(λ) ⟶ (pk, sk): on the input of the security parameter λ, this algorithm outputs a public/private key pair (pk, sk); Re − Key(sk o , pk c ) ⟶ r o⟶c : on the input of a user's private key sk o and another user's public key pk c , this algorithm outputs a re-encryption key r o⟶c ; Symmetric encryption and decryption AEnc(·), A De c(·) Asymmetric encryption and decryption Signature under the secret key sk i k st State key of smart contracts k o⟶c Reencryption key for DC

System Model
In this section, we illustrate the framework, outline the threat model, and design goals of the proposed scheme.

4.1.
Framework. e schematic diagram of our framework is shown in Figure 1, in which the consensus is separated from the execution of the smart contract. Similar to Ekiden [27], our framework consists of a CS, consensus nodes, participant nodes, and authorities. Each component and its role are described below.

Cloud Server.
It is usually a data center responsible for storing and securely sharing the encrypted data for the users with the help of a TEE (this TEE in the cloud is called sTEE). e CS loads the smart contract, executes it in the sTEE to generate a re-encryption key, and performs the reencrypting operations.

Consensus Nodes.
ere are two types of consensus nodes: without TEEs and with TEEs. A consensus node without TEEs is responsible for maintaining the blockchain ledger and realizing basic blockchain functions such as packaging blocks and verifying blocks. Besides maintaining the ledger, consensus nodes with TEEs play the role of key management committees (these TEEs are called kTEEs) and are responsible for managing secret keys and remotely attesting to the sTEE.

Participating Nodes.
ey are the blockchain users, including the DO and data consumers (DC). DO is responsible for data release; DC requests the data by revoking the smart contract. After that, DC obtains the reencrypted data, which can be decrypted using their private key.

Authorities.
ere are two types of authorities, certificate authority (CA) and judgment authority (JA). e CA is responsible for membership enrollment and certificate distribution, and the JA is responsible for judging whether malicious behaviors have been performed.
Algorithm 1 further illustrates the interactions of Figure 1 among CS, consensus nodes, and participating nodes. Algorithm 1 shows that the interactions include two parts, data release and data retrieval, and DC can obtain data M when the algorithm ends.

reat Model and Design
Goals. In our system, the authorities are trusted, and DO will honestly share the ciphertext of the data. CS and some unauthorized DCs are curious about DO's data, and CS may not honestly transfer the cipher to DC. Moreover, the openness of blockchain enables the analysis of the transaction information (such as input and output) [28], which may cause the leakage of DO's data or DC's identity and attributes. We further assume the data and program can be securely stored and executed in the TEEs.
Based on the above security hypothesis, our goals are achieved if the below properties are satisfied.

Data Confidentiality.
e DO's data should be kept confidential to CS during the storage and computation. Moreover, except for the DO and the authorized DC, other users cannot obtain the original data.

Nonredundant Storage.
e storage requests of duplicate data should be quickly detected and refused for reducing storage costs.

Verifiable Integrity.
When obtaining the data from CS, data integrity can be easily verified by DC.

Anonymity.
e DC's identity and attributes should not be recognized by anyone during the data requesting process.

4.2.5.
Transparency and Auditability. DO should know whom their data are shared with. Besides, the access and computation processes should be auditable.

Efficient Computation.
e data encryption/decryption should avoid the heavy cryptographic overhead and save computation costs as much as possible.

The Proposed Approach
e proposed data sharing scheme includes three phases: system initialization, data release, and data retrieval. CA generates the system parameters, and each entity generates a private/public key pair and then registers to CA in the system phase. In the data release phase, DO releases their encrypted data and delegates the corresponding secret key shares to a group of kTEEs. Finally, in the data retrieval phase, DC invokes the smart contract and will obtain a reencryption ciphertext, and then DC decrypts the re-encryption ciphertext to recover DO's data.

System
Initialization. CA chooses a security parameter λ and generates the pairing parameter pp � (p, G, G T , e). CA also chooses a generator P in G, a symmetric encryption algorithm SEnc (such as AES), an asymmetric encryption algorithm AEnc (such as ElGamal), and three secure cryptographic hash functions, h, h 1 , and H, where h: Each participant node i picks a private key sk i ∈ Z p , computes the public key pk i � sk i P , and then registers to CA. CA then issues a certificate cert i to user i. e certificate is combined with the user's public key pk i and attributes. e CS picks a private key sk s ∈ Z p , computes the public key pk s � sk s P, and then registers to CA. CS and key management nodes initialize their TEEs and send the necessary information (e.g., the TEE's public key pk tee ) to the blockchain.

Data
Release. In our scheme, data are encrypted and stored off-chain while the related information is stored onchain, and the DO can specify whom their data can be shared with. e data release ( Figure 2) is conducted as follows.  Require: DO: the encrypted data CF, the contract's program code Contract, the secret key shares dk 1 , dk 2 , . . . , dk n ; DC: the request req. Ensure: the data M. procedure DATA RELEASE: Step 1.1: DO sends CF and Contract to CS; Step 1.2: CS publishes Contract to blockchain; Step 1.3: DO checks Contract if the Contract in the blockchain is correct, then DO sends the secret key shares dk i to the kTEE i. procedure Data retrieval: Step 2.1: DC revokes the smart contract with input req; Step 2.2: CS loads Contract and req into the sTEE; Step 2.3: sTEE performs remote attestation with kTEEs if sTEE environment and loaded data are correct, then e kTEE i transmits dk i to the sTEE; Step 2.4: CS executes the smart contract in the internal sTEE to obtain a reencryption key; Step 2.5: CS computes the reencrypted ciphertext CF ′ outside the sTEE; Step2. 6: CS sends the reencrypted data CF ′ to the blockchain; Step 2.7: DC obtains CF ′ and decrypts it to obtain M.  en, DO randomly chooses r ∈ Z p and computes s � h(m), C 0 � SEnc s (m), C 1 � r · ek, and C 2 � s⊕H(e(P, P) r ), where SEnc s represents the symmetric encryption under the secret key s. DO sends CF � (C 0 , C 1 , C 2 ) to CS. If C 0 is not duplicated with other ciphers (this means m is different from other data), DO will receive a data retrieval index DI and a timestamp TS from the CS, where DI � sig sk s (h(CF)‖TS). DO checks the validity of the DI using the CS's public key pk s . If it is valid, then it continues; otherwise, it aborts.

Smart Contract Creation.
e smart contract is responsible for generating the re-encryption key for the authorized DC, and the contract's creation is carried out as follows: (1) DO creates a smart contract Contract, which is written in the form of program codes. en, DO chooses a state key k st and generates k st ′ by encrypting k st under the sTEE's public key pk tee . DO sends the Contract to the CS. e Contract contains the related information RI � ek, A, W, DI , k s t ′ , TS, sig sk o (ek, A, W, DI , k st ′ , TS)}, where A is the access attributes, W is the keywords set, k st ′ is the encrypted state key, and sig sk o (·) is the signature under DO's private key sk o .
(2) CS loads the code of Contract into the sTEE, and then the sTEE generates a new contract ID, namely, CID, and decrypts k st ′ using its secret key sk tee to recover the state key k st . en, the sTEE encrypts the initial contract state as st 0 ′ � SEnc k st ( 0 → ) and outputs CID, Contract, st 0 ′ , π , where π is a correctness proof generated using sTEE's secret key sk tee . After that, CS sends the output to the blockchain. e consensus nodes will verify π, pack the legitimate CID, Contract, st 0 ′ , π into a block, and record it on the blockchain.
(3) After the Contract has been confirmed in the blockchain, DO sends CID and the shares of dk to the kTEEs. e dk is shared using the secret-sharing schemes [29,30], and each share is encrypted using the corresponding kTEE's public key. e security feature of TEEs ensures that dk is kept secret against other nodes.

Data
Retrieval. e anonymous data retrieval is conducted as shown in Figure 3, which is comprised of three phases: off-chain re-encryption key generation, cipher reencryption, and data decryption. In the first phase, DC requests the cipher by invoking the smart contract, and then the smart contract executed in the sTEE generates a reencryption key for the authorized DC. In the second phase, CS reencrypts the cipher and sends it to DC. In the last phase, DC receives the encrypted data and decrypts it to obtain the plaintext.

Off-Chain Re-encryption Key
Generation. DC retrieves the interested keyword W i on the blockchain and obtains CID from the related information RI. After checking the contents of the contract, DC invokes the smart contract with input req � CID, AEnc ek (DI , pk c , cert c ) , where AEnc ek represents the asymmetric encryption under the public key ek. en, the CS loads the corresponding contract state st old ′ and req into sTEE, and the sTEE performs remote attestation with kTEEs to attest the sTEE environment. e loaded smart contract and data are correct. After passing the attestation, the shares of the decryption key dk will be transmitted to sTEE through secure channels. Once obtaining enough shares, the smart contract in sTEE performs the following steps: (1) recovers the decryption key dk and state key k st ; (2) Figure 2: Sequence diagram of data release.
(6) omputes a re-encryption key k o⟶c � (− dk) · k · pk c . When the execution ends, all involved keys and intermediate results of the off-chain smart contract execution in sTEE can be securely erased [18]. ere are two outputs: out 1 � DI , k o⟶c and out 2 � st new ′ , h(k o⟶c ), π (if DC is illegal, then out 1 � ⊥ and out 2 � st new ′ , ⊥, π ).

Cipher
Re-encryption. CS retrieves the cipher CF according to DI and reencrypts CF to obtain CF ′ � C 0 ′ � C 0 , C 1 ′ � e(C 1 , k o⟶c ), C 2 ′ � C 2 . CS then sends the reencrypted cipher to a temporary location, and a transaction tran � out 2 , url, h(CF ′ ), sig sk s (out 2 , url, h(CF ′ ))} is sent to the blockchain by CS, where url is the link of the temporary location that CF ′ stores. For the transaction tran, consensus nodes check the validity of out 2 through the proof π provided by sTEE, check the validity of the url and h(CF ′ ) through the signature provided by CS, and maintain the consistency of state through consensus schemes. otherwise, DC rejects it and complains to the authority JA.

Claim.
JA firstly requests the CS to provide the cipher CF and the sTEE's output out 1 . After confirming the correctness of CF (using the index DI in the blockchain), JA computes the hash of the re-encryption key H(k o⟶c ) and compares if it is equal to that in out 2 of the blockchain. If yes, JA computes the re-encryption cipher (C 0 , C 1 , C 2 ) and compares if C 0 � C 0 ′ , C 1 � C 1 ′ , and C 2 � C 2 ′ hold. If they hold, it ignores this complaint; otherwise, the CS has misbehaved, JA takes action accordingly.

Analysis and Evaluation
In this section, we analyze the security properties and evaluate the performance of the proposed scheme.

Security Analysis.
Our scheme achieves the security properties of correctness, confidentiality, verifiable integrity, transparency, and auditability.

Theorem 1. If DO, DC, and CS execute the scheme honestly, then DC can obtain DO's data correctly.
We can prove eorem 1 by verifying the following equation: � s⊕H e(P, P) r ⊕H e C 1 , k o⟶c � s⊕H e(P, P) r ⊕H e(P, P) r � s.  Figure 3: Sequence diagram of data retrieval.

Security and Communication Networks
We prove the confidentiality of our scheme by proving that secret s cannot be recovered from the ciphertext by unauthorized users. We constructed an algorithm B that is given the pairing parameters (G, G T , p, e) and an instance (P, A 0 � − aP, A 1 � aP, A 2 � (a 2 )P, B � bP, T) and aimed to decide whether T � e(P, P) b/a 2 . B controls a hash oracle and runs an algorithm A (aimed to break the confidentiality of s ) as a subroutine. We can prove that if A breaks the confidentiality of s, then B can break the 3-QBDH problem.
Before starting, we define two lists, L h and L c , where L h is the list of honest users and L c is the list of corrupt users.
(i) Init phase: A prepares lists L h and L c and outputs i ⋆ ∈ L h as the challenger user. Let sk i ⋆ , sk i be the random numbers chosen from Z p . e public key for the challenge user i ⋆ is set as pk i ⋆ � sk i ⋆ A 2 and the corresponding secret key is a 2 sk i * . Public keys of other honest users i ∈ L h are set as pk i � sk i A 1 , and the corresponding secret key is a · sk i . Public keys of corrupt users i ∈ L c are pk i � sk i P , and the corresponding secret key is sk i . It should be noted that the corrupt users' key pair (sk i , pk i ) i∈L c is known as A. (ii) Find phase: A plays the role of user j and queries a re-encryption key of user i from B. If i � i ⋆ and j ∈ L h , B randomly chooses k ∈ Z p and computes and j ∈ L h , B randomly chooses k ∈ Z p and computes k i⟶j � sk j · k ·(− sk i ) · P � − (sk i a) · (sk j a) · k · P. If i ∈ L h , i ≠ i ⋆ and j � i ⋆ , B randomly chooses k ∈ Z p and computes the rekey k i⟶j � sk i ⋆ · k · (− sk j ) · A 1 � − (sk j a) · (sk i ⋆ a 2 ) · k · P. If i ≠ i ⋆ , i ∈ L h , and j ∈ L c , B randomly chooses k ∈ Z p and computes k i⟶j � sk j · k · (− sk i ) · A 0 � − (sk i a) · sk j · k · P. If i ∈ L c , B computes k i⟶j � (− sk i ) · k · pk j . (iii) Challenge phase: A chooses two numbers (s 0 , s 1 ) and sends them to B. B chooses s b , where b ∈ 0, 1 { }. B sets the challenge cipher C ⋆ as C ⋆ 1 � sk i ⋆ B and C ⋆ 2 � s b ⊕H(T) and returns C ⋆ to A. e confidentiality of secret s is converted to the hardness of the 3-QBDH problem. If T is a random number, then the probability of A to break the confidentiality of our scheme is 1/2. If T � e(P, P) b/a 2 , then C ⋆ is a valid ciphertext of m b with r � b/a 2 , pk i ⋆ � sk i ⋆ A 2 � sk i ⋆ a 2 P, and B � bP. erefore, if A can break the confidentiality of our scheme with advantage ε, then B can break the 3-QBDH assumption with advantage 1/2ε. In the data release phase, data m are encrypted under key s, equal to h(m). erefore, the same data m will be encrypted under the same symmetric key s, and the ciphertext C 0 � SEnc s (m) will be the same. erefore, CS can quickly detect whether plaintext m has already existed in the database and refuse the redundant data's storage request. Furthermore, when DC decrypts key s from ciphertexts C 1 ′ , C 2 ′ and then recovers data m ′ from C 0 using s, they can compare if h(m ′ ) equals s to verify the integrity. If the verification fails, DC can infer that the ciphertext has been modified or has not been generated correctly.

Theorem 4. Our scheme satisfies the anonymity for DC.
In the data retrieval phase, DC uses the regenerated pseudonym addresses to request the cipher. Meanwhile, the certificate is encrypted and can only be decrypted in the sTEE. erefore, others cannot know the real identity and attributes of DC, and no one can recognize DC from the pseudonym. erefore, our scheme achieves anonymous data retrieval.

Theorem 5. Our scheme achieves transparency and auditability.
e data access events are transparently recorded in the unforgeable ledger as transactions, publicly auditable to DO and JA. If the CS did not honestly generate the reencrypted cipher or if the CS was maliciously accused of returning the wrong reencrypted ciphertext, JA could easily detect it using the information in the ledger.

Performance Evaluation.
We evaluate the computation and communication overheads of our scheme and then compare them with the related blockchain-based PRE schemes [11][12][13][14]. e symmetric pairings e: G × G ⟶ G T over the elliptic curve E: y 2 � x 3 + 3xmodp with embedding degree 2 are constructed, the field size is 520 bits, and the group order is 160 bits. e bilinear pairing achieves the security level of 80 bits. Our simulations are supported by the MIRACL library [31], and our execution environment is performed on a laptop with AMD Ryzen 5 3550H 2.10 GHz processor and 16.00 GB memory.

Computation Overhead.
e key cryptographic operations are Par, Exp, and Sm, which means the bilinear pairing operation, the exponentiation operation in G T , and the scalar multiplication operation in G, respectively. Based on this setting, the main computational costs of our scheme are listed in Table 2. In the data release phase, DO performs 1 Sm operation and 1 Par. In the data retrieval phase, the smart contract performs 2 Sm operations, the CS performs 1 Par operation, and DC performs 1 Exp operation.
In order to demonstrate the efficiency of our scheme, we compare our scheme with the related schemes [11][12][13][14] in terms of the above operations. Reference [11] proposed a blockchain-based data trade scheme. Reference [12] proposed a data sharing scheme for the scenario of multiple DCs, in which the PRE and smart contracts were integrated to achieve the privacy-preserving share of medical data. References [13,14] proposed identity-based PRE approaches to achieve secure data sharing for cloud-assisted systems. We compare our scheme with these schemes [11][12][13][14] as we all utilized the PRE and blockchain and achieved secure data sharing. Table 3 shows the comparison of operations Par, Exp, and Sm among our scheme and schemes [11][12][13][14]. We can see that our scheme needs the fewest Par, Exp, and Sm operations. Figure 4 shows the total time of these schemes, from which we can see that the computation time of data release in our protocol is 14.7 ms, which is much smaller than 23.9, 29.4, 22.3, and 22.9 ms of [11][12][13][14]. e computation time of data retrieval in our protocol is 23.2 ms, which is also smaller than 44.6, 63, 42.8, and 66.2 ms of [11][12][13][14].   Cipher-text length (Byte) The number of data owners [11] [12] [13] [14] Ours Recipher-text length (Byte) The number of data consumers [11] [12] [13] [14] Ours Figure 6: Communication overhead of the reencrypted ciphertext.

Communication Overhead.
To evaluate the communication overhead, we denote the sizes of a scalar value in Z p , the group elements in G, and in G T by |Z p |, |G|, and |G T |, respectively. We choose SHA-256 as the hash function of h. e symmetric encryption algorithm is AES-256. e signature used to sign a transaction of blockchain is ecdsa-with-SHA256. Based on these, Table 4 compares the communication costs in [11][12][13][14] in terms of encryption overhead and re-encryption overhead. Figure 5 and Figure 6 show the comparison results, from which we can see that the ciphertext length of our protocol and that of [13] are identical, which are shorter than those in [11,12,14], and the length of the re-encryption ciphertext of our scheme is shorter than those in [11][12][13][14]. Considering that our scheme has a significant advantage in computational time, our scheme is more efficient in the aspect of both computational overhead and communication overhead.

Conclusion
is paper proposes a flexible and secure data sharing method for data-driven systems. In order to ensure confidentiality and reliability, data are encrypted and then stored off-chain. In contrast, the relevant information, such as digest, is stored on-chain, and data can be efficiently shared with authorized users with the help of a CS. e smart contract is employed to control data access such that the key delegation can be automatically executed and the DO is not required to be online all the time. In order to enable security and privacy, the smart contract is executed in the TEEs. Besides, all interactions, data delegations, and other operations are recorded in the blockchain and can be checked at any time, which realizes the efficient monitoring and auditing of the data. We proved that the security properties, such as confidentiality, anonymity, and verifiable integrity, are ensured during the whole data sharing process. We also simulated the proposed scheme, and the results show that our scheme has a better performance than the related works.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.