Genuine and Secure Identity-Based Public Audit for the Stored Data in Healthcare Cloud

Cloud storage has attracted more and more concern since it permits cloud users to save and employ the corresponding outsourced files at arbitrary time, with arbitrary facility and from arbitrary place. To make sure data integrality, numerous public auditing constructions have been presented. However, existing constructions mainly have built on the PKI. In these constructions, to achieve data integrality, the auditor first must authenticate the legality of PKC, which leads to a great burden for the auditor. To eliminate the verification of time-consuming certificate, in this work, we present an efficient identity-based public auditing proposal. Our construction is an identity-based data auditing system in the true sense in that the algorithm to calculate authentication signature is an identity-based signature algorithm. By extensive security evaluation and experimental testing, the consequences demonstrate that our proposal is safe and effective; it can efficiently hold back forgery attack and replay attack. Finally, compared with the two identity-based public auditing proposals, our proposal outperforms the two proposals under the condition of overall considering computational cost, communication overhead, and security strength.


Introduction
With the technique progress in communication filed, the amount of the generated data is going through fast growth. Many companies working on the healthcare trade increasingly make use of cloud storage services. Instead of every hospital storing and maintaining medical data in physical servers, cloud storage is becoming a popular alternative since it can offer the clients more convenient network-connection service, on-demand data storage service, and resource-sharing service. e aging population problem urges healthcare services to make the continuous reformation so as to obtain costeffectiveness and timeliness and furnish the services of higher quality. Numerous specialists deem that cloudcomputing technique may make healthcare services good by reducing EHC (electronic-health-record) start-up costs, such as software, equipment, employee, and various license fees. ese reasons will urge to adopt the relevant cloud techniques. Let us see one instance of healthcare services in which cloud technique is applied, the Healthcare Sensor system can automatically collect the patients' vital data of the wearable devices which are connected to traditional medical equipment via wireless sensor networks and then upload these data to "medical cloud" for storage. Another typical instance is the Sphere of Care by Aossia Healthcare, it is started in 2015. ese cloud-based systems can automatically collect every day real-time data of users. It alleviates manual collection burden so that the deployment of the whole medical system is simplified. However, they may make healthcare providers to face many challenges in migrating all local health data to the remote cloud server, where the paramount concerns are privacy and security since healthcare administrator no longer completely deals with the security of those medical records. After medical data are stored on the cloud, they are possibly corrupted or dropped.
To make sure the intactness of the stored data in the remote server, the patients or healthcare service providers expect that the stored data integrality can be periodically checked to avoid the damage of the stored data. However, for the individuals, one of the greatest challenges is how to carry out the termly data integrality detection when the individuals have the copy of local files. Meanwhile, the method is also infeasible to conduct data integrality verification for a source-limited individual though retrieving the total data file.
To deal with the above issue, many specialists had presented a number of problem-solving methods which aim at the diverse systems and diverse security models in [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]. Nevertheless, most existing problem-solving methods had built on public key infrastructure (for short, PKI). As everyone knows that the PKI-based auditing proposals exist complex key management problem, data client needs to conduct key updating, key revocation, and key maintaining, and so on. Hence, the key management and certificate verification in the PKI-data auditing system will be a troublesome issue. Furthermore, PKC also needs more storage space than the individual ID since key pair (PK, sk) needs to be locally kept. For a verifier, to guarantee data integrality, it must firstly extract PKC from public key directory and then verify whether public key certificate (for short, PKC) is valid. erefore, it also increases computation burden and communication overhead for the verifier.
In 2014, the first so-called ID-based data integrality proposal was proposed by Wang et al. [13]. Strictly speaking, their proposal is not a kind of identity-based auditing one because the algorithm to generate metadata authentication tag is not an idenity-based algorithm, but a PKC-based one. In 2015, Yu et al. put forward a generic method of constructing identity-based public auditing system by integrating the identity-based signature algorithm with traditional PDP protocols in [15]. eir research is very significant on studying the ID-based public auditing system. However, in their scheme, the algorithm to produce metablock authentication tag is still to adopt a PKI-based one. Furthermore, in the auditing phase, the auditor firstly verifies the validity of an identity-based signature on a public key PK, and then it executes data integrity verificationby using this public key PK again, which increases the computation burden of the auditor. In 2016, Zhang and Dong brought forward a novel identitybased public auditing proposal in [16]. e proposal is the identity-based public auditing system from the literal sense since their algorithm to produce metadata authentication tag is the ID-based signature algorithm. However, their scheme is shown to be insecure in Appendix.
To increase efficiency and strengthen the security of IDbased auditing protocols, in this work, a secure and efficient ID-based auditing construction is proposed. For our construction, its original contributions are as follows: (1) On the basis of the idea of homomorphic signature in ID-based setting, we devise an authentic identitybased auditing proposal of data integrality. e proposal can not only avoid the key managing, but also relieve the auditor's burden. (2) In auditing the phase, our scheme has constant communication overhead. Compared with the two schemes [15,16], our proposal has more advantages with regard to computational cost and communication cost. (3) In the random oracle model, the proposed proposal has serious security proof, and the corresponding proof can be tightly reduced to the CDH mathematic problem.

Architecture and Security of System
In the following chapter, in order to better understand our ID-based data integrality auditing protocol (ID-DIAP, for short), we firstly give a description of the system model, and afterwards the security model of our ID-DIAP for cloud storage is defined.

System
Architecture. For our ID-DIAP system in cloud, the architecture is composed of four entities: privacy key generator (for short, the PKG), the third party verifier/auditor (for short, the TPA), cloud servers, and data user. e whole systematic architecture is demonstrated in Figure 1.
To avoid biases in the auditing process, the TPA is recommended to implement the audit function in our system model. Detail function of each role in system architecture is described as below.
(i) Data User. It acts as a cloud user and possesses a number of files which need to be uploaded to the remote cloud server without local data copy. Generally speaking, data user may be a resourcelimited entity due to the limited capability of storing and computing. And it can flexibly access at any time and share the outsourced data. (ii) Cloud Servers.
ey are composed of a group of distributed servers and have tremendous capability of storing and computing. Furthermore, it is answerable to save and maintain the stored files in cloud. Nevertheless, the cloud server might be untrusted, and for its own profits and a good commercial reputation, it might conceal data corruption incidents for its cloud users. (iii) e ird Auditor. It acts as a verifier of data intactness. In principle, it has professional experience and practical capability to take charge data integrality audit in the person of cloud users/data users. (iv) e PKG. It is a trusted entity and is duty bound to build up system parameters and to calculate privacy key of every cloud user.
For a cloud-based storing system, its goals are to alleviate the burden of data storage and maintaining of cloud users. Nevertheless, after data are uploaded to the remote sever in cloud, it might lead to a potential security problem since the uploaded data have been out of control for the data user and the remote server in cloud is generally unreliable. Data user might concern whether the stored files in cloud are intact.
us, the data user wants some security measures to ensure 2 Journal of Healthcare Engineering that the integrality of the outsourced data is examined regularly without a local copy. Definition 1 (identity-based data integrity auditing protocol, ID-DIAP). In general, an ID-DIAP system contains the three stages: (1) System Initialization Phase. In this phase, the PKG is duty bound to produce system parameters. erefore, it runs Setup(1 k ) algorithm to obtain system parameters Para, the PKG's key pair (mpk, mpsk) by inputting λ which is a safety parameter. On the contrary, the PKG also invokes KeyExtr(1 λ , Para, mpsk, ID) algorithm to calculate privacy key sk ID for the data user with identity ID by inputting its mpsk and Para as well as the identity ID of the data user. (2) Data Outsourcing Phase. In this phase, for the data owner (data user), it runs TagGen(M, sk ID ) to generate metadata authentication tag δ i , on each data block m i by inputting its private key sk ID

Different Types of Attack and Security Definition.
In the subsection, we will analyze that our ID-DIAP system may be confronted with diverse attacks in light of the behavior of every role in the system architecture. In our system architecture, the PKG is the privacy key generator which calculates data user's privacy key. In general, it is a credible authority. We assume that the PKG does not launch any security attack to the other entities in the whole system model. For the third auditor, it is deemed to be an honestbut-curious entity and can earnestly execute every step in the auditing course. And cloud server is considered to be unreliable. It might deliberately delete or alter rarely accessed data files for saving storage space. It is a powerful inside attacker in our security model. And the goal of the attacker is to tamper and replace the stored data without being found by the auditor. Because the cloud server is a powerful attacker in our security model, we mainly consider the attacks [7] which are launched by the cloud server in this paper.

Forge Attack.
e vicious cloud server may produce a forged meta-authentication signature on a new data block or fabricate the fake proving information Prf to deceive the auditor by satisfying the auditing verification.

Replace
Attack. If a certain data block in the challenge set was corrupted, the vicious cloud server would select another valid pair (m i , δ i ) of data block and authentication tag to substitute the corrupted pair (m j , δ j ) of data block and data tag.

Replay Attack.
It is an efficient attack. With respect to the vicious storage server in cloud, it might produce the new proof information Prf * without retrieving the challenge data of the auditor though realizing the former proving information Prf.

Our Public Auditing Construction
In the following, we will give the description of our ID-DIAP system. It contains four entities: the PKG, cloud server, data user, and TPA. And the whole system is composed of five PPT algorithms. As for every entity and all algorithms, the diagram of the framework is represented in Figure 2. To clearly describe our protocol, the algorithms are given in detail below.

Setup.
For the sake of enhancing readability, some notations used in our ID-DIAP system are listed in Table 1.
e PKG makes use of a parameter λ as an input and generates two cyclic groups G 1 and G T . e two groups have the identical prime order q > 2 k . And let P and T be two generators of group G 1 , where they satisfy P ≠ T. And define a bilinear pairing map e: And the PKG randomly chooses s ∈ Z q as its master privacy key, calculates P pub � sP as its public key. At last, public parameters Para are published as below: And the PKG needs its master privacy key s to be secretly kept.

Key Extraction.
For a data user, to produce its privacy key, it delivers its identification ID to the PKG. Subsequently, the PKG utilizes its master privacy key s and ID to implement the following process: (1) Firstly, the data user delivers its identity information ID to the PKG. (2) Next, the PKG generates (D ID0 , D ID1 ) data user's privacy key, where And then it goes back(D ID0 , D ID1 ) to the data user through a secret and secure channel.
(3) Upon receiving the private key (D ID0 , D ID1 ), this data user is able to test whether its privacy key is valid through the following equations:

TagGen Phase.
For the upload data file M, firstly data file M is divided into n blocks by the data user, namely, M � m 1 ‖m 2 ‖ . . . ‖m n . To outsource this file M to the cloud, the data user needs to randomly choose a pair (x ID , Y ID ) which is a private-public key pair of a secure signature algorithm � (Sig, Ver), for example, BLS short signature.
Let Name denote the identifier of data file M, and then it calculates the file authentication tag τ � τ 0 ‖ ·Sig(x ID , τ 0 ), where ·Sig(x ID , τ 0 ) denotes a secure signature on τ 0 , and τ 0 denotes an information string τ 0 � "Name‖n". Subsequently, the data user needs to generate metadata authentication tag on the data block. To compute block authentication tags on all data blocks m i i � 1, 2, . . . , n, the data user uniformly samples r ∈ Z q to calculate R � rP.
Next, for i � 1 to n, it calculates metadata authentication tag for data block m i by the following steps: (1) First of all, it calculates (2) And then, it makes use of its private key (D ID0 , D ID1 ) to compute (3) For data block m i , the resultant authentication tag of the data block is θ i � (R, δ i ).
At last, the data user needs to upload all the metaauthentication tags (τ, R, δ i i � 1, 2, . . . , n { }) and the outsourced file M to the remote server in cloud.   On obtaining all the aforementioned data (τ, R, δ j j � 1, 2, . . . , n ), the cloud server needs to execute the following validation procedure: For i � 1 to n, it verifies the relation where Q 1 � H 0 (ID‖1) and Q 0 � H 0 (ID‖0). If all relations hold, then it parses τ into τ 0 and ·Sig(x ID , τ 0 ) and verifies the validity of signature If it is also valid, then the cloud server preserves these data in cloud.

Challenge Phase.
To audit the integrality of the outsourced file M, firstly, the TPA parses τ into τ 0 and ·Sig(x ID , τ 0 ) and verifies ·Ver( ·Sig(x ID , τ 0 ), τ 0 , Y ID ) � 1. If it does not hold, then terminate it. Otherwise, it retrieves the corresponding file identifier name and block size n.
Later on, the auditor picks a subset I⊆ [1, n] randomly where |I| � l and ρ ∈ Z q to generate a challenge information Chall � ρ, I .
Finally, it delivers Chall to the cloud server as the challenge.

Proving Phase.
After obtaining the corresponding challenge information Chall � I, ρ , for i ∈ 1, . . . , |I| { }, cloud server calculates v i � ρ i modq, and then it produces a set Q � i, v i i∈I { } . Subsequently, in light of the outsourced data file M � m 1 , . . . , m n and meta-authentication tag θ i of each block, i ∈ 1, 2, . . . , n { }, it produces as follows: Finally, the cloud storage server goes back 3-tuple Prf � (δ, μ, R) to the auditor as the corresponding proof information.

Verifying Phase.
To check the outsourced data's integrality in cloud, after receiving the responded proof information Prf � (δ, μ, R), the third auditor calculates as below: where h i � H 2 (index i ‖ID‖R‖0) for i ∈ I. en, it checks the validity of the following equation:

e(δ, P) � e(H, R)e hQ
If the aforementioned Equation (11) satisfies, then the TPA outputs VerifyRes as true; if not, it outputs VerifyReS as false.

Security Analysis
To show our proposal's security, we will demonstrate that our proposal is proven to be secure against the above three attacks.

Theorem 1. Assume there exists a PPT adversary Adv that is probabilistic polynomial-time attacker (for shot PPT) and can cheat the auditor using invalid proving information Prf which is forged by the adversary Adv (the dishonest cloud storage server) in a nonignorable probability ε, then we are able to design an algorithm $B$ that can efficiently break the CDH assumption by invoking Adv as subprogram.
Proof. Let us suppose that a PPT adversary {Adv} is capable to calculate a faked proving information Prf after the data blocks or metadata authentication tags are corrupted, then we are capable of constructing another a PPT algorithm B which is capable of breaking the CDH assumption by utilizing the adversary Adv. First of all, let a 3-tuple (P, aP, bP) ∈ G 1 be a CDH assumption's random instance, it is hard to obtain the solution abP.
To show the security proof, hash function H 0 in the game is regarded as random oracle, and identity ID of each data user is only made H 0 -query once. For H 1 and H 2 , they only act as one-way functions. In addition, the adversary Adv is capable of adaptively issuing the queries to three oracles: {H 0 -oracle}, {Key-Extract oracle}, and {TagGen oracle}.
Setup. Choose two cyclic groups G 1 and G T , and their orders are the same prime number q. e algorithm B firstly sets P pub � aP as the public key of the PKG. Let H 0 , H 1 , and H 2 be three hash functions. Finally, it sends public system parameters (G 1 , G T , P, T, q, P pub , e, H 0 , H 1 , H 2 ) to the adversary {Adv}. And let j * ∈ 1, . . . , qH 0 be a challenged identity index of the data user.
H 0 -Hash Oracle. e adversary {Adv} submits a query to H 0 -oracle with an identity ID i . If the index of identity ID i satisfies i ≠ j * , then the challenger B picks t i0 , t i1 ∈ Z q randomly to set up H 0 (ID i ‖0) � t i0 P � h i0 and H 0 (ID i ‖1) � t i1 P � h i1 . Otherwise, the challenger B uniformly samples t * 1 and t * 0 from Z q to set up In the end, the 5-tuple (ID i , h i0 , h i1 , t i0 , t i1 ) is added in the H 0 -list being initially empty.

Journal of Healthcare Engineering
Key Extraction Oracle. For a key extraction query, {Adv} submits an identity information ID i to key extraction oracle. To response it, the challenger B calculates the following: (1) If the identity index of ID i satisfies i ≠ j * , then B looks for 5-tuple (h i0 , h i1 , ID i , t i0 , t i1 ) in the H 0 -list. If it exists, B sends D i0 � t i0 aP, D i1 � t i1 aP to the adversary Adv; otherwise, it implicitly queries a H 1 -Oracle with identity ID i . (a) Firstly, for file identifier "Name," it picks r name ∈ Z q randomly to calculate R name � r name P. (b) Next for l � 1 to n, it calculates And then for l � 1 to n, B calculates data block m 1 's authentication tag as and adds (ID i , R name , m l , δ il 1 ≤ l ≤ n { }) to the Tag list which is initially empty.
Output. In the end, for a challenge information (i, v i ) , i ∈ I, the adversary Adv outputs a fake proving information (δ * , μ * , R * ) on data user's the corrupted file M * in a nonneglected probability ε, where the data user's identity is ID * . Adv wins this security game if and only if the following constraint condition holds: (1) ID * � ID j (2) Prf ′ � (δ * , R * ) can pass verification equation (11) (3) Prf ′ ≠ Prf, where Prf � (δ, R, μ) should be a legitimate proving information for the challenge information (i, v i ), i ∈ I and the data file M which satisfies M ′ ≠ M When the adversary Adv wins this game, then we are capable of obtaining the following: e δ * , P � e H, R * e h * · Q IDj0 + μ * Q IDj1 , P pub , e(δ, P) � e(H, R)e h · Q IDj0 + μQ IDj1 , P pub , is computed by the verifier, and for the same data file, R � R * and h * � h. us, we have It indicates that the CDH assumption is able to be broken with nonneglected probability ε ′ . Apparently, it is impossible since it is a hard problem to solve the CDH problem.

Theorem 2. For a malicious cloud server, its replay attack in our proposed auditing proposal can efficiently be resisted.
Proof. e proof in detail is very alike with the security proof in [16]. Hence, it is left out due to the limited space.

Performance Evaluation
To efficiently evaluate our proposal's performance, in the following part, we show that our proposal is efficient by comparing with Yu et al.'s proposal [15] and Zhang and Dong's proposal [16] in the light of computational cost and communication overhead, where Zhang and Dong's proposal [16] which is the state-of-the-art identity-based public auditing schemes in the aspect of communication overhead.

Computation Costs.
To evaluate the computation costs of our proposal, we would like to contrast our proposal with Zhang and Dong's proposal [16] and Yu et al.'s proposal [15] since the two schemes are recent two efficient ID-based public auditing schemes. We emulate the operators adopted in the three schemes on an HP-laptop computer with an Intel-Core i3-6500 CPU at 2.4 GHz processor and 8 GB RAM and all algorithms are implemented using the MIRACL cryptography library [21,22], which is used for the "MIRACL-Authentication Server-Project Wiki" by Certivox. We employ a Super-singular elliptic curve over field GFp, which has the 160-bit modulus p and a 2-embedding degree. Moreover, in our experiments, the whole statistical results are from the mean values of 10 simulation trials. 6 Journal of Healthcare Engineering For explicit demonstration, we use Mul G1 to denote point multiplication operation, and let Hash and Pairing be one hash-to-point operation from Z q to G 1 and a bilinear pairing operation, respectively. For a public auditing protocol, the computational cost in the TagGen phase is mainly determined by computation of producing block authentication tags. To outsource a data file, the data user requires (3n + 1) Mul G1 + n hash to produce block authentication tag in our construction; in Yu et al.'s proposal and Zhang et al.' s proposal, each data user needs (n + 1)Hash + (2n + 2) Mul G1 and 4n Mul G1 to calculate metablock authentication tag, respectively, where n is the numbers of data block. In Figure 3, we show the simulation result of generating the block authentication tag for the size of diverse data blocks with the identical data file.
From Figure 3, we can know that, in the TagGen phase, Zhang et al.'s proposal is the most time-consuming and Yu et al.'s proposal is the most efficient. Our proposal is slightly slower than Yu et al.'s one since the algorithm to produce the block authentication tag is a public key certificate-based signature algorithm in Yu et al.'s proposal; however, the algorithm, which is used in our proposal, is the ID-based signature algorithm. Because block authentication tags for the data file can be produced in the off-line phase, it has a little influence on the whole protocol.
In auditing the verification phase, the computational cost mainly comes from verifying proof information. It is determined by the numbers of the challenge data blocks. For our construction, to check the validity of proving information, the auditor requires 3pairing + c Hash + (c + 2) Mul_{G_1}; however, in the other two proposals, the TPA needs to execute 3Pairing + 2Mul G1 and 5Pairing + (c + 2) Mul G1 + c Hash to check the integrality of the stored file in cloud, respectively, where c expresses the size of the challenge subset. In Table 2, we give their comparison of computational time in the different challenge subset.
According to Table 2, we infer that the proposal in [16] is the most efficient. Our proposal is slightly more efficient than the proposal in [15]. However, the proposal in [16] is shown to be insecure, and its detail attack is shown in Appendix. At the same time, we also find that the TPA's computational costs grow linearly with the size of the challenge subset.

Communication Cost.
In a data audit system, communication costs mainly come from two aspects. On the one hand, it is from the outsource phase of the datafile; on the other hand, it is from the auditing phase. In the outsource phase, data owner uploads data file and the corresponding meta-authentication tags. As far as our proposal, the data owner wants to upload (n + 1)|G_1| + |M| bits to cloud storage server; however, in the proposals [15,16], the data owner wants to upload (n + 3)|G_1| + |M| bits and 2n · |G_1| + |M| bits, respectively. Here, |G_1| represents the bit length of an element of group G 1 , |M| denotes the bit length of data file, and n is the number of data blocks.
In the auditing phase, communication costs are mainly from the challenge information and proving information transmitting between the TPA and cloud storage servers. In our scheme, the challenge information Chall is |Z_q| + |I| bits, proving information is 3·|G_1| bits, and thus, the total communication overhead is |Z_q| + |I| + 3·|G_1| bits, where |Z_q| represents the bit length of an element in group Z q and |I| is the size of the challenge subset. In the proposal [16], the total communication cost is 2|Z_q| + |I| + 2·|G_1| bits in the auditing phase; in the proposal [15], the total communication costs is (2 + |I|)|Z_q| + |I| + 5·|G_1| bits in the auditing phase. eir comparison in detail is shown in Table 3.
As shown in Table 3, our scheme has the least communication overhead among three schemes.

Security Comparison.
According to eorem 1, we know that our scheme is provably secure against the vicious cloud server in the computational Diffie-Hellman assumption, and it has tight security reduction. For Yu et al.'s proposal in [15], their proposal is also provably secure against the vicious cloud server under the CDH assumption. However, for Zhang and Dong's  proposal [16], it is shown to be insecure against the vicious cloud server attack. A vicious cloud server is capable of deleting the whole file without being conscious of the TPA, and the detail security analysis is given in Appendix.

Conclusion
In this work, we present a novel identity-based public audit system by merging the homomorphic authentication technique in the ID-based cryptography into the audit system. Our proposal overcomes the security problem and efficiency problems which have existed in the ID-based public audit systems. Finally, the proposal is proven to be secure, their security is tightly relevant to the classical CDH security assumption. By compared with two efficient IDbased schemes, our scheme outperforms those two IDbased schemes under the condition of overall considering computation complexity, communication overhead, and security.