Attribute-Based User Revocable Data Integrity Audit for Internet-of-Things Devices in Cloud Storage

,


Introduction
With the widespread use of MCS systems exemplified by Internet-of-ings (IoT) and mobile communication devices [1], the way users collect and use data has gradually become more diverse. While the portable terminal brings convenience to users' work and life [2][3][4][5], it also has certain limitations, such as limited storage space, difficulty in achieving data synchronization between different terminal devices, and the immediacy of accessing data, which makes data cannot be fully utilized. In order to improve user's work efficiency, improve data utilization rate, and reduce local data management and maintenance costs, the cloud storage technology has been promoted.
Because the storage service provider is not completely trusted, the data entrusted by the user to the third-party storage have potential security risks [6][7][8][9][10][11][12] such as the deletion of data with low usage rates, the fact that the data were damaged due to attacks is concealed, storage does not meet user requirements, and data are maliciously leaked. erefore, the proposed data integrity auditing technology can help users to ensure that the integrity and availability of data are not damaged when using incompletely trusted cloud storage services, thereby better monitoring the data storage status. For example, in 2019, Alibaba Cloud went down due to server failure, which led to a large area of paralysis of the APP and website produced by the company that entrusted the software business to Alibaba Cloud, resulting in user business losses. In case of such losses, for solve the problems in time, users hope to get timely problem feedback from the service provider. However, if the service provider deliberately conceals the loss, the user's interests will be damaged. erefore, to ensure the security of cloud storage services, one of the urgent problems is to propose more efficient mechanisms to resist security threats. e use of data is no longer limited to a single user. In many cases, the data will be shared in a specified work area for multiple users to access. For this type of sharing scenario, users also face many security threats, such as abuse of user access rights, malicious collusion between revoked users and storage causes collusion attacks on data, dynamic data modification issues, and user privacy leaks. Adding an effective user authority grant mechanism to the auditing scheme can ensure secure data access from the perspective of users and service providers. While effectively ensuring the security of data, it can also ensure that the legitimate rights and interests of data users are not infringed.
When hackers attack the server, they can directly obtain the data stored in the cloud for illegal transactions. For example, as much as 87 GB of user data stored by MEGA, a cloud storage service provider, has been leaked. According to the amount of data leaked, this data leakage event has become the largest security accident in history. Hackers attacked the servers of MyHeritage and other websites to obtain user information, resulting in up to 617 million private data being sold on the dark Internet. e user password stored in Facebook plaintext is also publicly viewed by the company's employees, and the user's privacy is gone, and the unencrypted data are likely to be directly used to cause user losses. All kinds of examples show that restricting the access rights of users and storing the data on the third-party server after encryption can better protect our key data.

Related Work.
With the development of science and technology, the increasingly changing way of work puts forward higher requirements for the function and security of cloud storage services. In addition to solving the data storage problems of users, cloud storage services are expected to meet the needs of users to access data anytime, anywhere. e cloud storage service needs to ensure that the data stored in the server are not modifiable by either the cloud server or the users sharing the data without the user's permission. However, the data stored in the third party cannot be under the supervision of users all the time. On the one hand, it does not meet the actual situation; on the other hand, it wastes too much resources, thus losing the significance of using cloud storage services. erefore, with the development of cloud storage technology, an efficient and low-cost audit scheme for data integrity of cloud storage has become one of the hot issues in this field.
Today, the data storage will be more flexible with the scale of data access. If we only rely on the DO (data owner) to audit the integrity of data is not conducive to the development and use of cloud storage services, and with the increase of data volume, the audit burden of the DO will be increased, especially for users with limited computing power and resources. At this time, we can reduce the DO burden and improve the audit efficiency by dispersing the audit work. Considering the actual situation, more local cases have proposed a new form of data integrity audit; that is, the audit work is entrusted to the outside, which is called public audit. In 2007, Ateniese et al. proposed an original public audit model based on RSA homomorphism markers [13]. e verifier only needs to store a small amount of raw data to verify the integrity of the data stored on the server. e scheme can realize remote verification of data integrity by random sampling of data blocks, which improves the reliability of audit and reduces the audit burden of users. In 2010, the public audit scheme [14] proposed by Wang et al. is based on homomorphic authenticator and random mask technology, which not only realizes the third-party audit (TPA) batch audit but also ensures that TPA cannot obtain any information about the data in the process of auditing data. In recent years, there is also about the use of blockchain to achieve the audit of Internet-of-ings [15][16][17] data in the cloud storage environment.
e proxy re-encryption technology [18] used in the scheme proposed by Ateniese et al. can control the authorized decryptor's decryption authority on the ciphertext, so when the business changes, the decryptor's decryption authority on the ciphertext can be recovered, further ensuring that the ciphertext can only be decrypted by the designated user. To achieve secure user revocation, Jiang et al. constructed a new data audit scheme [19] against collusion attack by using the group signature technology with good functions and data-processing mechanism.
is scheme solved the potential collusion attack problem in the scheme [20] constructed by Yuan et al. and realized the good attributes of audit disclosure. In the scheme Panda, Wang et al. [21] proposed to let cloud sign data instead of users, supporting batch audit and user revocation. en, the relevant revocation scheme [22][23][24] which combines attribute-based encryption (ABE) [25] and proxy re-encryption technology is proposed. We call it revocable attribute-based encryption (R-ABE) here. Sahai et al. used an attribute-based encryption scheme [26] to construct the scheme of user revocation. In this scheme, cryptograph delegation technology and double ABE are used to allow CSP to be responsible for updating cryptograph [27]. e attribute authorization center holds the private key and the update key used for indirect revocation. e validity of the update key is determined by its own effective time, and the update key only performs key update operations for users who are not revoked. However, due to the need to use two ABE schemes in its construction, this kind of revocation method is inefficient and not suitable for a large number of frequently updated application scenarios. As an improvement of this kind of scheme, in scheme [28], to improve the efficiency of CSP update, a new access mechanism is used to replace the time-based update key in the R-ABE scheme, which saves the process of entrusting the key to CSP. e data owner (DO) stores the original ciphertext to the CSP during the revocation period. If there is a ciphertext query beyond this revocation period, CSP will send the ciphertext that takes effect within the current time limit to the legitimate inquirer. As long as the user meets the access policy and revocation time limit specified by the DO, the ciphertext can be decrypted by the user.
is scheme combines identity-based encryption (IBE) and time-encoding mechanism to achieve fine-grained access control and data sharing [29][30][31]. Inspired by such user revocation schemes, some attribute-based user revocable data integrity audit schemes have been proposed [24]. In 2017, Yu et al. proposed an attribute-based cloud data audit protocol [32] to complete the check of data status while achieving efficient key processing. e data are uploaded to the cloud by the user, but the so-called attributes are needed for the subsequent identification of these data providers by the cloud.
is is done by the specifier. Tian et al. ensure the anonymity of users when auditing data integrity artificially, prevent the third party from inferring the identity information of data owners from the inspection program, propose a new concept of cloud data integrity audit based on attributes [33], so as to easily realize the anonymity of users, and propose the security model of such system. Yan et al. proposed a novel remote data-holding test scheme [34]. Based on the original remote data-holding checking (RDPC) protocol, we can prevent forgery attack and realize dynamic data update. Fu et al. proposed a privacy audit scheme NPP [35], which supports the privacy protection of multiple parties in the group and realizes effective user revocation.
e new data structure designed based on binary tree can effectively track the change of data. At the same time, the TPA in this scheme needs to obtain the authorization of GM when checking the data of the cloud server, which makes CSP resist the malicious audit request to a certain extent to ensure the effectiveness of CSP work.
Cloud storage service not only provides users with convenient data usage [36] but also makes users unable to master the absolute control of data. erefore, it is very important to provide an effective CSP supervision mechanism for improving users' trust in cloud services. e integrity verification technology of data has been one of the hot issues in the field of information security since its birth. Considering that in reality, cloud storage service usually has complex application scenarios such as multicloud, remote access, and duplicate data storage, in which user data sharing can help promote the use of CSP. e existing cloud storage data integrity audit scheme with user revocation attribute is suitable for sharing data between remote data access and group users face has strong practicability. erefore, our main direction in this scheme is user security revocation in data integrity audit scheme.

Our Contribution.
In order to achieve an efficient audit of user security while ensuring data integrity, our scheme has the following contributions: (i) Public audit: a trusted third-party organization is entrusted with strong computing power to monitor the storage status of the data stored by CSP. (ii) Correctness: it can effectively verify whether the data stored in the cloud server are correctly stored, and can effectively supervise the storage behavior of CSP.
(iii) Access control: only users who meet the specified attribute policy can access the shared data. (iv) Secure user revocation: the revoked user cannot pass identity verification and cannot access or modify shared data for dynamic operations. (v) Traceability: if a user abuses data or poses a threat to data security, the user's identity can be identified and the user's true identity can be tracked.

Organization.
is paper is organized as follows: Section 1 describes the research background and related work. Section 2 introduces the research content of this paper. Section 3 describes the professional basic knowledge used in this paper. Section 4 elaborates the details of the plan. Section 5 has carried on the security proof to the proposed scheme. In Section 6, the scheme is compared and simulated. Section 7 summarizes the whole work. Figure 1, the entities involved in this scheme are cloud storage service provider (CSP), cloud storage service consumer, third-party auditor (TPA) and key generation center (KGC), and a user group with various memberships. e specific functions and definitions of each party are as follows:

System Model. As shown in
(i) CSP: it provides outsourcing data calculation and storage services for user groups, and verifies group members' identity when group members access data. When the user needs to verify the data storage status, data integrity data are generated and sent to the TPA for verification of the data integrity certificate. (ii) TPA: it verifies the data integrity certificate generated by CSP, saving the user data audit burden. (iii) Users: it refers to users who need cloud services, by purchasing cloud services, and the data storage work is entrusted to the cloud server for execution. (iv) KGC: it is responsible for generating parameters for the system and generating attribute keys for all users.

Security
Model. e security threats we considered in this scheme mainly are as follows: (i) Semitrusted CSP: the CSP may not faithfully report the data storage status to the user and maliciously cover up data damage or loss. (ii) Revoked user: the revoked user colluded with the CSP using the expired signature, concealed the true modification of the data, and conducted a collusion attack on the database. (iii) ird-party auditor: TPA may be honest but curious. If the privacy of user identity is not anonymously protected, TPA is likely to analyze the user identity and user data information.

Security Definition.
A game is constructed that exists between the adversary A and the challenger B, and the security of the scheme proposed in this chapter is proved through the operation of the game. e details are as follows: (1) System setup phase: the challenger runs the system setup algorithm, obtains the system public parameters, and sends it to the adversary. It keeps the master key msk and user revocation list. (2) Query phase: in the query phase, the adversary will perform hash query, key query, signature query, and query of the generated proof.
(i) Hash query (query H 1 and query H 2 , respectively): through these two types of query adversaries, the obtained information such as user attributes and signature policies can be converted into elements on. e challenger separately generates a query list to observe the challenge from the adversary. (ii) Key query: the identity and related attribute set are input, and the challenger generates the relevant key to send to the A. (iii) Signature query: the A asks about the signature on the message, and runs the algorithm to generate σ and send it to A. (iv) Proof query: the data integrity proof is obtained from CSP, and the proof, audit message, and signature are sent to TPA for data integrity verification.
(3) Forgery phase: the adversary outputs a tuple containing elements such as forged signatures and data integrity proof. If it can pass the verification, the game is aborted and the adversary wins.
If A can win the game with negligible advantage ε, then the scheme is security.

Notations.
e main notations used in this paper are shown and explained in Table 1.

Bilinear Groups.
Let G and G T are two multiplicative cyclic groups of prime order p. e is a bilinear [37] map: G × G ⟶ G T with the following properties: (i) Bilinearity: for all u, v ∈ G, and a, b ∈ Z p , e(u x , v y ) � e(u, v) xy . (ii) Nondegeneracy: e(g, g) ≠ 1. (iii) Computability: there is an effective algorithm to compute bilinear maps e.

Computational Diffie-Hellman
Problem. G is defined as a cyclic group of prime order [38,39] g is the generator of G, given g a , g b ∈ G, a, b∈ R Z * p , and the probability of g ab calculated by the adversary A in polynomial time is negligible; then, there is adv CDH

Revocable Signature Based on Attribute.
In attributebased signature (ABS), users sign messages using any of their attribute predicates published from the attribute authority. Under this concept, the signature is not to prove the identity of the person signing the message, but to declare the properties owned by the underlying signer. In ABS, even if malicious users collude with each other to synthesize attributes that can generate effective signatures, users cannot forge signatures with attributes that they do not have.
Users get secret key from GM according to their attributes and choose signature strategy that meets the attribute requirements. rough the secret key, users can calculate the data signature based on this signature strategy. e verifier will not get any information about identity or attributes when verifying the user's signature, and just need to verify the attributes to meet the signature policy. In this section, we will describe in detail the four main algorithms of this signature scheme [40] as follows:

reshold Strategy.
Assuming that (Δ, c, Φ) is a threshold strategy, let Delta be a set containing n attributes, and the threshold is c; then, Δ � A|A⊆Δ, |A| ≤ c , at least having c attributes in the attribute set.
3.6. Automorphic Signature. In the signature generation process, the scheme embeds the verification key in the message space, and the data and signature in the message space are considered to be composed of elements in the bilinear group. Such a signature scheme [41,42] is called automorphic signatures (ASs). e signature validity verification on the message data is verified by a set of paired product equations. e self-constructed signature constructed based on the CDH hypothesis can resist the chosenmessage attack (CMA) from adaptive adversaries. We give the general structure of the self-constructed signature scheme as follows: (i) Setup: it is supposed that there is a quintuple composed of bilinear group elements tuple � (e, g, G, G T , p). At the same time, G ⟶ (x, y, z) is selected, and a data space D � (W d , V d ) composed of data is defined, where d ∈ Z p . (ii) KeyGen: h ∈ Z p is selected to calculate the private key k � g h .
(iii) Sign: the data d i ∈ D to be signed are input, the random number s, r ⟶ Z p is selected, and the signature is calculated as σ � ( B � y · x · d i1≤i≤n 1/h+s , E � g s , Q � x s , T � g r , I � x r ).
(iv) Verify: it is verified that the signature σ generated in the previous step meets the following three verification equations e(B, h · E) � e(y · d, g)e(z · g r ), e(E, x) � e(g, Q), e(g r , x) � e(g, x r ) to determine the validity of the value.

Scheme Framework.
e construction of this scheme includes setup, key generation, signature, proof generation, verification, user security revocation, and so on. e basic definition of the algorithm is as follows: Setup(1 λ ) ⟶ (params, msk, ξ p ): the algorithm inputs the security parameter λ, and the output is the public parameter and the master key of the system. In addition, it generates the secret value ξ p about the user's revocation.
is step is completed by the attribute authorization center. KeyGen (id, Δ, msk, params) ⟶ (sk i , SK id,Δ , RL): in this algorithm, the attribute authorization center takes the user identity id and the associated user attribute set Δ, the system parameters params, and the master key msk which was generated in the system initialization as the algorithm input, outputs the user private key sk i and the global attribute private key SK id,Δ after calculation, and stores them together with the list RL used to judge the user revocation. Sign (M, Φ, c, SK id,Δ , params) ⟶ (c i , α i , σ): this algorithm takes data M, user attribute domain Φ, attribute threshold κ, global private key SK id,Δ , and public parameters as input, and then outputs commitment value c i and corresponding attribute proof α i and the user's signature σ on data through calculation. Proof (σ, M) ⟶ (Λ): the algorithm inputs data and signature and then generates a data integrity proof Λ. Verify (Λ) ⟶ (1, ⊥): it inputs the data integrity proof and verifies the data storage status by equation. Revoke (list id , rk) ⟶ (id, k id ): taking the identity list list id containing the user's identity information and the revocation key rk as input, the user's real identity can be traced for revocation of the user's identity.

A Concrete Scheme.
e main work of this section is to introduce the algorithms in the attribute-based user revocable integrity audit scheme. e details of the algorithms are as follows: (1) Setup: the attribute authorization center first generates a 5-tuple β � (n, G, G T , e, g) for the system, where e is a bilinear map, e: G × G ⟶ G T . Let G and G T be two bilinear groups. p and q are prime numbers with bit size ϑ(λ) and satisfy the Randomly select θ∈ R G, π∈ R G q , g ∈ G, ρ ∈ Z n . Use the hash function H 1 , H 2 : 0, 1 * ⟶ G, and our scheme can be extended to support any element in G. Let the revocation key be rk � ξ p ∈ Z n .
(2) KeyGen: let user attribute be at i . e attribute set Δ is contained in the attribute domain Φ. Select the element ε id ∈ G, r i ∈ Z n , and then generate a automorphic signature σ ε id . Compute the user's private key as e user identity id and secret value ε id are stored in the user information table list id .
(3) Sign: the user inputs data M � (m 1 , . . . , m n ), attribute set Δ, private key sk i , and public parameter params. e attribute at i ∈ Φ owned by the user is set as the minimum authorized set of attributes as Δ ′ , and the number of matching attributes in Δ ∩ Φ is set as the threshold c. e signature generation follows the following steps: (4) Proof: the auditor chooses i ∈ I⊆ [1, n] and the random element k ∈ Z * q . Output audit message AM � i, k i i∈I , and send it to CSP. CSP computes (7) Revoke: the attribute authorization center sends the revocation key rk � ε p to a specific user, such as the group manager, to revoke the user's authority, and takes the public parameters, user signature σ, and the table list id corresponding to the secret value ξ p and user identity as the algorithm input, if com(ε id ) ξ p � ε ξ p id is established. is user identity can be successfully tracked. Add the user information to the revocation list RL to realize user revocation. e attribute verification formula is e(c i , c i /(H i (at i )/θ)) � e(ρ, α).

Correctness and Security Analysis
Theorem 1. When the TPA sends an audit message to the cloud server, if the audit response returned by the CSP can be verified by the following equation, it means that the CSP has achieved the correct storage of data. Proof.
e correctness of the storage can be verified by the following equation: When the equation is established, it shows that CSP has completed the correct storage of data, and the data are stored by the authorized user entrusted to CSP.

Theorem 2.
Considering the data security in the attack scenario of choosing message and signature strategy, we use CDH hypothesis to construct data integrity verification scheme, so as to ensure that the adversary cannot pass the legal authentication and damage the data in this scheme with the forged evidence.
Proof. Assuming that there is a polynomial time algorithm B, we can solve the CDH problem on G by interacting with the adversary.
at is, when there is a generator g p , g ] p , g ι p ∈ G, where g p is the generator of G, calculate the value of g ]ι p . is shows that the adversary A can successfully forge user signature to obtain data operation authority through authentication.
Game 0: the main stage of this game is the challengeresponse parameter generation and correct parameter distribution. Select the generator h in G. Randomly select element (r 1 , . . . , r 5 ) ∈ Z q . Let user attribute domain Φ and automorphic signature public-private key pair (sk au , pk au ). Set g 1 � g ] p h r 2 , g 2 � g ι p h r 4 satisfy the following relationship: Finally, send the parameter params to the adversary. Game 1: in this game stage, the query operation is mainly initiated by A. Q H 1 : entering the user attribute in the function H 1 can convert the value into an element in G, so as to facilitate subsequent verification calculations. Take at i as input, and run H 1 query. e query list list H 1 sends the query response at i , c i , x i as output to A. e probability of x i � 1 is recorded as τ 1 the function H 2 is used to transform the data and its signature strategy into elements in G to facilitate subsequent verification. Take (M, Δ, c) as input, and run H 2 query. If the item is detected in the query list list H 2 , the element s∈ R Z n is randomly selected, and (M, Δ, c, s) is output as a query response and sent to A. If not, H 2 (M, Δ, c) � g s is calculated and stored in list H 2 together with (M, Δ, c, s). Q key : adversary A wants to get the user's private key generated based on the user's attribute. A queries the key through B. When x i � 0, B selects μ∈ R Z n . When x i � 1, B selects μ, μ ′ ∈ R Z * n and computes ε id � g μ g μ′ . At the same time, list H 1 takes the at i , c i , x i item as the output to parse out the commitment information c i . e user private key can be obtained as sk i � ((g c i 2 ) − μ/μ′ c r i id , g r i g − c i /μ′ ). Q sign : in B, after obtaining c id , the user's signature σ on the data can be obtained according to the entries queried in list H 2 and list H 1 . Q proof : the adversary selects signature and challenge value to send to B for proof query, and B generates integrity proof Λ � (m ′ , σ 1 ′ , σ 2 ′ , σ 3 ′ ) sends it to the adversary. Game 2: the adversary attempts to forge the valid signature of the legitimate user and generate integrity evidence based on the signature. A has tuples at is to say, A can break the security with the advantage of ε. In this case, our scheme is secure and can resist signature forgery attacks from adaptive adversaries.

Performance Analysis
In this section, we compare the computational cost of this paper with other data integrity audit papers [21,32]. As shown in Table 2, when calculating the cost of each stage of the comparison scheme, in order to make the description more concise and clear, we will use M to represent the multiplication on the multiplicative cyclic group, P to represent the pairing operation, H to represent the hash operation, and E is used to express exponential operation. r represents the number of revoked users, and n represents the number of data blocks. e analysis of the computational cost of each stage in the plan mainly revolves around four operations: multiplication, pairing, hash, and exponent. e experimental environment of this program is a PC with Intel(R) i5-7300HQ CPU@2.5 GHz processor and 8G memory. e Java programming language is used to simulate the algorithm time of the program. e code-writing platform is Eclipse and is based on the Java Pairing Based Cryptography Library (JPBC) library [43] selects a class A elliptic curve for the simulation test of the efficiency of the scheme.
As shown in Figure 2, the main computational overhead in the data integrity proof generation phase comes from the computational storage of the audit message (AM). e computational cost of this scheme at this stage is 2P + 5M + 2H + 3E. Compared with the other two schemes, it is proved that the cost of generation is related to the size of data block. Our calculation cost at this stage is constant and will not be affected by the size of data. erefore, it is suitable for large-scale proof generation, greatly reducing the cost of proof.
As shown in Figure 3, here we use r to represent the number of users who have been revoked. Under the assumption that the number of users is r, the user revocation time is tested. Since scheme [32] does not include user revocation function, our comparison in revocation phase is only compared with scheme [21]. e computational cost of revoking a single user's operation is constant, but when the number of users increases, the efficiency of this scheme is significantly higher than that of scheme [21]. At this stage, our calculation cost can be recorded as r(E + 2P).
As shown in Figure 4, the cost of the phase in the verification proof does not change with the number of data blocks in the data sequence, and the verification time in this Security and Communication Networks 7 stage is constant. Compared with the scheme [21,32], the efficiency of our scheme has also been improved in the data integrity verification stage. e calculation cost of this scheme at this stage can be recorded as H + 5P.

Conclusion and Future Work
e main discussion in this paper is a user revocable attribute-based data integrity audit scheme. Compared with the scheme of completely anonymous user identity, this scheme can break the anonymity of user signature when necessary, and can be applied to the place where users do not want to be completely anonymous, and the scheme has the function of public audit. In addition, this scheme uses attribute-based signature to realize flexible access permissiongranting mechanism, and realizes the unforgeability of signature to resist collusion attack from revoked users.
As the demand for cloud storage services becomes more and more diverse, more and more data security problems are exposed, so we propose the following research directions as the next research content.
e Authorization Verification of TPA. In the process of data integrity audit, users entrust a third party to handle the data verification. After receiving the audit challenge from the TPA, the CSP sends a response to send the calculated data certificate to the TPA, but if the application of the TPA is not authorized, it will cause a waste of CSP resources. e introduction of the third-party audit saves the extra audit cost of users and realizes the efficiency of the audit work with its own more professional ability. However, in order to prevent the CSP server from being attacked by DDOS initiated by    malicious TPA, we need to consider an audit authorization mechanism of TPA to limit its audit application.
Data Batch Audit. In practice, there are multiple user groups using CSP services at the same time. When multiple user groups send audit requests to the same TPA, TPA needs to have the ability to process audit requests in batches. e solution of this problem can help users enhance their confidence in the reliability of cloud service applications and help developers better promote cloud computing services. So, the problem of batch processing of data integrity audit request in cloud storage environment also needs further research.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors have declared that no conflicts of interest exist.