Improved Public Auditing System of Cloud Storage Based on BLS Signature

Cloud storage and cloud computing technologies have developed rapidly for a long time, and many users outsource the storage burden of their data to the cloud to obtain more convenient cloud storage services. Allowing users to audit the private data’s integrity has become an additional basic function of the cloud server when providing services. In 2021, based on the BLS signature and automatic blocker protocol, Jalil et al. proposed a secure and efficient cloud data auditing protocol. 2e protocol can realize public audit, batch audit, data update, and protect data privacy. Moreover, the automatic blocker protocol is used to realize the identity authentication of the auditor. 2e protocol is relatively novel, innovative, and has a larger use space. However, we found that their scheme had security problems. If the cloud server has thoughts of malicious attack, he can forge the proof that he holds users’ data with stored labels and pass the audit. Referring to the original protocol and being inspired by them, we propose an improved audit protocol. 2e improved protocol solves the security problem and is more effective.


Introduction
Recently, advanced and innovative technologies represented by cloud computing and cloud storage have become increasingly mature. Cloud storage and cloud computing technologies have the characteristics of convenience, economy, and high scalability. Users can store the generated data in the platform and control their data remotely without purchasing and using local storage devices. Users are increasingly inclined to use cloud storage services to manipulate data more quickly and easily.
Cloud server providers centrally hold massive amounts of users' data, which are easily targeted by malicious attackers, and dishonest cloud server providers will deliberately delete users' data or conceal data security incidents from users for reasons such as reducing their own storage burden or maintaining their reputation. In the application of cloud storage technology, users cannot absolutely manipulate the data, and the integrity of the users' data is threatened. Verifying the integrity of cloud data is a hot topic of current research now.
1.1. Organization. We organize our paper as follows: in Section 1, we introduce the research background and related work. In Section 2, we describe the system model of cloud storage audit protocol. In Section 3, we review Jalil et al.'s public audit protocol. In Section 4, we give our attack on the original protocol and show that it is not efficient. In Section 5, we introduce our improved secure auditing protocol. In Section 6, we analyze the security of the improved protocol and compare the audit efficiency of the improved protocol with the original protocol. Finally, in Section 7, we make the conclusion of our work.

Related Work.
Scholars have proposed many cloud storage data integrity audit protocols with different functions to meet the different needs of users in different application scenarios more effectively. In 2004, based on the RSA signature, Dewarte et al. [1] designed a protocol to audit remote files. However, the exponential calculation on all data blocks in the file will be performed on the user side, which will result in expensive computational overhead. In 2007, Ateniese et al. [2] designed a verification scheme suitable for a cloud storage environment called "provable data possession (PDP)." e protocol uses an RSA-based homomorphic linear authenticator and random sampling technology, and users only download the partial file to be able to verify the integrity. en, Juels and Kaliski [3] designed another "proof of retrievability(PoR)" scheme suitable for a cloud storage environment, which implements data integrity detection by inserting special data blocks (generally called "sentinels") into the data file.
In the actual application of cloud storage, users may need to perform various modifications and update operations on the data. erefore, researchers have proposed audit protocols that support dynamic data updates. In 2008, Ateniese et al. [4] first proposed an audit protocol that can achieve dynamic data update with a symmetric encryption method. However, this audit protocol has the shortcoming of limited audit numbers and does not support public data audit. In 2012, Zhu et al. [5] constructed an audit protocol that supports dynamic data update with an index hash table (IHT) based on zero-knowledge proof. In 2015, Erway et al. [6] designed an audit protocol based on the sorted authentication skip list. e protocol supports a complete data dynamic update. In 2016, Jin et al. [7] introduced an index switcher to propose an audit protocol that not only provides fair arbitration but also supports dynamic data updates. In 2017, Shen et al. [8] used a two-way linked list of location arrays to implement the audit of the data. e protocol uses global and block-free sampling verification methods, which can also reduce computing and communication costs. In 2019, Guo et al. [9] designed a verification protocol that supports task outsourcing and supports dynamic data updates. It provides a log audit mechanism to enable users to detect misconduct by dishonest auditors. However, the solution has security loopholes. After multiple audits of data blocks with the same index, theoretically, data labels can be forged by solving linear independent equations. In 2020, the cloud audit scheme suitable for IoT [10] designed by Hou et al. [11] uses a chameleon authentication tree to save the computational overhead during the dynamic data update process and supports batch audit.
If users undertake the periodic audit work, it will generate a large computational overhead and consume a lot of resources [12]. In practical application scenarios, it is important to protect the privacy of user's data [13]. Scholars introduce a third-party auditor (TPA) to help users regularly check the integrity of the data stored on the cloud server. However, when users outsource the audit task, TPA will obtain data content during the implementation of audit tasks [14]. In 2013, Wang et al. [15] designed a public verification scheme, and the scheme supports a privacy protection function based on random masking technology and batch audit function based on the homomorphic linear authenticator. e protocol ensures that TPA cannot obtain the user's real data during the data integrity audit process. In 2014, Worku et al. [16] used random masking technology to propose an efficient public audit protocol with data privacy protection function. Wang et al. [17] designed a shared data audit protocol, which uses the ring signature technology and can protect the users' identity privacy. In 2015, Xiong et al. [18] used an ID-based encryption algorithm to design a privacy protection protocol, and the protocol uses distributed hash table network to protect sensitive data. In 2016, Li et al. [19] used online/offline signatures to design a lightweight public audit protocol with data privacy protection function.
Traditional cloud audit protocols are mostly based on the design of PKI cryptosystem, which brings complicated certificate management issues. In 2013, the first public identity-based audit scheme was designed by Zhao et al. [20]. e protocol minimizes the information carried in the verification process and the information obtained or stored by TPA, which simplifies key management and reduces communication and calculation overhead. In 2014, Wang et al. [21] proposed an ID-based data audit scheme, which formally defines the ID-based remote file verification model. e protocol gave the first security proof of the identitybased audit protocol based on CDH problem's difficulty. In 2016, Wang et al. [22] designed an agent-oriented ID-based remote data audit protocol. According to user's authorization, the protocol can realize three modes of private audit, entrusted audit, and public audit. In the same year, Yu et al. [23] used zero-knowledge proof to propose an ID-based cloud audit protocol that supports the privacy protection of users' data. e protocol regulates the identity-based audit protocol and its security model and can realize zeroknowledge privacy protection for TPA. In 2019, as the solution to the complex key management problem in cloud data integrity verification, Li et al. [24] used fuzzy identity to design an audit protocol. Xue et al. [25] designed an IDbased audit protocol using blockchain to construct random challenge messages. In their protocol, TPA cannot forge audit results to deceive users [26]. Peng et al. [27] designed a new ID-based data ownership verification protocol using compressed authentication arrays, which can simultaneously and efficiently support batch verification for multiple users in terms of computing and communication. Rabaninejad et al. [28] used the online/offline signature to design an ID-based PDP, and the protocol is implemented to support privacy protection, batch audit, and full dynamic data update [26].
However, the key escrow problem exists in ID-based cloud audit protocols, so many cloud audit protocols based on certificateless signature have been proposed. In the certificateless signature system, the user and the key generation center (KGC) cooperate to produce the private key for the user, which can avoid the strong dependence of the system security on the KGC security [29]. In 2013, Wang et al. [30] designed a certificateless cloud audit protocol, but He et al. [31] later pointed out the security problem. In 2015, Zhang et al. [32] designed the certificateless cloud data verification protocol that can resist malicious auditors. In 2017, Kang et al. [33] applied the certificateless cloud audit protocol to wireless body area networks. e proposed protocol can resist malicious auditors and protect data content. e certificateless cloud audit protocol proposed by He et al. [34] can protect users' privacy, but it has also been pointed out that there are security problems. He et al. [35] applied the certificateless data audit protocol to the data management system of the smart grid, reducing the computational overhead. In 2018, Yang et al. [36] designed a certificateless cloud audit scheme for group user file sharing, which supports the protection of data content and users' identity privacy. In 2019, Wu et al. [37] defined the security model of the certificateless cloud audit protocol with privacy protection. e proposed protocol supports the protection of multiuser group identity privacy. In 2020, Huang et al. [38] designed a certificateless data verification protocol supporting the batch audit function, which realized efficient key update based on the Chinese remainder theorem.

Our Contribution.
Recently, Jalil et al. [39] proposed an effective cloud data public audit protocol based on BLS signature to realize public audit and protect file content privacy. e protocol implements batch audit and dynamic update. eir scheme also uses automatic blocker protocol (ABP) to prevent unauthorized TPA from participating in the audit work, which is highly innovative, and ABP is essentially an access control facility [40], which can detect threats from auditors [41]. However, we found that their protocol has security issues. Even if the cloud server does not hold the stored data, he can mathematically prove that he holds the user's data. en, we propose an improved and secure protocol with high security. e analysis shows the safety and effectiveness of our improved program in actual environments.

System Model
To facilitate understanding, we define and explain the various symbols and variables that appear in the original scheme and the improved scheme in Table 1.
e existing cloud audit systems generally include three interactive entities: cloud server provider (CSP) provides users with data storage services to obtain remuneration. CSPs are incredible. ey may delete cloud data for profit or steal users' data privacy. Users: users are the owners of the data, and they upload files to the cloud to save their own storage cost. ird-party auditor (TPA) is not an entirely believable auditor entrusted by users, and on the one hand, TPA performs the audit task faithfully, and on the other hand, TPA attempts to decipher the content of the user's data with curiosity. e interaction process of all entities: the user preprocesses the data to be stored and uploads it to CSP. When the data integrity needs to be verified, TPA generates a challenge with relevant parameters and sends them to CSP. Based on the challenge parameters, CSP uses cloud data to generate the proof that he holds the user data in full and sends the proof to TPA. TPA uses the proof to audit the data's integrity and sends the result to the user.

Review of Jalil et al.'s Protocol
ere are three entities involved in Jalil et al.'s scheme. Jalil et al. used the BLS signature to achieve public audit and protect data content privacy. e program also supported batch audit and dynamic update. In addition, the proposed system enhanced the level of security authentication through an ABP to protect the system from unauthorized TPA. In particular, their scheme contains the following algorithm.

DataProtection Protocol.
To protect data privacy, data file blocks need to be encrypted first. e user divides the data file F into n data blocks (b 1 , · · · , b n ) and then uses the AES encryption algorithm to encrypt the data blocks and obtain the encrypted data blocks (e 1 , · · · , e n ).

Setup Protocol.
e user takes the security parameter λ ∈ Z * q as input, for each data block e i , outputs the corresponding private key k s i ∈ Z * q , and calculates the corresponding public key k p i � g k s i ∈ G.

SignatureGen Protocol.
For each data block e i , the user generates a random value a i ∈ Z * q and calculates the corresponding label S i : where m i is the name of relevant blocks e i and H is SHA256 hash function, which defines intermediate parameters en, the user uploads V i and m i to the auditor and uploads e i and S i with pk i to cloud for i ∈ [1, n] and deletes the local data.

ChallGen Protocol.
When the user needs to verify the integrity of cloud data, he sends an audit request to the TPA. TPA first randomly selects c elements to form a subset Q of [1, n]. For all i ∈ Q, TPA selects a random p i ∈ Z * q and sends all i and p i to CSP.

Response Protocol.
When CSP receives an audit challenge from TPA, he first asks the user whether the user has issued an audit request, thereby confirming the authenticity of the challenge from the TPA. After receiving user's affirmative reply, CSP confirms that the challenge is true and performs the next step. is process is implemented through the ABP. CSP uses the following equation to calculate the aggregate tag and sends the evidence S to the auditor: 3.6. CheckProof Protocol. When the TPA receives the corresponding evidence generated by the CSP for the challenge, he calculates the following equation to verify the integrity of the data: If equation (3) is true, he shows that the CSP has faithfully performed the service and ensured the integrity of the cloud data.

BatchAuditing Protocol.
Each user divides the original file into n data blocks, then uses different encryption keys to encrypt the respective data blocks, generates private and public keys for different data blocks, and uses equation (1) to generate data tags. All users send (e i , S i , pk i , i d) for i ∈ [1, n] to the cloud and upload metadata (m i , V i , i d) to TPA, where i d represents the user's identifier. When the data integrity needs to be verified, TPA randomly selects c data block indexes to be challenged and sends them to CSP. After CSP receives the challenge and confirms the authenticity of the challenge, based on the label set S j of each user, the aggregate label S U is calculated for all challenged data blocks: CSP generates evidence (S U , k p ij ) (1 ≤ i ≤ c,1 ≤ j ≤ u) and sends it to TPA. After receiving the evidence, the TPA verifies whether the following equation holds: If equation (5) is true, it means that the integrity of the data has not been damaged.

Our Attack
In the audit protocol of Jalil et al.'s scheme, the correctness of the audit cannot be achieved. Even if the user's data held by the CSP are incomplete, CSP can pass the audit. In the SignatureGen protocol, the user calculates the signatures ( S i [1 ≤ i ≤ n] ) as equation (1). In equation (1), the calculation process of ( S i [1 ≤ i ≤ n] ) is determined by the private key value sk i and the name of the data block m i . However, are not signatures of the content e i . In response protocol, CSP only uses equation (2) to calculate the aggregation signature, but he does not calculate the aggregation of the data content. e integrity proof generated by the CSP has nothing to do with the content of the data block. e CSP can use the stored signatures ( S i [1 ≤ i ≤ n] ) to generate the integrity evidence and pass the audit, so he can store the name m i locally instead of the content e i . In addition, in the original scheme, the number of public keys and private keys required is extremely large, which is proportional to n. Both in terms of certificate management and storage overhead of three entities, it is more complicated and cumbersome. In the CheckProof protocol, c bilinear mappings are used. e cost of calculation is also relatively high. In this section, we will show that CSP can generate an integrity proof that passes the audit from TPA without the store data block e i . e relevant data stored by CSP include the following: User needs to store the following: k s 1 , k s 2 , . . . , k s n , k p 1 , k p 2 , . . . , k p n .
e data stored by TPA include the following: We can see that the storage costs of the three entities are proportional to n, and the storage costs are relatively large, which violates the original intention of cloud storage. In addition, CSP and TPA need to store n public keys, users need to store the same number of private and public keys as the number of e i requiring a lot of certificates, and certificate management is more complicated.
In the response protocol of Julil et al.'s protocol, CSP only generates the aggregation of signatures. CSP stores S i , so regardless of whether CSP stores data, aggregate tags S can be generated according to equation (2). As long as the stored signatures are correct, the correct data audit proof can be generated and verified by the CSP.
In the CheckProof stage, after the auditor accepts the proof, he needs to verify whether equation (3) is true or not and calculates c bilinear mappings. e bilinear mapping is computationally expensive and reduces the audit efficiency.

Improvements to the Secure Auditing Protocol
Based on the above analysis, the original protocol is improved here to enhance security and efficiency. e difference comparison between the original scheme and the improved scheme is shown in Figure 1.

DataProtection Protocol.
e user encrypts n data blocks (b 1 , · · · , b n ) divided from the data file F using the AES encryption algorithm and obtains the encrypted data blocks (e 1 , · · · , e n ), which can protect data privacy.

Setup
Protocol. CSP inputs security parameters λ and outputs public parameters G, g, E, H . Among them, G is a multiplicative cyclic group, g is the generator of G, E is the bilinear mapping, and H is the hash function. e user randomly generates k s ∈ Z * q and calculates k p � g k s ∈ G.

SignatureGen Protocol.
For each data block e i , the user calculates the corresponding label S i : e tag ( S i [1 ≤ i ≤ n] ) is calculated by the secret key k s , data block e i , and data block index i. en, the user deletes the local data and tags after uploading them to the cloud.

ChallGen Protocol.
To verify whether the data are complete, the user sends a message to TPA requesting an audit first. TPA randomly selects c elements from (1, n) to form a subset Q, and then, he randomly selects p i ∈ Z * q for all i ∈ Q. Finally, all i and p i are sent to CSP.

Response Protocol.
When CSP receives an audit challenge from TPA, he first ensures the authenticity of the challenge by querying the user. When the user's authenticity is confirmed, the CSP will accept the challenge. is process is implemented through the ABP. CSP randomly generates r ∈ Z * q and uses the following equations to calculate the proof: μ � μ ′ + r, (13) and then sends the proof R, S, μ to the auditor.

CheckProof Protocol.
When the CSP sends the evidence to the TPA, TPA verifies the authenticity of equation (14): If equation (14) is true, the data are completed and not corrupted. e process of proving the truth of equation (14) is as follows: 5.7. BatchAuditing Protocol. u users use different encryption keys to encrypt the data blocks belonging to themselves among the n data blocks divided from the original file, generate private keys k s j (1 ≤ j ≤ u) and public keys k p j (1 ≤ j ≤ u) , and then use equation (9) to generate data tags. All users delete the local data after the task of transferring (e i , S i ) to the cloud server is completed. To prove the completeness of the data, the TPA randomly selects k data block indexes to be challenged, sending the indexes and corresponding random values p i(1≤i≤c) to the CSP. After the CSP receives and confirms the authenticity of the content, TPA randomly generates r j ∈ Z * q for each user and calculates: Based on the set S j(1≤j≤u) of each user, the aggregate tag S U is calculated for all challenged data blocks: Security and Communication Networks 5 CSP generates evidence P � (S U , k p ij ) (1 ≤ i ≤ c,1 ≤ j ≤ u) and sends it to TPA as a basis for the verification. Upon receipt, TPA indicates whether the cloud data are completed by verifying the following equation: If equation (20) holds, it proves that data integrity has not been compromised. e proof of the correctness of (15) is as follows: 6. Analysis of the Improved Protocol e security of the improved protocol is first analyzed and explained here, including preventing forgery attack from CSP and attack from TPA to steal data content privacy. en, the storage and computation overhead of the improved protocol are analyzed in comparison with the original protocol, to prove that the improved protocol is safe and efficient.

Security Analysis.
(1) Anti-Forgery Attack: if in the cloud, the CSP generates a forged audit certificate μ and the stored user data are corrupted or tampered with, then it means that the group can compute the discrete logarithm problem with probability 1 − 1/q (q is a large prime number). A forged data possession proof μ � c i�1 p i e i + r will be generated by the CSP in the case of incorrect data, and we define the following: Because μ is the forged evidence, there must be a difference between μ and μ, and there is at least one Δe i ≠ 0. Assuming that CSP's forged proof of data possession μ can pass TPA's audit, therefore e correct proof can pass the TPA audit; therefore, From equations (23) and (24), we get Given a and b, then g can be written as g � a y 1 · b y 2 ∈ G, and y 1 , y 2 ∈ Z * q , and therefore, 1 � g Δμ � a y 1 b y 2 Δμ � a y 1 Δμ b y 2 Δμ . (25) simplified further to get b � a − y 1 Δμ/y 2 Δμ . To make equation (25) not true, only if the denominator y 2 � 0, then equation (25)   and y 1 , y 2 ∈ Z * q , so P[y 2 � 0] � 1/q, and the probability that equation (25) is true is 1 − 1/q. It is concluded that if the CSP can successfully forge the data block, then he can calculate the discrete logarithm problem and the probability is 1 − 1/q but obviously the discrete logarithm problem is a difficult problem, so the CSP cannot forge the fake data block that has passed the audit.
(2) Privacy Protection: first, an authentication protocol (ABP) is used to prevent unauthorized adversaries from entering the system. en, in the DataProtection protocol, the user's original data (b 1 , · · · , b n ) are encrypted by AES to obtain (e 1 , · · · , e n ). e data uploaded to the cloud are encrypted data. e CSP does not hold the encryption and decryption keys of the AES encryption algorithm, so it is impossible to know the real data content of the user, avoiding the leakage of data privacy. Finally, for TPA, the improved protocol uses random masking technology to realize data protection. Assuming that TPA is curious about the challenged data blocks' content (e 1 , · · · , e c ) and audits c data blocks t(t ≥ 1) times, q ji is represented as random parameters during the j − th time audit on the i − th data block, and then, the set of random numbers is Q � p ij 1 ≤ i ≤ c,1 ≤ j ≤ t . An evidence set consisting of t pieces is proofs � (μ j , S j , R j ) 1 ≤ j ≤ t . TPA can obtain the following equations: p 11 e 1 + p 12 e 2 + · · · + p 1c e c + r 1 � μ 1 p 21 e 1 + p 22 e 2 + · · · + p 2c e c + r 2 � μ 2 ⋮ p t1 e 1 + p t2 e 2 + · · · + p tc e c + r t � μ t .
In the above equations, TPA knows p ij 1 ≤ i ≤ c,1 ≤ j ≤ t and μ j 1 ≤ j ≤ t , but he does not know e i 1 ≤ i ≤ c and r j 1 ≤ j ≤ t .
ere are c + t unknown numbers in equation (26), no matter how many times TPA audits the same data blocks; that is, no matter what the value t is, it will always be less than c + t, and TPA cannot solve equation (26) and cannot know the content of the data blocks (e 1 , · · · , e c ) and (b 1 , · · · , b c ).

Efficiency Analysis.
In the original protocol, the user needs to generate the corresponding public keys k p i and private keys k s i for e i(1≤i≤n) . After uploading the data blocks and tags ( S i [1 ≤ i ≤ n] ) to CSP, the user still needs to store his own public keys and private keys, and the storage cost is 2n|Z * q |. In addition to data blocks, CSP also needs to store tags, and the storage cost is 2n|Z is stored at TPA, so the storage overhead is 3n|Z * q |. In the improved protocol, the user only holds a pair of k p and k s , and the storage overhead on the user side is 2|Z * q |. CSP needs to store ( S i , e i [1 ≤ i ≤ n] ), and the storage overhead is n|Z * q | + n|S|. When TPA verifies the evidence, he needs the user's public key in addition to the challenge information, and the storage cost is |Z * q |. e storage cost comparison between the original protocol and the improved protocol is shown in Table 2. e storage overhead of the improved scheme is lower than that of the original scheme.
Because the multiplication and addition operations on Z * q have minimal computational overhead compared with other operations, we omit them. In the original protocol, the user needs to calculate k p i � g k s i , S i � (H(m i ) · g a i ) k s i , and V i � g a i ∈ G, and the calculation cost is 4n|E G | + n|H| + n|M G |. CSP needs to calculate S � c i�1 S i , and the calculation cost is cM G . TPA needs to calculate equation (3), and the calculation cost is 2c|M G | + c|E| + c|H|.
In the improved protocol, the user needs to calculate S i � (H(i) · g e i ) k s , and the calculation cost is n|E G | + n|H| + n|M G |. CSP needs to calculate S � c i�1 S p i i and μ ′ � c i�1 p i e i , and the calculation cost is c|M G | + c|E G |. TPA needs to calculate equation (14), and the calculation cost is c|E G | + 2|E| + c(H) + c(M G ). e calculation cost comparison between the original protocol and the improved protocol is shown in Table 3. Among the entities of the improved scheme, only CSP's calculation overhead is slightly higher than the original scheme. e calculation overhead of TPA and user in the improved scheme is significantly reduced compared with the original scheme.

Conclusion
According to the analysis in this study, it is clear that the protocol of Jalil et al. is insecure. We point out the security loophole in the original protocol and attacked it, and then, we propose an audit scheme with higher  Data Availability e data supporting this systematic review were taken from previously reported studies and datasets, which have been cited.
e processed data are available from the corresponding author upon request.

Conflicts of Interest
ere are no potential conflicts of interest.

Authors' Contributions
Ruifeng Li is responsible for the writing of the article and the construction of the improved scheme, Xu An Wang is responsible for the derivation of the formulas in the article and gives some significant ideas, Haibin Yang is responsible for the polishing of the language of the article and the collecting of the information related to this article, Zhengge Yi is responsible for the verification of the security of this article, and Ke Niu revised the finished manuscript.