Dynamic Outsourced Proofs of Retrievability Enabling Auditing Migration for Remote Storage Security

Remote data auditing service is important for mobile clients to guarantee the intactness of their outsourced data stored at cloud side. To relieve mobile client from the nonnegligible burden incurred by performing the frequent data auditing, more and more literatures propose that the execution of such data auditing should be migrated from mobile client to third-party auditor (TPA). However, existing public auditing schemes always assume that TPA is reliable, which is the potential risk for outsourced data security. Although Outsourced Proofs of Retrievability (OPOR) have been proposed to further protect against the malicious TPA and collusion among any two entities, the original OPOR scheme applies only to the static data, which is the limitation that should be solved for enabling data dynamics. In this paper, we design a novel authenticated data structure called bv23Tree, which enables client to batch-verify the indices and values of any number of appointed leaves all at once for efficiency. By utilizing bv23Tree and a hierarchical storage structure, we present the first solution for Dynamic OPOR (DOPOR), which extends the OPOR model to support dynamic updates of the outsourced data. Extensive security and performance analyses show the reliability and effectiveness of our proposed scheme.


Introduction
In today's information era of data explosion, it is an inevitable trend for most people to have the ever-increasing big data storage demands.Storage outsourcing through the cloud has become a promising technology paradigm that populates the recent literatures [1][2][3] and has been regarded as a faster profit growth point [4] by various IT industry giants (e.g., Google Drive, Microsoft OneDrive, and Amazon EC2 and S3).Cloud storage not only allows mobile clients to access their outsourced data from anywhere at any time, but also provides mobile clients with many benefits such as the inexpensive storage cost and elastic configuration of the storage capacity, which attract more and more mobile clients (e.g., smartphones and laptops) to join the cloud for the convenient lifestyle.
However, at the side of the cloud storage server (CSS), there still exist all kinds of internal and external threats against the storage security of outsourced data, such as Byzantine failures, the monetary reasons, and the hacker attacks [5,6].So, it is well known that CSS would be considered as a malicious entity that might try to hide the accident when data loss occurs, or even deliberately delete client's data for saving storage cost.In this case, for the mobile client who has not actually possessed her data after storage outsourcing, an urgent requirement is how to guarantee the correctness and retrievability of the outsourced data.Here, correctness guarantee means that any appointed data returned from CSS should be the latest version of the authentic data, and retrievability guarantee means that the whole outsourced data can be correctly retrieved by the client without any data loss.
Based on the application of erasure code and the periodic auditing against CSS, the security model Proof of Retrievability (POR) [7,8] is defined to offer client-side devices the above two guarantees in the context of malicious CSS.However, given the fact that most mobile clients only have a limited capacity so that these clients are unlikely to keep online all the time to perform the frequent auditing, various public auditing schemes are proposed [5,6,[9][10][11] for auditing migration, which enables mobile client to free herself by moving the heavy auditing tasks to a third-party auditor (TPA).But existing public schemes rely on the hypothesis 2 Wireless Communications and Mobile Computing that TPA is trusted to complete the migrated auditing tasks, meaning that these schemes do not provide any security guarantee to resist a malicious TPA that might break the auditing protocols, which is exactly the potential risk that has not been covered by current public auditing schemes [12].
The first Outsourced Proof of Retrievability (OPOR) solution, proposed by Armknecht et al. and called Fortress [12], is a stronger security model to protect against any malicious entity (e.g., malicious TPA) and against collusion among any two entities.Fortress enables auditing migration, but it is just a static scheme.Supporting dynamic updates is an essential requirement for numerous practical cloud storage applications [13].Although all kinds of authenticated data structures [6,9,[13][14][15][16] have been proposed to support data dynamics, there still exist research gaps in these structures.For one thing, most existing dynamic authenticated data structures [9,[14][15][16] are designed based on the Merkle Hash Tree (MHT).Unfortunately, the use of MHT is not efficient in some cases since MHT is an unbalanced tree.For example, the height of MHT will increase linearly when many new data blocks are continuously inserted in the same leaf position, so in this worst case the expected (log ) performance cannot be ensured for authenticating the index and value of any appointed leaf node via MHT.For another, although some other authenticated structures such as rb23Tree [6] and Skip List [13] are proposed for dynamism, when there is a need to verify multiple leaf nodes of different data blocks (i.e., authenticate the indices and values of these leaf nodes), all above-mentioned dynamic methods merely adopt the straightforward way of verifying these different leaf nodes one by one with their respective proof paths.Since the verification upon a single proof path is of (log ) bandwidth and computation costs, as the number of verified leaf nodes increases, it is clearly not an efficient way to separately verify that many leaf nodes one by one.Although a balanced authenticated data structure (e.g., the rb23Tree [6]) constructed upon the balanced tree can avoid the worst case of MHT as discussed above, however, to the best of our knowledge, there is no such balanced authenticated structure that can deal with the problem of how to efficiently batchverify any number of appointed leaf nodes altogether, which is a limitation that should be further addressed in this paper.
Furthermore, to support data dynamics when applying erasure code, the scheme of [6] is based on the way of local coding, that is, encoding each raw data block individually to ensure that an update upon any raw block only affects a small amount of the encoded blocks.However, such local coding solution is vulnerable to the selective deletion attack from malicious CSS [14], because once a targeted raw block has been updated in the case of local coding, CSS can learn which congenetic encoded blocks correspond to this targeted block.Then, CSS can selectively delete these congenetic encoded blocks to actually cause the loss of the targeted block, and simultaneously CSS can pass the data auditing with significant probability, since the auditing relies on the sampling technology that can hardly cover these selectively deleted encoded blocks if the size of deleted data is tiny.Although both the private and the public Dynamic POR (DPOR) schemes [14] are proposed to resist the above selective deletion attack, the direct application of these two DPOR schemes into OPOR model will result in security or efficiency problems.On the one hand, within the private DPOR of [14], the auditing can only be executed using client's secret key.But TPA is prohibited from obtaining such secret in OPOR, or otherwise the malicious TPA might share client's secret key with malicious CSS [12] so that CSS can break the auditing protocols without actually holding the outsourced data.On the other hand, in order to support public auditing, the public DPOR of [14] cannot apply the blockless verification technique [8,9,13,17] that combines multiple challenged blocks into a single aggregated block for efficiency, so it has to use the straightforward way of requiring TPA to retrieve all randomly challenged actual blocks during each POR audit.As shown in [9], this straightforward way could lead to a large communication overhead and thus is inefficient and should be avoided.
From the above, there will be various problems if the current dynamic schemes are directly ported to the OPOR model.In this paper, to solve these mentioned problems, we propose a concrete Dynamic Outsourced Proofs of Retrievability (DOPOR) scheme enabling auditing migration.DOPOR not only can defend against the malicious TPA and collusion, but also can enable efficient data dynamics under the setting of erasure code.Specifically, our contributions are summarized as follows.
(1) Different from traditional authenticated structures that only have the ability to verify different leaves one by one, we propose the novel authenticated data structure called bv23Tree, which is based on the balanced 2-3 tree to ensure the logarithmic complexity in any case of updates and simultaneously enables the verifier to batch-verify the indices and values of multiple appointed leaves all at once for efficiency.
(2) To defend against the selective deletion attack, we utilize a hierarchical storage structure with the same-sized levels for the unified management of outsourced encoded data and encoded update operations.According to this hierarchical structure and the bv23Tree, we resolve the open questions of [12] by transforming another secure public POR into the OPOR model and designing an appropriate dynamic scheme to efficiently integrate dynamic updates with OPOR.
(3) We analyze the security of our solution and conduct an extensive experimental study.The experimental results demonstrate the effectiveness of our scheme.
The rest of this paper is organized as follows: Section 2 states the background and introduces the architecture and system model of DOPOR.Section 3 shows the novel dynamic structure bv23Tree in detail, based on which we present detailed DOPOR solution in Section 4. Section 5 provides the security analysis, and Section 6 evaluates the experimental performance.Section 7 overviews the related work.Finally, this paper is concluded in Section 8.

Background and System Architecture
2.1.Problem Statement.We begin with the background of OPOR, as shown in [12].There are three entities involved  in an outsourced auditing environment: mobile client (i.e., data owner), cloud storage server (CSS), and third-party auditor (TPA).Because of their limited local storage capacity, mobile clients are motivated to outsource their large data files to CSS and then can make use of various on-demand cloud storage services.However, because CSS might be misbehaving, it is very important to design the periodic remote data auditing mechanism against CSS, which enables the mobile client to have the assurance that her outsourced data is always available and can be completely retrieved from CSS if necessary.Further, to liberate the mobile client from struggling with the endless online data auditing, OPOR also introduces TPA that has the expertise and ability to perform the above frequent auditing tasks on behalf of the mobile client, and then the mobile client can be offline to rest most of the time.
Although OPOR model has the same three entities as existing public auditing model [9,10,15,16], one of the main differences between the two is that TPA might be also malicious within OPOR [12].In other words, TPA might violate the auditing protocols, for example, by claiming that he has honestly performed all past auditing, but actually he tells a lie.Furthermore, any two entities might be in collusion within OPOR [12].For example, firstly, malicious CSS might collude with malicious TPA to deceive the honest client when outsourced data has been lost.Secondly, malicious client might collude with malicious CSS to frame the honest TPA, by asserting that TPA did not correctly perform the required auditing work to bring TPA into the compensation lawsuit.Since any entity might be malicious, in order to solve the problem of securely sampling the periodic challenges for frequent auditing, the time-based bitcoin pseudorandom source is introduced into the OPOR model.As demonstrated in [12], due to the fact that the bitcoin pseudorandom source cannot be manipulated by any entity, it is secure to be used for generating the continuous random seeds for periodic challenges.
OPOR inherits the retrievability guarantee of POR [8] by applying erasure code, meaning that the client can retrieve the whole outsourced data in case of minor data corruption.However, when both of the data dynamics and erasure code are considered, the problem of how to efficiently perform the updates is intractable.Within the first OPOR scheme Fortress [12], client's original data file (including  raw data blocks) is entirely encoded before outsourcing, so an update operation upon any single raw data block will affect the whole outsourced encoded data.In this case, the only way for client is to download and decode the whole outsourced encoded data and then encode and upload all the data again after performing the updates, which means unbearable bandwidth and computation costs.This is why Fortress is just a static scheme that cannot support efficient dynamic updates.

Dynamic OPOR (DOPOR)
Architecture.The representative DOPOR architecture is presented in Figure 1.Based on the bitcoin source that controls the random sampling of periodic challenges, once TPA accepts the migrated auditing tasks from the client, TPA must generate the corresponding logs after he completes each specified POR audit against CSS.In this case, the client is able to check TPA's work at any point in time by verifying TPA's logs, and then she can judge that if TPA did his auditing work correctly in the past.As shown in [12], such client's checking against TPA can be much less frequent than the TPA's POR audits against CSS, since the client can batch-check a number of accumulated TPA logs all at once.
To support dynamic updates, we apply a similar idea to [14] within our solution, which is to place all accumulated update operations into an erasure-coded buffer at CSS side, rather than immediately executing these update operations upon the outsourced encoded data.As shown in Figure 1, the CSS-side storage is organized in two different buffers denoted with U (i.e., unencoded buffer) and E (i.e., erasurecoded buffer).Buffer U will independently store an up-todate copy of all the raw data blocks, which are organized by our proposed bv23Tree, to support the efficient batch reads from cloud storage without struggling with the erasure code.On the other side, buffer E is further divided into two parts, ED and EO, which store the whole outsourced original encoded data blocks and all the accumulated encoded update operations, respectively.In case of data loss, client can recover the up-to-date copy of the whole outsourced raw data by decoding the entire buffer E and combining both of ED and EO, so the periodic POR audits only need to be performed upon buffer E for the retrievability guarantee.
As will be described in Section 4.1, both ED and EO constitute a complete hierarchical storage structure where all levels have the same size, which are different from the levels of exponentially growing capacity in [14].After the client performs a batch of update operations upon buffer U, benefitting from the same-sized levels of DOPOR, this batch of update operations can be wholly encoded and then directly placed into EO to fill up the corresponding level for improved computation cost, without executing the rebuilding of a level as in [14] that incurs (log ) amortized cost for each update operation.More importantly, based on the samesized levels, our DOPOR solution can build upon the public verification POR scheme of [8], the aggregation technique of which provides the support for the client's checking against malicious TPA.
Finally, after every  update operations, when the size of EO grows to the same size as ED, buffer E will be rebuilt.The rebuilding of E will rewrite the whole ED with the encoded version of all the up-to-date raw data blocks and meanwhile empty the whole EO.Since E is only rebuilt once in every  update operations, the amortized complexity of rebuilding E will be (1) per update operation.However, different from the existing schemes [14,18,19] that require () clientside temporary memory for such rebuilding, as shown in Section 4.3, based on the ability of batch reads and the samesized levels, the client of DOPOR only requires () clientside memory to gradually rebuild E, where  is the security parameter that is independent of the data size .So, DOPOR further reduces the required client's memory when rebuilding and thus is suitable for the client-side mobile devices.

System Model.
Formally, the complete definition of DOPOR system can be described by the following ten protocols: (vii) RandomChal(B, , P) → { () }: when inputting the bitcoin source B, the time , and the public parameters P, it outputs a bitcoin-based challenge  () for POR audit.

Balanced Authenticated Data Structure
Within the rb23Tree of [6], for verifying a leaf, the client must retrieve from the adversary a corresponding proof path that consists of (log ) marks.However, the limitation of rb23Tree method is that a lot of duplicated information will exist in the proof paths of different leaves, which will waste too much communication cost when the client verifies many leaves one after another by retrieving different proof paths.
To solve this problem, we propose the batch-verifications 2-3 Tree, called bv23Tree, for efficiency.
, the verifier only needs to retrieve the necessary auxiliary tree nodes (colored in red), which are organized into a special table structure as will be shown later.Then, the verifier can batch-verify the indices and values of these leaves in L all together.
3.1.Batch-Verifications 2-3 Tree.Following the definition of 2-3 Tree [20], each nonleaf node of bv23Tree can have two or three children.Let  be an original data file consisting of  raw blocks  = { 1 ,  2 , . . .,   }.With the hash function ℎ(⋅), the bv23Tree on file can be constructed by storing at each tree node V a 3-element tuple ( V ,  V ,  V ), defined as follows: (i)  V is the status of node V. Let P denote the parent node of V. Let ch 1 , ch 2 , ch 3 be the three left-to-right children of P, respectively.Specifically, if P only has two children, then ch 3 = null.So,  V is defined as (ii)  V is the rank value of node V, which is similar to the concept defined in existing rank-based MHT scheme [16].Namely,  V stores the number of leaf nodes that belong to the subtree with node V as the root.If V is a leaf node, we define  V = 1.(iii)  V represents the authentication hash value of node V.
The value of  V is defined with different cases.
Case 0. V = null; then Case 1. V is the th leaf node; then Case 2. V is a nonleaf node with children ch 1 , ch 2 , and ch 3 as above (sometimes ch 3 will be null): where ‖ denotes the concatenation operation.
We show an example of bv23Tree in Figure 2, which is constructed on 16 file blocks {  } 1≤≤16 .Next, we use the concise shorthand for some symbols.Given a bv23Tree node V  , we use   ,   , and   to denote  V  ,  V  , and  V  , respectively, so each tree node V  and its corresponding 3-element tuple (  ,   ,   ) are not distinguished.And we also use ℎ  to denote the hash value ℎ(  ) for convenience.

Batch Queries.
The integrity (i.e., authenticity and freshness) of file blocks can be protected by the corresponding hash values ℎ  (1 ≤  ≤ ) stored in leaves, while the integrity of the leaves themselves will be protected by bv23Tree.Now, suppose that the whole file blocks { 1 ,  2 , . . .,   } and a bv23Tree on all blocks have been stored at CSS. Client wants to verify the integrity of any  ordered file blocks   1 ,   2 , . . .,    (1 ≤  1 <  2 < ⋅⋅⋅ <   ≤ ) read from CSS and thus batch-queries CSS by issuing the ordered indices set Π = { 1 ,  2 , . . .,   } that appoints  ordered leaves {V  1 , V  2 , . . ., V   }.Then, CSS calls Algorithm 1 to generate the corresponding proof table and responds to client with the  appointed leaves and their proof table.
An example of proof table is shown in Table 1.Without loss of generality, suppose the number of the levels of a bv23Tree is .Due to the property of balanced tree, the number  must be (log ) complexity.Let  × denote the proof table of any  appointed leaves; then  × has the following characteristics: (i)  × contains  rows and  columns, respectively (i.e.,  × consists of  ×  items).
(ii) Each item  , (1 ≤  ≤ , 1 ≤  ≤ ) can have one or two components, and each component can be a tree node V  or a mark  (2 ≤  ≤ ) or null.Since  only denotes a pointer that points to the th row of the table  × itself, the communication cost of a mark  is little when compared to the cost of a node V  .
(iii) The more the leaves are batch-verified, the more the null that exists in  × .And no matter how many {let V denote the current tree node stored in []} (10) if node V only has one sibling node, which is denoted by V then (11) if V exists in  [𝜃], in this case  <  ≤  then (12)  , ← ;  [𝜃] ← null; (13) else {V does not exist in current array [ ]} (14)  , ← V; (15) end if (16) else {node V has two sibling nodes, which are denoted by (V 1 , V 2 )} (17) {note that the sequence of (V 1 , V 2 ) must follow the left-to-right principle as in Figure 2, e.g., two siblings of node V 8 must be denoted by The corresponding proof table  6×3 is generated based on the bv23Tree and the appointed ordered leaves set leaves are batch-verified, each necessary auxiliary tree node V  only appears once in  × .
In the context of the batch verifications upon  appointed leaves, compared to the proof path method as in [6], the communication cost of the proof table  × is much less than that of transferring different proof paths of  leaves, respectively.This is because the proof table avoids the limitation of proof paths that the repetitive node information (e.g., node hash values) will exhibit in the proof paths of different leaves with high probability, so there is a plenty of null in the proof table to save the communication cost.Furthermore, compared to the 8-element tuple mark in proof path of [6], each tree node V  included in the proof table is only related to a 3-element tuple, which further reduces the communication cost.

Batch Verifications.
Upon receiving from CSS the  appointed leaves set L = {V  1 , V  2 , . . ., V   } and the required proof table  × , client can run Algorithm 2 to batch-verify the indices and values of these  leaves in L all at once, by using her local metadata  root .An example of batch verifications upon multiple leaves is shown in Table 2.
Within Algorithm 2, the function Append(⋅) is to merge two different sets of numbers while preserving the order of these numbers.For example, given two sets IS 1 = {4} and IS 2 = {6, 8, 10}, then Append(IS 1 , IS 2 ) = {4, 6, 8, 10}.In addition, the operator notation "⊕" is to add a number to every element of a set.For example, let IS 3 = {4, 6, 8, 10}; Batch Verify( root , Π, L,  × ) → {true, false}.This algorithm can batch-verify not only the hash values of all  ordered leaves L = {V  1 , V  2 , . . ., V   } provided by CSS, but also that the indices of these  leaves are exactly matched with the appointed indices set Π = { and C 2 is a mark , then (13) {in terms of the left-to-right principle, the value of  t can only be 0 or 1 in this case} (14)  The process of batch verifications upon the appointed ordered leaves set At a high level, Algorithm 2 is also the way to gradually construct the partial bv23Tree, which precisely covers all appointed leaves in L, the paths from these appointed leaves to the root, and all the siblings of the nodes on these paths.Based on this partial bv23Tree, the batch updates upon outsourced original raw blocks { 1 ,  2 , . . .,   } can be supported, as shown in Algorithms 3 and 4.

Batch Updates.
Three basic types of dynamic update operations are modification (), insertion (), and deletion () [9].Any block-level update operation O can be defined by the form of O fl "U  ", where U ∈ {, , } denotes operation type,  is the index of targeted block, and  is the new data block that will be exactly stored according to the targeted index  ( is null for deletion).For example, " 2 " is to modify the 2nd block to , " 3 " is to insert  after the 3rd block, and " 4 null" is to delete the 4th block.
In the setting of dynamism, the update operations should be performed not only on the data blocks, but also on the bv23Tree.Note that only the insertion and deletion can cause the structure transformation of bv23Tree, and the maintenance of this transformation is essentially identical to the maintenance of a standard 2-3 tree [20], except that the updating of each affected tree node V should be considered in terms of the 3-element tuple ( V ,  V ,  V ).As in Figure 3, we give an example for the structure transformation of bv23Tree after repetitively inserting (or deleting) the appointed data blocks at the same index position.Now, suppose that client caches a batch of ordered update operations SO fl {O 1 , O 2 , . . ., O  }, which refer to an ordered indices set Π = { 1 ,  2 , . . .,   },  ≤ , by the removal of duplicate indices.To batch-update the remote data, client first issues SO to CSS and obtains the returned ordered leaves set L = {V  1 , V  2 , . . ., V   } along with the proof table  × .As shown in Section 3.3, during the process of executing Algorithm 2 to batch-verify L, client can construct the corresponding partial bv23Tree.Then, after sequentially performing each operation of SO upon the partial bv23Tree, client can compute by herself what would be the authentication hash value of the final state tree root.Finally, if CSS outputs the same final state root hash value as the one computed by client herself, client outputs true, meaning that CSS correctly performs the batch updates according to SO.Otherwise, client outputs false.As shown in Algorithms 3 and 4, we outline the Batch Updates algorithm performed by CSS and the Verify Updates algorithm performed by client, respectively.

Cloud Server Storage Configuration.
As shown in Figure 1, there are two different buffers U and E at CSS side for outsourced data storage.Client's original file , consisting of  raw blocks  = { 1 ,  2 , . . .,   }, will be separately stored into U and E with different formats, detailed as follows.

U (Unencoded Buffer).
To support the efficient reads, buffer U always stores the up-to-date copy of the raw data blocks, and thus the update operations issued from client must be immediately performed upon the appointed blocks of U.All the blocks in U are organized by the bv23Tree as proposed in Section 3, the batch-verifications property of which enables client to batch-read a group of raw blocks from U for improved performance. (Erasure-Coded Buffer).Buffer E is organized by the samesized hierarchical structure, as in Figure 4, where each level has the same size to hold  encoded blocks with the blocks tags ( is the security parameter).Moreover, E is equally divided into two parts ED and EO with the same capacity.ED is to store encoded data blocks, and EO is to store encoded operations blocks.At the beginning, client applies erasure code to encode the entire file  into ñ encoded data blocks, computes the blocks tags, and sequentially stores them into ED in terms of the blocks indices as shown in Figure 4. Subsequently, the content of ED will not be changed until buffer E is rebuilt.As shown in Section 3.4, each update operation O is of the form "U  ", which can also be regarded as an operation block.In this case, after a batch of update operations SO = {O 1 , O 2 , . . ., O  } ( < ) are performed upon U, client will encode SO into  encoded operations blocks and store them along with the blocks tags into a corresponding level of EO.So, EO is empty in the initial state and will be incrementally filled up level by level over time.At last, once EO is full, the rebuilding of E is triggered, as will be described later.At a high level, by decoding the whole E and sequentially performing the accumulated update operations of EO upon the original data blocks of ED, the latest version of the whole client's outsourced raw data can be recovered.Therefore, the periodic POR audits only need to be deployed against the buffer E, which functions as a backup storage to provide the retrievability guarantee when data loss occurs.To resist the mentioned selective deletion attack as before, every POR audit will sample challenged blocks from each filled level of E. Because each level is entirely encoded, malicious CSS must corrupt a significant portion of the encoded blocks of one level to actually cause the data loss.However, if malicious CSS corrupts that many encoded blocks of one level, it cannot pass the POR audits with overwhelming probability.

Initialization.
We also work in the bilinear setting, where  is a multiplicative cyclic group of prime order  and  is a generator of .Let  : × →   be the same nondegenerate bilinear map as in [8] that has the following property: for any z,  ∈  and ,  ∈ Z  , (z  ,   ) = (z, )  .Let  : {0, 1} * →  be the secure BLS hash function.The initialization of our scheme is described as follows.
(1) GenKey(1 k ): each entity E ∈ {client, CSS, TPA} generates a signing key pair (ssk E , spk E ) for their respective signatures.(along with their tags Φ = {  } 1≤≤ñ ) into the ED part of buffer E, the layout of which is shown in Figure 4.
In addition, let sp be a state pointer that denotes the number of the filled levels of buffer E. Clearly, the range of sp is ñ/ ≤ sp ≤ 2ñ/, as in Figure 4. Let cℓ denote the number of the challenged blocks from a filled level, and let P = {sp, ℓ, g, , fid,  1 , . . .,   } be the public parameters set for POR audits.Finally, client keeps P and  root locally, sends P to TPA, and deletes {, Ψ, , Φ} from her local storage.

Data Access Mechanisms.
Based on the bv23Tree of buffer U and the hierarchical configuration of same-sized levels of buffer E, DOPOR supports batch updates that enable the client to perform a batch of update operations upon the outsourced storage, which is suitable for the common scenario of [14] where writes are frequent.Now, after completing the initialization of DOPOR, the client can access and update her outsourced data by the following three protocols.
( } of these  encoded blocks in order   = sp⋅+, 1 ≤  ≤ , and computes the tag    for each    by the same tag formula as in GenTags(⋅) of initialization.Secondly, client outsources these  blocks {  1 ,   2 , . . .,    } along with their tags into the (sp + 1)th level of buffer E (i.e., the corresponding empty level of EO).Finally, client sends to TPA the updated state pointer sp * = sp + 1 with her signature and empties her () local storage for caching the next  ordered update operations.Overall, through this protocol, the CSS state st consists of U and EO as above.
(3) Rebuild( root , sk client , st, sp).Once in every  update operations, when EO of buffer E gets filled (i.e., sp is equal to 2ñ/ as in Figure 4), the periodic rebuilding for E is triggered.Since buffer U stores the up-to-date copy of all raw data blocks, client can carry out this rebuilding based on U for the performance improvement, instead of decoding the whole E and applying all operations of EO on the original data of ED. (ED and EO will not be decoded and combined, unless client detects data corruption within CSS and wants to retrieve the whole data.) Benefitting from the ability of batch-reading multiple blocks from CSS and our hierarchical configuration, client can rebuild E with only () local memory that is the same size as a level of the hierarchical structure, which is the significant improvement when compared to existing schemes [14,18,19] that require such rebuilding with () client local memory.In this protocol, the CSS state st consists of U and E. At the beginning, client runs ReadBlocks(⋅) to batchread the first  ( < ) raw blocks from U. After encoding these  blocks into  codeword blocks, client computes the corresponding  tags by the same way as in GenTags(⋅), where the indices of  codeword blocks of each level are shown in Figure 4.Then, client outsources to CSS these  codeword blocks and tags that will be stored in the first level of ED.Subsequently, client can empty her () local memory and batch-read from U the next batch of  blocks, which are processed and outsourced to CSS by the same procedures as above, except that each batch of  processed codeword blocks and tags will be stored in the corresponding different level of ED (e.g., the second batch will be stored in the second level of ED, etc.).After the last batch of blocks from U are processed and outsourced, client authorizes CSS to update the whole ED with all above outsourced codeword blocks and tags, and CSS simultaneously empties the whole EO.Finally, client publishes the updated state pointer sp * to TPA, meaning that the rebuilding is over.Overall, the amortized bandwidth of rebuilding E is (1) per update operation, since this rebuilding is executed only once every  update operations.

Outsourced Proof of Retrievability (OPOR)
(1) RandomChal(B, , P).The periodic random challenges are generated based on the time-based bitcoin pseudorandom source B.More specifically, given the current time , the tool getblockhash() [12] from the bitcoin source B can output the hash of the latest block that has arisen since time  in the bitcoin block chain.As shown in [12], for a future time , no adversary can predict the hash of a bitcoin block that will arise in the future.In addition, for a past time , the hash of previous bitcoin block, returned by getblockhash(), is objective and irrefutable against any adversary.Thus, let coin () denote the output of getblockhash(); then coin () can be considered as a secure pseudorandom coin for time .
With the public parameter sp, cℓ ∈ P, TPA can generate the random POR challenge of length sp ⋅ cℓ, by calling the same probabilistic algorithm Sample(coin () , sp ⋅ cℓ) as in [12] that coin () is obtained from getblockhash() for the different time .Specifically, to ensure that a POR challenge can sample the same amount of cℓ blocks from each filled level of buffer E (i.e., ED and EO) at CSS, TPA first calls Sample(coin () , cℓ) to choose a random cℓ-elements subset  * = { ,1 ,  ,2 , . . .,  ,cℓ } of [1, ], and then it computes the sp ⋅ cℓ-elements set  as follows: Moreover, for each  ∈ , as in [12], TPA also depends on Sample(coin () , sp ⋅ cℓ) to choose a corresponding random element    ←  Z  .Finally, let  () fl {(,   )} ∈ denote an sp ⋅ cℓ-elements set, which is regarded as the POR challenge at time  and sent to CSS by TPA.
(3) PORAudit(P,  () ,  () ).Based on the public parameters set P and the challenge  () , TPA can compute his own auditing parameter  () as follows: Then, after TPA parses the CSS's response to obtain  () , TPA will audit  () by checking the following equation: If this verification does not pass, TPA informs the client of this abnormal situation, meaning that the data loss occurs.Finally, TPA must generate and store the following log Λ () that corresponds to the challenge at time : Λ () fl (, coin () ,  () ,  () ‖ sig CSS ) . ( (4) CheckLogs(B, , Λ () , P, sk client ).To protect against malicious TPA who might violate the above auditing process or even collude with CSS, the client can verify TPA's work by checking TPA's logs.However, instead of checking the accumulated TPA's logs one by one, client is able to batchcheck multiple TPA's logs all together, so such a client's batchchecking against TPA is only seldom performed in practice, as shown in [12].Client can check the latest TPA's log for a minimal check, since this log reflects the latest status of retrievability for the outsourced data [12].More generally, to perform a batchchecking against TPA, client selects a point-in-time set  = { 1 ,  2 , . . .,   } and sends  to TPA, where each  ∈  marks the time of a past challenge.
Upon receiving , for each  ∈ , in terms of  () and  () stored in log Λ () , TPA computes Then, TPA responds to client with his proof  () fl ( ()  1 ,  () 2 , . . .,  ()  ,  () ,  () ), which is also signed by TPA.Based on the public parameters sp and cℓ, for each  ∈ , the client is able to reconstruct alone each past challenge  () as described in the protocol RandomChal(⋅), by using Sample(coin () , sp ⋅ cℓ) with the pseudorandom coin () obtained from getblockhash().So, client can compute her own checking parameter  () as follows: After verifying TPA's signature on the proof  () , client first checks that whether  () is equal to  () of  () .If  () ̸ =  () , client outputs false, confirming that TPA was irresponsible for the past POR audits.Finally, client checks the following equation with her secret key  ∈ sk client : If the above client's check fails, client outputs false, which means that there exists collusion among TPA and CSS and that the data corruption has occurred within CSS.The correctness of ( 12) is demonstrated as follows:

Security Analysis
Similar to the analysis of [8,12], we evaluate the soundness of our DOPOR scheme according to three parts: unforgeability, liability, and extractability.

Theorem 1 (unforgeability). It is computationally infeasible for any adversary A to forge a proof that can pass verifier's check, if the Computational Diffie-Hellman (CDH) problem and the Discrete Logarithm (DL) problem are hard.
Proof.Since CSS does not check any proof throughout the whole process of executing DOPOR, there are only two cases to be discussed.
Case 1. TPA plays the role of verifier to check the proof returned from CSS during executing the protocols GenProof(⋅) and PORAudit(⋅) as shown in Section 4.4.In this case, observe that both CSS and TPA perform exactly the same as the BLS-based public verification scheme of [8], so the unforgeability guarantee immediately follows from the work of [8].As shown in [21], the BLS scheme is secure when the CDH problem is hard in bilinear groups, based on which the unforgeability of BLS-based public scheme has been proven in [8] and thus omitted here.
Case 2. The client acts as the verifier to check TPA's logs as in the protocol CheckLogs(⋅) of Section 4.4.To pass the client's check with (12), TPA should return the correct proof ( () 1 ,  () 2 , . . .,  ()  ,  () ).Now, assume that TPA is able to forge the proof.As shown in [8], due to the security of BLS scheme, the BLS-based homomorphic verifiable tag   is unforgeable, and thus the aggregated tag  () is also unforgeable.So, the only choice for TPA is to generate the forged aggregated block, denoted with μ() 1 , μ() 2 , . . ., μ()  , as the response to client's check.Then, for (12) to be satisfied, we have In addition, according to the correct proof, we have Note that  () is the parameter computed by client herself, and the security of  () is ensured by the security of bitcoin pseudorandom source of [12].Based on the security of BLS scheme, we can learn that where Δ ()   = μ()  −  ()  and μ() .For any two given elements  1 ,  2 ∈ , we have   =  1    2   ∈ , where   ,   ∈ Z  .Hence, ( 16) is transformed as follows: Obviously, (17) means that malicious TPA can solve the DL problem, which is in conflict with the assumption that DL problem is hard.Therefore, it is infeasible for TPA to forge a proof to pass the client's check.This completes our proof.

Theorem 2 (liability). If any adversary A attempts to cheat or frame the honest entity who has been well behaving, the honest entity can output incontestable evidence to confirm the misbehavior of adversary A in case of lawsuit.
Proof.It is clear that if the honest entity can protect against the collusion of the other two malicious entities, then this honest entity can certainly protect against any single malicious entity.Hence, to prove Theorem 2, it suffices to consider the following three cases where only one entity is honest.
Case 1. Honest client defends against the collusion of CSS and TPA.Obviously, CSS has incentive to collude with TPA only when the outsourced data corruption has occurred at cloud side.In this case, once the corrupted data blocks are challenged, according to Theorem 1, both CSS and TPA cannot forge an effective proof to pass client's check against TPA's log, unless they can solve the DL problem (but this probability is negligible).Therefore, the false output by client when executing the protocol CheckLogs(⋅) is the incontestable evidence to identify the collusion of CSS and TPA.
Case 2. Honest TPA defends against the collusion of client and CSS.As shown in the protocol PORAudit(⋅) of Section 4.4, TPA completes his auditing work by computing the auditing parameter  () and verifying (8), where all these processes are reproducible and undeniable for the malicious entities.
More specifically, on the one hand, the nonrepudiation of parameter  () is derived from the objectivity of challenge  () , which is computed based on the secure bitcoin pseudorandom source and the public parameters {sp, cℓ}.On the other hand, all other inputs involved in verifying (8) are also undeniable for both client and CSS; for example, ( ()  1 ,  () 2 , . . .,  ()  ,  () ) are signed by CSS's signature, and {, ,  1 , . . .,   } are the public parameters confirmed by all entities.Hence, the honest TPA can provide his logs Λ () in case of lawsuit, which includes all above incontestable evidence enabling the playback of all past TPA's auditing work to prove the innocence of TPA.
Case 3. Honest CSS defends against the collusion of client and TPA.When malicious client colludes with TPA to falsely accuse CSS of corrupting the th block   , CSS can output the intact   = {  } 1≤≤ and its tag   as the incontestable evidence.As shown in [8], each tag   constructed and outsourced by client herself is unforgeable.Based on the security of BLS signature scheme, as long as the above   and   output by CSS satisfy (18), then CSS is innocent.
This completes the proof of Theorem 2.
According to Theorem 2, all the three entities have to behave properly in DOPOR.In this case, the extractability during the TPA' audits against CSS can immediately follow from the work of [8], since the procedure of TPA's audits of DOPOR corresponds to the public verification scheme of [8].As for the extractability of performing CheckLogs(⋅) protocol, we have the following theorem.
Theorem 3 (extractability).During the client's checking against TPA, if client does not output false after checking TPA's logs, then there exists a deterministic extraction algorithm, based on which client can extract the challenged file blocks by the repetitive interactions with TPA.
Proof.According to Theorem 1, to pass client's check during the execution of protocol CheckLogs(⋅), TPA has to respond to client with the correct proof  () that includes the aggregated block ( ()  1 ,  () 2 , . . .,  ()  ), and each  ()  is a linear equation of the following form: where  is a set of point-in-times chosen by client herself, and all the coefficients   are dominated by these point-in-times as shown in the protocol RandomChal(⋅) of Section 4.4.Now, suppose client chooses the appropriate point-intimes in the past to generate different set  and checks TPA's logs for a polynomial number of times by sending different  to TPA; then, client can get a total of  systems of linear equations that are built upon the challenged target blocks.
Finally, by solving these  systems, client can extract all target blocks.
When referring to dynamic updates, as shown in Section 4.3, the procedure of performing updates is divided into two parts: (1) performing the batch updates upon buffer U according to the algorithms Batch Updates (⋅) and Verify Updates (⋅), as shown in Section 3.4, the essence of which is based on the property of batch verifications of bv23Tree, while the security of this property is ensured by Theorem 4; (2) outsourcing a batch of encoded operations blocks and their tags to buffer E, the security of which is directly ensured by the security of the unforgeable tags, the periodic executions of POR audits against buffer E, and the erasure code scheme.Theorem 4. Assuming the existence of a collision-resistant hash function ℎ(⋅), for any  ordered indices set Π = { 1 ,  2 , . . .,   } appointed by the client, the corresponding proof table  × generated using the bv23Tree ensures the integrity of all the  appointed leaves L = {V  1 , V  2 , . . ., V   } returned from CSS with overwhelming probability.
Proof.As shown in Section 3.2, upon receiving the appointed ordered indices set Π = { 1 ,  2 , . . .,   }, CSS should respond to client with the corresponding leaves L = {V  1 , V  2 , . . ., V   } and proof table  × .Now, suppose that CSS tries to act dishonestly; then, the possible ways for CSS to misbehave can be covered by the following two cases.
Case 1. Malicious CSS either forges some leaves within L or forges some items within the proof table  × .As shown in Section 3.3, client possesses the public tree root hash  root and will verify both L and  × by calling the algorithm Batch Verify (⋅), the procedure of which is to recalculate the public  root by iteratively hashing the values of all tree nodes included in L and  × in terms of the specified order.Apparently, the above forged L or  × can enable Batch Verify(⋅) to output the same root hash value as  root , meaning that CSS is able to find the collisions against the hash function ℎ(⋅), which contradicts the assumption that ℎ(⋅) is collision-resistant.Therefore, it is a negligible probability for malicious CSS to forge L or  × .
Case 2. Malicious CSS launches the replacing attack; that is, CSS returns the replaced L where some appointed leaves are replaced with other existing leaves of bv23Tree and the corresponding proof table T× that is correctly generated based on L. Without loss of generality, suppose that L = {V  , V  , . . ., V   }, where V  and V  are not the appointed leaves; that is,  ̸ =  1 and  ̸ =  2 .In this case, although the final hash value output by Batch Verify (⋅) is equal to the public  root , the final value of the variable IS 1 within Batch Verify (⋅) will be {, , . . .,   } instead of the specified { 1 ,  2 , . . .,   }, which contradicts the expected results as shown in Section 3.3.So, client will still output false meaning that above malicious attack is detected by client.
In short, if there exists a collision-resistant hash function, malicious CSS has to return all the appointed leaves along with the correct proof table to pass client's batch verifications upon these leaves.

Performance Evaluation
Our experiments were deployed using Python language on the Linux system with Intel Xeon E5-2609 CPU running at 2.40 GHz, 16 GB of RAM, and 7200 RPM 600 GB Serial ATA drive with a 32 MB buffer.The cryptographic operations were implemented based on the Python Cryptography Toolkit [22] and Pypbc library [23], and we used the 80-bit security parameter that means the order  of group  is of 160-bit length.We chose 1 GB raw data file  for testing and relied on the (9, 12) erasure code for encoding.For ease of comparison, all block sizes are set to 4 KB as in [6,14].Our results are an average of 20 rounds.

POR Audits Cost.
During each POR audit, since the number of challenged blocks || is far less than the total number of encoded file blocks ñ (e.g., the percent ||/ñ = 1% as in [12]), the time consumed in proof computation (or proof verification) will not be the bottleneck for CSS (or TPA).The POR audit phase of DOPOR corresponds to the execution of the public verification construction of [8], the efficient computation performance of which has been confirmed as shown in previous studies [5,6,10].Therefore, the computation time of POR audit phase is not the primary concern in our DOPOR scheme, and we will focus on evaluating the bandwidth cost of this phase.Figure 5 depicts the total TPA-CSS bandwidth cost for executing POR audit once, for various percents of challenged blocks.Here, with regard to the given parameters cℓ and  in DOPOR, the percentage of challenged blocks is equal to cℓ/.It is obvious that the public DPOR of [14] results in a large communication overhead since it must transfer all challenged blocks during each audit, which greatly affects the bandwidth performance.By relying on the technologies of blockless verification and  homomorphic authenticators (tags) to compress the proof size, the bandwidth costs of both our DOPOR and the static OPOR (i.e., Fortress) of [12] are only dominated by the sizes of challenges released from TPA and thus gradually increase with the percents of challenged blocks.Note that the bandwidth cost of DOPOR is always less than that of Fortress, since TPA only needs to send a single challenge for each audit in DOPOR, but there are two parallel challenges for TPA to be sent in Fortress.This is due to the fact that, during each audit, Fortress requires CSS to respond with two different responses [12]: one is used by TPA for auditing CSS and the other will be used by client for checking TPA's work.And thus these two responses correspond to two parallel challenges in Fortress.However, as shown in Section 4.4, DOPOR enables CSS to respond with only one response, which is based on the public key cryptosystem to support both TPA's auditing and client's checking.So, within DOPOR, there is only one challenge corresponding to the above sole CSS's response.

Read Cost.
When client reads a batch of raw blocks from CSS, the integrity of these blocks is guaranteed by the authenticated data structure of the up-to-date buffer U. In Figure 6, we evaluate the extra bandwidth cost (i.e., not including the bandwidth of transferring the blocks themselves) incurred on client side for batch-verifying the leaves of all returned raw blocks with the proposed proof table of bv23Tree, when compared to the costs of the rb23Tree method [6] and the standard MHT method [14] that can only verify all appointed leaves by transferring their respective proof paths.As shown, with the increasing number of blocks batch-read from CSS, the extra bandwidth cost caused by rb23Tree is much higher than MHT, since the basic component of proof path of rb23Tree is an 8-element tuple mark with a larger size.However, owing to the proof  in Figure 7, based on bv23Tree, client also further reduces the computation time spent for verifying the integrity of all returned blocks, due to the fact that the proof table enables client to batch-verify all returned leaves together just by computing the tree root hash once, avoiding the straightforward way of rb23Tree and standard MHT that client has to verify different leaves one after another by repeatedly computing the tree root hash with different proof paths.
In conclusion, our results show that the more the blocks batch-read from CSS, the more the repetitive node values omitted for transferring and computing according to proof table, and thus the client will save more costs based on bv23Tree for improving the performance of reads.

Write
Cost.Now, we evaluate the performance of writes for DOPOR and the public DPOR of [14], both of which apply the client-side cache measure for performing writes; that is, client will cache locally a group of raw blocks (contained in the update operations of modification or insertion as in Section 3.4) and write these blocks in a batch to CSS, as shown in the protocol PerformUpdates(⋅) of Section 4.3.For DOPOR of this experiment, the parameter  is the number of client-side cached blocks, which determines the parameter  according to the erasure-coding rate.
Figure 8 depicts the client-CSS amortized bandwidth cost for writing each 4 KB raw block.With the increasing number of client-side cached blocks, our results show that DOPOR incurs 17%∼48% more amortized bandwidth than the public DPOR, due to the fact that DOPOR needs to transfer the additional encoded operations blocks besides the raw blocks.However, recall that the required bandwidth cost for frequent POR audits in the public DPOR is orders of magnitude higher than that in DOPOR (Figure 5), and DOPOR achieves a stronger security level by protecting against malicious TPA and collusion than the public DPOR.Furthermore, as shown in Figure 9, the public DPOR incurs an average of 45% more computation time at CSS side than DOPOR, since during performing writes the public DPOR must rebuild the corresponding levels of an MHT-based hierarchical structure located on CSS's disk, which results in a lot of additional disk I/O time when compared to DOPOR that does not need to do such rebuilding.

Client-Side Checking Cost.
As shown in [12], although client should not be embroiled in the most frequent POR audits, it is necessary to give client the capability of checking TPA's past work to protect against the malicious TPA.Since both DOPOR and the Fortress scheme of [12] adopt the aggregation technology to compress the proof size, the client-TPA bandwidth costs during checking TPA are alike for these two schemes, so in this experiment we focus on measuring the client's computation time of two investigated schemes when batch-checking TPA's logs, as shown in Figure 10.encoded blocks.Compared to Fortress, DOPOR requires more time at client side for checking TPA, due to the fact that the exponentiation operation on the elliptic curve of DOPOR incurs more computation cost than the module operation of Fortress.However, recall that DOPOR enables the efficient data dynamics that cannot be supported in Fortress, so the additional client-side time cost incurred by DOPOR, that is, the distance between the line of DOPOR and that of Fortress as in Figure 10, can be regarded as the price for dynamism under the OPOR setting.Indeed, this additional dynamism cost can be tolerated by client to a great extent, since the client's checking against TPA is an optional verification and is only seldom executed in practice [12]; for example, client might just batch-check TPA's logs once in several months or even a year.

Related Work
Nowadays, with the rapid development of cloud computing, more and more cloud applications are designed upon the big data stored at CSS side, such as the service quality evaluation [24] and cloud service recommendation [25,26].However, how to guarantee the storage security of the big data is a critical challenge for mobile clients in the setting of cloud computing.Proof of Retrievability (POR) is a kind of security measure that builds upon cryptographic proofs to ensure the correctness and retrievability of client's big data outsourced to cloud.Juels and Kaliski Jr. [7] proposed the first POR scheme by utilizing the "sentinels" technique, where client can conceal some sentinel blocks among other original data blocks for remote POR audits before outsourcing her data.But this proposal can only support a limited number of POR audits, since performing the audits will expose the corresponding sentinels, so the frequent audits cannot be sustained once all sentinels are exhausted.Based on the pseudorandom functions (PRFs) and BLS signatures [21], Shacham and Waters [8] proposed two improved POR schemes with private verification and public verification, respectively.Both of these schemes enable an unlimited number of audits against CSS and simultaneously compress the response of CSS into one aggregated block along with a small authenticator value for optimized auditing bandwidth.Subsequently, Dodis et al. [27] generalized the constructions of [7,8] by combining the concepts of POR with the coding and complexity theory.In view of the importance of data dynamics, Cash et al. [18] provided a Dynamic POR (DPOR) scheme based on the ORAM technique.Because ORAM will incur the heavy bandwidth overhead for client when performing dynamic updates under the POR setting, by replacing ORAM with the FFT-based constructible code and the hierarchical storage structure, Shi et al. [14] designed a more efficient private DPOR scheme than that of [18] and simultaneously applied the MHT structure to turn this private DPOR into a public DPOR scheme.Furthermore, with the observations made upon previous POR studies, Etemad and Küpc ¸ü [19] proposed a general framework to construct efficient DPOR and defend against the selective deletion attack described in [14].
On the other hand, Provable Data Possession (PDP), first proposed by Ateniese et al. [17], is a closely related research direction that focuses on ensuring the integrity of outsourced data.The difference between POR and PDP is that POR applies the erasure code but PDP does not.As shown in [19], the security level of POR is stronger than that of PDP, since POR ensures that the whole outsourced data can be retrieved by client, when compared to PDP that only guarantees the integrity of most of the outsourced data.Given that some existing public auditing schemes [5,15,16] are designed without involving erasure code, these schemes can be classed as the variants of PDP.Zhu et al. [4] presented a cooperative PDP (CPDP) scheme for distributed Multicloud Storage setting.Wang et al. [5] designed the random masking technology to protect client's outsourced data from leaking to TPA during the audits.Erway et al. [13] proposed the first Dynamic PDP (DPDP) scheme to support efficient data updates using Skip List.And then variant authenticated structures were proposed for data dynamics, such as the standard MHT method [9], rank-based MHT [15], multireplica MHT [16], and rb23Tree [6].However, all these authenticated structures can only verify different leaves one by one, which is an inefficient way for client when there are many leaves that need to be verified.
In addition, as shown in [12], when referring to public verification (auditing), the potential security risk is that TPA might also be malicious.But this risk has not been considered by all the above public schemes.Outsourced Proof of Retrievability (OPOR), proposed by Armknecht et al. [12], is the first scheme to protect against malicious TPA under the public POR setting.However, the OPOR construction of [12] only supports the static data, which is the limitation that should be further solved.

Conclusions
As a stronger security model in the context of remote data auditing, Outsourced Proof of Retrievability (OPOR) focuses on dealing with the dilemma that client hopes to resort to TPA for assessing the storage security of her outsourced Wireless Communications and Mobile Computing data, while TPA might be malicious and collude with CSS to cheat client.In this paper, we propose a concrete DOPOR scheme to support data dynamics under the environment of OPOR.Our DOPOR scheme is constructed based on a newly designed authenticated data structure, called bv23Tree, which not only relies on the property of balanced tree to guarantee the expected logarithmic complexity in any case of dynamic updates, but also enables client to batch-verify multiple appointed leaves all together for improved performance.Under the setting of employing erasure code, by separating the updated data from the original data and adopting the hierarchical structure of same-sized levels to uniformly store all encoded data, DOPOR can efficiently support batch reads and updates upon outsourced storage according to the feature of batch verifications of bv23Tree.When compared to the state of the art, our experiments show that DOPOR incurs a lower bandwidth cost for frequent TPA's audits than the original static OPOR scheme, and the overall performance of DOPOR for reads and writes is comparable to that of existing public Dynamic POR scheme.

Cloud
Storage Server (CSS) c o n t in u o u s r a n d o m s e e d s fo r p e r io d ic c h a ll e n g e s reads blocks in a batch batc h upd ate ope ratio ns re bu ild on ce ev er y n up da te op er at io ns unencoded logs correspond to the POR audits

Figure 2 :
Figure 2: An example of bv23Tree.Besides the verified ordered leaves setL = {V 3 , V 4 , V 8 , V 10 , V 12 , V 14 }, the verifier only needs to retrieve the necessary auxiliary tree nodes (colored in red), which are organized into a special table structure as will be shown later.Then, the verifier can batch-verify the indices and values of these leaves in L all together.

Figure 4 :
Figure 4: CSS-side hierarchical storage structure for erasure-coded buffer E, where each level stores  encoded blocks along with tags.

Figure 5 :
Figure 5: TPA-CSS bandwidth cost for a POR audit, with respect to the fraction of challenged blocks of the total number of encoded blocks.

Figure 6 :
Figure 6: Extra bandwidth cost consumed by client for verifying the leaves of all blocks batch-read from CSS, according to different authenticated data structures.

Figure 7 :
Figure 7: Computation time spent by client for verifying the integrity of all blocks batch-read from CSS, according to different authenticated data structures.

Figure 8 :Figure 9 :
Figure 8: Client-CSS amortized bandwidth cost for each block when writing all cached blocks in a batch to CSS.

Figure 10 :
Figure 10: Computation time spent by client for batch-checking TPA's logs (including the time to access the bitcoin source).

GenTags(sk client
,) → {,Φ}: when inputting the client's secret key sk client and the original file  that is an ordered set of raw data blocks {  }, this protocol encodes  into the encoded file  and outputs  as an ordered set of codeword blocks {  }.It also outputs the tags set Φ = {  }, where each   is computed based on sk client and   .,  2 , . . .,   }, and the CSS state st, this protocol outputs the appointed data blocks set M = {  1 ,   2 , . . .,    }, or false otherwise.(v) PerformUpdates( root , SO, sk client , st, sp) → {( root * , st * , sp * ), false}: when inputting the tree root hash  root , the set of update operations SO, client's secret key sk client , the CSS state st, and the state pointer sp that is related to st, it outputs a new root hash  root * , a new CSS state st * , and a new pointer sp * showing that all the operations in SO are correctly executed in a batch, or false otherwise.

Rebuild(𝑥 root
, sk client , st, sp) → {(st * , sp * ), false}: when inputting the tree root hash  root , client's secret key sk client , CSS state st, and state pointer sp, it outputs a new CSS state st * and a new pointer sp * showing that the rebuilding is completed, or false otherwise.
and decision D TPA .

Table 1 :
An example of proof table.

Table 2 :
An example of batch verifications.
according to Algorithm 2 and the proof table  6×3 of Table 1.then IS 3 ⊕ 4 = {8, 10, 12, 14}.Algorithm 2 applies each nonnull item  , of  × to iteratively compute tuple (R  , H  , IS  ).If the returned leaves set L and table  × are right, we will get the following results after the outermost for-loop of Algorithm 2 is finished: (i) Value H 1 is equal to  root , that is, the authentication hash value of the root of bv23Tree.(ii) Value IS 1 is exactly the same as { 1 ,  2 , . . .,   }, that is, the indices set of  appointed ordered leaves in L.
Batch Updates (SO, Ψ, ) → {L, × ,  root * }.Input parameters SO = {O 1 , O 2 , . . ., O  } are a batch of update operations, Ψ is the bv23Tree, and  = { 1 ,  2 , . . .,   } are the whole outsourced original blocks.This algorithm outputs an ordered leaves set L, the proof table  × , and the updated root hash value  root * from the final state bv23Tree Ψ * .(1) extract from O the largest ordered targeted indices set Π = { 1 ,  2 , . . .,   }, by removing the duplicate indices; (2) read leaves set L = {V  1 , V  2 , . . ., V   } from Ψ; (3) obtain  × ← Proof Table(Ψ, Π); (4) update the file blocks set  according to the sequential executions of all update operations {O 1 , O 2 , . . ., O  }; (5) perform each of the update operations in sequence on Ψ and then obtain the final state Ψ * ; more specifically, transform Ψ in terms of each operation, and update the status  V , rank  V , and hash value  V of the affected tree nodes during each transformation; (for the modification operation without transformation, only need to update the hash values of the nodes on the path from the targeted leaf to the root) (6) return {L, × ,  root * }, where  root * is the authentication hash value of the root of final state Ψ * ; Algorithm 3: Algorithm for CSS to perform the batch updates.Verify Updates ( root , SO, L,  × ,  root * ) → {true, false}.Input parameters  root is client local metadata, update operations set SO = {O 1 , O 2 , . . ., O  } are generated by client herself, L,  × ,  root * are provided by CSS as computed in Algorithm 3.This algorithm outputs true if the batch updates are successful, or false otherwise.(1) extract Π = { 1 ,  2 , . . .,   } from SO as in Algorithm 3; (2) if Batch Verify ( root , Π, L,  × ) = true then (3) construct partial bv23Tree with  root as the root hash; (4) else {Batch Verify( root , Π, L,  × ) = false} (5) return false; (6) end if (7) perform each update operation O  of SO upon above partial bv23Tree by the same transformations as in Algorithm 3, and then compute the final state root hash  root  ; (8) if root  =  root * then (9) replace local  root with  root * , return true; (10) else { root  ̸ =  root * } (11) return false; (12) end if Algorithm 4: Algorithm for client to verify the result of batch updates.
is kept secret by client but  is public.So, client's private key sk client = (, ssk client ) and her public key pk client = (, spk client ).GenTags(sk client , ): client applies erasure code to encode  = {  } 1≤≤ into ñ codeword blocks  = {  } 1≤≤ñ , and each   is  sectors long:   fl {  } 1≤≤ .Client then generates a name fid for  and samples  elements  1 , . . .,    ←  .For each index , 1 ≤  ≤ ñ, with her secret key  in sk client , client computes for   the corresponding tag   ← ((fid ‖ ) ⋅ ∏  =1     )  and attaches   to   ., Φ): based on  = { 1 ,  2 , . . .,   }, client generates the corresponding bv23Tree Ψ with the root hash  root , as shown in Section 3.1.Then, client outsources {, Ψ} into the buffer U and outsources all encoded data blocks  = {  } 1≤≤ñ In addition, client samples a random element   ←  Z  and computes  ←   .(3)OutsourceData(, CSS accesses appointed raw blocks M = {  1 ,   2 , ...,    } and the tree leavesL = {V  1 , V  2 , . .., V   } from U, generates  × ←ProofTable (Ψ, Π), and returns {M, L,  × } to the client.Then, the client batch-verifies the authenticity of L by calling Batch Verify ( root , Π, L ,  × ) and finally checks the integrity of all raw blocks of M according to the corresponding hash values stored in L. Suppose that the client keeps () local storage to cache  ( < ) ordered update operations SO fl {O 1 , O 2 , . . ., O  }.Then, the client sends SO to CSS for performing these  operations in a batch.As shown in Section 3.4, on receiving SO, with the raw data  and bv23Tree Ψ stored in buffer U, CSS can execute Batch Updates (SO, Ψ, ) to return to the client the results {L, × ,  root * } for batch updates, and the client can call Verify Updates ( root , O, L,  × ,  root * ) to authenticate these returned results.If the above results pass the client's authentication, the client then applies an erasure code to encode SO into  encoded operations blocks {  1 ,   2 , . . .,    }.Based on local state pointer sp, client first computes the indices { 1 ,  2 , . . ., ) ReadBlocks( root , Π, st).With her local root hash  root of bv23Tree, the client can batch-read any  appointed raw blocks from CSS, by sending the ordered blocks indices set Π = { 1 ,  2 , . . .,   } as the query to CSS.Here, let buffer U be the CSS state st.In terms of Section 3, upon receiving Π, (2) PerformUpdates( root , SO, sk client , st, sp).