A Privacy-Preserving Outsourcing Data Storage Scheme with Fragile Digital Watermarking-Based Data Auditing

Cloud storage has been recognized as the popular solution to solve the problems of the rising storage costs of IT enterprises for users. However, outsourcing data to the cloud service providers (CSPs) may leak some sensitive privacy information, as the data is out of user’s control. So how to ensure the integrity and privacy of outsourced data has become a big challenge. Encryption and data auditing provide a solution toward the challenge. In this paper, we propose a privacy-preserving and auditing-supporting outsourcing data storage schemebyusing encryption anddigital watermarking. Logisticmap-based chaotic cryptography algorithm is used to preserve the privacy of outsourcing data, which has a fast operation speed and a good effect of encryption. Local histogram shifting digital watermark algorithm is used to protect the data integrity which has high payload and makes the original image restored losslessly if the data is verified to be integrated. Experiments show that our scheme is secure and feasible.


Introduction
With the development of cloud computing, outsourcing data to cloud storage servers has become a popular way for firms and individuals.Cloud storage reduces data storage and maintenance costs.And cloud storage can provide a flexible and convenient way for users to access their data anywhere.However, the cloud service providers (CSPs) may not be honest and the data should not be disclosed to the CSPs.So the data must be encrypted before it is uploaded to the cloud.Encryption is a fundamental method to preserve data confidentiality.For privacy preserving concerned, data owner can encrypt the data before outsourcing it to CSPs.Many problems of querying over encrypted domain are discussed in research literatures [1][2][3].In addition, data owners worry whether the outsourcing data is modified or revealed by the CSPs.It is necessary to add the data auditing service in outsourcing data storage scheme.
In the existing outsourcing data storage schemes, the data auditing methods can be classified into three categories: message authentication code-(MAC-) based methods, RSAbased homomorphic methods, and Boneh-Lynn-Shacham signature-(BLS-) based homomorphic methods [4].In these methods, the data is calculated using MAC or digital signature and the verification information needs to be attached to the original data.If the data is digitally signed, any change in the data after signature invalidates the signature.Furthermore, these methods increase the data sizes and the time to sign, which is inconvenient in digital media (images, video, audio, etc.).So we use digital watermarking technology to offset the deficiency.Digital watermarking technology hides watermark information in the digital media without affecting data utilization.And it reduces the communication and computation costs.This means digital watermarking technology can provide a more effective auditing method than other cryptographic protocols for auditing.
Many works on outsourcing data storage schemes with digital watermarking are proposed.N. Singh and S. Singh [5] point out that collaboration of digital watermarking and cloud computing can significantly increase the robust of system as well as security of user's data.Boopathy and Sundaresan [6] propose a model of data storage and access process with digital watermarking technology in the cloud.Though they do not give concrete realization, it shows the broad prospects of applying digital watermarking technology into the cloud environment.In addition, digital watermarking technology is used for data auditing in cloud environment.Wang and Lian [7] focus on the application scenarios of multiwatermarking in cloud environment by investigating the secure media distribution models.Ren et al. [8] propose a provable data possession scheme based on self-embedded digital watermark for auditing service.However, they do not provide privacy preserving with encryption methods.It is believed that supporting privacy preserving is of vital importance to outsourcing data storage.
In this paper, logistic map-based chaotic cryptography algorithm is used to preserve the privacy of outsourcing data, which has a fast operation speed and a good effect of encryption.Traditional encryption techniques such as AES, DES, and RSA have low speed to encrypt media data.And they are not suitable for high real time in media data transmission.Chaotic cryptography has many good characteristics such as sensitivity to initial value, pseudorandom properties, and ergodicity.Logistic map-based chaotic cryptography is a simple nonlinear model, but it has complex dynamics, which is widely used in image encryption.In this paper, logistic map-based chaotic cryptography method is used to permute the positions of the image pixels in the spatial domain.It is suitable for embedding watermark information with local histogram shifting digital watermark algorithm later.Local histogram shifting digital watermark algorithm is utilized to protect the data integrity.It has high payload and makes the original image restored losslessly if the data is verified to be integrated.
We propose an outsourcing data storage scheme supporting auditing service by using fragile digital watermarking technology.Meanwhile, the scheme uses encryption methods to preserve privacy.In this scheme, digital watermarking technology and encryption methods are used to enhance the integrity and privacy of outsourcing data storage.Our contributions are as follows.
(i) We propose an outsourcing data storage scheme supporting privacy-preserving and auditing service.
In this scheme, we use the scrambling encryption algorithm based on logistic chaotic map, which has a fast operation speed and a good effect of encryption.Besides, local histogram shifting digital watermark algorithm [9] is used to embed the watermark, which has high payload and makes the original image restored losslessly if the data is verified to be integrated.
(ii) To reduce data owners' overhead cost, a third-party auditor (TPA) is used to verify the integrity of data in cloud.And TPA verifies the data integrity in encryption domain, which ensures the data confidentiality in the auditing process.
The rest of this paper is organized as follows.Section 2 summarizes the related work.Section 3 introduces the proposed scheme.Experiment results are given in Section 4. Section 5 concludes the paper and the future work.

Related Work
Many secure outsourcing data storage schemes are proposed these years.The privacy and integrity of data in cloud are the most concerns of data owners.Outsourcing data is often distributed geographically in different locations.CPSs can access the stored data if it is stored in plain format.Data owners have lost control over their data after it is uploaded to the cloud.So data privacy information [10] or sensitivity information [11] causes the outsourcing data to be encrypted in the data storage schemes.
To verify the data integrity, data auditing is considered in outsourcing data storage schemes.Ateniese et al. [12] first define the provable data possession (PDP) model for auditing service in untrusted storages.Juels and Kaliski Jr. [13] describe a proof of retrievability (POR) model, which ensures both "possession" and "retrievability" of data files.Sravan Kumar and Saxena [14] propose a proof of data integrity in the cloud, which could be agreed upon by both clients and the server via the Service Level Agreement (SLA).Hao et al. [15] propose the first protocol that provides public verifiability without TPA.Lu et al. [16] exploit the secure provenance model, which consists of the following modules: system setup, key generation, anonymous authentication, authorized access, and provenance tracking.Their scheme is based on the bilinear pairing techniques.And it records the ownership and the process history of data objects to increase the trust from public users.But all these methods have additional data to verify the data integrity and are not suitable for multimedia file.Digital watermarking technology can offset the deficiency, which is an effective method for data auditing.Digital watermarking can be divided into spatial domain and frequency domain [17].Spatial domain digital watermark directly embeds watermark information into the image pixels.Frequency domain [18] algorithm embeds watermark information into coefficients of transform domain.
Encryption is a fundamental method to preserve data confidentiality in outsourcing data storage schemes.Digital watermarking technology is an effective method for data auditing.The methods of embedding digital watermark in encryption domain are proposed [6,19,20].In medical domain, many healthcare information systems (HISs) [21] are proposed.Haas et al. [22] propose a privacy-protecting information system for controlled disclosure of personal data to third parties.This scheme uses authentic log files to check the completeness of data.And digital watermarking is used for tracing nonauthorized data disclosure.In the field of information hiding, Zhang [19] uses the simple encryption algorithm of exclusive-OR operation by a stream cipher and embedded watermark information by flipping the 3 LSBs of each encrypted pixel.Zhang [20] further proposes a scheme which makes watermark extraction independent from image decryption.That means a user can extract data from the encrypted image directly.Yin et al. [9] propose a scheme with the multigranularity encryption algorithm and local histogram shifting digital watermark algorithm, which ensures larger embedding capacity and better embedding quality.But chaotic-based scrambling encryption is widely used in image encryption.The common encryption algorithms are one-dimensional logistic map, two-dimensional Smale and Henon map, and three-dimensional Lorenz map.The logistic map-based chaotic cryptography is a simple nonlinear model, but it has complex dynamics, which has good effect and fast speed.
In our scheme, we combine encryption technology with watermark technology.Data owner encrypts the image before transmission.CSP embeds some additional message into the encrypted image without knowing the original image content.TPA is required to extract the watermark from the encrypted image.A user can first decrypt the encrypted image containing watermark information with the decryption key and then extract the embedded watermark from the decrypted version with the extraction key.The transmission of encryption keys is assumed to be secure and is not discussed here.Here the logistic map-based chaotic cryptography method is used to permute the positions of the image pixels in the spatial domain.So the histogram of the encryption version is the same as the original image.The histogram statistical property makes the encryption method suitable for embedding watermark information with local histogram shifting digital watermark algorithm [9].And this is a blind fragile watermark algorithm.The extraction of the watermark does not need the original image and original watermark information.Its error-free decryption can be used for military, remote sensing, and medicine data.

Proposed Scheme
In this section, we first analyze the framework of the system and then give the main steps of our scheme.

System Model.
We first give the sketch of the proposed scheme in Figure 1.Then four parties in the scheme are described as follows.
(i) Data owner encrypts an original image with an encryption key   , computes a verification information as watermark information  for the encrypted image, embeds  to the encrypted image with the embedding key   , and upload the encrypted image to CSP.
(ii) CSP stores the watermark-embedded encrypted image.
(iii) TPA extract the watermarking information   with   in the encrypted domain to verify the integrity and reconstructed the image if it is integrated.
(iv) Data user receives the reconstructed image from TPA and exactly decrypts the data to the original image with the decryption key   .

Main
Steps of Proposed Scheme.The proposed scheme contains four modules: image encryption, watermarking embedding, watermarking extraction, and image decryption.
The main steps of the proposed scheme are shown as follows.

Image Encryption.
Data owner creates an original image .Assume  is a gray image sized  ×  pixels in uncompressed format.The process of image encryption is as follows.
(iv) Scramble the sequence of image with the same location set.
The encryption key   consists of  0 and .The encrypted image  is generated.This algorithm is simple and has good performance.The algorithm keeps the image histogram statistical properties.

Watermarking Embedding.
The embedded watermarking information should be unpredictable and random.Arnold transforming or chaotic-based encryption can be used in this paper to improve the security of image watermarking algorithm.The above-mentioned encryption algorithm preserves the same image histogram statistical properties.Therefore, local histogram shifting watermarking algorithm is suitable for embedding data into the encrypted image [6].
When data owner embeds watermarking information  into the encrypted image , the steps are as follows.
(i) Divide the encrypted image  into blocks {  }  =1 of pixels in the size of  × . ( If  , =  , ,  , =  , + 1. (iv) Saturated pixels  ( = 0 or  = 255) have to be preprocessed by modifying one grayscale unit.Then they will be recorded in a location map  to avoid saturated pixels from overflow or underflow during embedding process.Scan the pixels block by block and append bit "1" to  when  ∈ {1, 254}.Then append bit "0" to  when  ∈ {0, 255} and modify  to   using The embedding capacity of each block is the number of pixels whose values are equal to peak points in each block.(v) Embedded information  consists of the location map  and the histogram information  of the image.Scan the nonbasic pixels in each block.If the scanned pixel  is valued  , or  , , a bit  ∈ {0, 1} from  will be embedded.Modify  to   as The encrypted image Ê with embedded data is obtained.The embedding key   consists of the parameter , ||, ||, and the seed .The data owner outsources the encrypted image Ê with embedded watermarking information to the cloud.Then the watermark embedding key   is transferred to TPA and the decryption key   is shared with the legal users.

Watermarking Extraction and Data
Auditing.TPA extracts the watermarking information   with the extraction key   before the user downloads the data from the cloud.The watermarking information can only be extracted from the encrypted domain by TPA that ensures data privacy.
This blind extracting algorithm is shown as follows.
(i) Divide the image Ê into blocks {   }  =1 of pixels in size  × .Determine the basic pixels   , and   , in each block    .
(ii) The difference    = |  , −  , | is calculated to estimate the smoothness of each block.Blocks with smaller    have higher priority to be chosen for extracting data.
(iv) Scan nonbasic pixels in each block    .If the scanned pixel is   , embedding information  will be extracted according to The extracted || bits consist of location map  and histogram information .
TPA verifies the data integrity after extracting the watermark information   .
The auditing process is as follows.
(i) Scan nonbasic pixels in each block    .If the scanned pixel is   , the restored pixel  can be computed by The reconstructed encrypted image   is generated.
If the value (,   ) < , the watermark information is correct and the data is verified to be integrated.(i) Generate a chaotic sequence of length × with the decryption key   .
(ii) Sort the chaotic sequence and record the location set.
(iii) Scramble the sequence of image and restore a decrypted image with the location set.
Then the original image  is obtained by the legal users.

Experimental Results
To study the performance of the proposed scheme, MATLAB software 7 is used.The test image Lena of 8-bit gray level sized 512 × 512 pixels is selected as original image and it is shown in Figure 2   The experimental results of proposed scheme are shown in Figure 3.
The quality of encrypted image can be evaluated by Peak Signal-to-Noise Ratio (PSNR): where  is the original image and   is the image with watermark information.The size of image  is  ×  pixels.The mean square error (MSE) can evaluate the error between the original image and decrypted image.Table 1 lists the embedding payloads and MSEs for image Lena, bridge, aerial, and dollar without any attacks.
From Table 1, the MSEs between the decrypted version and the original image are 0.This means the encrypted image will be reconstructed error-free during watermark extraction and data auditing process if the data in cloud is not attacked.The payload is enough for embedding verification information.
In this paper, the watermark algorithm is fragile, which cannot resist any attacks.This can be used in military, remote sensing, and medicine images.

Conclusion and Future Work
In this paper, we propose a privacy-preserving and auditingsupporting outsourcing data storage scheme by using encryption and digital watermarking.The proposed scheme combines digital watermark technology with encryption methods for outsourcing data storage.And the scheme supports auditing service and privacy preserving.We adopt the logistic map-based chaotic cryptography algorithm for image encryption and local histogram shifting watermarking algorithm [6] for embedding data integrity verification information.This scheme has high authentication precision which can be used in high quality images.
In the future, we will add semifragile watermark to verify the integrity of images, which can resist some good image operations, such as JPEG compression.We can also apply some algorithms for the sake of supporting tamper localization and recovery.

Figure 1 :
Figure 1: Sketch of the proposed scheme.

3. 2 . 4 .
Image Decryption.The legal users can decrypt the reconstructed encrypted image   using the decryption key   and can also obtain the original image .The decryption process is as follows.
(a).We use logistic map-based chaotic cryptography algorithm to generate an encrypted image ( 0 = 0.5,  = 3.7), which is shown in Figure2(b).The encrypted image containing watermarking information is shown in Figure2(c).After the watermarking information is extracted by TPA, a reconstructed image is shown in Figure2(d).Then the legal user can decrypt the reconstructed image.The decrypted image is shown in Figure2(e).

Figure 3 :
Figure 3: Experimental results of the proposed scheme.

PSNR = 10 × log 10
, ) −   (, )) 2 , Two basic pixels  , and  , are randomly selected in each block   with the seed of random permutation .(ii) Calculate the difference   = | , −  , | to estimate the smoothness of each block.Blocks with smaller   are smoother than blocks with larger   .Blocks with smaller   have higher priority to be chosen for carrying data.(iii) Determine the two peaks ( , ,  , ) in each block with  , = min ( , ,  , )  , = max ( , ,  , ) .

Table 1 :
Payload bits and MSE.