High-Capacity Reversible Data Hiding in Encrypted Images by Information Preprocessing

,


Introduction
In the last few years, with the development of technologies for digital image processing, the transmission and exchange of images have become more and more convenient. At the same time, original images have begun to suffer from many security issues. Reversible data hiding (RDH) in cover images is a methodology that embeds secret messages into original images, such as law forensics, military, and medical images, in a reversible way such that the cover images can be completely recovered after the hiding data are extracted.
Reversible data hiding can be categorized into four major methods: lossless compression, difference expansion (DE), prediction error expansion (PEE), and histogram shifting (HS). e core idea of lossless compression [1][2][3] is that secret information is embedded into compressed images. Commonly, however, we cannot achieve high-capacity data hiding. Tian [4] proposed a difference expansion information hiding method that explored the vacant room in digital images. Compared to the traditional method, the data hiding capacity is significantly improved. e paper calculated all the differences between two adjacent pixels and embedded secret data into the values. DE was generalized by Alatter [5], in which n-1 bits can be embedded into a vector with n pixels. odi and Rodriguez [6] proposed an excellent extension of DE called prediction error expansion. PEE uses each pixel's prediction error instead of the pixel difference to hide information, to increase the hiding capacity beyond that of DE. A high-fidelity reversible data hiding method was proposed by Li et al. [7], in which a new prediction method called pixel value ordering (PVO) was combined with PEE. For each pixel block, the pixels are reordered into a pixel vector according to their values, and the secret data are embedded into two prediction errors, which correspond to the difference between the smallest pixel and the second smallest pixel and between the largest pixel and the second largest pixel. In [8], all k maximum pixels (or minimum pixels) were treated as a unit for information hiding. e previous method was based on performing PVO and PVO-k in a block-by-block manner, and generally, the capacity was small. In [9], a novel pixel-based PVO (PPVO) was proposed, in which the generated predictions were made in a pixel-by-pixel manner. us, the hiding capacity of PPVO was increased. e histogram shifting (HS) [10][11][12] method modifies the image histogram to embed secret information. It can produce high-quality marked images, while the hiding capacity of HS is limited.
With the development of cloud computing, the growth in information technology has led to serious security problems such as copying, hacking, or malicious usage of information. To ensure the secure transmission of digital images over the public network, two kinds of security techniques can be utilized: encryption and data hiding. RDHEI is an effective method for embedding secret information in the encrypted domain. ere are three roles in RDHEI: the content owner, the data hider (cloud manager), and the receiver. e content owner encrypts the original image, the data hider embeds secret information into the encrypted image, and the receiver extracts the secret data and reconstructs the cover image. In recent years, RDHEI has received a lot of attention in the encryption stage [13][14][15][16][17][18][19] and the data hiding stage [20][21][22][23][24]. Until now, many RDHEI schemes have been proposed. On the basis of when the embedding space for additional data was created, i.e., before or after image encryption, embedding mechanisms of the current reversible data hiding schemes can be divided into two categories: vacating room after encryption (VRAE) and reserving room before encryption (RRBE). Generally, the RDHEI algorithm based on RRBE can embed a greater hiding capacity than that based on VRAE. However, an RDHEI scheme based on VRAE can reconstruct the original image without loss. erefore, the former method has a relatively wider applicability than the second one.
In [13], the original image was divided into patches that were then represented according to an overcomplete dictionary via sparse coding. Because of regarding the patch as a whole, the number of coefficients was small. And thus a high-capacity room was available. e embedding capacity was 1.071 bpp and the quality of the embedded image was 40 dB. e average maximum embedding rate reached a factor of 1.7 as large as that of the previous best alternative scheme conducts. An RDHEI method based on the adaptive encoding strategy has been proposed [20]. During image encryption, block permutation and stream cipher encryption are applied to mask the original image. e permutation for blocks and pixels basically does not change the redundancy of the cover image. e embedding capacity was 1.8319 bpp and the quality of the embedded image was good. In [15], the paper constructed a new reversible data hiding scheme in an encrypted image with public key cryptography from difference expansion. In the previous RDHEI methods, the encrypted images rarely contained redundancy space.
us, Liu and Pun [21] proposed a new novel RDHEI method based on reversible image reconstruction. e content owner rearranges the cover image to construct a redundancy image; simultaneously, the content of the cover image is made invisible. e rearranged image is used as an encrypted image. In 2019, Liu and Pun [23] proposed a novel reversible data hiding scheme in an encrypted image by redundant space transfer. To avoid destroying the majority of the redundant space, the paper designed an image encryption phase with three steps: disordering bit planes, disordering patches, and applying the Arnold transform. In order to reach better image visual quality. Wu and Sun [22] proposed two reversible data hiding schemes in encrypted images based on the prediction error: a separable method and a joint method. In [25], a lossless RDHEI method based on Chaos-Block was proposed. e paper used features of the pixel difference to embed more data than possible by other methods and carried out refinement with a single-level wavelet decomposition shifting technique to prevent image distortion problems. For the vacating room from the encrypted image without loss is difficult, a completely reversible data hiding in an encrypted image was proposed in [14]. is paper empties room by shifting the histogram of estimating errors of some pixels, which are estimated before encryption. In 2018, Qin et al. [16] proposed a high-capacity RDHEI method, in which the data hider first preprocesses all the encrypted patches by run-length coding and Huffman coding. us, a large amount of spare room is vacated to hide certain kinds of messages. In [17], the content owner preprocessed the cover image by the run-length coding and block-based MSB plane rearrangement schemes. e data hider can achieve high-capacity data hiding in an encrypted image. In the embedding phase [26], the encrypted image is adapted according to the error location map; thus, the receiver can extract secret data perfectly and reconstruct the cover image without any errors. But the algorithm cannot achieve high-capacity data hiding. In [18], before the image encryption, the content owner calculates the prediction values and marks the original image by Huffman coding. en, it encrypts the cover image and embeds the label map into the encrypted image. e data hider embeds multibit data in each encrypted pixel by multi-MSB substitution according to the embedded label map. When the label map is large, however, there will be little redundant room. In [19], two RDHEI schemes were proposed, both of which are based on MSB prediction. In the first method, before encryption, the cover image is preprocessed. In the second method, the cover image is encrypted, and then prediction errors are embedded. e former method can embed more secret data into the encrypted image than the latter, but the latter method can reconstruct the original image without loss. To improve the security of the encrypted image and the quality of the decrypted image, Shu et al. [24] proposed an RDHEI based on neighborhood prediction using XOR-permutation encryption. In this method, the XOR-permutation is used to encrypt the cover image.
us, the encrypted image can retain redundant room and statistical information.
In all cases, the presented RRBE and VRAE methods are not able to offer embedding rate and high reconstructed image quality simultaneously. Although the proposed 2 Complexity methods could perform very well, they cannot be used for all images. It is necessary to propose a general method for highcapacity reversible data hiding in encrypted images. In this study, we present an efficient MSB method for high-capacity reversible data hiding in encrypted images based on information preprocessing. Because the Arnold transform [27] or Sudoku [28] permutation encrypts an image by displacing the pixel positions, the pixel values do not change. For this reason, it seems natural to transform reductant space from the original image to an encrypted image, as many image encryption methods. However, using the Arnold transform to process an image introduces a security risk. Because the transformation cycle of images with the same size is fixed. In this study, we use the Arnold transform and Sudoku permutation to reduce this risk. Normally, most existing methods cannot achieve a high embedding capacity (more than 1 bpp) because the redundant room is limited. In this study, the secret data are preprocessed by the halftone, quadtree [29], and DES [30] algorithms. When the secret data are scanned documents, we convert them from graylevel form to binary values by half-toning and then extract the content using the quadtree algorithm. Whether the original secret message is random bits or scanned documents, the message is compressed by S-BOX at the end. Each 6-bit data point can be compressed into 4 bits by S-BOX. For these reasons, we can achieve a high embedding capacity (more than 1 bpp) in the hiding phase. In the decoding phase, the receiver can extract the secret information and reconstruct the original image without loss using MSB prediction.
In this study, high-capacity data hiding and image reconstruction are concerned. For this problem, the schemes in earlier studies mainly involve the encrypt stage and data hiding stage. In the previous RDHEI methods, the encrypted image, processed by traditional standard encryption schemes, contains almost no redundant space. Some other compression coding methods are not conductive to image reconstruction. It is necessary to achieve high embedding capacity and high visual quality. Summarizing, the RDHEI schemes are on the basis of MSB prediction with a very high capacity. We will employ this method to handle the data hiding and image reconstruction operation. e rest of the paper is organized as follows. Section 2 describes the proposed scheme in detail, including image encryption, data hiding, and data extraction and image recovery. Section 3 reports the experimental results and analysis. Finally, the conclusions are drawn and future work is proposed in Section 4.

Proposed Method
At present, few methods succeed in combining high embedding capacity (near 1 bpp) and high visual quality (greater than 50 db). An encrypted image is difficult to detect whether it contains secret data or not. Regarding the LSB (least significant bit) and MSB methods, their confidentiality is similar in the encryption stage, while the prediction of the MSB values is easier to carry out than that of the LSB values during decryption. To achieve high-capacity data hiding in encrypted images, we propose a method for high-capacity reversible data hiding in encrypted images based on information preprocessing. Sudoku and Arnold transformations both scramble the position of matrix elements without changing the size of the matrix. e encrypted image obtained through Sudoku and Arnold transformations has a good encryption effect and also retains redundant space. S-BOX is the core of the DES algorithm. It is the only nonlinear part of the algorithm and the key of the algorithm. e algorithm involves 8 S-BOX, each S-BOX with 6-bit input and 4-bit output. e first and sixth digits of the 6 digits in the S-BOX indicate the number of rows, and the middle four digits indicate the number of columns. Find the corresponding value by row number and column number. In this study, the S-BOX is used to preprocess the secret data, and the row number is transmitted through a special secure channel to achieve the effect of compression. e contributions of this study are summarized as follows. (1) We present a new encryption scheme with the Arnold and Sudoku transformations to transfer reductant spaces from the original image to an encrypted image. In this way, the redundant space is retained and the information of the cover image becomes invisible. (2) We preprocess the secret message before data hiding. e compressed data are embedded in the encrypted image. us, we can embed more than one bit per pixel. Figures 1(a)-1(c) show a flow chart of our scheme, including image encryption, data hiding, and data extraction and image recovery.

Image Encryption.
In the previous RDHEI methods, the encrypted image processed by traditional standard encryption schemes contains almost no redundant space. e encrypted image processed by the compression methods is difficult to be completely reconstructed. To avoid these problems, we design an Arnold transform [23] and Sudoku [28] matrix-based encryption method that preserve the redundant space and ensure security. e conventional schemes for data hiding are not flexible. ere are not many chances to improve the security of data hiding. ey are cracked and guessed easily. erefore, we need more technologies for nonfixing and randomizing. With Arnold and Sudoku's independence and randomness, we enhance the unpredictability for an encrypted image and improve the security to be perceived.
In this stage, the cover image I(M * N) is processed by the Arnold transform to generate I A (M * N).
en, the image I A (M * N) is divided into m nonoverlapping blocks B m sized 2 * 2 and scanned, block by block, in scan line order, where N 0 � M × N/2 × 2 and m � 1, 2, . . ., N 0 . Each subblock is processed by Arnold transform with a different transform time.
e Arnold transformed image I A is divided into nonoverlapping blocks sized 2 × 2 and pixels, which are not in any subblocks without change. e divided blocks and their pixels are permuted by using the Sudoku matrix. In the previous RDHEI methods, the encrypted image processed by traditional standard encryption schemes contains almost no redundant space. To avoid this problem, we design an Arnold transform and Sudoku scrambling based encryption

Prediction Error Detection.
In this study, the compressed data are embedded by MSB substitution. Hence, the original MSB values are lost after the data hiding phase. During the decoding phase, to be able to predict pixels without any errors, the previous pixels are used to predict the current pixel value. erefore, the content owner needs to analyze the original image content to detect all the possible prediction errors: (1) Consider the current pixel value as p(i, j) and its inverse value as inv(i, j).
Consider the average of the left and the top pixels as a predictor pred(i, j) , which is considered as a predictor during the decoding step: (3) Calculate the absolute difference between pred(i, j) and p(i, j), as well as between in v(i, j) and p(i, j), denoted by e and e inv , respectively.
When <e inv , there is no prediction error. Otherwise, there is an error, and we store the value.

Sudoku Matrix.
To preserve the redundant space of the original image, we introduce an image encryption method based on the Sudoku matrix and Arnold transform. It is known that Sudoku is a logical number fill game with the numbers 1 to 9 occurring exactly once in each subblock. Sudoku involves nine houses, and each house is divided into nine small squares. Our algorithm employs a Sudoku matrix and selects a subblock as our reference matrix. We use the reference matrix to permutate the pixel values. In the 3 * 3 grid of the reference matrix, each value represents the position of the original pixel. In this study, we divide the original image into nonoverlapping blocks of 3 * 3 in size. And then, according to the reference matrix, the pixels are permutated. Because the Sudoku matrix permutates only the position of each pixel and does not change its values, redundant space can be preserved. Furthermore, with the

Arnold Transform.
In the 1960s, Arnold [27] first proposed the Arnold transform, in which the content owner encrypts an image by displacing pixel positions. e Arnold transform can be represented by a matrix as follows: We first use the Arnold transform to process the original image n i times, in which n i is less than the transform cycle T and the value of T is calculated by formula (4). en, we divide the processed image into subblocks sized 3 * 3 and permute the pixels by using the Sudoku matrix. After completing the two steps, the original image I is encrypted, and the redundant space is transferred from the original  Complexity image to the encrypted image I AS . In this study, we set the Sudoku matrix and n i as the secret keys.

Data Embedding.
In the last few years, with the development of cloud computing, people have sought to embed more secret information into original images; at the same time, data privacy has become a major problem. In this study, the redundant space is transferred from the original image to the encrypted image. us, it is possible to achieve high-capacity reversible data hiding in an encrypted image. In many of the existing schemes, data hiding is carried out by MSB substitution and LSB substitution or the substitution of both. e previous methods cannot achieve high-capacity data hiding with most yielding results near 1 bpp. In the data hiding phase, we compress the secret data and then embed the compressed data into the encrypted image by MSB substitution. In many steganography schemes, any information is sent as random bits, while others make great efforts to embed and send scanned documents as secret data in a safe way. Any multimedia information can be converted into a binary string. Multimedia information contains a lot of redundancy. And it will be relatively large if directly converted into a binary string. erefore, when the multimedia information is a secret message, eliminating redundancy will greatly improve image quality. In this study, when the secret data is a scanned document, we convert the scanned document from gray-level form to binary values by half-toning and then extract the parts with information by using the quadtree algorithm. When the message has a random bit form, the messages are compressed by S-BOX. For the messages are compressed before data hiding, we can easily increase the embedding capacity.

Halftone and Quadtree.
e scanned document is converted to a binary image by the halftone method. It means that each 8-bit pixel is shown by 1 bit. us, the size is reduced by a factor of 8. us far, there are many methods for calculating the halftone image from a scanned document image, which are divided into three groups: error diffusion, dither, and iteration. In the present study, the method of error diffusion is used to calculate the halftone image. ese halftone methods convert each pixel to 1 or 0 and the reverse halftone image calculation converts each 1 and 0 bit to an integer value between 0 and 255. e halftone image of a scanned document with the binary display can include signs such as text, images, and tables, which are shown with 0 bits and white backgrounds, as indicated by 1 bit. In this study, when the secret data are scanned documents, only the document content from the background is useful, not the background itself. us, we separate the content from the background and consider the content as secret data. In [29], an improved quadtree method was proposed, which had the ability to be applied to any image dimensions. Figures 4(a)-4(e) depict a scanned document image, halftone image, subrectangle image, content subrectangle image, and rectangle merging image. e quadtree scheme consists of the following steps: (1) Scanned document images of any dimension are processed into N * N images (2) Each scanned document is converted to a halftone image by error diffusion and the size reduced by a factor of 8 (3) e halftone image is considered a rectangle. When the minimum width and height of the rectangle are larger than 1 * 1, the rectangle should be segmented into subrectangles. is means that the rectangle is divided into four subrectangles, and this process is performed repeatedly on all subrectangles until there is no other subrectangle with division conditions (4) Because there are the subrectangles without containing information, we only keep the content and coordinates of subrectangles that contain information. Normally, the number of these subrectangles can be high and 4 numbers are retained as coordinates for each subrectangle increase increasing the final amount of information (5) All subrectangles that contain information are merged by scanning neighboring rectangles horizontally and vertically. erefore, the number of merged rectangles can be small and 4 numbers can be retained as coordinates for each merged rectangle, reducing the size mentioned in step 4 (6) To ignore the 0s on the left side of a binary bit string, in our study, the decimal coding algorithm is used. Hence, we can read longer binary bitstreams and convert them to their equivalent decimal values. To achieve high-capacity data hiding, the data are processed by S-BOX and the binary bit strings are compressed by a factor of 1.5 In our study, the scanned document is processed by the above steps, yielding the rectangle merging image and the coordinates. We can see that the data were compressed very well. For example, we test the title of this paper in Figure 4, where (a) is the original scanned document image (24.3 KB, 24942 Byte), (b) is the halftone image, (c) is the subrectangles image, (d) is the content of the subrectangle image, and (e) is the rectangle merging image. As the result, the data are recorded by the text documents, and after step 5, we can obtain the data by data-bin.txt (192 KB, 197539 bytes).
rough the decimal coding algorithm, we can obtain the data by data-dec.txt (35.6 KB, 36492Byte).
rough the S-BOX operator, we can obtain the insert data by datasbox.txt (23.7 KB, 24328 bytes). In this test, the secret data were compressed by a factor of 8.202.

S-BOX.
S-BOX is the core of the DES algorithm. It is the only nonlinear part of the algorithm and the key to the 6 Complexity security of the algorithm. e input of the S-BOX is 6 binary digits, and the output is 4 binary digits. According to the characteristics of the S-BOX, it can be used for packet compression and expansion. In this study, the S-BOX is adopted to preprocess the secret data, and the row number is stored and transmitted through a special secure channel. S-BOX is a simple substitution operation in which the input is a bitstream with each 6-bit group considered a set of data. And its output is a bitstream, but each set of data is converted from 6 bits to 4 bits. Table 1 presents substitution mapping. Assuming A � a 1 a 2 a 3 a 4 a 5 a 6 , let k � a 2 a 3 a 4 a 5 and h � a 1 a 6 . In the row h and column k of S-BOX, we find a number B � b 1 b 2 b 3 b 4 . e value of B is in the range 0 to 15, proving that the substitution operator can compress the secret data from 6 to 4. In this study, we use all the numbers of these rows as secret keys. In this study, before being hidden, the secret data were preprocessed.
e secret data are a random bitstream or scanned document. When the secret data are a scanned document, the scanned document is converted into 4 numbers as coordinates for each merged rectangle and the content, and then, the data or the bitstreams are substituted by S-BOX. For example, regarding the bitstream 111000101010111000101010111000101010111000101010, first, each 6-bit group is treated as a set. us, we obtain 8 groups of data: 111000, 101010, 111000, 101010, 111000, 101010, 111000, and 101010. Second, each group can be substituted by S-BOX. us, we obtain 8 substitution groups of data, i.e., .0011, 0110, 0011, 0110, 0011, 0110, 0011, and 0110, and the numbers of rows 10, 10, 10, 10, 10, 10, 10, and 10 are kept as secret keys.

Data Embedding.
In the data hiding stage, the data hider first preprocesses the secret message by a halftone, quadtree, and S-BOX. en, the preprocessed data are embedded in the encrypted image without knowing the encryption key. Pixels of the encrypted image are scanned from left to right and then from top to bottom, and the preprocessed data are embedded into MSB planes by MSB substitution:

Receiver.
In the decoding phase, the receiver receives the marked encrypted image p em , and then the receiver can conduct data extraction and image recovery, which include two scenarios: (1) if the receiver has only a data hiding key, the secret message can be extracted; (2) if the receiver has the encryption key and the data hiding key, the secret data and cover image can be recovered with no error.

Data Extraction.
e pixels from the marked encrypted image are scanned from left to right, and then from top to bottom, and the MSB of each pixel is extracted to recover the reprocessed secret message: where 0 ≤ k < M × N and refers to the index of the recovered bit in the preprocessed secret message. For example, regarding the bitstream 00110110001101100011011000110110, first, each 4-bit group is regarded as a set. us, we obtain 8 groups of data: 0011, 0110, 0011, 0110, 0011, 0110, 0011, and 0110. Second, according to the numbers of rows, each group can be substituted by S-BOX. us, we can obtain 8 substitution groups of data: 111000, 101010, 111000, 101010, 111000, 101010, 111000, and 101010. When the secret data are a scanned document, the bit string is first operated by S-BOX, and then the coordinates and the content of the rectangles can be obtained. Finally, we recover the scanned document.

Image Recovery.
In this study, three steps are adopted to realize image reconstruction: ② e reconstructed p e1 image is operated on by an Arnold inverse transformation: where x ′ y ′ represents the pixels of thepe1 image and x ″ y ″ represents thepixels of the reconstructed p e2 image . ③ e MSB value is calculated: (a) e pixel value is considered for MSB � 0 and MSB � 1. And then, we calculate the differences between each of these two values and pred(i, j). e two differences are recorded as e 1 and e 2 : (b) e smaller value between e 1 and gives the reconstructed pixel value:

Experimental Results and Comparisons
In this section, we present the results obtained by using our method with the high-capacity reversible data hiding in encrypted images based on information preprocessing. To evaluate the performance of our proposed scheme, we first applied our study on the original image of 512 × 512 pixels of the USC-SIPI image database in this experiment. PSNR (Peak Signal-to-Noise Ratio) was used to evaluate the visual quality degradation of the encrypted image produced by our study. e high PSNR value indicated that the visual artifact of the encrypted image was imperceptible to human visual sensitivity. SSIM (Structural Similarity Index Measurement) was used to evaluate the similarity of two images. When the SSIM value was 1, it indicates that the reconstructed image and the cover image were the same. Section 3.1 lists a full example of our method and shows the results in the test images. We perform a statistical analysis to test the capacity and the visual security of our study. Finally, in Section 3.2, we compare our approach with related methods and discuss its efficiency. For data hiding in encrypted images, we need to measure different performances which are the number of incorrectly extracted bits, the payload, and the recovered image quality after message extraction. We are interested in discovering a general method to improve the embedding capacity for all images and to discover the best trade-off between all the above parameters.

Detailed Example for the Proposed Study.
In our study, the scanned document of [31] was used as secret information (each page was scanned in the scale of 1700 * 2388). In Table 2, the size of the eight scanned documents and corresponding sizes was listed. We first applied our study on the test images, sized 512×512, from the USC-SIPI image database. In Figure 5, for the same scanned document, the results of the average compression ratio for five different thresholds with the rectangular size of 1 × 1, 4 × 4, 8 × 8, 16 × 16, and 32 × 32 were 8.001296, 4.573963, 2.992956, 2.051653, and 1.665751, respectively. Eight different pages were used as secret information, and Lena was adopted as the cover image. In Figure 6, for the Lena image, the results of embedding capacity for five different thresholds with the rectangular size of 1 × 1, 4 × 4, 8 × 8, 16 × 16, and 32 × 32 were 442082 bits, 770898 bits, 1178221 bits, 1721599 bits, and 2119225 bits, respectively. For the same document, when the minimum rectangle size was 1 × 1, we can see that the actual embedding capacity was the highest and the mean hiding capacity of the test image was 1.6865 bpp. Hence, in this study, the minimum size of rectangles of 1 × 1 was adopted. e experiment is shown in Figures 7 and 8, in which the embedding rate is 1.4483 bpp when the secret data are a bitstream. And we preprocess the secret data with S-BOX. Sometimes the secret data are a scanned document, in which case, the embedding rate is 7.714 bpp. In this study, we preprocess the scanned documents with the halftone, quadtree, and S-BOX technologies. For the test image, all Table 1: S-BOX.   0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  14  4  13  1  2  15  11  8  3  10  6  12  5  9  0  7  0  15  7  4  14  2  13  1  10  6  12  11  9  5  3  8  4  1  14  8  13  6  2  11  15  12  9  7  3  10  5  0  15  12  8  2  4  9  1  7  5  11  3  14  10  0  6  13 8 Complexity pixels are correctly recovered (PSNR � ∞, SSIM � 1 ). In Table 2, the results of applying the proposed method to the test images are shown. e PSNR tends toward +∞ and the embedding rate by more than 1 bpp. However, as the amount of embedding is greater, the quality of the reconstructed image will decline.

Security Analysis of the Proposed Study.
We analyze the security of our study through the keyspace parameter, which involves the total number of possible combinations of an encryption key. Usually, a large keyspace can effectively ensure that encrypted contents are not accessible by unauthorized users. According to the detailed explanation in Section 2, the key includes two parts: (1) the total number of different Sudoku solutions was 6.67 × 10 21 and (2) the second part is the Arnold transform parameters E A . e number of possible combinations of E A can be calculated by erefore, the keyspace of our study is 6.67 × 10 21 × KS A . Supposing a 512 × 512 × 8 image to be encrypted by our study, the value of T is set to 3, and then the keyspace of our study is 6.67 × 10 21 ×8 × 2 65536. e keyspace of our study is sufficiently large to resist many kinds of brute force attacks.
In addition, the secret information extracted by the tamper is garbled. e message is the real message after the inverse transformation of the S-BOX.

Comparisons with Related Studies and Discussions.
We applied our method to two kinds of different secret data and present the detailed results obtained on the random bits and scanned documents. In Table 3, when the secret data are random bits, we make some comparisons between our study and five existing ones which are [13,14,19,22,25]. In this study, we use the four test images presented in Figure 6; the results are listed in Table 4. In [19], they can totally reverse all the images, and the SSIM is equal to 1 and the PSNR tends toward +∞. In [13,14,22,25], they cannot totally reverse all the cover images, and the embedding capacity of the four methods is all less than 1. With our proposed approach, we achieve results of 1.3824 bpp. e SSIM is equal to 1, and the PSNR tends toward +∞. In Table 5, we can see that the secret data is a scanned document, and in our study, the true embedding data are coordinates for each merged rectangle and then compressed by S-BOX. us, the embedding capacity is higher. We achieve results of 7.714 bpp. e SSIM is equal to 1 and the PSNR tends toward +∞. In conclusion, in addition to being error-free during secret data extraction, our study allows us to have a good trade-off between the hiding capacity and the recovered image quality after data extraction. From the security point of view, the statistical analysis shows that there is no information about the content of the cover image in the marked encrypted version. Most importantly, we used the S-BOX substitution, which invalidates the information even if an attacker extracts it.

Conclusions
In our study, we proposed an efficient MSB method for highcapacity data hiding in encrypted images based on information preprocessing that outperforms the previous state-of-the-art methods. We encrypt the original image by an Arnold and a Sudoku transformation. us, we retain room for redundancy. To achieve high-capacity reversible data hiding, we preprocess the secret data before the embedding operator. We can see that the S-BOX phase sacrifices memory for embedding capacity. In this study, the S-BOX and Sudoku matrix are saved and sent to the receiver in a separate encrypted channel. In future work, we are interested in hiding more secret data; for example, we can perform multiple secret data compression transformations before embedding data, and the second MSB of each pixel can be used to enlarge the amount of embedded data.

Data Availability
e data used to support the fndings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no financial conflicts of interest.