Constructing HVS-Based Optimal Substitution Matrix Using Enhanced Differential Evolution

Least significant bit (LSB) substitution is a method of information hiding. The secret message is embedded into the last k bits of a cover-image in order to evade the notice of hackers. The security and stego-image quality are two main limitations of the LSB substitution method. Therefore, some researchers have proposed an LSB substitution matrix to address these two issues. Finding the optimal LSB substitution matrix can be conceptualized as a problem of combinatorial optimization. In this paper, we adopt a different heuristic method based on other researchers’ method, called enhanced differential evolution (EDE), to construct an optimal LSB substitution matrix. Differing from other researchers, we adopt an HVS-based measurement as a fitness function and embed the secret by modifying the pixel to a closest value rather than simply substituting the LSBs. Our scheme extracts the secret by modular operations as simple LSB substitution does. The experimental results show that the proposed embedding algorithm indeed improves imperceptibility of stego-images substantially.


Introduction
The internet provides an easy way to exchange information with others.However, information is also prone to eavesdropping from hackers.Several methods can be employed to protect secret information, such as cryptography, steganography, and secret sharing schemes.The spirit of these methods is basically varied.Cryptography scrambles the content with a private key.Without the appropriate key, unauthorized authors cannot decode the secret within limited time and resources.Unlike cryptography, steganography, also called information hiding, conceals the secret rather than scrambling it.That is, the secret is covered by innocent information that does not attract the attention of hackers, who thereby pass over it.Due to its simplicity and efficiency, steganography is still a popular method until now [1][2][3][4].
Among plenty of steganographic methods, simple least significant bit (LSB) substitution is the most general one [5].The secret message is decomposed and embedded into the least significant bits of each pixel of the cover-image.The modified cover-image is called a stego-image.The secret message can be extracted by performing modular operation to each pixel of the stego-image.This method is very simple and easy to implement; however, sometimes the stego-image is not imperceptible enough when more least significant bits are substituted.Recently, Wang et al. proposed a novel idea about substitution matrix to improve the quality and security of the stego-image [6].The substitution matrix can be seen as a mapping function, which maps each secret value into another value.Different substitution matrices represent different mappings and result in different stegoimages.Among these different stego-images, some are closer to the original cover-image than others are.Obviously, the optimal substitution matrix is the one that produces the stego-image closest to the cover-image.Due to the huge number of possible substituion matrices, Wang et al. utilized genetic algorithms (GA) [7] to find the optimal substitution matrix.According to the patterns of chromosomes, GA can be classified into two types: one is binary GA, and the other is real-parameter GA.In the course of evolution, binary GA has to encode the original problem into binary chromosomes, and the encoding method may influence the efficiency of problem-solving.However, some problems, such as combinatorial optimization, are not easy to be encoded into binary chromosomes.Furthermore, the length of chromosomes of such kind of problems may be too long to solve problems efficiently.Although real-parameter GA can encode a problem with shorter chromosomes, its efficiency and the quality of the final solution are not as good as those of binary GA.Therefore, each kind of GA has its limitation.
Later, other researchers adopt different heuristic methods, such as tabu search [8], ant colony algorithm [9], and cat swarm optimization [10], to construct optimal substitution matrix.All of these researchers adopt simple LSB substitution to embed the secret.However, even though an optimal substitution matrix is adopted, the modification to the pixels of the cover-image may be still large due to the intrinsic features of simple LSB substitution.Consequently, the improvement of the imperceptibility may be limited.Besides, these researchers adopt peak signal-to-noise ratio (PSNR) as a fitness function to measure how the stego-image is near the cover-image.The PSNR number represents the average differences between pixels; however, sometimes the difference between pixels cannot respond to human perception.Generally speaking, human eyes can tolerate modifications to texture areas more than those to smooth areas [11].Images are viewed by human eyes after all; hence it is more suitable to use the measurement based on the human visual system (HVS) to evaluate the imperceptibility of the stego-image.
This paper proposed a method to construct optimal substitution matrix as well.Nevertheless, the proposed scheme has three aspects different from those in other similar researches.At first, the way to embed secrets is to change the pixel values rather than to substitute the least significant bits directly.However, the way of extracting is easy and the same as that of simple LSB substitution.Second, another heuristic method, called enhanced differential evolution (EDE) [12], is adopted to search for the optimal substitution matrix.Differential Evolution (DE) was first introduced by Storn and Price [13] and copes with problems whose feasible solutions are continuous values.Later, Onwubolu and Babu extend the capability of DE to handle problems with discrete solutions.Third, an HVS-based fitness function, called structural similarity SSIM [14], is employed to measure the difference between the stego-image and the cover-image.Therefore, our stego-image is not only physically near the cover-image but is also perceived similar to the cover-image by human eyes.The rest of this paper is organized as follows.In Section 2, some preliminary knowledge for our work is provided.In addition, some related literatures are reviewed as well.In Section 3, the way of constructing an optimal substitution matrix and the embedding and extraction methods are explained in detail.Then, the experimental results and comparisons with other researchers methods are presented in Section 4. Finally, we will give some conclusions in Section 5.

Literature Review
2.1.Simple LSB Substitution.The simple LSB substitution is the earliest steganographic technique.The so-called least significant bit is the less important part of a pixel.Therefore, modifying LSBs of pixels cannot change an image too much.Embedding and extracting secrets are very simple and easy to implement.Suppose that  denotes a pixel of the cover-image and is expressed as That is,  and  are the quotient and the remainder when  is divided by 2  .Suppose that  is a -bit secret.The secret  can be embedded into  by means of the following: Performing a modulo-operation on the stego-pixel   , as shown in (3), can extract the secret : Simply speaking, the secret  is embedded by directly substituting the last  bits of  and is retrieved from the last  bits of   .Take a pixel (00100000) 2 and a 3-bit secret message (111) 2 as an example.Since the length of the secret message is three, we can substitute the last three bits of the pixel with the secret.Using (1) and (2), we can get the stegopixel (00100111) 2 .If the secret  is very large, it is divided into segments of fixed length and is evenly distributed into each pixel of the cover-image.
There are two problems of this method.First, the more secret messages there are, the more bits of the cover-image have to be modified.Hence the stego-image may become too different from the cover-image to give cover to the secret message inside.Second, the simplicity is a two-edged sword.The receiver can recover the secret easily, so do the hackers.Therefore, the security of this method has to be enhanced.

Substitution Matrix.
In 2001, Wang et al. introduced the substitution matrix to improve the quality and security of the stego-image [6].Briefly speaking, a substitution matrix is used to replace the secret value with another value.Wang et al. 's method can be summarized as follows.At first, the secret  is divided into segments of -bit length, and then the order of each segment is randomly permuted.Suppose that  denotes the set of reordered segments of  and that where  is the number of total segments of .Let  denote a 2  × 2  substitution matrix, and  = [ , ], where 0 ≤ ,  ≤ 2  − 1, and  , ∈ {0, 1}.According to , every element of  is changed into another value as shown in Note that there is only a "1" in each row and each column.
In short, the substitution matrix can be seen as a one-toone mapping function from  to , where  is the set of all possible integer values of   .Then, the mapping result is embedded into the cover-image by means of simple LSB.The following serves as an example: The elements  00 ,  12 ,  21 , and  33 of  indicate that the possible values 0, 1, 2, and 3 of  are substituted with 0, 2, 1, and 3, respectively.Therefore,  is changed into another set   as {2, 3, 0, 1, 2}.Obviously, there are various possible substitution matrices.Different substitution matrix produces different   and further produces different stego-image.Wang et al. defined an optimal substitution matrix as the one that produces a stego-image with maximal peak signal-to-noise ratio (PSNR), where In ( 8),  and  denote the width and height of the cover-image, respectively.And  , and   , denote the pixels of the cover-image and the stego-image, respectively.Essentially, finding an optimal substitution matrix is a kind of combinatorial optimization problem, and there are totally 2  !possible solutions.Moreover, the solution space rapidly grows up with the number of .If  = 4, for example, the number of possible solutions becomes 20, 922, 789, 888, 00.When solving optimization problems with large solution space, heuristic algorithms perform better than deterministic algorithms.Therefore, Wang et al. utilized genetic algorithm (GA) to find near-optimal substitution matrix.

Optimal LSB-Based Steganography. Some other researchers apply Wang et al. 's substitution matrix to improve
the quality of the stego-images.The main difference is that they adopt different optimization algorithms, especially bioinspired algorithms.In 1992, Dorigo proposed an ant colony optimization (ACO) algorithm [15] in his Ph.D. thesis, which is very suitable for solving combinatorial optimization problems.Since finding an optimal substitution matrix is a kind of combinatorial optimization problem, Hsu and Tu [9] adopted ACO to find optimal substitution matrix.In 2007, Chu and Tsai introduced a cat swarm optimization (CSO) algorithm [16], which is derived from the behavior of cats.Valuing the performance on finding the global best solutions, Wang et al. [10] gave some revisions to CSO to generate optimal substitution matrix.When using bioinspired algorithms, one needs to provide fitness function to evaluate a solution so that the algorithm can guide those virtual creatures, such as cats or ants, toward the optimal solution.Hsu and Tu and Wang et al. utilized the pixel difference between the cover-image and the stego-image as the fitness of a solution.
Some researchers adopt different embedding strategies to make the distortion to the cover-image as little as possible.Xu et al. [17] adopted Mielikaines' pairwise LSB matching method [18] and changed the matching order between the secret bits and cover pixels to decrease the distortion to the cover-image.They designed a three-tiered score system to evaluate the performance of a matching order and utilized an immune programming to find the best matching order.Considering that the difference measured in pixels is not necessary the same as that measured by human eyes, a few of researchers take human visual system into consideration.Lee and Tsai [19] determined the number of bits used to carry secret in a pixel according to the principle of just noticeable difference (JND).Further, they utilized dynamic programming to divide the secret data into segments to minimize the modification to the cover-image when embedding secret data.Instead of using every pixel of a block, Bedi et al. [20] chose a part of the pixels in a block to carry secret data.In view of the image quality, the choice is not made at random.For each block of 8 × 8 pixels, they utilized particel swarm optimization (PSO) algorithm [21] to determine the best pixels to embed secret data sequentially.The distortion error between the cover-image and the stego-image is measured with a quality index based on human visual system.Since the pixels used to embed secret data vary from block to block, the pixel positions have to be recorded as the key to extract secret data successfully.If the size of the cover-image is × pixels, the minimal required space for the key is  ×  bits.Obviously, the required space grows up with the size of the cover-image.Another worry about Bedi et al. 's scheme is about hiding capacity.Not all of the pixels in a block will be used to embed secret data; or else, Bedi et al. 's scheme becomes meaningless.In fact, in their experiments, only eight pixels of a block are used to embed secret data.The highest possible payload is only 0.5 bits per pixel if the last four bits of a pixel are used to embed data.

HVS-Based Measurement.
Human eyes are complex biological organs.The way human eyes perceive the difference between two images is not the same as that of PSNR.Sometimes, a sensible difference for human eyes does not necessarily mean a large difference between pixels.After some observations, Barni and Bartolini [11] listed the following three rules of thumb.
(1) Disturbs are much less visible on highly textured regions than on smooth areas.
(2) Contours are more sensible to noise addition than highly textured regions but less than flat areas.
(3) Disturbs are less visible over dark and bright regions.
Based on the characteristics of human visual system (HVS), some researchers proposed different methods to evaluate image quality or to estimate the acceptable change to an image [22][23][24][25][26].
Combining the three components of luminance, contrast, and structure, Wang  (SSIM) index to measure the similarity between two images in light of HVS.Suppose that  and  denote two gray-level images, respectively.The luminance comparison function is where   and   denote the average pixel values of images  and , respectively, and  1 = ( 1 ) 2 , where  is the dynamic range of pixel values and  1 ≪ 1.The contrast comparison function is where   and   denote the standard deviations of pixel values of images  and , respectively, and  2 = ( 2 ) 2 where  2 ≪ 1.The structure comparison function is where   is the covariant of pixel values of images  and , respectively.Combining ( 9), (10), and (11), we can get the following SSIM index: For simplicity, Wang et al. set  =  =  = 1 and  3 =  2 /2.Consequently, SSIM can be transformed into a specific form as follows: The SSIM metric is calculated on various windows of the image, and hence we can use the following mean SSIM (MSSIM) index to evaluate the overall image quality: 2.5.Enhanced Differential Evolution.Differential evolution (DE) was first introduced by Storn and Price [13].DE is a population-based optimization method, and candidate solutions are represented as vectors.For each individual (called a target vector) in the current population, offspring (called a trial vector) is generated by adding a scaled, random vector difference to a randomly selected population vector.
The trial vector competes with its corresponding target vector on their fitness.The winner can live to the next generation.Although simple, DE performs well on a wide variety of test problems [12,13,27,28].Figure 1(a) is the flowchart of DE.Initially, DE is invented for solving continuous space optimization problems.Later, some researchers modified DE to attack permutative-based combinatorial optimization problem.The so-called permutative-based combinatorial optimization problem is that its candidate solution is a permutation of a sequence of integers.Among these modifications, Onwubolu and Babu's approach, called enhanced differential evolution (EDE), is intuitive and easy to implement [12].The main idea of this approach is to transform the permutative population into continuous population.The forward transformation formula is as follows: where   is a discrete parameter of some vector and  is a scaling factor.After being transformed into continuous form, the population can be handled by canonical DE strategy to generate the child population.However, the individuals of the child population are continuous values and cannot be evaluated by the fitness function.Therefore, they have to be backward transformed into discrete solutions by the following equation: where int[⋅] denotes a function that rounds a real value to the nearest integer.Backward transformation may produce infeasible solutions, so the offspring population has to be repaired.
To generate better offspring, Onwubolu and Babu proposed two improvement strategies for the repaired offspring: one is swap mutation and the other is insertion mutation.The final offspring will compete with the parents.Figure 1(b) shows the flowchart of EDE.

The Proposed Method
With the help of a substitution matrix, the imperceptibility can be improved, but the embedding way of simple LSB substitution may limit the improvement.Several researches have been devoted to the study of constructing an optimal substitution matrix.Some utilized deterministic algorithms [29,30], while some utilized heuristic algorithms [6,[8][9][10] to search for the optimal matrix.No matter what algorithms they employed, it is PSNR that they adopted as the objective function (also called fitness function in heuristic algorithms) to guide the search direction.As we have mentioned above, PSNR measures the absolute difference of pixel values, but not the difference perceived by human eyes.Therefore, we adopt MSSIM (14) as the fitness function.The heuristic algorithm we utilized to search for the optimal substitution matrix is EDE.In addition, to break through the intrinsic limitation of simple LSB substitution, we design a ModEmbedding algorithm, which can make the cover-image and stego-image as close as possible.Before moving on to the main task, it is helpful to give an overview of our scheme.Figure 2 is the flowchart of the proposed scheme.In the embedding process, a secret  is split into segments, each of which is of -bit length.Let  denote the set of segments.Then, all elements of  are randomly permuted using a pseudorandom number generator.According to   and the cover-image , EDE constructs near-optimal substitution matrix  * .With  * ,   is transformed into  * and embedded into .And finally, we can get the stegoimage  with the secret inside.In the extraction process,  * is extracted from the stego-image  and is reversely transformed into   with the same substitution matrix .Using a pseudorandom number generator seeded by the same key, we can rearrange the elements of   in the original order of the elements of .Finally, combining the elements of , we can recover the secret .
With the overview in mind, we can now look deeper into the details of the proposed scheme.

EDE Subroutine.
In this paper, EDE is employed to construct a near-optimal substitution matrix.Since initialization and selection are problem-dependent parts of EDE, we will concentrate on these two parts.
Initialization.EDE is a population-based evolutaionary algorithm.Therefore, a population size   has to be predefined at first.Initially, users have to randomly generate a set of   distinct candidate solutions.Starting from the initial population, EDE will generate offspring and evolve continuously to find the optimal solution until the terminated condition is satisfied.In order to use EDE to solve problems, we first need to represent solutions in form of vectors.As regards the problem of the proposed scheme, a solution is in the form of a 2  × 2  matrix.Precisely speaking, as we have mentioned before, it is a one-to-one mapping from the set  to the set , where  = {  |  = 0, . . ., (2  − 1), 0 ≤   ≤ 2  − 1}.Consequently, we can simply represent a substitution matrix as a permutation of the set .Therefore, it is obvious that a substitution matrix can be represented as a permutation of 0 to 2  − 1.We will now explain more definitely how a substitution matrix is encoded into a vector in EDE.Suppose that the substitution matrix  = [ , ], where 0 ≤ ,  ≤ 2  − 1, and  , ∈ {0, 1}.Then, the corresponding vector X = [ 0 ,  1 , . . .,  2  −1 ], where   =  if  , = 1.Take the following The process of computing fitness.substitution matrix as an example.The corresponding vector is [0 2 1 3].
Selection.In the selection phase, each individual (i.e., vector) of the current population has to compete with its offspring.The competition is based on their quality; hence users have to provide a fitness function to score vectors.Here we adopt (14) as our fitness function.Figure 3 illustrates the detailed process of computing a fitness of a vector X.At first,   is transformed into the substitution matrix corresponding to the vector .Next, the transformed result  () is embedded into the cover-image  to get the stego-image  () .Therefore, the fitness of a vector is MSSIM (,  () ).

Embedding.
Before explaining the way of the proposed embedding, let us consider the following example.Suppose that a cover-pixel is 60 and a 2-bit secret is 3.According to (1), 60 is expressed as follows: In light of simple LSB substitution, the secret substitutes the remainder directly and hence results in the following stegopixel: Performing 63 mod 2 2 , we can extract the secret.Let us consider another situation.If we change the cover-pixel to 59, instead of 63, we still can extract the secret by performing the same modulo-operation (i.e., 59 mod 2 2 ) because 59 = 14 × 2 2 + 3. It is clear that 59 is closer to the cover-pixel 60 than 63 is.This example makes it clear that we can test (2) on three quotients ( − 1), , and ( + 1) to see which one can result in a stego-pixel closest to the cover-pixel.The complex embedding algorithm is as Algorithm 1.

Extraction.
Extracting secret is very simple.Algorithm 2 illustrates the extracting algorithm.Each -bit secret  is extracted concatenated to the whole secret   .However,  * is not the original secret.We have to convert each -bit value of  * to the original value according to the substitution matrix .

Experimental Results and Discussions
This section demonstrates some experimental results of the proposed method.In addition, the proposed scheme was compared with some simulated experiments.The experiments in this section are carried out on a PC with Intel Core 2 Duo CPU at 2.8 GHz, 4 GB RAM, Windows 7 Professional Operating System, NetBeans IDE, and JDK 6.Before turning to a closer examination of the experimental results, we will outline our assumptions here.
(1) The cover-image is a gray-level and uncompressed image.
(2) The stego-image cannot be modified by any form of signal processing.(3) The key is preserved secretly.(4) The size of a secret image is of (×ℎ)/2 pixels, where  and ℎ are the width and the height of the cover-image, respectively.
Having clarified the assumptions, we may now go into details about our experiments.Here we have three distinct types of simulationsas follows.
(a) Experiment I.The secret was embedded and extracted by means of simple LSB substitution.(b) Experiment II.The secret was transformed with an optimal substitution matrix and then embedded and extracted by means of simple LSB substitution.The optimal substitution matrix was constructed by EDE with PSNR as the fitness function.
(c) Experiment III.The secret was transformed with an optimal substitution matrix and then embedded and extracted by means of simple LSB substitution.The optimal substitution matrix was constructed by EDE with MSSIM as the fitness function.
As regards the proposed scheme, we transformed the secret with an optimal substitution matrix and then embedded it by the proposed ModEmbedding algorithm.The way to extract the secret is the same as that of simple LSB substitution.The optimal substitution matrix was constructed by EDE with MSSIM as the fitness function.For clarity, we use Table 1 to summarize the similarities and dissimilarities between the proposed scheme and the above simulations.3.
lists the PSNR and MSSIM values of the stego-images of the three simulated experiments and our method.To summarize, we sketch the bar chart of these values in Figure 5.
Several observations from these experimental results are discussed as follows.
(1) The proposed scheme outperforms simple LSB substitution (i.e., Experiment I) in both PSNR and MSSIM.(2) As we have mentioned before, there are some researchers using heuristic algorithms to construct the near-optimal substitution matrix.As a whole, the merits of our work are summarized as follows.
(1) The extra space for the substitution matrix is small.The extra space for the substitution matrix is related to the number of bits, that is, the parameter  in our scheme, used to carry the secret.If  = 4, the required space is 16 × 16 bits.In Bedi et al. 's scheme [20], the extra space is related to the size of the cover-image.If the size of the cover-image is 512 × 512 pixels, the required space is 512 × 512 bits, which is 1024 times that of our scheme.
(2) The payload is high, but the image quality is not destroyed too much.
Generally speaking, the last four bits of a pixel can be modified at most; or else, the image quality is not acceptable.Hence the highest possible payload of a steganographic scheme is four bits per pixel.The experimental results show that our scheme achieves the highest payload, which is eight times that of Bedi et al. 's scheme.In addition, the average MSSIM of our scheme as shown in the experiments is 0.9183, while that of Bedi et al. ' is 0.9124 according to their experimental results.
(3) We give consideration to image quality at pixel level and at visual level simultaneously.
The pervious researches related to optimal substitution matrix only consider the image quality at pixel level [6,9,10].Our scheme takes the human visual system into account and adopts the measurement MSSIM as our fitness function.Besides, we elaborate the embedding algorithm so that the difference between the cover-and stego-images at pixel level is as small as possible.Though Bedi et al. also adopt MSSIM as their fitness function, the required space for the key is too large.
(4) Our extracting method is as simple as the simple LSB.
One of the merits of the simple LSB is its simple way of extracting the secret, that is, the modular operation.
Like simple LSB, we extract the secret only through the modular operation.

Conclusions
As we have mentioned in Section 2.2, the number of possible solutions becomes 20, 922, 789, 888, 000  = 4.In this paper, we adopt EDE to construct a near-optimal substitution matrix.It follows from the experiment results that EDE can construct a good substitution matrix within a few iterations.
Considering the features of human eyes, we adopt an HVSbased measurement MSSIM, instead of PSNR, as the fitness function.We can see from the experimental results that adopting MSSIM as the fitness function indeed improves imperceptibility visually.Besides, the proposed embedding algorithm improves the stego-image quality largely; at the same time, the extraction is as simple as by the traditional LSB substitution method.Many researchers utilize different methods to solve the problem of constructing an optimal substitution matrix, so we believe that this is an interesting problem.So far as we know, no one has attempted to apply discrete DE to solve this problem until now.Therefore, this paper provides an efficient method to construct a substitution matrix and extends the applications of the DE algorithm successfully.
In future work, we intend to address the issue of steganalysis [31].We will design a sophisticated embedding strategy against statistical steganalysis.In addition, we may compare the results obtained from different bioinspired algorithms.

Figure 1 :
Figure 1: Flowcharts of (a) canonical DE and (b) enhanced DE.

Figure 2 :
Figure 2: The flowchart of the proposed scheme.

Figure 4 (
a) is our secret image of 256 512 pixels, and Figures4(b) to 4(f) are our cover-images of 512 × 512 pixels.The secret image is embedded into the last four significant bits of pixels of the cover-image (i.e.,  = 4).The window size of MSSIM is 11 × 11.Table2lists parameters of EDE, and Table3

Table 1 :
Summarizations of the simulated experiments and the proposed scheme.

Table 3 :
The PSNR and MSSIM of the three experiments and our method.