An Enhanced Informed Watermarking Scheme Using the Posterior Hidden Markov Model

Designing a practical watermarking scheme with high robustness, feasible imperceptibility, and large capacity remains one of the most important research topics in robust watermarking. This paper presents a posterior hidden Markov model (HMM-) based informed image watermarking scheme, which well enhances the practicability of the prior-HMM-based informed watermarking with favorable robustness, imperceptibility, and capacity. To make the encoder and decoder use the (nearly) identical posterior HMM, each cover image at the encoder and each received image at the decoder are attacked with JPEG compression at an equivalently small quality factor (QF). The attacked images are then employed to estimate HMM parameter sets for both the encoder and decoder, respectively. Numerical simulations show that a small QF of 5 is an optimum setting for practical use. Based on this posterior HMM, we develop an enhanced posterior-HMM-based informed watermarking scheme. Extensive experimental simulations show that the proposed scheme is comparable to its prior counterpart in which the HMM is estimated with the original image, but it avoids the transmission of the prior HMM from the encoder to the decoder. This thus well enhances the practical application of HMM-based informed watermarking systems. Also, it is demonstrated that the proposed scheme has the robustness comparable to the state-of-the-art with significantly reduced computation time.

The scheme of [6] developed by Chen and Wornell is one of pioneering works designing practical informed watermarking algorithms. They constructed a class of latticecode-based quantization index modulation (QIM) and theoretically proved the optimum of the QIM with distortion compensation (DC-QIM) under the framework of dirtypaper coding [5]. The QIM-based watermarking scheme has the characteristics of high capacity and easy implementation. However, it is weak to scaling attacks that multiply amplitudes of watermarked images with a scalar factor, which is also called gain attacks. To address such issue, several variants of QIM have been proposed in the literature [14][15][16][17]. The scheme developed in [14] inserts a pilot signal to facilitate the estimation of the scalar factor, and then it employs the estimated factor to correct amplitudes. The work discussed in [15] resists against gain attacks by designing a gain-invariant step-size for quantization at both the embedder and receiver. Rather than directly quantizing host signals, the approaches proposed in [16,17] first group a number of host signals into a vector, then take any two vectors to generate an angular value, and finally quantize angular values to insert the message.
Besides the lattice code, the spherical code is also deployed in informed watermarking to implement the dirtypaper coding. The codewords of spherical code have the same 2 The Scientific World Journal energy and thus lie on the surface of a sphere with a radius equivalent to the codewords' norm. This code gives rise to a hyperconic decoding region that centers in the origin. This in turn allows, from a geometrical viewpoint, a signal point multiplied with a constant scalar factor to stay in the same decoding region. Consequently, it leads to a capability to resist against gain attacks. A number of spherical-code-based informed watermarking approaches have been developed in the literature. In [18], Malvar and Florêncio proposed an improved spread-spectrum watermarking scheme (ISS). They used two spread-spectrum vectors of equienergy to represent message bits 0 and 1 and developed an embedding technique to adapt watermark signals to the host one. As a result, the robustness is significantly improved. In [19], Miller et al. designed another informed watermarking scheme using the spherical code. It consists of two stages, namely, informed coding and informed embedding. In informed coding, a message bit is associated with a coset containing a number of spherical codewords, and an optimum codeword is then chosen to represent the message bit. In informed embedding, the selected codeword is tailored, according to the host signal and the constraints of robustness and distortion, and embedded in the host signal. This aims for putting the watermarked signal in the decoding region of the chosen codeword. A similar informed watermarking scheme is presented in [20], in which orthogonal and biorthogonal spherical codewords are used for implementing informed coding and an optimization approach is employed for performing informed embedding. Both schemes of [19,20] achieve a large embedding capacity as well as high robustness to gain attacks, additive white Gaussian noises (AWGNs), JPEG compression, and so forth. Nevertheless, the scheme of [19] merely evaluated the performance through Monte Carlo simulations, and that of [18] has a relatively high computational complexity because of the employed trellis code with a long codeword length. In [21], Wang et al. developed an informed watermarking using the hidden Markov model (HMM) in the wavelet domain. They used a spherical code with relatively short codeword length for dirty-paper coding, aiming to decrease the computational complexity. This approach achieves comparable performance to the state-of-the-art [19], but it decreases the computational complexity of informed embedding by an order of magnitude.
As the scheme presented in [21] employs the HMM in the wavelet domain to construct the detector, HMM parameters need to be sent as side information to the receiver. This would hinder the practical application of HMM-based informed watermarking systems. To address such issue, we are motivated to develop an informed watermarking scheme using a posterior HMM. That is, rather than transmitting HMM parameters to the receiver, we reestimate HMM parameters at the receiver via a particular manner. If the reestimated HMM parameters are sufficiently close to the original one that has been estimated at the transmitter, we thus avoid the transmission of HMM parameters to the receiver. In [22], we had developed a kind of posterior HMM for a spread-spectrumbased robust image watermarking scheme. Considering that the inserted watermark is essentially a weak signal compared to the host one, this scheme takes the watermarked signal as the source to estimate the posterior HMM. Such treatment merely degrades the detection performance slightly, as well demonstrated in [22]. In contrast to the posterior HMM, the one estimated with the original signal is denoted afterwards as the prior HMM for notational convenience.
Although the scheme of [22] shows that it is feasible to estimate the posterior HMM with the watermarked signal, it cannot be directly applied to the case of informed watermarking. As will be illustrated in Section 3, using the posterior HMM for watermark detection would decrease the performance significantly. This is due to the fact that the predefined robustness in informed watermarking is ensured with respect to the prior HMM, but it would be no longer guaranteed if the posterior HMM rather than the prior one is used. The more the differences between the prior and posterior HMMs, the larger the performance degradation is. To make the posterior-HMM-based informed watermarking feasible, the key point is to let both the encoder and decoder adopt HMMs that are sufficiently close to each other. To this end, we impose the same noises that are much stronger than the inserted watermark on the original and received signals and then use the corresponding noisy versions to estimate HMM parameters for the encoder and decoder, respectively. In this way, the estimated HMM at the decoder would be sufficiently close to that at the encoder. Consequently, the decoding performance using this posterior HMM would not degrade or merely degrade slightly.
In the interest of obtaining high robustness to JPEG compression, we take JPEG compression at a small quality factor (QF) as strong noises for the posterior HMM estimation. We then reckon on numerical simulations to determine a practically-optimum QF, namely, QF opt . Based on this QF, we construct a posterior-HMM-based informed watermarking scheme. At the encoder, we impose JPEG compression with QF = QF opt on the original image to estimate HMM parameters and use these parameters to implement the informed watermarking algorithm in [21]. At the decoder, we employ the same method as the encoder to obtain HMM parameters and apply them to extract the message. Extensive experimental simulations show that the proposed posterior-HMM-based informed watermarking scheme can achieve the same performance as the prior-based one when attacks are either (equivalently) weaker than or much stronger than the predefined robustness, but it degrades remarkably for other cases. It is also observed that the proposed scheme obtains the robustness comparable to the state-of-the-art [19] with significantly decreased computation time.
The rest of the paper is organized as follows. Section 2 reviews the HMM in the wavelet domain and the HMMbased informed watermarking scheme presented in [21]. The posterior-HMM-based informed watermarking algorithm is developed in Section 3. Section 4 introduces the determination of a practically optimum QF by the way of numerical simulations. Based on this QF, experimental simulations are then carried out to evaluate the proposed scheme in Section 5. The conclusion is finally drawn in Section 6.

Review of the HMM-Based Informed Watermarking Scheme
As the algorithm proposed in this paper is an enhanced version of that in [21], in this section we briefly review the HMM in the wavelet domain and the HMM-based informed watermarking scheme in [21]. Below are the details. As supposed in both schemes of [23,24], any input image is decomposed via orthogonal or biorthogonal wavelets into a -level ( ≥ 1) pyramid. The input image is first decomposed into four subbands of LH 1 , HL 1 , HH 1 , and LL 1 . The LL 1 subband is further used to generate another four subbands of LH 2 , HL 2 , HH 2 , and LL 2 at the next scale. Such decomposition is repeated until a predefined scale, , is achieved or the subband size does not allow further decomposition, as illustrated in Figure 1.  [21,23,24], it is reasonable to adopt the same = 2 states, = ( = 1, 2), for all wavelet coefficients at the same scale. One corresponds to small and the other to large wavelet coefficients.
Considering that the average of wavelet coefficients is zero, the probabilistic density function (pdf) of t , conditioned on = ( = 1, 2) can be modeled as a mixture of two multivariable Gaussians with zero means and covariances of C (1) and C (2) [21,24] as follows: where ( ) = ( ) , holds for all s, and (t; C) is defined as where C = [tt ] is a covariance matrix, det(C) denotes the determinant of C, and | ⋅ | stands for the absolute value.
As the parent vector node links itself to its four child vector nodes, as shown in Figure 1, the VWD-HMM uses the Markov chain to capture the energy dependency across Vector node t j,k = {t 1 j,k , t 2 j,k , t 3 j,k }(j = 3) Figure 1: Illustration of = 3 wavelet pyramid decomposition, the VWD-HMM, and a vector quad-tree of wavelet coefficients (3 levels). The square block with thick solid lines denotes the LL 1 subband at the finest pyramid level, which is further decomposed to form the subbands of LH 2 , HL 2 , HH 2 , and LL 2 at the second level. And so do the four subbands at the third level.
scales. To this end, it defines the following state transition probability: where → represents the probability that child vector nodes are in state ( = 1, 2) given that their parent vector node is in state ( = 1, 2). Thus, the state probability of child vector nodes can be determined as follows: where p = ( (1) (2) ) . Therefore, the VWD-HMM for wavelet coefficients of an image is represented with the following parameter set: According to [24], the Θ can be efficiently estimated by the expectation-maximization (EM) algorithm.

HMM-Based Informed Watermarking Scheme.
Based on the aforementioned VWD-HMM, the authors in [21] developed an informed watermarking scheme, as illustrated in Figure 2. It includes message embedding and message extraction.  Figure 2: Block diagram of watermark embedding and extraction processes in [21]. The GA, "∘, " and TLOT denote the genetic algorithm, element-wise multiplication, and Taylor-series-approximated locally-optimum test, respectively.
The message embedding is performed as follows.
(1) Decompose the original image ( , ) of size 1 × 2 with the biorthogonal 9/7 wavelet into a 3level wavelet pyramid, and use the coarsest two levels to construct Figure 1).
(2) Given that each vector tree is embedded with one bit, then generate a message sequence b of ( 1 2 /64) random bits. After permuting b with a secret key KEY, allocate one permuted bit ( = 0, 1) to each vector tree. This leads to an information rate of 1/64 bit/pixel.
(3) Associate the given message bit to its representative spherical codeword M in coset Coset . The codeword design is recommended to refer to [21].
(4) Determine the optimal strength vector A opt for M through informed embedding. This is formulated as an optimization problem and solved by the genetic algorithm (GA), as given in [21]. In this process, an HMM-based robustness metric is exploited to ensure the predefined robustness, where the HMM is estimated with the original image.
(6) After finishing embedding all message bits into their corresponding vector trees via Steps (3) to (5), perform the inverse wavelet transformation to obtain the watermarked image I ( , ).
The message extraction process is implemented in the following manner. For each vector tree Z ( = 1, 2, . . . , 1 2 /64) at the receiver, the Taylor series-approximated locally-optimum test (TLOT-) based detector is used to find a codeword with the maximum TLOT value, say M ∈ {M 0 , M 1 } ( = 0,1), where the TLOT-based detector exploits the prior HMM. The corresponding coset index (0 or 1) of M is then taken as the extracted message bit ∈ {0, 1}. After all vector trees have been processed, the extracted bit sequence is reordered with key KEY, and the message sequence b is finally recovered.
Because of the paper length limit, more details of this reviewed HMM-based informed watermarking scheme are recommended to refer to [21]. From the above descriptions, it is clearly found that both the encoder and decoder use the prior VWD-HMM. This implies that the encoder needs to send the prior VWD-HMM to the receiver for message extraction. This would probably hinder the practical application of HMM-based informed watermarking systems. Addressing such problem gives rise to the proposed scheme, as presented in the next section.

Posterior-HMM-Based Informed Watermarking Scheme
As pointed out in Section 2.2, the prior VWD-HMM parameters need to be sent as side information to the receiver. To handle this problem, we are inspired to take the posterior VWD-HMM estimated with the received image for message extraction, as similarly implemented in [22]. To assess its feasibility, we perform the following examination.
In the examination, we test 35 256 × 256 grey images with different textures. For each image, we take JPEG compression with QF = 70 as the predefined robustness and embed the same message sequence of 1024 random bits via the approach in [21] (see also Section 2). The generated watermarked images are then attacked with JPEG compression at different QFs, and the message sequence is extracted from the attacked images. In message extraction, the prior and posterior HMMs that are estimated with the original and attacked images are adopted, respectively. In other words, the compared two cases have the same setting except the used HMM parameters at the receiver. Their corresponding performance in terms of bit error rate (BER) is plotted in Figure 3, where BERs have been averaged over all 35 test images. It is clearly found that the performance of the posterior-HMM-based informed watermarking scheme degrades significantly. This is strongly contrasted with the results for the noninformed scheme of [22], in which the posterior HMM only leads to a slight degradation of detection performance.
The reasons for the significant degradation are explained as follows. Let ( , ) and , ( , ∈ {1, 2 . . .}) be the original image and its corresponding wavelet coefficients at scale and location , respectively. Suppose that ( , ) and , are the received image and its corresponding wavelet coefficients, respectively. Then , would be remarkably different to , for relatively large embedding strength is generally adopted for informed watermarking with high The Scientific World Journal capacity and robustness. As a result, the posterior HMM estimated with , , say Θ post , would significantly deviate from the prior one that is estimated with , , namely, Θ pri . In return, the detection performance using Θ post decreases greatly, and the predefined robustness ensured with respect to Θ pri is no longer achieved. According to the above analyses, the performance degradation is mainly due to the deviation of Θ post from Θ pri . If they are sufficiently close to each other, the detection performance would then decrease slightly. That is, the key point for the posterior-HMM-based informed watermarking is to let both the encoder and decoder adopt the (nearly) identical HMM parameters. To achieve this objective, we impose the same strong attacks on the original and received images and then use these attacked versions to estimate HMM parameters for the encoder and decoder, respectively. Once the imposed attack is much stronger than the inserted watermark signal, the HMMs estimated at the encoder and decoder would be close enough to each other.
The feasibility of this strategy can be evaluated as follows. Let Δ attack , be the distortion due to a particular strong attack.
The attacked version of , is then calculated as attack . These attack , s are used to yield the HMM parameter set, namely, Θ attack pri . With respect to Θ attack pri , the predefined robustness is ensured via the informed embedding algorithm in [21] (see also Section 2.2). Similar to the encoder, the received wavelet coefficients , s are also performed with the same strong attack to yield the following attacked version: . Consequently, Θ attack post would be close enough to Θ attack pri . In return, using Θ attack post for watermark detection would only degrade the detection performance slightly. In other words, this strategy allows designing a posterior-HMMbased informed watermarking scheme that would achieve a comparable performance to the prior-HMM-based one but eliminate the transmission of HMM parameters to the receiver.
In the interest of achieving high robustness against JPEG compression, the prior-HMM-based informed watermarking scheme [21] takes JPEG compression at a particular QF (e.g., QF = 70) as the predefined robustness. Under such setting, we can similarly adopt the JPEG compression with a small QF as the strong attack for estimating the Θ attack pri and Θ attack post . For notational convenience, the QFs for the predefined robustness and the strong attack are denoted as Rbst QF and Attk QF, respectively. Therefore, by using the Θ attack pri and Θ attack post for informed embedding and message extraction, respectively, we can develop a posterior-HMMbased informed watermarking scheme (PostHIW), as illustrated in Figure 4. The details are as follows.
(2) Impose the JPEG compression with QF = Attk QF on the cover image I( , ). Then use the EM algorithm (see also Section 2.1) to obtain the HMM parameter set, Θ attack pri .
(3) Set the predefined robustness represented by a JPEG QF to be Rbst QF.
(4) Execute Steps (4) to (6) in Section 2.2 to yield the watermarked image, I ( , ). In this implementation, the Θ attack pri estimated with the attacked original image rather than the Θ pri trained with the original image is used for informed embedding.
The posterior-HMM-based extraction process is actually the same as that in Section 2.2 except that the posterior HMM parameter set Θ attack post rather than the prior one Θ pri is exploited for watermark detection. In particular, the received (probably polluted) image I ( , ) is firstly attacked with JPEG compression at QF = Attk QF. The attacked image is then used to estimate the HMM parameter set, Θ attack post . The Θ attack post is finally employed to extract the message, namely, b , via the detection approach in Section 2.2.
As aforementioned, the performance of the proposed PostHIW is highly related to the closeness between the Θ attack pri and Θ attack post . The closer the Θ attack pri and Θ attack post are, the smaller  the performance degradation is. Therefore, we need to find an optimum Attk QF that makes the Θ attack post be closest to the Θ attack pri . This is a fundamental issue for the proposed PostHIW. Its practical determination is given in the next section.

Determination of the Practically Optimum Attk_QF
As mentioned in Section 3, the setting of Attk QF is a key parameter for the estimation of Θ attack pri and Θ attack post , and would have a significant impact on the PostHIW's performance. To achieve the best performance, we attempt to determine an optimum setting for Attk QF. As analyzed in Section 3, the Attk QF should be sufficiently small to let Θ attack post be close enough to Θ attack pri . As it is rather tough to obtain an analytic function to characterize the relationship between the Attk QF and the deviation of Θ attack post from Θ attack pri , we reckon on the numerical simulation to roughly analyze their relationship. This will give rise to a practically optimum setting for Attk QF, say Attk QF opt .
In the simulation, we test 35 256 × 256 grey images of different textures. We start from estimating HMM parameter sets, Θ attack pri s, for original test images. For each test image, we impose JPEG compression with Attk QFs in the range [5 : 5 : 100], where the second 5 denotes a step, and then employ the EM algorithm [21,23] (see also Section 2.1) to estimate the Θ attack pri . We proceed to estimate HMM parameter sets, Θ attack post s, for watermarked images. As till now we do not determine the Attk QF opt , we cannot employ the Θ attack pri with respect to Attk QF opt to generate, via the approach in Section 3, watermarked images. Instead, we obtain the watermarked images for evaluation using the prior-HMM-based informed watermarking (PriHIW) in [21] (see also Section 2.2). This makes sense as long as the JPEG compression attack is much stronger than the inserted watermark, for example, the situation of a small Attk QF. In the implementation, we set the Rbst QF that represents the predefined robustness to be 70, 80, and 90, respectively, and then employ the PriHIW to generate 105 watermarked images. These are further used to obtain 105 HMM parameter sets, Θ attack post s, via the same EM algorithm as that for Θ attack pri s. To evaluate the deviation of Θ attack post s from Θ attack pri s, we adopt the Kullback-Leibler divergence (KLD). Assume that total LEV (e.g., LEV = 2) coarsest levels of a -level ( ≥ LEV) wavelet pyramid are used for HMM estimation.
where ( attack pri ) and ( attack post ) are the probabilities of wavelet coefficients at the th level that are computed from Θ attack pri and Θ attack post , respectively, which are calculated as follows (see also where (⋅) attack pri and (⋅) attack post denote the parameters belonging to sets Θ attack pri and Θ attack post , respectively, and (t; c) is defined in (2). According to (3), the ( attack post ) and ( attack pri ) at other levels (i.e., -LEV + 1 ≤ < ) are computed as After obtaining the KLD for each level of the given image, we finally average them to yield the average distance, that is, KLD avg = ∑ = −LEV+1 KLD /LEV, to reflect the deviation of Θ attack post from Θ attack pri . Further averaging all KLD avg s of all test images yields the statistically averaged distance, say KLD avg . In the statistical sense, the KLD avg characterizes the relationship between Attk QF and the deviation of Θ attack post from Θ attack pri . Figure 5(a) to Figure 5(c) summarize the relationship between Attk QF and KLD avg for Rbst QF = 70, 80, and 90, respectively. It is observed that increasing the Attk QF generally increases the KLD avg . This is consistent with the intuition as increasing the Attk QF would make Δ attack , gradually decrease to the same magnitude order of , + , (see also (6)) and thus Θ attack post would gradually deviate from Θ attack pri . Therefore, it makes sense to take Attk QF opt = 5 as an optimum setting for the PostHIW using different predefined robustness. Although this setting might not be theoretically ideal, it is really an optimum setting for practical application, as will be well demonstrated in Section 5.

Experimental Results and Analysis
In this section, we assess the proposed PostHIW by comparing it to the PriHIW [21] (see also Section 2) and the state of the art [19]. Below are the details. In the simulation for the state of the art, that is, the trellisbased informed watermarking (TIW) [19], we also embed the same message in each test image. As the TIW presented in [19] is implemented in the DCT domain rather than the wavelet one in our situation, we slightly modify its implementation from the DCT domain to the wavelet one, aiming for fair comparison. That is, we replace embedding units of 12 DCT coefficients and perceptual masks in the TIW with those of 15-node vector trees and visual masks in the PostHIW or PriHIW, respectively. But we keep the other parts of TIW unchanged. Under this setting, the perceptual distance in the wavelet domain can be employed for fair performance evaluation. As set in the comparison between the PriHIW and PostHIW, the perceptual distance for the TIW, namely, TIW DWT , is also set to be nearly identical to Pri DWT and Post DWT by adjusting the robustness threshold of TIW. Figure 6 illustrates several images watermarked by the PriHIW, PostHIW, and TIW. It can be observed that the subjective visual fidelity of watermarked images of the PriHIW, PostHIW, and TIW is similar to each other. Table 1 summarizes the mean and standard deviation of peak signal-to-noise ratios (PSNRs) and perceptual distances for all 35 watermarked images generated by the PriHIW, PostHIW, and TIW, respectively. It is seen that these three schemes have nearly identical average perceptual distances.

Fidelity Evaluation.
In Figure 6, we adopt a relatively large perceptual distance for the convenience of visual artifact illustration. Actually, a smaller perceptual distance and thus the better subjective visual fidelity can be achieved by using a larger Rbst QF. This is well demonstrated in Figure 7, where the image "Lena" is taken, for example, and the watermarked images are generate, via the PostHIW with the Rbst QF set to be 70, 80, and 90, respectively.

Performance Comparison between the PriHIW, PostHIW, and TIW.
In this subsection, we evaluate the proposed PostHIW by comparing it to the PriHIW and TIW. In the evaluation, we take the watermarked images under the settings of Rbst QF = 70 and Attk QF = 5 as test images. On each test image, we impose JPEG compression, gain attacks, additive white Gaussian noise (AWGN), and lowpass Gaussian filtering (LPGF). We then assess the average performance in terms of BER against these attacks. The results are summarized in sections from Section 5.3.1 to Section 5.3.4, followed by the computation time evaluation for all three schemes.

Performance against JPEG Compression.
To examine the performance against JPEG compression, we impose JPEG compression with QFs ranging from 10 to 100 on all test watermarked images. Figure 8 plots the performance comparison for the PriHIW, PostHIW, and TIW. It is shown that the three compared schemes are equivalently robust to JPEG compression with QF ≥ 70. This implies that the predefined robustness indicated by Rbst QF = 70 has been desirably achieved, which demonstrates the feasibility of the PriHIW and PostHIW.
The PostHIW's performance for QF ∈ (20, 70] and QF ≤ 20 is remarkably weaker than and sufficiently close to the PriHIW's one, respectively. These can be analyzed as follows. Suppose that the real HMMs for these attacked images are Θ real s. Assume that the KLD between Θ attack post (see also Section 4) and Θ real is KLD post-real and the KLD between Θ pri (i.e., estimated with the original image) and Θ real is KLD pri-real . When QFs are not sufficiently small, for example, QF ∈ (20,70], both the Θ pri and Θ attack post deviate from Θ real , but the KLD pri-real is smaller than the KLD post-real . Therefore, the PriHIW's detection performance using the Θ pri is better than the PostHIW's one exploiting the Θ attack post . When QFs The Scientific World Journal are sufficiently small, for example, QF ≤ 20, the KLD post-real and KLD pri-real are probably close to each other, and thus the performance of both the PriHIW and PostHIW is close to each other. In the comparison with the TIW, the PostHIW has the same robustness as the TIW for QF ≥ 70, achieves worse performance than the TIW for QF ∈ [30, 70], and obtains higher robustness for other cases. The reasons are as follows. When QF ≥ 70 holds, the attacks are (equivalently) weaker than the predefined robustness, and thus the performance of BER = 0 can be exactly achieved for all compared schemes. In the situation of QF ∈ [30, 70], the attacks are more severe than the predefined robustness of the PostHIW but probably weaker than that of the TIW. As a result, the PostHIW has worse performance than the TIW. This can be expected since the TIW uses the trellis code with long codeword length, which allows achieving better performance at the cost of relatively high computational complexity. In other cases   (e.g., QF ≤ 20), the attacks are so strong that both the PostHIW and TIW are subject to uninformed attacks. In contrast to the TIW, the PostHIW well exploits the statistical model of wavelet coefficients and consequently achieves higher robustness than the TIW.
In comparison to the TIW, the PriHIW has the similar performance to the PostHIW except that the PriHIW is worse and better than the TIW for QF ∈ [40, 70] and QF ≤ 30, respectively. The reasons are the same as the analysis given above. Figure 9 gives the performance against AWGN attacks for the PriHIW, PostHIW, and TIW, where the standard deviation of Gaussian noise,  namely , is set as the range [1,20] with step 1. It is seen that both the PriHIW and PostHIW achieve zero BERs for ≤ 6. This comes from the fact that AWGN attacks with ≤ 6 are equivalently weaker than the robustness threshold represented by Rbst QF = 70. The PostHIW obtains a comparable performance to the PriHIW for ∈ [7,8], where the BERs of the PostHIW for = 7 and = 8 are larger and smaller than those of the PriHIW, respectively. This may arise from the unstable HMM estimation for several images. For ≥ 9, the PostHIW is considerably worse than the PriHIW. The explanations are similar to those for the case QF ∈ (20,70] in the comparison between the PriHIW and PostHIW.

Performance against AWGN.
It is observed from Figure 9 that the PostHIW has the same zero BERs as the TIW for ≤ 6, obtains worse performance than the TIW for ∈ [7,9], and reaches higher robustness for other cases. The reasons are the same as those in the comparison between the PostHIW and TIW for cases QF ≥ 70, QF ∈ [30, 70], and QF ≤ 20, respectively. Similar results are also found for the PriHIW.

Performance against Gain Attacks.
As the PriHIW, PostHIW, and TIW employ spherical codes, they are promising to be highly robust to gain attacks. To illustrate this, we impose gain attacks with different scaling factors on the aforementioned test images. In the simulation, scaling factors, say s, are set as the range [0.1, 2.0] with step 0.1. The performance comparison against these attacks is summarized in Figure 10. It is demonstrated that the performance of the PostHIW is slightly worse than that of the PriHIW for scaling factors varying from 1.1 to 2.0. The reasons are the same as the explanations for the case of QF ≤ 20 in the comparison between the PostHIW and PriHiW. However, the PostHIW is vulnerable to scaling factors below 0.9. This is because these scaling factors significantly decrease the image amplitude so that a large portion of wavelet coefficients are quantized by the JPEG compression attack with Attk QF = 5 to be zero. In return, this makes, with a large probability, the estimation of Θ attack post unstable and consequently degrades the detection performance significantly. In other cases, both the PriHIW and PostHIW have the identical robustness.
It is found from Figure 10 that the PostHIW obtains higher robustness than the TIW for ≥ 1.2, behaves the same as the TIW for = 1, and achieves worse performance for other situations. The explanations are similar to those for the cases of QF < 30, QF ≥ 70, and QF ∈ [30, 70], respectively, in the comparison between the PostHIW and TIW. Somewhat similar to the PostHIW, the PriHIW is better than and identical to the TIW for ≥ 1.1 and other cases, respectively.

Performance against LPGF.
We further examine the performance against the LPGF. The standard deviation of Gaussian filter, say , is set as the range [0.1, 2.0] with step 0.1. Figure 11 shows the performance comparison for the PriHIW, PostHIW, and TIW. Figure 11 indicates that the PostHIW is equivalently robust to the PriHIW for ≤ 0.5, which is due to the fact that these LPGF attacks are actually weaker than or equivalent to the predefined robustness. In other situations, however, the PostHIW is generally better than the PriHIW. This is because LPGF attacks with ≥ 0.6 would significantly smooth watermarked images, and thus the KLD post-real would be smaller than the KLD pri-real (see also Section 5.3.1). Consequently, the PostHIW using the posterior HMM leads to higher robustness than the PriHIW employing the prior HMM. The BER exceptions for ∈ {0.7, 1.3} may arise from the probably unstable HMM estimation for several watermarked images. Figure 11 also shows that the PostHIW obtains the same zero BERs as the TIW for < 0.5, but it yields higher robustness than the TIW for other cases. These can be similarly found for the PriHIW. The analyses are analogous to those in Section 5.3.1.

Computation Time Evaluation.
As described in Section 3, the PostHIW replaces the prior HMM with the posterior HMM, but it keeps the other parts of the PriHIW unchanged. Therefore, the computational complexity of both the embedding and detection processes for the PostHIW would be close to that for the PriHIW. As both the PostHIW and PriHIW use the spherical code with short codeword length, they facilitate the decrease of computational complexity in informed embedding. In contrast, the TIW adopts the trellis-based spherical code with long codeword length, and it also requires Viterbi-decodingbased iterations in the process of informed embedding. Thus, the computational complexity in informed embedding of the TIW would be relatively large. Different from the informed embedding process, however, the detection process of the TIW does not need to perform the time-consuming iterations, in which only one time of Viterbi decoding is needed. Thus, the computational complexity of the detection process would be small.
As it is rather troublesome to obtain an analytic function to characterize the computational complexity of both the embedding and detection processes for the PostHIW/PriHIW and TIW, we rely on the numerical simulation to evaluate their computation time. In the simulation, we implement these schemes with C code and perform them on a 2.2 GHz Intel Core(TM)2 Duo CPU with 2 GB memory. The parameter settings are the same as those in Section 5.1. Table 2 summarizes the computation time of the embedding and detection processes for the PriHIW, PostHIW, and TIW, respectively, where the results are averaged over all test images. It can be seen that the computation time of the embedding process for both the PriHIW and PostHIW is somewhat close to each other, but it is roughly an order of magnitude lesser than that for the TIW. In addition, it is also observed that the computation time of the detection process for the compared three schemes is approximately in the same order, which can be implemented in real time.

Conclusion
In this paper, we have presented an enhanced informed image watermarking scheme using the posterior HMM. The key point for this situation is to let both the encoder and decoder obtain (nearly) identical HMM parameter sets. This can be achieved by imposing strong attacks on the original and received images and then using the attacked versions to estimate HMM parameter sets. In the interest of obtaining high robustness to JPEG compression, we take the JPEG compression attack with a small QF as the strong attack. According to numerical simulations, the small QF of 5 can be reasonably considered as an optimum QF for practical use. Based on this setting, we developed a posterior-HMMbased informed watermarking scheme. Extensive simulations show that the proposed posterior-HMM-based informed watermarking scheme is highly robust to the attacks of JPEG compression, AWGN, gain attacks, and LPGF. It is also observed that the proposed scheme is comparable to its prior counterpart but eliminates the transmission of the prior HMM as side information to the receiver. This well enhances the practical application of HMM-based informed watermarking systems. In addition, the proposed scheme is demonstrated to have the performance comparable to the state of the art [19] with significantly reduced computation time.