The Theoretical Limits of Watermark Spread Spectrum Sequence

At present, the spread spectrum (SS) sequences used in watermark include i.i.d. random sequences and the sequences used in SS communications. They appear earlier than digital watermark. Almost no researchers pay attention to whether they are really fit for watermark. In this paper, we compare the SS watermark channel and the traditional SS communication channel. We find out that their correlation property is different. Considering cropping and translation attacks, we define watermark auto- and cross-correlation and propose Loose Autocorrelation and Tight Cross-Correlation (LAC&TCC) properties for SS watermark. The LAC&TCC properties are that, whether or not synchronized, the autocorrelation is equal or close to 1 and the cross-correlation is equal or close to 0. Accordingly, the peak correlation is divided into the peak autocorrelation R a(l) and the peak cross-correlation R c(l). We establish the lower bound of R c(l) and the higher bound of R a(l), respectively. The two bounds indicate that, no matter how small the cover is reserved, the extractor can always find a threshold to distinguish auto- and cross-correlation in theory.


Introduction
Digital watermarking has been applied to protect digital media from illegal copying and reproduction. Among various watermarking methods [1,2], spread spectrum (SS) watermarking, originally proposed by Cox et al. [3], is a useful approach. The SS watermarking is developed from SS communications. The watermark is spread over very many frequency bins so that the energy in any one bin is very small and certainly undetectable [3].
By comparing to SS communications whose most key factor is the pseudorandom (PN) sequences, we propose that there are two key components to SS watermarking: the insertion strategy and the PN sequences. However, researchers pay much less attention to the PN sequences than the insertion strategy. At present, the PN sequences, or called PN codes, used in SS watermarking, can be generally categorized into three kinds: independent and identically distributed (i.i.d.) random sequences, the sequences used in SS communications, and other PN-like sequences.
The i.i.d. Gaussian sequences ( , 2 ) (where is the mean and 2 is the variance) are the most widely used i.i.d. random sequences in SS watermark. Cox et al. first use real valued sequences (0, 1) [3]. Since then, many researchers follow the sequences [4][5][6]. In [7], Kuribayashi and Kato quantize the (0, 2 ) variable to an integer. Their detailed analysis reveals that the attenuation of the signal energy strongly depends on the quantization performed during the embedding and averaging stages. References [8][9][10] use the randomly generated sequences taking values from {−1, +1} with equal probability that can be regarded as the quantization of (0, 2 ) which quantizes the values that are less than 0 to −1 and greater than 0 to +1.
The SS communications generally make use of a sequential noise-like signal structure, that is, the SS sequences, to spread the normally narrowband information signal over a relatively wideband of frequencies. The receiver correlates the received signals to retrieve the original information signal. The typical sequences that are used in SS communications include m-sequences [11], Gold sequences [12,13], Msequences [14], and Walsh sequences [15,16]. Most of them are used in SS watermarking.
In other PN-like sequences, chaos is the most commonly used. Chaos is a deterministic phenomenon having almost all the features of a noise. The watermark signal generated by chaos system can be embedded in the host image as 2 The Scientific World Journal a small noise-like signal [17,18]. In addition, because chaos is sensitive to initial conditions, it plays an important role in the watermarking security.
Although all the preceding PN sequences play some role in the SS watermarking, they appear earlier than digital watermarking. They are not specifically designed for watermarking. Almost no researchers pay attention to whether they are really fit for digital watermarking. Only a few papers discuss SS watermarking from the perspective of SS sequences.
Kojima et al. [19] propose a digital watermarking scheme based on complete complementary codes and extend it to steganography. It is shown that the method has superiority to the watermarking scheme based on other SS sequences. The complete complementary codes (pairs of sequence sets) have ideal auto-and cross-correlation properties. It improves the robustness for the collusion attacks.
Huang et al. [20] develop a long PN code based direct sequence spread spectrum (DSSS) flow marking technique for invisibly tracing suspect anonymous flows. One segment of the long PN code is used to spread only one bit of the signal. Benefits of using long PN code include the following: (i) the approach can defeat mean-square autocorrelation based detection technique and make the traceback hard to detect; (ii) it can trace multiple traffic flows in an anonymous network simultaneously.
Deng and Jiang [21] use wavelet sequence as SS codes. Haar wavelet basis is changed into a multilevel orthogonal sequence by scaling and translation, which is used to encode the binary image. Experimental results show that, compared with Hadamard sequence and improved gold sequence, the wavelet multilevel sequence performs better than binary sequence in attack resistance.
In this paper, we discuss the influence of cropping over the correlation of SS sequence in SS watermarking. The peak cross-correlation and the peak autocorrelation are separated and redefined. Based on these definitions, we give the theoretical limits.

Watermark Correlations. The correlation properties of sets of SS sequences are important in SS communications.
Traditionally the delay or shift receives most attention in auto-and cross-correlation. However, only the delay cannot describe the watermark channel correctly. In watermark channel, attackers use variety of attacks to remove or to render the watermark useless. These attacks can be roughly grouped into signal processing attacks and geometric attacks. In SS watermarking, geometric attacks are difficult to deal with as they involve displacement of pixels, thereby inducing synchronization errors between the original and extracted SS sequences during detection process.
In this paper, we consider cropping and translation because cropping is an easily operable and intractable attack and it is often accompanied by translation. For example, as shown in Figure 1, a user can easily use almost any image processing software to crop an interest part down from a watermarked picture. The extractor can only get partial SS sequence from the cropped picture. We define the size of the partial sequence as correlation window. The extractor calculates the correlation between the original sequence in correlation window and the partial sequence. Because it is difficult to locate the partial SS sequence in the whole original sequence, the correlation is with delay (i.e., translation). Let = ( 0 , 1 , . . . , −1 ) and = ( 0 , 1 , . . . , −1 ) be two bipolar {−1, +1} vectors of length . Then the correlation between and is defined to be the number of positions in which and agree minus the number of positions in which they disagree. Definition 1. Suppose + + < , the watermark crosscorrelation between and over the subsequence of length beginning at position and with relative shift , denoted by ( , , ), is defined by where and are, respectively, the starting position and the size of correlation window and is the delay.
Similarly, we define the autocorrelation of over consecutive bits beginning at position and with relative shift by Of course, for any and , we have (0, , ) = 1.

Peak Watermark Correlations.
In order to improve the anti-interference ability, the traditional SS communications are interested in the maximum peak absolute values of autocorrelations and cross-correlations: This maximum peak absolute value is the smaller the better. That is to say that the cross-correlation is close to 0 and the autocorrelation is sharp.
However, the property that "the autocorrelation is sharp" is not fit for watermarking. Since the extractor depends on the value of correlation to extract watermarks, the autocorrelation should always be large no matter what attacks the covers suffered. Hence, we present LAC&TCC properties: (i) Loose Autocorrelation (LAC): whether or not synchronized, the autocorrelation is equal to or close to 1; (ii) Tight Cross-Correlation (TCC): whether or not synchronized, the cross-correlation is equal to or close to 0.
Correspondingly, we are interested in the maximum peak absolute values of cross-correlations and the minimum peak absolute values of autocorrelations.

The Theoretical Limits
Usually, it is difficult to give the extract value of ( ) and ( ). But we can give the lower bounds and upper bounds. In the traditional SS communications, there are Welch bound [22,23], Sarwate bound [24], Sidelnikov bound [25], and so forth. They give the lower bounds of max on how small the cross-correlation and autocorrelation can simultaneously be. In SS watermarking, because we define ( ) and ( ), we should give the lower bound of ( ) and higher bound of ( ), respectively.
3.1. The Lower Bound of ( ). As described above, researchers use different ways to give different lower bounds of max . The lower bound of ( ) is a bit similar to that of max . Their difference is that ( ) only includes cross-correlation.

Theorem 2.
In SS watermarking, let Λ be a set of LAC&TCC sequences of length and let be a positive integer; then for every with 1 ≤ ≤ , Proof. Obviously We have Because the maximum value of ∏ =1 + + is 1, 4 The Scientific World Journal Various choices of 1 , 2 , . . . , and 1 , 2 , . . . , give rise to the same product of and . The number of choices can be expressed as multinomial coefficients. This observation can be used to rearrange the sum in the form where , ≥ 0, ∑ =1 = ∑ =1 = , and Since the summands are nonnegative, the terms with may be dropped to yield Cauchy's inequality for sums of squares then gives Interchanging orders of summation and applying the multinomial expansion theorem give From the previous inequality this yields and the theorem follows.
Particularly, when = 1, we have 3.2. The Higher Bound of ( ) Theorem 3. In SS watermarking, let Λ be a set of LAC&TCC sequences of length and let be a positive integer; then for every l with 1 ≤ ≤ , Proof. Assume that, in ( , , ) = (1/ ) ∑ + −1 = + , the number of positions in which and + agree is ; then In order to reduce the BER, ( , , ) should be between 0.5 and 1. From 1/2 < ( , , ) ≤ 1, we can derive (3/4) < ≤ . Obviously We have From the previous inequality this yields and the theorem follows.
(1) ( ) cannot be arbitrarily small and ( ) cannot be arbitrarily large.
( ) only contains the cross-correlation information and ( ) only contains the autocorrelation information. Although they are more "pure" than max , generally they cannot be equal to 0 and 1, respectively. The lower bound of ( ) and the higher bound of ( ) indicate that there is some restrain relation between them.
(2) The threshold to distinguish auto-and crosscorrelation always exists. Figure 2 gives the function graphs of the higher bound of ( ) and the lower bound of ( ). From it, we can see that no matter how small the cover is reserved, ( ) is always larger than ( ). That is to say the extractor can always find a threshold to distinguish auto-and cross-correlation in theory.

Conclusion
In this paper, we propose LAC&TCC properties and give the theoretical limits of SS watermarking sequences under the attacks of cropping and translation. From the perceptive of SS watermarking sequences, the future works include the following.
(1) Considering Other Attacks. In this paper, considering the complexity, we do not discuss rotation, scaling, and other The Scientific World Journal attacks. Actually, these attacks often happen and are difficult to deal with. In the following research, we will add them to (1) and (2).
(2) The LAC&TCC Sequences Design. As described previously, the existing SS sequences do not have LAC&TCC properties and are not really fit for watermarking. It is necessary to design LAC&TCC sequences to improve performance of SS watermarking.