This paper proposes a statistically matched wavelet based textured image coding scheme for efficient representation of texture data in a compressive sensing (CS) frame work. Statistically matched wavelet based data representation causes most of the captured energy to be concentrated in the approximation subspace, while very little information remains in the detail subspace. We encode not the full-resolution statistically matched wavelet subband coefficients but only the approximation subband coefficients (LL) using standard image compression scheme like JPEG2000. The detail subband coefficients, that is, HL, LH, and HH, are jointly encoded in a compressive sensing framework. Compressive sensing technique has proved that it is possible to achieve a sampling rate lower than the Nyquist rate with acceptable reconstruction quality. The experimental results demonstrate that the proposed scheme can provide better PSNR and MOS with a similar compression ratio than the conventional DWT-based image compression schemes in a CS framework and other wavelet based texture synthesis schemes like HMT-3S.
1. Introduction
Texture data contain spatial, temporal, statistical, and perceptual redundancies. Representing texture data using standard compression schemes like MPEG-2 [1] and H.264 [2] is not efficient, as they are based on Shannon-Nyquist sampling [3] and do not account for perceptual redundancies. They are often resource consuming (as they acquire too many samples) due to its fine details in textured image and high frequency content. Variety of applications in computer vision, graphics, and image processing (such as robotics, defence, medicine, and geosciences) demands better compression with good perceptual reconstruction quality, instead of bit accurate (high PSNR) reconstruction. This is because the human brain is able to decipher important variations in data at scales smaller than those of the viewed objects. Ndjiki-Nya et al. [4–8], Bosch et al. [9, 10], Byrne et al. [11, 12], and Zhang et al. [13, 14] have proposed techniques to reconstruct visually similar texture from sample data. Statistically matched wavelet [15] is aimed at designing a filter bank that matches a given pattern in the image and can better represent the corresponding image as compared to other wavelet families.
Compressive sensing (CS) technique [16] has proved that it is possible to achieve a sampling rate lower than the Nyquist rate [3] with acceptable reconstruction quality. Leveraging the concept of transform coding, compressive sensing enables a potentially large reduction in sampling and computation costs for sensing signals that have sparse or compressible representation (by a sparse representation, we mean that for a signal of length N, we can represent it with K≪N nonzero coefficients). Compressive sensing framework opens a new research dimension in which most of the sparse signals can be reconstructed from a small number of measurements (M), using algorithms like convex optimization, greedy methods, and iterative thresholding [17, 18]. Significant theoretical contributions have been published on the compressive sensing in recent years [16, 17, 19] for image processing applications [20–24]. Compressive sensing framework mainly consists of three stages, that is, sparsification by transformation, measurement (projection), and optimization (reconstruction). Designing a good measurement matrix with large compression effects and designing a good signal recovery algorithm are the two major challenges for applying CS technique in image compression.
1.1. The Prior Works and Motivation
Existing methods of textured image compression scheme can be broadly classified as parametric or nonparametric. Nonparametric approaches can be applied to a wide variety of textures (with irregular texture patterns) and provide better perceptual results. However, these schemes are often computationally more complex. Parametric approaches can achieve very high compression at low computational cost. However, these techniques are not effective for structured textures such as those with primarily nonstationary data content.
Nonparametric approaches are pixel-based or patch-based. Efros and Leung [25] proposed pixel-based nonparametric sampling to synthesize texture. Wei and Levoy [26] further improve the above using a multiresolution image pyramid based on a hierarchical statistical method. A limitation of the above pixel-based methods is an incorrect synthesis owing to incorrect matches in searching for similar statistics. Patch-based methods overcome this limitation by considering features matching patch boundaries with multiple pixel statistics. People generally use Markov Random Field (MRF) models for texture analysis [7, 26]. The popular choice for texture synthesis is a patch-based graph cut [27]. Region-based texture representation and synthesis algorithms [28–30] have been explored recently to address the limitations of block-based representation in handling the homogenous texture and blocking artifacts. Byrne et al. [11, 12] have demonstrated region-based synthesis structure using a morphological and spectral image representation technique [31]. Region-based representation has also been explored using multiresolution wavelet based decomposition as reported in [32–34]. However that work is limited to document class of images.
Recent work in the parametric approach is typically based on Auto Regressive (AR) [38] or Auto Regressive Moving Average- (ARMA-) based modelling [39–41]. AR- and ARMA-based models in texture synthesis enable blocks to be selectively removed at the encoding stage and reconstructed at the decoder with acceptable perceptual quality. AR- and ARMA-based approaches are suitable for the textures with stationary data, like steady water, grass, and sky; however, they are not suitable for structured texture with nonstationary data as blocks with nonstationary which data are not amenable to AR modelling. Further, they are block-based approaches, and blocking artifacts can appear in the synthesised image. Portilla and Simoncelli [37] propose a statistical model for texture images based on joint statistical constraint on the wavelet coefficients. In the above work, authorsproposed an algorithm for synthesizing textures using sequential projections on to the constraints surfaces; however, the model is the choice of the statistical constraints and not suitable for texture with structural pattern. Wavelet domain hidden Markov tree (HMT-3S) [42] was recently used for texture analysis and synthesis, where it was assumed that three subbands of the two-dimensional DWT, that is, HL, LH, and HH, are independent. The HMT-3S adds dependencies across subbands by treating jointly the three hidden elements across the three orientations; however, HMT-3S is established in the nonredundant DWT and is inferior to redundant DWT for statistical modelling. Also, for structural textures with regular patterns, both HMT and HMT-3S fail to reproduce the regular patterns.
In this paper, we propose a statistically matched wavelet based texture data representation and a compressive sensing based texture synthesis scheme (henceforth we refer the proposed scheme in this paper as SMWT-CS). Statistically matched wavelet based representation [43] causes most of the captured energy to be concentrated in the approximation subspace, while very little information is retained in the detail subspace. We encode not the full-resolution statistically matched wavelet subband coefficients (as normally done in a standard wavelet based image compression) but only the approximation subband coefficients (LL) using standard image compression scheme like JPEG2000 [44] (which accounts for 1/4th of the total coefficients and can be represented using fewer bits). The detail subband coefficients, that is, HL, LH, and HH (which account for 3/4th of the total coefficients), are jointly encoded in a compressive sensing framework and can therefore be represented with fewer measurements. Quality matrix is an essential component for assessing the performance of an image compression system. We have computed Mean Opinion Score (MOS) for subjective assessment as it captures the visual perception of human subjects. We have also computed PSNR for objective assessment of the quality for comparative study.
Table 1 provides a comparison of existing texture analysis and synthesis scheme. We have included our proposed scheme to put the work in perspective. As can be seen the proposed scheme is more efficient than the existing schemes. The main difference between a conventional wavelet based compression scheme in a CS framework (DWT-CS) [21, 35, 45] and our approach is the use of Statistically Matched Wavelet based sparsification [15, 43], as opposed to generic DWT used in previous works. The existing schemes use regular DWT that is not able to fully exploit the signal properties (orthogonal/nonredundant DWT cannot well preserve the regularity and/or periodicity [46]) and hence do not fully exploit the sparsity. CS optimization that is done over detail subspace allows for scalability in choosing measurement vector size providing a tradeoff between compression and perceptual reconstruction quality. The other novelty is that we apply standard coding on the low resolution part (LL subband) of the information, because of the nature of wavelet decomposition (i.e., true especially for multilevel wavelet decomposition); the low frequency information is not sparse and hence not amenable for CS based encoding. Our proposed scheme can be easily integrated into an existing encoding framework [38] with texture analysis and synthesis being performed by the proposed scheme. So far, multiresolution based representation has been used in the domain of documentation class of image; however, to the best of our knowledge no work has reported using statistically matched wavelet and compressive sensing based analysis and synthesis technique for a generic texture class images for compression with subjective reconstruction quality.
Comparative study of texture analysis and synthesis schemes.
Proposed work
Texture analysis and synthesis technique
Texture type
Quality assessment
Limitation
Complexity
Wang and Adelson [29]
Affine warping
Rigid textures
None
Not suitable for structural texture
Medium
Dumitras and Haskell [30]
Steerable pyramids
Rigid textures
None
Not suitable for structural texture
Medium
Ndjiki-Nya and Wiegand [8]
Perspective warping
Rigid and nonrigid textures
Yes
Prone to propagation error
High
Zhang and Bull [14]
ARMA
Rigid and nonrigid textures
Yes
Not suitable for structural texture
Medium
Portilla and Simoncelli [37]
Wavelet based (joint statistics)
Rigid and nonrigid textures
None
Not suitable for structural texture
Medium
Fan and Xia [42]
Wavelet based (HMT-3S)
Rigid and nonrigid textures
None
Not suitable for structural texture
Medium
Our scheme (SMWT-CS)
Compressive sensing and statistically matched wavelet
Rigid and nonrigid textures
Yes (PSNR and MOS)
CS optimization still evolving
Medium
1.2. Overview of the Scheme
Figure 1 gives an overall view of the proposed scheme. In this figure, the texture analyzer block decomposes the input textured image into approximation (LL) and detail (HL, LH, and HH) subbands using statistically matched wavelet based image representation [15]. The basic idea is to design a statistically matched wavelet filter bank using source data and decompose the input textured image into approximation and detail subspace [43]. Standard image encoder like JPEG2000 [44] is used to encode approximation subband coefficients (LL). The detail subband coefficients, that is, HL, LH, and HH are jointly encoded in a compressive sensing framework. The proposed SMWT-CS encoder block performs compressive measurement using Noiselet transform [47] over detail subspace coefficients (joint representation for HL, LH, and HH subband coefficients) along with quantization and entropy coding. For a K-sparse signal x, we compute the measurement vector M×1, which is much smaller in length than the length of signal (N×1), and therefore the compression is guaranteed. Texture synthesis block does the compressive sensing optimization to synthesize the samples from detail subspace measurements, using convex optimization [16, 19], that is, l1-norm minimization with equality constraint. Combining the decoded samples from the standard decoder and the texture synthesizer and doing the inverse of statistically matched wavelet transform, a synthesized image is reconstructed.
Overview of analysis and synthesis based image compression.
The rest of the paper is organized as follows. Section 2 provides an overview of statistically matched wavelet based texture representation scheme. Section 3 presents the design and implementation of the compressive measurements and encoding framework, while Section 4 provides a detailed view of synthesis framework. The experimental results are discussed in Section 5 followed by conclusion in Section 6.
2. Statistically Matched Wavelet Based Texture Representation
Gupta et al. [15] propose statistically matched wavelet for the estimation of wavelets that is matched to a given signal in the statistical sense. This concept is further extended for image data and Figure 2 shows a 2D two-band separable kernel wavelet system (with an example of estimated matched wavelet filters for brick wall image). In this figure, x and y represent the horizontal and vertical directions, respectively. The scaling filter is represented by f0x and f0y while the wavelet filter is represented by f1x and f1y corresponding to the horizontal and vertical direction. The dual of them is represented by h0x, h0y, h1x, and h1y. The input to this system is a 2D signal (an image in our framework). The output of Channel 1 in Figure 2 is called approximation subspace or scaling subspace while the outputs of the other three channels (Channels 2, 3, and 4 in Figure 2) are called detail subspace. This system is designed as biorthogonal wavelet system so that it satisfies the condition (1) for perfect reconstruction of the two-band filter bank:
(1)h1i(n)=(-1)nf0i(d-n),f1i(n)=(-1)nh0i(d-n),∑nh0i(n-2m1)f0i(n-2m2)=δ(m1-m2),∀m1,m2∈Z,∑nh0i(n)h1i(n)=0,
where i can be x or y, respective to horizontal or vertical direction, and d is any odd delay. f0i is the scaling filter while f1i is the wavelet filter. h0i and h1i are the dual of scaling and wavelet filters, respectively.
Statistically matched wavelet estimation. (a) Matched Wavelet filter bank (output of Channel 1 represents approximation subspace while output of Channels 2, 3, and 4 is detailed subspace). (b) Estimated analysis filter coefficients for brick wall test image.
The optimization criterion used for matched wavelet is the minimization of the energy in the detail subspace [43]. If a(x,y) is a 2D image signal and a^(x,y) represents the 2D image reconstructed using only the output of Channels 2, 3, and 4 (detail subspace) in Figure 2, then the error function e(x,y) can be defined as
(2)e(x,y)=a(x,y)-a^(x,y).
To ensure that the maximum input signal energy moves to approximation subspace, the energy E captured in the difference signal e(x,y) should be maximized with respect to both x and y direction filters. This leads to the following set of equations [43]:
(3)∑kh1x[∑ma0x(2m+k)a0x(2m+r)]=0forr=0,1,2,…,j-1,j+1,…,N-1,(4)∑kh1y[∑ma0y(2m+k)a0y(2m+r)]=0forr=0,1,2,…,j-1,j+1,…,N-1.
Here the jth filter weight is kept constant leading to a close form expression. These is a set of N-1 linear equations in filter weight that can be solved simultaneously. All rows of image are placed adjacent to each other to form a 1D signal having variations in horizontal direction only and represented as a0x in (3). Similarly all the columns of image are placed together to form 1D signal having variations in vertical direction only and represented as a0y in (4). The solution of (3) and (4) gives the analysis high pass filters (wavelet filters) h1x and h1y. From these, other filters (analysis low pass, synthesis high pass, and synthesis low pass) are computed using finite impulse response perfect reconstruction biorthogonal filter bank design as presented with (1).
The estimation of scaling and wavelet function is done separately for the variations in the horizontal and vertical directions of the input image. Passing the input image through the 2D statistically matched wavelet filter bank, we get the output as subsampled image corresponding to approximation and detail subspace. Figure 3 demonstrates the decomposition of one of the input tests sequences in approximation and detail subspace and Figure 4 gives the distribution of the matched wavelet coefficients in approximation and detail subspace (horizontal, vertical, and diagonal subspace). As one can observe from Figures 3 and 4, most of the information is represented by approximation subspace alone and very little information is present in the detail subspace. In the rest of the paper we refer to the approximation or scaling subspace (output of Channel 1 in Figure 2) as the candidate data for host encoding and the detail subspace (sum of output of Channels 2, 3, and 4 in Figure 2) as the candidate data for CS measurements and synthesis.
Illustration of statistically matched wavelet decomposition (a) input image, (b) approximation subspace image, and (c) to (e) detailed subspace image respective to horizontal, diagonal, and vertical subbands.
Illustration of statistically matched wavelet coefficients distribution in approximation and detail subspace, (a) Histogram and auto corelation plot of Matched wavelet coefficients in approximation sub-space, (b) to (d) Histogram and auto corelation plot of matched wavelet coefficients in detail subspace respective to diagonal, vertical, and horizontal subbands.
3. Compressive Measurement and Encoding
Compressive sensing exploits the fact that most natural or artificial signals are sparse in some domain and hence compressible. A real valued signal x∈RN, can be represented as a function of basis vectors as follows [16, 17]:
(5)x=∑iNsiψiorx=Ψs,
where x and s are N×1 column vectors and Ψ is an N × N sparsifying basis matrix. The signal x is called K-sparse if it can be represented as a linear combination of only K-basis vectors, that is, only K elements of the vector s are nonzero. The signal x can be treated as compressible if it can be well approximated by a signal with only K (K≪N) nonzero coefficients. Figure 5 shows sparse representation of subbands coefficients corresponding to brick wall and escalator test sequences. Significant wavelet coefficients in the figure are represented by blue pixels while all other nonsignificant coefficients are shown in white. One can observe that most of the detail subbands coefficients are close to zero, and therefore CS can be best utilised to exploit this sparsity for encoding detailed subbands coefficients. Compressive measurement is computed through linear projection as shown in (6):
(6)y=Φx=ΦΨs=Θs.
Here y is an M×1 measurement vector where M<N. Θ is an M×N measurement matrix. The matrix Φ represents a dimensionality reduction matrix; that is, it maps RN to RM, where M is typically much smaller than N. The main challenge with CS theory is that how should we design the sensing matrix Φ so that it preserves the information in the signal x. Several theories have been proposed in the literature for reconstructing x from y, if Θ satisfies a Restricted Isometry Property (RIP) [16]. We have used statistically matched wavelet for the sparsifying matrix (Ψ) and Noiselet [47] for sensing matrix (Φ) for compressive measurements according to (6). Measurements from Noiselet have been chosen because they are highly incoherent with the considered sparse domain and RIP tends to hold for reasonable values of M. In addition, noiselet comes with very fast algorithms and just like the Fourier transform, the noiselet matrix does not need to be stored to be applied to a vector. This is of crucial importance for efficient numerical computations without which applying CS can be very complex. The CS measurement matrix created is orthogonal and self-adjoint, thus being easy to manipulate. Scalar quantization over the measurement vector is proposed to achieve better compression ratio.
Sparse representation of statistically matched wavelets subbands coefficients for brickwall and escalator texture.
It is important to note that (6) is ill-conditioned since there are more unknowns than the number of equations as M<N; however, it has been shown that if the signal x is K-sparse and the locations of the K nonzero elements are known then the problem can be solved provided M≥K through a simplified linear equation by deleting all those columns and elements corresponding to zero or nonsignificant elements. Figure 6 gives an overall block diagram of the proposed encoder. Bit mapper block is responsible for generating a final bit stream encompassing encoded data from standard codecs for approximation subspace and CS measurements for detail subspace. Algorithm 1 provides the implementation summary of the proposed encoder framework:
(7)y=yQ;y=round(y);y=y*Q,
where Q is the quantization factor.
<bold>Algorithm 1: </bold>CS measurement and encoding.
(1) Design a 2-D separable kernel filter bank using Statistically Matched Wavelet [Detailed in Section 2]
(a) Decompose input image into approximation and detail subbands coefficients
(i) Use Wavelet function from MATLAB “wavecut” to represent approximation and detail
coefficients
(2) Design a CS measurement matrix Φ using Noiselet Transform [47] and detail subbands coefficients
(3) Do quantization of the CS measurements using (7)
(4) Do Entropy coding of the quantized CS measurements
(5) Combine standard and CS encoded bit streams
Encoder framework.
4. Texture Synthesis Framework
In this section, we present the overall texture synthesis framework. Figure 7 gives an overall block diagram of the proposed decoder.
Decoder framework.
At the decoder, the detail subbands coefficients are synthesized from the CS measurements using the compressive optimization, while the approximation coefficients are decoded using standard JPEG2000. In all our experiments, we have used convex optimization (l1-norm minimization with equality constraint) for good recovery in a CS framework. This matches the l0 norm as RIP is ensured due to noiselet measurements. Because of nondifferentiability of the l1-norm, this optimization principle leads to sparser decompositions [17] and ensures fast and stable resolution better than other methods from the class of greedy algorithms. The l1 norm also provides a computationally viable approach to sparse signal recovery. In all our experiments we have used standard basis pursuit using a primal-dual algorithm [48] which finds the vector with smallest l1-norm (8). Algorithm 2 provides the implementation summary of the convex optimization and proposed decoder framework. Combining the decoded samples from the standard decoder and the texture synthesizer (CS optimisation) and doing the inverse of statistically matched wavelet transform, a synthesized image is reconstructed:
(8)min∥x∥1subjecttoΘx=sa,∥x∥1=∑i|xi|.
<bold>Algorithm 2: </bold>Texture synthesis in a CS framework.
Inputs: Initial Reference Point, Observation Vector, Optimization Parameter
In this section, we present the experimental results. For our experiments, we have used texture database from Brodatz album [36] and Portilla and Simoncelli [37] website to select different class of structural textured images (periodic, pseudoperiodic, and aperiodic) and complex structured photographic textures. All the test sequences are 128 × 128, 8-bit, and gray scale texture. PSNR and MOS have been used as the quality metrics. Mean Opinion Score (MOS) computation was done by collecting responses of various students and staff working in the lab and averaging them. All the experiments are carried out in MATLAB, running on windows XP PC with P4 CPU and 1 GB RAM.
In this section we present the simulation results of statistically matched wavelet based 2-D separable kernel wavelet filters used to input data decomposition and subbands representation in the proposed framework. Table 2 gives the result of statistically matched analysis wavelet filters estimated using input data with the filter length set to 5. Using the analysis filters we have computed all other wavelet filters as described in Section 2, to construct the filter bank. Figure 8 shows the reconstruction results of one of our test sequences (escalator), using different set of subbands coefficients for approximation and detail subbands. As one can see that by increasing the number of approximation subband coefficients (a = 1000 to 4000) while keeping the number of detail subbands coefficients constant (d = 1000) improves the reconstruction quality and PSNR significantly. As opposed to this if we keep the approximation subband coefficients at constant (a = 1000) and increase the number of detail subbands coefficients (d = 1000 to 4000), the reconstruction quality and PSNR remain almost unchanged. This experiment demonstrates the correctness of our theoretical claim that statistically matched wavelet based texture representation ensures that maximum energy is captured in the approximation subspace with very little information left in detail subspace. One can also observe from Figure 8 that we need smaller number of coefficients from detail subbands (we select 1000 coefficients out of total 13872 coefficients (3*68*68)) as compared to those from approximation subband. This experiment was performed to illustrate the significance of approximation and detail subbands coefficients for good quality reconstruction.
Estimate of statistically matched wavelet filters.
Test sequence
Matched wavelet filters
n1
n2
n3
n4
n5
Brick wall
h1x
0.1441
−0.4869
0.6997
−0.4855
0.1300
h1y
0.0862
−0.5230
0.7329
−0.4099
0.1180
Escalator
h1x
0.1564
−0.4931
0.6894
−0.4827
0.1553
h1y
0.1248
−0.3860
0.7252
−0.5496
0.0860
Floor box
h1x
0.1256
−0.4708
0.7171
−0.4852
0.1134
h1y
0.1018
−0.3761
0.7148
−0.5668
0.1264
Black hole
h1x
0.1276
−0.4873
0.7047
−0.4808
0.1360
h1y
0.1058
−0.4367
0.7065
−0.5257
0.1503
D20
h1x
0.1197
−0.4864
0.7371
−0.4459
0.0832
h1y
0.1080
−0.4851
0.7307
−0.4558
0.1062
D36
h1x
0.1012
−0.4673
0.7495
−0.4516
0.0753
h1y
0.0661
−0.4249
0.7732
−0.4633
0.0517
D75
h1x
0.1301
−0.4776
0.7051
−0.4896
0.1342
h1y
0.1366
−0.5195
0.7060
−0.4446
0.1242
D68
h1x
0.1341
−0.4849
0.7026
−0.4832
0.1405
h1y
0.1171
−0.4321
0.7241
−0.5139
0.1057
D87
h1x
0.1354
−0.4858
0.7052
−0.4811
0.1303
h1y
0.1529
−0.4996
0.6894
−0.4813
0.1418
Fish fabric
h1x
0.0942
−0.4654
0.7448
−0.4607
0.0870
h1y
0.0687
−0.4580
0.7686
−0.4373
0.0583
Illustrating the significance of approximation and detail subbands coefficients in image reconstruction. R1, R2, R3, and R4 are the reconstructed images using different approximation (indicated as a) and detail subbands coefficients (indicated as d in the figure). The upper part shows reconstruction for varying a with constant d, while lower part shows reconstruction for constant a with varying d.
5.2. Texture Synthesis Results
In this section we present the texture synthesis results of our proposed scheme (SMWT-CS) and conventional DWT-based texture synthesis scheme in a CS framework (DWT-CS) [21, 35]. We have compared our simulation results with the conventional DWT based image synthesis scheme in a CS framework as presented in [21, 35]. In addition, we have done a comparative study of our synthesis results with joint statistics based statistical model for texture synthesis schemes as presented in [37, 42] and standard JPEG2000. The texture synthesis performance is measured using varying CS measurement samples such as M = 2000 and M = 4000.
(i) Figures 9 and 10 present the synthesis results for structural texture such as brickwall, escalator, and statistical texture such as floorbox and balckhole, using conventional DWT-CS and our proposed scheme. Table 3 gives the PSNR values and MOS scores of the synthesized texture. As one can observe from Table 3, the proposed scheme outperforms the conventional DWT-CS scheme and can provide significantly better PSNR (5 to 10 dB gain) for the same compressive measurements (M = 2000 or M = 4000) or better compression at the same PSNR. It is important to note that SMWT-CS is able to reconstruct the textural pattern smoother and sharper as compared to conventional DWT-CS based scheme for the same CS measurements (Figure 9). In addition, we can observe that the texture synthesis quality can be improved by increasing the CS measurement. This can be observed both subjectively (MOS assessment) as well as objectively (PSNR data), hinting at the scalability in the proposed framework.
Illustrating the PSNR and MOS data of the proposed model ((a) floor box, (b) black hole, (c) escalator, and (d) brick wall) using JPEG2000, conventional wavelet based CS scheme (DWT-CS) [21, 35], and our proposed CS scheme (SMWT-CS).
Sequence: floor box (128×128).
CS measurements (Number of coefficients for JPEG2000)
(PSNR in db) (JPEG2000)
(PSNR in db) (DWT-CS)
(PSNR in db) (Our scheme, SMWT-CS )
(MOS) (Our scheme, SMWT-CS)
2000
39.19
30.47
42.86
5.0
4000
41.30
32.93
44.02
5.0
6000
41.85
36.39
45.40
5.0
8000
41.90
38.19
46.96
5.0
10000
41.90
42.27
48.98
5.0
12000
41.90
45.01
51.16
5.0
Sequence: black hole (128×128).
CS measurements (Number of coefficients for JPEG2000)
(PSNR in db) (JPEG2000)
(PSNR in db) (DWT-CS)
(PSNR in db) (Our scheme, SMWT-CS)
(MOS) (Our scheme, SMWT-CS)
2000
38.88
31.18
46.8
5.0
4000
41.51
34.43
47.86
5.0
6000
41.97
36.72
49.06
5.0
8000
41.98
39.63
50.58
5.0
10000
41.98
42.86
52.24
5.0
12000
41.98
46.30
54.16
5.0
Sequence: escalator (128×128).
CS measurement (Number of coefficients for JPEG2000)
(PSNR in db) (JPEG2000)
(PSNR in db) (DWT-CS)
(PSNR in db) (Our scheme, SMWT-CS)
(MOS) (Our scheme, SMWT-CS)
2000
30.86
13.93
17.22
3.0
4000
31.08
16.13
19.63
3.5
6000
34.76
18.26
22.13
4.0
8000
38.39
20.88
24.75
4.2
10000
40.07
23.75
27.84
4.4
12000
40.56
27.48
31.76
4.8
Sequence: brick wall (128×128).
CS measurements (Number of coefficients for JPEG2000)
(PSNR in db) (JPEG2000)
(PSNR in db) (DWT-CS)
(PSNR in db) (Our scheme, SMWT-CS)
(MOS) (Our scheme, SMWT-CS)
2000
30.47
18.52
21.52
3.5
4000
31.41
20.49
22.79
3.5
6000
33.25
22.34
24.50
4.1
8000
35.19
24.50
26.17
4.1
10000
37.48
27.14
28.35
4.4
12000
38.95
30.50
31.04
4.8
Synthesis results of structural texture with pseudoperiodic patterns (brick wall and escalator textures). For each original texture, the lower image pair is the synthesized texture using conventional DWT-based CS scheme for image synthesis [21, 35], and the upper image pair is the synthesized texture using our proposed scheme. (a) Synthesis results for CS measurements (M) = 2000. (b) Synthesis results for CS measurements (M) = 4000.
Synthesis results of statistical texture (black hole and floor box textures). For each original texture, the lower image pair is the synthesized texture using conventional DWT-based CS scheme for image synthesis [21, 35], and the upper image pair is the synthesized texture using our proposed scheme. (a) Synthesis results for CS measurements (M) = 2000. (b) Synthesis results for CS measurements (M) = 4000.
(ii) Figures 11, 12, and 13 show the synthesis results for structural textures with regular patterns such as D20, D36, and D75 and structural textures with irregular patterns such as D68, D76, and D87 from Brodatz album [36]. As one can observe, the proposed scheme can synthesize the regular and irregular structural patterns with better PSNR and perceptual quality as compared to conventional DWT-CS schemes for the same CS measurements. When compared with the synthesis results for the same textures with other wavelet based texture synthesis schemes such as HMT-3S [42], the proposed scheme outperforms them in terms of perceptual synthesis quality. In fact, such joint statistics based statistical texture model cannot handle the regular and periodic structures as reported by the authors in [42].
Synthesis results of structural texture with periodic patterns (D20 and D36 textures from Brodatz album [36]). For each original texture, the lower image pair is the synthesized texture using conventional DWT-based CS scheme for image synthesis [21, 35], and the upper image pair is the synthesized texture using our proposed scheme. (a) Synthesis results for CS measurements (M) = 2000. (b) Synthesis results for CS measurements (M) = 4000.
Synthesis results of structural texture with irregular and regular patterns (D68 and D75 textures from Brodatz album [36]). For each original texture, the lower image pair is the synthesized texture using conventional DWT-based CS scheme for image synthesis [21, 35], and the upper image pair is the synthesized texture using our proposed scheme. (a) Synthesis results for CS measurements (M) = 2000. (b) Synthesis results for CS measurements (M) = 4000.
Synthesis results of structural texture with irregular patterns (D76 and D87 textures from Brodatz album [36]). For each original texture, the lower image pair is the synthesized texture using conventional DWT-based CS scheme for image synthesis [21, 35], and the upper image pair is the synthesized texture using our proposed scheme. (a) Synthesis results for CS measurements (M) = 2000. (b) Synthesis results for CS measurements (M) = 4000.
(iii) Figure 14 shows the synthesis results for complex structured photographic textures such as stone wall and fish fabric from Portilla website [37]. As one can observe, the proposed scheme results in better PSNR and perceptual quality as compared to conventional DWT-CS schemes for the same CS measurements. Also, when compared with the synthesis results based on joint-statistical models for texture synthesis as suggested by Portilla and Simoncelli [37] for the same textures, the proposed scheme provides much better synthesis quality and the capability to synthesise all the patterns types.
Synthesis results of complex structural texture with irregular patterns (stone-wall and fish-fabric textures from portilla website [37]). For each original texture, the lower image pair is the synthesized texture using conventional DWT based CS scheme for image synthesis [21, 35], and the upper image pair is the synthesized texture using our proposed scheme. (a) Synthesis results for CS measurements (M) = 2000. (b) Synthesis results for CS measurements (M) = 4000.
When compared with JPEG2000, we obtain approx 4 dB PSNR gain for statistical textures such as blackhole and floorbox (Table 3). For structural textures such as brickwall and escalator (Figure 9), JPEG2000 performs better at the same bit rate. The reason for this is the presence of significant energy in the high frequency region for structured textures. The coding efficiency loss in a scene with very high frequency transition is due to the fact that compressive sensing recovery of sparse signals typically requires a number of measurements to be larger than the number of nonzero samples.
6. Conclusion
In this paper, we propose statistically matched wavelet based texture data representation and synthesis in a compressive sensing framework (SMWT-CS). Statistically matched wavelet based representation causes most of the captured energy to be concentrated in the approximation subspace, while very little information is retained in the detail subspace. We encode not the full-resolution statistically matched wavelet subband coefficients but only the approximation subband coefficients (LL) using standard image compression scheme like JPEG2000. The detail subband coefficients, that is, HL, LH, and HH, are jointly encoded in a compressive sensing framework using compressive projection and measurements. The experimental results demonstrate that the proposed scheme can provide significantly better PSNR for the same compressive measurements or better compression at the same PSNR as compared to conventional DWT-based image compression scheme in a CS framework. It can also be observed that performing linear compression over the approximation subspace provides better reconstruction quality for the same number of samples as compared to CS measurements over approximation subspace. This indicates that performing linear compression over approximation sub-space and CS measurements over detail subspace provides optimal compression and reconstruction quality as against using only standard linear compression or using only CS measurements over both approximation and detail subspace.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
ITU-T Rec. H. 262ISO/IEC 13818-2 MPEG-2Generic Coding of Moving Pictures and Associated Audio Information—Part 2 VideoITU-T Rec. H. 264ISO/IEC 4496-10 (MPEG-4/AVC)Advanced video coding for generic audio visual services, Standard version 7, ITU-T and ISO/IEC JTC 1NyquistH.Certain topics in telegraph transmission theoryNdjiki-NyaP.HinzT.StiuberC.WiegandT.A content-based video coding approach for rigid and non-rigid texturesProceedings of the IEEE International Conference on Image Processing (ICIP '06)October 2006Atlanta, Ga, USA316931722-s2.0-7814932512410.1109/ICIP.2006.313042Ndjiki-NyaP.StüberC.WiegandT.Texture synthesis method for generic video sequencesProceedings of the IEEE International Conference on Image Processing (ICIP '07)September 2007San Antonio, Tex, USAIII397III4002-s2.0-4814908499010.1109/ICIP.2007.4379330Ndjiki-NyaP.HinzT.WiegandT.Generic and robust video coding with texture analysis and synthesisProceedings of the IEEE International Conference onMultimedia and Expo (ICME '07)July 2007Beijing, China144714502-s2.0-46449089183Ndjiki-NyaP.KöppelM.DoshkovD.WiegandT.Automatic structure-aware inpainting for complex image contentNdjiki-NyaP.WiegandT.Video coding using closed-loop texture analysis and synthesisBoschM.ZhuF.DelpE. J.Spatial texture models for video compressionProceedings of the IEEE International Conference on Image Processing (ICIP '07)September 2007San Antonio, Tex, USAI93I962-s2.0-4814909075310.1109/ICIP.2007.4378899BoschM.ZhuF.DelpE. J.Video coding using motion classificationProceedings of the 15th IEEE International Conference on Image Processing (ICIP '08)October 2008San Diego, Calif, USA158815912-s2.0-6994918717710.1109/ICIP.2008.4712073ByrneJ.IerodiaconouS.BullD.RedmillD.HillP.Unsupervised image compression-by-synthesis within a JPEG frameworkProceedings of the 15th IEEE International Conference on Image Processing (ICIP '08)October 2008San Diego, Calif, USA289228952-s2.0-6994913566410.1109/ICIP.2008.4712399IerodiaconouS.ByrneJ.BullD. R.RedmillD.HillP.Unsupervised image compression using graphcut texture synthesisProceedings of the 16th IEEE International Conference on Image Processing (ICIP '09)November 2009Cairo, Egypt228922922-s2.0-7795195288110.1109/ICIP.2009.5414425ZhangF.BullD. R.CanagarajahN.Region-based texture modelling for next generation video codecsProceedings of the 17th IEEE International Conference on Image Processing (ICIP '10)September 2010Hong Kong, China259325962-s2.0-7865106648810.1109/ICIP.2010.5651626ZhangF.BullD. R.Enhanced video compression with region-based texture modelsProceedings of the Picture Coding Symposium (PCS '10)December 2010Nagoya, Japan54572-s2.0-7995177592410.1109/PCS.2010.5702560GuptaA.JoshiS. D.PrasadS.A new approach for estimation of statistically matched waveletDonohoD. L.Compressed sensingBaraniukR. G.Compressive sensingChenS. S.DonohoD. L.SaundersM. A.Atomic decomposition by basis pursuitCandèsE. J.RombergJ.TaoT.Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency informationZhangY.MeiS.ChenQ.ChenZ.A novel image/video coding method based on compressed sensing theoryProceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '08)April 2008Las Vegas, Nev, USA136113642-s2.0-5144912071110.1109/ICASSP.2008.4517871VenkatramanD.MakurA.A compressive sensing approach to object-based surveillance video codingProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '09)April 2009Taipei City, Taiwan351335162-s2.0-7034919606510.1109/ICASSP.2009.4960383Prades-NebotJ.MaY.HuangT.Distributed video coding using compressive samplingProceedings of the Picture Coding Symposium (PCS '09)May 2009Chicago, Ill, USA142-s2.0-7044963876110.1109/PCS.2009.5167431WakinM. B.LaskaJ. N.DuarteM. F.BaronD.SarvothamS.TakharD.KellyK. F.BaraniukR. G.Compressive imaging for video representation and codingProceedings of the Picture Coding Symposium (PCS '06)April 2006Beijing, China128913062-s2.0-34047115377YangY.AuO. C.FangL.WenX.TangW.Perceptual compressive sensing for image signalsProceedings of the IEEE International Conference on Multimedia and Expo (ICME '09)July 2009New York, NY, USA89922-s2.0-7044960289810.1109/ICME.2009.5202443EfrosA. A.LeungT. K.Texture synthesis by non-parametric sampling2Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV'99)September 1999103310382-s2.0-0033285309WeiL. Y.LevoyM.Fast texture synthesis using tree-structured vector quantizationProceedings of SIGGRAPH '00July 2000New Orleans, La, USA4794882-s2.0-0034448271KwatraV.SchödlA.EssaI.TurkG.BobickA.Graphcut textures: image and video synthesis using graph cutsProceedings of the SIGGRAPH '03July 2003San Diego, Calif, USA2772862-s2.0-7795400442610.1145/1201775.882264HillP.WangJ. Y. A.AdelsonE. H.Representing moving hands with layersDumitraşA.HaskellB. G.An encoder-decoder texture replacement method with application to content-based movie codingO'CallaghanR. J.BullD. R.Combined morphological-spectral unsupervised image segmentationChoiH.BaraniukR. G.Multiscale image segmentation using wavelet-domain hidden Markov modelsLiJ.GrayR. M.Context-based multiscale classification of document images using wavelet coefficient distributionsAcharyyaM.KunduM. K.An adaptive approach to unsupervised texture segmentation using M-band wavelet transformSchulzA.VelhoL.da SilvaE. A. B.On the empirical rate-distortion performance of compressive sensingProceedings of the 16th IEEE International Conference on Image Processing (ICIP '09)November 2009Cairo, Egypt304930522-s2.0-7795194126010.1109/ICIP.2009.5414390BrodatzP.PortillaJ.SimoncelliE. P.A parametric texture model based on joint statistics of complex wavelet coefficientsKhandeliaA.GorechaS.LallB.ChaudhuryS.MathurM.Parametric video compression scheme using ar based texture synthesisProceedings of the 6th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP '08)December 2008Bhubaneswar, India2192252-s2.0-6524913163610.1109/ICVGIP.2008.86StojanovicA.WienM.OhmJ. R.Dynamic texture synthesis for H.264/AVC inter codingProceedings of the IEEE International Conference on Image Processing (ICIP '08)October 2008San Diego, Calif, USA160816112-s2.0-6994912440810.1109/ICIP.2008.4712078StojanovicA.WienM.TanT. K.Synthesis-in-the-loop for video texture codingProceedings of the 16th IEEE International Conference on Image Processing (ICIP '09)November 2009Cairo, Egypt229322962-s2.0-7795194617910.1109/ICIP.2009.5414434ChenH.HuR.MaoD.ZhongR.WangZ.Video coding using dynamic texture synthesisProceedings of the IEEE International Conference on Multimedia and Expo (ICME '10)July 2010Singapore2032082-s2.0-7834929453010.1109/ICME.2010.5582585FanG.XiaX.Wavelet-based texture analysis and synthesis using hidden Markov modelsKumarS.GuptaR.KhannaN.ChaudhuryS.JoshiS. D.Text extraction and document image segmentation using matched wavelets and MRF modelI.T. R. T. 800Jpeg-2000:core coding system2000International Telecommunication UnionDengC.LinW.LeeB.LauC. T.Robust image compression based on compressive sensingProceedings of the IEEE International Conference on Multimedia and Expo (ICME '10)July 2010Singapore4624672-s2.0-7834927393510.1109/ICME.2010.5583387MojsilovicA.PopovicM. V.RackovD. M.On the selection of an optimal wavelet basis for texture characterizationCoifmanR.GeshwindF.MeyerY.NoiseletsBoydS.VandenbergheL.