Implementation of 2D Discrete Wavelet Transform by Number Theoretic Transform and 2D Overlap-Save Method

To reduce the computation complexity of wavelet transform, this paper presents a novel approach to be implemented. It consists of two key techniques: (1) fast number theoretic transform(FNTT) In the FNTT, linear convolution is replaced by the circular one. It can speed up the computation of 2D discrete wavelet transform. (2) In two-dimensional overlap-save method directly calculating the FNTT to thewhole input sequencemaymeet two difficulties; namely, a bigmodulo obstructs the effective implementation of the FNTT and a long input sequence slows the computation of the FNTT down. To fight with such deficiencies, a new technique which is referred to as 2D overlap-save method is developed. Experiments have been conducted. The fast number theoretic transform and 2D overlap-method have been used to implement the dyadic wavelet transform and applied to contour extraction in pattern recognition.

In digital signal processing, a signal is represented by a discrete sequence.Therefore, the discrete wavelet transform can be utilized to process it.We have [13]      (, ) = ∬  (, V)    ( − ,  − V)  V = ∑ ,  ( − 1 − ,  − 1 − )  , , , where  , , = ∬ ( Explicitly, either continuous wavelet transform or discrete wavelet transform is essentially the operation of filtering. Directly calculating (1) may be time-consuming, because the number of multiplication operations may come large in this way.Although Mallat algorithm [14] can implement the wavelet transform successfully in some special cases, that is, in an orthogonal wavelet basis or in the multiresolution analysis, it is not a solution of all kinds of wavelet transforms.Therefore, studying how to speed up the general discrete wavelet transform is of great practical significance.Actually, the filtering is a linear convolution in signal processing.In the following, without loss of generality, we will begin with the definition of Linear Convolution of two 2D sequences.
In many certain applications, we are required to compute only such an output sequence which is with a finite length in (3).Let   1 , 2 be a 2D finite input of the filter with lengths  1 and  1 .We compute the finite output as follows: In practice, the difference between the length of the input signal and that of the filter is often large.If we calculate convolution (4) directly, a large number of multiplication operations will be executed.In the meantime, a series of calculations to treat zeros will be performed, if fast Fourier transform (FFT) is used to speed up the computation.Both of these two methods will be time-consuming to compute the linear convolution.To overcome this problem, in this paper we will present a novel approach which is referred to our previous work as number theoretic transform (NTT) [15][16][17].Also, we will prove that the computation of the linear convolution can be replaced by that of the cyclic convolution of two 2D sequences with lengths  1 =  1 +  2 − 1 and  2 =  1 +  2 − 1, respectively.For two 2D sequences   1 , 2 and ℎ  1 , 2 of finite lengths  1 and  2 , their cyclic convolution can be written by 1 = 0, 1, . . .,  1 − 1,  2 = 0, 1, . . .,  2 − 1, (5) where ⟨⟩  denotes the remainder  of  modulo ; that is,  =  + ,  is any integer and 0 ≤  < .
The number theoretic transform (NTT) provides an effective way to calculate the cyclic convolution.However, there exist two difficulties, if directly applying the NTT to the whole input sequence, namely: (i) a considerable big modulo M has to be imposed.It obstructs the effective implementation of the NTT due to the limited length of the word which is used to store data in computers.It will, thus, become a bottleneck in the computation of convolution; (iii) the advantage of the fast computation of the NTT cannot be reached when the difference between the length of the input sequence and that of the filter sequence is large.Thus, it may not speed up the calculation of convolution in this case.
To avoid using a big modulo M, we can use Chinese remainder theorem to reduce the length of the modulo M and successfully apply the NTT to calculate the convolution.Unfortunately, the number of multiplication operations will be at least doubled, and thus the computational complexity will increase.The second key technique will be worked out in this paper to overcome the above difficulties and speed up the calculation of the convolution.It is termed 2D overlapsave method, which is an expansion of the 1D overlapsave method.It can be used to implement the 2D wavelet transform when a big difference between the length of the input sequence   1 , 2 and that of the filter sequence ℎ  1 , 2 occurs.This method consists of three steps.
(i) First, the original 2D input sequence is equivalently divided into many small separated sections.This division brings two evident improvements.
(a) The size of each section is much smaller than that of the whole input sequence, such that a small modulo M can be used, and the NTT can, therefore, be performed in a computer system.(b) The difference between the length of one section and that of the filter sequence becomes smaller, which makes the effective application of the NTT.
(ii) In the second step, we calculate the cyclic convolution of each section with the filter sequence by NTT.
(iii) Finally, the result of ( 4) is obtained by picking out data from each of the results calculated in the second step and combining these data together.
In comparison with direct method and fast Fourier transform (FFT), it will be proved explicitly that the number of multiplication operations in the 2D overlap-save method will be smaller than that in any of those two methods in many particular cases.
In the next section, the number theoretic transform (NTT) will be presented.Details of the 2D overlap-method will be discussed in Section 3. A comparison of computation of the multiplication in three methods will be given in Section 4. A computational example will be presented as well as a practical experiment in Section 5 which will verify the efficiency of the proposed approach.In the experiment, the fast number theoretic transform and 2D overlap-method were applied to implement the dyadic wavelet transform extracting the contours in recognition of Chinese handwriting.
Agarwal and Burrus gave some sufficient conditions for existence of 2-dimensional NTT given with  > 1,  1 > 1, and  2 > 1, (see [15]).In [17], the authors gave some sufficient and necessary conditions for existence.We have the following theorem.
Finally, we obtain

Fast Number Theoretic Transform (FNTT).
The idea of FFT can be used to perform the NTT.In this subsection, a theoretic description will be presented briefly.More details will be given in Section 5. Let From congruence equation ( 17), we deduce that Suppose that  1 and  2 satisfy inequalities, and we then can use the idea of FFT to calculate the congruence equations ( 17) and (18).
For computing every in congruence equation ( 17), the numbers of all multiplication necessary are ( 2 log  2 )/2 by using FFT algorithm.Hence, if using FFT algorithm to calculate the congruence in congruence equation ( 17), the numbers of all multiplication necessary are Similarly, if using FFT algorithm to calculate all congruence equations in (18), then the numbers of all multiplication necessary are (( 2  1 )/2) log  1 .
Therefore, if using the fast number theoretic transform (FNTT) to calculate the congruence equation ( 8), the number of all multiplication necessary is If  is very large, we may reduce the length of a word by using Chinese remainder theorem and it can be abbreviated to CRT.Therefore, we have the following proposition.
Proposition 5. Suppose that congruence equation ( 8) is a 2dimensional NTT mod, where  = . .,   are distinct primes; then, we have a total of s 2-dimensional NTT mod     , and they are described as follows: where then where  = Proof.Because the congruence equation ( 8) is a 2-dimensional NTT mod , hence so Hence, the congruence equation ( 22) has a total of  2dimensional NTTs mod     ,  = 1, . . ., .From CRT and the congruence equation ( 23), we deduce that the congruence equation (24) holds.
Assuming that the input sequence   1 , 2 ( 1 = 0, 1, . . .,  1 − 1,  2 = 0, 1, . . .,  1 − 1) is divided into many small sections, which are termed submatrices, The consecutive matrix is overlapped by the previous one.As a simple example, four blocks chosen from the divided input matrices are overlapped and are graphically shown in Figure 1.Parameters V 1 and V 2 can be considered as shift parameters, because changing either V 1 or V 2 will select different submatrix.For the first submatrix shown in Figure 1 The procedure of the overlapping is also graphically illustrated in Figure 1.
For the operation of circular convolution, it is necessary to have two sequences with the same length.Taking (7) for example, the sequences   1 , 2 and ℎ  1 , 2 are of the same length of  1 ×  2 .Since the input sequence has already divided into submatrices of  ×  by (27), the filter sequence has to be of length ×.Therefore, − 2 augmenting zeros are required to be added to the row and  −  2 zeros to the column of the filter sequence ℎ  1 , 2 .As a result, the sequence ℎ  1 , 2 becomes where  1 = 0, 1, . . .,  − 1 and  2 = 0, 1, . . .,  − 1.
The following form shows how a sequence ℎ   1 , 2 is yielded by adding augmenting zeros to the filter sequence ℎ  1 , 2 : For each of the divided submatrices, we can give V 1 and V 2 and compute the circular convolution of two 2D sequences From (28), we deduce that The result of (31) is that of circular convolution.As we will see below, there exists an important connection between the circular convolution and the linear one.In fact, since  1 = 0, 1, . . .,  2 − 1 and  2 = 0, 1, . . .,  2 − 1, we can choose  1 =  2 − 1, . . .,  − 1 and  2 =  2 − 1, . . .,  − 1 in (31); this makes  >  1 −  1 > 0 and  >  2 −  2 > 0 in (31).Thus, the remainder can exactly be equal to  1 −  1 and  2 −  2 .Then, the circular convolution of (31) becomes In order to clearly understand the procedure of obtaining the components of linear convolution from the circular convolution equation (31), a graphic description of it can be found in Figure 2. We show a dashed rectangle with size of  by , in Figure 2(a), to represent the circular convolution which is described in (31) and a darkened rectangle with size of   by   , to indicate the linear convolution stated in (32).Obviously, if the first  2 − 1 data in rows and the first  2 − 1 data in columns are discarded from the dashed rectangle, the darkened rectangle can be obtained.In the following, we will explain how to obtain the result of (4).The output of the whole equation ( 4) can be represented in the following matrix: which is an  1 ×  1 matrix.So, (32) gives a   ×   submatrix of matrix  (note that   =  −  2 + 1 and that   =  −  2 + 1): Next, to obtain the whole result, we need to compute the circular convolution of two 2D sequences )  and ℎ   1 , 2 ( 1 = 0, 1, . . .,  − 1,  2 = 0, 1, . . .,  − 1).With the same reasoning as above, a consecutive submatrix of the matrix (34) is evinced as follows: ) . ( The corresponding graphic description is indicated in Figure 2(b).We will darken the rectangle in Figure 2(b) to express the submatrix (35) and the dashed one on its upperleft-corner to be the graphic description of the submatrix (34).
In Figure 3, there is a white Γ-type area indicating that there are no output data.Two calculation methods can be applied to obtain the data.The first way is by using a direct method because only a few computations are needed.Alternatively, we can compute the data of Γ-type area with the overlap-save method proposed above, if   =  2 −1 and   =  2 − 1.The detail of it is presented in the Technical Report [18].The entire result is completed and is demonstrated in Figure 4 graphically.
The two-dimensional overlap-save method is summarized as follows.(i) Select two positive integers   and   , so that   +  2 − 1 =  <  1 and   +  2 − 1 =  <  1 .Either  or  is given by an integer power of 2 for using an FFT; that is,  = 2  1 and  = 2  2 where  1 and  2 are two positive integers.
(ii) Let ℎ   1 , 2 satisfy (28).(iii) Compute the circular convolution of two sequences we obtain all   ×   submatrices of the matrix of  whose form is as (34).
We will introduce the way of using FNTT and 2D overlap-save method with two examples.One is numerical computation and the other is an experiment of extracting the contour of a Chinese handwritten character.
We have the following proposition.
Compute the cyclic convolution of two sequences ; that is, compute matrix (35), by using FNNT.It is well known that using FNTT to calculate the (35) requires where  =  = 2 +2 .

Experiments
The theoretic description of the 2D overlap-save method has been provided in the preceding section.This section further examines how such a technique will be performed and how it will be applied to the wavelet transform.Particularly, we will use image processing as an example.The experiment will focus on the application of the 2D overlap-save method to the dyadic wavelet transform extracting the contours of Chinese handwriting.

Numerical Computation.
Let ℎ  1 , 2 denote a digital filter and let a two-dimensional sequence be the coefficients of the filter.A two-dimensional sequence Mathematical Problems in Engineering represents the input of the filter.In this example, because  1 =  1 = 8 and  2 =  2 = 3, the length of input matrix is 8 × 8 and that of the filter is 3 × 3.

Application to Image
Processing.The proposed method can be applied to many fields such as image processing and pattern recognition.An example of the application to the image processing is illustrated in Figure 6.The original image shown in Figure 6(a) is a 2-dimensional image of an aircraft with a size of 256 × 256 pixels; that is,  1 =  1 = 256.Our task is to extract its contours.The discrete wavelet transform with scale  = 2  where  = 2 was applied and the spline wavelet, described in (1), of size 17 × 17 was chosen; that is  2 =  2 = 17.We use the filtering coefficients tabulated in Table 1.Using the 2D overlap-save method, this original image was divided into 25 sections, with each one having the dimension of 64 × 64 pixels; that is, we select To facilitate the presentation of the 2D overlap-save method, the procedure of it is diagonally displayed in Figure 7, where a series of the divided subimages are illustrated.The image in Figure 6(a) was equivalently divided into 25 sections by applying (27).In Figure 7(a), we represent 5 subimages of The lower-right corner of each subimage is repeated in the upperleft corner of its consecutive subimage.Four subimages are displayed in Figure 7(b), and the parameters of Figure 7: A 2-dimensional image of an aircraft is dividing into separated small subimages; the parameters of V 0 and V 1 are chosen as follows: (a) , and V 0 = 2 and V 1 = 4; (d) V 0 = 0, V 1 = 3, and V 0 = 1 and V 1 = 4; (e) V 0 = 0 and V 1 = 4; (f) , and V 0 = 4 and V 1 = 2; (h) V 0 = 3, V 1 = 0, and V 0 = 4 and V 1 = 1; and (i) V 0 = 4 and V 1 = 0.
The FNTT was applied to calculate the circular convolution of each subimage.After applying (31) to each of the subblocks, a series of segments of the contours were obtained.
By using (32), we can gradually reach the final result of the linear convolution.Figure 8(a) illustrates those pixels which satisfy (35), where  2 =  2 = 17,  =  = 64,   =   = 48, and V 1 , V 2 = 0, 1, 2, 3, 4. Combining them together, the contours of the whole character were obtained and displayed in Figure 8(c) which excludes the Γ-type border.Next, the Γ-type border was computed by direct method.Figure 8(b) shows the result.Combining this result with that shown in Figure 8(c), the entire contours of the Chinese handwriting were combined.Figure 8(d) represents the entire contour.

Conclusions
A novel approach to reduce the computation complexity of the wavelet transform has been presented in this paper.Two key techniques have been applied in this new method, namely, fast number theoretic transform (FNTT) and twodimensional overlap-save technique.In the fast number theoretic transform, the linear convolution is replaced by the circular convolution.It can speed up the computation of 2D discrete wavelet transform.Directly calculating the fast number theoretic transform to the whole input sequence may meet two difficulties; namely, a big modulo obstructs the effective implementation of the fast number theoretic transform and a long input sequence slows the computation of the fast number theoretic transform down.To fight with such deficiencies, a new technique which is referred to as 2D overlap-save method has been developed.Experiments have been conducted.The fast number theoretic transform and 2D overlap-method have been used to implement the dyadic wavelet transform and applied to contour extraction in image processing and pattern recognition.

Figure 1 :Figure 2 :
Figure 1: Four divided input submatrices are overlapped using the 2D overlap-save method: (a) the first submatrix, with parameters V 1 = 0, V 2 = 0, is located on the top-left corner; (b) the second one (V 1 = 1, V 2 = 0) overlaps by the first submatrix, and the size of the overlapping between them is  times  2 −1; (c) the third divided input submatrix has parameters of V 1 = 0, V 2 = 1.It is overlapped by the first submatrix, and the size of the overlapping between them is  2 − 1 times ; (d) the fourth one with V 1 = 1, V 2 = 1 is overlapped by the first submatrix, and the size of the overlapping between them is  2 − 1 times  2 − 1.

Figure 3 :Figure 4 :
Figure 3: All   ×   submatrices of the output, except the boundary of Γ-type area.

Figure 6 :
Figure 6: A 2-dimensional image of an aircraft was processed with 2D overlap-save method.(a) Original Chinese handwriting and (b) the divided up subimages of the character.

Figure 8 :𝑋 𝑘 1 ,𝑛 1 𝑘 1 × (− 2
Figure 8: The linear outputs of 2D overlap-save method: (a) the outputs of the linear convolution of each subimages, (b) the left border and the upper border of the outputs of the image of aircraft, (c) the combined outputs of linear convolution, except Γ-type border, and (d) the entire contour extracted using 2D overlap-save method under wavelet transform.