Tight wavelet frames in Lebesgue and Sobolev spaces

A BSTRACT . We study tight wavelet frame systems in L p ( R d ) , and prove that such systems (under mild hypotheses) give atomic decompositions of L p ( R d ) for 1 < p < F . We also characterize L p ( R d ) and Sobolev space norms by the analysis coef ﬁ cients for the frame. We consider Jackson inequalities for best m - term approximation with the systems in L p ( R d ) and prove that such inequalities exist. Moreover, it is proved that the approximation rate given by the Jackson inequality can be realized by thresholding the frame coef ﬁ cients. Finally, we show that in certain restricted cases, the approximation spaces, for best m -term approximation, associated with tight wavelet frames can be characterized in terms of (essentially) Besov spaces.


INTRODUCTION
A tight wavelet frame (TWF) for L 2 (R d ) is a finite collection of functions = { } ∈E in L 2 (R d ), E = {1, 2,...,L}, for which the system X ( ) := {2 jd/2 (2 j · −k)| j ∈ Z, k ∈ Z d , ∈ E} is a tight frame for L 2 (R d ), i.e., there exists a constant A > 0 such that g∈X( ) | f , g | 2 = A f 2 L 2 for any f ∈ L 2 (R d ). The functions are called the generators of the TWF. The construction and properties of TWFs in L 2 (R d ) have been studied extensively by many authors (see e.g. [16,17]). The purpose of this paper is to study such frames in spaces different from L 2 (R d ).
In particular, we will study TWFs in L p (R d ) and L p -based Sobolev spaces. We prove that most reasonable TWFs give atomic decompositions of L p (R d ),1< p < , and it is proved that we can characterize the L p (R d ) and Sobolev norm by the analysis coefficients associated with the frame. An important consequence of the characterization is that there is a Jackson inequality for nonlinear approximation with TWFs, and moreover we will show that the rate of convergence given by the Jackson inequality can be reached simply by thresholding the analysis coefficients.
The structure of the paper is as follows. In Section 2 we review the most common method to construct TWFs, the so-called extension principles of Ron and Shen. The TWFs generated through an extension principle are based on a multiresolution analysis, and the generators are often called framelets. They have been studied extensively, see e.g. [2,4,15,16,17]. Section 3 contains the analysis of the properties of TWF expansions in L p (R d ) and L p -based Sobolev spaces. We give a complete characterization of the L p -norm, 1 < p < , in terms of analysis coefficients associated with the frame, and prove that a TWF gives an atomic decomposition for L p (R d ). The characterization has the same form as the classical characterization of the L p -norm by wavelet coefficients, see e.g. [13]. In Section 3.3 the analysis is extended to L p -based Sobolev spaces. In Section 4, we consider Jackson inequalities for best m-term approximation with TWFs in L p (R d ), and we discuss some cases where a complete characterization of the approximation spaces -associated with best m-term approximation in L p (R d ) with TWFs -in terms of (essentially) Besov spaces is possible. Two of the present authors have studied approximation with spline based framelets, defined on R, in [11].

TIGHT WAVELET FRAMES
The most common methods to construct TWFs are the extension principles of Ron and Shen. Tight wavelet frames build through the extension principle are based on a multiresolution analysis and we will briefly touch upon some of the main ideas in the construction, see [4,17,16]. There is also the (significant) advantage with the MRA based constructions that there are fast associated algorithms. For historical notes on this construction, we refer the reader to [4]. MRA based TWFs are called framelets. We begin by introducing some basic notation and general assumptions. Let =( 0 , 1 ,..., L ) be a vector of 2 Z d -periodic measurable functions with 0 the mask of a refinable scaling function of a MRA {V j } j∈Z . We assume that satisfies lim →0 ( )=1 and there exist 0 < c ≤ C < such that c ≤ k∈Z d | (·− 2 k)| 2 ≤ C, i.e., generates a Riesz basis of the scaling space V 0 of the MRA. We associate the "wavelets" = { } ∈E to by letting (2 )= ( ) ( ). The following is the fundamental tool to construct framelets: Theorem 2.1 (The Oblique Extension Principle (OEP) [4]). Suppose there exists a 2 Z d -periodic function that is non-negative, essentially bounded, continuous at the origin with (0)=1. If for every ∈ [− , ] d and ∈{0, } d , The system X ( ) is usually called the framelet system generated by .
Remark 2.2. Theorem 2.1 can be stated in slightly more generality by introducing the notion of a spectrum for the scaling space V 0 and dropping the requirement that generates a Riesz basis, see [4].  [17]. The advantage of the OEP compared to the UEP is that one can construct framelets with a high number of vanishing moments using the OEP. This is not possible with the UEP, where at least one of the generators has only one vanishing moment.

TIGHT WAVELET FRAMES IN L p AND SOBOLEV SPACES
In this section we study TWFs in L p (R d ),1< p < , (Section 3.1) and L pbased Sobolev spaces (Section 3.3). In Section 3.2 we show that thresholding the analysis coefficients associated with a TWF is a bounded operation in L p (R d ), 1 < p < . For notational convenience we let D denote the set of dyadic cubes 3.1. TWFs in L p (R d ). Theorem 3.1 below will show that we can characterize the L p -norms by the analysis coefficients associated with the TWF. Theorem 3.3 will show that there is a stable way to reconstruct L p -functions using the TWF, and this leads to two results: TWFs form atomic decompositions of L p (R d ) (see Corollary 3.6), and thresholding (or shrinkage of) the frame coefficients are stable operations in L p (R d ) (see Section 3.2).
Notice that the corresponding operator is bounded on L 2 (R d ) due to the fact that { I } I∈D is a subset of a frame. Also, standard estimates show that (see e.g. [3]) because of the smoothness and decay of . Thus T is a Calderón-Zygmund operator and therefore bounded on L p (R d ),1< p < . However T f has a nice expansion in the orthonormal Meyer wavelet, so using the L p -characterization of such expansions we get Using this estimate for = 1, 2,...,L, and the fact that 1 → 2 we get 2 Now we turn to the converse estimate. Notice that since we have a tight wavelet frame we have the identity where A > 0 is a constant depending only on the frame. Write and notice that for Taking the supremum of this estimate for

This proves the result for
To complete the proof we just notice that from the first part of the proof it follows that f → Wf(x),Wf(x) 2 is continuous on L p (R d ).
From Theorem 3.1 we see that the following sequence space plays an important role.
In fact, let us show that there is a stable reconstruction operator defined on d p .
Suppose for all ∈ E, some > 0 and some > 0, I . This is a well-defined function in L p (R d ) with where we used the characterization of L p (R d ) using wavelets. Thus, and it follows that T : d p → L p (R d ) is bounded. Unconditionality follows easily from the observation that none of the above estimates depend on the sign of each c I . Recall the Lorentz space p,q ( ),1≤ p < ,0< q ≤ , for some countable set , as the set of sequences {a m } m∈ satisfying {a m } p,q < , where with {a * j } j=0 a decreasing rearrangement of {a m } m∈ . It is known from the orthonormal wavelet case [10,11], that there exist constants c,C > 0 such that for any {c I }∈ p,1 (D). Notice that for any {c I }∈ p,1 (D × E), Combining these two estimates, we get that there exist constants c,C > 0 such that We have in fact proved that any reasonable (in the sense of theorem 3.1) TWF system induces an atomic decomposition of L p (R d ),1< p < . Let us recall the definition of an atomic decomposition: • For any f ∈ X we have From this definition we read off the following: 3.2. Thresholding the TWF analysis coefficients. From a practical point of view, it is interesting to study different types of thresholding (or shrinkage) operators for the framelet system in L p . Let : C × R + → C be a function for which there exists a constant C such that We call such a function a shrinkage rule, see e.g. [18]. The well-known notions of hard and soft thresholding are two of the prime examples of shrinkage rules. The expressions are given by ( We define the associated shrinkage operator T as To see this, we use the estimates given by Theorem 3.3, By (3.3) and the dominated convergence theorem we see that f − T f p → 0 as → 0.
3.3. Sobolev Spaces. We now turn our attention to L p -based Sobolev spaces. For 1 ≤ p ≤ and r ≥ 0 denote by W r (L p (R)) the Sobolev space consisting of func- with the Laplace operator. We prove in this section that for TWFs with some smoothness and vanishing moments, we can actually characterize the Sobolev norm using the frame coefficients. For a nonnegative integer N, we say that a function f belongs to the set S N (R d ) if there exist constants C,C < and > 0, such that Here, | | := d k=1 k .
Remark 3.7. Given N ∈ N, it is possible, using the oblique extension principle, to construct a generator of a framelet system such that ⊂ S N (R d ) (see e.g. [11]).
Theorem 3.8. Given 1 < p < and r ≥ 0. Let { } ∈E be the generators of a TWF for L 2 (R d ) such that ∈ S r (R d ) for all ∈ E. Then, , with equivalence depending only on p and r.
Proof. For notational convenience we write , for any f ∈ W r (L p (R d )). Let us first consider the case r ∈ N. According to Theorem 6.6.21 in [12], and the characterization of Sobolev functions using wavelet expansions, there exist constants C and C depending only on r and p such that , . This gives us the lower bound in (3.5) for r ∈ N.
To get the upper bound we recall that f W r (L p (R d )) f p + (− ) r/2 f p , and since f p ≤ C W 0 f p ≤ C W r f p by Theorem 3.1, it suffices to show that Thus, by the Cauchy-Schwartz inequality in [12], and the characterization of Lebesgue functions using wavelet expansions gives, Now, taking the supremum over all g ∈ L p (R d )∩L r 2 (R d ) with g p ≤ 1, we obtain In order to conclude the theorem we need to prove (3.5) for a general r > 0. Define for each I ∈ D, ∈ E and x ∈ R d the discrete weight function w r := w r (I, ) Then the arguments above show that P • J = Id W N (L p (R d )) for all N ∈ N 0 , in other words, W N (L p (R d )) is a retract of L p ( 2 (w N )).
For a given r > 0, r ∈ N, take N ∈ N 0 such that r =(1 − )N + (N + 1) for some ∈ (0, 1). Notice that w r w (1− ) N w N+1 . Now, according to Theorem 5.5.3 in [1] we have with equivalent norms. Thus, W r (L p (R d )) is a retract of L p ( 2 (w r )) for a general r ≥ 0.
Remark 3.9. The reader will notice that the spaces studied so far, L p (R d ) and L pbased Sobolev spaces, belong to the Triebel-Lizorkin scale of function spaces. It can be verified (at the expense of "messy" estimates) that sufficiently nice TWFs also can be used to characterize the Triebel-Lizorkin norms. We leave the details for the reader.

JACKSON INEQUALITIES FOR TIGHT WAVELET FRAMES
We will now look at some of the implications that can be derived from the various characterizations given in the previous section. The main result will be a Jackson inequality that will give a certain rate for m-term approximation for "nice" functions. We consider two interpretations of the word "nice". When we do not assume any smoothness or vanishing moments for the TWF, we get the Jackson estimates for functions in a sparsity class defined in terms of the TWF. If we assume the generators for the TWF has some smoothness and vanishing moments (the OEP tells us that such nice generators do exist), then we can state the Jackson inequality in terms of smoothness measured on the Besov scale.
First we introduce some notions that will be used later. A dictionary D = {g k } k∈N in L p (R d ) is a countable collection of quasi-normalized elements from L p (R d ).
For D we consider the collection of all possible m-term expansions with elements from D: The error of the best m-term approximation to an element f ∈ L p (R d ) is then

Definition 4.1 (Approximation spaces). The approximation space
is defined by

and (quasi)normed by f
with the q norm replaced by the sup-norm when q = .
It is well known that the main tool in the characterization of A q (L p (R d ), D) comes from the link between approximation theory and interpolation theory (see e.g. [7, Theorem 9.1, Chapter 7]). Let X p (R d ) be a Banach space with semi-(quasi)norm |·| X p continuously embedded in L p (R d ). Given > 0, the Jackson inequality and the Bernstein inequality (with constants C and C independent of f , S and m) imply, respectively, the continuous embedding and the converse embedding for all 0 < < and q ∈ (0, ]. We want to obtain a Jackson estimate for m when D is any (reasonable) TWF.
For this we need to define a class of "nice" and "smooth" functions. This will be the following class as introduced in [8] for Hilbert spaces. ∈ (1, ), ∈ (0, ) and
Therefore, it is interesting to study the set of functions in L p (R d ) depending only on the behavior of the coefficients f , ,p I . For p ∈ (1, ), ∈ (0, ) and q ∈ (0, ], we letK ,q (L p (R d ), X ( )) denote the set with equivalent norms.
Proof. By Remark 4.3 we haveK ,q (L p (R d ), X ( )) → K ,q (L p (R d ), X ( )). Thus, ) I,I , , is bounded on . Let us introduce the notation j,k := I and c j,k := c I for I = [2 − j k, 2 − j (k + 1)], j ∈ Z, k ∈ Z d . By Proposition 6.6.20 in [12] we have for j ≤ j j,k , this gives the bound For notational convenience we suppress the index in the following. For fixed j ∈ Z and k ∈ Z d we have Using the bound (4.4) and Hölders inequality for the sum over j , with 1 = 1/ + 1/ , we get Lemma 8.10 in [14] implies for any {d k } k ∈ ,1≤ < , for m ∈ Z.
This estimate and (4.5) yields We now want to use Theorem 3.3 to show that it is possible to obtain the same asymptotic upper bound for m ( f , X ( )) as in (4.3), just by including the m largest normalized framelet coefficients in the approximation. That is to say, we can obtain the approximation rate, associated with the general Jackson inequality, just by thresholding the TWF analysis coefficients.
The basic observation we need is the following: Let 1 < p < and ⊂ D × E be a finite set. Since the TWF system is p,1 -hilbertian, we have  Proof. The proof is an easy extension of the proof of Theorem 7.5 in [5]. Let The estimates in (4.6) gives Denote˜ := \∪ k j=− j and notice that˜ ⊂ k+1 . Now, according to (4.6), T m k+1 − T m p ≤ C2 −k (card˜ ) 1/p , and thus Finally, using (4.8), and (4.9) in (4.7) the result follows.

Tight wavelet frames with vanishing moments.
It is possible to prove that K ,q (L p (R d ), X ( )) is a (quasi) Banach space for any system X ( ). However, when we have a "nice" system, we can actually identify K ,q (L p (R d ), X ( )) with the space given by interpolation between L p (R d ) and a Besov space. This will lead to a Jackson inequality for a nice TWF, for functions that are smooth measured on the classical Besov scale. Let us give the details.   3.4)). Then, for 1 < p < , 1 < q ≤ and s ≤ r, the following identity holds, with equivalent norms, B s q (L p (R d )) = B s q (L p , ).
Proof. The embedding B s q (L p , ) → B s q (L p (R d )) follows from the theory of atomic decomposition of B s q (L p (R d )) (see e.g. [9]). To get the other inclusion, let { m,k } 2 d −1 k=1 be the Meyer wavelets defined on R d . Then for any f ∈ B s q (L p (R d )) we have an expansion f = I∈D Since m,s are Meyer wavelets and satisfies (3.4), the matrix M s, having m,s I , I as coefficients, is a sparse matrix and thus bounded onḃ s p,q provided that r ≥ s (see e.g. [9,Lemma 3.3]). In particular, this implies that With this characterization in hand, we read off the following result.  compactly supported for = 1, 2,...,L. Suppose the associated refinable scaling function has compact support, it has nonnegative two-scale coefficients, and there is s > 0 such that ∈ W s (L (R d )). Suppose, furthermore, thatˆ (2 j)= j,0 ,j∈ Z d and ˆ (2 j)=0 for j = 0, ∈ N d , | | < s. Then there exists a constant C < depending only on , s, d, and p such that (4.10) |g| B (L ) ≤ Cm /d g L p , 1 = d + 1 p , 0 < < s, for g ∈ m (X ( )).