Continuous characterizations of Besov-Lizorkin-Triebel spaces and new interpretations as coorbits

We give characterizations for homogeneous and inhomogeneous Besov-Lizorkin-Triebel spaces in terms of continuous local means for the full range of parameters. In particular, we prove characterizations in terms of Lusin functions and spaces involving the Peetre maximal function to apply the classical coorbit space theory due to Feichtinger and Gr\"ochenig. This results in atomic decompositions and wavelet bases for homogeneous spaces. In particular we give sufficient conditions for suitable wavelets in terms of moment, decay and smoothness conditions.


Introduction
This paper deals with Besov-Lizorkin-Triebel spacesḂ s p,q (R d ) andḞ s p,q (R d ) on the Euclidean space R d and their interpretation as coorbits. For this purpose we prove a number of characterizations for homogeneous and inhomogeneous spaces for the full range of parameters. Classically introduced in Triebel's monograph [28, 2.3.1] by means of a dyadic decomposition of unity, we use more general building blocks and provide in addition continuous characterizations in terms of Lusin and maximal functions. Equivalent (quasi-)normings of this kind were first given by Triebel in [29]. His proofs use in an essential way the fact that the function under consideration belongs to the respective space. Therefore, the obtained equivalent (quasi-) norms could not yet be considered as a definition or characterization of the space. Later on, Triebel was able to solve this problem partly in his monograph [30, 2.4.2, 2.5.1] by restricting to the Banach space case. Afterwards, Rychkov [23] completed the picture by simplifying a method due to Bui, Paluszyński, and Taibleson [3,4]. However, [23] contains some problematic arguments. One aim of the present paper is to provide a complete and self-contained reference for general characterizations of discrete and continuous type by avoiding these arguments. We use a variant of a method from Rychkov's subsequent papers [24,25] which is originally due to Strömberg and Torchinsky developed in their monograph [27,Chapt. 5].
In a different language the results can be interpreted in terms of the continuous wavelet transform (see Appendix A.1) belonging to a function space on the ax + b-group G. Spaces on G considered here are mixed norm spaces like tent spaces [5] as well as Peetre type spaces. and define the differential operators Dᾱ and ∆ by If X is a (quasi-)Banach space and f ∈ X we use f |X or simply f for its (quasi-)norm. The space of linear continuous mappings from X to Y is denoted by L(X, Y ) or simply L(X) if X = Y . Operator (quasi-)norms of A ∈ L(X, Y ) are denoted by A : X → Y , or simply by A . As usual, the letter c denotes a constant, which may vary from line to line but is always independent of f , unless the opposite is explicitly stated. We also use the notation a b if there exists a constant c > 0 (independent of the context dependent relevant parameters) such that a ≤ c b. If a b and b a we will write a ≍ b .
2 Function spaces on R d

Vector valued Lebesgue spaces
The space L p (R d ), 0 < p ≤ ∞, denotes the collection of complex-valued functions (equivalence classes) with finite (quasi-)norm with the usual modification if p = ∞. The Hilbert space L 2 (R d ) plays a separate role for our purpose (Section 3). Having a sequence of complex-valued functions {f k } k∈I on R d , where I is a countable index set, we put where we modify appropriately in the case q = ∞.

Maximal functions
For a locally integrable function f we denote by M f (x) the Hardy-Littlewood maximal function defined by where the supremum is taken over all cubes centered at x with sides parallel to the coordinate axes. The following theorem is due to Fefferman and Stein [6].
Theorem 2.1. For 1 < p < ∞ and 1 < q ≤ ∞ there exists a constant c > 0, such that holds for all sequences {f k } k∈Z of locally Lebesgue-integrable functions on R d .
Let us recall the classical Peetre maximal function, introduced in [19] . Given a sequence of functions {Ψ k } k∈N ⊂ S(R d ), a tempered distribution f ∈ S ′ (R d ) and a positive number a > 0 we define the system of maximal functions Since (Ψ k * f )(y) makes sense pointwise (see the following paragraph) everything is well-defined. However, the value "∞" is also possible for (Ψ * k f ) a (x). This was the reason for the problematic arguments in [23] mentioned in the introduction. We will often use dilates Ψ k (x) = 2 kd Ψ(2 k x) of a fixed function Ψ ∈ S(R d ), where Ψ 0 (x) might be given by a separate function. Also continuous dilates are needed. Let the operator D Lp t , t > 0, generate the p-normalized dilates of a function Ψ given by D Lp t Ψ := t −d/p Ψ(t −1 ·). If p = 1 we omit the super index and use additionally Ψ t := D t Ψ := D L 1 t Ψ. We define (Ψ * t f ) a (x) by We will refer to this construction later on. It turned out that this maximal function construction can be used to interpret classical smoothness spaces as coorbits of certain Banach function spaces on the ax + b-group, see Section 4.

Tempered distributions, Fourier transform
As usual S(R d ) is used for the locally convex space of rapidly decreasing infinitely differentiable functions on R d where its topology is generated by the family of semi-norms The space S ′ (R d ), the topological dual of S(R d ), is also referred as the set of tempered distributions on R d . Indeed, a linear mapping f : The convolution ϕ * ψ of two integrable (square integrable) functions ϕ, ψ is defined via the integral ). It makes sense pointwise and is a C ∞ -function in R d of at most polynomial growth. As usual the Fourier transform defined on both S(R d ) and The mapping F is a bijection (in both cases) and its inverse is given by In order to deal with homogeneous spaces we need to define the subset S 0 (R d ) ⊂ S(R d ). Following [28,Chapt. 5] we put i.e., to an element of S ′ (R d ) . However, this fact is not trivial and makes use of the Hahn-Banach theorem in locally convex topological vector spaces. We may identify S ′ 0 (R d ) with the factor space S ′ (R d )/P(R d ), since two different extensions differ by a polynomial .

Besov-Lizorkin-Triebel spaces
Let us first introduce the concept of a dyadic decomposition of unity, see also [28, 2.3.1].
Now we are ready for the definition of the Besov and Lizorkin-Triebel spaces. See for instance [28, 2.3.1] for details and further properties.
In case q = ∞ we replace the sum by a supremum in both cases.
The homogeneous counterparts are defined as follows. For details, further properties and how to deal with occurring technicalities we refer to [28,Chapt. 5].
In case q = ∞ we replace the sum by a supremum in both cases.

Inhomogeneous spaces
Essential for the sequel are functions Φ 0 , Φ ∈ S(R d ) satisfying for some ε > 0, and Dᾱ(FΦ)(0) = 0 for all |ᾱ| 1 ≤ R. (2.5) We will call the functions Φ 0 and Φ kernels for local means. Recall that Φ k = 2 kd Φ(2 k ·), k ∈ N, and Ψ t = D t Ψ. The upcoming four theorems represent the main results of the first part of the paper.
Theorem 2.6. Let s ∈ R, 0 < p < ∞, 0 < q ≤ ∞, a > d/ min{p, q} and R+1 > s. Let further Φ 0 , Φ ∈ S(R d ) be given by (2.4) and (2.5). Then the space F s p,q (R d ) can be characterized by with the usual modification in case q = ∞. Furthermore, all quantities f |F s p,q (R d ) i , i = 1, ..., 5, are equivalent (quasi-)norms in F s p,q (R d ) . For the inhomogeneous Besov spaces we obtain the following.
Theorem 2.7. Let s ∈ R, 0 < p, q ≤ ∞, a > d/p and R + 1 > s. Let further Φ 0 , Φ ∈ S(R d ) be given by (2.4) and (2.5). Then the space B s p,q (R d ) can be characterized by with the usual modification if q = ∞. Furthermore, all quantities f |B s p,q (R d ) i , i = 1, ..., 4, are equivalent quasi-norms in B s p,q (R d ) .

Homogeneous spaces
The homogeneous spaces can be characterized similar. Here we do not have a separate function Φ 0 anymore. We put Φ 0 = Φ .
Remark 2.10. Observe, that the (quasi-)norms · |Ḟ s p,q (R d ) 3 and · |F s p,q (R d ) 3 are characterizations via Lusin functions, see [30, 2.4.5] and [28, 2.12.1] and the references given there. We will return to it later when defining tent spaces, see Definition 4.1 and (4.1).

Particular kernels
For more details concerning particular choices for the kernels Φ 0 and Φ we refer mainly to Triebel [30, 3.3].
The most prominent nontrivial examples (besides the one given in Remark 2.3) of functions Φ 0 and Φ satisfying (2.4) and (2.5) are the classical local means. The name comes from the compact support of Φ 0 , Φ, which is admitted in the following statement.

Proofs
We give the proof for Theorem 2.6 in full detail. The proof of Theorem 2.8 is similar and even less technical. Let us refer to the respective paragraph for the necessary modifications. The proofs in the Besov scale are analogous, so we omit them completely. The proof technique is a modification of the one in Rychkov [23], where he proved the discrete case, i.e., that (2.9) and (2.10) characterize F s p,q (R d ). However, Hansen [15,Rem. 3.2.4] recently observed that the arguments used for proving (34) in [23] are somehow problematic. The finiteness of the Peetre maximal function is assumed which is not true in general under the stated assumptions. Consider for instance in dimension d = 1 the functions and, if a > 0 is given, the tempered distribution f (t) = |t| n with a < n ∈ N. Then (Ψ * k f ) a (x) is infinite in every point x ∈ R. The mentioned incorrect argument was inherited to some subsequent papers dealing with similar topics, for instance [1], [17] and [33]. Anyhow, the stated results hold true. There is an alternative method to prove the crucial inequality (34) which avoids Lemma 3 in [23]. It is given in Rychkov [24] as well as [25] . A variant of this method, which is originally due to Strömberg, Torchinsky [27, Chapt. V], is also used in our proof below.
We start with a convolution type inequality which will be often needed below. The following lemma is essentially Lemma 2 in [23].
Lemma 2.13. Let 0 < p, q ≤ ∞ and δ > 0. Let {g k } k∈N 0 be a sequence of non-negative measurable functions on R d and put Then there is some constant C = C(p, q, δ), such that and hold true.
We are going to prove the relations We just give the proof of f |F s p,q 1 ≍ f |F s p,q 2 in detail since the remaining equivalences are analogous.
We need a bit more. Fix a 1 ≤ t ≤ 2. Clearly, we also have We dilate this identity with 2 ℓ , i.e., g ℓ (η) = g(2 −ℓd η(2 −ℓ ·)) for η ∈ S(R d ). An elementary calculation gives for every g ∈ S ′ (R d ). Obviously, we can rewrite (2.13) to obtain Plugging this into (2.15) we end up with the pointwise representation (ℓ ∈ N) for all y ∈ R d . Let us mention that the case ℓ = 0 plays a particular role. In this case we have Substep 1.2. Let us prove the following important inequality first. For every r > 0 and every N ∈ N 0 we have x ∈ R d and ℓ ∈ N 0 . Again the case ℓ = 0 has to be treated separately according to the remark after (2.17). The representation (2.17) will be the starting point to prove (2.18). Namely, we have for Elementary properties of the convolution yield (compare with (2.34)) with the appropriate modification in case ℓ = 0.
Next, we apply the elementary inequalities where 0 < r ≤ 1. Let us define the maximal function and estimate Observe that we can estimate the term (...) 1−r in the right-hand side of (2.26) by where we again used the inequality (compare with (2.23)) and put Hence γ k+ℓ gives us only two different functions from S(R d ). This implies the boundedness Observe that the right-hand side of (2.18) decreases as N increases. Therefore, we have (2.18) on the left-hand side and (Φ k+ℓ ) t by Φ k+ℓ = Φ 0 for k = 0 on the right-hand side. We proved, that the inequality (2.29) holds for all t ∈ [1,2] where c > 0 is independent of t. If we choose r < min{p, q}, we can apply the norm on both sides and use Minkowski's inequality for integrals, which yields If ar > d then we have Now we use a well-known majorant property in order to estimate the convolution on the righthand side by the Hardy-Littlewood maximal function (see Paragraph 2.2 and [26, Chapt. 2]). This yields An index shift on the right-hand side gives Choose now d/a < r < min{p, q}, N > max{0, −s} + a and put We obtain for ℓ ∈ N Now we apply Lemma 2.13 in L p/r (ℓ q/r , R d ) which yields The Fefferman-Stein inequality (see Paragraph 2.2/Theorem 2.1, having in mind that p/r, q/r > 1) gives Hence, we obtain The summand (Φ * 0 f ) a |L p (R d ) can be estimated similar using (2.29) in case ℓ = 0. This proves f |F s p,q (R d ) 2 f |F s p,q (R d ) 1 . With slight modifications of the argument we prove This finishes the proof of (2.12).
Step 2. Let Ψ 0 , Ψ ∈ S(R d ) be functions satisfying (2.5). Indeed, we do not need (2.4) for the following inequality which holds true for all f ∈ S ′ (R d ). We decompose f similar as in Step 1. Exploiting the property (2.4) for the system (Φ 0 , Φ) we find S(R d )-functions λ 0 , λ ∈ S(R d ) such that supp λ 0 ⊂ {ξ ∈ R d : |ξ| ≤ 2ε} and supp λ ⊂ {ξ ∈ R d : ε/2 ≤ |ξ| ≤ 2ε} and for ξ ∈ R d . Putting Λ 0 = F −1 λ 0 and Λ = F −1 λ we obtain the decomposition for every g ∈ S ′ (R d ) . We put g = Ψ ℓ * f for ℓ ∈ N 0 and see Now we estimate as follows We first observe that for x ∈ R d and functions µ, η ∈ S(R d ) the following identity holds true for u, v > 0 This yields in case ℓ ≥ k (with a minor change if k = 0) where we used Lemma A.3 for the last estimate.
If k > ℓ we change the roles of Ψ and Λ to obtain again with Lemma A.3 (minor change if ℓ = 0) where L can be chosen arbitrary large since Λ satisfies (M L ) for every L ∈ N according to its construction. Let us further use the estimate Consequently, Plugging this into (2.33), choosing L ≥ a + |s| and δ = min{1, R + 1 − s} we obtain the inequality for all x ∈ R d . Applying Lemma 2.13 gives (2.31) .
Step 3. What remains is to show that (2.8) is equivalent to the rest.
We return to (2.29) in Substep 1.3. If |z| < 2 −(ℓ+k) t formula (2.29) implies by shift in the integral the following Indeed, we have 1 + 2 ℓ |x − y| ≤ 1 + 2 ℓ (|x − (y + z)| + |z|) Where the last estimate follows from the fact that k ∈ N 0 in the sum. Instead of the integral ( 2 1 | · | q/r dt/t) r/q we now take on both sides of (2.37) the norm The integration over z does not influence the left-hand side. Instead of (2.30) we obtain (1 + 2 ℓ |x − y|) ar dy .
We continue with analogous arguments as after (2.30) and end up with (2.36) .
Indeed, it is easy to see, that we have for all t > 0 and we are done. The proof is complete Proof of Theorem 2.8 The proof of Theorem 2.8 is almost the same as the previous one. It is less technical since we do not have to deal with a separate function Φ 0 which causes several difficulties. However, there are still some technical obstacles which have to be discussed.
1. Although we are in the homogeneous world, we use the same decomposition as used in (2.14), even with the inhomogeneity Φ 0 . In the definition of Λ m,ℓ (x) in (2.16) we have to put in addition Φ(x), if ℓ = 0 and m > 0. The consequence is equation (2.17) for every ℓ ∈ Z. Hence, the inhomogeneity is shifted to Λ m,ℓ . This yields (2.29) for all ℓ ∈ Z, where k still runs through N 0 . We need this for the argument in Substep 3.1.
Proof of Corollary 2.11 and 2.12 1. The proof of Corollary 2.11 is immediate. We know that ∆ N gives ( d k=1 |ξ k | 2 ) N as factor on the Fourier side. This gives (2.5) immediately and together with (2.11) we have (2.4) for ε > 0 small enough.
2. In the case of Corollary 2.12 the situation is a bit more involved. Clearly, Condition (2.5) holds true. But the problem here is, that (2.4) may be violated for all ε > 0. However, we argue as follows. In Step 2 in the proof above we have seen, that we do not need (2.4) for the system (Ψ 0 , Ψ). Hence, we can estimate (2.9) and (2.10) from above by a further characterization of F s p,q (R d ) . For the remaining estimates we apply Theorem 2.6 with the system

Classical coorbit space theory
In [7,8,9,14] a general theory of Banach spaces related to integrable group representations has been developed. The ingredients are a locally compact group G with identity e, a Hilbert space H and an irreducible, unitary and continuous representation π : G → L(H), which is at least integrable. One can associate a Banach space CoY to any solid, translation-invariant Banach space Y of functions on the group G. The main achievement of this abstract theory is a powerful discretization machinery for CoY , i.e., a universal approach to atomic decompositions and Banach frames. It allows to transfer certain questions concerning Banach space or interpolation theory from the function space to the associated sequence space level, see [8,9,18]. In connection with smoothness spaces of Besov-Lizorkin-Triebel type the philosophy of this approach is to measure smoothness of a function in decay properties of the continuous wavelet transform W g f which is studied in detail in the appendix. Indeed, homogeneous Besov and Lizorkin-Triebel type spaces turn out to be coorbits of properly chosen spaces Y on the ax + b-group G.
There are some more examples according to this abstract theory. One main class of examples refers to the Heisenberg group H, the short-time Fourier transform and leads to the well-known modulation spaces as coorbits of weighted L p (H) spaces, see [7, 7.1] and also [10].

Function spaces on G
Integration on G will always be with respect to the left Haar measure dµ(x). The Haar module on G is denoted by ∆. We define further L x F (y) = F (x −1 y) and R x F (y) = F (yx), x, y ∈ G, the left and right translation operators. A Banach function space Y on the group G is supposed to have the following properties The continuous weight w is called sub-multiplicative if w(xy) ≤ w(x)w(y) for all x, y ∈ G. The space L w p (G), 1 ≤ p ≤ ∞, of functions F on the group G is defined via the norm where we use the essential supremum in case p = ∞ . If w ≡ 1 then we simply write L p (G) . It is easy to show that these spaces provide left and right translation invariance if w is submultiplicative. Later, in Paragraph 4.1 we are going to introduce certain mixed norm spaces where the translation invariance is not longer automatic.

Sequence spaces
Definition 3.1. Let X = {x i } i∈I be some discrete set of points in G and V be a relatively compact neighborhood of e ∈ G .
(ii) X is called relatively separated if for all compact sets K ⊂ G there exists a constant C K such that sup j∈I ♯{i ∈ I : (iii) X is called V -well-spread (or simply well-spread) if it is both relatively separated and V -dense for some V .
Definition 3.2. For a family X = {x i } i∈I which is V -well-spread with respect to a relatively compact neighborhood V of e ∈ G we define the sequence space Y b and Y ♯ associated to Y as Remark 3.3. For a well-spread family X the spaces Y b and Y ♯ do not depend on the choice of V , i.e. different sets V define equivalent norms on Y b and Y ♯ , respectively . For more details on these spaces we refer to [8] .

Coorbit spaces
Having a Hilbert space H and an integrable, irreducible, unitary and continuous representation π : G → L(H) then the general voice transform of f ∈ H with respect to a fixed atom g is defined as the function V g f on the group G given by where the brackets denote the inner product in H .
Definition 3.4. For a sub-multiplicative weight w(·) ≥ 1 on G we define the space A w ⊂ H of admissible vectors by Finally, we denote with (H 1 w ) ∼ the canonical anti-dual of H 1 w , i.e., the space of conjugate linear functionals on H 1 w .
We see immediately that A w ⊂ H 1 w ⊂ H. The voice transform (3.1) can now be extended to H w × (H 1 w ) ∼ by the usual dual pairing. The space H 1 w can be considered as the space of test functions and the reservoir (H 1 w ) ∼ as distributions. Let now Y be a space on G such that (i) -(iii) in Paragraph 3.1 hold true. We define further where the operator norms are considered from Y to Y .
Definition 3.5. Let Y be a space on G satisfying (i)-(iii) in Paragraph 3.1 and let the weight w(x) be given by (3.2). Let further g ∈ A w . We define the space CoY , which we call coorbit space of Y , through 3) The following basic properties are proved for instance in [20,Thm. 4.5.13].
Theorem 3.6. (i) The space CoY is a Banach space independent of the analyzing vector g ∈ A w .
(ii) The definition of the space CoY is independent of the reservoir in the following sense: Assume that S ⊂ H 1 w is a non-trivial locally convex vector space which is invariant under π. Assume further that there exists a non-zero vector g ∈ S ∩ A w for which the reproducing formula holds true for all f ∈ S ∼ . Then we have

Discretizations
This section collects briefly the basic facts concerning atomic (frame) decompositions in coorbit spaces. We are interested in atoms of type {π(x i )g} i∈I , where {x i } i∈I ⊂ G represents a discrete subset, whereas g denotes a fixed admissible analyzing vector.
in some suitable topology.
(c) If {λ i } i∈I ∈ B ♯ then i∈I λ i g i ∈ B and there exists a constant C 2 > 0 such that (a) We have {h i (f )} i∈I ∈ B b for all f ∈ B and there exist constants C 1 , C 2 such that Remark 3.9. This setting differs slightly from the understanding of Triebel in [30,31] .
The following abstract result for the atomic decomposition in CoY is due to Feichtinger and Gröchenig (see [8,Thm. 6.1]). Theorem 3.10. Let Y be a function space on the group G satisfying the hypotheses (i)-(iii) from Paragraph 3.1 and let w(x) be given by (3.2). Furthermore, the element g ∈ A w is supposed to satisfy Then there exists a neighborhood U of e ∈ G and constants C 0 , C 1 > 1 such that for every U -well-spread discrete set X = {x i } i∈I ⊂ G the following is true.
with coefficients {λ i } i∈I depending linearly on f and satisfying the estimate (ii) (Synthesis) Conversely, for any sequence {λ i } i∈I ∈ Y ♯ the element f = i∈I λ i π(x i )g is in CoY and one has In both cases, convergence takes place in the norm of CoY if the finite sequences are norm dense in Y ♯ , and in the weak * -sense of (H 1 w ) ∼ otherwise. Remark 3.11. According to Definition 3.7 the family {π(x i )g} i∈I represents an atomic decomposition for CoY . Theorem 3.12. Under the same assumptions as in Theorem 3.10 the system {π(x i )g} i∈I represents a Banach frame for CoY , i.e., The following powerful result goes back to Gröchenig [13] and was generalized by Rauhut [21].
Theorem 3.13. Suppose that the functions g r , γ r , r = 1, ..., n, satisfy (3.4). Let X = {x i } i∈I be a well-spread set such that for all f ∈ H . Then expansion (3.5) extends to all f ∈ CoY . Moreover, f ∈ (H 1 w ) ∼ belongs to CoY if and only if { π(x i )γ r , f } i∈I belongs to Y b for each r = 1, ..., n . The convergence is considered in CoY if the finite sequences are dense in Y b . In general we have weak *convergence.
Proof. The proof of this result relies on the fact, that there exists an atomic decomposition {π(y i )g} i∈I by Theorem 3.10 with a certain g satisfying (3.4) and a corresponding sequence of points Z = {y i } i∈I . This has to be combined with Theorem 3.12 and Theorem 3.10/(ii) and we are done. See [13] for the details. .

Coorbit spaces on the ax + b-group
Let G = R d ⋊ R * + the d-dimensional ax + b-group. Its multiplication is given by (x, t)(y, s) = (x + ty, st) .
The left Haar measure µ on G is given by dµ(x, t) = dx dt/t d+1 , the Haar module is ∆(x, t) = t −d . Giving a function F on G the left and right translation L y = L (y,r) and R y = R (y,r) are given by t)(y, r)) = F (x + ty, rt) .

Peetre type spaces on G
The present paragraph is devoted to the definition of certain mixed norm spaces on the group. Such spaces have been considered in various papers, see [5,7,13,14]. In particular, so-called tent spaces have some important applications in harmonic analysis. Indeed, it is possible to recover Lizorkin-Triebel spaces as coorbits of tent spaces.
Here we use a different approach and define a new scale of function spaces on the group G. We call them Peetre type spaces since a quantity related to the Peetre maximal function (2.1) is involved in its definition. It turned out that they are straight forward to handle in connection with translation invariance. In contrast to the tent space approach they represent the more natural choice for considering Lizorkin-Triebel spaces as coorbits. Additionally, they seem to be suitable for inhomogeneous spaces and more general situations like weighted spaces and general 2-microlocal spaces, which will be studied in a further contribution to the subject.
Step 1. The left and right translation invariance ofL s p,q (G) andṪ s p,q (G) was shown in [20, Lem. 4.7.10].
Step 2. Let us considerṖ s,a p,q (G). Clearly, we have for F ∈Ṗ s,a p,q (G) Hence, we obtain L (z,r) :Ṗ s,a p,q (G) →Ṗ s p,q (G) = r d(1/p−1/q)−s . The right translation invariance is obtained by Observe that This yields and consequently R (z,r) :Ṗ s,a p,q (G) →Ṗ s,a p,q (G) ≤ r s+d/q max{1, r −a }(1 + |z|) a . Remark 4.3. Note, that we did neither use the translation invariance of the Lebesgue measure nor any change of variable in order to prove the right translation invariance ofṖ s,a p,q (G). This gives room for further generalizations, i.e., replacing the space L p (R d ) by some weighted Lebesgue space L p (R d , ω) for instance.

New old coorbit spaces
We start with H = L 2 (R d ) and the representation where T x f = f (· − x) and D L 2 t f = t −d/2 f (·/t) has been already defined in Paragraph 2.2. This representation is unitary, continuous and square integrable on H but not irreducible. However, if we restrict to radial functions g ∈ L 2 (R d ) then span{π(x, t)g : (x, t) ∈ G} is dense in L 2 (R d ).
Another possibility to overcome this obstacle is to extend the group by SO(d), which is more or less equivalent, see [7,8] for details. The voice transform in this special situation is represented by the so-called continuous wavelet transform W g f which we study in detail in Paragraph A.1 in the appendix.
Recall the abstract definition of the space H 1 w and A w from Definition 3.4. The following result implied by our Lemma A.3 on the decay of the continuous wavelet transform. It states under which conditions on the weight w the space H 1 w is nontrivial.
for some r, s, s ′ ≥ 0 then S 0 (R d ) ֒→ H 1 w . This is a kind of minimal condition which is needed in order to define coorbit spaces in a reasonable way. Instead of (H 1 w ) ∼ one may use S ′ 0 (R d ) as reservoir and a radial g ∈ S 0 (R d ) as analyzing vector. Considering (3.2) we have to restrict to such function spaces Y on G satisfying (i),(ii),(iii) in Paragraph 3.1 where additionally holds true for some r, s, s ′ ≥ 0 . The following theorem shows, how the spaces of Besov-Lizorkin-Triebel type from Section 2 can be recovered as coorbit spaces with respect to G.  [7,13,14] and rely on the characterizations given by Triebel in [29] and [30, 2.4, 2.5], see in particular [30, 2.4.5] for the variant in terms of tent spaces which were invented in [5].
From the deep result in [5,Prop. 4] it follows thatṪ s p,q (G) are translation invariant Banach function spaces on G, which makes them feasible for coorbit space theory (b) Assertion (iii) is indeed new and makes the rather complicated tent spacesṪ s p,q (G) obsolete for this issue. We showed that Y = P s,a p,q (G) is a much better choice since the right translation invariance is immediate and gives more transparent estimates for its norm. Once we are interested in reasonable conditions for atomic decompositions this is getting important, see Section 4.5.

Sequence spaces
In the sequel we consider a compact neighborhood of the identity element in G given by , where α > 0 and 1 < β. Furthermore, we consider the discrete set of points This family is U -well-spread. Indeed, Note that in this case the spaces Y ♯ and Y b coincide. We will further use the notation χ j,k (x) = 1 : x ∈ Q j,k 0 : otherwise .
Definition 4.7. Let Y be a function space on G as above. We put Theorem 4.8. Let 1 ≤ p, q ≤ ∞, s ∈ R and a > d/ min{p, q}. Then and Proof. We prove the first statement. The proof for the second one is even simpler. Let Discretizing the integral over t by t ≍ β −ℓ we obtain .
and estimate (4.4) In order to include also the situation min{p, q} ≤ 1 we use the following trick. Obviously, we can rewrite and estimate (4.4) with 0 < r < 1 in the following way We continue with the useful estimate sup w |χ ℓ,k (x + w)| (1 + β ℓ |w|) ar (4.6) Indeed, the first estimate is obvious. Let us establish the second one Note, that the functions g ℓ (x) = β ℓd (1 + β ℓ | · |) ar belong to L 1 (R d ) with uniformly bounded norm, where we need that ar > d . Putting (4.7) and (4.6) into (4.5) we obtain Now we are in a position to use the majorant property of the Hardy-Littlewood maximal operator (see Paragraph 2.2 and [26, Chapt. 2]), which states that a convolution of a function f with a L 1 (R d )-function (having norm one) can be estimated from above by the Hardy-Littlewood maximal function of f . We choose r < min{p, q} and apply Theorem 2.1 for the L p/r (ℓ q/r ) situation. This gives and finishes the upper estimate. Both conditions, ar > d and r < min{p, q}, are compatible if a > d/ min{p, q} is assumed at the beginning. For the estimate from below we go back to (4.2) and observe A further use of (4.3) gives finally The proof is complete.

Atomic decompositions
The following theorem is a direct consequence of the abstract results in Theorems 3.10, 3.12.
is a Banach frame forḞ s p,q (R d ) in the sense of (3.5) . Proof. Let us prove (a). First of all, we apply Theorem 4.5/(i). Afterwards, we use Proposition 4.2 in order to estimate the weight w Y (x, t) for Y =L Let us distinguish the cases s ≥ 0 and s < 0. In the first case we can put Finally (4.9), (4.10) and Theorem 3.13 yield (a) .
Step 2. We prove (b). We apply Theorem 4.5/(iii) and afterwards Proposition 4.2 and obtain for Y =Ṗ This yields the lower bound in (b) and we are done.
The following corollary is a consequence of Theorem 4.12 and the facts in Section A.2.
(D) For every N ∈ N there exists a constant c N such that (M L ) We have vanishing moments DᾱFΨ(0) = 0 for all |ᾱ| 1 ≤ L .
Remark A.2. If a function g ∈ L 2 (R d ) satisfies (S K ) for some K > 0 then by well-known properties of the Fourier transform we have g ∈ C ⌊K⌋ (R d ).
The following lemma provides a useful decay result for the continuous wavelet transform under certain smoothness, decay and moment conditions, see also [12,23,16] for similar results in a different language. It represents a continuation of [23,Lem. 1] where one deals with S(R d )functions Lemma A.3. Let L ∈ N 0 , K > 0 and Φ, Ψ, Φ 0 ∈ L 2 (R d ).
(i) Let Φ satisfy (D), (M L−1 ) and let Φ 0 satisfy (D), (S K ). Then for every N ∈ N there exists a constant C N such that the estimate holds true for x ∈ R d and 0 < t < 1 .
We exploit property (S K ) for Φ 0 and proceed analogously as above. This proves (A.2).
This completes the proof.
Corollary A.4. Let Φ, Ψ belong to the Schwartz space S 0 (R d ). By Lemma A.3/(ii) for every L, N ∈ N there is a constant C L,N > 0 such that Additionally, we obtain for Φ ∈ S 0 (R d ) and Φ 0 ∈ S(R d ) that

A.2 Orthonormal wavelet bases
The following Lemma is proved in Wojtaszczyk [34, 5.1]. is an orthonormal basis in L 2 (R d ).

Spline wavelets
As a main example we will consider the spline wavelet system. The normalized cardinal Bspline of order m + 1 is given by the generator of an orthonormal wavelet system is defined. For m = 1 it is easily checked that −ψ 1 (x − 1) is the Haar wavelet. In general these functions ψ m have the following properties: • ψ m restricted to intervals [ k 2 , k+1 2 ], k ∈ Z, is a polynomial of degree at most m − 1.   In particular, ψ m satisfies (M L ) for 0 < L ≤ m and ψ m , ϕ m satisfy (D) and (S K ) for K < m − 1.