Preconditioning Filter Bank Decomposition Using Structured Normalized Tight Frames

We turn a given filter bank into a filtering scheme that provides perfect reconstruction, synthesis is the adjoint of the analysis part (so-called unitary filter banks), all filters have equal norm, and the essential features of the original filter bank are preserved. Unitary filter banks providing perfect reconstruction are induced by tight generalized frames, which enable signal decomposition using a set of linear operators. If, in addition, frame elements have equal norm, then the signal energy is spread through the various filter bank channels in some uniform fashion, which is often more suitable for further signal processing. We start with a given generalized frame whose elements allow for fast matrix vector multiplication, as, for instance, convolution operators, and compute a normalized tight frame, for which signal analysis and synthesis still preserve those fast algorithmic schemes.


Introduction
Increasingly detailed data are acquired in all sorts of measurements nowadays, so that fast algorithms are an important factor for successful signal processing. The concept of generalized frames has a long tradition in signal processing and many unitary filter bank schemes with the perfect reconstruction property are induced by tight generalized frames. Frames themselves are basis-like systems that span a vector space but allow for linear dependency. The inherent redundancy of frames can yield advantageous features unavailable within the basis concept [1][2][3][4]. If the frame is tight and its elements have unit norm, then it resembles the concept of an orthonormal basis, with the add-on of useful redundancy, and frame coefficients measure the signal energy in a uniform fashion. Generalized frames were introduced in [5] as a tool for signal decomposition using a set of linear operators. In [6], collections of orthogonal projectors were considered under the name fusion frames, fusion frame filter banks have been considered in [7], and the concept of tight -fusion frames was developed in [8]. As convolution operators are linear, most filter banks can be thought of as pairs of generalized frames, one for analysis and the other for synthesis. Hence, in view of filter banks, it is not sufficient to deal with frames but we must inevitably consider their generalized counterpart. Tightness of a generalized frame means that the induced unitary filter bank provides perfect reconstruction. As with frames, we seek unit norm tight generalized frames because signal energy is then spread through the various channels in a more uniform fashion. The latter was used in [9] to verify robustness of tight fusion frames against erasures, meaning that it is beneficial to have tight fusion frames with equal norm when dealing with distortions and loss of data. To keep the filter bank perspective, we will focus on generalized frames consisting of convolution operators enabling fast algorithms.
In the present paper we start with a generalized frame, whose elements allow for fast matrix vector multiplications (e.g., convolution operators) and construct a unit norm tight generalized frame that induces a filter bank scheme preserving those fast algorithms. The latter is related to the so-called Paulsen problem for frames, where one is given a unit norm frame and one asks for the closest tight frame with unit norm and for an algorithm to find it. This problem for frames has been partially solved in [10][11][12]. Note that if we are given a unit norm generalized frame, whose elements allow for fast matrix vector multiplications, then the closest tight generalized frame with unit norm may not provide such fast algorithmic schemes in general. Here, we aim to find a related tight unit norm generalized frame in such a way that signal analysis and synthesis can still benefit from the underlying fast matrix vector multiplications.
We should point out that we used the term filter bank beyond sets of convolution operators similar to [7], where weighted orthogonal projectors are considered. Nonetheless, if the starting generalized frame consists of convolution operators, then the resulting scheme still represents convolution operators in each channel, but we require one additional linear operator for global pre-and postmultiplication. As this operator has a special structure being the inverse of a convolution frame operator, there are still fast computation schemes available [13].
Our construction is inspired by pseudocovariance estimators of elliptical distributions in [14]; see also [15,16]. We derive an iterative algorithm on positive definite matrices, for which we prove convergence, so that we obtain a positive definite matrix Γ that enables us to construct the tight unit norm generalized frame.
For related research topics, such as the optimal rescaling of filter banks, we refer to [17,18]. Preconditioning in the context of Gabor frames is addressed in [19]. In order to assess the benefits of our approach in image analysis, we suggest the use of the structural similarity proposed in [20].
The outline is as follows: In Section 2, we introduce the concept of generalized frames and motivate the construction of unit norm tight generalized frames. In Section 3, we present our iterative algorithm, for which we verify convergence, enabling us to construct tight generalized frames with unit norm that preserve fast analysis and synthesis due to their special structure. In Section 4 we provide few examples of random matrices whose samples satisfy the convergence assumptions needed. We also point out examples for convolution operators and further operators enabling fast matrix vector multiplications. In Section 5, we discuss the structure of our construction when the underlying generalized frame is a sample from an elliptical distribution. Some concluding remarks are contained in Section 6.

Generalized Frames
Let K denote either R or C. We follow [5] and call a collection { } =1 ⊂ K × a generalized frame (or a -frame for short) if there are two constants 0 < ≤ < ∞ such that If the constants can be chosen 0 < = , then { } =1 is called a tight -frame and it is called a Parseval -frame if 0 < = = 1. For = 1, we have * = ⟨ , ⟩, so that we recover the concept of frames (cf. [1]). It turns out that a collection { } =1 ⊂ K × is a -frame if and only if ⋃ =1 range( ) spans K .
If { } =1 is a -frame, then the analysis operator F : * , such that the generalized frame operator is The collection { −1 } =1 is called the canonical dual -frame and yields the expansion which simply follows from −1 = −1 = , where denotes the × identity matrix.
Here, ‖ ⋅ ‖ denotes the Hilbert-Schmidt (Frobenius) norm. Most parts of the proof of Proposition 1 can follow the lines in [21], where = 1 is considered, so we omit the proof.

Remark 2.
We have supposed that the linear operators of a -frame have all the same dimensions, which simplifies notation but is not necessary. The entire paper could also deal with sets of linear operators { } =1 , where ∈ K × , for = 1, . . . , . Then * ∈ K × and this is all we need.
Tight frames are desirable because synthesis is simply the adjoint of the analysis part. For signal processing purposes, we are interested in tight -frames that additionally have unit norm, because those more resemble orthonormal bases, and {‖ * ‖} =1 then has more information about the signal energy in the direction of a particular frame element; see [22] for = 1.
Given some -frame, say with unit norm elements, let us seek a tight -frame with equal norm elements that is nearby. If we give up the equal norm requirement and is the frame operator of some -frame { } =1 , then the see Proposition 1. In general, however, { −1/2 } =1 may not have equal norms. The search for the closest Parseval frame with equal norm elements has become known as the Paulsen problem. It is essentially the same problem if we restrict ourselves to the sphere; that is, given a unit norm -frame, we aim for the closest unit norm tight -frame. For = 1, this problem was partially solved in [10][11][12].  Suppose now that we are given a -frame { } =1 that allows for fast matrix vector multiplications for each and * . The closest equal norm Parseval -frame may not preserve such features. From a computational point of view, it would be preferable to find an equal norm Parseval -frame that still allows for fast analysis and synthesis schemes, and this is indeed our topic in the subsequent sections.
To construct a unit norm tight -frame that preserves fast matrix vector multiplications, we get inspired by Proposition 1 and aim to find a positive definite matrix Γ such that is a unit norm tight -frame. As opposed to Proposition 1, we replace −1 with Γ and normalize. The unit norm -frame (5) is tight if and only if Signal analysis and synthesis requires pre-and postmultiplication by Γ 1/2 but in-between we can use times the fast algorithms provided by * and (cf. Figure 1). Now, * Γ 1/2 /‖Γ 1/2 ‖ has unit norm, so that the signal energy better relates to the magnitudes of {‖ * Γ 1/2 ‖/‖Γ 1/2 ‖ } =1 . Thus, the special structure (5) can be advantageous over other unit norm tight -frames that can be closer to the original one.
Remark 3. The filtering scheme in Figure 1 can preserve many properties of the original -frame { } =1 , which can go beyond fast matrix vector multiplications, such as being orthogonal projectors and sparse matrices, as long as the application of Γ 1/2 is implemented separately and we do not use Γ 1/2 directly. Remark 4. We point out that structure (5) is different from the approach in [23], where rescalings are sought to derive tight frames. The authors in [24] discuss the setting when a linear operator exists that maps a frame into a unit norm tight frame. We are more general here, because we are joining both and we apply a linear operator and allow for rescaling.
Note that (6) is equivalent to Thus, Γ is the inverse of the generalized frame operator of Since this equation is invariant under scalings, we can look for a solution Γ with trace(Γ) = 1. Let P be the collection of hermitian positive definite matrices in K × and denote by P 1 the same space with the additional requirement that the trace is 1. The fixed point equation (7) gives rise to an iterative scheme that was already considered in [14,15] for = 1 to estimate the covariance of elliptical distributions. As initialization we choose Γ 0 := (1/ ) ∈ P 1 and define Γ +1 Note that Γ 1 = −1 /trace( −1 ), and, to verify convergence, we will follow the ideas of the technical procedure used in [14,4 Journal of Applied Mathematics 15] for = 1. For analysis purposes, we will introduce the mapping : P → P, so that We will first check that the mapping is injective up to scalings, which generalizes [14, Theorem 2.1] from = 1 to the general case.
The following result says that if we find a proper tightframe with unit norm based on (5), then this tight -frame is unique. Proposition 6. Let { } =1 be a -frame and suppose that there are two positive definite matrices Γ and Γ such that both Then those two tight -frames are identical.
Proof. The tightness assumptions imply that (Γ ) = (Γ ) = . According to Lemma 5, there is a positive constant such that Γ = Γ . Therefore, the two tight -frames are identical.
Next, we use scheme (10) to compute a unit norm tight -frame.
This theorem generalizes results in [14,15], where convergence is verified for = 1. The conditions in Theorem 7 are redundant. Condition (ii) clearly implies (i). For ≥ 2, (iii) yields (ii). In fact, conditions (i), (ii), and (iii) depend on the range of each but not on their norm. Note that condition (iii) can only be satisfied by some is independent of global scalings since multiplication of all by some constant means that the inverse frame operator needs to be divided by 2 . It requires < , which is, in fact, quite weak.
The assumption trace( * −1 ) = implies that the two above inequalities become equalities, which yield the required statements.
Note that Proposition 8 bounds the worst case scenario. Since = −1 , taking the trace on both sides yields that = ∑ =1 trace( * −1 ). If { } =1 has unit norm and is close to being tight meaning ≈ ( / ) , then ≈ / . If { } =1 is sufficiently generic or in sufficient general position, then (i)-(iv) are satisfied for sufficiently large .
If { } =1 is a sample of a continuous distribution on K × and is sufficiently large, then with probability one all of the assumptions in Theorem 7 are satisfied.
It remains to verify convergence, which we check in two steps.
Step 2 (refers to Theorem 2.2 and Corollary 2.2 in [14]). Since Γ is positive definite with trace 1, there is a subsequence (Γ ) ∞ =1 that converges towards some positive semidefinite matrix Γ. We must now verify that Γ is positive definite and that the entire sequence converges.

Manipulations as in
Step 1 applied to the formulas for and 0 imply that as well as According to Step 1, the largest and smallest eigenvalue of both and 0 are 1 and , respectively. Let and 0 be the eigenprojectors of and 0 , respectively, associated with 1 and = rank( ) and 0 = rank( 0 ). As in [14], without loss of generality, we can suppose 0 ≥ .
By multiplying both sides from the left and the right by 0 , the relations 2 Thus, for ∈ J, we either have The first option yields 0 −1/2 Γ 1/2 = 0. The second option implies ⟨ , Γ 1/2 * Γ 1/2 ⟩ = ⟨ , Γ 1/2 * Γ 1/2 ⟩, which yields after some computations Γ 1/2 = Γ 1/2 . For ∉ J, we obtain Similar to the above considerations, the first option yields 0 −1/2 = 0. The second option implies = . We now premultiply both sides of (24) by 0 and postmultiply by − . The above four options and using that and −1/2 commute imply 0 ( − ) = 0, which is equivalent to * 0 = * 0 . Since 0 ≥ , we obtain = * 0 , so that = 0 . The latter implies with the above that, for ∈ J, either Hence, we can split {1, . . . , } into two disjoint index sets 1 and 2 such that Condition (ii) yields that span(range( ) : ∈ ) equals K for either = 1 or = 2. If this holds for = 1, then we must have Γ 1/2 = 0. If it holds for = 2, then we derive Γ 1/2 = Γ 1/2 . Suppose now that Γ 1/2 = 0 holds. The same arguments as in the previous paragraph yield, for ∈ J, that either = 0 or = ; see also [14]. Pre-and postmultiplying both sides in (21) by yields where J 0 := { : ∉ J, = }. Next, we take the trace on both sides and use that trace( * ) = 1 to derive where 1 is the number of whose range is contained in the null space of Γ. Condition (iii) yields 1 ≤ 1 / < 1, which is a contradiction to the results of Step 1. Thus, we must have Γ 1/2 = Γ 1/2 , so that Since the ranks of the two summations in (21) are additive (see also [14]) the ranks of the two summations in (33) are additive. Hence, the two terms themselves must be orthogonal projections. According to condition (i), the rank of the first term equals rank(Γ). If rank(Γ) < , then taking the trace of the second term implies with condition (iii) that 1 < 1/( − rank(Γ)) ≤ 1, which is a contradiction to Step 1. Therefore, J 0 is empty and = rank(Γ). Taking the trace of the first term in (33) yields We obtain = trace( ) ≥ 1 + ( − ) , so that (34) implies At this point, we claim that assumption (iv) implies for at least one that holds, for all proper linear subspaces ⊂ K , but postpone the verification to the end of this proof. Since , is an increasing sequence, (36) implies ( 1 / ( − )) < if > . This violates (35), so that and hence Γ must have full rank and, therefore, Γ is positive definite. Also, must have full rank implying 1 = = 1 and = . Since the eigenvalues are monotone, the entire sequence ( ) ∞ =1 converges towards . The latter can be used with Banach's fixed point theorem to verify that also (Γ ) ∞ =1 must converge, hence, towards Γ. By continuity, we obtain (Γ) = .

Remark 9. The inversion of in iterative scheme
Next, we illustrate Theorem 7 with few numerical examples.
The subsequent sections are dedicated to provide some examples of random samples satisfying the assumptions of Theorem 7. We will also provide examples that allow for fast matrix vector multiplications such as convolution operators and we support the intuition that Γ is close to the identity if the sample is close to being tight.

Examples of Random Matrices Satisfying the Assumptions for Convergence
We first fuse the concepts of generalized frames and probabilistic frames as developed in [5] and [4], respectively, see also [25].
Definition 15. Let ≥ 1 be an integer. One says that a random matrix ∈ K × is a random -frame of order if there are positive constants and > 0 such that A random -frame of order is called tight if we can choose = .
Following the lines of the proof for rank one projectors considered in [2] yields that any random -frame of order 1 satisfies 1 ≤ (1/ )E‖ ( )‖ 2 ≤ 1 . Similar to finite frames, if is a random -frame of order , then the random -frame operator is positive, self-adjoint, and invertible. Thus, we obtain the reconstruction formula Moreover, is a tight random -frame of order 1 if and only if = , where = (1/ )E‖ ‖ 2 . Note that the case = 1 of the following result is already explicitly contained in [26] and see [27] for related results on orthogonal projectors. Proof. Let min ( ) and max ( ) denote the smallest and largest eigenvalue of , respectively. The matrix Chernoff bounds as stated in [28] yield, for all 0 ≤ ≤ 1, Some calculus yields so that we derive We can further compute Since (1 + )ln(1 + ) − > 0, for all ∈ (0, 1], we can find a suitable constant > 0 if is sufficiently large. Remark 17. The constants and in Theorem 16 can be explicitly computed. By using := (1 + )ln(1 + ) − > 0, we can choose > ln(2)/ and = − ln(2)/ .

Next, we discuss a few examples.
Example 18 (Gaussian matrices). Let 1 ≤ < and consider the × random matrix whose entries are i.i.d. Gaussian. Its joint element density is The resulting self-adjoint matrix * ∈ R × is a singular Wishart-matrix (cf. [29]). According to (48) the distribution of is invariant under orthogonal transformations, so that is a tight random -frame of order , for all integers . By using the moments of the chi-squared distribution, we see that the bounds satisfy = = ( + 2) ⋅ ⋅ ⋅ ( + 2 − 2).
Example 19 (fusion frames). If the columns of a matrix ∈ K × , < , have orthonormal columns, then we can identify with a subspaces ∈ G , (K), where G , (K) denotes the Grassmann space, that is, the collection of -dimensional subspaces of K . The Haar measure on G , (K) then induces a random -frame of order for all integers .
Example 20 (Gabor). Time-frequency structured matrices were considered in [30] in relation to compressed sensing, in which some window vector is modulated and shifted. We use cyclic shifts, which can be performed by applying a matrix having ones in the lower secondary diagonal, another one in the upper right corner, and zeros anywhere else. The modulation operator on C is given by For any nonzero ∈ C , the full Gabor system { ℓ : ℓ, = 0, . . . , − 1} has cardinality 2 and forms a tight frame for C (cf. [31]). We will use the 2 × matrix , whose rows are formed by the tight frame vectors. A short computation yields that if is chosen at random as the Rademacher sequence, then is a tight random -frame of order 1. Moreover, each * is an orthogonal projector, so that * corresponds to a tight random fusion frame. The same holds when is the Steinhaus sequence; that is, each entry is uniformly distributed on the complex unit circle.
Next, we have an example that indeed allows for fast matrix vector multiplication.

Example 21 (circulant matrices). Given a vector
= ( 1 , . . . , ) ⊤ ∈ R , the corresponding circulant matrix is ) . (50) Each column of̃is a cyclic shift of the previous one. The left × block of the matrix̃was used as a compressed sensing measurement matrix in [32]. If the entries of are i.i.d. with zero mean and nonvanishing second moments, then is a tight random -frame of order 1 with 1 = E( 2 ). For instance, if is the Rademacher sequence, that is, entries are independent and equal to ±1 with probability 1/2, then is tight of order 1 but not of order 2 in general. It is well known that the discrete Fourier matrices diagonalize circulant matrices, so that fast matrix vector multiplications are available. In fact, the terms "filter bank" and "filterning" are usually associated with the application of convolution operators, so that each channel corresponds to a circulant matrix with potentially some subsampling involved.

Remark 22.
Samples of all of the above examples satisfy the conditions (i)-(iv) with high probability for sufficiently large sample size. The circulant matrices represent convolution operators and hence correspond to a proper filter bank scheme. They enable fast matrix vector multiplications; hence, the circulant samples in Example 21 are indeed suitable for our construction in Section 3 that preserves this fast algorithmic scheme using the filter bank shown in Figure 1. Each channel corresponds to filtering, but we require one additional linear operator for pre-and postmultiplication.
It must be mentioned that filter banks usually involve some subsampling. Let the matrix ∈ K̃× ,̃≥ , be a random matrix with a single one in each column, whose position is chosen independently at random in a uniform fashion. Then each matrix of the sample { } =1 corresponds to a sampling operator, so that we derive samplings of length . Indeed, is a tight random -frame, but it may not satisfy all other conditions in Theorem 7. Nonetheless, subsampling operators in a filter bank are used in combination with more sophisticated filters, say { } =1 ⊂ K ×̃, so that it is possible that the conditions are satisfied by { } =1 ⊂ K × .

Closeness to the Original -Frame
To relate the algorithm of the previous section to the Paulsen problem, we would need estimates on the distance between the original and the resulting -frame. In particular, if the original unit norm -frame is close to being tight, then we aim to verify that the computed unit norm tight -frame is nearby. We do not derive any estimates for fixed but will provide some framework for random samples that supports such intuition.

Theorem 23.
Let be a random matrix continuously distributed on the set of matrices in K × and { } =1 an associated i.i.d. sample with Γ ( ) being the corresponding limit of the iterative algorithm (10). Then Γ ( ) converges almost surely towards some positive definite Γ, so that Σ := Γ −1 satisfies Before we shall provide the proof, we have some discussion. As in [14], we observe that results of the previous section applied to a continuously distributed random matrix yield that (51) has a solution Σ among the symmetric positive definite matrices and is unique up to multiplication by a positive constant.
For elliptical distributions, (51) has a very special meaning. Here, we call a probability distribution on K × elliptical if it has a density with respect to the standard volume element on K × and where Σ ∈ K × is hermitian positive definite, ∈ K × , and is some nonnegative function not dependent on and Σ with ∫ K × (trace( * )) = 1. For instance, the Gaussian random matrix in Example 18 is elliptically distributed. A direct computation yields that the matrix Σ of an elliptically distributed random matrix with = 0 satisfies (51). For simplicity, we will restrict ourselves to the case = 0 and point out that general can be handled in a similar fashion; see [14] for = 1.
Theorem 23 directly implies the following.
It remains to prove the two Lemmas 25 and 26.
Proof of Lemma 26. We can simply follow the lines of [14, Proof of Statement (3.3)], where = 1 is discussed. A first order expansion of ℎ with the frame property and Kantorovich's inequality yields Lemma 26. No new ideas are involved when dealing with > 1, so we refer to [14] for the details.

Some Concluding Remarks
For some signal processing aspects, the most attractive filter bank schemes are those that provide perfect reconstruction, synthesis is the adjoint of the analysis scheme (so-called unitary filter banks), and filters have equal norm. Tight fusion frames, for instance, correspond to perfect reconstruction filter banks, in which each channel corresponds to an orthogonal projection, and it was verified in [9] that robustness of tight fusion frames against distortions and erasures is maximized when the tight fusion frame has equal norm elements. Our aim was to turn a given filter bank into such more attractive schemes and preserve the essential features of the original filtering process. In terms of frames, we turned a given generalized frame into a tight -frame with unit norm by rescaling and then applying the inverse square root of the new -frame operator. Due to our special focus on filter banks, we started with a generalized frame consisting of convolution operators, hence, allowing for fast matrix vector multiplications. Through some iterative scheme, we constructed a generalized tight frame with unit norm, which induced a filter bank that preserved the convolution structure, hence, the fast algorithmic scheme, in each channel. Only one additional global pre-and postmultiplication by Γ 1/2 is necessary. Naturally, the application of Γ 1/2 needs special care because it may be structured but not exactly a convolution operator.
We observed that the assumptions of our algorithm are satisfied by any sufficiently large sample drawn from any continuous distribution or drawn from random convolution operators. Fields of application are filter banks, in which the additional computation costs of the application of Γ 1/2 or Γ, respectively, can be tolerated, as, for instance, when the number of channels is large or when computations are completely offline.
Our findings provide a tool to design new filter banks with improved properties on a theoretical level. Substantial numerical verification goes beyond the scope of the present paper and will be provided in future work. We hope that our theoretical findings can provide the basis for its use in more elaborate signal processing methods.

Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.