1. Introduction
The Cauchy-Bunyakovsky-Schwarz, or for short the C.B.S.-inequality, plays an important role in different branches of Modern Mathematics including Hilbert Space Theory, Probability & Statistics, Classical Real and Complex Analysis, Numerical Analysis, and Qualitative Theory of Differential Equations and their applications.
Given an n-dimensional complex space ℂn and two linear subspaces U and V such that
(1)U∩V={0},
there exists
(2)γ=γ(U,V)∈[0,1)
such that for all x∈U and y∈V the following strengthened C.B.S.-inequality holds (see [1]):
(3)|x*y|≤γ∥x∥·∥y∥,
where ∥·∥ denotes the standard Euclidian norm. The smallest quantity γ may be called the cosine of the angle ϕ between the spaces U and V, or be called the C.B.S.-ratio of U and V.
Strengthened C.B.S.-inequality has a long history and there exist various versions. The earliest result of this kind is due to the Wielandt [2] and was later generalized by many researchers. Among them two important extensions of the Wielandt inequality were given by Bauer and Householder [3], and Wang and Ip [4]. The aim of this paper is to present a matrix version of the Bauer-Hausdorff inequality like one of the Wielandt inequality given by Wang and Ip in [4].
On the practical side, this has been used in the analysis of two-level methods. The survey by Eijkhout and Vassilevski [1] attributes the basic theory of this inequality and its applications in multilevel methods for the solution of linear systems arising from finite element or finite difference discretisation of elliptic partial differential equations. Auzinger and Kirlinger [5] proposed another extension of this inequality for the resolvent conditions in the Kreiss matrix theorem [6].
Throughout the paper, we denote by R(X) the range of the matrix X∈ℂn×p,
(4)R(X)={x∣x=Xz,z∈ℂp},
and by W(A) the (closed) numerical range of an operator A on the space ℂn,
(5)W(A)={x*Ax∣x∈R(A),∥x∥=1},
and by L(a1,a2,…,ap) the linear subspace spanned by {a1,a2,…,ap} with ai∈ℂn, i=1,2,…,p.
Definition 1.
For A∈ℂn×n, PA=AA+ denotes the orthogonal projector onto the column space (range) of A, where A+ is the Moore-Penrose inverse of A.
Definition 2.
For two n×n positive semidefinite Hermitian matrices A, and B, we say that A is below B with respect to the Löwner partial ordering, and we write A ≤L B, if B-A is positive semidefinite.
2. A Matrix Inequality
Let A be an n×n positive definite Hermitian matrix. For any two nonzero complex vectors x∈U and y∈V, Bauer and Householder [3] asserted that
(6)(x*Ay)2x*Ax·y*Ay≤cos2θ,
where θ satisfies
(7)cot2θ2=λ1λncot2ϕ2,
with ϕ∈(0,π/2] being the angle between two vector subspaces U and V mentioned above, and with λ1 and λn being the largest and smallest, necessarily real and positive, eigenvalues of A.
A very interesting and important special case of (6) is the Wielandt inequality
(8)(x*Ay)2x*Ax·y*Ay≤(λ1-λnλ1+λn)2
established by Wielandt [2] when ϕ=π/2. The upper bound in (8) is called the Wielandt ratio (see [7]). Wang and Ip [4] generalized this inequality as follows.
Let X and Y be complex n×p and n×q matrices, respectively. If X*Y=0, then for all generalized inverses (Y*AY)-(9)X*AY(Y*AY)-Y*AX ≤L (λ1-λnλ1+λn)2X*AX
in the Löwner partial ordering.
The first statistical application of the Wielandt inequality seems to be Eaton [8]. There are various equivalent versions of (8) in the literature. The most important one of them is the famous Kantorovich inequality [9, 10] which was used in estimating convergence rate of the steepest descent method for minimizing quadratic problems. Historical and biographical remark on them can be found in [7].
Lemma 3 (see Zhan, [11]).
Let A∈ℂn×n be a Hermitian matrix with 2×2 block form
(10)A=(A11A12A21A22).
Then A is positive semidefinite if and only if A11 and A22 are positive semidefinite and if there is a contraction matrix W such that A12=A111/2WA221/2.
If A can be partitioned as (10) and
(11)X=(Ip0) Y=(0In-p),
then (9) becomes
(12)A12A11-1A21 ≤L (λ1-λnλ1+λn)2A22
such that
(13)W*W ≤L (λ1-λnλ1+λn)2I.
In other words, if A is positive definite with the 2×2 block (10), then there is a contraction matrix W, with the maximum singular value being equal to or less than (λ1-λn)/(λ1+λn), such that A12=A111/2WA221/2.
The main result of this paper is stated in the following theorem.
Theorem 4.
Let A∈ℂn×n be a positive semidefinite Hermitian matrix with
rank
r≤n and eigenvalues λ1≥λ2≥⋯≥λr>0. For any X∈ℂn×p and Y∈ℂn×q, if ϕ∈(0,π/2] is the angle between the two vector spaces R(PAX) and R(PAY) and
(14)cot2θ2=λ1λrcot2ϕ2,
then
(15)X*AY(Y*AY)-Y*AX≤L cos2θ·X*AX.
Theorem 4 will be proved in the next section. This states a very general form of a (strengthened) C.B.S.-inequality and covers various types of C.B.S.-inequalities and their matrix forms. For instance, the inequality (15) reduces to (6) when p=q=1 and reduces to (9) when ϕ=π/2.
Theorem 4 can be rewritten as the following equivalent form.
Theorem 5.
Under the assumptions of Theorem 4, there is a contraction matrix W (from Lemma 3), with the maximum singular value being equal to or less than cosθ, such that
(16)X*AY=(X*AX)1/2W(Y*AY)1/2.
The maximum singular value of the contraction matrix W in Theorem 5 might be strictly less than cosθ. For example, if
(17)A=(cosψ-sinψsinψcosψ)(λ100λ2)(cosψsinψ-sinψcosψ)=(λ1cos2ψ+λ2sin2ψ(λ1-λ2)sinψcosψ(λ1-λ2)sinψcosψλ1sin2ψ+λ2cos2ψ),
then W=A11-1/2A12A22-1/2 such that
(18)W*W=(λ1-λ2)2sin2ψcos2ψ(λ1cos2ψ+λ2sin2ψ)(λ1sin2ψ+λ2cos2ψ)=(λ1-λ2)2sin2ψcos2ψ(λ1-λ2)2sin2ψcos2ψ+λ1λ2.
That is, the singular value of the contraction matrix W belongs to the interval [0,(λ1-λ2)/(λ1+λ2)].
3. Proof of the Main Result
In this section we present an elementary proof of Theorem 4 by a biorthogonal procedure.
The C.B.S.-ratio γ in (3) can be redefined as
(19)γ=max{x*y∣∥x∥=∥y∥=1,x∈U,y∈V}.
Since in finite dimensional spaces the unit sphere is compact, the maximum value of (19) is attained. The following result is obvious.
Lemma 6.
Let γ be the C.B.S.-ratio of two subspaces U and V of ℂn satisfying (1). Then there exist two unit vectors u∈U and v∈V satisfying γ=u*v such that
(20)u-γv⊥V, v-γu⊥U.
One direct consequence of Lemma 6 is the following theorem.
Theorem 7.
Let U and V be p and q dimensional linear subspaces of ℂn satisfying (1) with p,q≥2 and p+q≤n. Then there exist two standard orthogonal bases {u1,u2,…,up} of U and {v1,v2,…,vq} of V such that for each i=1,2,…,min{p,q}(21)ui⊥L(vi+1,…,vq), vi⊥L(ui+1,…,up).
Proof.
We shall achieve the desired result by the following biorthogonal process. Start with U1=U and V1=V, i=1. By Lemma 6, one finds two unit vectors ui∈Ui and vi∈Vi such that
(22)γi=ui*vi,(23)ui-γivi⊥Vi vi-γiui⊥Ui,
where γi is the C.B.S.-ratio of Ui and Vi with γi≤γ. If dim(Ui)=1 or dim(Vi)=1, then the procedure is completed; otherwise, update Ui and Vi by setting
(24)Ui+1={x-x*viγiui∣x∈Ui},Vi+1={y-y*uiγivi∣y∈Vi}.
It is easily proved that
(25)ui⊥Vi+1, vi⊥Ui+1.
Replace i by i+1, and repeat the above procedure until i=min{p,q}.
If p is not equal to q, one finds a standard orthogonal bases {uq+1,…,up} of Up (q<p) or {vp+1,…,uq} of Vp (p<q). This procedure generates two bases {u1,u2,…,up} of U and {v1,v2,…,vq} of V such that Ui=L(ui,ui+1,…,up) and Vi=L(vi,vi+1,…,vq) for each i=1,2,…,min{p,q}. Equations (23) and (25) imply that
(26)ui⊥Ui+1, vi⊥Vi+1,
such that these two bases are standard orthogonal. Finally, (21) holds since it is equivalent to (25).
In order to acquire our main result, we need the following lemmas.
Lemma 8.
Let W(A) be the numerical range of a linear operator A on the space ℂn. Then
(27)W(X*AX)⊆W(A)·W(X*PAX)
for any operator X:ℂk→ℂn (k∈ℕ arbitrary), where the multiplication is defined by X·Y={xy:x∈X,y∈Y}, X,Y∈ℂ.
Proof.
For any z∈ℂk, if PAXz=0, then 0=(X*AXz,z)∈W(A)·W(X*PAX). If PAXz≠0 and ∥z∥=1, then A=AA+AA+A=PAAPA such that
(28)(X*AXz,z) =∥PAXz∥2(APAXz∥PAXz∥,PAXz∥PAXz∥)∈W(A)·W(X*PAX),
which results in the desired assertion.
When A is invertible, Lemma 8 was established by Fujii [12].
Lemma 9.
If p≤q, then the matrix
(29)C=(IpBBTIq)
has single eigenvalues 1±γ1,…,1±γp, where Ip and Iq are two p×p and q×q unit matrices, and B=(Λ,0)∈ℂp×q with a p×p diagonal matrix Λ=
diag
(γ1,γ2,…,γp). Furthermore, if p<q, then C has multiple eigenvalue 1.
Proof.
If two vectors ξ=(ξ1,…,ξj,…,ξp+q)T and η=(η1,…,ηj,…,ηp+q)T are defined by
(30)ξj={1,j=i, p+i,0,j≠i, p+i,ηj={1,j=i,-1,j=p+i,0,j≠i, p+i
then they are eigenvectors of C with the eigenvalues 1+γi and 1-γi, respectively.
If p<q, for each j=2p+1,…,p+q, the jth column vector of the (p+q)×(p+q) unit matrix is the eigenvector of C with the multiple eigenvalue 1. The proof is completed.
Finally, we give the proof of Theorem 4.
Proof of Theorem 4.
Let U and V be two standard orthogonal bases of ranges R(PAX) and R(PAY), respectively, with the ranks p and q in Theorem 7. Without loss of generality, we may assume that p≤q. Let U*V=(Λ,0)∈ℂp×q, where Λ= diag(γ1,γ2,…,γp) is a p×p diagonal matrix with γi≤γ for each i=1,2,…,p and γ the C.B.S.-ratio of R(PAX) and R(PAY). There exist two matrices X1∈ℂp×p1 and Y1∈ℂq×q1 such that PAX=UX1 and PAY=VY1.
Letting Z=(U,V), C=Z*PAZ can be expressed as the form (29) such that W(Z*PAZ)=[1-γ,1+γ] from Lemma 9. Lemma 8 shows that the matrix
(31)Q=Z*AZ=(U*AUU*AVV*AUV*AV)
has the largest and smallest eigenvalues λ1(1+γ) and λr(1-γ). Since (7) holds and
(32)1+γ1-γ=1+cosϕ1-cosϕ=cot2ϕ2,
the Wielandt ratio of Q is
(33)(λ1(1+γ)-λr(1-γ)λ1(1+γ)+λr(1-γ))2=(λ1/λrcot2ϕ/2-1λ1/λrcot2ϕ/2+1)2=(cot2θ/2-1cot2θ/2+1)2=cos2θ
such that the matrix
(34)(U*AUU*AVV*AUcos2θ·V*AV)
being positive semidefinite by the use of the Schur complement theory (see [13, 14]) to the inequality (9), which leads to the matrix
(35)(X*AYX*AYY*AXcos2θ·Y*AY)=(X1*00Y1*) ×(U*AUU*AVV*AUcos2θ·V*AV) ×(X100Y1)
is positive semidefinite. Applying the Schur complement theory again, the desired result is proved.