We consider the low-rank approximation problem arising in the
generalized Karhunen-Loeve transform. A sufficient condition for the existence of a solution is
derived, and the analytical expression of the solution is given. A numerical algorithm is proposed
to compute the solution. The new algorithm is illustrated by numerical experiments.
1. Introduction
Throughout this paper, we use Rm×n to denote the set of m×n real matrices. We use AT and A+ to denote the transpose and Moore-Penrose generalized inverse of the matrix A, respectively. The symbol On×n stands for the set of all n×n orthogonal matrices. The symbols rank(A) and ∥A∥F stand for the rank and the Frobenius norm of the matrix A, respectively. For a=(ai)∈Rn, the symbol ∥a∥ stands for the l2-norm of the vector a, that is, ∥a∥2=(∑i=1nai2)1/2. The symbol A1/2 stands for the square root of the matrix A, that is, (A1/2)2=A. For the random vector x=(xi)∈Rn, we use E{xi} to stand for the expected value of the ith entry xi, and we use E{xxT}=(eij)n×n to stand for the covariance matrix of the random vector x, where eij=E[(xi-E{xi})(xj-E{xj})], i,j=1,2,…,n.
The generalized Karhunen-Loeve transform is a well-known signal processing technique for data compression and filtering (see [1–4] for more details). A simple description of the generalized Karhunen-Loeve transform is as follows. Given two random vectors x∈Rn, s∈Rm and an integer d (1≤d<min{m,n}), the generalized Karhunen-Loeve transform is presented by a matrix T*, which is a solution of the following minimization problem (see [1, 4]):
(1)minT∈Rm×n,rank(T)=dE{∥s-Tx∥2}.
Here the vector s depends on some prior knowledge about the data x.
Without the rank constraint on T, the solution of the minimization problem (1) is
(2)T0=RsxRx+,
where Rsx=E{sxT}, Rx=E{xxT}. The minimization problem with this case is associated with the well-known concept of Wiener filtering (see [3]).
With the rank constraint on T, that is, rank(T)=d, we first consider the cost function of the minimization problem (1). By using the fact RsxRxRx+=Rsx and the four Moore-Penrose equations of Rx+, it is easy to verify that (see also [1])
(3)E{∥s-Tx∥2}=tr{(T-T0)Rx(T-T0)T}+E{∥s-T0x∥2}.
Noting that the covariance matrix Rx is symmetric nonnegative definite, then it can be factorized as
(4)Rx=Rx1/2(Rx1/2)T.
Substituting (4) into (3) gives rise to
(5)E{∥s-Tx∥2}=tr{(T-T0)Rx1/2(Rx1/2)T(T-T0)T}+E{∥s-T0x∥2}=tr{[(T-T0)Rx1/2][(T-T0)Rx1/2]T}+E{∥s-T0x∥2}=∥(T-T0)Rx1/2∥F2+E{∥s-T0x∥2}=∥TRx1/2-T0Rx1/2∥F2+E{∥s-T0x∥2}=∥T0Rx1/2-TRx1/2∥F2+E{∥s-T0x∥2},
since E{∥s-T0x∥2} is a constant, then
(6)minT∈Rm×n,rank(T)=dE{∥s-Tx∥2}=minT∈Rm×n,rank(T)=d∥T0Rx1/2-TRx1/2∥F2+E{∥s-T0x∥2},
that is to say, minimizing E{∥s-Tx∥2} is equivalent to minimizing ∥T0Rx1/2-TRx1/2∥F2. Therefore, we can find the solution T* of (1) by solving the minimization problem
(7)minT∈Rm×n,rank(T)=d∥T0Rx1/2-TRx1/2∥F,
which can be summarized as the following low rank approximation problem:
Problem 1.
Given two matrices A∈Rm×n, B∈Rp×n and an integer d, 1≤d<m,p, find a matrix X^∈Rm×p of rank d such that
(8)∥A-X^B∥F=minX∈Rm×p,rank(X)=d∥A-XB∥F.
In the last few years there has been a constantly increasing interest in developing the theory and numerical approaches for the low rank approximations of a matrix, due to their wide applications. A well-known method for the low rank approximation is the singular value decomposition (SVD) [5, 6]. When the desired rank is relatively low and the matrix is large and sparse, a complete SVD becomes too expensive. Some less expensive alternatives for numerical computation, for example, Lanczos bidiagonalization process [7], and the Monte Carlo algorithm [8] are available. To speed up the computation of SVD, random sampling has been employed in [9]. Recently, Ye [10] proposed the generalized low rank approximations of matrices (GLRAM) method. This method is proved to have less computational time than the traditional singular value decomposition-based methods in practical applications. Later, GLRAM method has been revisited and extended by Liu et al. [11] and Liang and Shi [12]. In some applications, we need to emphasize important parts and deemphasize unimportant parts of the data matrix, so the weighted low rank approximations were considered by many authors. Some numerical methods, such as Newton-like algorithm [13], left versus right representations method [14], and unconstrained optimization method [15], are proposed. Recently, by using the hierarchical identification principle [16] which regards the known matrix as the system parameter matrix to be identified, Ding et al. and Xie et al. present the gradient-based iterative algorithms [16–21] and least-squares-based iterative algorithm [22, 23] for solving matrix equations. The methods are innovational and computationally efficient numerical algorithms.
The common and practical method to tackle the low rank approximation Problem 1 is the singular value decomposition (SVD) (e.g. [1]). We briefly review SVD method as following. Minimizing (8) by a rank-d matrix XB is known [5, Page 69] to satisfy
(9)XB=Ad=∑i=1dσiuiviT,
where Ad denotes rank-d singular value decomposition truncation, that is, if the following SVD holds
(10)A=∑i=1rank(A)σiuiviT,
then Ad=∑i=1dσiuiviT. If the matrix B is square and nonsingular, then by (9) we obtain that the solution of Problem 1 is
(11)X=AdB-1=(∑i=1dσiuiviT)B-1.
The SVD method has two disadvantages as following: (1) it requires the matrix B to be square and nonsingular; (2) in order to derive the solution (11), we must compute the inverse matrix of B, whose computation cost is very expensive.
In this paper, we develop a new method to solve the low rank approximation Problem 1, which can avoid the disadvantages of SVD method. We first transform Problem 1 into the fixed rank solution of a matrix equation and then use the generalized singular value decomposition (GSVD) to solve it. Based on these, we derive a sufficient condition for the existence of a solution of Problem 1, and the analytical expression of the solution is given. A numerical algorithm is proposed to compute the solution. Numerical examples are used to illustrate the numerical algorithm. The first one is artificial to show that the new algorithm is feasible to solve Problem 1, and the second is simulation, which shows that the new algorithm can be used to realize the image compression.
2. Main Results
In this section, we give a sufficient condition and an analytical expression for the solution of Problem 1 by transforming Problem 1 into the fixed rank solution of a matrix equation. Finally, we establish an algorithm for solving Problem 1.
Lemma 2.
A matrix X^∈Rm×n is a solution of Problem 1 if and only if it is a solution of the following matrix equation:
(12)XBBT=ABT,rank(X)=d.
Proof.
It is easy to verify that a matrix X^∈Rm×n is a solution of Problem 1 if and only if X^ satisfies the following two equalities simultaneously:
(13)∥A-X^B∥F=minX∈Rm×p∥A-XB∥F,(14)rank(X)=d.
Since the normal equation of the least squares problem (13) is
(15)XBBT=ABT
and noting that the least squares problem (13) and its normal equation (15) have the same solution sets, then (13) and (14) can be equivalently written as
(16)XBBT=ABT,rank(X)=d
which also imply that Problem 1 is equivalent to (12).
Remark 3.
From Lemma 2 it follows that Problem 1 is equivalent to (12), hence we can solve Problem 1 by finding a fixed rank solution of the matrix equation XBBT=ABT.
Now we will use generalized singular value decomposition (GSVD) to solve (12). Set
(17)C=BBT∈Rp×p,D=ABT∈Rm×p.
The GSVD of the matrix pair (C,D) is given by (see [24])
(18)C=UΣ1W,D=VΣ2W,
where U∈Op×p, V∈Om×m, W∈Rp×p is a nonsingular matrix, k=rank([CT,DT]), r=rank(C), t=rank(C)+rank(D)-rank([CT,DT]), and
(19)Σ1=(IC00r-t0SC0t00OCk-r000)p-kr-ttp-r,Σ2=(OD00r-t0SD0t00IDk-r000)p-km-k-t+rtk-r
are block matrices, with IC and ID are identity matrices, OC and OD are zero matrices:
(20)SC=diag(α1,α2,…,αt),1>α1≥α2≥⋯≥αt>0,SD=diag(β1,β2,…,βt),0<β1≤β2≤⋯≤βt<1,αi2+βi2=1,i=1,2,…,t.
By (17) and (18), we have
(21)XBBT=ABT⟺XC=D⟺XUΣ1W=VΣ2W⟺XUΣ1W-VΣ2W=0⟺V(VTXUΣ1-Σ2)W=0.
Set
(22)Y=VTXU,
and Y is partitioned as follows:
(23)Y=(Y11Y21Y31r-tY12Y22Y32tY13Y23Y33)p-rm-k-t+rtk-r,
then
(24)VTXUΣ1-Σ2=YΣ1-Σ2=(Y11Y12Y13Y21Y22Y23Y31Y32Y33)(IC0000SC0000OC0)-(OD0000SD0000ID0)=(Y11Y21Y31r-tY12SCY22SC-SDY32SCt00-IDk-r000)p-km-k-t+rtk-r.
Therefore, by (21) and (24), we have
(25)XBBT=ABT⟺V(Y11Y12SC00Y21Y22SC-SD00Y31Y32SC-ID0)W=0⟺rank[V(Y11Y12SC00Y21Y22SC-SD00Y31Y32SC-ID0)W]=0⟺rank[(Y11Y12SC00Y21Y22SC-SD00Y31Y32SC-ID0)]=0⟺k-r=0,Y11=Y21=Y31=Y12=Y32=0,Y22=SDSC-1,
that is to say, the matrix equation XBBT=ABT has a solution if and only if
(26)k-r=rank([CT,DT])-rank(C)=rank([BBT,BAT])-rank(BBT)=0,
and according to (22), we know that the expression of the solution is
(27)X=VYUT,
where
(28)Y=(00r-t0SDSC-1tY13Y23)p-rm-tt,Y13∈R(m-t)×(p-r),Y23∈Rt×(p-r).
By (26)–(28) and noting that Y13 and Y23 are arbitrary matrices, we have
(29)minX∈Rm×prank(X)=minX∈Rm×prank(VYUT)=minY13∈R(m-t)×(p-r),Y23∈Rt×(p-r)rank(Y)=t=rank(D)=rank(ABT),maxX∈Rm×prank(X)=maxY∈Rm×prank(VYUT)=maxY13∈R(m-t)×(p-r),Y23∈Rt×(p-r)rank(Y)=t+min{p-r,m-t}=min{m,p+t-r}=min{{([CT,DT])}m,p+rank(C)+rank(D)-rank([CT,DT])-rank(C)}=min{m,p+rank(D)-rank([CT,DT])}=min{m,p+rank(D)-rank(C)}=min{m,p+rank(ABT)-rank(BBT)}.
Hence, if
(30)rank([BBT,BAT])-rank(BBT)=k-r=0,(31)rank(ABT)≤d≤min{m,p+rank(ABT)-rank(BBT)},
then (12) has a solution, and the expressions of the solution are given by (26)–(28), that is,
(32)X=VYUT=V(00Y130SDSC-1Y23)UT,
where Y23∈Rt×(p-r) is an arbitrary matrix and Y13∈R(m-t)×(p-r) is chosen such that
(33)rank(Y13)=d-t=d-rank(D)=d-rank(ABT).
And noting that the low rank approximation Problem 1 is equivalent to (12) (i.e. Lemma 2), then we obtain the following.
Theorem 4.
If
(34)rank([BBT,BAT])-rank(BBT)=k-r=0,rank(ABT)≤d≤min{m,p+rank(ABT)-rank(BBT)},
then Problem 1 has a solution, and the expressions of the solution are given by
(35)X=VYUT=V(00Y130SDSC-1Y23)UT,
where Y23∈Rt×(p-r) is an arbitrary matrix and Y13∈R(m-t)×(p-r) is chosen such that
(36)rank(Y13)=d-t=d-rank(D)=d-rank(ABT).
Remark 5.
In contrast with (11), the solution expression (35) does not require the matrix B to be square and nonsingular and does not need to compute the inverse of B.
Based on Theorem 4, we can establish an algorithm for finding the solution of Problem 1.
Algorithm 6.
(1) Input the matrices A,B and the integer d;
make the GSVD of the matrix pair (C,D) according to (18);
choose Y23∈Rt×(p-r) and Y13∈R(m-t)×(p-r), such that rank(Y13)=d-rank(ABT);
compute the solution X according to (35).
3. Numerical Experiments
In this section, we first use a simple artificial example to illustrate that Algorithm 6 is feasible to solve Problem 1, then we use a simulation to show that Algorithm 6 can be used to realize the image compression. The experiments were done with MATLAB 7.6 on a 64-bit Intel Pentium Xeon 2.66 GHz with emach≈2.0×10-16.
We make GSVD of the matrix pair (C,D)=(BBT,ABT) as follows:
(38)C=UΣ1W,D=VΣ2W,
where(39)U=(-0.75740.4215-0.05900.0130-0.2832-0.4060-0.1509-0.5893-0.3965-0.6437-0.1562-0.1845-0.1775-0.34610.8997-0.1454-0.0810-0.1071-0.1752-0.5847-0.16670.7506-0.1161-0.1510-0.3397-0.0817-0.0302-0.02510.9311-0.0970-0.4754-0.0814-0.0331-0.0218-0.09150.8703),(40)V=(-0.28180.43220.09370.7418-0.4180-0.1814-0.6971-0.44170.0795-0.55030.9419-0.0182-0.03300.2445-0.22720.01660.5179-0.2279-0.5906-0.5751-0.0160-0.24240.8753-0.1863-0.3743),(41)W=(-6.2034-5.7059-4.4070-5.9640-4.5789-6.1909-7.5279-8.6034-6.4574-8.9389-6.2317-8.37560.4175-0.13140.24530.4166-0.1876-0.73450.19280.66880.2243-0.6061-0.2199-0.2228-0.01010.2451-0.4669-0.02480.7439-0.40970.19260.3245-0.72130.2930-0.49100.1022),(42)Σ1=(0.9985000000.419200000000000000000000000),Σ2=(0000000000000000.0541000000.9079000).
It is easy to verify that(43)rank([BBT,BAT])-rank(BBT)=0,rank(ABT)=2,min{m,p+rank(ABT)-rank(BBT)}=5,
that is, if 2≤d≤5, then the conditions of Theorem 4 are satisfied. Setting d=2∈[2,5], according to (35), we obtain that the solution of Problem 1 is
(44)X^=V(0000000000000000000.05410.99850000000.90790.41920000),(45)UT=(-0.41210.52750.30620.52230.06030.0546-0.50570.70170.41170.69610.09590.0949-0.21750.28800.16800.28550.03570.0338-0.50080.73890.43670.73390.11260.1166-0.33410.47930.28230.47580.06960.0708).
Setting d=4∈[2,5], according to (35), we obtain that the solution of Problem 1 is
(46)X^=V(0010000001000000000.05410.99850000000.90790.41920000),UT=(-0.38980.3610-0.01020.89370.05790.0549-0.50401.22240.34980.20320.11880.1155-0.2733-0.07361.01810.11480.00770.0030-0.49500.39890.37641.11980.09910.1052-0.33630.64160.30320.29650.07620.0764).
Example 7 shows that Algorithm 6 is feasible to solve Problem 1. However, the SVD method in [1] cannot be used to solve Example 7, because B is not a square matrix.
Example 8.
We will use the generalized Karhunen-Loeve transform, based on Algorithm 6 and SVD method in [1], respectively, to realize the image compression. Figure 1(a) (see page 3) is the test image which has 256×256 pixels and 256 levels on each pixel. We separate it into 32×32 blocks such that each block has 8×8 pixels. Let fi,j(k,l) and ni,j(k,l) (i,j=0,1,2,…,7;k,l=0,1,2,…,31) be the values of the image and a Gaussian noise (generated by Matlab function imnoise) at the (i,j)th pixel in the (k,l)th block, respectively. For convenience, let a=i+8j, p=k+32l, and the (i,j)th pixel in the (k,l)th block be expressed as the ath pixel in the pth block (a=0,1,2,…,63;p=0,1,…,1023). We can also express fi,j(k,l) and ni,j(k,l) as fa(p) and na(p), respectively.
The test image is processed on each block. Therefore, we can assume that the blocked image space is 64-D real vector space R64. The pth block of the original image is expressed by the pth vector:
(47)sp=(s0p,s1p,…,s63p)T.
Hence the original image is expressed by 1024 64-D vectors {sp}p=01023. The noise is similarly expressed by {np}p=01023, where
(48)np=(n0p,n1p,…,n63p)T.
Figure 1(b) is the noisy image {xp}p=01023, where
(49)xp=sp+np,p=0,1,…,1023.
By (47), (49), (2), (4) and the definition of covariance matrix, we get T0 and Rx1/2 of (7). Then we use Algorithm 6 and SVD method in [1] to realize the image compression respectively, and the experiment results are in pages 4 and 5.
Figure 2 illustrates that Algorithm 6 can be used to realize image compression. Although it is difficult to see the difference between Figures 2 and 3, which are compressed by SVD method in [1], from Table 1 we can see that the execution time of Algorithm 6 is less than that of SVD method at the same rank. This shows that our algorithm outperforms the SVD method in execution time.
Execution time for deriving Figures 2(a)–3(c).
Figure 2(a)
Figure 3(a)
Figure 2(b)
Figure 3(b)
Figure 2(c)
Figure 3(c)
3.5835 (s)
3.9216 (s)
2.8627 (s)
3.0721 (s)
2.0591 (s)
2.1433 (s)
(a) Original image; (b) noisy image.
Image compression by Algorithm 6 with different rank d: (a) d=40; (b) d=30; and (c) d=20.
Image compression by SVD method with different rank d: (a) d=40; (b) d=30; and (c) d=20.
4. Conclusion
The low rank approximation Problem 1 arising in the generalized Karhunen-Loeve transform is studied in this paper. We first transform Problem 1 into the fixed rank solution of a matrix equation and then use the generalized singular value decomposition (GSVD) to solve it. Based on these, we derive a sufficient condition for the existence of a solution, and the analytical expression of the solution is also given. Finally, we use numerical experiments to show that new algorithm is feasible and effective.
Acknowledgments
This research was supported by the National Natural Science Foundation of China (11101100; 11226323; 11261014; and 11171205), the Natural Science Foundation of Guangxi Province (2012GXNSFBA053006; 2013GXNSFBA019009; and 2011GXNSFA018138), the Key Project of Scientific Research Innovation Foundation of Shanghai Municipal Education Commission (13ZZ080), the Natural Science Foundation of Shanghai (11ZR1412500), the Ph.D. Programs Foundation of Ministry of Education of China (20093108110001), the Discipline Project at the corresponding level of Shanghai (A. 13-0101-12-005), and Shanghai Leading Academic Discipline Project (J50101).
HuaY.LiuW. Q.Generalized Karhunen-Loeve transform19985141142KrautS.AndersonR. H.KrolikJ. L.A generalized Karhunen-Loeve basis for efficient estimation of tropospheric refractivity using radar clutter2004521486010.1109/TSP.2003.820297MR2049873OgawaH.OjaE.Projection filter, Wiener filter, and Karhunen-Loève subspaces in digital image restoration19861141375110.1016/0022-247X(86)90063-6MR829109ZBL0588.94005YamashitaY.OgawaH.Relative Karhumen-Loeve transform199644371378GolubG. H.Van LoanC. F.19963rdBaltimore, Md, USAJohns Hopkins University Pressxxx+698MR1417720HansenP. C.The truncated SVD as a method for regularization198727453455310.1007/BF01937276MR916729ZBL0633.65041SimonH. D.ZhaH.Low-rank matrix approximation using the Lanczos bidiagonalization process with applications20002162257227410.1137/S1064827597327309MR1762041ZBL0962.65038DrineasP.KannanR.MahoneyM. W.Fast Monte Carlo algorithms for matrices—II. Computing a low-rank approximation to a matrix200636115818310.1137/S0097539704442696MR2231644ZBL1111.68148FriezeA.KannanR.VempalaS.Fast Monte-Carlo algorithms for finding low-rank approximations20045161025104110.1145/1039488.1039494MR2145262ZBL1125.65005YeJ. P.Generalized low rank approximations of matrices200561167191LiuJ.ChenS. C.ZhouZ. H.TanX. Y.Generalized low rank approximations
of matrices revisited201021621632LiangZ. Z.ShiP. F.An analytical algorithm for generalized low rank approxiamtions
of matrices20053822132216MantonJ. H.MahonyR.HuaY.The geometry of weighted low-rank approximations200351250051410.1109/TSP.2002.807002MR1956702MarkovskyI.Van HuffelS.Left versus right representations for solving weighted low-rank approximation problems20074222-354055210.1016/j.laa.2006.11.012MR2305139ZBL1115.65047SchuermansM.LemmerlingP.Van HuffelS.Block-row Hankel weighted low rank approximation200613429330210.1002/nla.459MR2220675ZBL1174.65390DingF.ChenT.On iterative solutions of general coupled matrix equations20064462269228410.1137/S0363012904441350MR2248183ZBL1115.65035DingF.ChenT.Gradient based iterative algorithms for solving a class of matrix equations20055081216122110.1109/TAC.2005.852558MR2156053DingF.LiuP. X.DingJ.Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principle20081971415010.1016/j.amc.2007.07.040MR2396289ZBL1143.65035DingJ.LiuY.DingF.Iterative solutions to matrix equations of the form AiXBi = Fi201059113500350710.1016/j.camwa.2010.03.041MR2646321ZBL1197.15009XieL.LiuY.YangH.Gradient based and least squares based iterative algorithms for matrix equations AXB + CXTD = F201021752191219910.1016/j.amc.2010.07.019MR2727965ZBL1210.65097XieL.DingJ.DingF.Gradient based iterative solutions for general linear matrix equations20095871441144810.1016/j.camwa.2009.06.047MR2555281ZBL1189.65083DingF.ChenT.Iterative least-squares solutions of coupled Sylvester matrix equations20055429510710.1016/j.sysconle.2004.06.008MR2109576ZBL1129.65306XiongW.FanW.DingR.Least-squares parameter estimation algorithm for a class of input nonlinear systems201220121410.1155/2012/684074684074MR2959986ZBL1251.62036PaigeC. C.SaundersM. A.Towards a generalized singular value decomposition198118339840510.1137/0718026MR615522ZBL0471.65018