1. Introduction

MPE

Mathematical Problems in Engineering

1563-51471024-123X

Hindawi

10.1155/2020/6527462

6527462

Research Article

Improved Shrinkage Estimator of Large-Dimensional Covariance Matrix under the Complex Gaussian Distribution

https://orcid.org/0000-0001-7654-5072

Zhang

Bin

¹²Li

Jianfeng

College of Education

Hubei Minzu University

Enshi

Hubei 445000

China

hbmy.edu.cn

College of Mathematics

Sichuan University

Chengdu

Sichuan 610064

China

scu.edu.cn

2020

772020

2020290320200505202028052020772020

2020

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Estimating the covariance matrix of a random vector is essential and challenging in large dimension and small sample size scenarios. The purpose of this paper is to produce an outperformed large-dimensional covariance matrix estimator in the complex domain via the linear shrinkage regularization. Firstly, we develop a necessary moment property of the complex Wishart distribution. Secondly, by minimizing the mean squared error between the real covariance matrix and its shrinkage estimator, we obtain the optimal shrinkage intensity in a closed form for the spherical target matrix under the complex Gaussian distribution. Thirdly, we propose a newly available shrinkage estimator by unbiasedly estimating the unknown scalars involved in the optimal shrinkage intensity. Both the numerical simulations and an example application to array signal processing reveal that the proposed covariance matrix estimator performs well in large dimension and small sample size scenarios.

Youth Research Foundation of Hubei Minzu University

MY2017Q021

1. Introduction

The problem of estimating the covariance matrix of a random vector arises in both multivariate statistical theory and various applications [1, 2]. In large sample setting, where the dimension of a random vector is small and the sample size is large enough, the sample covariance matrix (SCM) is a reliable estimator of the real covariance matrix and is widely employed in many scenarios. However, suffering from the curse of dimensionality, the SCM becomes ill-conditioned or even singular in large dimension scenarios [3]. Then, severe consequences may appear if the SCM remained as the covariance matrix estimator [4, 5].

During the last two decades, scientists have proposed many regularization strategies to generate outperformed covariance matrix estimators in large dimension scenarios [6–10]. Among these, the linear shrinkage estimation is an effective strategy to inspire a well-conditioned covariance matrix estimator when the dimension is large compared to the sample size [11, 12]. When the prior information of the covariance structure is available, the linear shrinkage estimator is modeled as a linear combination between the SCM and a proper target matrix. In the existing literature, the target matrices, which are usually formed through structuring the SCM in line with the prior information, include the spherical target and others, such as the diagonal target, the Toeplitz rectified target, and the tapered SCM [13].

With the aid of prior information, the linear shrinkage estimator can always outperform the SCM when the involved tuning parameter is carefully selected [14]. Therefore, one of the crucial difficulties in linear shrinkage estimation is to determine the optimal tuning parameter which is also called as the shrinkage intensity. By minimizing the mean squared error (MSE) between the shrinkage estimator and the real covariance matrix, the optimal tuning parameter can be expressed in a closed form for an arbitrary target. However, it is comprised of unknown scalars which involve the expectation operator and the real covariance matrix, leading to a chief difficulty in generating an available shrinkage estimator. In particular, when the data follow a specific distribution such as Gaussian distribution or elliptical distribution, the expectation can be further calculated [11]. What is noteworthy is that the optimal tuning parameter owns different expressions under different distributions, even for the same target [15]. Therefore, it is necessary to discover the specific properties of shrinkage intensity under some typical distributions. In many applications, such as array signal processing, the data come from the complex domain. There have been some related research studies on it. In [16], a linear shrinkage estimator for Toeplitz rectified target is developed under the complex Gaussian distribution, whereas the involved unknown scalars in the optimal tuning parameter are not unbiasedly estimated, resulting in a suboptimal covariance matrix estimator [17]. In [18], a linear shrinkage estimator is proposed via low-complexity crossvalidation under an arbitrary complex distribution. Therefore, when the data follow a specific distribution, the linear shrinkage estimator could be further improved by making full use of the distribution information.

In this paper, we further research the linear shrinkage estimator under the complex Gaussian distribution. The target matrix is chosen as the spherical target which has been widely studied under the real number field [6, 11, 14]. The optimal tuning parameter is obtained by minimizing the MSE. We remind that the above optimal tuning parameter involves both the expectation operator and the real covariance matrix. By developing a novel moment property of the complex Wishart distribution, we can calculate the expectation operator. Then, the optimal tuning parameter turns to be only related to some unknown scalars concerning the real covariance matrix. A popular approach is adopted by replacing these unknown scalars with their estimates to obtain an available tuning parameter. Furthermore, good estimates of unknown scalars can benefit the corresponding available tuning parameter and the corresponding shrinkage estimator [11, 19].

The main contributions of this paper are summarized as three-fold:(1)

A necessary moment property of the complex Wishart distribution is developed. On this basis, the optimal tuning parameter for the spherical target is analytically expressed under the complex Gaussian distribution.

(2)

All the unknown scalars involved in the optimal tuning parameter are unbiasedly estimated. Then, the corresponding available linear shrinkage estimator under the complex Gaussian distribution is proposed.

(3)

The performance of the proposed covariance matrix estimator is verified with comparison to the existing estimators in numerical simulations and an example application to adaptive beamforming.

The rest of this paper is organized as follows: Section 2 formulates the linear shrinkage estimation under the complex Gaussian distribution as a quadratic programming problem. The optimal solution is analytically obtained. Section 3 unbiasedly estimates the relative unknown scalars and subsequently proposes a new shrinkage estimator for the spherical target. Section 4 provides some numerical simulations and an example application for verifying the performance of the proposed covariance matrix estimator. Section 5 concludes.

1.1. Notations

The notation ℂm is the set of all m-dimensional complex column vectors, and ℍn is the set of all n×n Hermitian matrices. The symbol E denotes the mathematical expectation. The bold symbols 0 and 1 respectively denote the column vectors having all entries 0 and 1 with an appropriate dimension. The symbol In denotes the n×n identity matrix. For a matrix A, AH and A respectively denote its conjugate transpose and Frobenius matrix norm. For a squared matrix A, A−1 and trA respectively denote its inverse and trace. For two real numbers a and b, a∧b and a∨b respectively mean the maximum and minimum of a and b.

2. Formulation and the Optimal Solution

Assume a p-dimensional random vector x∈ℂp follows the complex Gaussian distribution CN0,Σ, where Σ is the unknown covariance matrix. Let x1,x2,…,xn∈ℂp be an independent and identically distributed (i.i.d.) sample, then the sample covariance matrix S is defined by(1)S=1n∑i=1nxixiH.

For an arbitrary prespecified target matrix T∈ℍp which represents an aspect prior information of the real covariance matrix structure [20], the linear shrinkage estimator of covariance matrix Σ is modeled as(2)Σ^=1−wS+wT,where w∈0,1 is the tuning parameter which is also called shrinkage intensity [21]. Because S and T are Hermitian, we have Σ^∈ℍp for an arbitrary w∈0,1.

To find the optimal shrinkage intensity, we employ the MSE criterion:(3)ℳTw=EΣ^−Σ2=E1−wS+wT−Σ2.

Furthermore, we have(4)ℳTw=w2EtrT−S2−2wEtrT−SΣ−S+c,where c=EtrΣ−SΣ−S2 is a constant. Therefore, the optimal shrinkage intensity can be obtained through solving the following optimization problem:(5)minw2EtrT−S2−2EtrT−SΣ−Sw,s.t.0≤w≤1.

It is worth noticing that the objective function in optimization problem (5) is a convex quadratic function of w, and the optimal shrinkage intensity can be expressed in a closed form as follows:(6)w0∗=0∧EtrT−SΣ−SEtrT−S2∨1.

Furthermore, for the spherical target T=trS/pIp, the optimal shrinkage intensity w0∗ given by (6) becomes(7)w1∗=0∧1/ptr2Σ−trΣ2+EtrS2−1/pEtr2SEtrS2−1/pEtr2S∨1.

Denote the matrix(8)En=11n1n1.

We can obtain the following moment property of the complex Gaussian distribution.

Proposition 1.

Assume an i.i.d. sample x1,x2,…,xn∈ℂp follows the complex Gaussian distribution CN0,Σ and S is the sample covariance matrix; we have(9)EtrS2tr2S=EntrΣ2tr2Σ.

Proof.

Because the sample x1,x2,…,xn∈ℂp follows a complex Gaussian distribution with mean 0, we have nS∼CWΣ,n. By the moment properties of complex Wishart distribution [22], we can obtain(10)EtrS2=trΣ2+1/ntr2Σ.

Furthermore, when a random matrix W=wijp×p follows complex Wishart distribution CWIp,n with degree of freedom n, we have(11)trW=∑i=1p∑j=1pwij,(12)tr2W=∑i=1p∑j=1p∑k=1p∑l=1pwijwkl.

By taking expectation on both sides, we have(13)Etr2W=∑i=1p∑j=1p∑k=1p∑l=1pEwijwkl=∑i=1p∑j=1p∑k=1p∑l=1pb1δijδkl+b2δilδjk=b1tr2Ip+b2trIp,where b1=n2 and b2=n. For a random matrix W which follows complex Wishart distribution CWΣ,n with degree of freedom n, let Σ=GGH; then, we have G−1WG−H∼CWIp,n. In the same manner, we can obtain(14)Etr2W=n2tr2Σ+ntrΣ2.

Noticing that nS∼CWΣ,n, we can obtain(15)Etr2nS=n2tr2Σ+ntrΣ2.

Therefore, we have(16)Etr2S=tr2Σ+1ntrΣ2.

By (10) and (16), equality (9) holds.

Theorem 1.

When the target matrix is T=trS/pIp, the optimal shrinkage intensity under the complex Gaussian distribution is(17)w∗=ptr2Σ−trΣ2p−ntr2Σ+np−1trΣ2∈0,1.

Proof.

By plugging equalities (10) and (16) into (7), we have(18)EtrS2−1pEtr2S=np−1nptrΣ2+p−nnptr2Σ.

Therefore, we can obtain(19)w∗=0∧ptr2Σ−trΣ2p−ntr2Σ+np−1trΣ2∨1.By Cauchy–Schwarz inequality, we have w∗∈0,1. Hence, we have(20)w∗=ptr2Σ−trΣ2p−ntr2Σ+np−1trΣ2∈0,1.

By Theorem 1, the corresponding optimal linear shrinkage estimator is(21)Σ^=1−w∗S+w∗trSpIp.

We remind that the optimal shrinkage estimator concerns with the real covariance matrix. Thus, it is unavailable in practical applications. Despite this, it provides a theoretical optimal value for evaluating the available ones.

3. Available Linear Shrinkage EstimatorTheorem 2.

Under the complex Gaussian distribution, the unbiased estimates of trΣ2 and tr2Σ are, respectively, given by(22)α=1n2−1n2trS2−ntr2S,β=1n2−1n2tr2S−ntrS2.

Proof.

Because the inverse matrix of En is(23)En−1=n2n2−11−1n−1n1,we have(24)EEn−1trS2tr2S=trΣ2tr2Σ.

Therefore, we can obtain(25)Eα=trΣ2,Eβ=tr2Σ,revealing that α and β are unbiased estimates of trΣ2 and tr2Σ, respectively.

Through plugging the unbiased estimates given by (22) into the optimal shrinkage intensity w∗, we can obtain the available shrinkage intensity:(26)w^=0∧pβ−αp−nβ+np−1α∨1.

Therefore, the available linear shrinkage estimator is(27)Σ^=1−w^S+w^trSpIp.

The linear shrinkage estimator given by (27) is positive definite even when the dimension exceeds the sample size, except that Σ^ degenerates into the SCM.

4. Numerical Simulations and Adaptive Beamforming

In this section, we provide some numerical simulations and an example application to adaptive beamforming for verifying the performance of the proposed covariance matrix estimator. The proposed linear shrinkage estimator is denoted as T1cg. The linear shrinkage estimator corresponding to the spherical target matrix in [18] is denoted as T1cv.

4.1. Numerical Simulations

As mentioned before, an accurate shrinkage intensity estimate can benefit the linear shrinkage estimator. In this section, we compare the proposed available shrinkage intensity and the existing one based on crossvalidation in [18] to reveal the advantage of the proposed shrinkage estimator. In our simulations, the real covariance matrix is Σ=σijp×p with(28)σij=ti−j.

The model parameter t is set to be 0.5, resulting in the real covariance matrix being close to a spherical structured matrix. The data come from the complex Gaussian distribution CN0,Σ. The MSE of each available shrinkage intensity relative to the optimal intensity given by (17) is computed by averaging 5×104 Marlo runs.

Figure 1 reports the MSEs of available shrinkage intensities versus the sample size and the dimension. We can see that the MSEs of available shrinkage intensities in T1cv and T1cg decrease as the sample size or the dimension gets larger. Because the proposed T1cg by plugging in the unbiased estimates of unknown scalars employs the complex Gaussian distribution information, it outperforms the T1cv based on the nonparameter approach.

Figure 1

The MSEs of available shrinkage intensities when p=60 (a) and n=60 (b).

(a)(b)

4.2. Adaptive Beamforming

In this section, we apply the proposed covariance estimators to array signal processing. Specifically, we consider a uniform linear array (ULA) which consists of p sensors with half-wavelength spacing. At time t=1,…,n, the received signal can be modeled as(29)xt=aθ0s0t+∑k=1Kaθkskt+nt∈ℂp,where θ0 and θk are the directions of desired signal s0t and interference signals skt, respectively, aθ0 and aθk are the corresponding array responses, and nt is the noise [23]. Then, the minimum variance distortionless response (MVDR) beamformer is expressed as(30)w=Σ−1aθ0aθ0HΣ−1aθ0.

The covariance matrix Σ in (30) is unknown and suggested to be replaced with its estimate Σ^ [24]. Then, the corresponding output signal-to-interference-plus-noise ratio (SINR) is(31)SINR=σ02w^Haθ02w^HΣ^−aθ0aθ0Hw^,where w^ is the estimated beamformer corresponding to Σ^. The covariance matrix estimator with larger output SINR is preferred in array signal processing.

In our simulations, we assume the desired signal has an angle of arrival of θ0=5° with power σ02=10 dB, and the interference signals come from the directions −10°,0°,10° with power 8 dB. For each covariance estimator, the corresponding output SINR is approximated by averaging 5×106 repetitions.

Figures 2 and 3 report the SINR and corresponding elapsed time of the adaptive beamformers based on different covariance matrix estimators under the complex Gaussian scenario, where the noise nt follows the complex Gaussian distribution with power 0 dB. In Figure 2, the dimension is p=60 and the sample size ranges from 20 to 120. In Figure 3, the sample size is n=60 and the dimension ranges from 20 to 120. Our observations and analyses are summarized as follows:(1)

Even though enjoying the lowest computation cost, the classic covariance estimator SCM has an unsatisfactory performance in small sample size scenarios. Therefore, it is not an ideal covariance matrix estimator any more in these scenarios.

(2)

The SINR based on each covariance matrix estimator increases when the sample size gets larger but decreases when the dimension gets larger. It reveals that the covariance matrix estimators play an important role in adaptive beamforming in large dimension and small sample size scenarios.

(3)

Both the proposed shrinkage estimator T1cg and the existing shrinkage estimator T1cv outperform the SCM with an additional but reasonable computation cost. Furthermore, the proposed T1cg dominates the T1cv on both SINR and computation cost because the signal comes from the complex Gaussian distribution in the simulation setting, and the proposed covariance matrix estimator has considered the specific distribution information.

Figure 2

The SINR (a) and corresponding elapsed time (b) of adaptive beamformer based on each covariance matrix estimator versus the sample size under the complex Gaussian distribution.

(a)(b)

Figure 3

The SINR (a) and corresponding elapsed time (b) of adaptive beamformer based on each covariance matrix estimator versus the dimension under the complex Gaussian distribution.

(a)(b)

Figures 4 and 5 report the SINR and corresponding elapsed time of the adaptive beamformers when each variate in the noise follows the complex Gaussian mixture distribution 0.3CN−20,1+0.4CN0,1+0.3CN20,1 with power 6 dB. We can see that the SINR and elapsed time show the same varying tendencies, as in Figures 2 and 3. When the received signal comes from the complex non-Gaussian distribution, the proposed estimator T1cg performs inferior to the existing T1cv in adaptive beamforming. It is worthy noticing that both T1cg and T1cv have analytical expressions, and there are np2+np+3p2 multiplication in the proposed estimators T1cg and np2+3np+5p2 multiplication in T1cv. Therefore, the proposed T1cg can always enjoy a lower computation complexity than T1cv.

Figure 4

The SINR (a) and corresponding elapsed time (b) of adaptive beamformer based on each covariance matrix estimator versus the sample size under the complex non-Gaussian distribution.

(a)(b)

Figure 5

The SINR (a) and corresponding elapsed time (b) of adaptive beamformer based on each covariance matrix estimator versus the dimension under the complex non-Gaussian distribution.

(a)(b)

On the whole, by employing additional distribution information, the proposed estimator T1cg outperforms T1cv in the complex Gaussian scenario and enjoys a comparable performance with T1cv in the complex non-Gaussian scenario. Moreover, the proposed estimator T1cg always enjoys an advantage over T1cv in computation cost.

5. Conclusion

In this paper, we have proposed a new covariance matrix estimator via linear shrinkage procedure under the complex Gaussian distribution. Through calculating the moment of Wishart distribution, we obtain the optimal shrinkage intensity for the spherical target. Furthermore, the involved unknown scalars are unbiasedly estimated. Subsequently, we propose the corresponding available linear shrinkage estimator. Numerical simulations and application to adaptive beamforming show that the proposed covariance matrix estimator is outperformed compared with the existing estimators. In future work, we will investigate the Cramér–Rao bound for the linear shrinkage estimation and develop nonlinear shrinkage estimation of the large-dimensional covariance matrix.

Data Availability

All data included in this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interests regarding the publication of this paper.

Acknowledgments

This work was supported in part by the Youth Research Foundation of Hubei Minzu University under Grant MY2017Q021.

Bai

Shi

Estimating high dimensional covariance matrices and its applications

Annals of Economics and Finance2011122199215

Van Veen

B. D.

Adaptive convergence of linearly constrained beamformers based on the sample covariance matrix

IEEE Transactions on Signal Processing199139614701473

10.1109/78.136563

2-s2.0-0026170893

Bai

Silverstein

J. W.

Spectral Analysis of Large Dimensional Random Matrices2010

New York, NY, USA

Springer

Gini

Greco

Covariance matrix estimation for CFAR detection in correlated heavy tailed clutter

Signal Processing2002821218471859

10.1016/s0165-1684(02)00315-8

2-s2.0-0036887912

Witten

D. M.

Tibshirani

Covariance-regularized regression and classification for high dimensional problems

Journal of the Royal Statistical Society: Series B (Statistical Methodology)2009713615636

10.1111/j.1467-9868.2009.00699.x

2-s2.0-66849143711

Ledoit

Wolf

Improved estimation of the covariance matrix of stock returns with an application to portfolio selection

Journal of Empirical Finance2003105603621

10.1016/s0927-5398(03)00007-0

2-s2.0-0041841552

Cai

T. T.

Yuan

Adaptive covariance matrix estimation through block thresholding

The Annals of Statistics201240420142042

10.1214/12-aos999

2-s2.0-84871942706

Rothman

A. J.

Positive definite estimators of large covariance matrices

Biometrika2012993733740

10.1093/biomet/ass025

2-s2.0-84865499369

Besson

Abramovich

Y. I.

Regularized covariance matrix estimation in complex elliptically symmetric distributions using the expected likelihood approach-Part 2: the under-sampled case

IEEE Transactions on Signal Processing2013612358195829

10.1109/tsp.2013.2285511

2-s2.0-84887425005

Bien

Bunea

Xiao

Convex banding of the covariance matrix

Journal of the American Statistical Association2016111514834845

10.1080/01621459.2015.1058265

2-s2.0-84983262164

Fisher

T. J.

Sun

Improved Stein-type shrinkage estimators for the high-dimensional multivariate normal covariance matrix

Computational Statistics & Data Analysis201155519091918

10.1016/j.csda.2010.12.006

2-s2.0-79251594741

Chen

Wang

Z. J.

McKeown

M. J.

Shrinkage-to-tapering estimation of large covariance matrices

IEEE Transactions on Signal Processing2012601156405656

Schäfer

Strimmer

A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics

Statistical Applications in Genetics and Molecular Biology200941132

Ikeda

Kubokawa

Srivastava

M. S.

Comparison of linear shrinkage estimators of a large covariance matrix in normal and non-normal distributions

Computational Statistics & Data Analysis20169595108

10.1016/j.csda.2015.09.011

2-s2.0-84947983428

Konno

Shrinkage estimators for large covariance matrices in multivariate real and complex normal distributions under an invariant quadratic loss

Journal of Multivariate Analysis20091001022372253

10.1016/j.jmva.2009.05.002

2-s2.0-70249115086

Liu

Sun

Zhao

A covariance matrix shrinkage method with Toeplitz rectified target for DOA estimation under the uniform linear array

AEÜ-International Journal of Electronics and Communications2017815055

10.1016/j.aeue.2017.06.026

2-s2.0-85029008808

Zhang

Zhou

Improved shrinkage estimators of covariance matrices with Toeplitz-structured targets in small sample scenarios

IEEE Access20197116 785116 798

10.1109/access.2019.2936402

Tong

Xiao

Guo

Linear shrinkage estimation of covariance matrices using low-complexity cross-validation

Signal Processing2018148223233

10.1016/j.sigpro.2018.02.026

2-s2.0-85042490398

Chen

Wiesel

Eldar

Y. C.

Hero

A. O.

Shrinkage algorithms for MMSE covariance estimation

IEEE Transactions on Signal Processing2010581050165029

10.1109/tsp.2010.2053029

2-s2.0-77956726241

Hannart

Naveau

Estimating high dimensional covariance matrices: a new look at the Gaussian conjugate framework

Journal of Multivariate Analysis2014131131149162

10.1016/j.jmva.2014.06.001

2-s2.0-84904550307

Ledoit

Wolf

A well-conditioned estimator for large-dimensional covariance matrices

Journal of Multivariate Analysis2004882365411

10.1016/s0047-259x(03)00096-4

2-s2.0-0346961488

Tague

J. A.

Caldwell

C. I.

Expectations of useful complex Wishart forms

Multidimensional Systems and Signal Processing199453263279

10.1007/bf00980709

2-s2.0-0028464609

Mestre

Lagunas

M. A.

Finite sample size effect on minimum variance beamformers: optimum diagonal loading factor for large arrays

IEEE Transactions on Signal Processing20065416982

10.1109/tsp.2005.861052

2-s2.0-30344437591

Serra

Nájar

Asymptotically optimal linear shrinkage of sample LMMSE and MVDR filters

IEEE Transactions on Signal Processing2014621435523564

10.1109/tsp.2014.2329420

2-s2.0-84903640706