DOA Estimation for a Mixture of Uncorrelated and Coherent Sources Based on Hierarchical Sparse Bayesian Inference with a Gauss-Exp-Chi 2 Prior

,


Introduction
Direction of arrival (DOA) estimation has been a crucial issue in various application areas involving radar, wireless communication, and navigation [1][2][3].Multiple signal classification (MUSIC) [4] and estimation of signal parameter via rotational invariance technique (ESPRIT) [5], known as the two most classical subspace-based DOA estimation algorithms, have been proposed to resolve uncorrelated sources.However, in multipath propagation environments, sources from an identical target may undergo reflection from various surfaces, and thus, the received sources may be a mixture of uncorrelated and coherent sources.In such environments, the aforementioned subspace-based algorithms would suffer from serious performance deterioration or even fail [6].
In order to solve the aforementioned problem, several preprocessing techniques are developed for decorrelation.In the related studies, these preprocessing techniques are mainly classified into two categories: spatial smoothing (SS) [7] and matrix reconstruction (MR) [8].Theoretically, the SS technique is implemented by dividing the whole array into multiple subarrays to combat the rank deficiency, while the MR technique is performed by rearranging the rank-deficient matrix into Hankel matrix, Toeplitz matrix, or other matrices to restore the rank.By combing the SS or MR techniques with subspace-based algorithms, several DOA estimation algorithms have been developed to handle the scenarios where uncorrelated and coherent sources coexist [9][10][11][12][13].More specifically, the oblique projection spatial smoothing (OPSS) [9] and moduli property spatial smoothing (MPSS) [10] are the typical SS-based algorithms, and the oblique projection Toeplitz matrix reconstruction (OPT-MR) [11], symmetric non-Toeplitz matrix reconstruction (SNT-MR) [12], and real-valued Hankel matrix reconstruction (RVH-MR) [13] are the popular MR-based algorithms.
Unlike traditional subspace-based DOA estimation algorithms, the emerging sparse source reconstruction (SSR) algorithms [14][15][16][17][18][19], including matching pursuit (MP) algorithm [14], lp-norm optimization algorithms [15,16], and sparse Bayesian inference (SBI) algorithms [17][18][19], provide a new perspective for DOA estimation.Since SSR-based algorithms realize DOA estimation via sparse source reconstruction, instead of calculating the covariance matrix, they can resolve the coherent sources directly without extra preprocessing techniques.Among these SSR-based algorithms, both the MP and the lp-norm optimization algorithms are restricted to solve an optimization problem and are relied on the point estimate, so that these algorithms ignore the uncertainty of the underlying source in the process of source reconstruction.By contrast, SBI-based algorithms specially consider the uncertainty of the underlying source and estimate the source via choosing an appropriate prior, which yield favorable reconstruction performance [20].Moreover, they can provide good estimation performance in the case of low SNR or small number of snapshots [21].In general, SBI-based algorithms first specify a sparsityencouraging prior to the unknown source model and then the model parameters are estimated via Bayesian inference.Since the exact Bayesian inference is intractable, two mainstream approximation inference algorithms were presented to estimate the model parameters, in which one is evidence procedure [22] and the other is variational Bayesian approximation [23].For the evidence procedure, some unknown hyperparameters with respect to the hierarchical prior model are estimated iteratively by maximizing the evidence.For the variational Bayesian approximation, the posterior distribution is approximated as the product of several tractable distributions, and the model parameters keep updating to minimize the Kullback-Leibler (KL) divergence between the true posterior and the variational approximation, which has attractive computational efficiency along with high estimation performance [24].Note that these approximation inference algorithms operate under the premise that an appropriate sparse prior has been imposed on the source model for the purpose of encouraging sparse solutions.Many sparsity-encouraging prior models have been investigated in the SBI framework [25][26][27].In [25], Bayesian compressed sensing (BCS) was proposed with a two-layer hierarchical Gaussian-inverse-gamma prior (or Student's-t prior), where the first layer is a Gaussian probability density function (pdf) and the second layer is a gamma pdf.Babacan et al. [26] proposed an equivalent Laplace prior that is generated by a Gaussian prior and an Exponential prior.In [27], a normal product (NP) prior was developed with two algorithms: NP-0 (using one-layer source model) and NP-1 (using twolayer source model).However, these priors are concentrated near the origin with relatively light tails, which may cause overshrinkage of the incident sources [28].
In this paper, we develop a new sparse-encouraging prior (called Gauss-Exp-Chi 2 prior) whose pdf distribution exhibits a sharp peak at the origin and heavy tails.With the proposed prior, the DOA estimation for a mixture of uncorrelated and coherent sources is performed under the hierarchical SBI framework using a uniform linear array (ULA).A three-layer hierarchical Bayesian model is established based on the Gauss-Exp-Chi 2 prior.Subsequently, according to the variational Bayesian approximation, the model parameters (including the mean and variance of sparse sources and hyperparameters) keep alternately updating until the KL divergence between the true posterior and the variational approximation tends to be zero.By exploiting the estimated model parameters, the source power spectra is constructed, from which the number and locations of the highest peaks are extracted to obtain source number and DOA estimates.Simulation results show that the proposed algorithm has superior estimation performance.Now we briefly summarize the contributions of this work as follows: (i) To encourage sparse solutions, we develop a new sparse-encouraging prior, called Gauss-Exp-Chi 2 prior, whose pdf distribution has a sharp peak at the origin and heavy tails.
(ii) By constructing the source powers of all the potential directions in the angular space, both source number and DOA estimates are obtained.
(iii) Variational approximations are adopted for the estimation of the hierarchical sparse Bayesian model parameters.
(iv) Several implementation details for algorithm optimization including Woodbury matrix identity for dimension-reduction, pruning a basis function and the third kind Bessel function approximation are discussed, and the CRB of DOA estimation is derived.
The remainder of this paper is organized as follows.The DOA estimation model for mixed sources is formulated in Section 2. Section 3 presents the Gauss-Exp-Chi 2 prior and DOA estimation algorithm for a mixture of uncorrelated and coherent sources within the hierarchical SBI framework.The algorithm optimization and CRB of DOA estimation are discussed in Section 4. Section 5 presents the simulation results of the proposed algorithm.Conclusions are drawn in Section 6.
Notations.Vectors and matrices are denoted by lowercase and uppercase boldface letters, respectively.⋅ T , ⋅ H , ⋅ −1 , and ⋅ represent transpose, conjugate transpose, inverse, and the statistical expectation, respectively.I is an identity matrix.⋅ 2 denotes the Euclidean norm, and diag

Problem Formulation
Consider a total of K far-field narrowband sources impinging on the uniform linear array (ULA) consisting of M omnidirectional sensors with the interspacing between adjacent sensors d being a half of the carrier wavelength λ, that is, d = λ/2.Then, the array output vector at the tth snapshot can be expressed as where T is the number of snapshots, a θ k = 1, e −jπ sin θ k , … , e −jπ M−1 sin θ k T is the steering vector corresponding to the direction θ k of the kth impinging source s k t , and n t denotes the noise vector.Without loss of generality, the impinging source s k t K k=1 is a mixture of uncorrelated and coherent sources.Note that there exists power fading among the coherent source, and fading coefficients are used to describe the degree of power fading [5].Specifically, consider K far-field narrowband sources s k t K k=1 impinging on the ULA, in which the first K u sources s 1 t , s 2 t , … , s K u t are uncorrelated, and the last the uncorrelated source set is denoted as s k t K u k=1 and the coherent source set is denoted as Divide the entire angular space into J sampling grids , where J represents the grid number and generally satisfies ; thus, we have h j t = s j k t δ j − j k .Thus, x t can be rewritten in a sparse form Due to the fact that h t has K non-zero elements in J elements, it is a sparse vector.In the case of multiple snapshots, the sparse sources at all the snapshots share the same support, and the array output matrix of T snapshots can be represented by where The goal of this paper is to provide the DOA estimation under the coexistence of uncorrelated and coherent sources from a sparse Bayesian perspective.

DOA Estimation
In this section, a DOA estimation algorithm for a mixture of uncorrelated and coherent sources is proposed within the hierarchical SBI framework.A Gauss-Exp-Chi 2 prior is developed to encourage sparse solutions, and then the parameters of three-layer hierarchical Bayesian model are estimated via variational Bayesian approximations.By constructing source power spectra, the source number and DOA estimation are obtained.
3.1.Bayesian Model.In the Bayesian model, the pdf of a joint distribution p X, H, α, η with respect to all the unknown and observed quantities is required to be known.In this paper, the joint distribution can be expressed as where α and η are referred to as the hyperparameters; τ ≥ 0 and v ≥ 0 are, respectively, referred to as the rate parameter and shape parameter.It is assumed that the components n t , t = 1, 2, … , T are the independently zero-mean stationary Gaussian noise with known variance σ 2 .Thus, the pdf of the noise matrix N is given by Combining ( 3) and ( 5), the Gaussian likelihood model is obtained as follows: From a Bayesian perspective, the pdf distribution of an assigned prior is appealing to exhibit a sharp peak at the origin and heavy tails, which favors strong shrinkage of noise sources and avoids overshrinkage of the interest sources.This property is generally considered as a desirable property for enforcing sparsity and variable selection [24].Some typical sparse priors, such as Gaussian-inverse-gamma prior and Laplace prior, are widely used in the relevant research [25,26], which, however, are concentrated near the origin with relatively light tails [28].To alleviate this problem, we here develop a three-layer hierarchical prior, referred to as Gauss-Exp-Chi 2 prior, for H.In the first layer of prior, we adopt a zero-mean Gaussian prior In the second layer of prior, an exponential hyperprior is imposed on α 3 International Journal of Antennas and Propagation where Exp ⋅ denotes the exponential distribution.In the third layer of prior, a chi-square (Chi2) hyperprior is considered over η, that is, where Γ ⋅ denotes the gamma function.
The first two layers ( 7) and ( 8) of the proposed prior result in a generative Laplace (Gaussian-Exponential) distribution [26], and the last layer (i.e., a Chi2 distribution) is embedded to obtain the proposed three-layer Gauss-Exp-Chi 2 prior.Thus, the proposed prior has more free parameters to control the degree of sparseness as compared with the Laplace prior [29].
Based on the above analysis, the directed graph of the sparse Bayesian model is shown in Figure 1, where arrows are used to point to the generative model.Note that the first five blocks from the left (corresponding to the variables v, η, τ, α 1 , … , α J , and h 1 t , … , h J t ) depict the three-layer hierarchical prior mentioned above.
Since p α j | τv ∝ τα j + 1/2 − v/2+1 , j = 1, 2, … , J, it is obvious that most α j for j = 1, 2, … , J are favored to zero, which leads to the fact that most h j t are favored to zero.Thus, the proposed three-layer hierarchical Gauss-Exp-Chi 2 prior is a sparsity-encouraging prior, which can help to improve the performance of source reconstruction [30].By combining the three layers of the proposed prior, the generated prior can be computed via where where U chf ⋅ , ⋅ , ⋅ is the confluent hypergeometric function defined by The reason for choosing the Gauss-Exp-Chi 2 prior is twofold: its pdf distribution has a sharp spike at the origin and heavy tails; it forms in a conjugate manner since the exponential and Chi2 distributions belonging to the exponential distribution family are chosen as the 2nd layer and the 3rd layer of this prior, which significantly simplifies the form of posterior distributions [24].The tail to spike behavior for Gauss-Exp-Chi 2 prior, Laplace prior, Student's-t prior, and Gaussian prior is illustrated in Figure 2, wherein the parameter selection of these priors is according to the standard derivation [25,26].It is observed that the proposed Gauss-Exp-Chi 2 prior is a sparsity-encouraging prior which has sharper peak and heavier tails than the existing priors.

Variational Bayesian Approximations.
It is well known that Bayesian inference operates on the basis of the posterior distribution p H, α, η | X = p H, α, η, X /p X .However, this posterior distribution is intractable for the reason that p X = p H, α, η, X dHdαdη cannot be calculated analytically.In order to approximate this posterior distribution, we resort to mean-field variational Bayesian approximation [23], whose task is to seek several analytically approximate distributions that minimize the KL divergence between the posterior p H, α, η | X and its approximation distributionq H, α, η .According to [23], q H, α, η can be factorized as the product of three variational distributions q H , q α , and q η ; thus, we have   International Journal of Antennas and Propagation (1) Update of q H . Keeping the terms of q H only depend on H, we have with Λ = diag 1/α 1 , 1/α 2 , … , 1/α J .By gathering together the similar terms, we have which implies that q H is the product of multiple Gaussian distributions N h t | μ t Σ with the mean μ t and covariance Σ being

21
(2) Update of q α .Similarly, according to (16), q α is derived as Denoting h 2 j = ∑ T t=1 h 2 j t and <h 2 j > = μ 2 j + Σ jj , ( 22) can be further simplified to The posterior distribution of q α can be approximately equivalent to the product of a series of generalized inverse Gaussian (GIG) distributions, that is, α ∼ ∏ J j=1 GIG α j | 1/2 < h 2 j > 2τ < η > , and thus, we have where κ p ⋅ is the third kind Bessel function with order p.The case of n = 1 in (24) gives the calculation of <α j > , which can be used in (25).The case of n = −1 gives <α −1 j > used for the evaluation of Σ in (21).International Journal of Antennas and Propagation (3) Update of q η .For q η , we have Thus, it can be known that q η is a gamma distribution, that is, η ∼ Gamma J + v/2, τ∑ J j=1 < α j > +1/2 with the following mean The estimation of model parameters is implemented by alternately updating the mean μ, variance Σ, and hyperparameters <α j > and <η > to minimize the KL divergence between the true posterior and the variational approximation, and the main steps are summarized in Algorithm 1.

Source
Power and DOA Estimation.The DOAs of the impinging sources are estimated via the following two steps: (1) form spatial spectra using the estimated source powers of all the potential directions and (2) extract the number and the corresponding locations of the peaks beyond a power threshold from the spectra.Let U= μ 1 , μ 2 , … , μ T and Σ be the final estimates of U and Σ (the mean and variance with respect to H) and consider H row by row, then we have h j ∼ N U j , Σ jj I .As outlined in [18], let P o j be the source power of direction θ j ; then we have Therefore, the source powers of all potential directions in the angular space are estimated by calculating T .Suppose that there are K peaks exceeding the power threshold P thres and the corresponding grid indices are j k j k = 1, 2, … , K .Then, the estimated source number and DOAs will be K and θ j k j k = 1, 2, … , K .

Discussions
4.1.Algorithm Optimization.For alleviating the computational complexity and speeding up the update efficiency, some implementation details are adopted for algorithm optimization in this subsection.

Woodbury Matrix Identity for Dimension-Reduction.
At each iteration, it is inevitable to calculate a J × J matrix inversion when updating Σ according to (21), which requires O J 3 computations.To reduce the computational complexity, we adopt the Woodbury matrix identity, and then (21) can be rewritten as 4.1.2.Pruning a Basis Function.In order to speed up the update efficiency, <α j > and the corresponding a θ j would be pruned from the original sparse Bayesian model when <α j > is smaller than a certain threshold.This procedure is referred to as "pruning a basis function", and similar approaches have been used in [29,30].

The Third Kind Bessel Function Approximation.
It is known that (24) involves the third kind Bessel function, and this integral process is relatively complicated.According to [31], the third kind Bessel function of m can be approximated as κ p m ∼ 1/2Γ p m/2 −p , if p > 0, and κ p m ∼ 1/2 Γ −p m/2 p , if p < 0.Then, (24) can be simplified to update Σ i using (21) 4.
i = i + 1 8. end while Algorithm 1: The main steps of the model parameter estimation.

6
International Journal of Antennas and Propagation

Cramér-Rao Bound (CRB).
To evaluate the DOA estimation performance, the CRB of the DOA estimation for a mixture of uncorrelated and coherent sources is derived in this subsection.Consider the sparse form of the array output vector x t mentioned in (2), the log-likelihood function of x 1 , x 2 , … , x T is firstly construct as Let h r t and h i t be the real part and imaginary part of h t , that is, h r t ≜ Re h t and h i t ≜ Im h t .Then, the partial derivatives of ( 29) with respect to h r t , h i t , σ, and θ are, respectively, given as where , Then, the Fisher information matrix (FIM) [32] can be obtained as It is noted that the following equality holds Finally, by combing ( 32) and (33), we have

Simulation Results
In this section, several simulations are presented to illustrate the performance of the proposed algorithm as compared with the BCS [25] and NP-1 [27] algorithms.Consider a 9sensor ULA with sensor interspacing being a half of the carrier wavelength.In the proposed algorithm, hyperparameters α and η are initialized as 1/T∑ T t=1 A H x t and 0.1; the values of rate parameter τ, shape parameter v , and power International Journal of Antennas and Propagation threshold P thres are, respectively, set to be τ = 1 5, v = 1, and P thres = −10 dB.Two hundred independent Monte Carlo trials are conducted for the following simulations, and the success rate and root mean squared error (RMSE) are two significant performance metrics for evaluating the DOA estimation performance.The success rate is defined as the rate between the number of successful estimates and the total number of Monte Carlo trials, where a successful estimate is declared if the estimation error is within a certain small Euclidean distance of the true directions.
The RMSE is defined by where θ k,i is the estimates of θ k in the ith Monte Carlo trial.
In the first simulation, we evaluate the effectiveness of the proposed algorithm.Assume that two uncorrelated sources from −32 2 °, −11 5 °and two coherent sources from 6 7 °, 23 4 °with fading coefficients 1, 0 2891 − 0 7567j impinge this ULA.The grid number, SNR, and number of snapshots are, respectively, set to be 361, 0 dB, and 200. Figure 3 plots the source power spectrum, which is a function of DOA.
As can be seen from Figure 3, the power peaks in the vicinity of the true DOAs can be distinguished from other peaks by their relatively strong powers and the biases between these peaks and the true DOAs are slight, which demonstrate the effectiveness of the proposed algorithm.
The second simulation compares the estimation performance of three algorithms, including the proposed algorithm, BCS, and NP-1 algorithms.Consider one uncorrelated source from 19 2 °and two coherent sources from −25 5 °, 33 3 °with fading coefficients being 1, −0 5280 + 0 6010j .The grid number and SNR are, respectively, fixed at 361 and 0 dB.The success rate versus number of snapshots and SNR are depicted in Figures 4 and 5, whereas the RMSE versus number of snapshots and SNR are shown in Figures 6  and 7.The results from Figures 4 and 5 illustrate that all the three algorithms are able to estimate the DOA correctly under the condition that the SNR or number of snapshots are relatively high, but for low SNR or small number of snapshots, the proposed algorithm has a higher success rate than the BCS and NP-1 algorithms.Figures 6 and 7 show that the estimation accuracy of the three algorithms tends to improve with the increase of SNR or the number of snapshots, and the proposed algorithm yields the best estimation accuracy as compared to the other two algorithms.This is because the proposed Gauss-Exp-Chi 2 prior is a three-layer sparsityencouraging prior, which can help to improve the accuracy of source reconstruction.The third simulation investigates the RMSE versus grid number.The simulation settings are the same as those of the second simulation, except that grid number in this simulation is ranged from 61 to 361. Figure 8 plots the RMSE curves versus grid number with fixed SNR 10 dB and number of snapshots 500.
It is observed that the RMSE curves of the three algorithms rapidly decrease as grid number increases from 61 to 181, and these curves tend to slowly decrease with the increase of grid number when grid number is beyond 181.In addition, the proposed algorithm has minimum estimation errors among the three algorithms.Note that the three algorithms recover sparse sources from a Bayesian perspective; thus, they are not restricted to the restricted isometry property.This implies that the grid number can continue to increase to further improve the estimation accuracy and the computational complexity would increase accordingly.Therefore, for the purpose of balancing estimation accuracy and computational complexity, the reasonable range of grid number is from 181 to 361.
In the last simulation, we test the angular resolution of the three algorithms.Consider two coherent sources from  9 International Journal of Antennas and Propagation 10.1 °and 10 1 °+ Δθ with the angular separation Δθ ranging from 3 °to 10 °.The fading coefficients are 1, 0 4469 − 0 7696j , and the grid number is set to be 361.Figure 9 depicts the RMSE versus angular separation with the fixed SNR 10 dB and number of snapshots 500.The results from Figure 9 show that both the BCS and NP-1 algorithms fail to work when the angular separation is less than 4 °, whereas the proposed algorithm can still provide accurate DOA estimation as long as the angular separation is no less than 3 °.
Moreover, the proposed algorithm has smaller RMSE than the BCS and NP-1 algorithms at the same angular separation.Thus, we can say that the proposed algorithm has a higher angular resolution than the BCS and NP-1 algorithms.

Conclusion
In this paper, we develop a DOA estimation algorithm for a mixture of uncorrelated and coherent sources using SBI.In the Bayesian framework, a Gauss-Exp-Chi 2 prior is introduced to promote sparse solutions, and the corresponding three-layer hierarchical Bayesian model is established.Then, the model parameters are estimated via variational Bayesian approximations.Finally, we form the source power spectra by using the estimated model parameters, from which the number and the locations of the highest peaks are extracted to achieve source number and DOA estimates.Simulation results show that the proposed algorithm can effectively estimate the DOAs of mixed sources and has better estimation performance than the state-of-the-art BCS algorithm and NP-1 in terms of estimation accuracy, success rate, and angular resolution.

Appendix A. The Derivation of the Probability Density Function for Gauss-Exp-Chi 2 Prior
In this Appendix, we prove that (12) holds.

A1
The expression on the right hand side of (12) can be rewritten as p h j t ; τ, v =

2
International Journal of Antennas and Propagation ⋅ denotes a diagonal matrix.Additionally, b a vdv represents the integral of v from a to b.

Figure 1 :
Figure 1: Directed graph of the sparse Bayesian model.

Figure 2 :
Figure 2: Four kinds of pdf curves with the standard derivation: (a) compares four distributions at origin and (b) compares four distributions in tail.

Figure 3 :Figure 4 :Figure 5 :
Figure 3: Spectra for the proposed algorithm with the fixed SNR 0 dB and number of snapshots 200.

Figure 9 :
Figure 9: RMSE versus angular separation with the fixed SNR 10 dB and number of snapshots 500 for coherent sources.