Distributed Structured Compressive Sensing-Based Time-Frequency Joint Channel Estimation for Massive MIMO-OFDM Systems

In massive multi-input multi-output orthogonal frequency divisionmultiplexing (MIMO-OFDM) systems, accurate channel state information (CSI) is essential to realize system performance gains such as high spectrum and energy efficiency. However, highdimensional CSI acquisition requires prohibitively high pilot overhead, which leads to a significant reduction in spectrum efficiency and energy efficiency. In this paper, we propose a more efficient time-frequency joint channel estimation scheme for massive MIMO-OFDM systems to resolve those problems. First, partial channel common support (PCCS) is obtained by using time-domain training. Second, utilizing the spatiotemporal common sparse property of the MIMO channels and the obtained PCCS information, we propose the priori-information aided distributed structured sparsity adaptive matching pursuit (PA-DSSAMP) algorithm to achieve accurate channel estimation in frequency domain. .ird, through performance analysis of the proposed algorithm, two signal power reference thresholds are given, which can ensure that the signal can be recovered accurately under power-limited noise and accurately recovered according to probability under Gaussian noise. Finally, pilot design, computational complexity, spectrum efficiency, and energy efficiency are discussed as well. Simulation results show that the proposed method achieves higher channel estimation accuracy while requiring lower pilot sequence overhead compared with other methods.


Introduction
Due to the scarcity of radio spectrum in the microwave band, how to improve spectrum efficiency (SE) has become one of the significant problems for future wireless communication systems.By leveraging the outstanding advantages of multiplexing and diversity, massive multi-input multioutput (MIMO) or large-scale antenna technology can achieve high SE and energy efficiency (EE) and has become one of the key technologies for fifth-generation (5G) wireless communications [1,2].In addition, orthogonal frequency division multiplexing (OFDM) has been widely adopted in modern wireless communication systems due to its excellent antimultipath capability [3].Consequently, massive MIMO-OFDM will be a new standard for 5G.
In massive MIMO-OFDM systems, channel state information (CSI) is indispensable for precoding/combine, channel equalization and coherent detection, etc., which makes the accurate channel estimation crucial to improving system performance [3,4].In conventional MIMO-OFDM systems, orthogonal pilot training scheme is adopted for accurate downlink CSI acquisition.In massive MIMO-OFDM systems, however, since hundreds of transmit antennas are installed at BS, the acquisition of highdimensional CSI results in the prohibitively high pilot overhead, much less the problem of computational complexity caused by high-dimensional matrix operation [5].Meanwhile, quantization accuracy of the CSI fed back from users to BS, which is affected by the number of BS antennas, is a question worth considering as well [6,7].
To solve those problems, many low-rank channel estimation approaches were proposed such as low-rank channel covariance matrices-(CCMs-) based method [8,9], array signal processing-based method [10,11], and compressive sensing-(CS-) based method [12][13][14][15][16][17][18][19][20][21][22][23].Although the lowrank property of CCMs can make the dimensions of MIMO channels greatly reduced, high-dimensional CCMs are difficult to be obtained.Moreover, high-dimensional matrix operations involved in the singular value decomposition (SVD) or eigenvalue decomposition (EVD) lead to high computational complexity.Array signal processing-based methods have many advantages, but it is particularly applicable to millimeter-wave massive MIMO systems which have high angular resolution [4].Fortunately, the CS-based channel estimation method can be adopted for massive MIMO-OFDM systems due to the sparse characteristics of MIMO wireless channels [24].
Basically, there are three categories of CS-based channel estimation methods for MIMO-OFDM systems: timedomain training channel estimation (TTCE) method [12,13], frequency-domain pilot training channel estimation (FTCE) method [17][18][19], and time-frequency joint training channel estimation (TFTCE) method [21][22][23].TTCE exploits the interblock interference-free (IBI-free) region in the redundant portion of the received time-domain training sequences (TSs) for channel estimation.Although this scheme has high spectral efficiency, the design of the timedomain TSs is rather difficult when the number of antennas is large enough, because it not only needs to ensure the orthogonality between different TSs but also makes sure that the sensing matrices satisfy the restricted isometry property (RIP) condition [14,15].Moreover, accuracy of the estimation is seriously affected by channel length [16].Unlike the TTCE method, the FTCE method uses orthogonal or nonorthogonal frequency-domain pilots for channel estimation.Although the difficulty of designing training sequences is greatly reduced, the spectral efficiency is decreased because the useful information provided by the preamble is not effectively utilized.Taking the advantages of the above two schemes, Dai et al. [5] proposed TFTCE scheme, while the CS method is not adopted for higher accuracy estimation.On this basis, Ding et al. [21] proposed a TFTCE method based on compressed sensing.However, due to the interference of nonorthogonal time-domain TSs, it cannot be applied to massive MIMO systems.Ding et al. and Fan et al. [22,23] exploited identical TSs in time domain and frequency domain over a transmission frame, while changing TSs can bring better performance [25].More importantly, these methods did not make effective use of the sparse characteristics of MIMO channels.
In this paper, we propose a distributed structured compressive sensing-based time-frequency joint channel estimation method for massive MIMO-OFDM systems.Our contributions in this paper are summarized as follows: (i) By using spatio-temporal common sparse characteristics of wireless MIMO channel, we propose a priori information aided distributed structured sparsity adaptive matching pursuit algorithm (PA-DS-SAMP).e proposed algorithm regards the partial channel common support as a priori aided information and combines SAMP algorithm with structured multimeasurement vector (MMV) ideas [26].e proposed algorithm improves the spectral efficiency and the accuracy of the channel estimation, while without knowing the sparsity of MIMO channels.
(ii) e performance of the proposed algorithm is analyzed.First, the signal power thresholds which can ensure accurate recovery of signals under powerlimited noise and according to probability accurate recovery under additive white Gaussian noise (AWGN) are obtained though strict derivation.Next, we give the pilot design scheme that is well suited for massive MIMO systems.Finally, computational complexity, spectrum efficiency, and energy efficiency are given, which show that not only the computational complexity is less than other algorithms but also the spectrum efficiency and energy efficiency are greatly improved.
e remainder of this paper is organized as follows.e distributed massive MIMO-OFDM system model with block-sparse channel is described in Section 2. In Section 3, a distributed structured compressive sensing-based channel estimation algorithm is proposed.Next, Section 4 presents the performance analysis of the proposed algorithm.Simulation results are presented in Section 5 to demonstrate the performance of the proposed algorithm.Finally, the paper is concluded in Section 6.
Notation.roughout our discussion, lowercase boldface letters indicate vector and uppercase boldface letters indicate matrix; 0 and I denote the zeros matrix of arbitrary size and identity matrix; (•) T , (•) H , (•) −1 , (•) † , and |•| c stand for the transport, conjugate transport, matrix inversion, Moore-Penrose matrix inversion, and cardinality, respectively.‖•‖ p and ‖•‖ F represent ℓ p norm and Frobenius norm; operator ⊗ represents circular convolution; A[i] expresses the ith block of matrix or vector A and ‖•‖ p,q represents ℓ p /ℓ q mixed norm of matrix or vector A.

Distributed System Model with Block-Sparse Channel
Consider a massive MIMO-OFDM system with N t transmit antennas configured at base station (BS) to serve K signalantenna mobile terminals, where N t ≫ K.In this paper, we focus on the case of downlink transmission in which each frame consisting of several time-frequency training ODFM (TFT-OFDM) symbols is transmitted from BS to mobile terminals.As described in Figure 1, the ith (1 Unlike cyclic prefix OFDM (CP-OFDM) and zero-padding OFDM (ZP-OFDM), known pseudonoise (PN) instead of unknown CP or ZP is adopted as guard interval c Mobile Information Systems TFT-OFDM to alleviate interblock interference (IBI) caused by multipath.At receiver, the TFT-OFDM symbols transmitted from each BS antenna are mixed together at each mobile terminal so that the ith received TFT-OFDM symbol vector in time domain can be written as [5] i,ISI and A (p) i,IBI are N × N Toeplitz lower and upper triangular matrices with the rst column vector [h i,L , 0, . . ., 0] T and the rst row vector [0, . . ., 0, h denotes the channel impulse response (CIR) vector from the pth antenna to receiver with the length L and n i is the complex additive white Gaussian noise (AWGN) vector with zero mean and the variance of σ 2 I.
After cyclicity reconstruction operated by the overlapping and adding (OLA) algorithm [5], the received pilot sequence of the ith OFDM symbol can be expressed through discrete Fourier transform (DFT) process as where F is the N × N DFT matrix and F L is the N × L matrix consisted of the rst L column of F, ξ i denotes the index set of subcarriers allocated to pilots whose elements are uniquely selected repeatedly from the set of 1, 2, . . ., N { } , (Fx (p) i )| ξ i and F L | ξ i are the submatrix or subvector by selecting the rows of (Fx (p) i ) or F L according to ξ i , h i [(h (1)  i ) T , . . ., (h is the number of pilot subcarriers, and w i is the complex AWGN.Wireless channels have sparse characteristics [24,27].e presence of scatterers and re ectors in wireless communication environment results in a multipath channel with several signi cantly resolvable propagation paths.Because the signi cant scatterers between transmitter and receiver are usually scarce, the number of the paths containing the majority of channel energy is so small that they are sparsely distributed in the resolvable paths.In other words, the gains of most paths approach to zeros except for a small number of signi cant paths so that the channel exhibits sparsity in delay domain [12].e sparsity support of the sparse channel h (p) i is de ned where p (p) th is the noise oor which can be determined according to [28].e sparsity level of the channel is denoted as Because antennas usually share the same scattering environment, all of the channels between one user and di erent antennas deployed on base station share common sparsity [17].
erefore, the nonzero entries of the CIR vector can occur in clusters so that the CIR vector h i can be transformed to a block-sparse CIR vector ( ) T , . . ., h (1)  i,l , . . ., h ( ) T , . . ., h (1)  i,L , . . ., h where h i [l] [h (1)  i,l , h (2)  i,l , . . ., h e blocking process of CIR vector is shown in Figure 2. At the same time, the observation matrix should have corresponding changes as follows: , . . ., φ (1)  i,l , . . ., φ , . . ., φ (1)  i,L , . . ., φ where (1)  i,l , φ (2)  i,l , . . ., φ 2) can be re-expressed as Besides, the path delays of practical wireless channels vary much slower than the path gains as shown in many studies [12]. is slow variation results that the CIRs over consecutive TFT-OFDM symbols share the same sparsity pattern [17].Consequently, this spatio-temporal common sparse characteristic of MIMO channel over R TFT-OFDM symbols can be expressed as Mobile Information Systems S (1)   i S (2)   i i De ning the spatio-temporal Common Sparse Support D of H ≜ h 1 , . . ., h R as where p th (1/N t ) th and p th is determined as (3).For simplicity, we assume the same threshold p th is used over each frame.us, the spatio-temporal Common Sparse Level S of the channel vector set H is S |D| c .
Because the observation matrices are di erent, but CIRs have the same sparse characteristics, this system model can be called a distributed system model.e structure of the distributed system model with common block-sparse channel is described in Figure 3.

Distributed Structured Compressive Sensing-Based Channel Estimation
In this section, we will introduce a time-frequency joint channel estimation method to obtain more accurate estimation accuracy.Firstly, time-domain training is used to obtain partial common path delay and then accurate channel estimation is achieved using the proposed PA-DS-SAMP algorithm.

Acquisition of Partial Channel Common Support.
Although time-domain training cannot realize accurate channel estimation, it can be used to acquire some prior information of channels such as partial channel delay information.Using the overlap-add method, the superimposed signal with the sum of the main part and tail of received time-domain TS can be expressed as where r i,main and r i,tail are the received signals associated with PN part and the tail part expressed as (10) and R is the number of symbols in which the channel delay remains unchanged.
where the measurement matrices G i,main [G (1)  i,main , . . ., G (N t ) i,main ] and G i,tail [G (1)  i,tail , . . ., G i,tail which are given in [12].us, we can get i,DBI can be denoted as Taking advantage of the circular convolution, the coarse CIR associated with the uth antenna can be estimated as 4 Mobile Information Systems In equation ( 13), the second part and the third part are the interference caused by nonorthogonal PN sequences and data, respectively.e data interference is mainly due to the superposition of the previous TFT-OFDM data block and the current TFT-OFDM symbol data block.Fortunately, there is no correlation between the random OFDM data block and the fixed TS sequence so that the data interference from others antennas has little effect on the delay estimation of the channel, especially in largescale antenna systems [29].However, the PN sequences interference from different antennas will increase linearly with the number of antennas when the cyclic matrices of the PN sequences associated with different antennas are not orthogonal with each other.To the best of our knowledge, however, there is no a set of sequences which have good autocorrelation and the cyclic matrices of which are orthogonal to each other.It is worth noting that this problem can be resolved once the same PN sequence is used for each antenna.Consequently, we have average CIR As a perfect sequence, Zadoff-Chu sequence which has good autocorrelation and cross-correlation is widely used in the OFDM system for synchronization and channel estimation [30] is defined as [30] where M is the sequence length, g is an integer, and P is an integer prime to M. By taking the inverse discrete Fourier transform (IDFT) of a diagonal matrix which the diagonal elements comprise the Zadoff-Chu sequence c, matrix C is constructed with a lot of excellent correlation properties in each row vector as well.For each row sequence c i of the constructed matrix C, the following correlation properties are held [30]: where c(•) M represents a periodic cycle of length M.
It has been proved that exploiting constant TS within the transmission frame is not an optimal scheme for channel tracking and changing TS can bring better channel estimation performance [5,25].us, we adopt changing timedomain training scheme which different PN sequences generated by each row sequence of C for the acquisition of accurate CSI over one frame of the transmission signals.e partial common support of MIMO channels over R TFT-OFDM symbols can be calculated as follows: where p th is the same as that in (8) an can ensure the reliability of the channel common support.

Acquisition of Accurate Path Gain Estimation.
In this section, we use the frequency-domain method in combination with the partial channel common support as a priori information for accurate channel estimation.
Since the support set of CIR remains unchanged over several consecutive TFT-OFDM symbols, it can be estimated simultaneously by using multiple measurement vector (MMV) CS methods [31].In general, MMV methods are used to solve sparse signal recovery problems in which measurement vectors are associated with the same sensing matrix, while generalized MMV (GMMV) CS methods are required for the distributed systems with different sensing matrices.e problem of sparse channel state acquisition can be formulated as the following optimization problem: To solve this problem, the PA-DS-SAMP algorithm listed in Algorithm 1, which is developed from the DSAMP algorithm [18], is proposed.
e initialization of the PA-DS-SAMP algorithm is implemented in Steps 1∼3.In Step 5, the correlations of the residuals with the column vector of sensing matrices are Mobile Information Systems calculated over each TFT-OFDM symbol and the indexes of the S B + S I largest nonzero entries are selected in Step 6, where arg max A, b { } represents the indexes of b maximum values in the set A and S B and S I presents basic sparsity and increased sparsity, respectively.Steps 7∼10 are the support pruning for more precise sparsity support.Note that Step 9 does not guarantee that the basic support is fully contained so that Step 10 is needed to refine the effective support and ensure S B + S I entries in the final support.Step 11 estimates  h i associated with the effective support D k using LS, and Step 12 calculate the residuals over each symbol.e index of the minimum power in the block of  h i is found in Step 13, and the iteration is terminated if the condition that the average energy of signal is less than a certain threshold holds, which is expressed in Step 14. Steps 15∼21 are the step adaptive process, and the advantages of this approach will be explained below, where • denotes upper bound integer operation.e whole iteration will be terminated if the residual average energy is less than a certain threshold given in Step 22. Finally, the estimation of structured channel impulse response is obtained in Step 23.Compared with DSAMP algorithm [18] and the classic SAMP algorithm [32], the proposed PA-DS-SAMP algorithm has four distinctive features: (i) Since the signal to be estimated has block structure, the structured sparsity is exploited in the proposed PA-DS-SAMP algorithm for block signal recovery.However, this useful signal feature is not used in either DSAMP algorithm or SAMP algorithm.(ii) In the PA-DS-SAMP algorithm, the sparsity support consists of basic support and additional support.Only additional support is updated while the basic support does not change in each iteration operation.e reason is that the basic support achieved by time-domain training has high reliability.
Input: (1) Initial support D 0 , initial channel sparsity S 0 � |D 0 | c , (2) Noisy measurements y i and sensing matrices Step size selection threshold Γ, initial step size s, (4) Termination threshold p th , noise power σ 2 , iteration index k � 1. Initialization: Iterations: (5) ALGORITHM 1: PA-DS-SAMP algorithm. 6 Mobile Information Systems (iii) e third difference is that the step size can be adaptively adjusted in PA-DS-SAMP algorithm, which has no discussion in DSAMP algorithm.Due to the tradeoff between recovery speed and recovery efficiency [33], an adaptive step size selection scheme is utilized which is based on the experiences that large step size is preferable for signal with flat magnitude and small step size is suitable for signal with fast magnitude attenuation.(iv) Finally, we give the termination threshold of the algorithm through strict mathematical derivation, which plays an important role in the performance of the algorithm.
Remark 1.After removing the large gain taps of CIR by using a priori information of channel delay, the remaining tap amplitudes have not much difference so that large step size is suitable.However, with large magnitude taps being reconstructed after a few stages of the algorithm, the number of taps to be reconstructed is less and less.In other words, the energy of the reconstructed CIR approaches to remain stable and the estimated sparsity level is close to the true sparsity level.erefore, in order to better determine the more accurate sparseness, the step should be gradually reduced.According to this behavior, a large step size is set at the several stages of beginning in PA-DS-SAMP algorithm to expedite the convergence.en, the step size decreases adaptively to provide fine tuning in following stages when the change of the reconstructed signal is less than a certain threshold.

Performance Analysis
In this part, we discuss the performance of the proposed algorithm.Firstly, we discuss the convergence of the algorithm, getting the convergence condition and termination threshold of the algorithm.Secondly, pilot design for sensing matrices is discussed.en, the computational complexity of proposed algorithm is compared with OMP, SOMP, DSAMP algorithm, DS-SAMP algorithm, and PA-SOMP algorithm.Finally, pilot overhead for CSI acquisition is obtained.Before performance analysis, we first introduce two important definitions and two useful lemmas, which will play a key role in the subsequent analysis and discussion.
Definition 1 [34].Block-Coherence of Ψ is defined as where M[k, l] � Ψ H k Ψ l and N t is the size of the block.
Lemma 2. [35].Suppose that the noise vector w follows is satisfied, then we have

Convergence Analysis of Proposed PA-DS-SAMP
Algorithm.
e PA-DS-SAMP algorithm begins with the residual initialization as b 0 i � y i , and the significant part is to choose at least one correct column block of the sensing matrix in each iteration.At the kth stage (k ≥ 1), the block best matched to b k−1 i is chosen according to Suppose D denotes the support set of the signal  h i and D denotes the complementary set of D.
eorem 1 reveals the convergence conditions of the proposed algorithm in the absence of noise.We first discuss the noise-free case, then discuss powerlimited noise ‖w i ‖ 2 ≤ ε, and finally extend it to Gaussian white noise case w i ∼ N(0, σ 2 I N p ).

Theorem 1. For block-sparse signals 􏽥
h i ∈ C LN t with the blocks length of N t which satisfy Proof.See Appendix A.

□
Mobile Information Systems e sufficient condition (26) depends on  Ψ i,D which is determined by the nonzero position of the signal  h i , but the nonzero position of the  h i is not known in advance.erefore, eorem 1 is not practical.Following theorem under certain conditions on Block Coherence and Subcoherence associated with  Ψ i can make sure the (A.1) holds.
Theorem 2. Sufficient condition ( 26) is satisfied if for every μ Βi , there is e proof of the theorem can refer to the literature [34].Based on eorem 1 and eorem 2, we will discuss the condition in which the proposed algorithm can select the correct atom for each step in the case of power-limited noise.

□
Remark 2. eorem 3 gives the double threshold condition for the signal to be fully recovered according to a certain probability.Although noise power and empirical threshold involved in signal power are usually used as terminated conditions, these thresholds are somewhat conservative.When the signal-to-noise ratio is low, it leads to the incorrect supports.eorem 3 can provide a degree of quantification of correct recovery so as to guide the choice of thresholds.

Pilot Design Discussion for Sensing Matrices.
According to (2), the design of sensing matrices  Ψ i is related to pilot sequences p (p) i

􏽮 􏽯
N t p�1 and pilot placement ξ i .In CS theory, RIP is treated as a sufficient condition that the sparse signal can be reliably and stably recovered.However, RIP of a sensing matrix is so difficult to calculate that mutual incoherence property (MIP) which is stronger than RIP is widely used as a framework for sparse signal recovery.Besides, from eorem 2, we can see that the MIP of sensing matrix is directly related to the termination threshold of the proposed algorithm.e smaller the v i and the μ Bi , the lower the threshold of the recoverable signal and the more accurate the signal recovery.
erefore, our purpose is to design sensing matrices with good cross-correlation.
Using the Central Limit eorem, Gao [17] proposed a constant envelope complex exponential random phase pilot where θ i,t,p has the independent and identically distributed (i.i.d.) uniform distribution U[0, 2π], and proved that the pilot vectors between different antennas and different OFDM symbols have asymptotic orthogonality when pilots are placed at equal intervals.lim erefore, when the pilot number is large enough, sensing matrices  Ψ i will have good cross-correlation.As discussed below, the number of pilots is proportional to the number of antennas.So, for massive MIMO systems, the column vectors of the sensing matrices satisfy approximately orthogonality.Besides, according to eorem 2, we have inequality N t < ((1 + v i )/((2S − 1)μ Βi + v i )), which means that smaller v i and μ Βi bring a larger number of antenna.

8
Mobile Information Systems

Computational Complexity of the DS-SAMP Algorithm.
e computational complexity of the algorithm is analyzed in this section.Firstly, the computational complexity of each iteration in each step is given in Table 1.
Obviously, among the steps of the PA-DS-SAMP algorithm, matrix inversion operated for least squares (LS) estimation contributes the dominant computational complexity.Gao [18] compared the number of complex multiplications of OMP, SAMP, and DSAMP algorithms in each iteration.Compared with these four algorithms, it is clear that the DS-SAMP algorithm almost has the same computational complexity due to almost the same calculation steps they have.However, with a priori common support, the number of iterations of the proposed algorithm will be greatly reduced so that the total amount of calculation will be significantly smaller.

Spectrum Efficiency and Energy Efficiency.
Block-structured processing of signals can reduce pilot overhead, and the minimum pilot overhead will be derived below.Theorem 4. [31].For  Ψ i , 1 ≤ i ≤ R, whose elements obey an i.i.d.continuous distribution, the minimization problem where ) � and � is the common support of  h i .After coarse CIR estimation, system equation becomes where D/D 0 denotes the difference set between D and D 0 .Due to the temporal common sparsity, the sparsity of the signal that needs to be recovered becomes S 1 � N t (|D| c − S 0 ) and the equation ( 29) can be transformed into spark In addition, in order to recovery the signal with known support, N t S 0 pilots are needed.erefore, the total number of pilots needs to meet the following condition: Subsequently, we can get the smallest required pilot overhead N p � N t (2S − S 0 ) − min S 1 , R   + 1 which is less than the minimum pilot overhead required for unstructured CS methods.
Due to the overhead caused by the time-domain guard interval and the frequency-domain pilots, the spectral efficient of the proposed method normalized by the ideal case without any overhead [5,13] can be expressed in the percentage notation as Table 2 compares the spectral efficiency of several commonly used algorithms [5], where the typical wireless digital television system with the total subcarriers number N � 4096 is adopted.Besides, the channel model with six resolvable paths is defined by ITU, which means that the sparsity S � 6.Without loss of generality, the guard interval length is M � 256 and the number TFT-OFDM symbols in a frame is R � 10.Suppose the initial sparsity is S 0 � 2; for large-scale 16 × 16 MIMO system, the spectral efficient can be calculated as η SE � 91.89%.As can be seen from Table 2, the proposed method has the highest spectral efficiency compared with other methods.
Besides, the PN sequence power and pilot power are boosted to achieve more reliable channel estimation in the TFT-OFDM scheme.erefore, the energy efficiency can be expressed in the percentage notation as [12] where α and β denote the amplitude factor imposed on the time-domain NP and frequency-domain pilots, respectively.Generally, β � 4/3 and α � � 2 √ have been specified such as DVB-T2 standard and DTMB standard.erefore, energy efficiency can be calculated as η EE � 85.21%.Table 3 summarizes the energy efficiency comparison for different OFDM schemes.It is clear that the proposed method has the highest energy efficiency.
e reason is that less pilot overhead leads to higher energy efficiency.

Simulation Results
In this section, in order to verify the effectiveness of the proposed algorithm, we perform performance analysis by comparing the proposed algorithm with other seven schemes: OMP, simultaneous OMP (SOMP), distributed SAMP (DSAMP) [18], and priori information aided SOMP (PA-SOMP) [22] through computer simulation.Simulation system is configured according to the most commonly used wireless broadcasting systems with 40 antennas and the parameters are set as centric carrier frequency f c � 835 MHz, signal sampling frequency bandwidth f s � 1/T s � 7.56 MHz, Doppler shift f d � 80 Hz, DFT size N � 4096, and guard interval length M � 256.
e percentage of pilot overhead can be calculated as η � N p /(N + M). e typical multiple channels with 6-taps named ITU B is used to evaluate system performance and the specific parameters can be referred as literature [12].
Figure 4 shows the mean square error (MSE) performance comparison of five channel estimation methods for the massive MIMO-OFDM system where per frame of the transmitted signal contains 10 TFT-OFDM symbols.e Cramer-Rao Lower Bound (CRLB) is used as a benchmark Mobile Information Systems for comparison.e number of pilots is 600, that is, the percentage of pilot overhead accounts for 14.6 percent.It is clear that the proposed algorithm has 1.5∼2 dB signal-tonoise ratio advantage for other methods when the MSE is 10 −1 and OMP algorithm has the worst performance.However, when the MSE is less than 10 −2 , the performance of several algorithms tends to be the same.e reason is that those algorithms can choose the correct support when the signal-to-noise ratio is high.erefore, we can conclude that the proposed algorithm has great advantages in the case of low and medium signal-to-noise ratio (SNR).
Figure 5 shows the bit error rate (BER) of each algorithm for the massive MIMO-OFDM system.Zero-forcing (ZF) equalizer is adopted at mobile terminals for signal detection by using the estimated channel state information.Modulation scheme adopts quadrature amplitude modulation (QAM), and Gray code is used for source code.e BER with ideal CSI is plotted as the benchmark.Although the bit error rate curve of several algorithms is relatively close to each other, it can be seen from the performance curves in the subgraph that the proposed algorithm has the best bit error rate compared to other algorithms. is result is consistent with the analysis results in Figure 4.
To further evaluate the performance of the proposed method, Figures 6 and 7 show the MSE performance and BER performance with the variety of the number of measurements under the xed SNR of 20 dB, respectively.As the number of pilots increases, it can be observed that the MSE curve of the proposed algorithm is closer to the MSE curve of CRLB. is means that the proposed algorithm can get the right support due to signal structured processing.e proposed algorithm can achieve an MSE accuracy of 10 −2 with the 9.77% pilot  Table 3: Energy e ciency comparison.
TFT-OFDM MIMO [5] TFT-OFDM MIMO [22] FT-MIMO [5] Proposed TFT-OFDM MIMO 82.90% 77.25% 55.91% 85.21%  Mobile Information Systems overhead, while SOMP and PA-SOMP algorithms require pilot overhead of 10.99% at least and DSAMP needs more.As expected, OMP algorithm has the worst estimation performance consistent with the previous analysis.From the performance curves of the gures and the analysis above, we conclude that the proposed algorithm requires fewer pilots overhead at the same estimation accuracy.
In Figure 8, the correct signal recovery probabilities which are functions of the pilot overhead are drawn to evaluate the performance of the ve methods.Due to the presence of additive white noise in the measurement signal, the correct recovery of the signal is probabilistic.For more convenient discussion, the recovery probability of signal is regarded as 1 when the MSE of the signal estimation is lower than 10 −2 .It is clear that the proposed method can accurately recover the signal when the pilot overhead is 9.77%, while the DSAMP, SOMP, and PA-SOMP algorithms need up to 12.82%, 11.6%, and 10.99% pilot overhead, respectively.is result is consistent with the performance analysis in Section 4. erefore, compared to other algorithms, the proposed method can reduce about 2% to 3% pilot overhead so that spectral e ciency is improved.

Conclusions
In this paper, we proposed a time-frequency joint channel estimation scheme for massive MIMO-OFDM systems.First, partial channel common support was achieved by using changing time-domain training sequence.In order to obtain accurate CSI, we proposed a priori information aided distributed structured sparsity adaptive matching pursuit (PA-DS-SAMP) channel estimation algorithm by using obtained common path delay information.en, we analyzed the performance of the proposed algorithm, including the convergence analysis, pilot design, computational complexity, and spectrum and energy e ciencies.rough convergence analysis, we gave two signal power thresholds which can ensure signal is fully recovered and recovered fully based on probability under the power-limited noise and the Gaussian white noise cases, respectively.In addition, pilot design results showed that the pilot sequences of di erent antennas tend to be orthogonal with the increase in number of antennas.Furthermore, the computational cost of the proposed algorithm is signi cantly reduced, and the spectrum and energy e ciencies are improved.Experimental  simulation showed that compared with other algorithms, the proposed algorithm not only increases the estimation accuracy but also greatly reduces the pilot overhead.

A. Proof of Theorem 1
Assume that the next chosen indexes will at least contain a block in ). is is equivalent to the following inequality: (A.1) ) H ] H , inequality (A.1) can be converted to the following form: and it also can be expressed as the following form: (A.3) As a generalization of the matrix norm, the mixed matrix norm is defined as ‖A‖ 2,p � max x≠0 (‖Ax‖ 2,p /‖x‖ 2,p ).According to the properties of the pseudoinverse, † is the Hermitian matrix, the following equation holds (( e first inequality is obtained by using the definition of mixed matrix norm ‖Ax‖ 2,p ≤ ‖A‖ 2,p ‖x‖ 2,p , and the second inequality is obtained by Lemma < 1 holds, PA-DS-SAMP algorithm can pick up a correct new block in each step.If the correct block can be selected for each iteration, the recursive refining process of the estimate of support set, which includes adding the correct block and removing the wrong block, will lead to estimated subspaces with strictly decreasing distance from the measurement vector so that the residuals will be decreasing.After a limited iteration, the algorithm reaches convergence.

B. Proof of Theorem 2
By using orthogonal projection matrix us, in Step 6, the sufficient condition that at least one correct atom should be selected is e last equation is obtained by the projection theorem.us, the sufficient condition for and we have According to [34], there is inequality (B.5) established: us, we have Substituting inequality (B.7) into inequality (B.5), we have a sufficient condition that (A.2) is hold, which means that at least one correct atom should be selected in the current cycle Meanwhile, we have (B.8) e second inequality is obtained by using Lemma 5 in [35].According to Gershgorin circle theorem, we have λ min i ((  Ψ i,D ) H  Ψ i,D ) ≥ 1 − (N t − 1)v i − (S − 1)N t μ i .Besides, μ i ≥ μ Βi ; thus, we have and N k−1,1 ≤ Rε 2 .erefore, we have is means that if the remaining signal energy is large enough, the algorithm will choose a correct atom at this step.For conservation, we get the next result that if max holds, then the correct atom will be selected in the next step.When all atoms are properly selected, the lowest energy of the selected atoms is min 1≤i≤R ((2 According to lemma

Figure 4 :
Figure 4: MSE performance comparison of di erent channel estimation methods.

Figure 5 :
Figure 5: BER performance comparison of di erent channel estimation methods.

Table 1 :
Computational complexity in one iteration.