Hybrid Precoding Algorithm for Millimeter-Wave Massive MIMO Systems with Subconnection Structures

In mmWave massive MIMO systems, traditional digital precoding is difficult to be implemented because of the high cost and energy consumption of RF chains. Fortunately, the hybrid precoding which combines digital precoding and analog precoding not only solves this problem successfully, but also improves the performance of the system effectively. However, due to the constant mode constraint introduced by the phase shifter in the analog domain, it is difficult to solve the hybrid precoding directly. There is a solution which divides the total optimization problem into two stages to solve, that is, first fix the digital precoding matrix, solve the analog precoding matrix, and then optimize the digital precoding matrix according to the obtained analog precoding matrix. In this paper, a high energy-efficient hybrid precoding scheme is proposed for the subconnection structure. In the first stage, the optimization problem can be decomposed into a series of subproblems by means of the independent submatrix structure of the analog precoding matrix. When the optimized analog precoding matrix is obtained, the digital precoding matrix can be solved by the minimum mean error (MMSE). Finally, the digital precoding matrix is normalized to satisfy the constraint conditions. The simulation results demonstrate that the performance of the proposed algorithm is close to that of fully digital precoding based on subconnection structure and better than that of the existing algorithms. In addition, this paper presents the simulation analysis of the algorithm performance under imperfect channel state information. Simulation results show that when the estimation accuracy of channel state information is 0.8, the spectral efficiency of the proposed algorithm can already be maintained at a good level.


Introduction
With the rapid development of technology, the fifth generation mobile communication (5G) has attracted wide attention due to higher frequency, greater network capacity, and lower latency. It integrates many technologies, among which millimeter-wave (mmWave) communication and massive multiple-input multiple-output (MIMO) technology play important roles. Due to the rich spectrum resources available in the high frequency band, millimeter-wave communication can realize extremely high-speed and short distance communication with high gain, but there are also short transmission distance, poor penetration, and diffraction capability, vulnerability to climatic and environmental impacts [1], while mas-sive MIMO systems, which use large arrays of antennas to communicate, can rapidly improve wireless data rates and system energy efficiency [2]. The shorter wavelength of mmWave signals enables large antenna arrays to be integrated into a smaller space, which implies that massive MIMO is feasible in wireless communications [3]. In addition, the combination of them can significantly improve user throughput, spectrum, and energy efficiency and increase the potential for mobile network capacity [4].
In millimeter-wave communication system, the cost of traditional digital precoding is too high to realize. Therefore, hybrid precoding, which combines digital precoding with analog precoding, is a better choice. It is usually based on two traditional structures: fully connected and subconnected.
Since the fully connected structure can approach the theoretical optimal spectrum efficiency, it has been extensively studied in the academic community. The hybrid precoding problem was reconstructed by the spatial sparsity of the mmWave channel in [5], and the orthogonal matching pursuit (OMP) algorithm is proposed to solve the problem. Based on the OMP algorithm, a greedy algorithm is proposed in [6]. The algorithm does not need to consider the geometry of the antenna array, which effectively reduces the computational complexity while achieving good performance. A threshold orthogonal matching pursuit (TOMP) algorithm with performance close to the optimal and higher than the OMP algorithm is proposed by setting an appropriate threshold [7]. Although the performance of the OMP algorithm is good, the complexity of the algorithm is relatively high. Therefore, a real-time and high performance precoding strategy based on singular value decomposition is studied in [8]. The performance of this scheme is similar to that of the OMP algorithm, and the complexity is much lower than that of the OMP algorithm. In [9], an alternative minimization (AltMin) algorithm is proposed by using the idea of manifold optimization, and the complexity of the algorithm is further reduced.
The characteristics of subconnected structure make it achieve a better balance between performance and cost, which has attracted the attention of many scholars. Unlike the one-to-one correspondence between the RF chain and the antenna in the fully connected structure, each RF chain in the subconnected structure is only connected with part of the antenna. Therefore, research on hybrid precoding algorithms with subconnection structures has also become the focus of attention in recent years. For the subconnected structure, an AltMin algorithm is also developed with the help of semidefinite relaxation in [9]. Similarly, a new divide-and-conquer precoding scheme is proposed in [10]. The performance of this scheme is close to that of SDR-Alt-Min, which can save time effectively and is robust to potential saddle points with poor performance. In view of the hybrid precoding problem of energy-saving subconnection structure, a hybrid precoding scheme based on successive interference cancellation (SIC) is proposed in [11]. It does not require singular value decomposition and matrix inversion, and the computational complexity is much lower than traditional sparse reconstruction precoding algorithms. In [12], an optimization model depends on the minimum mean square error (MMSE) is established, and then, a low complexity hybrid precoding algorithm based on particle swarm ant colony optimization (PSACO) is proposed. The high performance hybrid precoding algorithm for subconnection structures is studied in [13], and the performance of the algorithm is better than that of the SIC algorithm when the channel state information is not perfect. A new subconnection structure is introduced in [14] to further reduce the power consumption of the system. According to this structure, an efficient hybrid precoding scheme based on service quality constraint is proposed. A general hybrid precoding algorithm is studied in [15], in which the supplementary matrix is introduced, and the regularization zero-forcing method is used to solve the problem.
Synthesizing the current research situation, most of the literature studies the ideal fully connection structure, but in practice, the structure has high complexity and energy consumption. Although the performance is considerable, the practical application is poor. Therefore, some scholars have gradually focused on the energy-saving and more practical subconnection structure, thus far, also achieved some results, but the performance of the hybrid precoding algorithm and the complexity of the algorithm still need to be further improved. Different from the traditional full-digital precoding, hybrid precoding can only control the phase of the data stream in the analog domain, but cannot adjust its amplitude. The constant modulus constraint of the analog precoding matrix makes the solution of the hybrid precoding more difficult.
Thus, in order to simplify the problem, we consider solving the analog precoding matrix and the digital precoding matrix, respectively, as in most references [11,13]. The total achievable rate R of the system is expanded directly, and the optimization problem is decomposed into a series of subrate optimization by recursion in [11,15]. Different from them, this paper starts with the equivalent problem of maximizing the achievable rate, the Euclidean distance derived in [5] to solve the optimization problem. We utilize the block diagonal form of the analog precoding and decompose the optimization problem into a series of optimization subproblems and solved in turn. The subproblem, which is each column of the analog precoding matrix, is derived from the corresponding Euclidean distance expansion. After the complete analog precoding matrix is obtained, the corresponding digital precoding matrix optimization problem is easy to solve. The simulation results also show that the algorithm can achieve considerable performance regardless of whether the acquired channel state information is perfect or not.
The rest of this paper is arranged as follows. Section 2 introduces the model of mmWave massive MIMO system and the channel model used. Section 3 describes the design process of the proposed algorithm in detail. In Section 4, the simulation results are given, and the performance of the proposed algorithm is analyzed. Finally, we conclude this paper in Section 5.
Notation: in this paper, A, a, and a denote a matrix, a vector, and a scalar, respectively; ðAÞ T , ðAÞ H , and kAk F are the transpose, conjugate transpose, and Frobenius norm of A, respectively; A −1 and A † represent the inverse and Moore-Penrose pseudo inverse of A; TrðAÞ indicates the trace of matrix A; det ðAÞ expresses the determinant of A; diag ðAÞ is the diagonal version of A; absðAÞ is the element-wise absolute value of A; A⊘B is the Hadamard division between Aand B; CN ð0, σ 2 Þ is a complex Gaussian vector with mean 0 and covariance σ 2 ; Eð⋅Þdenotes the expectation; ℂ m×n denotes an m × ndimensional complex space; RðaÞis the real part of a. Wireless Communications and Mobile Computing as shown in Figure 1. Consider a single-user mmWave massive MIMO system as shown in Figure 2. The transmitter and the receiver are equipped with N t and N r antennas, respectively [16]. N s data streams in the baseband are first precoded by a digital precoder D and then precoded by an analog precoder after passing through the corresponding RF chain. After that, each data stream is transmitted by a subantenna array with only M antennas associated with the corresponding RF chain. The number of RF chains at the transmitter and the receiver is N t RF and N r RF , respectively. To achieve multiple data transmission, the following constraints must be satisfied:

System Model and Channel Model
The hybrid precoder is composed of a digital baseband precoder D and an analog precoder A, and their dimensions are N t RF × N s and N t × N t RF . Assuming that the initial signal is s, the transmission signal can be written as x = ADs, where s is the N s × 1 symbol vector and satisfies E½ss T = 1/N s I N s . kADk 2 F = N s is the normalized power constraint of the system [17]. The received signal vector y = ½y 1 , y 2 , ⋯, y K T of the system can be expressed as where ρ is the average received power; H ∈ ℂ N r ×N t denotes the channel matrix, the baseband transmission signal vector can be expressed as s = ½s 1 , s 2 , ⋯, s N T , and A and D represent the analog precoding matrix and the digital precoding matrix, respectively. n denotes the noise vector with independently and identically distributed i.i.d. CN ð0, σ 2 Þ entries. F = AD presents the hybrid precoding matrix of size N t × N r , which satisfies kFk F ≤ N s to meet the total transmit power constraint.
In subconnection architecture, each RF chain is connected to M (M = N t /N RF ) antenna via M phase shifter [18]. Therefore, the digital precoding matrix D in this architecture is the diagonal matrix, The corresponding analog precoding matrix is a block diagonal matrix, which is represented as : ð2Þ a n ∈ ℂ M×1 can be regarded as an analog weighting vector. Due to the constant modulus constraint, each element in the a n has the same amplitude and different phases [18].

Channel Model.
Unlike traditional low frequency channels, the propagation characteristics of millimeter-wave channels no longer subject to Rayleigh fading. Due to the high path loss in free space, the spatial selectivity of millimeter-wave propagation is limited [19]. In addition, large-scale dense antenna arrays will result in obvious antenna correlation in millimeter-wave channels [20]. Therefore, the traditional channel model is not suitable for the millimeter-wave channel in the massive MIMO system. For simplicity, this paper adopts the geometric Saleh-Valenzuela model in most literature [20,21]. The channel matrix is given by where N cl is the number of scattering clusters and each cluster contributes N ray propagation paths. α ik is the gain of the kth ray in the ith scattering cluster, and it is subject to i.i.d. CN ð0, σ 2 α,i Þ, where σ 2 α,i indicates the power of the ith clusters [22]. a r ðϕ r ik , θ r ik Þ and a t ðϕ t ik , θ t ik Þ represent the normalized antenna array response vectors at the receiver and transmitter, respectively. ϕ r ik ðθ r ik Þ and ϕ t ik ðθ t ik Þ are the azimuth (elevation) angles of the kth ray in the ithscattering cluster on the receiving and transmitting sides, respectively. The mean cluster angles ϕ r ik ðθ r ik Þ and ϕ t ik ðθ t ik Þ are subject to uniformrandom distribution in the range of ½0, 2πÞ [23]. Within the ith cluster, the arrival and departure angles of each ray follow Laplace distribution with mean cluster angles ϕ r ik ðθ r ik Þ and ϕ t ik ðθ t ik Þ as the standard deviation. The antenna array response vector depends on the structure of the antenna array at the transmitter and receiver. There are two types of antenna array structures: uniform linear array (ULA) and uniform planar array (UPA). For millimeter-wave massive MIMO systems, a uniform planar array is conducive to the packaging of antennas at the transceiver, and the realization of beamforming will be more flexible. On the other hand, as the UPA array can bring better energy efficiency and spectrum efficiency, we mainly consider the UPA. In the case of a uniform planar array with horizontal W antenna elements and vertical H antenna elements, the array response vector is given by [24]

Wireless Communications and Mobile Computing
where λ is the wavelength of the carrier signal, d is the distance between adjacent antennas, and m and n satisfy 0 ≤ m < W and 0 ≤ n < H, respectively.

Hybrid Precoding Algorithm Design
In this paper, an approximate optimal hybrid precoding algorithm is proposed for the subconnection structure. Because the joint optimization problem can be temporarily decoupled [5], this paper mainly discusses the design of the hybrid precoding algorithm at the transmitter of the system. The fundamental goal of the hybrid precoding algorithm is to maximize the spectral efficiency R by designing the optimal hybrid precoder. The spectral efficiency of the system is expressed as [5] follows: where I N s denotes a N s × N s unit matrix and F presents the hybrid precoding matrix with block diagonal form. Thus, the hybrid precoder optimization problem in this paper can be expressed as Hybrid precoding introduces phase shifters in the analog domain to process the transmitted signal. Since the phase shifter can only adjust the phase of the signal and cannot adjust the amplitude of the signal, the nonzero terms in the analog precoding matrix need to have the same modulus [18]. A is used to represent the set of A matrices that satisfy the constant modulus constraint. Obviously, problem (6) belongs to nonconvex optimization problem due to the existence of constant modulus constraint. Although the constraints make it difficult to obtain the exact solution directly, the approximate optimal mixed precoding matrix can be obtained by solving the approximation of (6). As described in [5], problem (6) can be further equated to problem (7): The Euclidean distance is used to make the optimal mixed precoding as close as possible to the unconstrained optimal digital precoding. It is shown in reference [5] that the objective function in the minimization problem (7) can maximize the spectral efficiency. The optimal mixed precoding matrix in the problem can be obtained by singular value decomposition of the channel matrix. The channel matrix singular value decomposition is represented by H = UΣV H , and the unconstrained optimal precoding matrix F opt consists of the first N s columns of the right singular matrix.
Problem (7) is fundamentally aimed at obtaining approximately optimal analog precoding matrix A and digital precoding matrix D. The constant mode constraint in (7) makes it difficult to solve the problem directly. Thanks to the special block diagonal form of the hybrid precoding matrix; it can be seen that the different subarrays are not correlated. This allows us to decompose the overall optimization problem into a series of suboptimization problems, each of which can be represented by f n = d n a n . , ð8Þ where f n is the n column of the hybrid precoding matrix F and f n is the nonzero element in the n column, which is the nonzero element matrix of column matrix of F. The concrete structure of f n is shown as  Figure 2: Single-user mmWave massive MIMO system.

Wireless Communications and Mobile Computing
It is obvious from formula (8) that the design of hybrid precoder F can be converted into optimizing analog precoder A and digital precoder D separately, or the problem is equivalent to optimizing each independent subantenna array f n . Therefore, we assume that the digital precoding matrix D is fixed and first optimize the analog precoding matrix A. The optimization problem is expressed as Since the hybrid precoding matrix of subconnected structure is a special block-diagonal form, the total optimization problem is decomposed into a series of independent subarray optimization problems. The power constraint in (10) can be scaled to meet the constraint after the analog precoder and the digital precoder are optimized. With the help of (8), the subarray optimization problem is described as a opt n = arg min a n F opt n − a n d n 2 F , s:t: a n ð Þ m = 1: As can be seen from the objective function (11), the optimization problem of the analog precoding matrix A opt is further decomposed into optimization of nonzero submatrix a n . The objective function of the (11) suboptimization problem is expanded as F opt n − a n d n 2 F = Tr F opt n − a n d n À Á H F opt n − a n d n À Á TrðABÞ = TrðBAÞ and a H n a n = M are used in the expansion of the equation (12). Because all the terms in a n have the same amplitude 1, a H n a n = M is established apparently. From (12), we can see that the expansion result is made up of three parts, in which the first term TrððF opt n Þ H ðF opt n ÞÞ and the last term M d 2 n are fixed, and the goal of the problem is converted to maximize the real part of d n Trð a H n F opt n Þ, that is, to maximize the real part of a H n F opt n . Under constant modulus constraint, when the phase of each term of the a n is equal to the phase of F opt n corresponding term, the a H n F opt n contains only the real part and reaches the maximum. Therefore, the optimal solution of a n can be obtained by preserving the phase of the term in F opt n and normalizing it.
If n < N t RF , then set n = n + 1 and continue to solve the next subproblem in a similar way. By solving a n successively, the complete optimized analog precoding matrix A = diag ½ a 1 , a 2 ,⋯, a N can be obtained. When the analog precoding matrix is determined, the corresponding optimal digital precoding matrix D can be easily obtained by the minimum mean square error (MMSE).
Finally, the digital precoding matrix is normalized to meet the previously ignored power constraints.
The complete process of hybrid precoding design is shown in Algorithm 1.
We have W RF H W RF ∝ I on the receiving side, while the digital combining matrix satisfies W BB H W BB ∝ I. If H eff = HAD is considered as an equivalent matrix, the spectral efficiency can be given by [13] The spectrum efficiency of the receiver and the mutual information of the transmitter have similar structures, so the analog receiver and the digital receiver can be designed in the same way as the transmitter.
The algorithm proposed in this paper is to iteratively solve a single column vector. As mentioned in [11], the expected performance of the algorithm can be achieved when the number of the iterations is set to 5. The HPD-PS algorithm in [13] requires 5N s iterations, and the magnitude of the major complexity of each iteration is OðN 2 t /N t RF 2 Þ, while the complexity of each iteration of the SIC algorithm in [11] is OðN 2 t /N 2 s Þ. The main complexity of each iteration of the algorithm presented in this paper is about Oð2N t N s / N t RF Þ. It can be seen that compared with other subconnection structure hybrid precoding algorithms, the computational complexity of the algorithm in this paper is not high.
In addition, for mmWave communications, one nature is broadband. Therefore, the case of extending the algorithm to wideband communication is briefly discussed here. The same as in narrowband systems, the extra orthogonality constraint on the digital precoder also has been negligible impact on spectral efficiency in mmWave OFDM systems [9]. That is to say, the main difference between the hybrid precoding schemes of wideband and narrowband systems is the design of analog precoding. In mmWave MIMO-OFDM systems, the signals of all subcarriers usually need to share an analog 5 Wireless Communications and Mobile Computing precoder. Thus, the solution of analog precoding can be extended to wideband system by carrier aggregation and other related operations.

Simulation Results
In this section, the performance of the proposed algorithm is simulated and compared with the hybrid precoding algorithm based on continuous interference cancellation in [11] and the HPD-PS algorithm in [13]. In the result diagram, for reference, the performance of full-digital precoding and full-analog precoding for subconnection structures is given.
In this simulation, the default parameters are set as follows. The channel matrix is generated from the channel description above. We model the propagation environment as a N cl = 8 cluster environment with N ray = 10 rays per cluster [24]. The angular spread of each cluster is 10 degrees. The carrier frequency is set as 28 GHz [11]. The UPA array is used for the antenna of transmitter and receiver, and the distance between the adjacent antennas is d = λ/2. We assumed that the AODs follow the uniform distribution within ½-π/6, π/6, while the AOAs follow the uniform distribution within ½-π, π [22]. Moreover, the maximum number of iterations is set to 5.

Performance under Perfect CSI.
We first consider the ideal state, that is, the millimeter-wave massive MIMO communication scenario in which the CSI can be obtained. Figure 3 shows a comparison of spectral efficiency in a millimeter-wave MIMO system with NM × N r = 64 × 16, and the number of RF chain is 8. Obviously, the subconnection structure algorithm proposed in this paper is better than the SIC hybrid precoding algorithm in the whole analog signal-to-noise ratio range. In addition, the spectral efficiency of the proposed algorithm is close to that of the optimal unconstrained full-digital precoding algorithm with subconnected structure.
Considering that the continuous interference cancellation algorithm was proposed in 2016, Figure 4 further shows that when the antenna size of the transmitter is N t = 144 and the receiver is N r = 16, N = 4, the proposed algorithm is compared with the spectrum efficiency of the hybrid precoding algorithm based on successive interference cancellation in [11] and the HPD-PS algorithm in [13]. It has been shown in [13] that the performance of HPD-PS algorithm is better than that of SIC hybrid precoding algorithm, and the result can be seen in Figure 4. In addition, the hybrid precoding algorithm in this paper is much closer to the unconstrained full-digital precoding algorithm than the HPD-PS algorithm.
There is a gap of 5 dB between the optimal fully connected method and the optimal unconstrained precoding for subconnected structure. However, the subconnection structure is much lower in cost and hardware complexity than the full-connection structure. Taking the system model of this article as an example, the fully connected structure requires at least N 2 M phase shifters, while the subconnected structure only needs NM phase shifters.
4.2. Impact of Imperfect CSI. However, in the actual communication process, due to the lack of cooperation between the Input: H 1: ½~,~,V d = svdðHÞ ; 2: F opt = ðV d Þ :,1:Ns ; 3: for l ≤ L do 4: for n ≤ N RF do 5: a n = F opt n ⊘absðF opt n Þ 6: end for 7: end for 8: A = diag ½ a 1 , a 2 ,⋯, a N 9: Compute D according to (16)  Wireless Communications and Mobile Computing base station and the user, and some uncertain factors, such as uncertain channel estimation and limited feedback, perfect channel state information is difficult to obtain. Thus, the preprocessing of the transmit signal of the base station cannot completely eliminate the interference between the users as under the accurate channel state information, and the users cannot be used as the interference elimination strategy at the receiving end due to the limitation of the number of receiving antennas. No matter how accurate the channel estimation algorithm is, there will always be a fixed error bound, which cannot meet the specified arbitrary accuracy requirements [25,26]. Therefore, it is of great research and practical value to analyze the effect of imperfect channel state information on downlink multiuser MIMO system performance and the method of obtaining and utilizing channel state information effectively. The perfect result cannot be guaranteed by data acquisition and processing in communication network [27][28][29][30]. Although there are a lot of channel estimation algorithms, still with the ideal state, the perfect channel state information estimation has a certain gap. In the case of imperfect channel state information, the performance of the proposed hybrid precoding algorithm is compared. For the estimation of channel state information, the minimum mean square error method is used, and the expression is as [31] follows: whereĤ is the estimated channel matrix, 0 ≤ t ≤ 1 presents the CSI accuracy, and E is the error matrix with entries following the distribution i.i.d. CN ð0, 1Þ. Figure 5 shows the performance comparison of the proposed algorithm with HPD-PS and optimal unconstrained full-digital precoding under imperfect channel state information and perfect channel state information. Obviously, the performance of this algorithm is not as good as that of HPD-PS when t = 0:6, but it is close to that of HPD-PS when t = 0:8.
Other than that, Figure 6 shows the detailed performance simulation results of the proposed algorithm under imperfect 7 Wireless Communications and Mobile Computing channel state information. As can be seen from the graph, the spectral efficiency of the algorithm is relatively stable from the perfect channel state information to t = 0:8, and even if t = 0:6, the performance of the algorithm does not decrease greatly.

Conclusions
In this paper, we mainly focus on the hybrid precoding of millimeter-wave large-scale MIMO systems based on subconnection structures. Firstly, the hybrid precoding optimization problem is decoupled into analog precoding and digital precoding. In the optimization process, the digital precoding matrix is first fixed to solve the optimal analog precoder. Thanks to the block diagonal form of the analog precoding matrix; the optimization problem can be decomposed into a series of optimization subproblems and solved in turn. Finally, the optimal digital precoding matrix can be obtained by MMSE. The simulation results show that the spectrum efficiency of the proposed algorithm is better than that of SIC and HPD-PS. Considering the fact that the channel state information cannot be obtained completely, the performance of the algorithm under imperfect channel state information is simulated and analyzed. It can be seen that the performance degradation of the algorithm is acceptable even under imperfect CSI and is similar to the performance of full-digital precoding under perfect channel state information.

Data Availability
The data used to support the findings of this study are included within the article.

Conflicts of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.