Machine Learning Empowered Accurate CSI Prediction for Large-Scale 5G Networks

Wi-Fi networks rely on channel estimation to ensure their performance. The computational complexity and dependability of ﬁ fth generation telecommunication networks have signi ﬁ cantly improved using supervised learning. In this paper, we develop a channel estimation model that uses a machine learning approach and the study uses multipath channel simulations for the estimation of channel state information (CSI) over arbitrary transceiver antennas. The simulation is conducted to test the e ﬃ cacy of the model against various machine learning channel estimation models. The results of simulation show that the proposed model obtains increased channel estimation quality than other methods. Further, the bit error rate is recorded low among other methods using the machine learning model. Thus, it is seen that the proposed method achieves a reduced mismatch rate of 1 : 26 × 10 − 1 : 5 than other methodson Doppler frequency during channel estimation, where the mismatch rate is higher in existing methods.


Introduction
In the current mobile communication systems, more devices are connected at the base stations, and the volume of data traffic is predicted to grow rapidly as well [1]. Because of the large number of devices and applications, infrastructure management has become increasingly challenging [2].
Low-power connectivity is necessary for the Internet of Things (IoT), whereas higher speeds of mobile communication are required for trains travelling at speeds of up to 300 kilometres per hour, and fiber-like broadband access is required for users at home [3]. A number of technologies are presented to support the aforementioned goals [4]. In antenna beamforming, MIMO, use of custom-tailored, and virtualized network functions (VNFs), adequately provisioned network slices are only a few examples of these technologies [5].
It is feasible to use some data-based technologies to manage 5G networks, which would be advantageous. Dynamic mobile traffic analysis, for example, can be used to forecast the position of the user when it comes to handover procedures [4,5]. Another example is the allocation of network slices, which takes into account the state of the network and the availability of resources [6]. Each of these scenarios is built on the foundation of data analysis. Depending on the source, some predictions regarding future behaviour are based on historical data, while others are based on present conditions and are designed to assist in decision-making. It is possible to overcome these types of challenges with the use of machine learning techniques [7][8][9].
The algorithms are only capable of handling raw natural data in their current form. Building a machine learning or pattern recognition system requires substantial domain expertise and rigorous engineering across many decades, which is why the design of a feature extractor is so important. The data can then be translated into a suitable representation for the learning system when this stage has been completed [7].
The use of two alternative sparsifying basis in a hybrid feedback compression approach for a slowly variable propagation environment can help to achieve a better balance between the feedback load and the CSI recovery performance. CSI recovery performance because massive MIMO is likely to be deployed in mmWave frequency ranges in cellular networks; the use of a compressive sensing-based approach may not be feasible, because the occurrence of strong spatial correlation is not likely to occur at high carrier frequencies. The beamforming-based solution, on the other hand, is probably more practical due to the restricted number of propagation channels available, which allows the user to concentrate on only a few angular beams for CSI measurements and reporting, which is probably more practical. This study discusses the downlink estimation that utilizes a machine learning approach to maintain the trade-off between the resource and energy consumption. This does not require CSI feedback from the users, which is extremely efficient in terms of resource and power savings on the user end.
In this paper, we develop a novel channel estimation model that uses machine learning approach, namely, back propogation neural network (BPNN), and the study uses multipath channel simulations for the estimation of channel state information (CSI) over arbitrary transceiver antennas.

Background
The problem of estimation of CSI is considered persistent in wireless systems. The next section [4] discusses the quality of communications links. This brief explanation will use these characteristics to determine how a signal travels from its source to its intended destination. Transmissions can be tailored to the current channel conditions depending on the CSI in order to improve overall communication performance. The CSI has an impact on a variety of things, including radio resources, modulation, and coding schemes.
Traditional CSI estimation methods [10] sometimes necessitate the use of high-performance computation [11]. As a result, machine learning models are now being used by numerous writers in their CSI estimation work, which is a significant advancement. Five papers on machine learning-based CSI estimation were identified as a result of our thorough review.
Three models in [11][12][13] proposed a machine learningbased technique for MIMO systems, each of which was based on machine learning. MIMO systems employ an array of antennas for both the transmitter and the receiver, resulting in more efficient transmission and reception. If we compare it to LTE, this is an extremely important 5G technology because of the huge reductions in spectral and energy consumption it delivers [14]. It should be noted that while MIMO is utilized in LTE, massive MIMO is employed in 5G, which makes use of extremely large antenna configurations.
In [11], they use MIMO system which helps to avoid Doppler rate estimation and thereby avoid Doppler rate estimation. Carrier channels, in which the estimation of Doppler rate varies between the packets, make the computation difficult to perform the optimal operations. The MIMO fading channels with varying Doppler rates were learned and estimated using machine learning models, which were applied to the problem.
Using a combination of machine learning and overlay coding techniques, the authors of [12] demonstrated channel state CSI feedback. CSI estimation at the downlink and identification of user data in base stations are the key goals of this research project. The authors in [13] describe an evaluation of the employment of machine learning models to estimate CSI in three different use cases. Xu et al. [13] also reported on this research. The first scenario involved the use of machine learning models to estimate the angular power spectrum; the other two scenarios involved static estimates based on machine learning and its version, which took into account temporal variation, i.e., the machine learning model is recommended for estimating in-band CSI across time. According to Albataineh et al. [4], one method of predicting the quality of an Internet radiolink transmission is to take into account elements. A machine learning model is used to replace the traditional methods of CSI estimation, equalization, and demapping [15,16].

Proposed Method
A transmitter and receiver are depicted in Figure 1 as part of a MIMO-OFDM system in this section. Since the 5G channel profile is modeled after the NTNR MIMO channel model. The MATLAB 5G toolbox is used to simulate the instantaneous channels in this propagation channel models. The TDL-C profile shown is used for 5G and beyond channels showing the gain of channel. The gain changes from 12 to 47 dB in more detail. Because the mobile communication 2 Wireless Communications and Mobile Computing frequency is of 4 GHz in this case, where the channel profile is not sparse.

3.2.
Transmitter. The modulation block as in Figure 1 (transmitter block) is used to encode and map binary data on the transmitter side using quadrature amplitude modulation (QAM). There are T time slots, and the symbols (QAM) are concatenated to xðtÞ ∈ C N at time t: where N is the symbols of modulation. Data is decoded as vectors N T that correspond to the antennas N T and it is given as below: This is accomplished by first converting the data from the transmitter and reception antennas in a parallel form, and finally, the pilot signals are inserted with the data into each layer for use in channel estimation. To translate signals from frequency domain, we use what we call an IFFT to transform The signal vector x a ðtÞ contains a pilot embedded in the data x i ðtÞ. By inserting the CP insertion block, a cyclic prefix (CP) of length N G is then used to alleviate intersymbol interference (ISI). The transmitted signal, indicated by x ga ðtÞ, is expressed in the time domain by adding the cyclic prefix: where N FFT is the size of FFT. That is, in order to make the signal in this symbol longer, the cyclic prefix at the end of each X ga ðtÞ sample is utilized to prefix the beginning of this symbol.

Receiver.
It is initially eliminated from each antenna received signal using the removal of cyclic prefix in order to obtain vectors of length NFFT from each antenna (Figure 1 receiver block). For channel estimation, the pilot signal from frequency domain is extracted. In addition to calculating the channel, a layer demapping module equalizes and concatenates incoming signals from all receiver antennas. A specific demodulation strategy, based on the transmitter approach, is utilized to decode the signal. At this stage, the complete binary data sequence from the MIMO-OFDM model is acquired.
The mapping of pilot signals in accordance with the pilot structure. Pilots in 5G networks are organized in a comb-like pattern across the antennae. Symbols in the time and frequency domains, Dt and Df , respectively, are evenly spaced. Different use scenarios for a 5G system specify the values of Dt and Df . Pilot signals are organized into an alternating pattern among transmission antennas.

Machine Learning Channel Estimation
Traditional estimation approaches can be used to estimate the propagation channels between transmitters and receivers in wireless communication systems that need coherent detection. Using machine learning frameworks to improve channel estimation mistakes is motivated in this part by presenting two widely used channel estimation approaches.
Error convergence is a critical issue in supervised learning, i.e., the reduction of the difference between the intended and computed unit values. We determine a set of weights that are as accurate as possible. Less mean square (LMS) convergence has been used in many different learning paradigms.
Both the transfer function and weights for the units influence the behaviour of a BPNN.
Signals are generated constantly but not linearly for sigmoid devices. Sigmoid units have a closer similarity to neurones than threshold units but should be taken as approximations.
It is necessary to change the unit weights in a neural network so that the difference between the expected output is

Wireless Communications and Mobile Computing
minimized. An erroneous derivative of the weights is calculated by the neural network (EW). The error must be calculated in the following manner: as weight is increased or decreased, the error must be calculated in the following manner. The EW can be determined using the back propagation methodology.
In order to comprehend the back propagation algorithm, it is best if all network units are linear. For each EW, the process begins by calculating a unit EA (changing error rate). It is the difference between what you actually get and what you want to get. The weights connect one hidden unit with other hidden units and output units must be identified in order to construct an EA.
The study then multiplies weights by EAs by output units and sum of the products. There is a total equal to this value for each concealed unit that was selected. When all the EAs in one layer have been computed, we can proceed to calculate the EAs for next neural layers, where the progressing from layer takes place in opposite direction of activity propagation. Back propagation is the name given to this process. Once the EA has been computed, EA and EW can be computed for each incoming connection of a unit. It is the result of a combination of the EA and incoming activity.

Results and Discussions
Here, we compare our proposed machine learning-based channel predictions to existing approaches to the 5G channel profile and evaluate their effectiveness. This was a simulation of a MIMO-OFDM system that included the characteristics necessary to model the 5G network.
All of the proposed estimation are implemented on an Intel i5 CPU running at 2.90 GHz with 16 GB of memory. It is used for Monte Carlo simulations in MATLAB 2021a. BER and MSE vs. SNR were used as a comparison tool to evaluate the performance, and the results were compared to conventional estimation. Figure 2 shows the BER performance of the scenarios under consideration using the various channel estimation approaches. There is a strong correlation between the BER performance and the MSE performance of the estimators under study. In both instances, DBN performance of BER is marginally lower than MLP. As a result of this, the loss function is designed to decrease channel estimation errors rather than the bit error rate.
Tables 1 and 2 illustrates the effect of pilot density on the robustness of BPNN estimators as shown. The performance of the three machine learning estimators remained constant when the pilot density fell, regardless of the SNR. As a result, we may conclude that the BPNN models are resistant to variable pilot densities.
The proposed machine learning models are tested for the impact of the maximum Doppler frequency. As the maximum Doppler frequency grew, the machine learning model performance degraded. As the Doppler frequency grew, the channel changed more frequently. We can also see from the figure that the BPNN model performance fell more severely than the other models. However, it still outper-formed the DBN and MLP models in terms of performance which is discussed in Table 3.
On the ground, we used Doppler frequency variation as a way to test the neural network sensitivity to changes in the receiver velocity. The proposed models were also examined for accuracy in prediction when the Doppler frequency between training and testing was out of whack. In this simulation, a uniform distribution was used to randomly     Table 4 shows the results.
In spite of the mismatch in Doppler frequency, all of the machine learning channel estimate models performed well, as shown in Table 4. When SNR is set to 20 dB, only the BPNN model performance suffers marginally. Even with channel mismatching, all of the BPNN model than DBN and MLP. Based on the results, BPNN is sensitive on Doppler frequency than DBN and MLP.
Due to the time-varying channel features, BPNN has greater impact on Doppler frequency than on MLP and DBN. However, the three presented models are still more efficient than conventional approaches because they are more resistant to variations in the Doppler frequency.

Conclusions
In this paper, a channel estimation model is conducted using BPNN and the study uses multipath channel simulations for the estimation of CSI over arbitrary transceiver antennas. The simulation is conducted to test the efficacy of the model against various machine learning channel estimation models. The results of simulation show that the proposed model obtains increased channel estimation quality than other methods. Further, the bit error rate is recorded low among other methods using the machine learning model. Because of its capacity to utilize the temporal and frequency correlation across channels, BPNN showed the biggest reduction in channel estimation error among the proposed channel estimation. Furthermore, the BPNN channel esti-mation algorithms showed excellent resilience to changes in pilot density and Doppler frequency. In future, the application of noise modeling can be varied to check the efficacy of the model under different rugged scenarios.

Data Availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.