Forecasting of Ionospheric Total Electron Content Data Using Multivariate Deep LSTM Model for Different Latitudes and Solar Activity

,


Introduction
Te ionosphere, found in the Earth's upper atmosphere, plays a crucial role in the propagation of radio signals. Variations in the density of electrons in the ionosphere can impact the speed and delay the radio signals traveling from Global Navigation Satellite System (GNSS) satellites to receivers, ultimately decreasing positioning accuracy. Te Total Electron Content (TEC) parameter is directly proportional to the infuence of the ionosphere on satellite signals. Te TEC causes delay in the radio waves traversing through the ionosphere. Tus, it is a major source of error in navigation systems like GNSS. Hence, prediction of ionospheric TEC is important in radio communications, radar systems, navigation and positioning systems. Forecasting the TEC successfully will help in correction of positioning errors caused by the ionosphere [1]. Various physical processes, such as solar radiation, geomagnetic storms, and atmospheric tides, can signifcantly impact the ionosphere, leading to substantial variations in TEC values. So, there is a need to comprehend the spatiotemporal variations in TEC and develop robust global and regional TEC models [2]. Tis requirement has been emphasized by experts in this feld and has become a critical research topic in ionospheric studies. Tus, this research paper aims to investigate the parameters which cause signifcant variations in TEC and develop accurate model to improve the prediction of ionospheric conditions for the reliable functioning of radiocommunication systems.
Ionospheric TEC forecasting methods can be classifed into empirical methods, statistical methods and machine learning methods. Empirical models describe the state of the ionosphere as a function of latitude, altitude, solar cycle, day of the year, season, geomagnetic activity etc. Te traditional ionospheric error mitigation approaches like the International Reference Ionosphere (IRI) model [3], the Klobuchar model [4], the NeQuick model [5] show limitations during complex ionospheric dynamics. Statistical methods like autoregressive [6], autoregressive moving average (ARMA) [7], autoregressive distributed lag (ARDL) model [8] have been developed in the past for forecasting regional short-term ionospheric TEC. Over the years, several neural network models have been developed for prediction of TEC and various related parameters at regional levels. Tese machine learning models use various algorithms to map nonlinear and complex relationship between input and output. Neural network [9][10][11], wavelet-based ANN [12], support vector machines [13], nonlinear radial basis function [14], and genetic algorithm-based neural network [15] are a few among them. However, these models fail to consider the time sequential feature of TEC. In deep learning, recurrent neural networks (RNNs) are very effective in modeling sequential data but vanishing and exploding gradient problems are common issues in these deep neural networks. LSTMs are a special type of RNN's which can model long range temporal dependencies due to the presence of memory cells. Te ability to learn long-term dependency in time series prediction makes LSTM models more powerful than any other type of neural networks. Several studies have analyzed and compared the performance of LSTM prediction models for Total Electron Content (TEC). LSTM was shown to outperform ARIMA and seq2seq models during a magnetic storm [16]. It was found that LSTM had the highest accuracy in predicting ionospheric delay, compared to NN and IRI models [17]. LSTM had better prediction accuracy than MLP under both quiet and stormy geomagnetic conditions [18]. Te prediction of TEC using LSTM was better than back propagation (BP) at midlatitude station during diferent solar conditions [19]. Tese studies collectively provide evidence that LSTM performs better in TEC prediction compared to other neural networks and existing empirical models. However, the performance of the TEC prediction models at diferent latitudes and during diferent solar conditions needs further study. Te current paper aims to investigate the performance of TEC prediction model in high latitude, midlatitude, low latitude, and equatorial region. Although literature suggests diferent exogenous parameters that could be used for TEC prediction, none of the literature studies have found out the correlation between TEC and other exogenous parameters which could be used for better prediction accuracy. Tis study focuses on determining the exogenous features which correlate well with TEC using heat map and Pearson correlation coefcients and using them for TEC prediction. Our model is tested for prediction accuracy during low solar activity year 2008, high solar activity 2014 and during the occurrence of geomagnetic storms. Additionally, this study also compares the prediction results of LSTM with MLP at diferent latitudinal regions during diferent solar and geomagnetic conditions.
Te contributions of the present study are as follows:

Input Parameters for the LSTM Model.
In the present study, ionospheric TEC data along with other exogenous parameters are used. Te TEC data sampled at every one hour was downloaded from IONOLAB site [20]. Te major parameters that cause ionospheric TEC variations are the local time, season, solar and geomagnetic activities [21,22]. Te solar parameters that afect TEC are solar radio fux F10.7 and sunspot number (SSN). Planetary index (Kp, Ap) and Disturbance storm time index (Dst) are used as geomagnetic parameters in the study of ionospheric TEC variations [23][24][25][26]. Also, the interplanetary magnetic feld (IMF) data By and Bz along with plasma speed (Vp) and proton density (Np) have shown to yield better results in prediction of TEC maps [27]. Tus, the literature suggests various exogenous parameters that can be used for better prediction of TEC. To select the right exogenous parameters which could be used for TEC prediction, we plotted the correlation heat map which shows the correlation of the exogenous parameters with TEC (see Figure 1). Correlation coefcients quantify the relationship between variables. A heat map provides a visual representation of these correlations, making it easier to identify patterns and relationships between variables. Strong correlations may indicate a potential relationship or dependency between variables. Tis can help identify which variables are most infuential or redundant in the dataset, allowing us to make informed decisions about feature selection. Te Pearson correlation coefcients are plotted in  Table 1 shows the exogenous parameters used along with their units and symbols. All these parameters were obtained from NASA OMNI website and were sampled after every 1 hour to maintain consistency with TEC data [28].

Journal of Electrical and Computer Engineering 3
A brief description of these exogenous parameters is given as follows: Solar fux F10.7: TEC tends to be lower during periods of low solar activity and higher during times of high solar activity [25]. Tis is because solar activity afects the ionosphere, which in turn afects TEC. One commonly used measure of solar activity is the F10.7 solar fux, which is directly linked to the number of sunspots, as well as the levels of ultraviolet and visible solar radiation. Terefore, we use F10.7 since it is a reliable predictor of solar activity levels. Disturbed storm index Dst: Te intensity of geomagnetic storms is often measured using the Disturbed storm index, also known as the Dst index. Tis index quantifes the deviation of the earth's magnetic feld from its normal variation during a quiet day. Under normal conditions, the Dst index typically ranges from 0 to −50 nT. However, during intense geomagnetic storms, the index can drop to below −200 nT [26]. Te Interplanetary Magnetic Field has two components that are particularly relevant for understanding its impact on the ionosphere: Te east-west (By) component and the north-south (Bz) component [26,27]. IMF By: Te By component of the magnetic feld is oriented perpendicular to the plane of the Earth's magnetic equator and is responsible for the formation of the equatorial ionization anomaly (EIA). Te EIA is a region of enhanced ionization that forms over the magnetic equator and is responsible for the peak in ionospheric total electron content (TEC) observed in the daytime. During periods of strong felds, the EIA can extend to higher latitudes, leading to an increase in TEC at these locations as well. IMF Bz: Te Bz component of the magnetic feld is oriented parallel to the earth's magnetic axis and is responsible for the formation and behavior of highlatitude ionospheric structures, such as the auroral oval. During periods of southward (negative) Bz, the solar wind can penetrate deeper into the Earth's magnetosphere, leading to enhanced ionization and TEC at high latitudes. Plasma speed, Vp: Tis refers to the speed at which plasma, a gas consisting of ions and free electrons, moves in the ionosphere. Te plasma speed in the ionosphere is afected by a variety of factors, including the earth's magnetic feld, solar activity, and the density of the plasma. When the plasma speed is high, it can cause irregularities in the ionosphere, which can lead to variations in TEC [27]. Proton density, Np: Np refers to the number of positively charged hydrogen ions in the ionosphere. Proton density is also afected by various factors, such as solar activity and the earth's magnetic feld. When the proton density is high, it can increase the ionization of the atmosphere, which can cause an increase in TEC [27]. Planetary 3-hour range index Kp: Tis index is a measure of the geomagnetic activity that is obtained by averaging the standardized K-index from 13 observatories located between 44°and 60°latitudes in both the hemispheres. Tis index serves as an indicator of the overall level of geomagnetic activity, with higher values indicating more severe activity [25]. Planetary index Ap: Te Ap index is a measure of the strength of geomagnetic activity that ranges from 0 to 400. Higher values of Ap index correspond to more intense geomagnetic activity [25]. Time of the day: Te ionospheric TEC exhibits a daily pattern where the electron density increases gradually to reach its peak at noon, then decreases to its lowest point at midnight. Te sine and cos functions, sin(2π/24) h and cos(2π/24) h, where local time, h, ranges from 0 to 23, help in normalizing the time input in the range (−1, 1) [25].
To study the impact of latitude on TEC prediction, we used four stations in the northern hemisphere each corresponding to a diferent latitudinal region. Te stations used along with their latitudes and longitudes are shown in Table 2. We used TEC data from 1 st -31 st July, 2008 corresponding to the low solar activity year, 1 st -31 st July, 2014, corresponding to the high solar activity year and 1 st July to 31 st October, 2011, for the geomagnetic storm event.

Analysis of the LSTM Model.
Te LSTM model is made of multiple LSTM cells. Each LSTM cell has a forget gate, input gate, memory cell and output gate. Te fow of information at any timestep t in the LSTM cell is controlled by three gates, namely, the forget gate, input gate, and the  [16][17][18][19]. Te internal structure of each LSTM cell is shown in Figure 3.
Forget gate: Te previous hidden state, h t−1 , and the current input data, x t , are fed into a forget gate, which is a neural network. Tis network (which uses a sigmoid function) is trained such that for irrelevant input it outputs close to 0 and closer to 1 when the input is relevant. Te output of this gate is f t . Tese values are then point wise multiplied with the previous cell state c t−1 to yield c ft . By performing a pointwise multiplication, the cell state components that the forget gate network has identifed as irrelevant will be multiplied by a value close to zero. As a result, these components will have minimal infuence on the following stages of the process. Te forget gate is represented mathematically by (1) and (2).
Store/Update the current cell state: Te input gate decides which new information should be added to the network's long-term memory (cell state). Te process involves two neural networks: the memory cell and the input gate. Te memory cell uses the tan h activation function to generate a new candidate vector g t that determines how much each component of the longterm memory should be updated based on the new input data. Te input gate utilizes the sigmoid activation function to selectively determine which information from the input should be stored in the new memory vector. Te outputs i t and g t are pointwise multiplied to retain only the relevant information. Te resulting combined vector, c it , is then added to the cell state, c ft , to update the long-term memory of the LSTM network, c t . Tis way, the network can selectively store and update relevant information while discarding irrelevant information.
where W denotes the diferent weight matrices with the connections of each weight matrix indicated by its indices and b denotes the bias terms of each of the fully connected layers. Equations (1) to (8) mathematically represent the internal operations occurring in each of the LSTM cell.

Implementation of the LSTM Model.
We built the LSTM model using Keras and TensorFlow deep learning libraries in Python. Te fowchart for the development and implementation of proposed methodology is shown in Figure 4. Te various steps performed to build, train, and test the model are as follows: Step 1: Data Preparation First, we read the .csv fle and fnd if there are any missing values. Next, to eliminate the infuence of diferent measurement scales and range diferences among the variables, the entire dataset is standardized using the minmax scaler between (0, 1). We then convert the time series into a supervised learning problem by using a sliding window algorithm [2]. In the sliding window approach, we have used a fxed-size window that slides over the time series data, and the model is trained on each window sequentially. At each time step, the input sequence to the LSTM model is the past observations within the window, and the output is the next value in the time series. Te window is then moved by a fxed step size and the process is repeated   Figure 4). We frame N − W instances of length W + 1 containing input and output datasets to train the LSTM model (see Figure 5). Te frst W samples of each instance is considered as training input data and the W + 1 th sample is the target data. In our algorithm, we have used past 24 values in each window as the input and the next value as the output. Te step size is taken as 1. For predicting the future TEC values, the model is trained to predict for one next time step and then the state of the network is updated after every prediction. For training, validation, and testing the model for solar quiet and active years, we have used 1 month data each. Te TEC data is sampled after every one hour, thus the total number of TEC samples is 744. Te total number of exogenous parameters used are 10 and these are also sampled after every one hour. Tus, the total number of samples used for exogenous data equals 7440. For training, validation, and testing the model for the geomagnetic storm event, we have used 4 months data. Total number of TEC samples is 2952. Te total exogenous parameters used are 10 and total number of samples used for exogenous data = 29520. Tis data is partitioned into 70-14-16 split. 70% of the data is used for training the model, 14% is used for validation, and then the model is tested on 16% of the data.
Step 2: Design and ft the model Te LSTM model consists of an input layer, one LSTM layer, dense (fully connected layer), and an output layer (see Figure 6). Te input layer inputs the past TEC data along with the selected exogenous parameters to the LSTM layer. Te model consists of a single layer LSTM with the 100 LSTM cells. Te fully connected dense layer with one output unit produces a single TEC value as the fnal prediction. Te model is trained such that it learns to predict the next time step for each iteration of the input time step. Te network optimizer is chosen as Adam and the loss function is MSE. Te number of epochs is set to 100, and validation loss value is monitored by the early stopping method, where patience is set to 10. Te dense layer (fully connected layer) uses the Leaky ReLU activation function to avoid the dead neuron problem. Some important hyperparameters for LSTM models include the number of LSTM layers, the number of LSTM cells per layer, the learning rate, the batch size, and the dropout rate [29]. In this work, the LSTM network with 100 LSTM cells has been used to achieve an adequate LSTM model's performance. We have used the hyperparameter grid search method for hyperparameter tuning and selecting the optimal values of batch size, dropout rate, and number of epochs. In addition, the Adam solver, a step-descent algorithm with a variable learning rate of 0.001 and the drop rate of 0.2 for 100 epochs, is considered (see Table 3).   Journal of Electrical and Computer Engineering Step 3: Forecast TEC Te model predicts one TEC value at a particular time step, and then the network state is updated. Tus, for each future prediction, it uses the past predicted value as an input. Since, the entire data were standardized to [0, 1] before training the model, we do the inverse normalization to recover the actual data. We calculate the MAE and RMSE. We also plot the observed and predicted TEC waveforms for the test data. Te plot of training and validation loss v/s number of epochs shows that the model ft is exact and the model can be used for forecasting (see Figure 7).
Te validity of our model's results could not be compared to the existing literature in a meaningful way due to the signifcant variations in methodologies used across studies, including diferences in station selection, amount of training data used, and the solar quiet and active years used. So, we simulated an MLP NN model for the same stations during the same solar quiet and active years, and we compare our results with the MLP NN model.

Implementation of MLP NN Model.
A multilayer perceptron (MLP) is an artifcial neural network that consists of an input layer, one or more hidden layers, and an output layer. Input features are normalized and passed through the network where each hidden layer applies an activation function to the weighted sum of inputs. Te output of each hidden layer is passed to the next layer until the fnal output is produced. Te weights of an MLP are updated during training using an optimization algorithm such as the stochastic gradient descent, based on the diference between the predicted and actual outputs. Te objective of training is to minimize the error or loss function while avoiding overftting or underftting. MLP can handle nonlinear relationships between inputs and output, but the number of hidden layers and neurons can signifcantly afect performance. Choosing the optimal hyperparameters, such as hidden layer size and activation function (fa), is critical. In our algorithm, hls can take four diferent values, (50), (100), (50, 50), and (100, 100), while fa can take three diferent values, ("relu," "tanh," and "logistic"). Te optimal hyperparameters are determined by performing a grid search to evaluate all possible combinations of hyperparameters and selecting those with the best performance metrics, such as RMSE, MAE, and R 2 .

Evaluation Metrics.
To evaluate the robustness and efectiveness in predicting TEC, we use RMSE and MAE as the metrics. RMSE helps in assessing the accuracy of forecasting models, as it measures the square root of the average squared diferences between the predicted and actual values. On the other hand, MAE provides the average of absolute errors, which can better indicate the magnitude of the errors in the predicted values. Terefore, using both RMSE and MAE provide a more comprehensive evaluation of the forecasting models.
wheret y i and y i are the actual and the predicted TEC values and N is the total number of observations.

For the Solar Quiet
Year 2008. Te original and predicted TEC waveforms for the period from 28 th July to 31 st July, 2008, are plotted in Figure 8. Figures 8(a)-8(d) show the original and the predicted TEC for the qaq1, baie, mas1, and bogt stations in the high-latitude, mid-latitude, low-latitude region, and the equatorial region, respectively. Figures 8(a)-8(d) clearly show that that the predicted TEC is very close to the original TEC most of the times except for small deviations when TEC reaches the maximum or minimum values. Te predicted TEC waveform, most of the times, exactly overlaps the original TEC for all the stations during the solar quiet year 2008. We tested the LSTM model performance by computing RMSE and MAE (see Table 5).
Te average RMSE and MAE values are 1.038 and 0.719 TECU, respectively. Te results indicate an accuracy improvement of 70% (using RMSE metric) and 76% (using MAE metric) over MLP.

For the Solar Active
Year 2014. Te original and predicted TEC waveforms are plotted in Figure 9 for the period from 28 th July to 31 st July, 2014. Figures 9(a)-9(d) show the original and the predicted TEC for the qaq1, baie, mas1, and bogt stations in the high-latitude, midlatitude, low-latitude regions and the equatorial region, respectively.
Figures 9(a)-9(d) show that that the predicted TEC is very close to the original TEC for high latitude (qaq1) and midlatitude (baie) stations compared to the low-latitude (mas1) and equatorial (bogt) regions. Tis is because the ionosphere is more intensely ionized near the equator during solar active years. We also tested the model performance for the solar active year (2014) by computing RMSE and MAE (see Table 6).
Te average RMSE and MAE are 2.656 and 2.111 TECU, respectively, which is much lower than that predicted by MLP. Te results indicate an accuracy improvement of 64% (using RMSE metric) and 66% (using MAE metric) over MLP. Te comparison of the MLP and LSTM models using RMSE and MAE evaluation metrics is shown in Figures 10  and 11, respectively.
As expected and as shown in Tables 5 and 6 and Figures 10 and 11, the RMSE increases with decreasing latitude, indicating better prediction at higher and midlatitude regions compared to lower and equatorial regions for both the solar quiet and active years. Te ionospheric TEC variations over the low-latitude regions are difcult to model and predict due to the equatorial ionization anomaly [17]. Te presence of more sunspots on the surface of the sun during an active solar year causes an increase in the amount of solar radiation that reaches the earth, resulting in a higher TEC in   the ionosphere. Tis phenomenon is particularly noticeable at the equator, where the ionosphere is most intensely ionized. During years of high solar activity, such as those with more solar fares and coronal mass ejections, there may be temporary disruptions in the ionosphere, leading to short-term increase in TEC. As a consequence, the RMSE and MAE also tend to be higher during such high solar activity periods (see Figures 10 and 11). Tis fnding is consistent with the research in this feld.

During Geomagnetic Storm Events.
Te year 2011 experienced 3 geomagnetic storms with a minimum Dst of less than −100 nT [30]. Te date of occurrence of these storms along with the corresponding Dst values are shown in Table 7. We tested our LSTM model for prediction of TEC during the geomagnetic storm at the midlatitude baie station. Te geomagnetic storm with Dst values −115 nT, −118 nT, and −147 nT on days of the year (DOY) 218, 269, and 298 caused sudden fuctuations in TEC (see Figure 12) [30]. We used the data from July 1 st , 2011, to October 31 st , 2011, which included these three storm events. Te data from July 1 st , 2011, to October 12 th , 2011, was used to train the LSTM model, and then it was tested for unseen data from October 13 th to October 31 st , 2011, which included a strong geomagnetic storm with Dst � −147 nT.      Figure 13 shows the original and predicted TEC by our LSTM model during the occurrence of the geomagnetic storm during the period from 13 th October to 31 st October, 2011. Tis includes a very strong storm with Dst � −147 occurring between 24 th and 25 th October, 2011. Before the occurrence of the storm, TEC increases steadily, but during the storm event, there is a sudden decrease in TEC, which is correctly and accurately predicted by our LSTM model (see Figure 13). Te periodic variation is also accurately captured by the LSTM model.    Table 8 shows the comparison of the MLP and LSTM model during the occurrence of the storm event. Te results indicate that the accuracy of our LSTM model is 74% better (using RMSE metric) and 77% better (using MAE metric) than MLP.
Compared to MLP, LSTM has predicted TEC with more accuracy (see Tables 5, 6, and 8). Tis is because LSTM has a memory cell, which allows the LSTM model to learn and maintain information about past inputs over longer periods of time. Tis is especially important for time series data, such as TEC, since it exhibits diurnal characteristics. MLPs do not utilize history information, so they may struggle to capture long-term dependencies in the data. As a result, MLPs fail to predict accurately in cases where the data is noisy or turbulent, such as in the case of TEC data, particularly during solar active years and during the occurrence of geomagnetic storms.

Conclusion
We propose a multivariate deep learning LSTM model for the prediction of TEC data. We determined the correlation of TEC with other exogenous parameters and used the ones that correlated well with TEC. We validated the performance of our model at diferent latitudinal regions during the quiet and active solar years. We also tested our model during the occurrence of a geomagnetic storm event. LSTM is a powerful tool for TEC prediction because it is capable of capturing long-term dependencies, handling nonlinear patterns, dealing with variable-length sequences, and mitigating the vanishing gradient problem, thus maintaining accuracy and performance while dealing with time series TEC prediction. Te proposed LSTM model has better performance accuracy than the classic MLP during the solar quiet year, the solar active year, and also during the occurrence of the geomagnetic storm. Te RMSE and the MAE values obtained using the LSTM model are found to be very low, suggesting that this method could be well-suited for forecasting the ionospheric TEC at all times and for all latitudinal regions. In the future, we wish to develop a global deep learning model that can capture both the spatial and temporal features of TEC data and can help in the prediction of TEC data covering a larger latitudinal and longitudinal range.

Data Availability
Te TEC data used in this paper were downloaded from IONOLAB (https://www.ionolab.org) site. All the exogenous parameters were obtained from NASA OMNI website: https://omniweb.gsfc.nasa.gov/form/dx1.html.

Conflicts of Interest
Te authors declare that they have no conficts of interest.