A New GPT2w Model Improved by PSO-LSSVM for GNSS High-Precision Positioning

Tropospheric delay is an important error affecting GNSS high-precision navigation and positioning, which will decrease the precision of navigation and positioning if it is not well corrected. Actually, tropospheric delay, especially in the zenith direction, is related to a series of meteorological parameters, such as temperature and pressure. To estimate the zenith tropospheric delay (ZTD) as accurately as possible, the paper proposes a new fused model using the least squares support vector machines (LSSVM) and the particle swarm optimization (PSO) to improve the precision and temporal resolution of meteorological parameters in global pressure and temperature 2 wet (GPT2w). The proposed model uses the time series of meteorological parameters from the GPT2w model as the initial value, and thus, the time series of the residuals can be obtained between the meteorological parameters from meteorological sensors (MS) and the GPT2w model. The long time series of meteorological parameters is the evident periodic signal. The GPT2w model describes its dominant frequency (harmonics), and the residuals thus can be seen as the short-period signal (nonharmonics). The combined PSO and LSSVM model (PSO-LSSVM) is used to predict the specific value of the short-period signal. The new GPT2w model, in which the meteorological parameter value is obtained by combining the estimated meteorological parameters residuals and the GPT2w-derived meteorological parameters, can be acquired. The GNSS network stations in Hong Kong throughout 2017-2018 are processed by the GNSS Processing and Analysis Software (GPAS), which is developed by the Chinese Academy of Surveying & Mapping, to estimate the zenith tropospheric delay and station coordinates using the new GPT2w model. Statistical results reveal that the accuracy of the new GPT2w model-derived ZTD was improved by 60% or more compared with that of the GPT2w-derived ZTD. In addition, the positioning accuracy of the GNSS station has been effectively improved up to 44.89%. Such results reveal that the new GPT2w model can greatly reduce the influence of nonharmonic components (short-period terms) of the meteorological parameter time series and achieve better accuracy than the GPT2w model.


Introduction
Tropospheric delay is an important error that affects the positioning accuracy of the global navigation satellite system (GNSS). However, it is also an important parameter to calculate tropospheric delay for GNSS meteorology. Usually, the value can reach tens of meters [1] for satellites with low altitude so that this error should be carefully considered in the GNSS positioning. Establishing an accurate and reliable ZTD model becomes crucial and critical for GNSS research.
According to the relationships between ZTD and meteorological parameters obtained from the ground, the com-monly ZTD models are established including Hopfield model [2], Saastamoinen model [3], Black model [4], and Ifadis model [5]. The accuracy of these models is limited by the input meteorological parameters and environments. When meteorological parameters cannot be acquired accurately, the accuracy of GNSS high-precision positioning will be decreased. Additionally, when GNSS stations are not equipped with the meteorological sensors, ZTD is not able to be obtained by these models.
Thus, the ZTD empirical models are proposed that only rely on the location and observation time without the need of any auxiliary information. For example, the UNB3 model [6] stores the meteorological parameters in the form of a table from the equator to poles at the intervals of 15 degrees. IGGtrop model [7] is established by using four years of National Centers for Environment Prediction (NCEP) data. Yao used the Global Geodetic Observing System (GGOS) Atmosphere data to establish the GZTD model [8] and improved the GZTD2 model [9] by introducing diurnal variations. Some other ZTD empirical models are also established by considering other factors, such as IGGtrop_SH and IGGtrop_RH models [10], the GZTDS model [11], the SHAtropE model [12], the RGZTD model [13], and the global pressure and temperature 2 wet (GPT2w) model [14]. The GPT2w model that is the paper concerned can offer precise ZTD products.
However, the ZTD empirical models are not applicable to some regions limited by the spatiotemporal resolution. Fortunately, the Continuously Operating Reference Station (CORS), whose ZTD products have high accuracy and high temporal resolution, provides an opportunity for establishing new ZTD models with higher accuracy. This paper proposes a new model combining the particle swarm optimization algorithm with the least squares support vector machine (PSO-LSSVM) model to improve the GPT2w model. First, the GPT2w is used to calculate meteorological parameters that as the initial value. Second, the time series of meteorological parameters residuals can be obtained as the difference between meteorological parameters from meteorological sensors (MS-derived) and GPT2w-derived meteorological parameters over GNSS stations. Third, the PSO-LSSVM model is used to predict meteorological parameters residuals. So the meteorological parameter value can be acquired by combining the estimated meteorological parameters and GNSS-derived meteorological parameters. Finally, the zenith tropospheric delay (ZTD), zenith hydrostatic delay (ZHD), and station coordinates can be obtained by GNSS Processing and Analysis Software (GPAS) using the meteorological parameters, which is developed by the Chinese Academy of Surveying & Mapping.

Metholodogy
2.1. The GPT2w Model. The Global Pressure and Temperature (GPT) series model is an empirical model to provide the global temperature and pressure at any GNSS station in the world. These models include GPT [15], GPT2 [16], and GPT2w [14]. GPT2w, a very comprehensive tropospheric model, can be used for a range of geodetic, meteorological, and climatic purpose without auxiliary information. It provides the annual and semiannual amplitudes of a set of meteorological parameters that include pressure (P) in hpa, weighted mean temperature (T m ) in K, water vapor pressure (e) in hpa, and water vapor lapse rate (λ In equation (1), rðtÞ is the estimated meteorological parameters. A 0 denotes the mean value as well as A 1 , B 1 and A 2 , B 2 are annual and semiannual amplitudes, which can be downloaded from https://vmf.geo.tuwien.ac.at/ codes/gpt2_1w.grd. doy denotes the day of the year. So the needed meteorological quantities at the four nearest grid points can be acquired. Then, the estimated meteorological parameters of a site can be calculated by the bilinear interpolation algorithm. The ZTD values are calculated [17] as equations (2)-(4).
In equations (2)-(4), ZHD and ZWD denote, respectively, the zenith hypostatic delay and zenith wet delay, φ is the latitude of station, H is the geoid height of the station, and P is the surface pressure. R d is the dry air ratio gas constant with a value of 287:0464JK −1 kg −11 , k 2 ′ and k 3 are the atmospheric refractive index formula constants with values of 16:52K/hpa and 3:776 × 10 5 K 2 /hpa, respectively; g m is the mean gravity acceleration with a value of 9:80665m/s 2 at the mass center of the vertical column of the atmosphere.
2.2. The Least Squares Support Vector Machine. Support vector machine (SVM) is a machine learning method with a perfect theoretical system, which is different from general statistical methods. It avoids the process from induction to deduction and thus realizes the inference and estimation from training samples to forecast samples and obtains the simplification of regression analysis and other problems [18]. The Least Squares Support Vector Machine (LSSVM) inherits the basic idea of SVM and uses the quadratic square loss function instead of the classical SVM quadratic programming method. The LSSVM model converts the optimization problem to the linear equation problem, reducing the complexity of the algorithm [19]. Therefore, the LSSVM model can solve the problem of inequality constrains and improve the speed and accuracy in solving the linear equation problem simultaneously [20]. Additionally, the random error and overtraining can be avoided in this model [21]. 2 Journal of Sensors The LSSVM can be explained as follows: for the training sample D = fðx k, y k Þ | k = 1, 2, ⋯, Ng, where x k ∈ R m is the input data and y k ∈ R is the output data. The linear regression function can be introduced to establish relationship between x k and y k as equation (5).
In equation (5), w is the weight vector of the hyperplane, b is the bias constant, and φðxÞ is the linear mapping function which can make the input vector be mapped to a highdimensional feature space so that the original linearly inseparable samples can be separated in the kernel space. Then, the regression problem is converted into a quadratic optimization problem whose objective function and constraints are as equations (6) and (7).
In equations (6) and (7), J is loss function, γ is the regularization parameter, and e is slack variable.
The equation (6) is convex function which belongs to the quadratic programming problems. To solve the optimization problem, the equation (8) is introduced.
In equation (8), the a k is the Lagrange multiplier of the model.
In equation (9), I is the identity matrix and the kernel function is Ω:a and b can be calculated by the least square 6 Journal of Sensors model. Thus, the regression estimation function of the LSSVM is equation (14).
In equation (14), the constructed support vector machine varies with the kernel function. Since the radial basis kernel function (RBF) corresponds to an infinite dimensional feature space, the limited samples are linearly separable in the feature space, and it has better computational performance. Therefore, the RBF kernel function is chosen as equation (15).
In equation (15), x k and x l are the input data and σ is the shape parameter of the RBF kernel function.

Fusion of Particle Swarm Optimization and LSSVM.
There are two important parameters (σ and γ) determined difficultly for the LSSVM model, which decrease the accuracy of the LSSVM model if they are not well treated. Actually, it is usually tried or determined by experience, which may not be effective. The Particle Swarm Optimization (PSO) originally originated from the foraging behavior of bird flocks [22]. It simulates the mutual learning behavior between individuals through the shared information in the biological group and finds the optimal solution in its solution space [23]. The PSO can determine the optimal parameters through individual and global particle optimization, which is practicable for the LSSVM parameter determination.
For the optimization problem of the PSO, the solution can be regarded as a bird in the search space, which has its own initial velocity, position, and fitness to a certain position. Finding the optimal solution is to find the position where the particle has the best fitness value from the starting position to the current position. The update formulas for the velocity and position of each particle in the particle swarm are as equations (16) and (17).
In equations (16) and (17), t is the defined number of iterations, and v i ðtÞ is the speed of the ith particle in the particle swarm at the tth iteration; x i ðtÞ is the location of the ith particle in the particle swarm at the tth iteration; c 1 and c 2 are the positive learning coefficients and influence partial search capability and global search capability, respectively; R 1 and R 2 are two uniformly distributed random numbers that ensure the particles move to their optimal position and the optimal position of the group in the form of equational probability; w is the 7 Journal of Sensors ability of a particle to maintain the state of motion at the previous moment that can achieve a balance between global search and partial search of particles. R b i ðtÞ is the optimal loca-tion of the individuals; R b g ðtÞ is the global optimal location of the community. φ is the shrink silver that keeps the particle speed in a certain range to ensure convergence.

Journal of Sensors
To demonstrate the advantage of the LSSVM and the PSO clearly, the optimization process is shown in Figure 1. The steps are as follows: Step 1. Initialize the particle swarm and the parameters of the LSSVM.
Step 2. Calculate the fitness of each particle. The fitness of the current particle is compared with the fitness of the individual's optimal position and the historical optimal position. If the fitness of the current particle is optimal, the position is replaced; otherwise, the original optimal position is maintained.

Journal of Sensors
Step 3. Update the position and velocity of the particles by the maximum number of evolutions.
Step 4. Judgment of termination conditions. When the error requirements or the maximum number of evolutions are met, the process will end. Otherwise, the process will repeat Step 2 to continue until reach the requirements.

GNSS Station Solution Based on PSO-LSSVM Model.
The GNSS Processing and Analysis Software (GPAS) is developed by Chinese Academy of Surveying & Mapping to obtain the zenith tropospheric delay (ZTD), zenith hydrostatic delay (ZHD), and station coordinates. The process of GPAS is shown in Figure 2. To clarify the tropospheric delay with the proposed PSO-LSSVM method, tropospheric delay is given before the least squares. Firstly, it is necessary to initialize GNSS orbit and determine GNSS clock including preparation of earth rotation parameters and ephemeris, calculating GNSS clock by broadcast ephemeris and checking the ephemeris. Secondly, GNSS observation data is preprocessed including conversion of data format, detecting cycle slip and marking resolution. At the same time, some errors are corrected including tropospheric delay, where GPT2w model and PSO-LSSVM model are contrastively analyzed. Then, the traditional GNSS data processing is implemented to get the station coordinates, such as the least squares estimation, residual edit, and ambiguity resolution.

Experiment Description
The Hong Kong Survey and Mapping Office (SMO) of the Lands Department builds a local satellite positioning reference station network (SatRef). It consists eighteen CORSs and six of them are chosen considering the continuity and completeness of observation data, which is shown in Figure 3. The paper adopts the meteorological observations from January 1, 2017, to December 31, 2018, with a temporal resolution of 1 hour and the GNSS observations' sampling rate is 30 s.
The ground weather stations obtain the temperature, air pressure, relative humidity, and other meteorological parameters at the site through meteorological sensors (MS). The air pressure detection accuracy can reach 0:5 hpa, and the air temperature detection accuracy is high. Therefore, the meteorological sensors are regarded as a criterion for selected GNSS stations. Since the temporal resolution of the GPT2w model is one day, the GPT2w-derived meteorological parameter for each hour is replaced the calculation value of GPT2w model. Then long-term meteorological parameters residuals between GPT2w-derived and MS-derived can be obtained. The residual data from January 1, 2017, to November 30, 2017, is used to train the PSO-LSSVM model; the residual data from December 2017 is used to test whether the model has overfitting Phenomenon; the residual data for the whole year of 2018 is used as the validation set. Then, the paper adds the fitted residual data and the GPT2w model value of the corresponding time period to obtain the improved meteorological parameters, and analyzes the root mean square error (RMS) as the evaluation index as equation (18).
In equation (18), Y i is the improved meteorological parameters; Y i i is the MS-derived parameters; n is the time of prediction.

Analysis and Discussion of Experiment
The time series of pressure and temperature residuals between GPT2w and MS at stations (HKOH, HKKT, HKPC, HKSC, HKSS, and HKST) are given in Figure 4. The MSderived meteorological parameters in the unit of day are obtained by taking the average of meteorological parameters a day. It can be seen from Figure 4 that the time series of meteorological parameters such as atmospheric pressure and temperature consists two parts: harmonic and nonharmonic. The harmonic part reflects its long-period characteristics, and the nonharmonic part reflects its short-period characteristics. The GPT2w model fits the harmonic part of the meteorological parameter time series well and reflects its long-period characteristics. However, for its nonharmonic 11 Journal of Sensors part, there is still some residual errors affecting the accuracy to a certain extent. If those errors can be reduced, it will be meaningful for the improvement of accuracy.
Taking the HKOH as example, Figure 5 presents the short-period (nonharmonic) part of the meteorological parameters. The results of other stations are similar.
In Figure 5, there is the phenomena of nonlinearity, nonstationarity, and noise in the time series of residuals. This means that it is difficult to establish the relationship between the time series in the past and in the future. The PSO-LSSVM model is used to solve this problem.

Journal of Sensors
The LSSVM is a nonparametric model, which means that it does not require any prior information about the underlying data. Thus, this paper uses the past 24 hours of historical meteorological parameters residuals (nonharmonic part) to train the model. To determine the appropriate forecast range, the meteorological parameters residuals of the next 1 h, 2 h, 4 h, 6 h, 8 h, 10 h, and 12 h are, respectively, predicted. The RMS values of the model meteorological parameters and the measured meteorological parameters are calculated to evaluate the performance of different prediction models. It can also be seen from Figure 6 that with the increase of the prediction range, the RMS value also increases. So the prediction range is 1 h. Figure 7 is the time series of the model prediction and the observed value (nonharmonic component). The RMS of the pressure residuals and temperature residuals are, respectively, 0.8536 hpa and 0.5609°C, which have high accuracy.
Thus, the predicted meteorological parameters are obtained by adding the predicted nonharmonic components and the harmonic components estimated by the GPT2w model and then the GPAS software is used to output ZHD and ZTD.
It can be seen from Figure 8 that the predicted results of the PSO-LSSVM model are usually consistent with the MS. The ZHD output by the meteorological parameters from the PSO-LSSVM are in good agreement with the MS. This is because the good performance of the GPT2w empirical model itself, which better fits the harmonic components of the meteorological parameter time series, provides a solid foundation for further prediction of nonharmonic components. At the same time, the nonlinear factors of the time series are not ignored, and the nonlinear relationship can be found by the RBF kernel function of the LSSVM.
The paper selects the data from August 10th, 2018, to August 27th, 2018, when the weather of Hong Kong was rainy continuously to validate the proposed model. The right picture in Figure 9 is the time series of ZHD from the GPT2w, MS, and PSO-LSSVM model from August 10th, 2018, to August 27th, 2018, at station HKOH. It can be seen from the right picture that the PSO-LSSVM ZHD are in good agreement with the MS ZHD. However, the GPT2w ZHD has a large bias with the MS ZHD. This verifies the new proposed model has a better performance than GPT2w model. The results of other stations are similar.
The left pictures of Figure 10 are the frequency histograms of the difference between the GPT2w-derived ZTD and the MS-derived ZTD, and the right are the difference between the ZTD output by the PSO-LSSVM model and MS-derived ZTD. In Figure 10, the frequency histograms of the right pictures are thinner than the left pictures which demonstrate the standard deviations of the right pictures are smaller. According to the properties of Gaussian function, the smaller the standard deviation is, the more stable the distribution will be. Thus, the LSSVM has a certain learning ability for Gaussian white noise in the time series of ZTD.
It can be seen from Table 1 that the improved meteorological parameter model proposed in this paper shows better performance than the GPT2w model. The accuracy of the ZTD value has been effectively improved. Especially, the proposed model achieves RMS of 0.4989 mm, 0.3231 mm, 0.3320 mm, 0.3141 mm, 0.3014 mm, and 0.3036 mm at stations of HKKT, HKOH, HKPC, HKSC, HKSS, and HKST, respectively, which has approximately 65.62%, 75.29%, 76.29%, 77.54%, 78.23%, and 76.78% improvements over them, respectively.
Finally, the proposed model is applied to the GPAS and compared with the GPT2w model. From Table 2, it is obvious that the proposed model also reaches improvements of more than 30% over stations of HKKT, HKOH, HKPC, and HKSC though the improvements are not evident over HKSS and HKST. Above all, for each station, the positioning accuracy has also been improved. The improvement verifies the universality and effectiveness of the model proposed in this paper for the research GNSS stations.

Conclusions
To improve the accuracy of estimating ZTD, the new meteorological parameters model based on GPT2w model and PSO-LSSVM model is proposed. Based on the advantages of MS-derived meteorological parameters and GPT2w model, the time series of meteorological parameters is divided into two categories: harmonic and nonharmonic. Then, the PSO-LSSVM model is used to estimate the nonharmonic and GPT2w is applied to fit the harmonic. Then, the improved meteorological parameters are obtained. Finally, these parameters are input into GPAS which can process GNSS data; the ZTD and coordinates of stations are obtained accordingly. The results show that the proposed model has higher temporal resolution and higher accuracy than GPT2w model. Also, the proposed model is robust even though the weather is rainy.

Journal of Sensors
The future work will focus on the following aspects. First, the past 24 h of historical meteorological parameter residuals (nonharmonic part) are used to train the PSO-LSSVM model. A shorter time should also be considered, such as 12 h. Second, the estimation and forecasting of meteorological parameters are a complicated process. This paper only considers the correlation of its one-dimensional time series. In the following work, more relevant variables can be considered to improve the accuracy of model.

Data Availability
The codes of GPT2w model can be downloaded at https:// vmf.geo.tuwien.ac.at/codes/. The GNSS observation and meteorological data of Hong Kong can be obtained from https://www.geodetic.gov.hk/en/satref/satref.htm.

Conflicts of Interest
The authors declare no conflict of interest.