^{1}

^{2}

^{3}

^{1}

^{2}

^{1}

^{2}

^{3}

A variety of climate factors influence the precision of the long-term Global Navigation Satellite System (GNSS) monitoring data. To precisely analyze the effect of different climate factors on long-term GNSS monitoring records, this study combines the extended seven-parameter Helmert transformation and a machine learning algorithm named Extreme Gradient boosting (XGboost) to establish a hybrid model. We established a local-scale reference frame called stable Puerto Rico and Virgin Islands reference frame of 2019 (PRVI19) using ten continuously operating long-term GNSS sites located in the rigid portion of the Puerto Rico and Virgin Islands (PRVI) microplate. The stability of PRVI19 is approximately 0.4 mm/year and 0.5 mm/year in the horizontal and vertical directions, respectively. The stable reference frame PRVI19 can avoid the risk of bias due to long-term plate motions when studying localized ground deformation. Furthermore, we applied the XGBoost algorithm to the postprocessed long-term GNSS records and daily climate data to train the model. We quantitatively evaluated the importance of various daily climate factors on the GNSS time series. The results show that wind is the most influential factor with a unit-less index of 0.013. Notably, we used the model with climate and GNSS records to predict the GNSS-derived displacements. The results show that the predicted displacements have a slightly lower root mean square error compared to the fitted results using spline method (prediction: 0.22 versus fitted: 0.31). It indicates that the proposed model considering the climate records has the appropriate predict results for long-term GNSS monitoring.

Within the various remote sensing technologies, Global Navigation Satellite Systems (GNSS) plays an important role in providing fundamental infrastructure and has been successfully implemented in deformation monitoring. The global GNSS, such as the United States’ Global Positioning System (GPS), Russia’s Global Navigation Satellite System (GLONASS), the European Union’s Galileo, and China’s Beidou Navigation Satellite System (BDS), serve as highly efficient monitoring tools for precise geodetic surveying. Different unions or countries have installed numerous Continuously Operating Reference Stations (CORS) for various monitoring purposes, including the Plate Boundary Observatory (PBO) maintained by National Science Foundation (NSF) EarthScope, the CORS GPS network maintained by the U.S. National Geodetic Survey, and the GPS Earth Observation Network (GEONET) of Japan. More than 506 worldwide permanent GNSS stations are managed by the International GNSS Service (IGS) group as of December 08, 2019. The original RINEX (Receiver Independent Exchange Format) files for the GNSS stations in different CORS networks are free to download through University NAVSTAR Consortium (UNAVCO) or National Geodetic Survey (NGS) data archiving facilities [

Understanding the climate variations of GNSS time series are important for monitoring applications. In practice, during the period of half, one, or two years, the long-term monitoring time series have cyclical fluctuation and rebound characteristics triggered by climate influence, which makes the users who do not major in geodesy confuse about the precision of GNSS monitoring. The climate factors, mainly including the rainfall, temperature difference, wind speed, visibility, dew, and humidity, have always been a question among the long-term monitoring GNSS operators and data users. Xu et al. [

To achieve the quantitative analysis results of the impact on the GNSS time series caused by different daily climate factors, we need high-precision GNSS records. The accuracy and precision of GNSS observations are impacted by the types of GNSS equipment used and the processing method applied. In general, there are two widely used GNSS data postprocessing approaches: relative positioning and absolute positioning. The relative positioning approach uses the carrier-phase double-difference (DD) method to fix differenced phase ambiguities to integer values between stations and between satellites [

Precise Point Positioning (PPP) is a typical GNSS postprocessing absolute positioning approach, which uses a single-receiver phase-ambiguity-fixed resolution to calculate daily original raw data. The precision of the PPP solutions has dramatically increased during the last decade, which primarily attributes to highly precise satellite orbits and clock data provided by the International GNSS Service (IGS) and new algorithms used to resolve phase ambiguity within a single receiver [

The scale, the orientation, the origin, and the change of these parameters over time are the main physical and mathematical properties of a reference frame. In geodetic applications, a stable local reference frame is primarily transformed from the latest and well-established global reference frame using the Helmert transformation. A group of GNSS reference stations (common points) are used to tie the target regional or local reference frames to a global reference frame (e.g., IGS14). Pearson and Snay [

In practice, after the highly precise GNSS records are processed, it is still a challenge to quantitatively analyze the weights of impact from different climate factors on the GNSS time series. However, with computing science development, the machine learning approach achieves a dramatic developing rate and provides a new tool to explore new analysis methods in geodesy and geosciences. The approach can quantitatively analyze the hypothesis and assist to capture high-dimensional data sets [

In this paper, we proposed a hybrid method to evaluate the impact of daily climate factors on the GNSS time series, using the extended Helmert transformation method and XGBoost algorithm. The model is trained by high-precision GNSS records and various long-term climate records. The contributions of this paper are shown as follows:

To remove the background tectonic movements when monitoring local ground deformation, we proposed the extended Helmert transformation to establish the highly stable PRVI19 local reference frame based on ten well-distributed continuously operating GNSS stations with at least five years of data

By combining the GNSS records with millimeter accuracy and the local climate data with a span of at least five years, we applied the XGboost machine learning algorithm to derive the quantitative results of the weights of impact from different daily climate factors on the GNSS time series

Based on the model, we predicted the GNSS records and validate them with the real raw GNSS data. The results show that the high accuracy of the prediction and it is expected that this study can provide a new prospect to explore the potential deformation monitoring problem

The Helmert transformation, also called a 7-parameter transformation, is used to conduct distortion-free reference frame transformations within a three-dimensional (3D) space in the geodesy area. The 7-parameter approach employs three parameters from translations, three parameters from rotations, and one parameter from the scale at a selected epoch and the rates of these seven parameters over time. For daily GNSS positional coordinate transformation from the PPP solutions, the geocentric coordinates of a site with respect to the local reference frame can be approximated by the following formula:

Theoretically, the seven parameters for each epoch/day are different. Therefore, to obtain the positional time series in the local reference frame, the seven parameters at each epoch have to be provided. Currently, there are two strategies that can be employed to perform the transformation, daily seven-parameter transformation, and 14-parameter transformation [

Here,

Then, the coordinates of the target GNSS stations referred to the local reference frame can be obtained through:

Also, since a linear model is assumed, the changing rates can be easily calculated with two sets of transformation parameters at two different epochs. In this study, the epochs are set as

In order to calculate the transformation parameters at another epoch

Here, the velocity in local reference frame is regarded as zero since local reference frame is designed to have a velocity of zero relative to the rigid part of the region. With the coordinates at epoch

Then, the coordinates of a GNSS site at any epoch can be transformed from IGS14 to the local reference frame with

In this study, we applied the Extreme Gradient Boosting (XGboost) algorithm to predict ground displacements and to understand which climate factors have more impact on the GNSS monitoring time series. XGBoost was proposed by Chen et al. [

In XGBoost,

The optimization function of XGBoost can be written as:

However, because of the complex architecture, it is difficult to train the ensemble learner once. An additive strategy has been widely applied which means trees are trained one by one. The trees that have already been trained will be fixed, and then, a new tree is added at one step. Suppose that the predicted displacement at step

With equations (

As studied by Breiman and Friedman [

The XGBoost is implemented using the “xgboost” package in the R software [

In geodesy, a terrestrial reference frame is realized by selecting a set of reference stations and defining their positions and velocities. The selection of the reference stations is critical for establishing a stable reference frame. Here, it is very hard to set any mathematical or technical standard for selecting appropriate GNSS reference stations. In general, with some previously proposed guidelines, the reference stations are selected based on overall geographic distribution and long-term (e.g., >5 years) continuous records [

Also, the PRVI region has established an appropriate GNSS infrastructure and has a long-term land monitoring history. Since 1986, GNSS stations were installed by researchers for studying Caribbean plate tectonic movements [

Map showing the locations of current GNSS stations in the Puerto Rico and Virgin Islands region. VI represents the Virgin Islands; WB represents the Whiting basin; VIB represents the Virgin Islands basin; SCB represents the St. Croix basin.

Figure showing the site views of four typical continuously operating permanent GNSS stations in the PRVI region. (a) A site view at Cerrillos_PR2008 GNSS station (ID: P780) installed in Ponce, PR. The antenna is mounted to a short-drilled braced monument (SDBM) designed by UNVCO. (b) A site view at BayamonSciPR2008 GNSS station (ID: BYSP) and a nearby strong motion accelerometer installed in the Bayamón Science Park, Bayamón, PR. (c) A typical building based GNSS station (ID: PRLP) installed by the HLCM Group Inc. in Las Piedras, PR. (d) A site view at DVirgGordaBVI2013 GNSS station (ID: CN03) installed at the top of a mountain in North Sound, Virgin Gorda.

The climate impacts the seasonal ground deformation, and different climate factors perform different impact weights in the GNSS time series. In this study, PRVI region that presently follows a tropical rainforest climate is selected to test the hybrid method. The area has recorded an annual mean temperature of 28°C, and it has a trend of increasing since the 1950s [

This study used the GIPSY-OASIS software package (V6.4) to obtain daily solutions using the PPP method. Firuzabadi and King [

Plots showing the geocentric positional time series (

In this study, we used the extended Helmert transformation and selected ten permanent stations to realize the PRVI19 reference frame. The MIDAS method was used to calculate velocity and uncertainty [

Map showing the locations of the ten selected GNSS reference stations within the PRVI region. The red velocity vectors are referred to the IGS14 reference frame. The green velocity vectors are referred to the PRVI19 reference frame.

Detailed information of the ten reference stations and the site velocities referred to IGS14 and PRVI19 reference frames.

Reference GNSS | Location (degree) | Site velocity | Uncertainty of the velocity (mm/year) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

IGS14, mm/year | PRVI19, mm/year | |||||||||||

Location | Longitude | Latitude | EW | NS | UD | EW | NS | UD | EW | NS | UD | |

P780 | Ponce, PR | -66.5791 | 18.075 | 8.9 | 13.8 | -0.3 | 0.1 | 0.1 | -0.3 | 0.2 | 0.2 | 0.8 |

BYSP | Bayamón, PR | -66.1612 | 18.408 | 8.8 | 13.8 | -0.5 | 0.2 | -0.2 | -0.9 | 0.2 | 0.2 | 0.8 |

PRJC | San Sebastian, PR | -66.9995 | 18.342 | 9.0 | 13.6 | -0.8 | 0.4 | 0.1 | -0.9 | 0.3 | 0.3 | 1.0 |

PRLP | Las Piedras, PR | -65.8683 | 18.195 | 9.3 | 14.8 | 0.1 | 0.5 | 0.7 | -0.1 | 0.3 | 0.3 | 1.0 |

PRAR | Cercadillo, PR | -66.647 | 18.450 | 8.9 | 14.4 | -0.5 | 0.3 | 0.7 | -0.9 | 0.3 | 0.3 | 0.9 |

PRN4 | Coamo, PR | -66.369 | 18.079 | 9.0 | 13.6 | 0.6 | 0.2 | -0.3 | 0.6 | 0.3 | 0.3 | 1.1 |

CUPR | Culebra Island, PR | -65.2825 | 18.307 | 8.5 | 14.6 | -0.5 | -0.3 | 0.2 | -0.9 | 0.3 | 0.3 | 1.0 |

VITH | St. Thomas Island, USVI | -64.9692 | 18.343 | 8.6 | 14.8 | -0.1 | -0.1 | 0.2 | -0.5 | 0.2 | 0.3 | 0.9 |

CN03 | Gorda Peak, BVI | -64.403 | 18.490 | 8.9 | 15.5 | -0.2 | 0.2 | 0.6 | -0.5 | 0.4 | 0.4 | 1.3 |

ABVI | Anegada Island, BVI | -64.3325 | 13.730 | 8.4 | 14.9 | -0.3 | -0.2 | -0.1 | -0.9 | 0.3 | 0.3 | 1.0 |

Seven parameters for Helmert reference frame transformation from IGS14 to PRVI19.

Parameters | Unit | IGS14 to PRVI19 |
---|---|---|

dTx | m/year | 2.652437 |

dTy | m/year | -2.130450 |

dTz | m/year | -9.310699 |

dRx | Radian/year | 1.386115 |

dRy | Radian/year | 1.273412 |

dRz | Radian/year | 5.511930 |

1/year | 2015.0 |

Theoretically, the longer length of GNSS data and better geographical distribution of reference stations can improve the stability of a reference frame. Wang et al. [

Figure

Plots showing the three-component displacement time series of the BYSP station with respect to two different scale reference frames: a global reference frame (IGS14) and a local reference frame (PRVI19). The green dots represent the displacements with respect to the PRVI19 reference frame during 2008-2020. The red dots represent the displacements referred to the IGS14 reference frame within the same period of time.

Climate change is considered as an important external impact factor influencing the GNSS data precision. However, it is a challenge to clarify the relationship between daily climate change and daily GNSS records. The main reasons are because the daily climate change influence could be partially removed by the 24-hour GNSS processing method. Thus, we selected the continuous operating GNSS station BYSP, which is installed nearby a real-time climate-monitoring device. The 5-year continuous climate data are used in the model, which is collected by the weather station (TJSJ) nearby the BYSP (NOAA National Weather Service). Here, we established two models between which the only difference is that one considers the daily climate features and the other not. We used the model to evaluate whether climate change can influence the precision of the 24-hour GNSS time series, which has been transformed to the PRVI19 reference frame using the extended Helmert transformation method. The dimensionless index shows that the model without considering the daily climate change is 0.32 and the other one considering the daily climate change is 0.25. The lower dimensionless index means that the model has better performance. The results prove that daily climate change is one of the impact factors in the GNSS time series.

Furthermore, we determined the quantitative weights of impact from different daily climate factors on the GNSS time series. Figure

Plot showing the impact weights of different climate factors evaluated by the combined method. Sea: sea level; Temp: temperature; Hum: humidity; Vis: visibility. NS represents the North-South displacements. Events represent the operation log of weather station. EW represents the East-West displacements. Avg represents the average of daily values. Low represents the lowest daily value. High represents the highest daily value.

XGBoost hyperparameters.

nrounds =400 | nthread =8 | eta=0.01 |

Gamma =0.01 | max_depth =8 | min_child_weight =2 |

Subsample =0.54 | colsample_bytree =0.54 | Lambda =0.01 |

The hybrid model forecasted the GNSS monitoring displacements by learning the previous data and expected to explore the potential deformation problem in various monitoring applications. The prediction of GNSS observations is derived from the previous displacements that are referred to the stable local PRVI19 reference frame. Here, we used the BYSP GNSS postprocessing records referred to PRVI19 to train the model. The hybrid model was trained by the 1823 days of GNSS displacements. Also, we used the root mean square error (RMSE) as an indicator to evaluate the forecasting precision [

Plot showing the detailed comparisons between the fitted values from measured values and predicted values in 600 days. The orange dots represent the results predicted by the combined machine learning method. The yellow dots represent the fitted results based on GNSS records using the cubic spline method.

To precisely analyze the impact of various daily climate factors on the GNSS time series, we proposed a hybrid method and applied it in the PRVI area. We used the extended Helmert transformation method to establish the PRVI19 local reference frame, which could help avoid the bias of background global or regional tectonic movements in the GNSS time series when studying local ground deformation. The stability of the PRVI19 reference frame is approximately 0.4 mm/year and 0.5 mm/year in the horizontal and vertical directions, respectively. Also, we adopted the XGBoost algorithm and the highly stable PRVI19 local reference frame to quantitatively assess the effects of daily climate factors on the GNSS daily (24 hours) observations. Based on the 13 years of GNSS records referred to PRVI19 and climate data recorded by a nearby climate-monitoring device, we observed that the wind had the biggest impact on the GNSS time series. The results show that the average, lowest, and highest wind speeds are the first, second, and fourth-largest weights among all the climate factors. Besides, the result also shows that the lowest temperature also greatly affects the GNSS displacements, which is the third-largest weight among all climate factors. This paper introduces a new method that can quantitatively determine the impact weights of different climate factors on the GNSS time series. Moreover, we used the model to predict the GNSS records and indicate users to explore potential deformation risk. It is hoped that this study can promote the applications of the GNSS techniques and improve the understanding of the impact of different climate factors on the GNSS monitoring time series.

Data available in a publicly accessible repository. Data is available through

The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Linchao Li and Hanlin Liu contributed to the conceptualization, methodology, data curation, and writing—original draft preparation. Hanlin Liu contributed to the visualization and investigation. Linchao Li contributed to the supervision. Linchao Li and Linqiang Yang contributed to the writing—reviewing and editing.

The authors acknowledge Guoquan Wang (University of Houston) for providing the PPP solutions data quality check for this study. The first author also appreciates the UNAVCO and the Nevada Geodetic Laboratory (NGL) at the University of Nevada for sharing their GPS products with the public. Data is available through