A Quality Control Method Based on an Improved Kernel Regression Algorithm for Surface Air Temperature Observations

An improved kernel regression (IKR) method based on an adaptive algorithm and particle swarm optimization is proposed. Considering the limitations of current quality control methods in different regions and on multiple time scales, the kernel regression algorithm is applied to the quality control of surface air temperature observations. Observations of 12 reference stations in Jiangsu from 1961 to 2008 and of 14 regions in China from 2010 to 2014 were selected. +e analysis of surface air temperature observations was performed in terms of the mean absolute error (MAE), root mean square error (RMSE), consistency indicator (IOA), and Nash–Sutcliffe model efficiency coefficient (NSC). +e results indicate that compared with the traditional IDW and SRT methods, the IKR method has a high error detection rate. Furthermore, the IKR method achieves better predictions and fitting in the single-station and multistation regression experiments in Jiangsu and in the national multistation regression prediction experiment.


Introduction
Compared with radiosonde stations, surface observation stations have a higher spatial resolution. A high-resolution model is sensitive to high-resolution initial fields. However, surface observations, except pressure, have not yet been assimilated in the numerical weather prediction system. e study shows that the forecast skill of numerical weather prediction will be improved if surface observations are assimilated properly. 2 m temperature observations (surface air temperature observations) have more noticeable impact on the model forecast, compared with other elements [1]. Surface air temperature is measured in surface observation stations at 2.0 m above the ground and represents the important energy interaction between the earth's surface and the atmospheric layer, and between the soil surface and underlayers [2,3]. e goals of the quality control (QC) of surface air temperature observations are to review the observations, identify missing data and suspect data, and supplement and correct the observation data to ensure the maximum extent of archived data [4]. Surface meteorological observations are the basic element in meteorological research and have important decision-making significance for data assimilation technology and numerical weather prediction technology [5]. e accuracy of numerical weather prediction (NWP), a key meteorological forecasting technology in the current era, is largely restricted by data assimilation technology, and the QC of surface air temperature observations in the process of data assimilation is the basis of research in this area. With the rapid improvement of the current social economy, the distribution of surface weather stations is becoming more systematic and refined, leading to geometric growth in the number of meteorological observations; thus, the QC of surface meteorological observations is a basic and necessary task of meteorological research experiments [6]. erefore, the QC of surface air temperature observations is the basis of China's meteorological industry: only by ensuring the accuracy and rationality of surface air temperature observations, we can further complete data assimilation and improve the ability of NWP [7][8][9][10][11].
In general, foreign scholars' QC methods for surface air temperature observations follow single-station to multistation approaches to finally form comprehensive QC methods. Common QC methods for single-station surface meteorological observations include internal consistency tests, extreme value tests, time consistency tests, limit value tests, and time-varying tests [12][13][14].
ese methods can identify large errors in the observations and pave the way for subsequent high-precision QC methods. e spatial consistency test is a representative QC method for surface air temperature observations from multiple stations. e test overcomes the problem of poor detection of reasonable changes in observations, and the effect of regional QC methods is obviously better than that of the QC methods used for single stations. Common spatial consistency test methods include inverse distance weighting (IDW), polynomial interpolation, and the spatial regression test (SRT) [15][16][17]. Among these methods, IDW and SRT are widely used in foreign countries. e Euclidean distance between stations is typically used in IDW to estimate the weights of neighbouring stations, whereas the SRT mainly uses the RMSE to predict neighbouring stations. e core idea of the above spatial consistency test methods is to estimate the observations of the target station based on the observations of neighbouring stations to achieve the purpose of QC by judging whether the observations are acceptable or need to be corrected based on the difference between the estimated observations and the observed observations [18].
Domestic research on the QC of surface air temperature observations started late, so most scholars study and discuss the corresponding domestic situation on the basis of foreign research. QC methods for surface air temperature observations have been developed steadily in China due to the completion of a three-level QC system, which has improved the QC of surface air temperature observations [4]. Domestic scholars continue to promote the development of QC methods. For example, Ren et al. conducted comprehensive QC and analysis of observations of extreme anomalies, thus preventing correct extreme meteorological observations from being completely eliminated and further improving the QC effect [19]. Li et al. proposed a QC method based on the blackboard model and used the ontology analysis method to diagnose and analyse meteorological observations via cooperation between heterogeneous knowledge bodies [20]. Wang et al. conducted QC research on meteorological observations based on comprehensive time and spatial consistency and showed that the integrated consistency method can detect erroneous observations more effectively than a single method [21]. Ye et al. proposed a method based on intelligent equation fitting to construct QC equations to achieve the QC of surface air temperature observations. Furthermore, a QC method based on improved random forest had been proposed, which effectively improved the QC effect of surface air temperature observations [22,23]. Zhang et al. considered the continuity and stability of surface air temperature observations in short time series and proposed an integrated learning method based on particle swarm optimization for phase space reconstruction and extreme learning machines [24]. Xu et al. proposed a new QC method to analyse the observations of the new surface automatic weather station, which lacks historical observations. According to regional and climatic characteristics, the national ground observatory is divided into eight areas for QC methods [25]. Xiong et al. proposed a QC method of surface air temperature observations based on the difference of spatial observations [26].
All the above methods can be used for QC, but the existing models of surface air temperature observations have a single structure and fail to achieve high universality in the face of surface air temperature observations in different regions or different scales. Aiming to overcome this deficiency, this paper proposes an IKR method to achieve the QC of surface air temperature observations: the test results are compared with the QC effects of traditional methods to verify the superiority of the method. e specific objectives are to (1) use surface air temperature observations from different time scales in Jiangsu and China to test the feasibility of the KR algorithm, (2) use surface air temperature observations of 14 regions in China to test the superiority of the IKR algorithm compared with the KR algorithm, and (3) test the QC effects of the IKR and KR algorithms and the traditional IDW and SRT algorithms using Jiangsu and national surface air temperature observations, including the error detection rate, predictions, and fitting.
All the above methods lack physics basis, and multiple resource information [2,3] has to be introduced for the better quality control (QC) of surface air temperature in the future. e AE (e.g., generalization error) in the studies [2,3] can overcome deficiency of all the above methods and achieve high universality of surface air temperature observations in different regions or different scales or different topography and/or ground cover. However, the operation of those algorithms is not easy, and a powerful tool needs to be developed. To test the performance of the QC method, we refer to Hubbard's multistation QC method and randomly generate error values in the original data to simulate possible erroneous observations [27]. e error values K λ are generated by

Data
where p λ is a uniformly distributed random number in the interval [− 3.5, 3.5], s λ is the standard deviation of the original observations, λ is the position of the error, and K λ is the original data.

Inverse Distance
Weighting. IDW, one of the most commonly used spatial interpolation methods [28], was proposed by the US National Weather Service to interpolate the distance between a point to be inserted and the actual observed sample point. e closer the sample points are, the greater the weights are; that is, the weight contribution is inversely proportional to the distance. e calculation formula is where Z is the estimated value of the point to be interpolated, Z i is the measured value of the i-th sample point, d i is the distance between the i-th sample point and the point to be inserted, m is the measured sample point of the participating calculation number, and n is a power exponent that controls the degree of decrease in the weight coefficient as the distance between the point to be inserted and the sample point increases. When n is larger, closer sample points have higher weights; when n is smaller, the weights are more evenly distributed to each sample point.

Spatial Regression.
e SRT is a QC method that checks whether a variable is within the confidence interval formed by the surrounding station data in a period of length N. All stations (M) within a certain distance are selected, each station is paired with the relevant station, and linear regression is performed. For each surrounding station, an estimate is obtained based on the regression formula e weighted estimate x ′ is then obtained based on the estimated standard errors, also known as the error in the process: where N is the number of stations used in the estimation, which is selected by the user (N ≤ M), considering that the observation system may change over time and that the data of the surrounding stations change within a day. e formula for the estimated weighted standard deviation s′ is According to the above formula, the neighbouring station is selected according to the minimum value of the standard error between each station and the target station.

Kernel Regression Algorithm.
Kernel regression estimation is an important and commonly used method in nonparametric regression. Due to the particularity of air temperature observations and the multistation QC model that this paper considers, targeted improvements must be made to the proposed method.
Assuming X and Y for a given temperature sample ob- , there is a regression model given by where g(X i ) is the regression function and ε i is a sequence of random error variables that are independent of each other, which has a mean of 0 and a variance of σ 2 . e regression function g(X i ) is the conditional expectation given X � x, namely, where f(x, y) is the joint density function of (X, Y) and f X (x) is the edge density function of X. e kernel estimator of f(x) is defined by the formula according to the kernel density estimation algorithm as , and h and h 0 are the window widths of X and Y. e estimator g(x) � (yf(x, y)/f X (x))dy of the regression function can then be written in the form of a nonparametric kernel estimator as From formula (7), it is seen that if there is no specific functional relationship between the response variable Y and the explanatory variable X, then there is no relationship between the data types of surface meteorological observations such as temperature, humidity, rainfall, and wind speed. is process involves the study of only surface air Advances in Meteorology temperature observations, and the feasibility of the algorithm requires further verification. e following is an explanation of formula (7): Formula (8) can be interpreted in the form of matrix is related to the explanatory variable X itself and how x is selected in g NW (x). en, for formula (5), surface air temperature observations are introduced for interpretation. If station 1 is the estimated station, its data . Station 2 provides the explanatory variables, and its data form is X � erefore, each X i has a corresponding weight coefficient matrix [w 1 , w 2 , . . . , w n ] j , and the standard format of formula (9) can be modified as follows: e above is the interpretation of the calculation process of the regression estimator of surface air temperature observations of the central site for a neighbouring site.

Multistation Kernel Regression Algorithm.
e formula for kernel regression based on multiple stations to a central station is e explanation of formula (11) is as follows: for each group X k , there is a weight coefficient matrix where each column is a set of estimates for the central station. However, the above formula increases the amount of calculated data from single-column data to matrix data, and the corresponding result is also matrix data, which does not represent the central station directly. erefore, this paper draws on the idea of a multidimensional regression formula and improves it into a multistation kernel regression formula, as follows: where X i , Y i , an d Z i are different types of multidimensional data, such as temperature and humidity, and the following is drawn from formula (12): where X si is the surface air temperature observations of multiple neighbouring stations, Y i is the surface air temperature observations of the central station, and g NW is the central station observations of the regression prediction.

Window Width Improved by PSO and Adaptive
Algorithm. Since a fixed window width cannot effectively reflect the influence of the sparseness of the data, the width must be improved by the adaptive algorithm. Simultaneously, based on the improvement of the adaptive algorithm, particle swarm optimization (PSO) is proposed to further improve the method. e specific design method is as follows: according to the kernel density estimates f(x, y) and f X (x) in equation (6), the window width coefficient is designed on the basis of h being proportional to f − (1/2) [29]; λ � [f/g] − α , where g is the arithmetic mean of f, g � (1/n) n i�1 f, the effect of which is better than that of the geometric mean [30]; and α is a sensitivity parameter that satisfies 0 ≤ α ≤ 1. Studies have shown that the practical effect is best when α is 0.5. erefore, the adaptive window width is h * � λh, and h in equation (7) can be substituted to obtain the adaptive kernel regression algorithm formula as On this basis, a new window width formula is h � cσn − α , where c and α are parameters to be determined and optimized by the adjusted PSO algorithm. Taking the kernel density estimation function as the objective function and assuming the N-dimensional space, the air temperature observations of multiple stations constitute a particle population X � (X 1 , X 2 , . . . , X d ), where the i-th particle data X i � (x i1 , x i2 , . . . , x iN ) T are calculated by the objective function f to obtain a set of potential solutions of the kernel density estimation function F(X i ) � (f i1 , f i2 , . . . , f iN ). e RMSE is used as the fitness function: in the initial solution, the parameters c and α are set to 1.06 and 0.2, the speed parameter V is adjusted to the two change factors ω and μ, and the position parameter X is adjusted to the window width h.
e new window width formula is then h � (c + ω)σn − (α+μ) , which is combined with the window width coefficient λ and substituted into equation (7) to obtain the following equation: e window width improvement in the multistation kernel regression formula (13) is the same.

Evaluation of Model
Performance. Commonly used evaluation parameters are the MAE, RMSE, IOA, and NSC. e MAE and RMSE represent the prediction accuracy: the smaller the value is, the higher the accuracy is. e NSC and IOA measure the goodness of fit: the larger and closer to 1 the value is, the better the goodness of fit is.

Results and Analysis of Jiangsu Single-Station Test and Multistation
Test. e surface air temperature observations of Jiangsu Province from 1961 to 2008 were selected to meet the conditions of large samples. First, the single-station kernel regression test in Jiangsu was conducted: the Nanjing station was the central station, and the other stations were used for regression predictions of the neighbouring stations. On this basis, four time scales-year, quarter, month, and day-were tested separately, and the four indicators (MAE, RMSE, NSC, and IOA) were calculated. e test results are as follows. Figure 1 shows that, overall, the neighbouring stations with shorter distances to the central station (NJ), such as YZ and SZ, have better regression effects, but this conclusion is not absolute. For example, when the neighbouring station is ZJ, the effect of the regression test is not as good as that when the neighbouring stations are LYG and SQ. Compared to the number of observations in the kernel regression test at the annual time scale, that at the quarterly, monthly, and daily time scales is greatly increased. According to the kernel regression evaluation indicators in Figure 1, on the quarterly time scale, with the Nanjing station as the central station, the kernel regression effect of the whole region is better when the city goes southeast. On the monthly and daily time scales, the overall effect is similar to that of the annual time scale: the shorter the Euclidean distance between the neighbouring station and the central station is, the better the kernel regression effect is. In addition, since the analysis uses a singleneighbour kernel regression test, even the daily average air temperature observations are acceptable in terms of computational efficiency. e analysis is performed based on comparison charts at different time scales under different indicators (Figure 1). From the perspective of the prediction accuracy indicators MAE and RMSE, the prediction effects on the annual time scale are the best. On the daily time scale, the effectiveness of a few stations that are close to the central station is similar to that on the quarterly and monthly time scales, and the other effects are not satisfactory. e prediction effects on the Advances in Meteorology quarterly and monthly time scales also do not reach the prediction accuracy standard (RMSE value below 0.6) of the conventional regression algorithm. From the perspective of the fitting accuracy indicators NSC and IOA, except for the low effectiveness of the daily time scale (in terms of the numerical value, the fitting effect is good) far from the central station in the northern part of Jiangsu Province, the fitting accuracy of the quarterly, monthly, and daily time scales is very good. e fitting effect on the annual time scale is not good: the overall indicator value is not high, the numerical value is unstable, and there is no clear relation. Jiangsu is located in the plain area in the southeast where the climate is pleasant and no obvious air temperature differences exist throughout the whole area. erefore, when conducting the experiments, the Euclidean distance will be affected by factors such as the geographical location of different urban areas.

Single-Station Basic Test at Different Time Scales.
ese influencing factors are more obvious when the overall air temperature changes only slightly, so different regions in the country must be selected for verification experiments.

Multistation Test at Different Time Scales.
Based on the abovementioned experiments, the multistation test of kernel regression was conducted. e Jiangsu area was taken as an example, and the number of neighbouring stations was increased from 2 to 11. To solve the problem of irregularity in station selection when increasing the number of neighbouring stations, this paper calculates the indicators by traversing all the neighbouring stations in turn and taking the average. e specific test results are as follows.
According to Figure

Results and Analysis of the National Multistation
Test. e national average daily air temperature observations from 2010 to 2014 are selected for part of the experiment. Due to the inconsistent number of stations in the 14 regions across the country, there is no comparability between different regions when gradually increasing the stations; therefore, the central station is considered to be the center, and the selection radius is gradually expanded from 20 to 200 km.
at is, the concept of gradually increasing the number of neighbours (n) is replaced with the concept of gradually expanding the range of adjacent stations. e test is performed as follows. Figure 3 shows that the kernel regression effect for multiple stations in China is similar to that in Figure 2 (Jiangsu multistation kernel regression). e MAE and RMSE of 11 regions (excluding JH, LS, and MH, where the sites are sparse) continue to decrease and eventually stabilize as the radius increases. When the radius reaches 160 km and above, regions reach unique stable values, and the improvement in prediction accuracy obtained by further increasing the radius is not substantial. e NSC and IOA gradually stabilize and approach 1 as the radius increases. When the radius reaches 120 km and above, the fitting indicator values are already close to 1, and the fitting accuracy cannot be further improved by increasing the radius. In addition, the effect of the kernel regression method is excellent in terms of the numerical values of the fitting indicators: even in the regression test with a radius of 20 km, the fitting result is quite good. Furthermore, to improve the universality of the kernel regression method in different regions, we use the data from all stations within 200 km of the target station in the 14 regions (including JH, LS, and MH areas, where the stations are sparse) to conduct a comparative experiment to assess the prediction accuracy and fitting accuracy of the IKR method and KR method in different regions. e comparison chart is as follows.
As shown in Figure 4, in terms of the MAE and RMSE, the improvement of the IKR method relative to the KR method in the radius range of 20 km to 60 km is not obvious, and the prediction accuracy is only slightly increased. When the radius range is expanded to 100 km, the prediction accuracy of the IKR method is improved substantially, and when the radius is expanded to 200 km, the IKR method is qualitatively better than the KR method: the improvement gradually stabilizes in the range of 160-200 km. In terms of the fitting accuracy indicator NSC, because the KR method already achieves good results, the IKR method shows little improvement compared to the KR method. In summary, the IKR method achieves good better universality and robustness and is suitable for QC research on surface air temperature observations in different regions of the country.

Comparative Test.
e regression predictions of the IDW, SRT, KR, and IKR methods are compared for the single-station and multistation tests in Jiangsu and the multistation tests in different regions of the country. e evaluation selects three indicators: MAE, RMSE, and NSC. e abscissas 1, 2, and 3 are the annual, quarterly, and monthly time scales.  Figure 5 shows the optimal effect comparison of the four methods in the Jiangsu single-station regression. e MAE and RMSE of the IKR method are better than those of the other three methods on the annual, quarterly, and monthly time scales. e traditional IDW and SRT methods are second only to the IKR method on the annual, quarterly, and monthly time scales, and the SRT method is slightly better than the IDW method. Moreover, the effect of the KR method is between those of the IDW and SRT methods on the annual time scale, but its prediction accuracy on the quarterly and monthly time scales is far worse than that of the other three methods. erefore, the KR method does not achieve high universality in the single-station regression prediction in Jiangsu. e NSC indicates that the KR and IDW methods have lower fitting effects than the IKR and SRT methods on the annual time scale, and the IDW method is slightly lower than the KR method. On the quarterly and monthly time scales, the  Figure 6 presents the optimal results of the four methods in the Jiangsu multistation regression prediction experiment. In terms of the MAE and RMSE, the traditional IDW and SRT methods show little difference in prediction accuracy from the single-station regression prediction and achieve good prediction results. e prediction effect of the KR method on the quarterly and monthly time scales is obviously improved and approaches the prediction effect of the traditional method. On the annual time scale, the KR method is far better than the traditional methods. Overall, the KR method is more suitable for multistation regression prediction experiments than for single-station experiments. e prediction effect of the IKR algorithm on the annual and monthly time scales has been improved, but that on the quarterly time scale has been reduced. However, the IKR method is the best among the four methods. e overall trend of the NSC is similar to that in the Jiangsu single-station test, but the IKR and KR methods achieve better fitting effects than the traditional IDW and SRT methods on the annual time scale. Figure 7, abscissas 1-14 represent BH, CD, GZ, HK, HHHT, JH, LS, LZ, MY, MH, NJ, TY, WLMQ, and CC, respectively. e MAE and RMSE indicate that the IKR method has excellent prediction effects in the 14 regions of the country, whereas the other three methods have disadvantages in terms of universality and robustness. For example, in terms of the MAE, although the KR method has a good predictive effect in most areas, the effect in the MH region is not as good as that of the traditional IDW and SRT methods. Furthermore, the IDW and SRT methods provide good predictions in station-intensive areas, but the prediction accuracy in areas such as JH, LS, LZ, MH, and WLMQ is reduced or even unsatisfactory. e overall trend of the RMSE is similar to that of the MAE, but the prediction accuracy of IDW in HHHT and CC is not good. In terms of the NSC, the fitting effect of the IKR, KR, and SRTmethods is very good in the 14 regions of the country, but the fitting effect of the IDW method is not good in JH, LS, and LZ and is especially poor in JH.

Error Detection Rate.
When testing a hypothesis, we always want to obtain a correct judgement, but according to probability statistics, it is impossible to do so completely. We may make an incorrect judgement when we accept or reject a hypothesis. ere are two types of errors, one of which is called "the first kind of error" in mathematical statistics, that is, to "take the truth as false." In statistical QC testing, if the data are correct and rejected, the "first-type error" occurs. e second type of error is to "pass the false as true"; that is, the data are wrong and accepted, resulting in "the second kind of error." Table 1 shows the matrix of the relationship between the research conclusion and the actual situation. e first type of error is also called an α error, which refers to rejecting H 0 when the null hypothesis H 0 is correct. In this case, the researcher's conclusion is incorrect; that is, a treatment effect that does not actually exist is observed. Possible causes of this type of error are extreme values in the sample and the adoption of relatively loose decision criteria. e second type of error, also known as a β error, refers to the situation in which the null hypothesis H 0 is accepted when the null hypothesis H 0 is wrong; that is, a treatment effect that actually exists is not observed. e harm of making a type I error is substantial. Since a phenomenon that does not actually exist is reported, the harm of follow-up research and applications derived from the phenomenon will be inestimable. e potential harm of a type II error is relatively small. erefore, on the basis of controlling the type I error, we should select appropriate QC parameters to minimize type II errors. According to this principle, the ratio of the number of errors detected by the QC method to the total number of inserted errors is calculated and referred to as the error detection rate. e performance of different QC models is evaluated by comparing the error detection rates. In this paper, the QC parameter values are set to those that produce an equivalent number of type I and type II errors. Figure 8 shows the first-and second-error sizes of the IKR, SRT, and IDW algorithms under different QC parameter settings (four regions, BH, HK, LS, and MH, are selected for display). Figure 8 shows that as the values increase, the first type of error decreases and the second type of error increases. Generally, the intersection of the curves of the first type of error and the second type of error is taken as the best QC parameter. In Figures 8(a)-8(l), when selecting the best value, the relationship between the type I and type II error rates is as follows: IKR < IDW < SRT. erefore, the IKR algorithm can effectively reduce the type I and type II error rates. Figure 8 shows the change in the rates of the two types of errors for the IKR, IDW, and SRT algorithms with the change in QC parameters in four regions. Tables 2 and 3 show the best QC parameters and corresponding error detection rates of the three algorithms in the 14 regions of China.
According to the data in Tables 2 and 3, in most regions, the effectiveness of the IKR algorithm in selecting the best QC parameters and maximizing the error detection rate is better than that of the IDW and SRT algorithms. In addition, in terms of selecting the best QC parameters, the IKR algorithm is between the IDW and SRT algorithms in the HHHT and JH areas. In the MH area, the error detection rate of the IKR algorithm is slightly lower (2%) than that of the other two methods. Combined with the comprehensive analysis in Figure 9, these results show that the IKR   algorithm has good universality in terms of the error detection rate for different regions.

Conclusions
e IKR method for the QC of surface air temperature observations is introduced in this paper. e results show that the IKR method has better prediction accuracy and fitting accuracy than the traditional QC methods. Moreover, the IKR method can effectively identify problems in surface air temperature observations. e specific conclusions are as follows: (1) Surface air temperature observations of the Jiangsu area are selected to verify the KR method. Whether the KR method can be applied to the regression prediction of surface air temperature observations is investigated, and the effectiveness of single-station and multistation predictions is determined. Finally, different regions of the country are selected to validate the feasibility of the KR method. e results show that application of the KR method in the regression prediction of surface air temperature observations is feasible, but its prediction accuracy and stability must be improved, and it cannot be applied to all regions. (2) e surface air temperature observations of 14 regions in the country are selected to verify the prediction effect of the IKR method, which is improved via PSO and the adaptive algorithm, compared with the KR method. e results show that the IKR method has higher prediction accuracy, fitting accuracy, universality, and robustness in different regions.
(3) e surface air temperature observations of Jiangsu and 14 regions across the country are selected to compare the four methods considered in this paper. e error detection rate, prediction accuracy, and fitting accuracy demonstrate the superiority of the IKR method. Furthermore, the results show that the IKR method has good universality and robustness in terms of prediction accuracy, fitting accuracy, and error detection rate and is suitable for QC research in different regions.

Data Availability
e meteorological data used to support the findings of this study have not been made available because the data from the National Meteorological Center are confidential.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.   14 Advances in Meteorology