The Impact of Length-Scale Variation When Diagnosing the Standard Deviations of Background Error in a 4D-Var System and Filtering Method Investigation

)e four-dimensional variational data assimilation (4D-Var) method has been widely employed as an operational scheme in mainstream numerical weather prediction (NWP) centers. In addition to the ensemble data assimilation method, the randomization technique is still used to diagnose the standard deviations of background error in variational data assimilation (VAR) systems; however, such randomization techniques induce sampling noise, which may contaminate the quality of the standard deviations. First, this paper studies the properties of the sampling noise induced by the randomization technique.)e results show that the sampling noise is on a small scale displaying high-frequency oscillations around the estimate compared with the estimate and this difference motivates the use of filtering techniques to eliminate the sampling noise effects. )e characteristics of the standard deviation field of the control variables are also investigated, and the standard deviation fields of different model parameters have different scales and vary with the vertical model levels. To eliminate such sampling noise, the spectral filtering method used widely in the operational system and a modified spatial averaging approach are investigated. Although both methods have splendid performance in eliminating sampling noise, the spatial averaging approach is more efficient and easier to implement in operational systems. In addition, the optimal filtered results from the spatial averaging approach are dependent on model parameters and vertical levels, which is consistent with the variation in the standard deviation field. Finally, the spatial averaging approach is tested on the operational system at the global scale based on the YH4DVAR and the global NWP system, and the results indicate that the spatial averaging approach has positive effects on both analysis and forecast quality.


Introduction
A data assimilation system optimizes the combination of observations and a short-term forecast of the state variables to provide the best estimate of the initial state of the atmosphere [1]. Reports from the famous numerical weather prediction (NWP) Centre, European Centre for Medium-Range Weather Forecasts (ECMWF), show that the assimilated observations contribute 15% of the influence in any one analysis, and the complementary 85% is from the background which contains information from earlier assimilated observations [2]. e information from observations and a short-term forecast are combined and weighted by their corresponding error covariance matrix, the observation error covariance matrix (R matrix), and the background error covariance matrix (B matrix). e fourdimensional variational data assimilation (4D-Var) method is the mainstream scheme used by many leading NWP centers to estimate the initial state of the atmosphere for the NWP model [3]. 4D-Var can assimilate observations of nontraditional types, in particular from satellites or radar, and these observations are generally considered to be a key factor in the continuous improvement of the quality of NWP [4][5][6]. In the 4D-Var system, the B matrix has a profound impact on the analysis because it determines the weight of the prior state, propagates the information from observations, and imposes a balanced relationship between the model control variables [7,8]. e specification of the B matrix is a key component of the 4D-Var system, but it remains a great challenge. e dimensions of the model state vector are of the order of 10 7 in a current NWP system; thus, it is far too large for the B matrix to be explicitly represented. e lack of knowledge about the statistical properties of background error is also a limitation. To overcome these difficulties, significant efforts have been made to approximate the B matrix. In the operational variational data assimilation (VAR) system, generally, the B matrix has to be modeled by using control variable transforms (CVT) and represented as a product of operator form, B � ΣCΣ, where C is the background error correlation operator and Σ is the diagonal matrix of background error standard deviations [9][10][11][12][13][14]. Considering the constraints between multiple variables, the standard deviations need to be diagnosed in the VAR system. Fisher and Courtier [15] suggested that the standard deviations of background error can be estimated by a randomization technique if the B matrix is in the form B � UU T . e randomization approximation of the standard deviations is widely implemented in the current operational centers [16].
e YH4DVAR system is an operational deterministic 4D-Var system that uses the global spectral model as a constraint to impose a dynamic balance on the assimilation [17]. Considering that the sampling noise introduced by the randomization technique has a negative effect on the estimate, a spectral filtering method truncated at a fixed wavenumber in an operational setting solves the issue of sampling noise in the YH4DVAR system [18]. e spectral filtering method enables the removal of most of the sampling noise while preserving the standard deviations of interest. However, the local variation characteristics of the standard deviations may be destroyed through fixed wavenumber truncation. In this paper, we will introduce the modified spatial averaging approach to solve the issue of sampling noise. e article is organized as follows. First, the current operational version of the background error covariance matrix model in the YH4DVAR system is introduced in Section 2, together with the spatial structure of sampling noise and the characteristics of the local variation in the standard deviations for further investigations. In Section 3, filtering approaches are implemented on the operational system and the filtered results are obtained. e results of applying the filtering approaches to the operational system are presented in Section 4. Finally, conclusions follow in Section 5.

Motivations
In a 4D-Var system, the cost function measures the distance between the model trajectory and the observations over an assimilation window. e cost function [19] can be written in terms of increment δx by where subscript t is the time index, δx is the increment, and M 0,t δx is the increment evolved by the tangent linear model from the initial time to time index t. B and R t are the covariance matrices of background errors and observation errors at time index t, respectively. d t is the innovation vector at each time step, t is the background at time index t, and H t is the observation operator with a linear approximation H t .
To simplify the background term in J inc , most leading NWP centers are modeling the B matrix in the CVT framework. A general CVT form of a covariance matrix can be written as an eigenvector decomposition [20]. In the VAR system, the B matrix can be modeled as B � UU T , which is not computed explicitly in the assimilation but implied by the operator U. e construction of operator U is a challenge in background error covariance modeling, and it is currently decomposed into the parameter transform U p , horizontal transform U h , and vertical transform U v in operational settings [21]: where U p introduces balance constraints to the assimilations and the parameters of x u are thought to be nearly uncorrelated with each other [22][23][24]. erefore, the background error covariance matrix can be modeled by B � U p B u U T p , where the matrix B u is the covariance of x u .

Randomization Technique and Sampling
Noise. In the YH4DVAR operational system, the control variables contain the vorticity ζ, the divergence η, the temperature and surface pressure (T, p s ), and the specific humidity q. e U p matrix transforms [ζ, η u , (T, p s ) u , q u ] into [ζ, η, (T, p s ), q], and the parameters of x u correspond to the unbalanced components of the control variables; thus, B u is assumed to be a blockdiagonal matrix with no correlation between the parameters. erefore, the background error covariance matrix can be written explicitly by the autocovariance matrix B u and parameter transform U p , B � U p B u U T p , with the background autocovariance matrix C(·) for each variable. In the VAR system, it is crucial to input the standard deviations of the background error of control variables into the autocovariance matrix C(·) to estimate the B matrix, and the standard deviations of other variables are implicitly given by the parameter transform.
Fisher and Courtier suggested that the randomization technique can be used to diagnose the standard deviations of background error in the assimilation system, thereby alleviating the impact of the imposition of balance conditions on the correlations of background error [15]. As the B matrix is of the form B � UU T , the standard deviations of background error can be estimated by the sample standard deviations of model vectors, that is, applying U to a set of N (N � 10) random vectors ϕ 1 , ϕ 2 , . . . , ϕ n drawn from a Gaussian distribution with N (0, I). Figure 1 shows the vorticity standard deviation field at approximately 1000 hPa obtained by the randomization technique. Figures 1(a)-1(f ) correspond to the raw estimates obtained with 10, 50, 100, 200, 1000, and 10000 samples, respectively, which are abbreviated as ζ (10), ζ(50), ζ(100), ζ(200), ζ(1000), ζ(10000). e sampling noise tends to be characterized by a relatively small scale in ζ (10) such that the estimate strongly deteriorated, and the sampling noise is significantly reduced when the sample size increases to 50 that the main largescale features of the vorticity standard deviation field are well captured. e values of the vorticity estimate are smaller in the tropics and data-dense areas and higher in the high latitude areas, especially in the southern hemisphere in ζ(50). As the number of samples increases, the sampling noise is dramatically removed, and the scale of the standard deviation field is visible. When the number of samples reaches 1000, the basic shape of the estimate ceases to meaningfully change; for example, ζ(1000) is similar to ζ(10000). e randomization technique is widely used to diagnose the standard deviations of background error in VAR systems in many operational centers, and the sampling noise is inevitably introduced into the estimate when undersampling. Although increasing the sample number can alleviate this issue, it is usually unrealistic to employ a sample size over 100 for estimation in the operational system. It needs to be a compromise between the accuracy of the estimates and operational time constraints that considers maintaining affordable computational costs in the operational system. erefore, it is important to remove the sampling noise efficiently in the case of undersampling.
To separate the sampling noise induced by the randomization technique from the estimate, it is necessary to investigate the difference between the sampling noise and the standard deviation field. e spatial structure of the sampling noise in the assimilation system will be explored in the operational version of the B matrix. e true state of background error standard deviations for vorticity is never exactly known, but it is known from Figure 1 that when the number of samples reaches 1000, the estimate of the vorticity field is sufficiently accurate.
us, the reference for the vorticity standard deviation field is generated from 1000 samples.
e sampling noise included in the estimates is obtained by subtracting the reference ζ(1000) from the estimated values of 50, 100, and 200 samples. e sampling noise is defined by e energy spectrum of the signal is an effective means to examine the spatial structure of the signal. e wavenumber in the spectral space corresponds to the scale of the spatial structure in the grid-point space that a large-scale spatial structure means that the energy of the small wavenumber in the spectral space is much larger than that of the large wavenumber, whereas it is a small scale. Figure 2 shows the sampling noise and the corresponding energy spectrum of a specific transect on the spatial fields from the 50 samples, 100 samples, and 200 samples. ere are two striking features worth noting in Figure 2. First, the sampling noise is clearly presented in the vorticity standard deviation field and displays high-frequency oscillations around the globe on a small scale. From the corresponding energy spectrum in the right column, the energy spectrum of the noise over the entire band is more consistent and is close to the white noise. Another striking feature is that the structure distribution of sampling noise from different samples is similar, but as the number of samples increases, the absolute extreme value of the noise decreases, which is consistent with the fact that more samples generate a more accurate estimate. e information in Figures 1 and 2 indicates that the spatial structure of the sampling noise is different from that of the estimates, and this difference motivates the reduction in the sampling noise by filtering techniques.

Characteristics of the Standard Deviation Field.
e characteristics of the standard deviations of background error obtained by the randomization technique in the assimilation system will be explored in the operational version of the B matrix. e characteristics of the standard deviation estimate vary with the model parameters, as illustrated in Figure 3. An overview of Figure 3 shows that low values of vorticity, divergence, temperature, and U-wind estimates are found in the tropics, and higher values are found in the high latitude areas, especially in the southern hemisphere. It is also shown that the standard deviation fields of the four model parameters have different distributions: low values of temperature and divergence are more concentrated at low latitudes, and higher values are found in higher latitude areas; however, there are relatively large vorticity fields along the intertropical convergence zone, and U-wind has a larger value in the data-spare and Antarctic regions and a smaller value in the densely observed land area. e temporal evolution of the standard deviation field is examined in Figure 4 at 1200 UTC and 2400 UTC. It appears that the evolution of the standard deviations over 12 hours is not quite significant, and the distribution of the temperature standard deviation field at 500 hPa (1000 hPa), stretching along the west-east axis, is quite similar over the period. In the VAR system, the covariances matrix is estimated by the climatic information; thus, the characteristics of the standard deviation field obtained by the randomization technique have a certain similarity. In addition, the distribution of the standard deviation field is connected with the dynamics of the synoptic situation. ere is another striking feature in Figure 4 in the differences in the distribution of the standard deviation field at the vertical levels of the model. e values near the surface are larger than those at 500 hPa, which can be explained by the complicated physical process Advances in Meteorology near the surface directly affecting the relative error statistics. e corresponding energy spectra of the standard deviation at 500 hPa and 1000 hPa are plotted in Figure 5. Both of the standard deviations at different model levels have relatively larger energy at large scales, and the energy spectrum of the standard deviation at 1000 hPa has a larger amplitude at medium scales than that at 500 hPa. e length-scale of the estimate of the standard deviations varies with the model parameters and the vertical levels, and these variations have a profound impact on the procedure for eliminating sampling noise. erefore, the method of removing the sampling noise should not only preserve the standard deviations of interest but also consider the variation in the standard deviation field with model parameters and levels.

Filtering Approaches
e standard deviations of background error estimated by the randomization technique are contaminated by sampling noise, and this noise has a negative effect on the estimate. e previous paragraph shows that the sampling noise differs in the spatial structure based on the standard deviations and that the standard deviations are model parameter-dependent and vertical level-dependent. In this section, the spectral filtering method used in the operational system and the concept of a spatial averaging approach will be introduced, with a discussion of the implementation of these approaches in the YH4DVAR system. e filtered results of spatial averaging on the estimate will also be discussed in detail.

Filtering Approaches and Implementation.
In the YH4DVAR operational system, a spectral filtering method, truncated at a fixed wavenumber N tr , is employed to remove the sampling noise by multiplying a coefficient ρ(n) with each wavenumber. e coefficient ρ(n) is defined by where n s is the wavenumber in the spectral space. e spectral filtering method achieves the elimination of sampling noise by setting a reasonable truncated wavenumber N tr , based upon empirical settings, and the efficiency of the  spectral filtering method is more significant when the standard deviation field is isotropic and varies slowly.
For the filtered results in grid-point space, the spatial averaging [25] of the estimate is defined by where N x � n L + 1, L is the length of spatial averaging, and n L is the number of grid points in domain [− L/2, L/2]. e double sum in this equation indicates that spatial averaging has a potential effect of increasing the sampling realizations. is means that, in each model level, the total sampling realizations will be dramatically increased by N t � N × N x × N y , where N is the sampling number. us, the averaged results have less sampling noise when the independent sampling realizations are increased.
It is important to investigate the implementation and computational cost of filtering approaches when considering their application to operational systems. e implementation of the spectral filtering and spatial averaging approach on the operational system is discussed, and the flowchart is shown in Figure 6. e practical implementation of the spectral filtering method using fixed wavenumber truncation in the operational system can be summarized by the following steps: Advances in Meteorology the spatial averaging approach does not need to transform standard deviations into spectral space and has a higher degree of parallelism in the grid-point space. ere is another striking feature that once the optimal spatial average length is determined, there is no need to relocate the grid points satisfying the distance because the sample realizations in the field are invariable. e calculation of the optimal spatial averaging length L can be performed offline and updated with the frequency of the background error covariance statistics. Figure 7(a) is the reference for the vorticity standard deviation field generated from 1000 samples, and the raw estimate computed from 50 samples is represented in Figure 7(b). It is possible to capture the main large-scale features with only 50 samples, and the sampling noise tends to be characterized by a relatively small scale, such that the details of the estimate strongly deteriorated. Note that in the operational assimilation system, it is usually unrealistic to employ a sample size over 100 for estimating because of the computational cost. e filtered result of the .41, respectively, both of which are larger than that of ζ(50), and a larger signal-to-noise ratio of ζ(50) sp strongly supports the fact that the spatial averaging is able to remove most of the detrimental noise. ζ(50) Ntr still has obvious noise, which is caused by the fixed truncation of the wavenumber. Figure 8 shows the coefficient of the fixed truncation of the wavenumber of the spectral filtering method. e N tr � 1000 used here is a conservative spectral filtering function. It sets most of the wavenumber with a coefficient equal to 1, which mainly cuts off some lowfrequency noise and does not consider the local variation of the standard deviation field. e physical structure captured by ζ(50) sp is very detailed and is even as accurate as the estimate in ζ(1000). e averaged results have less sampling noise as the independent sampling realizations are increased, N t � N × N x × N y , which is coherent with the fact that the total sample size N t of the filtered 50 samples is close to the sample size of the raw 1000 samples. is attractive result shows the potential efficiency of the spatial averaging in that the averaging estimates can reach the accuracy of large sample estimates by significantly eliminating the noise. Figure 9 shows the relative error of the filtering result of vorticity ζ(50) sp, temperature T(50) sp, and U-wind U(50) sp at 1000 hpa:

Spatial averaging
Spatial averaging where value sp is the standard deviation filtered by the spatial averaging approach and value ref is the reference standard deviation field by 1000 realizations. It can be seen that, with the increase in the filtering length, the relative error of the filtering result can be expressed as a U shape. ere is a significant effect on reducing the relative error as the averaging length increases, and there is an optimal length at which the error of the filtering result is minimized. When the averaging length becomes excessive, the relative error increases again with a greater length. is can be explained by spatial filtering increasing the accuracy of the filtered results by increasing the independent sample realizations; when the length becomes excessive, the relevant points are also   Advances in Meteorology included in the realizations that result in a reduction in the filtering result. e optimal filtering length for vorticity is 250 km, for temperature is 750 km, and for U-wind is 500 km. e difference in the optimal length of these model parameters shows the difference in the standard deviation field, which is consistent with the characteristics of the standard deviation field mentioned in 2.2. It is absolutely necessary to consider the effects of the model parameters on the spatial averaging length in the procedure for removing sampling noise since the optimal filtering length is dependent on model parameters. e efficiency of spatial averaging at the different vertical levels for different model parameters is shown in Figure 10. Figure 10(a) shows the optimal averaging length is different at the different model levels; for example, the optimal length of temperature around the surface is 750 km, which is larger than 500 km at 500 hpa. In addition, the trend in the optimal length varying with the vertical levels is different from that for model parameters.
e optimal length of the U-wind increases obviously with the model levels, while the optimal length of the temperature decreases gradually from 1000 hPa to 500 hPa and then increases with the vertical model levels.
In general, the optimal length of the variable increases with the vertical levels. is can be explained by the fact that the correlation length-scale of the variable field at higher levels is greater than that at the surface, and a greater length-scale corresponds to a greater optimal length for more independent sampling realizations. Figure 10(b) shows the relative error of U-wind raw estimates U(50) and optimal filtering results U(50) sp at different vertical levels. e relative error of raw estimates U(50) remains around 0.08 while there is a significant reduction in the relative error with the optimal length. e essence of the spatial averaging to reduce sampling noise is to increase the independent sampling realizations by optimal length. A larger optimal length may result in the fact that the sampling realizations are not completely independent, and this is coherent with the relative error associated with optimal length being increasing above level 64.

Effects on Analysis and Forecast Quality
e previous results show that the large-scale features of background error standard deviation can be captured by a randomization technique, but this technique may be contaminated by the sampling noise because of the finite sample sizes. Furthermore, the fixed wavenumber truncation in the spectral filtering method has an inferior impact on the local variation characteristics of the estimates, but the spatial averaging approach offers some advantages in addressing this issue. In this section, the effect of the spatial averaging approach on assimilation and forecast quality will be examined in the operational NWP system consisting of YH4DVAR and the global spectral model.
Applications for assimilation diagnostics and forecasting will be considered with the operational NWP system on a low-resolution grid with 91 levels (abbreviated as T399 L91).
ere are two groups. In one, the number N of random vectors is taken as 50 in the first group, which is numbered CN, using the spectral filtering method truncated with a fixed wavenumber to eliminate the sampling noise. In the other experimental group, the number N of random vectors is taken as 10 with spatial averaging, numbered SP. e results conducted with the operational system on a highresolution grid with 91 levels (abbreviated as T799 L91) are used as a reference. e operational system run started at 00 UTC on 26 August 2015, ended at 00 UTC on 26 September 2015, and was updated every 12 hours. e statistical properties of the B matrix are generated using the popular National Meteorological Centre (NMC) method, an approach that uses the differences between forecasts of different lengths to evaluate forecasts error [9], and the optimal averaging length of the model parameters is calculated at 00 UTC on 26 August 2015 on each vertical level. Figure 11 displays the root mean square error (RMSE) of the geopotential height as a function of time for two schemes: CN and SP.
e Northern and southern hemisphere extratropics correspond to the regions of 30°N-90°N and 30°S-90°S, respectively. e RMSE in the southern hemisphere is generally higher than that in the northern hemisphere and tropics, and RMSE in the tropics is slightly smaller than that in the northern hemisphere.
e RMSE of the SP analysis in all three regions was less than that of the control, which leads to the same conclusion as that for 850 hPa in Figure 11(b). It can be observed that the RMSE in different regions is significantly reduced by the use of the spatial averaging approach in Table 1, and this tends to be consistent with the result that the spatial filtering has a positive impact on improving the accuracy of the estimate. e difference between the SP and the CN analysis fields at a level near 500 hPa and 850 hPa is shown in Figure 12. A visible common feature is that the differences in low latitudes were significantly smaller than those in high latitudes. is can be interpreted to mean that the equatorial waves coupled to convection explain most of the standard deviation in the tropics, especially in the lower troposphere; thus, the filtering effect of the spatial averaging approach is similar to that of the spectral method. Another interesting point is that the larger values are mainly located in the vicinity of ridges and troughs, which is in agreement with previous studies.
Applications for prediction have been achieved by examining the forecast quality of filtered results on a global spectral model. Figure 13 shows the RMSE of the forecast for temperature (500 hPa) at different times compared with the references at the corresponding time of T799. e RMSE has a small value in the tropics and northern hemispheres, whereas much higher values are found in the southern hemispheres in the data-poor areas. It is also observed that the RMSE in the three regions increases with a long forecast time, and the growth of RMSE in the tropics is slower than that in the nontropical area.
e filtered estimates have a positive impact on the forecast quality, and its forecast improvement is generally weaker than that in the analysis. is may be due to the filtering effect of the numerical model, and continued research is important to improve the skill of the forecast.

Discussion and Conclusions
In a VAR system, the randomization technique is widely implemented in the current operational centers to diagnose the standard deviations of background error for modeling the B matrix. In this case, the standard deviations are estimated from a finite sample and will inevitably be affected by sampling noise because of the finite size of the samples. erefore, some filtering techniques must be implemented to eliminate this sampling noise. e spatial structure of the sampling noise introduced by the randomization technique is explored in the operational version of the B matrix. e estimated standard deviations from a finite sample are deteriorated by sampling noise, and the noise tends to be characterized by a relatively small scale and evenly distributed across the global map compared to the estimated reference field. e absolute value of noise dramatically decreases as the sample size increases, and the large-scale of standard deviations is more visible in the estimated field while the global distribution of sampling noise remains even. Such a difference in the structure between sampling noise and the estimates is the key to designing filtering techniques to filter out the small scale noise and preserve the signal of interest. ere is a spectral filtering method truncated at a fixed wavenumber in an operational setting to resolve the issue of sampling noise in the YH4DVAR system. e application of this truncated spectral method is a kind of low-pass filter in spectral space that enables the small scale noise to be filtered out, preserves the large-scale information, and works well for eliminating sampling noise in the operational system.
Our investigation has also shown that the length-scale of the estimate of the standard deviations varies with the model parameters.
e low values of the divergent standard deviations are more concentrated at low latitudes, and higher values are found in higher latitude areas, but there are relatively large vorticity fields along the intertropical convergence zone. In addition, the distribution of the standard deviation field is different between model vertical levels.
e length-scale of the estimates from the randomization technique varies with the model parameters and vertical levels; thus, the fixed wavenumber truncation in the spectral filtering method has an inferior impact on the local variation characteristics of the standard deviations. e modified spatial averaging approach has some advantages in addressing this issue. In an operational context, the physical structure captured by the averaged results of 50 samples is fine, as the reference with the noise is dramatically eliminated, the spatial averaging approach has the potential to increase the independent sample realizations by averaging length, and the averaged results with the optimal length maximize the total sampling numbers. e optimal averaging length is connected with the model parameters and vertical level, which is consistent with the variation in the estimated standard deviations field; thus, the spatial averaging approach has better performance in eliminating the noise and preserving the characteristics of local variation in the estimates. In addition, the spatial averaging approach is easy to implement in the operational system and has a higher degree of parallelism in the grid-point space. e RMSE of the analysis in different regions is significantly reduced by the use of the spatial averaging approach, and relatively small values can be observed over data-spare regions such as the southern hemisphere. In addition, the results also show that the spatial averaging approach has a positive effect on improving forecast skill scores.
In the next step, we need to make further improvements to the spatial averaging approach. e optimal averaging length is computed offline, and the averaging length could be suboptimal with the variations in flowdependent information in the assimilation period. It would be interesting to consider updating the optimal averaging length adapted with the variations. In addition, some qualitative analysis conclusions given in this paper need to be examined in detail in future work. e homogenous and uniform filtering type of spatial averaging approach ignores the anisotropic characteristics of the signal; thus, some important characteristics may be smoothed or weakened in averaging, and it would be worth the effort to introduce an adaptive averaging algorithm in the future.
Data Availability e observations used in this paper are obtained from the NCEP and NOAA website (ftp://ftpprd.ncep.noaa.gov/pub/ data/nccf/com/gfs/prod). e background data are not freely available because it is a short-term forecast by a global spectral model NWP system at the National University of Defense Technology that is subject to commercial confidentiality.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.