ASCAT Wind Superobbing Based on Feature Box

Redundant observations impose a computational burden on an operational data assimilation system, and assimilation using highresolution satellite observation data sets at full resolution leads to poorer analyses and forecasts than lower resolution data sets, since high-resolution datamay introduce correlated error in the assimilation.Thus, it is essential to thin the observations to alleviate these problems. Superobbing like other data thinning methods lowers the effect of correlated error by reducing the data density. Besides, it has the added advantage of reducing the uncorrelated error through averaging. However, thinningmethod using averaging could lead to the loss of somemeteorological features, especially in extreme weather conditions. In this paper, we offer a new superobbing method which takes into consideration the meteorological features.The newmethod shows very good error characteristic, and the numerical simulation experiment of typhoon “Lionrock” (2016) shows that it has a positive impact on the analysis and forecast compared to the traditional superobbing.


Introduction
The demand for more accurate predictions of hurricanes is increasing in order to issue timely warning to society.This will help to minimize losses and damage.One primary objective is to enhance the observation targeting and observability of hurricanes.Satellite observations can effectively compensate for the shortcomings of traditional methods of sea surface measurement and provide all-weather observation over the sea surface, which is of great significance to improve the numerical prediction of strong convective weather in the marine area [1].
The spaceborne scatterometer observes the backscattering caused by the sea surface roughness, and then the sea surface wind can be retrieved.Scatterometer data were first used in a numerical weather forecasting operational system in 1998, when the European Center for Medium-Range Weather Forecasts (ECMWF) incorporated ERS-1 scatterometer data into its global three-dimensional variational system [2].Previous studies have shown that scatterometer data have significant impacts on weather forecasting and climate monitoring [3][4][5][6][7][8][9].Particularly, it has been demonstrated to be useful in the prediction of tropical cyclones (Isaksen and Stoffelen, 2000) [5] and extratropical cyclones [4].ASCAT surface wind data have been used in many daily weather forecast operations such as the ECMWF, the United Kingdom's National Weather Service (Met Office), the National Weather Service of France (Meteo-France), and Environment Canada.In July 2009, the Japan Meteorological Agency (JMA) began to use ASCAT data for the Global Spectrum Model (GSM) and found that the ASCAT wind can capture the development of the low-pressure system and improve the prediction precision.Hersbach (2010) pointed out that the neutral wind retrieved by ASCAT had a positive effect on the ECMWF forecasting system [10].In 2011, Bi et al. evaluated the role of the ASCAT wind in the global data assimilation system of the NCEP (National Centers for Environmental Prediction); the results showed that ocean surface wind of ASCAT has a positive effect on the forecast of wind and temperature [11].
Current satellite observations generally have high temporal and spatial resolution.For example, the horizontal grid size of the ASCAT scatterometer has reached 12.5 km.If the high-resolution observations are directly brought into the assimilation system, it will greatly increase the computational overhead.In addition, high-resolution data will inevitably produce some spatial correlation errors in the observations [12].Therefore, the thinning technique of observations 2 Advances in Meteorology becomes a key technology of pretreatment in actual satellite assimilation.It plays an important role in improving data assimilation effect, and different thinning algorithms should be designed for different types of satellite observations.At present, the common way of satellite observations thinning is using a temporal or spatial sampling method, which makes observations distributed evenly in time and space, or using a "super-observation" method, where the values of observation minus background (−) or innovations are averaged within a certain region and assigned to the background chosen as superob.The superob has the advantage of reducing both the correlated error and the uncorrelated error of the observation (Howard, 2004) [13].Ochotta et al. (2006) proposed two thinning methods [12]: one is to cluster observation data according to the observed spatial position and observed data and finally to keep the center of each cluster; the other approach is to iterate over the most redundant observations from the data set.Li et al. (2010) [14] proposed a thinning scheme combining the background error covariance of the model, which minimizes the analysis error variance by selecting observations.Bauer et al. (2011) [15] proposed a method based on singular vectors to find sensitive regions of the satellite observations.The nonsensitive regions use the conventional thinning method, while the sensitive regions keep more observations.Gratton et al. (2015) [16] proposed a thinning method based on hierarchical observations, starting with the lowest (sparse) layer and adding observations gradually based on a posteriori error estimate.However, the above methods have no special consideration for the real-time meteorological feature of the observation.In this paper, based on the wind data of the ASCAT scatterometer, a new superobbing method which takes into consideration feature of the wind innovation field is proposed.To achieve the purpose of thinning and decorrelation, the main idea of this algorithm is to retain the spatial wind variability characteristics in dynamic situations, while at the same time the winds with low spatial variability are averaged over larger areas.
The structure of this paper is as follows: A brief introduction of superob is given in Section 2. Section 3 gives the specific flow of the wind superobbing using the feature box.In Section 4, we give the error characteristic of the superob using feature box and compare it to the regular superob.In Section 5, we use the typhoon forecast impact experiment to examine how this new method can affect the forecast, and the results are compared with those of the traditional superobbing scheme.Finally, the conclusion is given in Section 6.

Wind Superobbing with Regular Box
In this section, we will give the specific definition of the superob and derive the expression for the observation error within a box.Before we begin to derive the expression of the superob, we give the following assumptions [13] in order to simplify the problem.
(1) Observation and background errors are not correlated to each other.
(2) All of the background errors within a box have the same magnitude and are fully correlated with each other.
(3) All of the observation errors are constant and the spatial correlation is constant on the length scales of a box.
(4) All of the innovations within a box are weighted equally.
By fixing the size of a 3-dimensional box (for ASCAT wind field, it is 2-dimensional), we got  observations    (where  here is  or V component of the wind) and corresponding background    in a box.Then the superob   0 of a box can be formed as a weighted average of the observation minus the background; namely, where   0 is the background value at the chosen superob location and   is the weight assigned to each  −  pair.Assume    is the truth at location ; then   0 =   0 +   0 ,   0 =   0 +   0 , and    =    +    , where   is the corresponding error at location .Then the equation can be rewritten as ,   , and   are the error vectors of weights, observations, and backgrounds within the box.Squaring and averaging the error in equation over many boxes produce According to assumption (1) (     =     =     =     = 0), equation becomes Since all of the innovations are weighted equally ( = [1/ ⋅ ⋅ ⋅ 1/]  ) and according assumption (2), we have thus, Based on assumption (3),     can be written as the product of the observation error   and a correlation matrix ; namely,   0 2 =     , where ) ) ) .
Define the uncorrelated observation error    = (1 − )  and the correlated error    =   ; the superob error can be simplified to Howard ( 2004) [13] pointed out that the uncorrelated part of the observation error can be approximated by the innovation variance within a box; namely,    =  − .Since one box makes up a sample, the innovation variance can be estimated using the standard statistical formula: Thus, the superob error can be rewritten as Although it does not provide the true error of the superob, it provides a reasonable estimate of the superob error with all the assumptions.It can be seen from the equation that the superob reduces the random error greatly within a box but not the correlated error.If the observations are perfectly uncorrelated, the superob error will only depend on the innovation variance within the box.Thus, the bigger the box is, the smaller the superob error would be.However, one major concern is that a box too big in size may lead to the loss of the meteorological features.Another problem is that box with big size would disobey the hypothesis that the spatial correlation of observation is constant within a box.

Wind Superobbing with Feature Box
At present, the most commonly used thinning methods in data assimilation all use a fixed size of grid (2-dimension) or box (3-dimension).However, the global smoothing may usually destroy some of the structural characteristics of the data field, while these structural features often contain some key information, such as the wind field vortex structure of typhoons.Duan et al. (2017) [17] proposed a new thinning method called feature thinning; it preserves the characteristics of the wind field, while at the same time removing the redundancy of the observation.The grid size is flexible according to the spatial structure of the wind field and the feature here comes from the wind field itself.This paper will draw on the idea of the feature thinning to determine the box size of each superob.However, the feature in this paper is extracted from the innovation field but not the wind observation field, since data assimilation is a combination of observation and background.The flow of the superobbing with feature box algorithm is shown in Figure 1.
The specific steps of the algorithm are shown as follows: (1) Meshing: this step is to mesh the data in accordance with the geographical location of the regular grid cell which can be stored in a two-dimensional array in computer, where one grid cell corresponds to an element in the array.Assume that the wind field is divided into  rows and  columns and each grid has no more than one observation.Since the wind product of ASCAT is organized in units of tracks and each data element can be marked by the row and column numbers of the wind vector cell (WVC), therefore, each WVC is a grid cell and can be easily stored and processed through a two-dimensional array.
(2) Cluster initialization: initialize each grid cell and make sure that each grid cell   belongs to a separate cluster   ; thus, each observation has its own cluster.This can be expressed as where  is the index of the cluster, and in which (, ) is the row and column index of the grid cell.And each cluster contains the wind innovation vector ( −  , V −  ) and geographical coordinate (  ,   ) within the grid cell: (3) Cluster scanning: scan the whole clusters and compare the cluster   with   , which is adjacent to   ; if the wind innovation vector difference of the two clusters does not exceed a certain proportion threshold , namely, then the two clusters are combined to a new cluster  (,) , where the components of this new cluster are the average of two clusters: thus, the grids which belong to the new cluster are updated: Otherwise if the wind innovation vector difference of the two clusters exceeds the proportional threshold , that is, these two clusters are kept as separate clusters, and the search process is moved to the next cluster.After the scanning is done over the whole clusters, we redo the scanning process until clusters are no longer updated or the iteration reaches certain times.
(4) Superobbing: after step (3) is completed, we get a number of new clusters which we call feature boxes.It should be noticed that ( −  , V −  ) is the value average of all the wind innovation vectors which belong to the new cluster   , and we assume that all of the innovations within a box are weighted equally.Then, for feature box   , the superob   = (   , V   ) can be calculated by where (  ,   ) is the coordinate average of all the wind innovation vectors in box   , and (   , V   ) is the background wind vector at location (  ,   ).
Figure 2 gives the concept of feature box. Figure 2(a) is the wind innovation vector (which is expressed with colored circle for simplicity) field.At the beginning of the algorithm, each vector is initialized as an individual cluster as Figure 2(b) shows.Then the clusters are merged according to their similarity with each other (Figure 2(c)).Finally, the value and coordinate of new innovation vector within each new cluster are calculated by the average of the all the wind innovation vectors which belong to the new cluster, as shown in Figure 2(d).

ASCAT Wind Superobbing
4.1.ASCAT Wind Data.ASCAT is one of the instruments carried by the Meteorological Operational (Metop) series polar satellites launched by the European Space Agency (ESA) and operated by the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) [18].Its operating frequency is C-band (5.255 GHz), so the effects of clouds and precipitation on the observation are small.The unilateral scanning swath width of ASCAT is 550 km, and it can achieve a daily quasi-global coverage.ASCAT surface wind data have been used in many daily weather forecast operations such as the European Center for Medium-Range Weather Forecasts (ECMWF), the United Kingdom's National Weather Service (Met Office), the National Weather Service of France (Meteo-France), and Environment Canada.Two sets of three antennas are used in the ASCAT to generate radar beams looking 45 degrees forward, sideways, and 45 degrees backwards with respect to the satellite's flight direction, on both sides of the satellite ground track.For each wind vector cell (WVC), ASCAT obtains three independent backscatter measurements using the three different viewing directions and separated by a short time delay.Then the surface wind speed and direction can be calculated by using these "triplets" within a geophysical model function (GMF) [1].The wind product we used in this paper is obtained through the retrieval processing of scatterometer data originating from the ASCAT instrument on EUMETSAT's Metop-B satellite with a resolution of 12.5 km. Figure 3 shows the ASCAT wind field of the center of the typhoon "Lionrock (2016)" at 9 am on August 29, 2016.4(a) and 4(e) are the wind fields after superobbing with regular box and feature box, respectively.The regular box size is 37.5 km and the proportional threshold  for feature box is set to 0.15.The distribution of the wind field using feature box reflects the key structure of the innovation field, and the innovations within one feature box are similar to each other.Thus, the standard deviation of innovation within each feature box is much smaller than that of the regular box, as shown in Figures 4(b) and 4(f).Then we can conclude from (10) that the superob error using the feature box is much smaller than that using the regular box when the feature box size is equal to or bigger than that of the regular box.In the meanwhile, from (1), we have This shows that superobbing with the feature box not only has the characteristics of superob, but also retains the original observation information.

Error Quantification of Superobbing. Figures
The information loss between the original wind field and the wind field after superobbing can be used as a criterion to estimate the thinning method.One possible way to quantify the loss is to compare the original field with the resampled field.The difference between these two fields sometimes also can be called representation error (RE), since RE can be referred to as forward interpolation error [19,20] which is subject to the effects of discretization error and limited resolution [21].
Given the background and the superob wind field, we can resample the superob wind back to the original wind field scale.The innovations within each box are given by the innovation at superob location, then the resampled observation x at location  within a box is calculated by The resampled wind field using different box is shown in Figures 4(c) and 4(g).The resampled wind field using regular box has obvious strip errors between boxes and sometimes introduces large errors due to the big deviation between the observation and background, while the resampled wind field using feature box is very close to the original observation field.Figures 4(d) and 4(h) give the RE of the two resampled wind field, and Table 1 gives the root-mean-square (rms) of RE of  and V components; it could be seen that the superobbing using feature box has greatly reduced the RE of the thinned wind field.[22] is adopted in this study.
The WRFDA system is a widely used operational system that can produce a multivariate incremental analysis in the WRF model space [23].The grid size of the assimilation region is 260 × 250; the horizontal resolution is 15 km, and the vertical discretization is 30 layers.The time of assimilation is based on the time window of the joint wind field of the typhoon region, which was 0900 UTC 29 August 2016.Using FNL (final) global reanalysis data provided by the National Centers for Environmental Prediction (NCEP) as the initial field and boundary conditions, we take the 39-hour forecast adjustment from 1200 UTC 28 August 2016 to 0900 UTC 29 August 2016 as the background field of the assimilation system.After the assimilation, a 30-hour forecast is made, which is a forecast lasting until 1500 UTC 30 August 2016.A set of assimilation and comparison experiments are carried out as shown in Table 2. 3 shows the number of wind observations before and after the process of thinning and assimilation.The second column shows a typical number of available satellite wind observations (without data selection and quality control) for each experiment.The third column shows the typical number of wind observations after thinning using different thinning schemes.Since the regular box size is 37.5 km (three times the resolution of ASCAT wind field), the typical number of wind observations after thinning using superobbing with regular box is one-ninth (=1/3 × 1/3) of the total number of winds.While the latter is dependent on the structure of wind innovation field and the proportional threshold of the feature box.The reason why the number of accepted wind observations using feature box used in the data assimilation system is larger than that of the regular box is that the WRFDA is a regional system, and superobbing with feature box preserves most of the observations at areas where the wind field has a strong spatial wind variability characteristics, especially in the area where the typhoon locates, as can be seen from Figure 4(e).

Results. Table
One important diagnostic tool for understanding the impact of a data type on the assimilation is the analysis residual (the difference between the analysis and the observation).Figure 5 gives the O/A (observation/analysis) comparison of the bias, root-mean-square (rms) value, and standard derivation of the  and V wind components of the two experiments.As shown in Figure 5, the wind field ).A negative Δ value implies that the analysis of pressure field using the superobbing with the feature box is closer to the reference, and vice versa.As shown in Figure 6, there are generally more areas with negative Δ than areas with positive Δ, which means that the assimilation using the superobbing with feature box improves most of the pressure field, especially areas along with observations (the red circles).Meanwhile, the typhoon center is also located in the areas (33.1 ∘ N, 141.4 ∘ E) with negative Δ.
It is also instructive to examine how the forecast skill changes with time.Figure 7 shows the true and forecast typhoon paths of different experiments.It is apparent that the location of the typhoon center based on different thinning schemes is very close to the control experiment (with no assimilation of the scatterometer wind) at the time of the assimilation; this may be due to the defect of the position algorithm of typhoon center.However, forecasted typhoon paths of wind assimilation all show some improvement as compared to the control experiment.Generally, the forecast typhoon path of assimilation using superobbing with feature box is slightly better than that of the regular box.
The intensity of the typhoon based on the different schemes is compared in Figure 8.As shown in the figure, assimilation of wind data improves the intensity of typhoon forecast.Assimilation using superobbing with the feature box has the minimum error of pressure and maximum wind speed of the typhoon eye at the time of the assimilation.Table 4 gives the average forecast error of pressure and maximum wind speed; it could be seen that superobbing using the feature box has slightly reduced the forecast error compared to the regular box.

Conclusion
In this paper, we proposed a new thinning method which combines the superobbing with the structural feature of   the wind innovation field.The new thinning scheme shows great RE reduction and good superob error characteristic against the traditional superobbing.From typhoon forecast impact experiment results, one can see that superobbing with the feature box shows some skill (although probably not   significant skill) compared with that of the superobbing with the regular box in the analysis and forecast.However, more experiments are necessary to conclude that the positive impact is caused by the wind observation selection rather than the increased number of winds influencing the analysis.Another important question to be concerned is the correlated and random error of the ASCAT winds which is crucial to quantify the exact superob error.The optimal selection of the proportional threshold  for feature box should also be considered in the next step.

Figure 1 :
Figure 1: Flow of the superobbing with feature box algorithm. is the proportional threshold.

Figure 5 :
Figure 5: O/A comparison of the bias, root-mean-square value, and standard derivation of the  and V wind components: (a) superobbing with regular box; (b) superobbing with feature box.

Table 2 :
Data assimilation experiment design.t u r e b o x A S C A T w i n d using the superobbing with feature box has a smaller rms and std than that of the regular box, which implies that the superobbing with the feature box improves the assimilated wind field compared to the regular box.This is not surprising since superobbing with feature box has smaller superob error than that of the regular box and contains more structure information of the wind field, thus increasing the pull of the analysis toward observation.In order to display how the superobbing with the feature box has improved the pressure field, we use the FNL global reanalysis pressure field as a reference.We compare the absolute value of the pressure error field ( abs = | analysis −  fnl |) of the two experiments and give the difference field of the pressure error Δ (Δ =  feature abs −  regular abs 125 ∘ E 130 ∘ E 135 ∘ E 140 ∘ E 145 ∘ E 150 ∘ E 120 ∘ E Sea Level Pressure (hPa) ASCAT Wind Observation

Figure 6 :
Figure 6: Pressure error improvement using superobbing with feature box compared with regular box (the red circles indicate the ASCAT wind observation location in the map).

Figure 8 :
Figure 8: Pressure and maximum wind speed forecast error of the typhoon eye.

Table 3 :
Typical number of winds available, after thinning and assimilating in the experiment.

Table 4 :
Pressure and maximum wind speed forecast error of data assimilation experiments.