Effect of Data Assimilation Using WRF-3 DVAR for Heavy Rain Prediction on the Northeastern Edge of the Tibetan Plateau

The numerical weather prediction (NWP) is gaining more attention in providing high-resolution rainfall forecasts in the arid and semiarid region. However, the modeling accuracy is negatively affected by errors in the initial conditions. Here we investigate the potential of data assimilation in improving the NWP rainfall forecasts in the northeastern Tibetan Plateau. Three of threedimensional variational (3DVar) data assimilation experiments were designed on running the advanced research weather research forecast (WRF) model. Two heavy rain events selected with different rainfall distribution in space and time are utilized to examine the improvement for rainfall forecast after data assimilation. For the spatial distribution, the improvement of rainfall accumulation and area is obvious for the both two events. But for the temporal variation, the improvement is more obvious for the event with even rainfall distribution in time, while the effect of data assimilation is not ideal for the rainfall event with uneven distribution in space and time. It is noteworthy that, for both the spatial and temporal distribution of rainfall, satellite radiances have greater effect on rainfall forecasts than surface and upper-air meteorological observations in this high-altitude region. Moreover, the data assimilation experiments provide more detail information to the initial fields.


Introduction
Precipitation is a crucial component in the hydrological cycle of the Earth and has a profound influence on climate and hydrology at regional to global scales [1].A high-resolution heavy rainfall forecast plays an important role in accurate flood forecasting and water resource regulation [2], especially in the arid and semiarid region of northwest China where precipitation is crucial for water resources, and the heavy rain often results in flash flood in summer.Therefore, providing accurate rainfall forecasts over northwest China using the numerical weather prediction (NWP) is of prime importance.Weather research and forecast (WRF) is the latest generation mesoscale NWP model.Recent studies have shown that the WRF model has good potential in capturing rainfall features such as rainfall timing, location, and evolution [3][4][5].
However, for producing accurate values for rainfall quantities, the results are not ideal due to the low-quality initial conditions [6], which can be improved by data assimilation [7].
With the improvement of the NWP models, several data assimilation skills, such as three and four-dimensional variational methods (3DVar/4DVar), ensemble Kalman filers (EnKF), and latent heat nudging (LHN) have been developed in the methods of variational and ensemble [8,9].Although the 4D-Var and EnKF methods show great potential, they still suffer from unaffordable computer costs for operational NWPs.In continuous cycling mode, 3DVar performs better in producing rational analyses of hydrometeorological fields with greater computational efficiency than 4DVar, EnKF, and LHN [10,11].
Real-time observations have generally been used in the assimilation systems and shown to improve markedly the performance of NWP models, in spite of the poor initial conditions provided by the global NWP models [12].In the case of high temporal and spatial resolution satellite data, the European Center for Medium-Range Weather Forecasts (ECMWF) pioneered the direct assimilation of microwave radiance data affected by precipitation, first in a 1 + 4Dvar assimilation approach [13,14] and later in an implementation of all-sky radiance assimilation in the operational 4Dvar system [15,16].Additionally, many investigations have shown that rainfall forecasts from the NWP models can be improved noticeably with the assimilation of radar reflectivity [17][18][19] or radar-derived precipitation data [20,21].Nevertheless, the remote sensing data need to be validated against ground truth [22], and surface observations have a wealth of information that can be used to simulate mesoscale weather phenomena [23].Therefore, the assimilation of surface observations in a NWP model is likely to improve the model performance as well.Previous studies have been made to improve simulations of weather parameters by using direct or modeled surface observations.These include the assimilation of temperature, water vapor, mixing ratio and winds [24], and 2 m potential temperature, 2 m dew point temperature, and 10 m wind observations [25] into the NWP model, to determine planetary boundary layer (PBL) profiles and to analyse the surface cold pool, respectively.The results show a marked improvement in the model simulations after the assimilation.
In the WRF assimilation system, previous research has also shown that, by assimilating only the satellite data, the improvement in precipitation forecasts is not as great as when the assimilation of satellite data is combined with that of surface observations [26,27].
In this study, the WRF-3Dvar system is used to explore the effect of the assimilation of NCAR surface and upperair observations and of AMSU-A and AMSU-B microwave radiance data on forecasts of heavy precipitation on the northeastern Tibetan Plateau.Two heavy rain events that occurred in June, 2013, were selected in order to evaluate the improvement for rainfall forecast after data assimilation.The paper is organized as follows.Section 2 gives a concise overview of the method and experiment design.Section 3 provides information on the study area and the data.The results of the data assimilation experiments are presented and evaluated in Section 4, and conclusions are given in Section 5.

Method and Experimental Design
2.1.WRF Model Set-Up.The numerical data assimilation experiments in this study are conducted using the Advanced Research WRF model Version 3.5.WRF is a nonhydrostatic, primitive-equation, mesoscale meteorological model with advanced dynamics, physics, and numerical schemes (details of the model is at http://www.mmm.ucar.edu/).As shown in Figure 1, the model domains are two-way nested with 12 km (208 × 182) and 4 km (235 × 182) horizontal spacing.Each domain has 28 vertical pressure levels with the top level set at 50 hPa.The WRF physical parameterization schemes used in this study include the Purdue Lin microphysical parameterization, Rapid Radiative Transfer Model (RRTM) longwave radiation, Dudhia shortwave radiation, Monin-Obukhov surface layer, Noah land surface, Mellor-Yamada-Janjic (MYJ) planetary boundary layer scheme, and Grell-Devenyi (GD) cumulus scheme.The projection method is Lambert.

3D-Var Data Assimilation.
Data assimilation is the technique by which observations are combined with a NWP product (called the first guess or background field) and their respective error statistics to provide an improved estimate (the analysis) of the atmospheric (or oceanic) state.Variational (Var) data assimilation achieves this through the iterative minimization of a prescribed cost (or penalty) function [28]: where  is the analysis to be found that minimizes the cost function (),   is the first guess of the NWP model,  0 is the assimilated observation, and  = () is the model-derived observation transformed from the analysis  by the observation operator  for comparison against  0 .The solution for the cost function given by (1) represents a posteriori maximum likelihood (minimum variance) estimate of the true state given the two sources of a priori data: the first guess   and the observation  0 [29].The fit to individual observation points is weighted by the estimates of their errors, that is,  and , which are the background error covariance matrix and the observation error covariance matrix, respectively.The WRF-3DVar system developed by Barker et al. [10] is used in this study in tandem with the WRF model for assimilating the satellite radiance data and the traditional observations.The performance of the data-assimilation system largely depends on the plausibility of the background error covariance (BE), that is, the matrix  in (1).In this study, the "CV5" background error option is used with the control variables of stream function, unbalanced temperature, unbalanced potential velocity, unbalanced surface pressure, and pseudo relative humidity.The background error covariance matrix is generated via the National Meteorological Center (NMC) method [30] for our own forecasting domain.

Experimental Design.
Four sets of experiments have been conducted using two domains.As shown in Table 1, the model simulation without data assimilation will be referred to as the control experiment (CTRL).Three assimilation experiments are designed using different observation parameters from different data sources.In the DA-OBS experiment, measurements of pressure, geopotential height, temperature, dewpoint temperature, wind direction, and speed from NCAR are assimilated, while in the DA-SAT experiment, only satellite radiance data are assimilated.Finally, in the DA-BOTH experiment, both NCAR observations and AMSU-A (B) radiance data are assimilated.

Study Area and Data
As shown in Figure 2, the northeastern of the Tibetan Plateau (94 ∘ 39  -103 ∘ 27  E, 35 ∘ 51  -40 ∘ 31  N) range of elevation between 758 m and 5725 m a.s.l is the head of many inland rivers, which play an important role in the hydrology and agriculture in the downstream arid region.A total of 43 national observation stations have been used to verify the spatial distribution of the simulated precipitation, among which the Wulan station has the highest elevation of 3800 m and Dunhuang has the lowest altitude of 1137 m.The observed The initial and boundary conditions necessary to run the WRF are the ERA-Interim data at 1 ∘ × 1 ∘ grid resolution obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF), rather than the NCEP-NCAR Final Analysis (FNL) data.This is because several studies have demonstrated that the reliability of ERA-Reanalysis data is higher than the NCEP data in China [31,32].In the model integration, the coordinates of the central point are 38.1 ∘ N and 98.9 ∘ E. WRF-3DVar experiments have been conducted with modified initial conditions which were obtained by assimilating other measured data.The assimilated data includes NCAR surface and upper-air observations and AMSUA and AMSUB radiance data.The surface and upperair data assimilated in this study are obtained from the "ds337.0" in the NCAR archives, which contain measurements of pressure, geopotential height, temperature, dew point temperature, wind direction, and speed from fixed and mobile land/sea stations.The data are initially downloaded in PREPBUFR format and can be assimilated directly into WRFDA.The AMSU-A and AMSU-B satellite radiance data (NOAA-15/16/17/18/19) from the NOAA ATOVS instruments can be read in WRFDA via CRTM2.0.2, which is in BUFR format.

Rainfall Event.
In June 2013, two heavy rain events occurred in the northeast of the Tibetan Plateau.The durations of the events and the maximum/mean rainfall accumulation observed by rain gauge network are shown in Table 2.The two events are of different types according to the evenness of rainfall distribution in time and space.Figure 3 illustrates the spatial distribution of the rainfall accumulation for the durations of the two events, while Figure 4 presents the time series bars and the cumulative curves of the observed precipitation for the two events at Laohugou station.By comparing the evenness of the rainfall distribution in time and space in Figures 3 and 4, it can be found that Event A has even rainfall distribution neither in space nor in time.The rainfall distribution of Event B is also uneven in space, but continuous and almost constant in time.As for the meteorological characteristics of the storm events according to the weather analysis charts, Event A is likely caused by very strong local convections, and Event B may be a stratiform storm.The rainfall evenness/unevenness of the two heavy rain events can be further verified quantitatively by using the coefficient of variability (CV): where   is the rainfall accumulation of each rain gauge  (when calculating the evenness in space) or the average areal rainfall at each time step  (for the evenness in time),  is the mean value of   , and  is the total number of rain gauges or the total number of the time steps.The results of the two rainfall events are also shown in Table 2.A larger CV value represents higher variability thus less even rainfall distribution.The CV value of the spatial evenness for Event A is larger than that for Event B, which means Event A has higher variability in space.6 Advances in Meteorology  experiment predicts only a small amount of rainfall on the northwestern of the study area, while in DA-SAT experiment precipitation area extends to the central and southern regions of the study area and rainfall accumulation increases significantly (Figure 5(b)).But in the northwest of the region, the rainfall accumulation for DA-SAT experiment is similar with that shown in Figure 5(a).In DA-OBS experiment (Figure 5(c)), precipitation accumulation on the northwestern of the region is greater than that in CTRL experiment, but the precipitation area is similar in both experiments.Figure 5(d) shows the result of DA-BOTH experiment, in which the precipitation area is the largest and accumulation is the biggest compared with the other three experiments.
The results for Event B are shown in Figure 6.The precipitation area on the northwestern of the study region is significantly increased by DA-SAT experiment (Figure 6(b)) relative to that in CTRL experiment.In Figure 6(c), the precipitation area in DA-OBS experiment is same as that in Figure 6(a), but the precipitation accumulation is obviously decreased on the southeastern of the region.Figure 6(d) shows the DA-BOTH experiment result, which has the similar precipitation area and accumulation as shown in DA-SAT experiment.
To sum up, the data assimilation has important influence on the precipitation forecast with significantly changes of the rainfall area and accumulation.It should be noticed that the effects of the forecast depends on the different rainfall event.
The rain-gauge data were used to analyze the changed precipitation area and accumulation after data assimilation experiments.Figure 7 compares the measured and modeled results in the rainfall Event A. In CTRL experiment (Figure 7(a)), almost all the triangles lie below the line  = .However, in DA-SAT experiment, the green triangles which represent the observation sites (Qilian, Yeniugou, Tuole,  The above results suggest that data assimilation have positive effects on the rainfall forecast, through changing the rainfall area and accumulation for the study area.And for the two events with different evenness in time and space, the results after data assimilation are different.For Event A, data assimilation experiments provide larger rainfall areas and accumulation to improve the predicted accuracy.And for Event B, data assimilation experiments improve the accuracy on the southeastern of the study area by providing smaller precipitation accumulation.This means that CTRL experiment underestimates the convective rainfall for Event  A and overestimates the stratiform rainfall of Event B on the southeastern of the region.
In order to evaluate quantitatively the impact of data assimilation experiments which have positive effects on the rainfall forecast, rain-gauge data from 43 sites were used.Table 3 shows the comparison between the measured and modeled results.The CTRL experiments have the lowest precision of simulation for both Event A and Event B, whose  2 are the smallest and RMSE are the biggest compared with three data assimilation experiments.The greatest improvement appears in DA-BOTH experiment for the two events.For Event A, the  2 for DA-BOTH experiment is increased by 0.46 and its RMSE is reduced by 4.7, and for Event B, the  2 for DA-BOTH experiment is increased by 0.32 and its RMSE is reduced by 0.6.Additionally, it is found that in DA-SAT experiment the improvement is bigger than DA-OBS experiment.The ME of rainfall accumulation shows that for the entire rainfall range it has positives value for Event A but has negative value for Event B. This means that the four experiments produce underestimated rainfall accumulations for Event A, but overestimated rainfall accumulations for Event B. That is because in the precipitation ranges of 0-5 mm and 5-10 mm, almost all the ME values are positive for Event A and all the ME are negative for Event B.
In this section for the two events, it is found that the forecast precision in DA-SAT experiment is higher than that in DA-OBS experiment.This may be because there are few stations located on the high-altitude region of northeastern Tibetan Plateau.DA-BOTH experiment has the biggest precision compared with the other two data assimilation experiments, which means that the improvement after the assimilation of the satellite radiances combined with surface and upper-air meteorological observations is greater than that after only assimilating satellite radiances.

Temporal Variations of Precipitation.
The greater difference between Event A and Event B is the rainfall distribution in time.Figure 9(a) shows the rainfall intensities and cumulative curves of the measured and modeled results for Event A at Laohugou station.The cumulative curves of CTRL experiment are lie below and far away from that of the measured values.However, during the period from  In Figure 9(b) for event B, the precipitation intensity in CTRL experiment is significantly larger than the measured values in first 6 hours.Therefore, the cumulative curve in the CTRL experiment is far above that of the measured values, due to the inaccurate initial field.After data assimilation, the rainfall intensities in DA-SAT and DA-BOTH experiments are closer to the observations in first 6 hours, while the rainfall intensity in DA-OBS experiment is much less than the observations.As a result, the cumulative curve in DA-BOTH experiment is closest to the curve of the observed values, DA-SAT experiment is next, and DA-OBS experiment is farthest.It is noteworthy that for Event B the cumulative curve in DA-SAT experiment is closer to that of the observed values, in comparison with Event A.
The absolute error (AE) has been calculated for both Event A and Event B. In Figure 10    The results shows that for the temporal variation of rainfall, the improvement in DA-SAT experiment is better than that in DA-OBS experiment due to the lack of ground observational data at high-altitude region.And for Event B, the improvement in DA-SAT is greater than Event A.

Impacts of Data Assimilation on the Initial Field.
From the above analysis, it is clear that data assimilation experiments can improve rainfall forecast.The purpose of data assimilation is to acquire accurate initial fields for numerical models.To reveal the impact of data assimilation experiments on the initial fields, Event A is chosen as an example to analyze the difference in the initial fields between data assimilation and control experiments.The geopotential height and moisture flux at 850 hPa are selected for analysis.
Figure 11 shows the geopotential height differences between data assimilation and control experiments.Compared with CTRL experiment (Figure 11

Conclusions
In this study, the 3DVar assimilation system is used to improve the forecasting of heavy rain in spatial and time distribution in northeastern Tibetan Plateau.A control and three data assimilation experiments were designed.These experiments demonstrate that the rain forecast is significantly improved after data assimilation, through enhancing the accuracy of the initial field.Two heavy rainfall events with different evenness in space and time are used to examine the improvement for rainfall forecast.
For the spatial distribution, data assimilation experiments changed significantly the rainfall area and accumulation for the study area.For Event A, the data assimilation experiments provide larger rainfall areas and accumulation, while for Event B the data assimilation experiments provide smaller precipitation accumulation on the southeastern of the study area.Rain-gauge data from 43 sites were used to evaluate the impact after data assimilation.The results shows that DA-BOTH experiment has the highest prediction accuracy, followed by the DA-SAT experiment and then DA-OBS experiment, while the CTRL experiment has the lowest accuracy.For the temporal variation, the forecast results have great differences between Event A and Event B.  The satellite radiances for Event B have greater positive effect than that for Event A. For the two events, the cumulative curve in DA-BOTH experiment is closest to the curve of the measured values, while in DA-OBS experiment the accumulation curves are far below to the curve of the measured values, due to the lack of ground observations in the high-altitude region.In conclusion, both for the spatial and temporal distribution of rainfall, the satellite radiances have greater effect than surface and upper-air meteorological observations in high-latitude regions of the northeastern edge, and the improvement for the assimilation of satellite radiances together with surface and upper-air meteorological observations is greater than that for assimilating either satellite radiances or meteorology observations.Event A is chosen to evaluate the impact of data assimilation on the initial fields.It is found that better and more detailed information is added to the initial fields (e.g., geopotential height and moisture flux), especially on the northwest of Gansu Province and Qinghai Province, where the forecasts of precipitation are improved significantly.
It should be mentioned that many other assimilation techniques, such as 4DVar, EnKF, and so forth, need to be tested in this study.These approaches have great potential, although they currently suffer from unaffordable computer costs.Meanwhile, we need to analyse the impact of different parameterization schemes on the WRF model after data assimilation, because in this study the conclusions drawn are subject to the specific parameterization schemes used here.

Appendix Statistic Calculation
The coefficient of determination ( 2 ), root mean square deviation (RMSE), mean error (ME), and absolute error (AE)  Here,  is simulated value;  is observed value;  is the number of sites.

Figure 1 :
Figure 1: Domains for the ARW-WRF forecasts.The outer box is the coarse grid with a resolution of 12 km (d01); the inner box is the nested grid (d02) with a resolution of 4 km.

Figure 2 :Figure 3 :
Figure 2: Location map showing the northeastern edge of the Tibetan Plateau and 44 observation stations.

4. 1 . 2 .
Spatial Distribution of Precipitation.In Figure5, compared with CTRL experiment (Figure5(a)), the forecasts of precipitation area and accumulation are increased noticeably depending on different data assimilation.The CTRL Cumulative rainfall (mm)

Figure 4 :
Figure 4: The time series bars and the cumulative curves of the observed precipitation for the two heavy rainfall events at Laohugou station: (a) Event A; (b) Event B.

Figure 5 :
Figure 5: The forecast precipitation for Event A from the different numerical experiments: (a) CTRL; (b) DA-SAT; (c) DA-OBS; (d) DA-BOTH.

Figure 7 :
Figure 7: Differences between rainfall values measured at the sites and those from the numerical experiments for Event A: (a) CTRL; (b) DA-SAT; (c) DA-OBS; (d) DA-BOTH.

Figure 8 :
Figure 8: Differences between rainfall values measured at the sites and those from the numerical experiments for Event B: (a) CTRL; (b) DA-SAT; (c) DA-OBS; (d) DA-BOTH.

Figure 9 :
Figure 9: The rainfall intensities and cumulative curves of the values measured at the sites and the simulated results from the four numerical experiments: (a) Event A; (b) Event B.
(a)  for Event A, the AE of the numerical experiments becomes bigger and bigger with the increase of rainfall intensity, while the values in DA-BOTH and DA-SAT experiments are relatively smaller.It means that, for the convective rainfall like Event A, WRF model fails in capturing the whole process of the event.However, in Figure10(b) for Event B with even distribution in time, the AE of the numerical experiments is relatively constant, while the values for CTRL and DA-OBS experiments are relatively larger.

Figure 10 :
Figure 10: The AE of rainfall intensity between the values measured at the sites and the simulated results from the four numerical experiments: (a) Event A; (b) Event B.
(a)), there is no evident change of geopotential height in DA-OBS experiment (Figure 11(b)) or DA-SAT experiment (Figure 11(c)).

Figure 8 (
d) shows the geopotential height for DA-BOTH experiment, which is much smaller than the values from CTRL experiment.The region with the largest changes is mainly located in Qinghai Province and the northwest of Gansu Province, which is consistent with the improved forecasts of precipitation area and accumulation in DA-BOTH experiment.As can be seen in Figure12, after data assimilation the moisture flux change noticeably in the study area, in comparison with the control experiment (Figure12(a)).The moisture flux of DA-SAT experiment is markedly increased in the north of the study area (Figure12(b)).It may bring more precipitation in the northeast of Qinghai Province.In Figure12(c), there is a clear flow of moisture to the northwestern part of Gansu Province which results in the heavy rain.The biggest change of moisture flux occurs in Figure12(d) for DA-BOTH experiment, which covers the largest area compared with other numerical experiments, and it could give greatest estimated precipitation in northeastern Tibetan Plateau. 41

Table 1 :
Details of the observed data used in the assimilation experiments.

Table 2 :
Durations, maximum/mean rainfall accumulation, and spatial/temporal evenness of the heavy rain events.

Table 3 :
Differences between observation and numerical experiments.