Evaluating the Performance of a WRF Physics Ensemble in Simulating Rainfall over Lao PDR during Wet and Dry Seasons

Dynamical downscaling of General Circulation Model (GCM) data for any region has been made possible due to a set of physics options and model dynamics within the Weather Research and Forecasting (WRF) model. %is study evaluated the performance of an ensemble of physics options in simulating rainfall during wet and dry seasons of Lao PDR. %e model evaluation criteria focused on identifying the optimum physics options for a range of scenarios. No single combination of physics options performed well in all scenarios reflecting the importance of using different parameterizations according to the geographic location and the intended application of the results. For the dry season, none of the ensemble members performed satisfactorily for the southern region of Lao PDR, while all the ensemble members performed well for the northern and central regions. While almost all the WRF simulations overestimated the rainfall during the wet season, BMJ for cumulus physics performed better in the northern and central regions, and KF performed better in the south region.%e YSU scheme performed best as the planetary boundary layer for both wet and dry seasons, while WSM5 for the wet season and Lin for the dry season gave the best model performance as the microphysics option.


Introduction
e Weather Research and Forecasting (WRF) model is a mesoscale numerical weather prediction (NWP) and atmospheric simulation program created for both operational weather forecasting and research requirements. e span of its applications ranges from real-time NWP to data assimilation, parameterized physics research, regional climate simulation (downscaling), air quality modelling, atmosphere-ocean coupling, and idealized simulations [1]. e most widely used application of WRF among users globally is in downscaling coarser resolution General Circulation Model (GCM) outputs to finer resolution Regional Climate Models (RCM). NWP models for downscaling GCM data essentially involve calculating numerical solutions to the hydrodynamic equations that govern the atmospheric motions through different numerical methods. However, all these numerical methods involve some form of division of the model extent into 3D cells, which determines the level of detail and the computational load for the model host computer. is limits the ability of the differential equations that depict the atmospheric motions to describe only processes that occur in scales of more than twice the relevant grid size. However, subgrid scale processes that take place in the atmosphere such as cloud cover and microphysics directly affect the thermodynamic state of the atmosphere in larger spatial scales and need to be accounted for when downscaling. e technique that introduces these subgrid scale processes explicitly in the form of their statistical effect on the grid mean state is known as parameterization [2].
Selection of the best set of physics schemes for parameterization, which performs well, has become an increasingly difficult task with the availability of more schemes, which accounts for the various complex processes. Since testing large multiphysics ensembles over seasonal time scales is computationally intensive, many researchers opt to investigate event-based simulations spanning a couple of days or weeks to find out the optimal combination of physics options, while there is some focus on seasonal highlights in the form of the long-term mean [3]. In this study, a compromise of these two approaches was used, where an ensemble of 12 different combinations of parameterization schemes were tested for a period of one month each (with 30 days + 3 days spin-up time) over 2 seasons for Lao PDR for 2 years (2017 and 2018). e response to various processes in the atmosphere may change both spatially (with the geographic location) and temporally (with the change in time and seasons), which needs to be investigated in order to capture the real-world observations through model simulations. Previous studies have shown that the best interpretation of these subgrid scale processes in different geographical constraints across time has not been achieved by a single combination of physics schemes resulting in underestimation and overestimation of variables such as temperature and rainfall [3][4][5].
A study in northern Vietnam [6] using WRF to simulate heavy rainfall events suggested the importance of microphysics schemes in playing a major role in correct forecasting. is study involved an ensemble of microphysics, shortwave radiation, and boundary layer options. Another study conducted in ailand [7] tried to capture dry conditions, tropical cyclones, and monsoonal flow using reanalysis data. In this study, the model was run for 2 weeks for dry season simulation in March 1999. e simulation was able to capture major spatial precipitation patterns, although all the simulations overestimated the amount of precipitation.
Chotamonsak et al. [8] carried out a one-year simulation in 2005 for ailand with an ensemble of 8 members to evaluate the WRF model performance during the rainy, cool-dry, and hot-dry seasons. eir model output comparison for rainfall with 69 observation station datasets obtained correlation values of more than 0.8 for both the cool-dry and rainy seasons, while very low correlation values below 0.3 were observed for the hot-dry period.
According to the climate risk and adaptation reports published by the World Bank Group and World Meteorological Organization (WMO) [9], the baseline climate of Lao PDR can be divided into two distinct seasons: a dry season from mid-October to April and a rainy season from May to September dominated by the southwest monsoon, which brings a significant amount of precipitation to the country. ree climatic zones are recognized for Lao PDR: the northern mountainous region above 1000 m, the central region, and the tropical lowland plains and floodplains in the south [10].
In this study, both temporal (dry and wet seasons) and geographical (northern, central, and southern) regions of Lao PDR were assessed for each ensemble member performance to determine the most suitable physics schemes for each scenario. e identification of the patterns and period of seasonality (wet and dry seasons) for Lao PDR (not included in this paper) was carried out using meteorological station data for minimum temperature, maximum temperature, and daily rainfall data, which were then tested using statistical indicators (e.g., Autocorrelation Function (ACF), Standardized Precipitation Index (SPI), and Fast Fourier Transformation (FFT)). e regions were derived from the second level administrative boundaries of Lao PDR. e evaluation of the ensemble member results was done using the meteorological station data records from 19 stations across Lao PDR maintained by the Department of Meteorology and Hydrology (DMH) and the "Global Satellite Mapping of Precipitation" (GSMaP) data by the Japan Aerospace Exploration Agency (JAXA) Global Rainfall Watch which are produced and distributed by JAXA's Earth Observation Research Center. e results were then statistically evaluated using key indicators to obtain the best performing combinations of physics options for different scenarios.

Model Configuration
e Materials and Methods section should contain sufficient details so that all procedures can be repeated. It may be divided into headed subsections if several methods are described e physics ensemble was generated using the Advanced Research WRF (ARW) version 4.0 hosted at the National Center of Atmospheric Research (NCAR) along with the Coordinated Regional Climate Downscaling Experiment (CORDEX-WRF) version 1.3.
Each model simulation was a one-way nested run with domain 1 encompassing the South-East Asian region around Lao PDR with a resolution of 25 km grid size and domain 2 covering the whole of Lao PDR with a resolution of 5 km grid size ( Figure 1). e buffer length from domain 1 to domain 2 is about 525 km, which is 21 grids in the domain 1 scale from all 4 sides. e closest boundary of Lao PDR (southern tip of Champassak province) to the domain 2 boundary is about 20 km apart which is 4 grids in the domain 2 scale to give enough grid scale iterations to the model from the boundary conditions. e driving data (boundary condition and initial conditions) for the WRF model were derived from NCEP GDAS/FNL (final) global analysis data which are on 0.25degree by 0.25-degree grids. is dataset is derived from the Global Data Assimilation System (GDAS), which collects observational data from the Global Telecommunication System (GTS) and produces them with a delay of about one hour from NCEP Global Forecasting System (GFS) initiation. is delay allows more observational data to be incorporated during the initiation of FNL than in the GFS [11].
e simulations were carried out separately for wet and dry seasons with 33 days each (30 days + 3 days spin-up time) in a year. e wet season simulation spanned from 13 2 Advances in Meteorology July to 15 August and the dry season simulation spanned from 13 December to 15 January for the years 2017 and 2018.

Ensemble Design
e WRF model consists of several options for physics schemes allowing the users to set up the model with optimal performance for a range of temporal and spatial resolutions to cater to climatologically different geographies. In this study, three physics schemes, which proved to be most influential in model performance derived from a literature study of similar investigations in the region [4,[6][7][8], were tested. e schemes tested were the planetary boundary layer (PBL) scheme, cumulus (CU) scheme, and the microphysics (MP) scheme. e vertical thermodynamic and kinematic profiles of the atmosphere represented in mesoscale models depend on the correct representation of the turbulent mixing which occurs in the lower troposphere. Since turbulence is a subgrid scale process, the planetary boundary layer (PBL) parameterization allows this effect to be accounted during the simulations. e main classification of PBL parameterization options can be drawn according to their order of turbulence closure (integer and noninteger) and the depth over which their effect can be introduced (local and nonlocal). During this study, a first-order (integer) turbulent closure, nonlocal influenced scheme (YSU) [12], and a 1.5order (noninteger) closure, local influenced scheme (MYJ) [13], were used. e YSU PBL scheme was used along with the MM5 similarity theory surface layer [14,15], while the MYJ PBL scheme was used along with the Eta similarity scheme [16].
Cumulus parameterization predicts the effect of subgrid scale convection in the modelled atmosphere in terms of other grid scale variables. In doing so, a cumulus parameterization initially detects whether conventional rain exists (satisfying the trigger function) by analyzing grid scale variables and then introducing subsequent modifications to the grid scale variables so that it accommodates the effects of convection. Convection processes in general cover a range of about 10 km and, therefore, models that have a horizontal grid scale below 4 km are regarded as convection permitting (convection resolving/allowing) models and do not require an additional cumulus parameterization to resolve convection. Models with 5 km or more grid size, however, require cumulus parameterizations to represent the effects of deep convection [17] and therefore were used in this study. e main 2 types of cumulus parameterization schemes are those with deep layer control (related to the generation of Convective Available Potential Energy (CAPE) by largescale processes) and low level control (related to the convection generated to counter the Convective Inhibition (CIN)) schemes. In this study, a deep layer control type cumulus parameterization (BMJ) [13] and a low level control type cumulus parameterization (KF) [18] were used. e microphysics parameterization is one of the most challenging tasks in parameterization for NWP models due to the complexity introduced with the scale of the processes involved and the heavy computation it demands. For these reasons, bulk representation schemes of microphysics are mostly dominant in NWPs over bin representative schemes [19]. e bulk representation of microphysics in general involves the prediction of change to a specific moment (1 � mass, 2 � number, and 3 � radar reflectivity) of particle size distribution to a set of defined classes of hydrometeors (3 class, 5 class, 6 class, etc.) [20]. During this study, only single moment representing microphysics schemes was tested with 5 and 6 classes of hydrometeors (WSM5, WSM6, and Purdue Lin) [21][22][23] to account for the relatively smaller grid size. e short-and long-wave radiation parameterization aims at calculating the radiative flux, where the resultant is the sum of fluxes and vertical radiative flux density (RFD) and thus determines the temperature tendencies of the system. In this study, a combination of relatively simple schemes was used (rapid and accurate radiative transfer model (RRTM) for long wave and Dudhia for short wave) [24,25].
A total of 48 simulations were carried out in parallel for the complete task accounting for 1584 days simulated with different members of the physics ensemble (Table 1). e average time for one simulation using 32 cores was around 1.5 days (576 core hours).

Observation Data.
Two independent sets of data were used for the model evaluation: daily rainfall recorded in 19 meteorological stations maintained by the DMH and the gridded satellite data from GSMaP project. e meteorological station data for point locations were compared with the closest WRF model grid center point (grid to point) for 19 stations in Lao PDR ( Figure 2). e WRF model gridded data were also compared with GSMaP rain and gauge Advances in Meteorology calibrated gridded data to assess the spatial accuracy of the model outputs.

Evaluation Statistics.
e duration of wet and dry seasons was identified by analyzing 6 years of historical observation data, which agreed with the previous climate studies carried out in Lao PDR [10]. e wet season runs from May to September, while the dry season covers October to late April. e averages of daily rainfall results for all 19 stations from each WRF ensemble member were compared with the averages of daily observation records from DMH stations. In addition, the three major zones in Lao PDR (northern, central, and southern) were considered to evaluate the performance of the ensemble to varying dynamics and in different geographical regions. e following statistical metrics were used in this study to assess the suitability of the different physics schemes for a range of scenarios and applications. e bias (equation (1)) was calculated by computing the difference between the WRF simulated rainfall and the observed rainfall for each day and the mean bias was derived using the total days in the simulation period.
Mean absolute error (MAE) (equation (2)) was calculated to quantify the difference between the ensemble member result and the observation data while negating the effect of cancelling positive and negative errors seen in the bias.
Root mean square error/deviation (equation (3)) which is a measure of the combination of systematic error and random error was calculated for each ensemble member's results.
e Pearson correlation coefficient (R) (equation (4)) was calculated to measure the linear dependence between the WRF results and the observation data on a scale between −1 and +1. e value indicates the strength and direction of a linear relationship between WRF output and the observation. A value of 1 implies that a perfect linear equation describes the relationship between WRF and the observations, with all data points lying on a line for which the WRF values increase as the data values increase. e correlation is −1 in the case of a decreasing linear relationship and the values in between indicate the degree of linear relationship between the WRF model and the observations.
Slope/linear regression (equation (5)) goes one step beyond the correlation coefficient in identifying the linear relationship between the WRF simulated rainfall values and the DMH observed rainfall values. is statistic was used to derive the total model performance (TMP) which is explained below.
e Index of Agreement (IoA) (equation (6)) which is a dimensionless statistical measure of model performance [26] was used to compare the WRF model output values pairwise with observation values which were assumed to be reliable and close to reality. e values would range from 0 to 1 with Total model performance [27] (equation (7)) which is a combined index derived from MAE, RMSE, and slope was used to quantify the overall performance so that each ensemble member could be compared individually.
where x i is the daily average observed rainfall data, y i is the daily average rainfall data from WRF, x is the mean daily average observed rainfall data, and y is the mean daily average rainfall data from the WRF. e ability of the ensemble members to capture the spatial extent of rainfall was tested using an indicator derived from the Fractional Skill Score (FSS) method introduced by Roberts and Lean [28]. e original method proposed by Roberts and Lean aimed at defining an optimal grid scale resolution for capturing convective rainfall events using NWP. However, during this study, the FSS method was adjusted to determine the level of spatial accuracy for capturing the spatial extent of rainfall for the model's grid scale resolution only. First, the accumulated rainfalls from both WRF (I y ) and GSMaP (I x ) were transformed into binary fields using individual thresholds, where grids with accumulated rainfall more than 10% of the maximum value were set to 1 (equation (8)), while the rest were set to 0 (equation (9)).
Next, the mean square error (MSE) (equation (10)) was calculated for each ensemble member.
where N e and N s are the numbers of grids in the model domain and n is the ensemble member ID given in Table 1. e FSS is defined as follows: where MSE (n) perfect � 0 is the MSE of a perfect model output. In this study, the reference data derived from GSMaP were used to calculate MSE ref as follows: e FSS value for the best performing member will be closer to 1, while the worst performance will be closer to 0.

Temporal Analysis.
e daily average rainfall for the whole of Lao PDR derived from the WRF model was validated with the DMH observation data for both the wet and dry seasons of 2017 and 2018. e ensemble member giving the least variation for wet seasons was F, while E gave the least variation from DMH observation for the dry season (marked in red in Figure 3). While all the ensemble members overestimated the rainfall, two distinct groups of observation were evident with low and higher values of bias.
e simulations with the KF CU physics option showcased higher values of rainfall for both the wet and dry seasons. e relaxation time of the BMJ scheme, which has been largely calibrated with tropical systems, produced less rainfall during the simulations and therefore less bias compared to the KF scheme. e nonlocal closure scheme PBL option YSU, coupled with BMJ for CU physics, demonstrated the least variation from observation data (Figure 4). e isolated rainfall events during the dry season were also captured by all the ensemble members except for one event on 9 January 2018, which recorded rainfall of less than 10 mm. e ensemble mean was able to capture the rainfall closest to the DMH observations, which emphasizes the value of the ensemble products rather than one single configuration which either underestimates or overestimates it ( Figure 5). e WRF interpretations of high rainfall events using the KF scheme closely followed the DMH observation, while the relatively low precipitation days were captured well by the BMJ scheme suggesting the best use of cumulus schemes depending on the application (event-based simulations/long-term climate simulations).

Spatial Analysis.
e ability of the WRF ensemble to correctly map the spatial extent of rainfall over the wet season was assessed using the FSS of each member using the GSMaP rain and gauge corrected gridded data as the reference. Both the satellite rainfall data and the WRF downscaled rainfall showed similar rainfall extents in the central region of Lao PDR, where the terrain is relatively flat. However, the rainfall mapped by the GSMaP in the southern Lao PDR for both years was better captured by WRF model runs with the BMJ rather than the KF cumulus scheme (Figure 6). In these regions, the sudden change in relief in the form of the southwest facing slopes of the Annamese mountain range near the ailand border produces high rainfall due to the orographical intensification of the southwest monsoon [29]. However, the limitations of passive satellite-based rainfall data to provide less accurate information in complex terrains [30] should also be considered when used as a baseline for comparison.    Advances in Meteorology e adjusted FSS score was able to highlight the ability of each ensemble member to correctly map the extent of the accumulated rainfall within a period of 1 month for the wet season (Table 2). It was evident that the ensemble members with the BMJ CU scheme were able to capture the greatest number of grids, which received rainfall during the wet season simulation compared to the members with the KF CU scheme.

Zonal Analysis of Rainfall.
e daily average time series data of rainfall from 7 stations in the northern, 7 stations in the central, and 5 stations in the southern regions were compared with WRF model results to identify the variation of performance in different regions of Lao PDR (Table 3). e dry season did not show any significant variation among ensemble members and DMH observation for all the regions.
During the wet season, the performance of ensemble members showed greater variation. For the northern and central regions, the ensemble members that used BMJ demonstrated lower RMSE and hence lower bias compared to the members that used KF. e southern region was better captured by ensemble members that used the KF scheme and produced greater correlation and lower RMSE compared to members that used BMJ. e southern region of Lao PDR received the highest annual rainfall [31] dominated by the southwest monsoon and the KF scheme performed best in these conditions. No significant influence was observed with the coupled PBL and MP physics schemes for different regions of Lao PDR.

Sensitivity to Physical Parameterizations.
e sensitivity of each physics option to induce errors in downscaling results was analyzed by grouping all ensemble members together with a common physics option and comparing their respective range of bias using box and whisker plots (Figure 7). e analysis was carried out at the country level using the daily average rainfall results from the WRF model for the 19 station locations.
e KF scheme when used as the CU physics option tends to produce the highest bias compared to BMJ for both the wet and dry seasons in Lao PDR. e YSU scheme as the PBL option produced relatively low levels of bias for the wet season compared to MYJ and did not vary significantly during the dry season. e ensemble members with the WSM5 option gave the least bias during the wet season and the highest bias during the dry season. In contrast, the Lin option as the MP parameterization produced the least bias during the dry season and the highest bias during the wet season.

Identifying the Optimal Physics Scheme Combination.
During the wet season simulations, the ensemble members F, E, and A performed best for the northern (Figure 8(a)), central (Figure 8(c)), and southern regions (Figure 8(e)) of Lao PDR, respectively, considering the correlation coefficient, standard deviation, and RMSE. When considering the whole country, ensemble member E performed the best (Figure 9(a)       Advances in Meteorology 13

Conclusion
is study evaluated the performance of a WRF physics ensemble in reproducing the rainfall region-wise in wet and dry seasons of Lao PDR. e ensemble consisted of 12 members with 2 cumulus physics schemes (BMJ and KF), 2 planetary boundary layer schemes (YSU and MYJ), and 3 microphysics schemes (WSM5, WSM6, and Lin) along with Dudhia and RRTM as short-wave and long-wave radiation schemes.
Several significant observations can be made in relation to the variation of performance when analyzing each ensemble member's performance in both spatial and temporal domains. However, as suggested by previous studies of WRF parameterizations, no single combination of physics options performed best in all cases for each of the evaluation criteria. e results from different evaluation statistics suggested the best use of the physics option according to the intended application (short-term extreme rainfall events/long-term regular rainfall events) temporally and the focus of the study (region-wise average study/country-scale average study) in the spatial context.
During the wet season, while all the ensemble members overestimated the rainfall in general, the BMJ scheme as the CU physics option produced the least bias for regular longterm rainfall periods, while the KF scheme was able to better interpret the short-term extreme rainfall events. e isolated rainfall events in the dry season were also captured well by all ensemble members with the BMJ scheme performing with lower variation from observation than the KF scheme. e YSU PBL scheme was able to produce slightly better results than the MYJ PBL scheme when coupled with BMJ for both the wet and dry seasons in Lao PDR. e WSM5 microphysics option was able to produce the least bias during the wet season, revealing a clear preference during periods of regular rainfall events and the opposite during the dry season, where it demonstrated the highest bias. In contrast, the Lin microphysics option performed best during isolated rainfall events in the dry season with the least bias and produced the greatest bias during the wet season. In summary, the BMJ-YSU-WSM5 (ensemble F) combination performed better during the wet season and the BMJ-YSU-Lin (ensemble E) combination performed better during the dry season. In general, the BMJ scheme was also able to produce the spatial extent of long-term rainfall accumulation slightly better than KF, which is revealed by the adjusted FSS scores. e variation of region-wise evaluation statistics for independent ensemble members suggests the clear difference of dynamics governing the atmospheric condition within the country and the necessity of an ensemble to better predict the climate variables by producing a range of values, so that the reliability of the ensemble output is greater than a single member simulation. e most significant observation in region-wise evaluation of model performance was the behaviour of the southern region compared to the rest of the country. While BMJ performed better than KF for the rest of the country, the KF cumulus scheme performed best in the southern region.
Since the decision on simulation method, resolution, and time period depends on both resources and the intended application of the model output, the results of this study can provide the best possible parameterization schemes for multiple scenarios of domain configurations and model run periods.
Further extension of this study is ongoing with 3 components. First, the evaluation of different driving data 14 Advances in Meteorology for WRF such as ERA-interim and ERA5 reanalysis is being carried out to better establish the suitability of the FNL or otherwise in the regions of Lao PDR. e second step involves the use of optimum physics option combinations to downscale 30-year historical climate data for Lao PDR and to establish a bias correction for the region. e third step involves the downscaling of future climate scenarios with different GCM data at 5 km resolution to be used as input for the Agroecological Zonation (AEZ) of Lao PDR.

Data Availability
All the Python and NCL codes related to extracting data from WRF model outputs and comparison with observation data are available in the GitHub repositories https://github. com/Rajitha-Athukorala/wrf-post-NCL and https://github. com/Rajitha-Athukorala/wrf-post-processing. All the statistics and model output data used in this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.