Implementation of the WSM5 and WSM6 Single Moment Microphysics Scheme into the RAMS Model: Verification for the HyMeX-SOP1

This paper shows the results of the implementation of two widely used bulk microphysics parameterizations (BMP) into the Regional Atmospheric Modeling System to improve the Quantitative Precipitation Forecast (QPF). The schemes are the WSM5 and WSM6 (WRF-single-moment-microphysics classes 5 and 6). The RAMS is run at high horizontal resolution (4 km) over the whole Italian territory and, to mimic the operational context, it is initialized by the analysis/forecast cycle issued at 12UTC by the European Centre for Medium Weather Range Forecast (ECMWF). The performance of the BMP is analysed for the period of September 11 to October 31, 2012, which span most of the Special Observing Period 1 (SOP1) of the hydrological cycle in the Mediterranean experiment (HyMeX). For this period a database of daily precipitation of thousands of rain gauges over the Italian territory is available. In SOP1 few hazardous events occurred over Italy and, for one of them, the model performance is shown in detail.The potential improvement gained by combining themodel outputs with different BMP in a single forecast is finally explored.


Introduction
Numerical weather prediction of Quantitative Precipitation Forecast (QPF) has been continually improved over the last decades, thanks to the increasingly sophisticated physical schemes, improved data assimilation, and postprocessing techniques.
Among the physical schemes, the microphysics parameterization plays an outstanding role for QPF [1].The representation of cloud microstructure and processes in meteorological models can be done by sophisticated bin-resolving cloud models [2][3][4][5][6], which prognose multiple variables for specific intervals of each hydrometeor species size spectrum, or by simpler double and single moment schemes.
Bin-resolving cloud models are complex and are very expensive in terms of computing time, deterring their use in the operational context, which is the target of this study.Double moment schemes are becoming more available [7][8][9][10] in meteorological models; however, their usage is still limited for the operational forecast because of their increased computational cost, caused by the prediction of the second moment, which, in most cases, is the number concentration.So, bulk microphysics schemes, which reduce the number of prognostic variables by assuming hydrometeor size spectra, are typically used in the operational context.
The RAMS (Regional Atmospheric Modeling System) model is a fully compressible nonhydrostatic model developed at the Colorado State University ( [11,12], http://www .atmet.com/).Recently a 3D-Var [13] data assimilation system and the capability to simulate lightning [14] have been added to this model.
There are several options for the physical parameterization of the PBL and for the short-and long-wave radiation schemes [11,12].However, for the single moment microphysics scheme (with the exception of pristine ice, for which a double moment scheme is always used) the only choice available is that reported in [15] (hereafter RM6).
The aim of this paper is to show the performance of two widely used bulk microphysics parameterizations (BMP) implemented into the RAMS to improve the QPF and to compare their performance with RM6.
Most of the BMP are based on the works [16,17], which have been a central feature of mesoscale and general circulation models.An important problem of [16] was the excessive production of cloud ice at cold temperature.A revision of the ice microphysics processes was given in [18].The most distinguishing features of the revised scheme were the following: (a) the ice number concentration is a function of temperature; (b) the ice crystal number concentration is a function of the ice amount.
The microphysics scheme [18] was coded into WRF in two different configurations: (a) WSM3 (WRF-single-momentmicrophysics class 3), which predicts three categories of hydrometeors, vapour, cloud water/ice, and rain/snow, and (b) WSM5 (WRF-single-moment-microphysics class 5), which considers the hydrometeors, vapour, cloud water, cloud ice, rain, and snow.The WSM5 allows supercooled water to exist and allows a gradual melting of snow falling below the melting layer, a process occurring instantaneously in WSM3 when the snow falls below the 0 ∘ C level.
The study [18] showed that the new microphysics scheme improved significantly the performance for high cloud ice amount, surface precipitation, and large scale mean temperature through a better representation of the ice cloud/radiation feedback.
The WSM5 was further tested in [19], where two heavy precipitation events over Korea as well as regional climate experiments were considered (2 months of accumulated precipitation for the period July 1 to August 31, 2002) to test the performance of the WSM5 in terms of long-range forecast, showing good results for QPF in all cases.
An extension of the WSM5 was proposed by [20], where the graupel was added as a prognostic variable to the WSM5 scheme and the scheme was referred to as WSM6.The WSM6 scheme was tested for an idealized 2D-thunderstorm and for a 3D real case of heavy precipitation occurred over Korea.The results of [20] show that the change in the hydrometeors prediction number had a negligible impact on the QPF for the coarse horizontal resolution (45 km), while they found a positive and nonnegligible impact, both for the quantitative precipitation at surface and for the temporal storm evolution, at fine (5 km) horizontal grid resolution.
Even if decade-old, WSM5 and WSM6 are still widely used BMP of the WRF model [1].
The WSM5 and WSM6 performance are compared with the performance of the RM6 [15].This scheme uses a generalized gamma size spectrum, rather than a Marshall-Palmer, considers ice-liquid mixed phase hydrometeors (graupel and hail), and introduces approximate solutions to the stochastic collection based on [21], which are used in place of continuous accretion approximations of [18][19][20].
Despite the continuously improved models' performance in the last decades, QPF still remains one of the most difficult tasks in weather forecast, and any attempt to improve the QPF contributes to giving a better characterisation of the impact of the weather on the territory.
A chance to improve the deterministic QPF is through the multiphysics ensemble, which can be formed combining forecasts issued by the same model with different physical parameterizations [22,23].Recently, a multimodel superensemble [24] was applied for QPF in the Piedmont Region down to the hydrological discharge prediction for small-and medium-sized catchments, which are typical of the Mediterranean area, showing considerable good results.
In this paper, the performance of the unbiased ensemble is considered to improve the QPF over the Italian territory.
The HyMeX is an international experimental program that aims to advance knowledge of the water cycle variability in the Mediterranean Basin.The goal is pursued by monitoring, analysis, and modeling of the hydrological cycle on different spatial and temporal scales.In HyMeX special emphasis is given to the topics of the occurrence of heavy precipitation and floods, and their societal impacts, which were the subjects of the Special Observation Period 1 (SOP1), held from September 5 to November 6, 2012.One of the products of the HyMeX-SOP1 is a database of daily precipitation collected by thousands of rain gauges over Mediterranean countries ([27], Figure 1 shows only the portion of the rain gauges available over Italy and surroundings).This database gives a unique opportunity to verify the QPF of the RAMS BMP.
The paper is divided as follows: Section 2 gives the details of the RAMS configuration and of the verification methods; Section 3 shows the results, while conclusions are given in Section 4.

Model Configuration, Observations, and Verification Methods
The RAMS configuration of this paper has one grid with 4 km horizontal resolution and covers the whole Italian territory (Figures 2(a)-2(c) show the model domain).The domain has 401 grid points in both N-S and W-E directions (Table 1).Thirty-six levels are used in the vertical.The levels below 1000 m are between 50 and 200 m thick, while the thickness of the layers above gradually increases up to 1000 m at about 8000 m of height; then the vertical grid spacing is held constant at 1000 m up to about 21800 m, which is the model top.
The Land Ecosystem-Atmosphere Feedback (LEAF) model is used to calculate the exchange between soil, vegetation, and atmosphere [28].LEAF is a representation of surface features, including vegetation, soil, lakes and oceans, and snow cover and their influence on each other and on the atmosphere.Exchange terms among the LEAF surface features include turbulent exchange (heat and water) between the atmosphere and soil, heat conduction, and water diffusion and percolation in the snow cover and soil, long-wave and >1900.00>1800.00>1700.00>1600.00>1500.00>1400.00>1300.00>1200.00>1100.00>1000.00>900.00 >800.00 >700.00 >600.00 >500.00 >400.00 >300.00 >200.00 >100.00 >0.00 short-wave radiative transfer among the different LEAF components, transpiration, and precipitation.
A full-column, two-stream single-band radiation scheme is used to calculate short-wave and long-wave radiation [29].The Chen and Cotton scheme accounts for condensate in the atmosphere, but not whether it is cloud water, rain, or ice.
RAMS parameterizes the unresolved transport using Ktheory, in which the covariance is evaluated as the product of an eddy mixing coefficient and the gradient of the transported quantity.The turbulent mixing in the horizontal directions is parameterized following [30], which relates the mixing coefficients to the fluid strain rate and includes corrections for the influence of the Brunt-Vaisala frequency and the Richardson number [11].
Precipitation is assumed to be completely resolved by the microphysics scheme, and no convective parameterization for precipitation is adopted.
With the model configuration introduced above 51 simulations from September 11 to October 31, 2012, were performed using each of the following microphysics schemes: the RM6 ( [15], hereafter this configuration is referred to as R 6), the WSM5 ( [19], hereafter RW5), and the WSM6 ( [20], hereafter RW6).Initial and dynamic boundary conditions are given for all simulations by the ECMWF operational analysis/forecast cycle issued at 12 UTC, to put emphasis on the forecasting performance for QPF of the three BMP.Each RAMS simulation lasts 36 h and starts at 12 UTC on each day; this setting allows for a 12 h spin-up time and for a 24 h forecast.The verification of the precipitation is done on a daily basis, so model accumulated daily precipitation is compared with the corresponding observation.
It is noted that, with the exception of the microphysics schemes, all settings of the R 6, RW5, and RW6 are identical, so the differences of the results are caused only by the different microphysics schemes.
As already stated, for the HyMeX-SOP1, a database of daily precipitation ( [27], http://mistrals.sedoo.fr/?editDatsId= 904&datsId=904&project name=MISTR) was created for thousands of rain gauges located in several countries of the Mediterranean Basin.More specifically, because the aim of this study is to verify the RAMS performance for QPF at high horizontal resolution only a portion of the Mediterranean Basin was selected for computational reasons, that is, the Italian area (IA, Figure 1).
The IA was selected by performing the forecast verification over the area 6 E-19 E and 36 N-47 N, which includes the Italian territory and the Coarse Island.The number of rain gauges in the IA is 3212.However, not all rain gauges were available on each day, their number varying from 2328 (September 14) to 113 (October 8).For the whole period considered in this paper, the number of observed-forecast daily rainfall pairs is 76131 and the average number of rain gauges available for each day is 1492 (see also Table 2).The high number of daily rainfall data available for the HyMeX-SOP1 gives a unique opportunity to verify the model performance over a rather wide territory at high horizontal resolution.
In this paper, rain gauge (tipping bucket) errors are considered negligible compared to model errors; nevertheless Wetting errors are caused by the rainwater adhering to the inner wall of the receptacle, which does not reach the tipping bucket.Evaporation errors are caused by water remaining in the tipping bucket and subsequently evaporating.If a heater is attached to a tipping bucket rain gauge to melt snow, weak precipitation or snowfall may evaporate, thereby causing errors.About 300 of the rain gauges used in this paper, located over the Apennines and Alps, are heated to melt snow.
However, among the errors of the tipping bucket rain gauges, the most important is caused by the wind, which reduces the amount of rain/snow collected by the receptacle [31].This error is particularly important for small precipitation rates and for snow.
Snow, however, was not an issue for this work because (a) by far, the rain gauges are below 1000 m (87% of the precipitation data are for stations below 1000 m and 95% are for station below 1500 m); (b) the period considered in this work is at the end of summer and start of fall, when temperatures are mild even at elevations above 1000 m [32].To better quantify this point the daily averaged temperature was computed for all the days and stations reporting some precipitation.This average was computed using the model output (in this paper three different model configurations are used but, for this issue, no particular differences were found among these configurations), which was interpolated to the station position taking into account the local temperature gradient and the difference between the model and station elevations as in [33].It was found that 1% of the precipitation was reported for daily temperatures less than 0 ∘ C, showing the negligible impact of the snow for this study.
Another important point is the data quality control, which is a two-step process.A first quality control is made on the hourly database, which is used to build the daily database, and consists of discarding unrealistically high rainfall rates or data transmitted with errors (e.g., duplicate data).If the rainfall for a specific hour is discarded, the daily data is flagged as missing (this quality control is not made by the author but it is done in HyMeX).The second quality control compares the daily values with nearby stations, using a methodology similar to [34].Rainfall showing considerable departure from nearby stations is discarded.The departure is evaluated comparing the precipitation standard deviation for the whole period for the considered station with the difference between the actual precipitation and a pseudoprecipitation computed at the station by the nearby stations [34].This control discards isolated thunderstorms, which are very difficult to predict at 4 km horizontal resolution and resulted in rejecting about 1% of the available data.All the stations of the Italian territory, which are by far the largest part of the stations used in this paper, are certified and maintained according to the WMO standard.
For the verification of the QPF, the differences between the model and observed precipitation at the four grid points surrounding each station are computed.Then the model grid point with the smallest difference with the observation is considered as the predicted value.In this way, a shift in the precipitation field up to 4 √ 2 = 5.7 km is considered negligible [35].
Precipitations of two or more stations falling in the same grid cell of the model are grouped in a single superobservation, whose precipitation is given by the average of the precipitation recorded by the stations inside the grid cell.
Statistical verification is performed in three ways.The first method uses 2 × 2 contingency tables for different precipitation thresholds, namely, 1, 5, 10, 20, 40, and 60 mm/day, where 60 mm/day is considered as a threshold for severe precipitation events in the Mediterranean Basin [36,37].From the hits (), false alarms (), misses (), and correct no forecasts () of the contingency tables, the probability of detection (POD; range [0, 1], where 1 is the perfect score, i.e., when no misses and false alarms occur), false alarm rate (FAR; range [0, 1], where 0 is the perfect score), the bias (range [0, +∞), where 1 is the perfect score), and the equitable threat score (ETS; range [−1/3, 1], where 1 is the perfect score and 0 is a useless forecast) are computed [23]: where   is the probability to have a correct forecast by chance [23].
The bias tells us the fraction of yes forecast with respect to the yes events.The POD gives the fraction of the observed yes events that were correctly forecast.The FAR gives the fraction of yes forecasts that did not occur.The ETS measures the fraction of observed and/or forecast events that were correctly predicted and adjusted for hits associated with a random forecast (i.e., where forecast occurrence/nonoccurrence is independent of observation/nonobservation).
The QB and MAE are given by where  is the total number of observations (  ) and forecast (  ) pairs for each observed precipitation range.
The third verification method considers multicategorical contingency scores starting from a multicategory contingency table of forecasts and observations (details on the categories are given in Section 3.2).From this table the accuracy (AC) and the Kuipers skill score (KSS) are computed.
The AC gives the fraction of the forecast in the correct category and is given by where (  ,   ) denotes the number of forecasts in the category th that had observation in the category th,  is the total number of occurrences, and  are the precipitation classes considered.The range of the AC is in the interval [0, 1] and the perfect score is 1.The use of the AC, however, can be misleading because it is heavily dominated by the most populated category.
To avoid this problem, the Kuipers skill score (KSS) is computed.The KSS quantify the accuracy of the forecast in predicting the correct category, relative to that of random chance.It is defined as where (  ) denotes the total number of forecasts in the category th and (  ) denotes the number of observations in the category th, while (  ,   ) and  are for AC.The KSS ranges in the interval [−1, 1] and 0 indicates no skill.The KSS score possesses two advantages: (a) the random and constant forecast receive the zero score; (b) the contribution made to the Kuipers skill score by a correct "no" or "yes" forecast increases as the event is more or less likely, respectively.

Results for the HyMeX-SOP1
3.1.The October 15, 2012, Case Study.In this section the performance of the RAMS model with the three microphysics schemes is considered for the case study of October 15, 2012.This case study is selected because it was already considered in past studies [14,39] and it is well documented.Heavy precipitation was recorded on October 15 in several parts of Italy, and this case study shows the performance of R 6, RW5, and RW6 for a case of severe weather.
The October 15, 2012, case study was characterised by frontal precipitation over northern and central Italy.The precipitation was determined by a synoptic scale system with a wide upper level trough extending from northern Europe to Spain.The interaction between the trough and the western Alps formed a V-shaped secondary trough, which moved to the East and, by interaction with the orography of the Alps, generated a cyclone over the Gulf of Genoa (see Figure 8 of [39]).
This cyclone moved rapidly toward the Italian peninsula and the associated frontal system forced the development of deep convection, especially in northern and central Italy.
Figure 2 shows the simulated and observed precipitation on October 15.From observations (Figure 2(d)) it is apparent that precipitation affected several parts of Italy and Coarse Island.More specifically, precipitation larger than 60 mm occurred over a wide area south of the Alps and over Liguria and Tuscany.Other precipitation spells occurred over the western Sardinia and Coarse Island, over central Italy (south of Rome, between 40 and 42 N), and over western Sicily.In all these areas there were stations reporting more than 60 mm for the whole day.
Considering the precipitation fields simulated by R 6, RW5, and RW6, it is noted that all models correctly forecast several features of the observed field.The main precipitation area south of the Alps is well forecast as well as the huge precipitation spell over Liguria and Tuscany.The rainfall south of Rome (between 40 and 42 N) is correctly forecast by all simulations while the precipitation band between the Italian peninsula and western Sicily is forecast too far to the northwest causing a notable underestimation of the precipitation over Sicily.
Despite several similarities among the precipitation fields of the BMP, the precipitation for R 6 is lower than those of RW5 and RW6.For example, over southern Alps, the area with precipitation larger than 90 mm is much larger for RW5 and RW6, which overestimate the observed rainfall, compared to R 6.The lower precipitation forecast by R 6 compared to other BMP is apparent also over central Italy.In this area RW5 and RW6 show precipitation bands between 42 and 44 N, which are less evident in R 6.There are missing data in correspondence of these precipitation bands in the database used in this paper; nevertheless, they were observed and caused the 35 mm precipitation recorded in central Rome (see Figure 9 of [39]).A similar underestimation of the accumulated rainfall by R 6 occurs over the western side of the Coarse Island, while all RAMS configurations underestimate the precipitation over Sardinia.
Despite the lower rainfall simulated by R 6, it is noted that the precipitation for the lowest threshold (0.2 mm) covers a larger area in R 6 compared to RW5/RW6.
To investigate in more detail the differences among the rainfall simulated by R 6, RW5, and RW6, Figure 3 shows   the hydrometeors mixing ratios averaged over the whole domain (only grid cells with mixing ratios of the hydrometeors larger than 10 −3 g/kg are considered for this average).The ice mixing ratio is given by the sum of the snow and cloud ice mixing ratios for RW5 and RW6, while it is given by the sum of snow, pristine ice, and aggregates for R 6.
The cloud water averaged mixing ratio (Figure 3(a)) varies between 0.10 g/kg and 0.16 g/kg for RW5 and RW6, while it varies between 0.16 g/kg and 0.21 g/kg for R 6.It is noted that the cloud water mixing ratio decreases during the central hours of the day.This decrease is caused by the dissipation of extended fog layers that form during the night and persist for part of the day over several countries of the simulated domain.
The situation for the rain mixing ratio is shown in Figure 3(b): its average over the whole domain increases during the day from 0.04 g/kg to 0.08 g/kg for RW5 and RW6, while it is roughly halved for R 6.
The differences found for the cloud water and rain mixing ratios for RW5/RW6 and R 6 suggest a slower autoconversion between cloud water and rain for the R 6 microphysics scheme, confirming the findings of [40,41] and showing a key mechanism for the smaller precipitation simulated by R 6 compared to other BMP. Figure 3(c) shows the domain-averaged graupel mixing ratio simulated by R 6 and RW6 (RW5 does not consider the graupel).The RW6 mixing ratio is about twice that of R 6, showing a larger conversion from other categories to graupel.This is in part confirmed by the lower amount of ice mixing ratios for RW6 compared to R 6 (Figure 3(d)).It is also noted that RW5 has the largest ice concentration among all microphysics schemes.This is because the ice accounts also for the graupel category in RW5.
To further explore the difference among the hydrometeors simulated by the BMP, Figure 4 shows the mixing ratios of the hydrometeors of R 6 (Figures 4(a for the convective and stratiform regions.The methodology to separate the convective and stratiform regions is adapted from [42], which partitions each vertical column containing clouds into convective and stratiform according to the following criteria: (a) a model grid point is classified as convective if it has a rain rate twice the average over the four surrounding grid points, that is, one on either side of the considered grid point; (b) any grid point with a precipitation rate larger than 10 mm/h is classified as convective; (c) grid points not classified as convective are identified as stratiform.In the original method, point (b) assumes a rain rate of 20 mm/h to classify a grid point as convective.This threshold was subjectively lowered to 10 mm/h to take into account the differences between the tropics, where the methodology of [42] was applied, and mid-latitude convection.
Following [42] the stratiform region is further checked and identified as convective when one of the following two conditions is met: (a) for raining grid points the cloud water below the melting layer is greater than 0.5 g/kg or the maximum updraft is larger than 5 m/s; (b) for nonraining grid points the cloud water is larger than 0.025 g/kg or the maximum updraft is larger than 5 m/s.
The methodology is able to identify and separate the convective and stratiform region.For example, the vertical velocity (not shown) for the convective region is of the order of 20-30 cm/s for the lowest 3 km of the troposphere, while it is less than 10 cm/s and, for most layers, lower than 5 cm/s for the stratiform region for all BMP.
Figures 4(a) and 4(b) show the mixing ratios of different hydrometeors for the convective and stratiform region for R 6.In Figure 4(a) the mixing ratio below 2 km is dominated by water (cloud water and rain) with cloud water having a larger mixing ratio than rain.Aggregates, which give a contribution below 2 km and may fall, are defined as ice particles that are formed by collision and coalescence of pristine ice, snow, and other aggregates [15].They may retain a moderate amount of riming before being reclassified.Aggregates are the most abundant hydrometeor in R 6 between 2 and 5 km height, while above 5 km the pristine ice has the largest mixing ratio.
For the stratiform region, ice microphysics plays the major role in R 6.The most abundant hydrometer below 5 km is the aggregate, while the pristine ice is the most abundant hydrometeor above 5 km.
Figures 4(a) and 4(b) confirm, indirectly, the ability of [42] scheme to divide the convective and stratiform regions as their microphysics is substantially different because the water microphysics plays a much larger role in the convective region compared to stratiform [20,42,43].In general, the ice integrated water path (IWP) is larger than the liquid integrated water path (LWP) for both convective and stratiform regions; nevertheless, the ratio of IWP to LWP is about 2 for the convective region, while it is 8 for the stratiform region, showing the much larger contribution of water microphysics in convective clouds compared to stratiform.
Figure 4(d) shows the mixing ratios of RW5 hydrometeors for the convective region.Below 2 km the microphysics is dominated by water categories (cloud water and rain) with the largest contribution of rain (about 0.4 g/kg).Above 2 km the ice microphysics dominates and the most abundant hydrometeor is snow.In the stratiform region (Figure 4(e)), the contribution of the snow is by far the largest; even if near the ground the rain hydrometeor is still dominant.
For RW5, the ratio between IWP and LWP is about 2 for the convective region and 6 for the stratiform region, showing the larger contribution of the water microphysics in convective clouds.Also the contribution of the water microphysics in the stratiform region is more important for RW5 compared to R 6.
The comparison of Figures 4(a)-4(b), and 4(d)-4(e) shows the different behaviour of the two BMP.Three points are of particular interest: (a) the aggregates are very important in the ice microphysics of R 6 especially below 5 km (the importance of the aggregates was also highlighted in [15]); (b) in the convective region, below 2 km, rain is the most abundant hydrometeor in RW5, while cloud water is larger than rain for R 6; (c) the cloud water mixing ratio in RW5 is lower than in R 6 for both convective and stratiform regions, while the opposite occurs for rain.This shows that the conversion of cloud water to rain is larger in RW5 compared to R 6.
The differences between the R 6 and RW5 microphysics characteristics have an important impact on the surface accumulated precipitation.Figures 4(c) and 4(f) show the accumulated precipitation for different hydrometeors for R 6 and RW5.As expected the rain is the most important contributor to the total precipitation for both BMP.Contributions of aggregates and graupel (R 6) and the contribution of snow (RW5) refer to the precipitation over the Alps.The snow contribution in RW5 (105 m) accounts for both graupel and aggregates in R 6.However, the larger rain mixing ratio of RW5 at the lower levels for both stratiform and convective regions determines a much larger rainfall accumulated over the whole domain for RW5 (630 m) compared to R 6 (405 m).
Figure 4(g) shows the vertical profile of the averaged hydrometeors for the convective region of RW6.Below 2 km the most abundant hydrometeor is rain followed by cloud water and graupel.Between 2 and 3 km the graupel is the most abundant hydrometeor followed by cloud water and rain.Above 3 km snow has the largest mixing ratio, while graupel decreases with height with negligible values above 6 km.The ice mixing ratio has a maximum at 6.5 km (about 0.1 g/kg) and is the second abundant hydrometeor above 5 km.
Comparing mixing ratios is apparent.The snow is reduced at all levels in RW6 and its mixing ratio is replaced by graupel but also by cloud water, especially between 2 and 3 km.Graupel also accounts for the lower mixing ratio of the rain above 2 km, while the rain mixing ratio is similar for RW5 and RW6 for the lowest level (0.38 g/kg).The difference of the mixing ratios of the hydrometeors between RW5 and RW6 shows the important role of the graupel in redistributing the hydrometeors, as apparent by the many microphysics processes involving the graupel in RW6 (see [20] for a detailed description).The role of graupel is less evident for the stratiform region, where the main consequence is the reduction of the snow mixing ratio below 4 km, which is converted to graupel.
The differences between the mixing ratios of RW6 and R 6 are similar to those discussed in the comparison between RW5 and R 6.
Figure 4(i) shows the accumulated precipitation for the different hydrometeors for the RW6 scheme.The most abundant precipitation is given by rain, whose accumulated value over the whole domain is similar to RW5 (630 m), followed by graupel (95 m) and snow (45 m).The sum of graupel and snow accumulated precipitation is larger than snow in RW5 (130 mm), similarly to the findings of [20], but the difference is small.
It is noted that the examination of the vertical profiles of the mixing ratios for the convective and stratiform regions was done also considering the whole period of this paper (September 11 to October 31, not shown).For this application, the methodology to separate the convective and stratiform region was applied to each forecast day, and then the mixing ratios of the different days were averaged for the convective and stratiform regions, respectively.This analysis showed similar results to the October 15, 2012, case study and the physical differences outlined above among the BMP are not a particular occurrence for the October 15 case study, but are general properties of the BMP.
The main difference was found in the lower layers (below 1-2 km) for the aggregate mixing ratio for R 6, for the snow mixing ratio for RW5, and for the graupel and snow mixing ratios for RW6.These mixing ratios were found considerably lower for the whole period compared to those of Figure 4.This is expected because the snow and mixed phase hydrometeors contribution to the precipitation is small for the whole period, as discussed in Section 2, so their concentration at low levels is expected to be small.With respect to this point, the October 15 case study is an exception with snowfall and mixed phase precipitation over the Alps.
Overall, this case study shows that the precipitation of RW5 and RW6 is larger than that of R 6 through the key mechanism of the larger conversion between cloud water and rain; however, the precipitation area for the lowest precipitation amount is larger for R 6 compared to RW5 and RW6.These are general behaviours of the different BMP as will be shown in the next section.

Objective and Quantitative Scores for the HyMeX-SOP1.
This section shows the scores of the different RAMS microphysics schemes for the 51 days of the HyMeX-SOP1 from September 11 to October 31, 2012.
Figure 5(a) shows the bias for the three BMP.With the exception of the 60 mm/day threshold for RW5 and RW6, all RAMS configurations have a bias lower than one, showing an underestimation of the precipitation area.For R 6 the underestimation is larger compared to RW5 and RW6.In more detail, the bias of R 6 decreases from 0.7 (1 mm/day) to 0.4 (60 mm/day).This result is in agreement with the results of [33], which investigated the R 6 performance for a different period (one year, 2013) and for a different geographical area (Calabria, southern Italy), showing that this model configuration underestimates the precipitation.
The bias behaviour of RW5 and RW6 is different.The bias decreases from 0.7 (1 mm/day) to 0.6 (10 mm/day); then it increases up to 1.0 for the 60 mm/day threshold.From Figure 5(a) it is apparent that (a) the RW5 and RW6 have a bias closer to 1 compared to R 6, with the exception of the 1 mm/day threshold, showing a better performance of RW5 and RW6 for this score; (b) the bias of RW5 and the bias of RW6 are similar.
The results for the bias confirm the finding of the previous section for the October 15 case study: RW5 and RW6 have larger precipitation areas compared to R 6 for all thresholds but 1 mm/day (see also Table 1).Moreover, RW5 and RW6 have similar behaviours for the QPF.
The ETS (Figure 5(b)) for R 6 ranges from 0.49 (1 mm/day) to 0.20 (60 mm/day).The decrease of the ETS between 40 mm/day and 60 mm/day thresholds is noted.
The ETS behaviour of RW5 and RW6 is similar.The ETS decreases from 0.48 (1 mm/day) to 0.30 (60 mm/day).The results of Figure 4(b) show that, with the exception of the 1 mm/day threshold, the ETS of RW5 and RW6 is better compared to R 6.
The POD for different BMP (Figure 5(c)) confirms the different behaviours of R 6 and RW5/RW6.The POD for R 6 decreases from 0.6 (1 mm/day) to 0.25 (60 mm/day), that is, one potentially hazardous event out of four is correctly forecast.This shows a low performance of R 6 for the 60 mm/day threshold.
The POD for RW5 and RW6 decreases up to 10 mm/day threshold (0.5); then it is almost constant for larger thresholds.For the 60 mm/day threshold, the RW5 and RW6 correctly forecast half of the potentially hazardous events.
Figure 5(d) shows the FAR for the different BMP.This score has an opposite behaviour compared to the POD.The FAR increases from 0.13 (1 mm/day) to 0.55 (60 mm/day) for RW5 and RW6, while it varies between 0.15 (1 mm/day) and 0.35 (60 mm/day) for R 6.The larger FAR for the R 6 for the 1 mm/day threshold is noted, which shows that larger POD are also associated with larger FAR for the different BMP.
With the exception of the 1 mm/day threshold, the larger precipitation of RW5/RW6 compared to R 6 increases not only the correct yes forecast (POD) but also the fraction of yes forecast that did not occur (FAR), showing two different facets of the forecast.The choice of a forecasting system giving more hits and false alarms compared to one having less hits and false alarms is not simple and depends on the specific application.In general, the use of different scores shows the many aspects of the QPF and is the guide for the final choice.
Figure 5(e) shows the quantitative bias.This statistic is always less than zero because of the rainfall underestimation for all RAMS BPM.However, the rainfall underestimation is larger for R 6, showing a better performance of RW5 and RW6.The QB for RW5 and RW6 are very similar.
The MAE (Figure 5(f)) shows similar behaviours for all RAMS BMP.This is caused by the similar precipitation patterns forecast by all microphysics schemes, as shown in Figures 2(a)-2(c) for the October 15 case study.The slightly lower MAE of R 6 compared to RW5 and RW6 up to the 10-20 mm/day precipitation is caused by the lower precipitation amount forecast by R 6.
To explore further the difference among the QPF of R 6, RW5, and RW6, Table 3 shows the number of occurrences for different daily precipitation intervals for all BMP and observations.The number of occurrences for R 6 is larger than that of RW5 and RW6 up to 10-20 mm/day interval, while the opposite occurs for intervals involving larger precipitation, in agreement with the analysis of the scores.
A chi-square test, with the null hypothesis that the datasets were drawn from the same distribution, was performed to infer the statistical difference of the precipitation datasets of Table 3.The results show that the R 6, RW5, and RW6 precipitation distributions are all different from the observed distribution (99% significance level), showing the difficulty for all BMP to correctly forecast the precipitation field.Moreover, the R 6 distribution is also different (99% significance level) from both RW5 and RW6 distributions, while the null hypothesis that the QPF distributions of RW5 and RW6 are the same cannot be disproved.
The statistical test for the difference among the QPF distributions of R 6, RW5, and RW6 was repeated dividing the whole dataset into two batches: precipitation above 400 m (hill-mountain) and precipitation below 400 m (valley).The results of the chi-square test are similar to those for the whole dataset: (a) R 6 QPF distribution is different from observed and RW5/RW6 QPF distributions at 99% significance level; (b) the null hypothesis that RW5 and RW6 QPF are drawn from the same distribution cannot be disproved; (c) RW5 and observed QPF distributions are different at 99% significance level for both hill-mountain and valley; (d) RW6 and observed distributions are different for the valley (99% significance level) and for hill-mountain (95% significance level).
Considering the results of the hypothesis testing it is possible to conclude that RW5 and RW6 have similar QPF distributions, which are different from that of R 6. None of the forecast QPF distributions is similar to that observed, showing the difficulty to correctly predict the rainfall in the different precipitation intervals.
From Table 4, the accuracy and Kuipers skill score were computed.The value of AC is 0.82 for all forecasts.The high value of AC, however, is largely dominated by the forecast/observed precipitation for the lowest interval [0-1) mm.The KSS is 0.36 for all RAMS configurations showing that the forecasts have skill compared to a random or constant forecast.
Interestingly, the KSS of RW5 and RW6 are not different from that of R 6, despite the fact that the QPF of the RAMS BMP are different, as shown by the chi-square test.This is determined by two reasons: (a) R 6 has a better performance for the [1][2][3][4][5] mm interval, as shown by the diagonal elements of Table 4; (b) even if R 6 underestimates the precipitation compared to RW5 and RW6, it has also less false alarms, which are penalized by the KSS.
Again, the scores and the contingency tables, including also multicategory contingency tables, reveal several aspects of a forecasting system and the choice of the best forecast, which depends on the specific application, is aided by the analysis of these verification tools.

A Simple Ensemble Forecast.
A method to exploit the implementation of different microphysics schemes in meteorological models is by multiphysics ensemble forecast [22], that is, to run an ensemble changing the physical/dynamical parameterizations of the model and, in addition to probabilistic forecast, to give the deterministic forecast merging different model outputs [24].The use of ensemble techniques, however, involves the computation of statistical properties of the models which requires a long simulation period, while this work spans only 51 days.Bearing this in mind, this section shows the comparison between the ensemble average ("poor man ensemble"), the best model among R 6, RW5, and RW6, and the unbiased ensemble.The unbiased ensemble is defined as where  UNB is the unbiased ensemble forecast,   is the forecast of the th model, and  and   are the means of the observations and of the th model in the training period.
The ensemble technique was applied by selecting each day in turn, that is, the actual day, for the period September 11 to October 31, 2012, then computing the statistics   and  considering all days but the actual day on which the forecast is issued, and then applying (5) for the actual day.In this way the statistics   and  are independent from the forecast day.
Figure 6(a) shows the bias of the (a priori unknown) best model, for the ensemble average and for  UNB . UNB has a better performance compared to the best model up to the 10 mm/day threshold; then the best model performs better.The important impact of the unbiased ensemble on the bias is noted.For the 1 mm/day threshold  UNB bias is larger than 0.9, showing a good prediction of the precipitation area, as well as considerable improvement compared to the ensemble average and to the best model, whose bias is 0.7.
The improvement of the ensemble unbiased forecast is also found for the ETS score (Figure 6(b)) up to the 10 mm/day threshold, while for larger thresholds there is no improvement of  UNB compared to the best model.However,  UNB outperforms the ensemble mean, showing the positive impact of model output postprocessing, despite the short training period.
A similar behaviour was found for the POD (not shown), while  UNB has a larger FAR (not shown) compared to the best model.
In general, comparing the results of the ensemble unbiased forecast with those of the best model, the former performs better up to the 10 mm/day threshold.For larger thresholds the objective scores of the best model are better or similar to those of  UNB .However, the unbiased forecast clearly outperforms the ensemble mean.
There are two main reasons for the worse performance of the unbiased ensemble compared to the best model for thresholds larger than 10 mm/day: first the dataset used is short (50 days of training period).As shown in Table 2, the total number of observations larger than a given threshold has a steep decrease with the increasing daily precipitation, and the available observations are nearly halved for each increasing threshold larger than 10 mm/day.
The second reason for the worse performance of  UNB compared to the best model for thresholds larger than 10 mm/day is that the RW5 and RW6 have similar behaviours for the daily accumulated rainfall, as shown in the previous sections.So, RW5 and RW6 are not truly statistical independent dataset and this reduces the performance of the ensemble forecast.

Conclusions
This paper shows the performance of three different BMP implemented into the RAMS model.The schemes are the RM6 [15], already included in the RAMS model, and the WSM5 [18,19] and WSM6 [20] schemes, which were coded into RAMS with the aim of improving the model performance for QPF.
The HyMeX initiative gave a unique opportunity for the verification of the model performance.More specifically during the SOP1 (September 5 to November 6, 2012), which focused on the heavy precipitation events and the associated societal response, rain gauges data of daily accumulated rainfall were collected for thousands of rain gauges over the Italian territory, which is the target of this study.
Simulations were performed at high horizontal resolution (4 km) for the period from September 10 to October 31, and, to simulate the operational context, initial and dynamic boundary conditions were interpolated from the ECMWF analysis/forecast cycle issued at 12 UTC.
The performance of the RAMS BMP was first analysed for the case study of October 15, 2012, when heavy precipitation occurred in several parts of Italy.
The analysis of the diurnal evolution of the hydrometeors mixing ratios and of the vertical distributions of the hydrometeors mixing ratios for both the convective and stratiform regions shows important differences among the BMP.In particular, the rain mixing ratio near the surface is smaller for R 6 compared to RW5/RW6, while the cloud water mixing ratio has an opposite behaviour.This result shows a larger conversion from cloud water to rain for RW5/RW6 compared to R 6.As a consequence, the precipitation, which is dominated by rain, is larger for RW5/RW6.
Despite the different microphysical processes considered in RW5 and RW6, the rain has similar mixing ratios at the lowest levels of RW5 and RW6, determining a similar QPF.
Even if shown for a case study, these conclusions are valid also considering the whole period, showing that the differences among the mixing ratios of the hydrometeors of the different BMP on October 15, 2012, are general properties of the BMP.
To study the QPF performance of R 6, RW5, and RW6 for a longer period, objective and quantitative scores were considered for different precipitation thresholds (from 1 mm/day to 60 mm/day) for the whole SOP1.From the analysis of the scores it is apparent that RW5 and RW6 have larger rainfall compared to R 6. Also, the scores of RW5/RW6 are better than those of R 6, aside from the FAR and from the 1 mm/day threshold for most scores.
The analysis of the scores shows that RW5/RW6 have larger POD but also FAR compared to R 6.In general, the analysis of different scores shows the many aspects of the QPF and is the guide for the choice of the best forecast for the application of interest.
The difference among the QPF distributions of the BMP was studied in detail by the chi-square test.Results have shown that RW5 and RW6 QPF distributions are similar, while they are different from that of R 6.This results in agreement with the finding of [1], even if differences may arise for specific cases, especially at high horizontal resolution [20].All the BMP precipitation distributions are different from the observed distribution.
Multicategory contingency scores were elaborated from a multicategory contingency table.The results for the accuracy and of the Kuipers skill score show that all the BMP have skill respect to a random or constant forecast.Although R 6 and RW5/RW6 have different QPF distributions, the KSS for all BMP is the same.
The last point considered in this paper is the preliminary QPF performance of an ensemble forecast formed by R 6, RW5, and RW6.Despite the short training period of this paper, the unbiased ensemble forecast has a considerable impact on QPF up to the 10 mm/day threshold, and  UNB outperforms the best model forecast.For example, for the 1 mm/day threshold, the bias of the R 6, RW5, and RW6 is about 0.7, while that of  UNB is 0.9.For the 10 mm/day threshold,  UNB scores are similar to those of the best model, while for larger thresholds the best model outperforms  UNB .
In almost all cases  UNB outperforms the ensemble average, showing the importance of the postprocessing technique for the QPF.
Future developments of this work points at two main directions: first, to collect more simulations and larger statistics with the RAMS configurations used in this paper and second, to add a double moment microphysical scheme to RAMS.
The first point is important to give a statistically robust quantification of the model performance, including also different seasons to take into account for the natural variability of the Mediterranean climate [32,44].Also, considering a longer period for training would be important to improve the ensemble performance.
For the second point the implementation of the Thompson microphysical scheme [35,36] is underway.This scheme allows for a more detailed representation of ice-phase processes than WSM5/WSM6 and has several other improvements for both the warm and cold phases of the microphysical scheme and for computational efficiency.This double moment microphysical scheme reproduces the behaviour of hydrometeors of more sophisticated bin models, while preserving the advantages of simpler microphysical schemes.In recent studies [1,34], the scores of the Thompson scheme used with the WRF model were better than those of WSM5/WSM6.

Figure 1 :
Figure 1: Rain gauges considered in this study.The colours are the rain gauges heights in metres.

Figure 2 :
Figure 2: Accumulated rain on October 15, 2012, for (a) R 6; (b) RW5; (c) RW6; (d) objective analysed rainfall for available rain gauges.The objective analysed rainfall is given, at each grid point, by the average of the observations recorded by the rain gauges inside a radius of 30 km from the grid point.

Figure 3 :
Figure 3: Domain-averaged mixing ratios (g/kg) for (a) cloud; (b) rain; (c) graupel; (d) ice.The hydrometeor ice for R 6 is the sum of the pristine ice, snow, and aggregates hydrometeors, while it is the sum of cloud ice and snow categories for RW5 and RW6.The graupel of R 6 is the sum of graupel and hail categories.R 6 is in green, RW5 in red, and RW6 in blue.
Figure 3(c)  shows the domain-averaged graupel mixing ratio simulated by R 6 and RW6 (RW5 does not consider the graupel).The RW6 mixing ratio is about twice that of R 6, showing a larger conversion from other categories to graupel.This is in part confirmed by the lower amount of ice mixing ratios for RW6 compared to R 6 (Figure3(d)).It is also noted that RW5 has the largest ice concentration among all microphysics schemes.This is because the ice accounts also for the graupel category in RW5.To further explore the difference among the hydrometeors simulated by the BMP, Figure4shows the mixing ratios of the hydrometeors of R 6 (Figures4(a) and 4(b)), RW5 (Figures 4(d) and 4(e)), and RW6 (Figures 4(g) and 4(h))for the convective and stratiform regions.The methodology to separate the convective and stratiform regions is adapted from[42], which partitions each vertical column containing clouds into convective and stratiform according to the following criteria: (a) a model grid point is classified as convective if it has a rain rate twice the average over the four surrounding grid points, that is, one on either side of the considered grid point; (b) any grid point with a precipitation rate larger than 10 mm/h is classified as convective; (c) grid points not classified as convective are identified as stratiform.In the original method, point (b) assumes a rain rate of 20 mm/h to classify a grid point as convective.This threshold was subjectively lowered to 10 mm/h to take into account the differences between the tropics, where the methodology of[42] was applied, and mid-latitude convection.Following[42] the stratiform region is further checked and identified as convective when one of the following two conditions is met: (a) for raining grid points the cloud water below the melting layer is greater than 0.5 g/kg or the maximum updraft is larger than 5 m/s; (b) for nonraining grid points the cloud water is larger than 0.025 g/kg or the maximum updraft is larger than 5 m/s.The methodology is able to identify and separate the convective and stratiform region.For example, the vertical velocity (not shown) for the convective region is of the order of 20-30 cm/s for the lowest 3 km of the troposphere, while it is less than 10 cm/s and, for most layers, lower than 5 cm/s for the stratiform region for all BMP.Figures4(a) and 4(b) show the mixing ratios of different hydrometeors for the convective and stratiform region for R 6.In Figure4(a) the mixing ratio below 2 km is dominated by water (cloud water and rain) with cloud water having a larger mixing ratio than rain.Aggregates, which give a contribution below 2 km and may fall, are defined as ice particles that are formed by collision and coalescence of pristine ice, snow, and other aggregates[15].They may retain a moderate amount of riming before being reclassified.Aggregates are the most abundant hydrometeor in R 6 between 2 and 5 km height, while above 5 km the pristine ice has the largest mixing ratio.For the stratiform region, ice microphysics plays the major role in R 6.The most abundant hydrometer below 5 km is the aggregate, while the pristine ice is the most abundant hydrometeor above 5 km.Figures4(a) and 4(b) confirm, indirectly, the ability of[42] scheme to divide the convective and stratiform regions

Figure 5 :
Figure 5: Objective scores computed from contingency tables and quantitative statistics for the precipitation over the Italian area: (a) bias; (b) equitable threat score (ETS); (c) probability of detection (POD); (d) false alarms rate (FAR); (e) quantitative bias (QB); (f) mean absolute error (MAE).R 6 is in green, RW5 in red, and RW6 in blue.

Figure 6 :
Figure 6: Objective scores for the precipitation over the Italian area: (a) bias; (b) equitable threat score (ETS).Best (green bar) is the best model, PMan (red bar) is the ensemble mean, and UNB (blue bar) is the unbiased ensemble forecast.

Table 1 :
RAMS grid-setting.NNP  , NNP  , and NNP  are the number of grid points in the west-east, north-south, and vertical directions.  (km),   (km), and   (m) are the domain extension in the west-east, north-south, and vertical directions.  (km) and   (km) are the horizontal grid resolutions in the west-east and north-south directions.CENTLON and CENTLAT are the geographical coordinates of the grid centres.

Table 2 :
Number of data equal to or above a given threshold for observations, R 6, RW5, and RW6.