Evaluation of Precipitation Forecast of System: Numerical Tools for Hurricane Forecast

Heavy rainfall events, typically associated with tropical cyclones (TCs), provoke intense flooding, consequently causing severe losses to life and property. *erefore, the amount and distribution of rain associated with TCs must be forecasted precisely within a reasonable time to guarantee the protection of lives and goods. In this study, the skill of the Numerical Tool for Hurricane Forecast (NTHF) for determining rainfall pattern, average rainfall, rainfall volume, and extreme amounts of rain observed during TCs is evaluated against Tropical Rainfall Measuring Mission (TRMM) data. A sample comprising nine systems formed in the North Atlantic basin from 2016 to 2018 is used, where the analysis begins 24 h before landfall. Several statistical indices characterising the abilities of the NTHF and climatology and persistence model for rainfalls (R-CLIPER) for forecasting rain as measured by the TRMM are calculated at 24, 48, and 72 h forecasts for each TC and averaged. *e model under consideration presents better forecasting skills than the R-CLIPER for all the attributes evaluated and demonstrates similar performances compared with models reported in the literature. *e proposed model predicts the average rainfall well and presents a good description of the rain pattern. However, its forecast of extreme rain is only applicable for 24 h.


Introduction
Tropical cyclones (TCs) are among the most devastating atmospheric phenomena, as they result in strong surface winds, tornadoes, storm surges, and heavy rainfall events. Heavy rainfall events are distributed over wide areas and can cause flash flooding, thereby resulting in human and economic losses. It has been reported that approximately 60% of human deaths caused by hurricanes in the USA were related to flash flooding [1,2]. In Cuba, Hurricane Flora (1963) caused approximately 2000 casualties due to heavy persistent rains [3]. ese facts highlight the importance of an accurate forecast of the distribution and amount of rain during the interaction of a TC with land. e rain pattern of a TC depends on different factors, namely, its internal dynamics, the synoptic situation around the cyclone, and its translational speed, which provoke azimuthal asymmetries [4]. It has been reported that vertical wind shear creates asymmetries in the inner-core field rainfall distribution pattern [5,6]. e interaction of the storm with the Earth surface, as well as the available humidity, and intensity of the cyclone significantly affect the distribution and amount of rain [7]. For instance, a close relationship between precipitation distribution and thermodynamical symmetry has been discovered in the evolution of Hurricane Edouard [8]. In recent years, these factors have been incorporated in numerical models to perform a quantitative forecast of the track, intensity, and precipitation of TCs. e incorporation of a better physical representation of processes associated with hurricanes, as well as different parameterization schemes, allows a higher precision of the forecast of these elements with better spatial and time resolutions [9,10]. However, most previous studies have focused on forecasting the intensity and track of cyclones; fewer studies have focused on precipitation forecast. For example, DeMaria and Tuleya [11] evaluated the precipitation forecast of model Geophysical Fluid Dynamics Laboratory (GFDL) in the North Atlantic basin (NATL) for cyclones affecting mainland USA. Marks and DeMaria [12] developed an equivalent of the climatology and persistence model for rainfalls (R-CLIPER), where the intensity of the climatological precipitation is accumulated in the storm's track. is model is widely used as a benchmark to evaluate other precipitation forecast techniques.
Marchok et al. [7] reviewed and applied different validation schemes of the most frequently used forecast models regarding their ability to predict different aspects of rainfall, i.e., its distribution in time and space, its mean rainfall, and extreme rains observed in TCs.
e validation was performed with all the TCs land-falling in the USA from 1998 to 2004. A similar study for the North Indian Ocean was performed as well [13], where the precision of precipitation forecast of several global models used in the area was evaluated through comparison with TRMM-3B42 data. e sample was composed of nine TCs that formed in the North Indian Ocean from 2010 to 2013.
ose researchers discovered that, although the performances of some models were similar to the observations, no single model predicted all the observed features equally well. Nevertheless, the TRMM data must be used cautiously owing to the demonstrated underestimation of heavy rainfalls in mountainous regions [14] and other drawbacks, which will be discussed in the following.
In Cuba, two operational systems are used for precipitation forecast. In 2015, Sierra et al. [15] proposed a configuration derived from the Weather Research and Forecasting model (WRF-ARW), known as Sistema de Pronóstico Inmediato. e main objective of the configuration was to perform a short-term prediction. ey discovered the largest precipitation forecast errors from July to November, coinciding with the hurricane season in the NATL; therefore, they concluded that none of the tested configurations correctly forecasted the rainfall associated with TCs. e other operational system, named Sistema de Pronóstico Numérico Océano Atmósfera, combines the WRF-ARW with two sea wave models (WW3 and SWAN) and an oceanic circulation model (ROMS) [16,17]. e authors found the best forecast ability for rainfall thresholds above 5 mm/day during the months of April, May, and June, out of the cyclone season. Currently, an evaluated operational model demonstrating good rain forecasts for TCs does not exist.
e Department of Meteorology of the Higher Institute of Applied Technologies and Sciences, University of Havana developed an operational model that can be used to forecast precipitation. It is known as the Numerical Tool for Hurricane Forecast (NTHF) [18], which incorporates the option of a GFDL vortex tracker. It is a program that loads model forecasts in the GRIB/NetCDF format, objectively analyses data to provide an estimate of the vortex centre position (latitude and longitude) and tracks the storm for the duration of the forecast. It includes parameterization schemes to describe the physics related to the development and intensification of hurricanes.
Furthermore, it uses the atmospheric component of the HWRF3.9 model, which is specifically designed to be used for forecasting TCs [19,20]. It has been demonstrated that this system can forecast the track of TCs, particularly Category 4 and 5 hurricanes in the Saffir-Simpson scale [18]. Regarding intensity, the system has forecasting abilities of cyclonic systems from depressions to Category 3 hurricanes, with discrete results for Category 4 and 5 hurricanes. Although the model can forecast rainfall [20], it has not been evaluated for Cuba and the Inter-American Oceans. e aim of this study is to evaluate the abilities of the NTHF system for forecasting rainfall associated with TCs, as reported by the TRMM. e system demonstrated excellent predictions of the average rainfall and a good description of the rain pattern; however, its forecast of extreme rain was only applicable for 24 h.

HWRF and R-CLIPER Models.
e NTHF has been implemented and is operational at the Department of Meteorology of the Higher Institute of Technologies and Applied Sciences of the University of Havana. Its aim is to forecast the evolution of TCs formed in the NATL, particularly in intercontinental seas. e model uses the HWRF as a dynamic core for the solution of a system of primitive equations. e computational algorithms guarantee the initialisation of the model during operational runs with official information from the National Hurricane Centre (NHC) and the forecast outputs of the Global Forecasting System (GFS). Furthermore, it contains algorithms for postprocessing the results. e forecasts extend for 120 h durations, consistent with the time period of official NHC forecasts. Figure 1 shows the NTHF block diagram [18].
e configuration of the model for this study was based on the recommendations reported in [21,22] for the operational runs of the HWRF in the National Centre for Environmental Prediction (NCEP). Experiments were performed with bidirectional interactive nested domains of 27-9 km resolution. e external domain was located in the centre of the storm, whereas the internal domain tracked the centre of the storm during the integration of the model using a movable grid. Figure 2(a) shows the nested domains.
All performed simulations were initialised at 0000 UTC with the outputs of the GFS at a 0.5°resolution, as obtained from https://nomads.ncdc.noaa.gov/data/gfs4. ey were performed for a forecast period of 72 h. e boundary conditions were updated every 6 h, and the temporary integration step was 69 s for the 27 km domain; meanwhile, the temporary integration of the internal domain was 1/3 of the external domain temporary step. Table 1 shows the fundamental aspects of the configuration used.
is configuration has certain limitations associated with the noncoupling of the ocean model. erefore, the model uses a static sea surface temperature, rendering it impossible to account for changes in temperature during the model integration, thereby affecting the calculated intensity [23]. Another shortcoming of the system is that it does not use vortex relocation, thereby affecting some evaluation parameters, as discussed below. For a description of the parameterizations used and a discussion of the shortcomings of the configuration, see [18].
Despite the limitations mentioned above, the NTHF is an alternative to the NOAA HWRF system [22]. It can be implemented in centres without high computational resources for operational use in TC forecasts in the NATL     Advances in Meteorology basin. In addition, the implementation of the NTHF as an operating system in the NATL allows the meteorological offices of countries in Central America and the Caribbean to develop graphic products, such as monitoring cones as well as intensity and monitoring forecasts, by incorporating the results of all numerical forecasting models from https://ftp. nhc.noaa.gov/atcf/com/. Furthermore, NTHF would become a powerful tool for the scientific research of hurricanes in these countries, as an alternative to the NOAA HWRF system, because the implementation of the HWRF model requires high computational resources. As a baseline, the R-CLIPER [12] model was implemented at the Department of Meteorology. e precipitation field was calculated using the best-track and maximum speed 72 h forecasts, for the latitudes and longitudes of the TRMM data mesh. Calculations were performed with a spatial resolution of 27 km and at 6 h time intervals.

TRMM Data Used in Evaluation.
e study area was the NATL basin. It was selected because it constitutes a region with intense cyclonic activity. e cyclonic systems are likely to affect the Caribbean Islands as well as Central and North Americas. In addition, the study area is susceptible to developing maximum intensities owing to high sea surface temperatures in areas of the Caribbean Sea, the West Atlantic, and the Gulf of Mexico, thereby increasing the likelihood of heavy rainfall events, large accumulations, and consequently loss of human lives and economic damages.
To evaluate the NTHF configuration, nine TCs were selected (see Table 2). All of them were formed in the NATL, and most of them in the Intra-American Seas. e selection criterion was based on selecting TCs that arrived on land during the 2016, 2017, and 2018 study periods. In addition, the runs were initialised 24 h before landfall at 0000 UTC to provide forecasts with the highest accuracies of the accumulated rainfall in that time interval. Cyclones with different characteristics and intensities were included, i.e., tropical storms and hurricanes with various categories, to verify the general performance of the operational configuration. Figure 2(b) shows the trajectories predicted by the NTHF and the best track for all study cases.
Owing to the development of satellite observation systems, the estimation of precipitation has been widely applied in weather and climate research, in particular for the study of rain in TCs over oceans and arriving on land. Although rain gauge data measurements are available, they are often not sufficiently dense or do not have an appropriate spatial distribution in important regions, mainly on land and near coastal areas. In addition, weather radars provide good spatial and temporal resolutions, but the area included is limited.
erefore, satellite precipitation estimates are suitable for studying rainfall characteristics, especially over oceans and coasts, where surface observations are limited [24]. e efficacy of using TRMM data in describing the characteristics of systems that reach the land has been evaluated in several studies [25][26][27]. ese studies indicated that TRMM-3B42 data can reasonably represent the distribution of rainfall compared with rain gauge data and radar observations. However, it has been reported that TRMM-3B42 data underestimate moderate and heavy rainfalls but overestimate light precipitation [27]. Meanwhile, TRMM data collected at daily to monthly intervals are often underestimated for mountainous regions in tropical and middle latitude mountain systems [28,29]. Hence, it is advisable to consider the corrections introduced in [14]. Furthermore, TRMM-3B42 has a lower detection probability and lower false-alarm rates than other products in both warm and cold seasons for North America [30]. TRMM data are available on a grid with a spatial resolution 0.25°× 0.25°. ey are the result of the combination of TRMM and other satellite estimates, as the Multisatellite Precipitation Analysis (TMPA). It is a trihour product that provides information from 50°N to 50°S, available at http://www.pmm.nasa.gov/ data-access/downloads/trmm.shtml/.

Methodology
To evaluate the performance of each model in predicting observational data, a set of statigraphs from Brown et al. was used [31]. Rain field data simulated by models NTHF and R-CLIPER were statistically compared with the satellite observations of the TRMM. ree important elements in the forecast of TC precipitation were considered: ability to match rainfall patterns around the centre (pattern matching), ability to match average and distribution of rainfall volume (mean rainfall distribution), and ability to reproduce the largest rain values that are typically related with TCs (extreme rain prediction). e R-CLIPER model was used as a reference and comparison for all study cases.
For the interpolation of the NTHF precipitation fields to the latitudes and longitudes of the TRMM, the nearest neighbour method was used [32], in which a 600 km grid was selected with respect to grid's centre to avoid the inclusion of rain fields associated with another type of synoptic systems [7,13]. A radius measuring 5°(∼555 km) around the centre has already been tested by Englehart and Douglas [33], who demonstrated that the distance between the centre of a TC and the outer edge of its cloud shield was between 500 and 600 km for 90% of the cases. Moreover, Larson et al. [34] conducted sensitivity tests using radii varying between 2.5°a nd 7.5°; they discovered that radii larger than 5°excluded  4 Advances in Meteorology most TC rainfall. Zhan et al. [35] used a radius of 500 km, which accounted for rainfall located from the inner core to adjacent rainbands. On the contrary, Rios Gaona et al. [36] used a radius of 1000 km to determine rainfall associated with a large number of TCs (166) for two years in different basins. However, in the presentation of the rain data, the graphs were always limited to a radius of approximately 600 km, showing rainfall intensities ≤1 mm/h at a radius of 600 km in all cases. Hence, it may be concluded that the inclusion of data for larger radii did not significantly improve the quality of the results. Once rain fields have been calculated, they were used to calculate a group of statigraphs, which are described in the next paragraph. Using the statigraphs, the forecast skill indices of both models were obtained and compared to assess the performance of the NTHF. Next, we define the statigraphs and their associated indices, where x i represents an observed datum, y i the corresponding forecast value, and n the sample size.

Rain Pattern Matching.
To determine the ability of the system in predicting rainfall distribution, two statigraphs were used. First, the correlation pattern was measured as the Pearson correlation coefficient between the simulated and observed values in all grid points. Next, the Equitable reat Score (ETS), which is the percent of local rainfall correctly predicted by the system, was calculated. ese two statigraphs are dependent on the geographical location of the rain and hence sensitive to the track error. e points were selected up to 600 km away from the centre of the real track for the NTHF, R-CLIPER, and TRMM. e Pearson correlation coefficient is an index that measures the degree of linear correlation between two related variables. e closer the correlation coefficient of two variables to 1, the more similar is the behaviour between both variables. For a sample, it can be calculated as follows: (1) x � ( n i�1 x i /n) and y � ( n i�1 y i /n) are the average values of the observed values and forecasts, respectively. e second statigraph is related to the correct forecast of rainfall above a specified threshold in different grid points. In this case, the variable is dichotomous; therefore, it is convenient to construct a contingency table. To determine the accuracy of the forecast, the results are categorized into four groups: (i) H: the number of times the system forecasts a precipitation event above a specified threshold, and it occurs (known as hits) (ii) M: the number of times the event was not forecast but it occurred (missing) (iii) FA: the number of times the event was forecast but did not occur (false alarms) (iv) CR: the number of times the event was not forecast, and it did not occur (correct rejection) e contingency table (Table 3) shows a comparison of the NTHF predictions and TRMM observations; YES means an event (predicted or observed) above a specified threshold, whereas NO means a nonoccurrence.
e ETS, also known as the Gilbert skill score, is calculated based upon the contingency table. It represents the fraction of all observed and predicted events that were correctly diagnosed and considers the possibility of a correct forecast that occurs by chance. It presents the difficulty that does not distinguish the sources of error. e index is calculated as follows: where H random is an evaluation of the number of positive forecasts that occurred by chance, which can be calculated as where N obs(Yes) � H + M is the total number of rain occurrences, N fsc(Yes) � H + FA is the total number of rain forecasts, and N is the total number of events. e closer the score is to 1, the better is the skill.

Mean Rainfall and Rain Flux Distribution.
e average rainfall and rain volume distributions are good indicators to evaluate the intensity of rain (heavy, moderate, and light) and the limitations of a model in predicting it. In our analysis for the NTHF and R-CLIPER, all grid points within 600 km from the centre of the forecast track (relative track) were considered, while for the TRMM the 600 km radius was centred in the real track. e radial distributions, volume, root mean square error (RMSE), mean absolute error (MAE), and bias were determined for both models. Subsequently, the probability distribution functions (PDFs) and cumulative distribution functions (CDFs) were constructed. e 50th percentile of the CDF is important to determine the central value of the distribution such that the behaviour of the remaining rainfall thresholds that are greater than the median can be ascertained. In addition, a more detailed frequency distribution analysis of the rain behaviour was performed based on bands of 100 km to an outer radius of 600 km. e definitions of the magnitudes involved are presented below. e MAE is a measure of the deviation of forecast values from observed values. It is calculated as follows: e closer the MAE values are to zero, the more accurate is the forecast.
Advances in Meteorology e root mean square error (RMSE) enables the magnitude of the deviation of the forecast values from the observed values to be quantified. It is calculated as follows: When the RMSE � 0, the forecast is perfect. e RMSE is a measure of accuracy in the forecast.
Bias provides the difference between the estimated and observed values, considering the sign of the deviations as follows: A simulation is good when the BIAS values are close to zero. Unlike the MAE, whose magnitude is always nonnegative, the BIAS can assume both positive and negative values, thereby allowing one to determine if an under-or overestimation had occurred in the forecast.

Extreme Rain Prediction.
It is important to assess if the system can reproduce extreme rain events. In this case, the 95th percentile of the cumulative frequency distribution was analysed. To calculate it, the relative track was used for the NTHF and R-CLIPER, whereas the TRMM was centred in the actual track. In addition, the same method was applied to the 0-100 km and 300-400 km bands.

Skill Indices.
e skill indices in the quantitative precipitation forecast (QPF) proposed by Marchok et al. [7] are used to determine the ability of the model. e closer the index to 1 (or 100%), the better the performance of the model.

Pattern Matching.
is index corresponds to the ability of the model in reproducing rain patterns. It is obtained by averaging the average ETS determined for all precipitation thresholds and the average correlation coefficient:

Average Rainfall and Its Distribution.
is corresponds to the ability of the model in representing the average rainfall and its distribution. It can be calculated using the average of the mean rainfall error index (MREI) and the median value index in the cumulative frequency distribution (CDF − MVI). e MREI index is expressed as where n is the number of radial intervals; R fi and R oi are the predicted and observed rain averages for the i radial interval, respectively. ese were determined for 20 intervals from 30 to 600 km around the centre of the storm by averaging the rain at each grid point located in each interval. R max is the maximum observed rain in some bands corresponding to the system. e CDF−MVI is expressed as where R f50% and R o50% are the rain thresholds corresponding to the 50% percentile in the CDF for the model and observations, respectively. In this formulation, the index is high (low) when the average differences between the rain thresholds are small (large). If the difference exceeds 1 inch (25.4 mm), the index value is zero.

Extreme Rain Amount.
e maximum value index (CDF−MI) represents the ability of the model in forecasting extreme rain observed in TCs. It is calculated as follows: where CDF 95% is the percentile in the CDF of the model, corresponding to the precipitation threshold in the 95% percentile for the CDF of the observations. Table 4 shows a summary of the indices and the dependence with the track error. A library (QPF-verif1.0) Python 3.7 based on Anaconda3 packages [37] was implemented for processing the TRMM data and the outputs of the NTHF system and R-CLIPER model, as well as for the representation of the results.

Errors in Track Prediction for Study Cases.
It is mandatory to evaluate the accuracy of the track forecast for TCs because it is a key parameter that affects the results of the precipitation forecast, both in the interest region and in concrete locations. For verification, the HURDAT2 database was used. is dataset contains six-hour text information regarding the location, maximum winds, minimum central pressure, and size of all known tropical and subtropical cyclones [38]. It is available at https://www.nhc.noaa.gov/ data/#hurdat. Figure 3 shows the errors in track forecast for the study cases, the average error for the sample, and the average error of the NHC for all the TCs studied.
As shown, the average track forecast error for the nine study cases was similar to that of the NHC until approximately 42 h. e difference increased considerably from 48 to 72 h, exceeding 100 km at the end of the forecast period. Most of the hurricanes considered have similar time evolution of the forecast error in the first 42 h. is enables us to  6 Advances in Meteorology associate the errors in the precipitation forecast for these time periods with the track error.

Comparison of Model Forecast and TRMM Output.
In Figure 4, a comparison of the accumulated precipitation forecasted using the NTHF (Figure 4(a)) for Hurricane Irma and the accumulated values obtained from the TRMM (Figure 4(b)) in the first six forecast hours is shown. Both results were obtained at 1800 UTC, September 8, 2017, and the accumulated precipitation in a duration of 6 h was considered. ese values are the approximate data used to determine the statigraphs and indices for evaluating the forecast quality. For instance, to determine the BIAS (see (6)), the value of the difference between the corresponding grid points of the two maps is calculated and averaged. Figure 4(c) shows the distribution of the difference based on the considered area. e stars represent the centre of the storm according to the best track (black star) and the track predicted by the NTHF (red star). In Figure S1 of the Supplementary Materials, the accumulated precipitation forecasted by R-CLIPER for the same hurricane, date, and time is shown. By adding the precipitation values for consecutive time intervals, the accumulated values for longer forecast periods can be calculated. Next, we discuss the average of the values accumulated from the sample.

Pattern Correlation (PC).
In Figure 5, the correlation coefficients (calculated using (1)) between the rain distribution predicted by the NTHF and R-CLIPER with the values provided by the TRMM are shown. e results spanned from 6 to 72 h of forecast, with a 6 h interval. It is clear that the NTHF achieved correlation coefficients exceeding 0.6 for all forecast intervals. e PC increased with time, reaching its maximum values from 18 to 48 h, and then decreased. e initial increment appeared to be provoked by the self-adjustment of the fields, whereas the final decrement was governed by the increment in the track forecast error (see Figure 3). Regarding the R-CLIPER, its r ≥ 0.5; however, in the time interval from 48 to 60 h of the forecast time, its r ≥ 0.6, which is equal to that of the NTHF. is is likely due to the weakening of the cyclones after a landfall, which improves the R-CLIPER forecasting capabilities based on a mean representation. Both forecasts improved with time and exhibited a final decrement. In general, NTHF performed better than R-CLIPER. Figure 6, the ETS values calculated for several rainfall thresholds at 24, 48, and 72 h of forecast are shown.

ETS. In
is index gives a measure of the frequency of correctly predicted events. e reported value excluded a number of cases for which a correct forecast was performed by chance (according to (2)). It can be concluded that the NTHF always presented better ETSs than the R-CLIPER. e best predictions were obtained for thresholds ranging from 6.4 to 51 mm, for which the ETS was approximately 0.4. For heavy rainfall events, the ETS decreased, indicating the difficulty in forecasting large rainfall volumes for both models. In particular, the R-CLIPER provides an average description of the climatological precipitation, and it was unsuitable for the description of heavy rainfall. In both systems, the ETS was low for lighter rain.

Radial Distribution.
e mean radial profiles for 24 h forecasts of the TRMM, NTHF, and R-CLIPER are shown in Figure 7(a). In Figure 7(b), the bias, defined as the difference between the predictions and the measurements, is represented. e radii considered were in the interval 30 km ≤ R ≤ 600 km. In Figures S2 and S3 of the Supplementary Materials, the values for 48 and 72 h forecasts are shown. Both models showed the same decrement in the predicted rainfall with the distance from the centre of the storm. e predictions of the R-CLIPER underestimated the rainfall distribution in all the profiles, in which the larger bias (20-50 mm) was close to the core of the storm. e bias was almost zero in the periphery (500-600 km) of the cyclone. ese results were identical for all forecast times.
Up to 24 h, the NTHF overestimated the observed rainfall by 10 mm from the centre of the storm up to a radius of 320 km, with a negligible bias at larger radius.
is is attributable to the use of the Ferrier-Aligo parameterization scheme. Wang and Phillips [39] demonstrated that this scheme generated less stratiform clouds and less anvil clouds out of the eye wall owing to the lower heating ratio in this zone. In particular, it provoked the maximum rainfall in the region from the centre to the eye wall. Additionally, the use of convective parameterization with a low horizontal resolution (∼27 km) further activated the convection in the region close to the centre. All these factors were reflected in the modelling of rain accumulates, surpassing the values detected by the TRMM. Ko et al. [40] modelled the precipitation provoked by Hurricane Harvey and obtained an overestimation of the rainfall (predicted by the HWRF) when compared with gauge data up to a radius of 120 km.

Advances in Meteorology
At 48 h, the NTHF system underestimated the precipitation from 0 to 130 km, whereas it overestimated it for all the other radii, although the amount was less than 10 mm. For 72 h, an underestimation (less than 20 mm) in the central zone of the cyclone was detected, whereas a behaviour similar to that of the TRMM was observed for larger distances from the centre.

Rain Flux.
e forecast of the total volume of water deposited by rain associated with a cyclone in a specified region is important because this volume is related with the possibility of flooding. Rain flux is often used as an indicator to compare between models that use different grid areas. It is defined as the product of rainfall in a specified grid point and the area represented by the grid point (∼27·27 km 2 in this study). It can be expressed in units of mm·km 2 , or in km 3 . Furthermore, it is proportional to the rainfall amount, thereby enabling one to use the volume of water deposited by rain (the rain volume) instead of the number of times a specific threshold has been exceeded [7]. Figure 8(a) shows the total rain flux estimated by the TRMM and calculated by the NTHF and R-CLIPER as a function of forecast time up to 72 h. e bias, determined as the difference between the forecast and TRMM measurement, is shown in Figure 8(b). Both figures reveal the quality of the NTHF forecast of total rain volume, underestimating no more than 5 km 3 (≈4.4 mm) in the first 42 h and overestimating a maximum value of 10 km 3 (≈8.8 mm) in the final hours of the forecast.

Mean Rainfall Statigraphs.
To further characterise the mean rainfall (shown in Figure 7), we calculated the BIAS, MAE, and RMSE (as shown in (4)-(6), respectively) for every grid point. Figure 9 shows the BIAS of both models compared with TRMM. It is clear that the NTHF always had a smaller BIAS than the R-CLIPER, underestimating the mean rainfall until 42 hours of the forecast. e BIAS in this time interval was smaller than 3 mm. Beyond 42 h, an overestimation occurred, with maximum values less than 9 mm. e R-CLIPER always underestimated the rainfall, with a maximum BIAS of 25 mm.
To evaluate the average disagreement between the forecast and the rainfall measured by the TRMM, the MAE was calculated (see (4)). Figure 10 shows the time evolution of the MAE for the NTHF and R-CLIPER. It is clear that the NTHF (to a forecast time of 60 h) had a smaller MAE than the R-CLIPER, i.e., less than 20 mm up to 30 h. For the forecast time beyond 60 h, both systems performed almost equally. Comparing Figures 9 and 10, it can be concluded that the R-CLIPER underestimated the mean rainfall in almost every grid point, whereas the NTHF underestimated it in some grid points and overestimated it in others, thereby yielding a bias that was much smaller than that of the R-CLIPER; however, the MAE was almost equal for both systems.
Another measure of the forecast quality is the RMSE (see (5)). Figure S4 of the Supplementary Materials shows a graph with the time dependence of RMSE for both systems. For time intervals below 48 h, the NTHF performed better. For ulterior forecast times, this statigraph indicated a better performance of the R-CLIPER. Figures 10 and S4 provide similar results and indicated similar dependence.

PDF and CDF.
e PDF provides the frequency of rainfall occurrence pertaining to different rain thresholds. It was calculated for the observations and the two forecast systems. Figure 11(a) shows the PDF for the first 24 h of forecast. Using this data, the CDF, which provides the percentile at which a specified rain threshold is reached, can be calculated. In particular, the 50th percentile provides a threshold above which 50% of the rain occurs. It is typically assumed that the low-to-moderate rainfall zone appears below this percentile. If the observations indicate a specified 50th percentile threshold and the system forecasts a larger than 50th percentile, it indicates that the system is forecasting less rain in the low-to-moderate amount. Figure 11(b) shows the CDF for the first 24 h of forecast.
As shown in Figure 11(a), the maximum of the observed frequencies is in the thresholds between 10 and 31.6 mm, consistent with the NTHF prediction. e R-CLIPER shows a maximum in the frequencies between 3.2 and 10 mm. is is reflected in the values of the 50th percentile determined from Figure 11(b) (TRMM, 12.6 mm; NTHF, 6.3 mm; and R-CLIPER, 3.2 mm). ese values imply that the NTHF overestimated the light rain, whereas the R-CLIPER overestimated those values.
When forecast was performed in a 48 h interval (see Figure S5 in the Supplementary Materials), the same behaviour was observed, except that the difference for the R-CLIPER was greater. e maximum frequencies for the TRMM, NTHF, and R-CLIPER were in the intervals 31.6-100 mm, 10-31.6 mm, and 3.2-10 mm, respectively. Meanwhile, their median values were 15.8, 10, and 5 mm, respectively. For 72 h (see Figure S6 in the Supplementary Materials), the maximum frequencies for the TRMM, NTHF, and R-CLIPER were in the intervals 31.6-100 mm, 10-31.6 mm, and 3.2-10 mm, respectively. Meanwhile, the median cumulative frequencies were 20.0, 15.8, and 5.0 mm, respectively, consistent with the tendency shown above.

PDF and CDF of Rain in Bands.
When the frequency distribution was segmented in bands around the centre of the forecast and observed track, the rainfall distribution was seen with more details, and the effect of the track error on the analysis of the precipitation forecast can be decreased. e total rainfall was segmented into circular bands of 100 km radius. Among those bands, the innermost one (0 to 100 km, including the eyewall) presents the heavier precipitation, whereas the outer bands include stratiform zones and rainbands [7]. In Figures S7-S9  In the 300-400 km band, the NTHF correctly described the rainfall distribution for all forecast times and for most of the thresholds. e R-CLIPER indicated a significant bias in the interval from 1 to 10 mm, overestimating the small amounts and underestimating the remaining thresholds. Figure 12 summarises the thresholds, showing the maximum frequency for the different bands at 24 (Figure 12(a)), 48 (Figure 12(b)), and 72 h (Figure 12(c)). e NTHF overestimated the heavy rain in the central part of the storm (0-100 km band). In the 24 h forecast, it underestimated the rain in the 100-200 km zone and overestimated the rain in the outer zones of the cyclone. In the 48 and 72 h forecast by the NTHF, the rainfall threshold in the outer zone was overestimated and the mid radii was underestimated. Meanwhile, the R-CLIPER underestimated the observed maximum frequency thresholds.

Extreme Rain Prediction.
To evaluate the efficacy of a system in describing the extreme rain amount, the threshold for which the observed precipitation has a 95% cumulative frequency (95th percentile) in the CDF must be determined. Subsequently, from the CDF predicted by the system, the percentile corresponding to a rain amount equal to the 95% value of the observations is to be extracted. If the percentage is smaller than 95%, it means that the system overestimates the extreme rain amounts. On the contrary, if the percentage is larger than 95%, the system underestimates the amount of extreme rain.
From the CDF of the 0-100 km band, which contains the zone with larger precipitations, the NTHF behaved similarly to that observed for 24 h; however, for 48 and 72 h, it overestimated the part of the precipitation corresponding to large accumulates. e R-CLIPER could not forecast large rain amounts because the threshold corresponding to 95% of the observed cumulative frequencies corresponded to 100% of the cumulative frequency calculated by the system, implying that it cannot predict extreme rain events.  [7] for three dynamical models. e values for the R-CLIPER were between 0.2 and 0.4, which were determined by the low values of the ETS (∼0.15). is poor performance may be associated with the asymmetry in the rain distribution of the cyclones considered in the sample. ese asymmetries can be predicted by the numerical model because the parameterizations used (involving microphysics and convection processes) yielded a close agreement between the model and reality. Regarding the climatological model, the assessment of the rain amounts was based on the symmetric distribution of rain, which deviate significantly from reality. For the larger radii, although the R-CLIPER estimates were closer to the TRMM output, the ETS was extremely low, resulting in a low ability for predicting the local rain distribution. e values of pattern matching depended significantly on the track forecast error. In [18], it was demonstrated that the NTHF can forecast the cyclone track in the first 48 h, similar to the error reported by the NHC in its official forecast until 2016. erefore, the ETS of the NTHF and GFDL were similar (see [7]). For the 72 h forecast, the value was of 0.48, which was similar to the evaluation of GFDL by Marchok et al. [7], i.e., 0.46. e inclusion of vortex relocation in the numerical system will improve this ability.

Average Rainfall and Its Distribution.
e NTHF performed better than the R-CLIPER in forecasting the mean rain. is is due to its superior performance in the two indices that compose the mean rain skill index. Its MREI value was ∼0.9, larger than that of the R-CLIPER (∼0.7). Furthermore, its CDF − MVI, related with the central value in the rainfall distribution, was ∼0.7, whereas that for the R-CLIPER was only 0.5.
ese results indicate that, although the R-CLIPER represented the average radial distribution of rain effectively, it presented limitations in systems with uneven radial distributions of rain in different directions, which are associated with the occurrence of bands with intense rain and zones of almost no rain, particularly at radii below 400 km. ese asymmetries caused the R-CLIPER to predict unrealistic large areas with extremely light rainfall, whereas the dynamical models reproduced the asymmetries. For instance, the values obtained by the NTHF (0.8-0.9) highlight the correct representation of rainfall for this system, even for cyclones with conditions that generate large asymmetries due to interactions with frontal bands, a complex topography, and/or wind shear. e obtained values compared well with the value (0.74) calculated in [7] for GFDL. e abovementioned factor also resulted in the higher ETS for the NTHF compared with the R-CLIPER. e predicted quality of rain volume in the observation zone (radius 600 km around the track centre) was similar to that obtained by Marchok et al. [7] for 72 h using the GFDL model. e bias of 10 km 3 in this study was close to the value of 9.4 km 3 obtained in [7]. It is noteworthy that for models GFS (5.2 km 3 ) and NAM (0.9 km 3 ), the bias was smaller, which coincided with the results of Brennan et al. [41]. In this study, TCs from 2005 to 2007 were investigated; it was discovered that the NAM presented a small bias but indicated limitations in predicting extreme amounts of rain. Furthermore, it was discovered that the GFS and the European model ECMWF (European Centre for Medium-Range Weather Forecasts) performed well. e values of the MAE and BIAS for the mean rainfall were similar to those of Tuleya et al. [42]. An average BIAS of 8.3 mm and an MAE of 23 mm at 72 h of forecast were obtained using the GFDL for 25 TCs. e R-CLIPER (see Figure 8(b)) underestimated the total volume, reaching a maximum BIAS of 32 km 3 at 72 h. e reasons for the divergence are explained above. e differences in the distribution for the NTHF (overestimating the frequency in the smaller thresholds) is a limitation inherited from the HWRF model, a deficiency associated with the cumulus parameterization. Additionally, it is clear that the NTHF overestimated the maximum in the 0-100 km band (around the eyewall) compared with the observations. e R-CLIPER underestimated the frequencies in all bands, in which larger differences were discovered in radii up to 400 km owing to the large asymmetries observed. For radii above 400 km, where the rain distribution was more uniform, the differences were smaller. It is clear from Figure 12 that the maximum threshold decreased with the radius, and a radial profile exhibiting the behaviour obtained by Lonfat et al. [6] was obtained.
It is important to control these results comparing the TRMM observations with gauge and radar data; this is because it has been reported [14,27] that rainfall satellite estimates (particularly TRMM) presented biases when compared with gauge data. e final assessment of the NTHF bias involves these comparisons.

Extreme Rain Amount.
By comparing the threshold value of the 95th percentile in the observation with the percentile corresponding to this value for the forecast and by calculating the maximum value index, it can be concluded that the NTHF can accurately predict (MVI∼1.0) extreme rain for a 24 h forecast. For 48 and 72 h forecasts (see Figures S8 and S9, Supplementary Materials), the values of extreme rain calculated by the model were approximately 9-11% in the CDF, which is a large overestimation, mainly in the 1-100 km band. is is attributable to the increase in the track forecast error for the sample used in this study (see Figure 3) and also the representation of microphysics models in the resolution used [7]. ese results were similar to those in [43], which reported an overestimation of approximately 8% rainfall in the vicinity of the storm core. For these time intervals, the system did not perform well. In the future, the innermost domain (9 km) of the NTHF will be evaluated to achieve a better representation of rain generating processes, associated particularly with extreme rains.

Conclusions
e NTHF system was evaluated to be used in the QPF associated with TCs. It demonstrated the best performance up to a 24 h forecast. Its performance was of the same quality as that of other numerical systems reported in the literature.
Compared with the TRMM, the system underestimated the rain volume up to 42 h and overestimated it in the subsequent hours. At 24 h, the MAE and RMSE were 16 and 27 mm, respectively, both of which increased with time. e indices of the QPF were approximately 0.48 for determining the rain pattern and 0.80-0.90 for the mean rain and rain flux distribution at all the forecast hours, similar to the results of a previous study. Regarding the prediction of extreme rain, it only performed well for 24 h. In future studies, we plan to perform data assimilation to improve the precision in the beginning of the observation period. Furthermore, we plan to perform a hot start in the system initialisation. Additionally, we will include vortex relocation in the model to obtain the rain values, assuming that the simulated field coincides with the real field. Comparisons with real gauge and radar data will be performed to evaluate the local performance of the system. Finally, we will perform comparisons with other precipitation reanalysis databases, such as NARR, CFSR, and ERA5 for a greater generalisation of the results.
Data Availability e data used in this article to carry out the validation are public and are available at http://www.pmm.nasa.gov/dataaccess/downloads/trmm.shtml. e outputs of the NTHF model can be reproduced by performing the simulations with the initialisation data. e outputs of the GFS at 0.5°r esolution are obtained from https://nomads.ncdc.noaa.gov/ data/gfs4/. Finally, the actual tropical cyclone trajectories are public and they are available at https://www.nhc.noaa.gov/ data/#hurdat.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.