Evaluation of High-Resolution Multisatellite and Reanalysis Rainfall Products over East Africa

The performance of six satellite-based and three newly released reanalysis rainfall estimates are evaluated at daily time scale and spatial grid size of 0.25 degrees during the period of 2000 to 2013 over the Upper Blue Nile Basin, Ethiopia, with the view of improving the reliability of precipitation estimates of the wet (June to September) and secondary rainy (March to May) seasons. The study evaluated both adjusted and unadjusted satellite-based products of TMPA, CMORPH, PERSIANN, and ECMWF ERAInterim reanalysis as well as Multi-Source Weighted-Ensemble Precipitation (MSWEP) estimates. Among the six satellite-based rainfall products, adjusted CMORPH exhibits the best accuracy of the wet season rainfall estimate. In the secondary rainy season, unadjusted CMORPH and 3B42V7 are nearly equivalent in terms of bias, POD, and CSI error metrics. All error metric statistics show that MSWEP outperform both unadjusted and gauge adjusted ERA-Interim estimates. The magnitude of error metrics is linearly increasing with increasing percentile threshold values of gauge rainfall categories. Overall, all precipitation datasets need further improvement in terms of detection during the occurrence of high rainfall intensity. MSWEP detects higher percentiles values better than satellite estimate in the wet and poor in the secondary rainy seasons.


Introduction
Rainfall is an important parameter for the characterization of water cycle.In Africa, assessment, planning, and management of water resources are often constrained by lack of reliable rainfall data [1][2][3].One of the reasons is that spatial and temporal availability of rain gauge networks in Africa and in particular in Ethiopia is deteriorating by the year.As a result, the density and spatial distribution of rain gauges in the Upper Blue Nile Basin is uneven and time-varying.Satellite-based and reanalysis global precipitation estimates are steadily rising, which offers precipitation datasets at high spatial and temporal resolution, which could potentially support research and operational water resources applications in these data-poor environments.
In accession to the satellite-based precipitation estimates, a newly released European Center for Medium range Weather Forecast (ECMWF) reanalysis precipitation estimates and Multi-Source, Weighted-Ensemble Precipitation (MSWEP) [4][5][6] at a spatial resolution of 0.25 degree grid size are available at the eartH 2 Observe web site (http://www .earth2observe.eu).The eartH 2 Observe is a Global Earth Observation for Integrated Water Resource Assessment project funded by the European Union to integrate available earth observations, in situ datasets and models, in order to construct a consistent global water resource reanalysis.
Both satellite-based and reanalysis precipitation estimates exhibit significant bias, which needs to be postprocessed [7,8] before these data are used in hydrological applications.There is also a pressing need for a better long-term rainfall dataset to be used in water resource analysis of the Upper Blue Nile Basin.However, the herein mentioned reanalysis precipitation datasets have not yet been evaluated over the study domain in terms of accuracy and systematic error.

Advances in Meteorology
Ra n gauge locat on Lake Tana Blue N le R ver 12 ∘ 0  0  N 13 ∘ 0  0  N Upper Blue N le Bas n Figure 1: The Upper Blue Nile Basin in the Eastern Africa, overlaying the rain gauge network and the global precipitation estimates grid.In gray highlighted grid pixels, we depict those grid cells that include at least one gauge.
Past rainfall studies in the region and elsewhere [9][10][11][12][13][14][15][16] have demonstrated challenges for the satellite precipitation estimation over mountainous regions, and results have shown that satellite-based precipitation estimates (SPEs) generally underestimate heavy precipitation events, with a slight overestimation in some locations.There are few studies conducted at daily time scale for the Upper Blue Nile Basin [17][18][19][20][21] that evaluated the performance of satellite estimates.The evaluation error metrics at daily timescale have been illustrated for a period of two consecutive wet seasons [18,21] and during the period 2002 to 2006 [19,20] and reported with the view to improve rainfall retrieval algorithms.
This study attempts to evaluate six commonly available satellite-based precipitation estimates (SPEs) and three newly available reanalysis precipitation estimates at 0.25-degree spatial grid size and daily temporal resolution with the view of improving the reliability of precipitation estimation of the wet season (June to September) and secondary rainy season (March to May) rainfall datasets over the Upper Blue Nile Basin.This comes from the evaluation of stateof-the-art reanalysis that will allow us to understand the current strengths and limitations of these products for water resources evaluations in the region.Results will also contrast the potential benefits/limitations between satellite-based and reanalysis precipitation estimates, information that can be helpful for blending approaches of the two.

Study Region and Data Type
2.1.Study Area.The Upper Blue Nile Basin, locally called "Abbay" in Ethiopian, is located within 7.5 ∘ to 12 ∘ north and 34 ∘ to 40 ∘ east (Figure 1).The basin has a complex topography, ranging from lowlands (∼500 m.a.s.l.) near Ethio-Sudan border to mountain ranges (∼4250 m.a.s.l.) in the central highlands.The basin has an area of about 177,000 Sq.km and accounting for 17% of Ethiopia's land mass and nearly 7% of the Nile Basin surface area.The basin is the major source of the Nile water resources, which contributes above 60% of the overall river Nile flow at Aswan Dam in Egypt [1].
Past studies have demonstrated difficulty to efficiently evaluate water resources of the Blue Nile due to its complex terrain and lack of adequate data, mainly precipitation, at subbasin and short time scales.Existing rain gauge observations are sparse in both time and space within the basin.This could cause a lack of insight into the evaluation of water resource availability for impacts and benefits of major development interventions on the water resources management of the basin.
The rain producing climate systems and rainfall characteristics for the study region has been described in several studies [22][23][24][25].The period from June to September represents the wet season of the region which obtains almost 70% of the annual rainfall amount [26].

Observed Rainfall Data.
The surface rainfall observations are obtained from a network of 153 National Meteorological Agency (NMA) stations within the Upper Blue Nile Basin at daily time scale.The spatial and temporal distribution of these rainfall gauging stations is uneven and exhibits very limited coverage in time and spatial distribution that follows the local road network and major towns (Figure 1).A rainfall station above 80% of historic record for the rainfall seasons were considered for the analysis.Based on the above criteria, 126 station observations passed through a quality control (QC) process and were used as a reference in this study.The remaining gauges were discarded.Evaluation of precipitation estimate is carried out at the 0.25-degree regular grid pixels represented by at least onegauge observation.A total of 92 satellite-grid boxes that contain gauge observations were considered for the period 2000-2013.The observed gauge rainfall data were interpolated using ordinary kriging (OK) algorithm to produce rainfall fields at 0.05-degree grid size, which were then aggregated to the 0.25-degree grid box, and considered as the reference areal gauge rainfall to evaluate satellite-based and reanalysis precipitation dataset [27].

Satellite-Based and Reanalysis Precipitation Products.
The satellite products used in the analysis are commonly used in operational and research activities focusing on water resources planning, design, and decision-making in the basin.This study evaluated six of the main satellitebased precipitation estimates (SPEs) at 0.25-degree spatial grid size and daily time scale for the period 2000 to 2013 (Table 1).The SPEs are (1) the National Aeronautics and Space Administration (NASA); Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) [7,28]; (2) the National Oceanic and Atmospheric Administration (NOAA) Climate Prediction Center (CPC) morphing technique (CMORPH) [29]; and (3) Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) [30] technique.
The TMPA version-7 precipitation datasets were released in December 2012 [31] with a spatial grid resolution of 0.25 degrees and a nominal 3-hourly time scale.The dataset combines geostationary thermal infrared (IR) and low-earth orbiting (LEO) passive microwave (MW) sensor precipitation estimates.The MW sensor data come from Microwave Imager (TMI) on TRMM, Special Sensor Microwave Imager (SSMI) on the Defense Meteorology Satellite Program (DMSP) satellites, Special Sensor Microwave Imager/Sounder (SSMI/S, only for the research product), the Advanced Microwave Sounding Unit-B (AMSU-B), and the Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E) on the Aqua and Microwave Humidity Sounder (MHS).Descriptions of the TMPA sensors can be found in [32].
The TMPA algorithm includes four steps [28]: (1) the MW data are converted into instantaneous rain rates at the individual sensors' field of views; the resulting datasets are calibrated and combined to produce 0.25-degree grid size MW estimates at 3-hourly time steps; (2) the IR-based rain-rate estimates were derived using a MW-precipitationcalibrated algorithm, (3) the near real-time (RT) version-7 unadjusted (hereafter, 3B42RT) precipitation estimate was created by combining the two products, and (4) the post-RT gauge adjusted research product (hereafter, 3B42V7) [7,28,33].
The CMORPH product is created by morphing methods that combines MW precipitation estimate with IR sensors observations.The IR-image data are used to propagate the MW-based precipitation estimates forward and propagated backward in time between successive MW sensor observations [29].CMORPH estimates are available as postprocessed gauge adjusted product (hereafter, CM) and as a near-realtime product (hereafter CM-unadj).
The PERSIANN precipitation datasets are created from IR brightness temperature observations using an artificial neural network method [34,35].The model is calibrated against MW rainfall estimates provided by the TMPA satellite through a procedure that adjusts model parameters iteratively for rain rate at 0.25-degree spatial grid size [30].
In addition to SPEs, we examined unadjusted and gauge adjusted ERA-Interim and Multi-Source Weighted-Ensemble Precipitation (MSWEP) precipitation estimates as indicated in Table 1.The MSWEP dataset has been created using an optimal combination of the highest quality data from SPEs, reanalysis data sources, and gridded gauge observation.These include three satellite estimates (CMORPH, 3B42RT, and (Global Satellite Mapping of Precipitation Microwave-IR Combined Product) GSMaP-MVK), two atmospheric model reanalysis (ERA-Interim and (Japanese 55-year Reanalysis) JAR-55), and gauge observation [6].

Performance of Rainfall Detection.
The rainfall events detection capability was examined.The global rainfall products' skill to detect daily rainfall accumulation greater than 0.1 mm [36] is evaluated using Probability of Detection (POD), False Alarm Ratio (FAR), and Critical Success Index (CSI).POD represents the fraction of observed rainfall events correctly detected with a perfect value of 1.Similarly, the FAR represents the fraction of no-rainfall falsely detected by satellites with a perfect value of 0. The CSI describe the skill of rainfall estimate by combining the marginal error metrics of POD and FAR with a perfect value of 1.The three error metrics are defined from the contingency metric (Table 2).
where , , and  represent hit, false detection, and missed rainfall, respectively.The rainfall detection metrics do not provide the volume of rainfall correctly or incorrectly detected.Additional error metrics of missed rainfall volume fraction (MRV) and falsely detected rainfall volume fraction (FRV) are useful in evaluating the satellite-based and reanalysis rainfall [10].MRV error metric measures ratio of missed rainfall volume to the total observed rainfall volume while FRV measures the ratio of falsely detected rainfall volume to the total rainfall volume for the period we examined. (2)

Statistical Error Metrics.
The quantitative analysis for comparison of satellite rainfall against gauge observation is based on a statistical error metric.We use the following error metrics to evaluate the performance of satellite-based and reanalysis products.The error analysis utilized statistical techniques using bias ratio (bias), Person correlation coefficient (CC), and normalized root-mean-square-error difference (NRMSE) to evaluate performances.
where  and  are the mean of satellite and gauge rainfall, respectively.The bias ratio is an error metric measuring the systematic error component with a perfect score of 1, while a value less or greater than one shows that underestimation and overestimation, respectively.In a similar way, the MRE error metric indicates the magnitude of under-or overestimation with perfect score of zero.Pearson correlation indicates the linear association between observation and model estimates.
The NRMSE metrics measure the variation of the random error component.Furthermore, the performance of satellite products is evaluated for different rainfall magnitudes conditional to different reference gauge rainfall threshold values.These threshold values correspond to the 10th, 25th, 50th, 75th, 90th, and 95th percentiles of the reference gauge rainfall.

Evaluation of Mean Seasonal Rainfall Patterns. The Upper
Blue Nile Basin has a marked wet season from June to September and a secondary rainy season from March to May.The climatological characteristics and seasonal rainfall driving systems of the region have been discussed in several past studies [22][23][24]26].Spatial patterns of mean seasonal rainfall, obtained from the SPEs and global reanalysis products, are presented in Figures 2 and 3 for the wet and secondary rainy seasons, respectively.
Gauge adjusted SPEs, 3B42TR, CM-unadj, and MSWEP show an equivalent spatial pattern with two regions of peak values (>1000 mm), whereas adjusted ERAI shows a similar spatial pattern without distinct peak rainfall in the basin.The ERAI-unadj estimates mean seasonal rainfall above 1000 mm in most parts of the basin exhibit a stronger overestimation of the wet seasonal rainfall relative to the other products.PNN-unadj does not capture the spatial pattern of rainfall as compared to the other products and underestimate the seasonal rainfall amount in the southern and eastern parts of the study domain.The secondary rain seasonal mean rainfall pattern is illustrated in Figure 3.All products give similar spatial distribution of the seasonal rainfall pattern indicating that the southern part of the domain received relatively more rainfall during the season.PNN-unadj product gave lower seasonal rainfall amounts than the other products (Figure 3).

Rainfall Detection.
We examined the performance of the six SPEs and the three reanalysis products using categorical statistics of POD, FAR, and CSI.The statistics computed from rain/no-rain events of contingency table used to evaluate the skill of products' that detects rainy events [37].The results of the categorical error metrics are shown in boxplot (Figure 4) for a better visual inspection of the skill of rainfall detection spectrum across the study area.Each box plot ranges from the 25th to 75th quartile; the middle line in the box shows that the median and the dot represent the mean value of error metrics as shown in Tables 3(a) and 3(b); the plus signs indicate the values beyond the whiskers.
Results from SPEs showed that CMORPH products scored higher mean POD (92%) which have better skill in detecting rainfall events in the wet season while 3B42V7 is about 83%.3B42RT and gauge adjusted PERSIAN are equivalent with mean value of POD about 78%.The precipitation-estimating algorithms are different among the SPEs (explained in the methodology section).The MWbased products provide better estimate of precipitation event detection than the IR-based estimate.Both products of CMORPH are a result from propagation and morphing techniques of MW-based estimate which are superior in rainfall event detection performance among the six SPEs in this study (Figure 4(a)), and the plus symbol below the whiskers shows that the performance of rainfall event detection is nearly 70% to 80% in some locations.All SPEs are nearly equivalent in detecting the wet season percentage of FAR (<1%).Results indicate that CMORPH products capture most of the rainy events better than TMPA and PERSIANN products in the study domain.This indicates that IR-based rainfall retrieval algorithms have major limitation in complex topographic regions, while the MW-based rainfall is more physically based and free of the cold surface of snow effects [17,38].The cloud top brightness temperature does not always correlate well with the gauge rainfall amount; cirrus cloud and nonprecipitating cold cloud can easily mislead the IR-based estimate [29].From March to May of rainy season, both products of TMPA and CMORPH performance is nearly equivalent with mean POD (62-67%) and higher values over some locations.FAR is the relatively higher mean value of about 20% for TMPA and CM-unadj while CM has 16%.PERSIAAN products exhibit a lower value of mean FAR (<10%).Figure 4(d) shows that TMPA and CMORPH exhibit larger false alarm in some parts of gauge locations.Figure 5 shows missed and falsely detected rainfall volume of SPEs; CMORPH products exhibit a lower MRV (∼4%) and all other SPEs perform a mean value of 10-15% in the wet season.Both TMPA and CMORPH have equivalent MRV in the small rainy season, about 18-20%, whereas PERSIANN products performed on MRV above 30%.
In addition to SPEs, results from the reanalysis products of POD and FAR are shown in Figures 4(a The mean POD values are 98-99% in the wet and 82-92% in the small rain seasons.On the other hand, the reanalysis products exhibit a slightly higher mean FAR than the SPEs for the wet as well as in the small rain seasons.The three products of reanalysis have mean value of MRV <0.5% in wet, while MSWEP (∼3%) and the other two are ∼8% in the small rain seasons.This indicates that the reanalysis products are relatively better in avoiding missed rainfall volume in both seasons.The mean value of FRV indicates that the reanalysis products detected relatively higher volume of false rainfall in domain.
The results from CSI showed that, during the wet season, CMORPH products exhibiting 91% of rainfall events were correctly detected followed by 3B42V7 (82%).The three

Rainfall Quantification Error Metrics.
The error analysis is done on a grid cell by grid cell basis for the unconditional case where the reference gauge rainfall threshold ≥ 0.1 mm and the averages over the study domain are described.The spectrum of the error metrics CC, bias ratio, and NRMSE collected from each grid box is illustrated in Figure 7 for visual inspection and mean values of the metric are reported in Table 3. Results showed that the SPEs underestimated the wet season reference gauge rainfall.The gauge adjusted SPEs perform higher in terms of bias ratio with 3B42V7 (88%), CM (87%), and PN (82%) which indicates that the contribution of ingesting of gauge observation into the model improves the SPEs (Figure 7(a)).TMPA products and CM-unadj relatively overestimated and CM and PNN underestimated the secondary rainy season (Figure 7(b)); PNN-unadj underestimated both wet and secondary rainy seasons nearly by 50%.
The results from CC statistics showed all products are nearly equivalent both in wet (CC ∼0.3) and in secondary rainy (CC ∼0.4) seasons (Figures 7(c) and 7(d)).The lower linear relation between SPEs and gauge observations could be attributed to the inherent sampling nature of point observation at daily time scale.The NRMSE is a normalized rootmean-square error by mean reference rainfall with a lower value indicates less error variance (Figures 7(e) and 7(f)).The mean value of NRMSE was from 1.1 to 0.8 with lower values corresponding to PERSIANN (0.81) and MSWEP which indicates that the spread of random error component is relatively lower for these products.
Recently, Abera et al. [39] demonstrated a comparison of point gauge observations and SPEs.The authors reported that CM-unadj was the most biased product with a magnitude of underestimation about 72% in the domain which seems to be unrealistically high.This is quite far from our findings and inconsistent with previous studies on the performance of the SPEs in the region [12, 17-21, 38, 40-42].Our findings indicate that the magnitude of the error was lower than the results reported in previous studies.This is as a result of considering a longer period of time (2000 to 2013), taking all the available gauge rainfall data and areal representation of gauge rainfall for comparison analysis.
The gauge unadjusted ERA-Interim overestimated, which is the most biased product in the wet season (Figure 2), whereas the other two products of reanalysis have nearly equivalent bias ratio with 3B42V7 and CM.During the small rainy season, MSWEP is nearly unbiased and the other products indicating a slight under-and overestimations were observed as shown in Figure 7(b) and Table 3.The mean values of CC and NRMSE of MSWEP and satellite-based products were nearly equivalent in both seasons.

Error Metrics for Different Gauge Rainfall Threshold
Categories.The performance of the SPEs and reanalysis was evaluated with different percentile threshold values of gauge rainfall categories for the two rainy seasons (Figure 8).The error metrics were examined at six percentile threshold values corresponding to 10th, 25th, 50th, 75th, 90th, and 95th percentiles.The corresponding percentiles of gauge rainfall values were 1.9, 4. 2, 7.8, 12.3, 17.3, 20.8, and 29.1 and 0.2, 0.6, 1.9, 4.9, 9.1, 12.2, and 19.4 mm/day for the wet (June to September) and secondary rainy (March to May) seasons, respectively.ERAI-unadj overestimated below the median by 30% at 10th percentile and exhibits lower underestimation for higher threshold values in the wet season.The other eight products show the increased magnitude of MRE with an increasing  percentile of threshold values.ERAI, MSWEP, CMORPH, and 3B42V7 nearly are equivalent having a lower magnitude of MRE (10-20%) below the 25th, whereas CMORPH and 3B42V7 outperform better in capturing relatively higher percentile threshold values.
In the secondary rainy season, TMPA and CM-unadj have slight overestimation (∼<5%) below the 25th and outperform with a lower magnitude of underestimation for higher thresholds.The reanalysis estimates sharply increase the magnitude of underestimation for higher quantile.The PNN-unadj exhibits highest underestimation in both seasons.The CC relatively lower values decrease with gauge rainfall threshold categories.The ERAI and ERA-unadj have a lower CC for the two rainfall seasons.

Conclusion
The results of this study provide an examination of the performance of satellite-based and newly released reanalysis rainfall products at a daily time scale for the wet and the second rainy seasons during the period 2000 to 2013 over the Blue Nile Basin.The study utilized six adjusted and unadjusted satellite precipitation products (TMPA, CMORPH, and PERSIANN), two ERA-Interim reanalysis products, and MSWEP, which is a blended product that combines satellite, reanalysis, and gauge precipitation.The evaluation of rainfall estimate was carried out at 0.25-degree regular grid and at least onegauge observation within the grid box.Comparison analysis using a point observation or averaged rainfall of the unevenly distributed station networks at daily time step could cause a substantial consequence on the value of error metrics.We used the unbiased linear estimators, ordinary kriging (OK) algorithm, to produce areal representation of gauge rainfall at 0.25-degree grid size for comparison at daily time scale.Based on the categorical and quantitative error metrics used in our analysis, we summarize our findings as follows: (1) The categorical error metrics for event detection showed that CMORPH products have higher POD, which are better in detecting rainfall events in the wet season while TMPA and CMORPH products are nearly equivalent during the secondary rainy season.FAR is below 1% of all products during the wet season and 8 to 20% during the small rainy season.MSWEP product outperformed the satellite-based and reanalysis with the highest POD in both rainfall seasons; both exhibit a slightly higher mean FAR than the satellite-based on the wet and small rainy seasons.
(2) In terms of volume of missed and falsely detected rainfall, CMORPH has lower MRV than the TMPA and PERSIANN products.All products are nearly equivalent in detecting percentage of FRV.The ECMWF reanalysis and MSWEP products are relatively better than the satellite products in avoiding missed rainfall volume in both seasons.The mean value of FRV indicates that the reanalysis products detected slightly higher volume of false rainfall.(3) The results from CSI showed that CMORPH products outperform the rainfall events correctly detected in the wet season.During the secondary rainy season, CMORPH and 3B42V7 are nearly equivalent.The three newly released reanalysis products outperform the satellite estimates in terms of correctly rainfall event detection in both seasons.(4) The bias ratio results showed that satellite-based rainfall products underestimated the wet and slightly overestimated the small rainy season's gauge precipitation.The CC statistics showed that all products are nearly equivalent in both seasons.The spread of random error components was shown to be slightly higher for TMPA products.(5) Among the reanalysis products, the error metric statistics show that MSWEP outperform ERAI-unadj and ERAI estimates.(6) The magnitude of error metrics is linearly increasing with increasing percentile threshold values of gauge rainfall categories.3B42V7 and CM are relatively better in capturing higher percentile in the wet season while CM-unadj better in capturing higher percentile

Figure 2 :
Figure 2: Spatial pattern of mean season rainfall (mm) for the period June to September.

Figure 3 :
Figure 3: Spatial pattern of mean seasonal rainfall (mm) for the period March to May.
)-4(d), indicating that the reanalysis products outperformed SPEs with MSWEP being the highest POD in both rainfall seasons.

Figure 4 :
Figure 4: Categorical statistics for (a) POD for June to September, (b) POD for March to May, (c) FAR for June to September, and (d) FAR for March to May.

Figure 5 :
Figure 5: MRV and FRV for June to September (a and c) and March to May (b and d).

Figure 6 :
Figure 6: CSI for June to September (a) and March to May (b).

Figure 7 :Figure 8 :
Figure 7: Statistical error metrics of bias ratio (a and b), CC (c and d), and NRMSE (e and f) for the rainfall seasons.

Table 1 :
Summary of the satellite-based and reanalysis rainfall products used in this study.

Table 3 (
a) Mean value of statistical error metrics for the wet (June to September) seasons