Downscaling of Open Coarse Precipitation Data through Spatial and Statistical Analysis, Integrating NDVI, NDWI, Elevation, and Distance from Sea

This study aims to improve the statistical spatial downscaling of coarse precipitation (TRMM 3B43 product) and also to explore its limitations in the Mediterranean area. It was carried out in Morocco and was based on an open dataset including four predictors (NDVI, NDWI, DEM, and distance from sea) that explain TRMM 3B43 product. For this purpose, four groups of models were established based on different combinations of the four predictors, in order to compare from one side NDVI and NDWI based models and the other side stepwise with multiple regression. The models that have given rise to the best approximations and best fits were used to downscale TRMM 3B43 product. The resulting downscaled and calibrated precipitations were validated by independent RGS. Aside from that, the limitations of the proposed approach were assessed in five bioclimatic stages. Furthermore, the influence of the sea was analyzed in five classes of distance. The findings showed that the models built using NDVI and NDWI have a high correlation and therefore can be used to downscale precipitation.The integration of elevation and distance improved the correlation models. According to R2, RMSE, bias, andMAE, the study revealed that there is a great agreement between downscaled precipitations and RGS measurements. In addition, the analysis showed that the contribution of the variable (distance from sea) is evident around the coastal area and decreases progressively. Likewise, the study demonstrated that the approach performs well in humid and arid bioclimatic stages compared to others.


Introduction
Researchers agree on the key importance of precipitation data and its broad spectrum of use [1,2].In addition to its crucial role in the hydrological cycle balance, precipitation data is integrated into the assessment of extreme events, used as input in runoff and erosion modeling, utilized as an important parameter for hydrometeorological and agricultural hazards assessment, such as drought and flood.It is also one of the most challenging aspects of climate modeling.Precipitation is also a useful parameter in other fields such as ecology, natural resources, and environment.
Conventional measurements of precipitation in rain gauge stations (RGSs) allowed point-based estimations at specific geographic locations.The quality of the recorded precipitation data depends heavily on field observations, and the establishment of an adequate measuring network at the watershed level requires a nonnegligible cost (equipment, installation, maintenance, etc.).Furthermore, to consider the geographic variability of precipitation, one often relies on deterministic and geostatistical interpolation techniques such as IDW and Kriging.Although spatial interpolation techniques are widely used, they are hindered by several impediments [3] related to data precision, especially in watersheds where the number of RGSs is insufficient or inadequately distributed, as it is the case in developing countries [4].
The progress in open satellite precipitation products has relatively overcome this problem to some extent.Indeed, satellite missions such as TRMM (Tropical Rainfall 2 Advances in Meteorology Measuring Mission) and Climate Precipitation Center's (CPC) morphing technique precipitation product (CMORPH) provide spatialized precipitation data with a coarse spatial resolution that is adequate for the characterization of large watersheds.Among the freely accessible data, one can mention TRMM 3B43, which is a monthly product, resulting from the combination of TRMM and other data sources (Huffman and Bolvin, 2014) [5,6].TRMM data is freely available at a spatial resolution of 0.25 ∘ for a period of 16 years, from 1999 to 2014 (https://disc.sci.gsfc.nasa.gov/alerts/nearing-the-end-of-the-trmm-era).Although TRMM data was restricted starting from October 7, 2014, due to the end of its fuel, this source of information still constitutes an important axis of research and is still applied in different studies [7][8][9][10].
TRMM 3B43 allows characterizing precipitation across large watersheds.However, the spatial resolution of this data is not fine enough to apprehend spatial variety over small and medium watersheds.For this purpose, different approaches are used to downscale this data to a fine resolution of 1 km [1,11,12].Downscaling is of key importance in the field of remote sensing, since it allows an increase in spatial resolution [13].Many downscaling techniques have been recently used in different fields and were reviewed by Jia et al. [12].
This study focuses on spatial downscaling of coarse satellite precipitation.This topic was recently studied by many researchers.Nichol and Abbas [10] studied the relationship between TRMM, at a spatial resolution of 0.25 ∘ , and Normalized Difference Vegetation Index (NDVI) at 1 km, in the Iberian Peninsula.The significant statistical relationship between these indicators allowed the downscaling of TRMM 3B43 products to a resolution of 1 km.A similar approach was used by Immerzeel et al. [11] to downscale the same product in the Qaidam Basin of China.During their analysis, TRMM 3B43 was downscaled using a multiple linear regression model, integrating NDVI and DEM [1,11] downscaled version 7 of TRMM 3B43 over a humid and semiarid area, covering Lake Tana Basin in Ethiopia and Caspian Sea Region in Iran.The downscaling approach adopted in this study was based on a nonlinear relationship between the annual precipitation and the annual average of NDVI.The downscaled precipitation at 1 km was calibrated based on two approaches: Geographical Differential Analysis (GDA) and Geographical Ratio Analysis (GRA) [1].In the same study, the researchers explored the disaggregation of monthly precipitation and demonstrated that the monthly downscaled precipitation has a good agreement with RGS measurements.Another strand of research has examined the downscaling of TRMM3B42 for six rainstorm events in the mountainous area of the Xiao River Basin in China [14].The downscaling scheme was developed using multivariate regression that explains the precipitation by local topography and prestorm meteorological conditions.In this study, elevation, angle between slop aspect and prevailing wind, and the roughness index were used as a proxy of local topography, while antecedent maximum temperature and average humidity served as an indicator of prestorm meteorological condition [13].The study showed a good agreement between downscaled precipitation and ground observation and revealed a better result than the conventional spline and Kriging interpolation methods.
The main objective of this study is to improve downscaled precipitation at a spatial resolution of 1 km using stepwise regression and Akaike's Information Criterion (AIC), based on four predictors (NDVI, Normalized Difference Water Index (NDWI), elevation, and distance from sea).The specific objectives of this study are as follows.
Indeed, it has been shown that vegetation response has a positive relationship with precipitation at the annual scale (e.g., Malo and Nicholson, 1990;Martiny et al., 2006;Nicholson et al., 1990).This relationship was exploited to downscale annual TRMM precipitation using NDVI [1,10].With regard to NDWI, it could be a good proxy of precipitation and thus could be used to downscale TRMM 3B43, since it is sensitive to vegetation water content [15] (Gao, 1996).In this study, the potentialities of NDWI to desegregate TRMM 3B43 product will be explored, compared to those of NDVI, and evaluated using in situ measurements.
This study aims also to assess how stepwise regression and AIC could improve the models selection and thus downscaled precipitation.In this sense, desegregated precipitation through stepwise and AIC selected model was compared with those based on multiple regression using the four predictors (NDVI, NDWI, elevation, and distance from sea) and then evaluated through four statistical metrics estimated using independent in situ measurements ( 2 , RMSE, MAE, and bias).
Furthermore, the study investigated the contribution of distance from sea as a predictor to build robust regression models that could improve downscaled precipitation and assessed the sensitivity of the proposed spatial downscaled approach in five bioclimatic stages of the Mediterranean area.

Study Area
The study was carried out in Morocco, which is located in the southwest of the Mediterranean region, at the northwestern part of Africa.Morocco is bordered to the north by the Mediterranean Sea, to the west by the Atlantic Ocean, to the east by Algeria, and to the south and southeast by Mauritania (Figure 1).The country has a long coastline that extends for more than 3,500 kilometers.
Morocco is essentially characterized by a Mediterranean climate, with mild and relatively wet winters and hot to dry summers.The climate shows enormous variations from subhumid in the north to Saharan in the south (Figure 1).This diversity is due to the combination of several factors, namely, its latitudinal location, the influence of the Atlantic Ocean and the Mediterranean Sea, and the influence of elevation through Atlas and Rif mountains.Spatial and temporal rainfall variability is considerably important.Mean annual rainfall ranges from less than 100 mm (Saharan bioclimatic stage) to 1200 mm (humid bioclimatic stage).The rainy season lasts from October to March in most of the country, and December, January, and February receive the maximum rainfall.The summer months have low rainfall and stormy character in general.The total land area of Morocco is about 710850 km 2 , including 58000 km 2 of forests (8%), 92000 km 2 of agricultural lands (13%), and 460000 km 2 of pastures, rangelands, and deserts. .

Datasets and Methodology
(1) 3.2.Methodology.The adopted methodology in this study includes several steps (Figure 2).The main important ones are as follows.

Data Preparation.
The monthly TRMM 3B43 precipitation was accumulated in order to calculate the annual TRMM precipitation year by year.Also, the zonal average of each predictor was calculated at a spatial resolution of 0.25 ∘ , to produce a dataset with the same spatial resolution.The same data preparation process was applied by Nichol and Abbas [10], by Zheng and Zhu [7], and by Duan and Bastiaanssen [1].

Comparison of the TRMM 3B43 and NDVI Relationship versus TRMM 3B43 and NDWI.
For each year between 1999 and 2012, regression models were established using TRMM average annual precipitation as a dependent variable and NDVI as an independent variable.These models were compared to those performed using NDWI.Then, elevation and distance were integrated progressively in models, in a second and a third iteration.

Stepwise Regression and AIC Analysis.
Stepwise multiple regression is a widely used approach to assess the importance of different predictors to explain a dependent variable.It is considered as a semiautomated process of building a model by successively adding or removing variables based on their estimated coefficients.The process of adding more variables stops when all of the variables have been included or when it is not possible to make a statistically significant improvement in  2 using any of the variables not yet included in the model.This statistical technique is applied in different fields including mathematics, Earth observation, and geoinformation [20][21][22].
It is worth mentioning that although it is a widely used approach by remote sensing and GIS community, stepwise multiple regression has several limitations, such as the bias arising from variable selection on the basis of statistical significance [23,24].To overcome these limitations, other model selection protocols are recommended.An interesting review of these techniques was given by Anderson et al. [25].Among the techniques discussed by these authors, one can mention Akaike's Information Criterion (AIC), Kullback-Leibler Information, and Takeuchi's Information Criterion.
In this study, stepwise multiple regression and Akaike's Information Criterion were applied in order to select the best combinations of variables (NDVI, NDWI, altitude, and The methodology used in this study.DS: downscaled; GDA: Geographical Difference Analysis. distance from sea) that explain the maximum variation of TRMM and give the best models fit.The stepwise multiple regressions were performed initially using a dataset of six years.These allowed choosing the most robust models for each year.Then, these models were double-checked and the best model fit for each year was selected using AIC.The resulting regression models using this approach led to a first group of models (Group 1) that were compared with a second group of models (Group 2), built on the basis of multiple regression using the same dataset.The models of two groups were evaluated using in situ measurement through four statistical metrics ( 2 , RMSE, MAE, and bias).This evaluation aims to assess whether the stepwise regression and the AIC improve the models fit and performance.
Using the same dataset, two other groups of models (Groups 3 and 4) were established.The models of Group 3 were built using stepwise regression integrating NDVI, altitude, and distance from sea, while the models of Group 4 were established using stepwise regression based on NDWI, altitude, and distance from sea.These two groups of models allowed us to compare and to evaluate downscaled precipitation using NDWI with those based on NDVI.The evaluation was undertaken through the same statistical metrics ( 2 , RMSE, MAE, and bias).

Advances in Meteorology
It is important to emphasize that the assumptions of normality, linearity, and homoscedasticity of the residuals were checked for all the selected models.

Downscaling and Calibration of TRMM Precipitation
to 1 km.Three groups of models were used to downscale TRMM precipitation.This includes the selected models of Groups 2, 3, and 4. The selected models of Group 2 were chosen since they give a better fit compared to the models of Group 1.The selected models of Groups 3 and 4 were used to compare the contributions of NDWI and NDVI.The evaluation of three groups of models was based on the four statistical metrics mentioned above.
For each year and for each model, downscaled and calibrated precipitations were calculated according to the scheme used by Nichol and Abbas [10] and Duan and Bastiaanssen [1].This scheme considers all raster cells (in our case, 875 cells) and it is implemented on the basis of the five steps described below: (1) The selected regression models were used to estimate the precipitation at a spatial resolution of 0.25 ∘ (PE 0.25 ), in function of the predictors.
(2) Residual values of precipitation (RES 0.25 ) were calculated at a spatial resolution of 0.25 by the difference between TRMM precipitation (TRMM 0.25 ) and estimated precipitations (PE 0.25 ).The residual values are considered as the amount of the annual precipitation that cannot be predicted by the models.
(3) The residual values (RES 0.25 ) were interpolated to a spatial resolution of 1 km through the spline algorithm.This interpolation method estimates values using a mathematical function that minimizes the total surface curvature, resulting in a smooth surface that passes exactly through the sampled points [26].Such algorithm is recommended when the punctual data is regularly spaced [10], as it is the case in this study.These interpolations allowed estimating the residual values at a fine resolution (RES 1 km ).The same interpolation approach was adopted by Immerzeel et al. (2009) and Duan and Bastiaanssen (2013).
(4) Preliminary estimations of downscaled precipitation were carried out by applying the regression models using the predictors at fine resolution (1 km), and then the results were corrected by adding the corresponding residual values (RES 1 km ).
(5) The Geographical Differential Analysis (GDA) [27] was adopted for the calibration of downscaled precipitation using RGS measurements.This approach was also used for the calibration of downscaled precipitation by Duan and Bastiaanssen [1].The GDA relies on the in situ measurement at the level of rain gauge stations and was implemented year by year and model by model.In this sense, the difference between downscaled precipitation (DSP) and in situ measurement was calculated at the level of each gauge station.This difference is noted as the likely precipitation error (Perr).It was then interpolated via Inverse Distance Weighting algorithm, since the gauge stations are not regularly spaced.The final downscaled calibrated precipitation (DSC) was calculated by summing the downscaled precipitation (DSP) and the likely error (Perr) at a spatial resolution of 1 km * 1 km.

Comparison and Validation.
The validation of the downscaled and calibrated precipitations was based on commonly used statistical metrics, namely, the coefficient of determination ( 2 ), the root mean square error (RMSE), the bias, and the mean absolute error (MAE).The use of these indicators is widespread among the remote sensing and GIS community for models evaluation [1,10].The RMSE and MAE have been also used as a standard statistical metric to measure model performance in meteorology, air quality, climate research studies, and geoscience [28].It should be pointed out that since there is no consensus on the most appropriate metric for model errors, both RMSE and MAE were used.In addition to these two metrics, the bias was also assessed.The four metrics were calculated year by year and for all the models based on independent RGS according to the following equations: where  is the estimated precipitation for year,  is the measured precipitation, and  is the number of RGSs.
It is worth noting that the same statistical metrics were used for the comparison of regression models of the three groups (2, 3, and 4).Likewise, the resulting downscaled precipitation was compared by visual interpretation.Also, after comparison and validation, the approach that gives the best model fit was used to extend the study to the other years (between 2005 and 2012).

Sensitivity to Mediterranean Bioclimatic Stages.
The previously mentioned downscaled studies did not take into consideration the climatic zoning.In fact, the downscaled precipitation using the described scheme could be sensitive to climatic conditions, especially in the area where precipitation is low and/or denuded of vegetation.In this study, we explored this potential sensitivity in five bioclimatic stages of the Mediterranean area.
In this regard, it is worth mentioning that one of the key steps in the downscaling process is the establishment of robust regression equations, with the best fits.The regression models with low and statistically insignificant correlation coefficients will be unable to explain an important part of TRMM 3B43 product, and thus they cannot downscale it with accepted approximation.In this sense, the sensitivity of the proposed approach to the different bioclimatic stages can be assessed based on the statistical parameters of regression models.To this end, stepwise regression was performed year by year (1999)(2000)(2001)(2002)(2003)(2004) for each of the five bioclimatic stages that characterize Morocco (humid, subhumid, semiarid, arid, and Saharan).The resulting regression models and their statistical parameters were compared over the five bioclimatic stages, year by year.The approach was considered sensitive to bioclimatic stage when the correlation coefficients are low and/or statistically not significant ( > 0.05).Downscaling of precipitation using these models could lead to unsatisfactory results.The approach was considered nonsensitive when the correlation coefficients are important and statistically significant.

Influence of Distance from
Sea.The variable distance from sea was included in the original set of candidate models.However, the influence of distance could be important only in the first kilometers near the sea and not over all the study area.In addition, the area close to the Atlantic coast could be influenced by the sea breeze effect.To provide a better understanding of the influence and contribution of the distance from sea to explain TRMM product, a second analysis of this variable was performed.For this purpose, the map of this variable was classified into five classes, namely, Class 1 (0 to 0.25 ∘ ), Class 2 (0.25 ∘ to 0.50 ∘ ), Class 3 (0.50 ∘ to 0.75 ∘ ), Class 4 (0.75 ∘ to 1 ∘ ), and Class 5 (more than 1.00 ∘ ).
For each of these five classes, the stepwise regression was performed, year by year, using the four predictors.Then, the standardized coefficients of the distance from sea were compared class by class, year by years, using Tamhane's post hoc test [29].This allowed us to explore whether the contribution of the distance is significantly different across the five classes.The interval of 0.25 ∘ was chosen to be consistent with the spatial resolution of TRMM.

Relationship of TRMM 3B43 versus NDVI and NDWI.
Figure 3 shows that TRMM versus NDVI and TRMM versus NDWI relationships have high  2 and all the selected models have a significant -statistic. 2 ranges from to 0.70 to 0.82 for the TRMM versus NDVI and from 0.40 to 0.65 for the TRMM versus NDWI.Although the correlation between NDWI and TRMM is relatively moderate, they remain statistically significant.It can be concluded that both NDVI and NDWI can be used to explain TRMM 3B43. Figure 4 illustrates graphically that, after integrating the predictors elevation (Figure 4(b)) and distance (Figure 4(c)) in both NDVI and NDWI based models, all the correlation coefficients increase slightly and progressively.The lower limits of the correlation coefficients were increased from 0.76 to 0.83 for NDVI based models and from 0.51 to 0.63 for the NDWI, while the upper limits of the correlation coefficients increase slightly from 0.90 to 0.92 for the NDVI versus TRMM relationship and from 0.81 to 0.84 for the NDWI versus TRMM relationship.
The summary of unstandardized regression coefficients () of the NDVI, NDWI, elevation, and distance is given in Table 2.It can be seen that the unstandardized coefficients of NDVI and NDWI are higher than those of elevation and distance.This means that the NDVI and NDWI are the variables that contribute the most to the models.Elevation and distance have small unstandardized coefficients compared to NDVI and NDWI.By way of background, in addition to , standardized regression coefficients are used in the interpretation of the contribution of the variable in the regression models.The standardized regression coefficients (Beta) refer to how many standard deviations a dependent variable will change, per standard deviation increase in the predictor variable.They are calculated by multiplying the unstandardized coefficient, , by the ratio of the standard deviations for the independent and dependent variables.The use of Beta coefficients facilitates comparisons among independent variables since they are all expressed in standardized units.According to this analysis, it could be concluded that the statistical metrics of both NDVI and NDWI are significant; nevertheless, those of NDVI are better than NDWI.This can be explained by the fact that NDWI are more dynamic than NDVI; therefore, the NDWI syntheses do not capture all the variation of water content.
According to Table 2, although the elevation has a small , it is characterized by high standardized coefficients (Beta).This means that this variable has an important contribution to the regression models because they have a large absolute standardized coefficient (IBM Corp., 2012) The low values of Beta for the distance from sea indicate that this variable does not contribute significantly to the regression models.This may be explained by the fact that the influence of the sea could be important close to coastal areas and decreases with distance.The effect of distance was further analyzed in Section 4.5.2.

Stepwise Regression of TRMM and Predictors.
It is important to recall that assumptions of normality, linearity, and homoscedasticity of the residuals were checked for all the selected models.Table 3 presents the statistical parameters of the models of Groups 1 and 2 that give the best model fit for the six years.As reported in Table 3(a), the six selected models by stepwise regression and AIC (Group 2) are characterized   by high and statistically significant correlation coefficients ( < 0.001) and by very important unstandardized and standardized coefficients of the NDVI and NDWI.This means that NDVI and NDWI have an important contribution to the models (because they have a large absolute standardized coefficient).The variable elevation also contributes, to a lesser extent, to the models.Although its unstandardized coefficients are low, its standardized coefficients are relatively large, ranging from 0.21 to 0.32.Also, it should be noted that the unstandardized coefficients of these variables (NDVI, NDWI, and elevation) are positive, meaning that the precipitation increases as the values of these variables increase.Regarding the distance, it is characterized by negative standardized and unstandardized coefficients.This indicates that rainfall decreases as distance from the sea increases.However, the variable distance contributes slightly to the models, since the absolute values of their standardized and unstandardized coefficients are very small [30].
The models of Groups 1 and 2 were compared in order to assess whether stepwise regression and AIC improved the models through the selection of the appropriate variables and the best model fits.The comparison was based only on the years 2000, 2001, and 2003, since for the other years the models of Groups 1 and 2 are the same.It appears from Tables 3(a) and 3(b) that even though the correlation coefficients are more or less the same for the two groups of models, Student's  absolute values are small and statistically not significant for the models of Group 1 that correspond to the years 2000, 2001, and 2004.The use of stepwise regression and AIC allowed addressing this constraint by selecting the "best" models that do not include the variable distance.
On the other hand, the AIC values are relatively low for the models of Group 2 compared to those of Group 1.This confirms that models of Group 2 perform relatively more than those of Group 1. Regarding the tolerance of the predictors, they range from 0.27 to 0.53 for NDVI, from 0.26 to 0.55 for the NDWI, from 0.58 to 0.62 for distance, and from 0.62 to 0.98 for elevation.This suggests that there is no significant multicollinearity in the regression models.
It can be concluded that the use of stepwise regression and AIC-based model selection allowed refining relatively the approach by choosing the best models.The selected models are characterized by high and significant correlation coefficients, high and significant standardized coefficients, and low AIC values and are without multicollinearity problems.
Table 4 compares the relationships between NDWI and TRMM 3B43 (group 3) with those of NDVI and TRMM 3B43 (Group 4).It reveals that the resulting regression models of Groups 3 and 4 have high and statistically significant correlation coefficients.Those of Group 3 are relatively high than those of Group 4. According to the tolerances, the models of these two groups did not present any multicollinearity problem.

Spatial Downscaled and Calibrated Precipitation.
Three groups of models were used to downscale TRMM 3B43 to 1 km.The selected models of Group 2 are more performant than those of Group 1.The selected models of Groups 3 and 4 were chosen in order to compare downscaled precipitation using NDVI and NDWI.
Figure 5 points out that the three groups of models captured the spatial distribution of precipitation pattern in Morocco, year by year.In general, northern Morocco is better watered than the south, and west is better watered than the east.The same figure helps to highlight the years that experienced relatively abundant rainfall (2001, 2003, and 2004) and the years that experienced low rainfall, such as 1999.
Although the estimated precipitation has captured the overall precipitation pattern over Morocco, some residual values were observed (Figure 6).Negative residual values indicate an underestimation of rainfall.This concerns the Saharan bioclimatic stage where vegetation is very sparse or absent, and hence vegetation growth is not proportional to the rainfall.The positive residual values indicate an overestimation of rainfall.This corresponds to the wettest areas of Morocco that is covered by forests and matorral, which are characterized by relatively deep roots and therefore do not have necessarily an immediate response to rainfall.Similar residual values were observed in Spain [10].The final downscaled and calibrated precipitations for the six years according to Groups 2, 3, and 4 are presented in Figure 7.
Figure 8 reports that all the correlation coefficients are important for the three groups.These coefficients range from 0,72 to 0,92 and are similar to or relatively higher than those found by other authors in China [11].Also, model fittings have all passed the  statistical test ( < 0.001) and are statistically significant.This means that, in addition to having a finer spatial resolution of 1 km, the downscaled precipitation captured the pattern of TRMM 3B43.
The higher values of  2 correspond to the models of Group 2 that range from 0.78 to 0.86, with an average of 0.83 for this group of models.The values of  2 of Group 3 are very close to those of Group 2. They range from 0.77 to 0.84, with an average of 0.81, while  2 of Group 4 is lower, with values ranging from 0.50 to 0.70 and an average of 0.61.Nevertheless,  2 of this last group remains important and statistically significant.The performance of the models of Group 2 could be explained by the fact that the models of this group include all variables.Aside from that, the models of Group 3 have good statistics compared to those of Group 4, since the models of Group 3 are based on NDVI, which is less dynamic than NDWI.

Validation of Downscaled and Calibrated Precipitation.
Figure 9 reveals that the averages of  2 for the six years are 0.89, 0.87, and 0.79 for Groups 2, 3, and 4, respectively.The averages of these coefficients for the three groups are slightly lower than of the original TRMM 3B43.A similar result was observed by Duan and Bastiaanssen [1].
As mentioned earlier, the number of the used RGSs for spatial downscaling approach was 53, 61, 34, 34, 32, and 25 for the years 1999, 2000, 2001, 2002, 2003, and 2004, respectively (in function of data availability).In order to evaluate downscaled precipitation and to compare the models of Groups 2, 3, and 4, only independent rain gauge stations can be used.In this study, 10 available rain gauge stations were used to estimate the statistical metrics ( 2 , RMSE, MAE, and bias).It is worth mentioning that all downscaling studies cited earlier use independent rain gauge stations for the validation purpose.
According to Table 5, the RMSE values range from 26 to 167, from 20 to 158, and from 33 to 170 for Group 2, Group 3, and Group 4, respectively.In general, these values remain lower than those of TRMM 3B43.The bias is also more important for TRMM 3B43 compared to those of the three groups.Group 2 has lower bias (−0.021 to −0.006).The bias is between −0.026 and −0.06 and −0.027 and 0.037 for Group 3 and Group 4, respectively.The bias of Group 2 is systematically negative for the six years; this means that the downscaled precipitations of this group slightly underestimate the precipitation.
It can be concluded that the models of Group 2, which were built using NDVI, NDWI, elevation, and distance, perform slightly better than the models of Groups 3 and 4. It is also evident that the models developed by stepwise regression based on NDWI, distance, and elevation have good agreement with the observed precipitation.However, these models perform slightly less than those of Group 3.
The developed methodology was applied to the recent years, from 2004 to 2012.This allowed us to have an updated picture of spatial distribution of precipitation over Morocco during the last 14 years at a spatial resolution of 1 km 2 (Figure 10).6.It seems that the subhumid stage is characterized by very high and significant , with an average of 0.88 for the six years.The second area where the approach performs well is the semiarid stage, with an average of 0.8.The average  coefficient in the arid bioclimatic stage is around 0.65.The bioclimatic stages where  was lower are the humid and Saharan stages.In these stages,  ranged from 0.22 to 0.48 for the first and from 0.22 to 0.62 for the second.These low values of  in the Saharan bioclimatic stage can be explained by the nature of the vegetation cover in this area, which is very sparse or even nonexistent.It can also be due to the nature of soil, which is skeletal and sandy.Given these conditions, precipitation does not lead to a substantial growth of vegetation, since there are other limiting factors.Regarding the humid bioclimatic stage, low  could be explained by the coincidence of these areas with mountain peaks that are characterized by the presence of rocky outcrops and/or forests.Indeed, rocky outcrops are almost devoid of vegetation, while forests are characterized by deep roots that do not deplete the water needs directly and immediately from precipitation.It is worth mentioning that the low  in the Saharan bioclimatic stage could be affected also by the low number of RGSs.The same table reveals the absence of multicollinearity except in the humid stage where the maximum VIF value can reach 7.56.

Influence of Distance from the Mediterranean Sea.
Among all the possible combinations of variables, for the five classes of distance over the six years, stepwise regression identified 74 statistically significant models, from which only 16 models include distance.This number is equal to 5, 4, 2, 0, and 5 for Classes 1, 2, 3, 4, and 5, respectively.
Figure 11 presents the average of standardized coefficients of the variable distance for the five classes.It appears that the absolute values of these coefficients are important for Class 1 and regress gradually for other classes.
This result was confirmed through an ANOVA -statistic test and was pursued further by applying Tamhane's post hoc test, to verify whether there is a statistically significant difference between the standardized coefficients of pairwise classes.The result (Table 7) shows that there is a significant difference between Class 1 and Classes 3 and 5.The difference is small and not significant between Classes 1 and 2. There is a significant difference between Classes 2 and 5 and between Class 3 and Class 5.According to this analysis, it was seen that the standardized coefficients of the variable distance are larger in Classes 1 and 2 than in Classes 3 and 4. It can be concluded that the predictor distance has a relatively large and statistically significant contribution in Classes 1 and 2. The contribution of this variable over Classes 3 and 4 is nonsignificant.It is the same for Class 5 since the unstandardized coefficients are very small, with an average of −0.02.

Conclusion
This study investigated the spatial downscaling of coarse satellite-derived precipitation over five bioclimatic stages in Morocco, for a period of 14 years, from 1999 to 2012.The case of TRMM 3B43 was studied through multiple stepwise regressions and AIC based on an open dataset including NDVI, NDWI, elevation, and distance from sea.
The study demonstrated the existence of a strong and statistically significant relationship between NDVI and TRMM 3B43, with correlation coefficients reaching 0.81.The integration of the predictors elevation and distance from sea in these regression models can slightly improve the correlation coefficients,.Likewise, the standardized coefficients of NDWI are high and statistically significant, meaning that they have a high contribution to the selected models.
The pairwise comparisons of the selected models through stepwise regression and AIC with those based on multiple regression showed that the first ones are more performant than the second.In fact, the stepwise regression and AIC allowed refining more the models by choosing the best

Figure 4 :
Figure 4: Correlation coefficients between TRMM and explanatory variables: (a) TRMM versus NDVI compared with TRMM versus NDWI, (b) TRMM versus NDVI and elevation compared with TRMM versus NDWI and elevation, and (c) TRMM versus NDVI, elevation, and distance compared with TRMM versus NDWI, elevation, and distance.

Figure 7 :
Figure 7: Spatial distribution of downscaled and calibrated precipitation over Morocco based on TRMM 3B43 product, according to Groups 2, 3, and 4.

Figure 9 :Figure 10 :
Figure 9: Scatterplot of the measured precipitation from 10 independent rain gauge stations versus the estimated precipitation according to Groups 2, 3, and 4.

Table 1 :
Datasets used in this study.

Table 2 :
Summary of standardized and unstandardized coefficients for different predictors.

Table 3 :
Regression parameters of stepwise regression models versus multiple regression models.

Table 4 :
Stepwise regression parameters of Group 3 (models based on NDVI, distance, and altitude) and Group 4 (models based on NDWI, distance, and altitude).

Table 5 :
Statistical metrics calculated based on measured precipitation from independent gauge stations for three groups of models (DSC: desegregated and calibrated precipitation).

Table 6 :
Correlation parameters of the five bioclimatic stages during the six years.

Table 7 :
Pairwise comparison of standardized coefficients of distance from sea using Tamhane's post hoc tests.