Downscaling improves considerably the results of General Circulation Models (GCMs). However, little information is available on the performance of downscaling methods in the Andean mountain region. The paper presents the downscaling of monthly precipitation estimates of the NCEP/NCAR reanalysis 1 applying the statistical downscaling model (SDSM), artificial neural networks (ANNs), and the least squares support vector machines (LS-SVM) approach. Downscaled monthly precipitation estimates after bias and variance correction were compared to the median and variance of the 30-year observations of 5 climate stations in the Paute River basin in southern Ecuador, one of Ecuador’s main river basins. A preliminary comparison revealed that both artificial intelligence methods, ANN and LS-SVM, performed equally. Results disclosed that ANN and LS-SVM methods depict, in general, better skills in comparison to SDSM. However, in some months, SDSM estimates matched the median and variance of the observed monthly precipitation depths better. Since synoptic variables do not always present local conditions, particularly in the period going from September to December, it is recommended for future studies to refine estimates of downscaling, for example, by combining dynamic and statistical methods, or to select sets of synoptic predictors for specific months or seasons.
1. Introduction
General Circulation Models (GCMs) are widely used to predict the impact of climate change on, for instance, the regional precipitation trend. The resolution of these models, typically around 2°×2°, is unsuitable for climate change impact estimations at basin scale [1, 2]. Additionally, these models do not capture well the subgrid processes, which can be very complex in mountain regions, and fail to account properly for the orographic features of those regions. Therefore, to obtain projections of the impact of climate change at basin scale, particularly in a mountain region, downscaling is a must. In general, the downscaling methods can be subdivided into two large groups: dynamical downscaling (DD) and statistical downscaling (SD) methods. On the one hand, the DD methods integrate a regional climate model (RCM) in the GCM, which enables capturing the atmospheric phenomena at a much higher resolution, in the order of tenths of kilometers. The SD techniques, on the other hand, are based on the determination of statistical relations between large-scale synoptic predictors and local observations from ground stations, which are considered to be stationary, an assumption that might not be true for future climate projections. The computational cost of these methods is low, they are relatively easy to implement, and they present generally higher accuracy than dynamical models. Particularly when the aim is not to understand the change in weather processes provoked by climate change, the generation of future projections at basin scale using SD methods might be convenient.
Several SD techniques exist and among them the statistical downscaling model (SDSM) is probably the most widely used [3]. The SDSM approach facilitates the rapid development of multiple, low cost, single-site scenarios of daily surface weather variables and is considered as a stochastic weather generator on a daily scale. The limited use of this technique in the Andean mountain region is perhaps the consequence of its complex topography, location in the tropical zone, and the influence of the warm and cold ENSO phases, El Niño and La Niña, respectively. All these influences add complexity to the atmospheric processes, making the representation by downscaling techniques more difficult. Artificial neural networks (ANNs) are another SD technique commonly used, however, to a limited extent, in the Andes region. As stated by [4], the success of this technique is primarily due to its ability to map highly nonlinear relations between inputs and outputs of the model. Reference [5] applied ANN for downscaling monthly precipitation and temperature in the Paute basin in the Andes of Ecuador. The application of ANN based SD was conducted in order to evaluate the performance on seasonality representation against DD using a regional climate model, Weather Research and Forecasting, WRF, model [6]. With respect to rainfall representation, they found that although both downscaling approaches represent qualitatively well seasonality in this highly complex terrain, ANN estimates of rainfall were more accurate than WRF estimates. This fact highlights the applicability of ANN when the understanding of the processes involved is not required. Another technique with recent application for the downscaling of GCMs is SVM, support vector machines [7]. In particular the least squares support vector machines, LS-SVM [8], downscale GCM output even better, mainly due to the reduction of the optimization problem to the resolution of a linear system reducing considerably the computational requirements. Despite SD methods presenting greater accuracy than DD methods, from a statistical point of view, they possess two handicaps: bias and low variance. Normally, the quantile mapping (QM) technique [9] is applied to improve the representation of the distribution of derived applications.
This paper presents a comparative evaluation of downscaled GCM estimates of monthly precipitation at the scale of a large river basin situated in the Andean mountain region in southern Ecuador, applying SDSM and two artificial intelligence (AI) techniques: artificial neural networks (ANNs) and the least squares support vector machines (LS-SVM) approach. The downscaled results were corrected for bias and variance inflation applying the QM technique prior to the comparative analysis. For the evaluation, historic data was used.
2. Study Area and Data
The Paute River basin, tributary of the Amazon basin and 6148 km^{2} in size, was selected for the comparative evaluation of the downscaling methods, a basin located between the eastern and western cordillera of the Andes in Ecuador. The basin is characterized by a high spatial and temporal variation in precipitation that broadly can be classified into three rainfall regimes, respectively, subregions with a uni-, bi-, and three-modal precipitation pattern [10, 11]. Data of 5 rainfall stations (see Table 1) were used of whom the geographical distribution is given in Figure 1. The measured monthly rainfall for the 30-year period 01-1980 to 12-2009 was used to quantify the differences in the performance of the selected downscaling methods. The dataset was split in a first set for the calibration of the methods, encompassing 75% of the total dataset, and the remaining 25% was used for the validation. The results presented herein belong all to the validation set. Quality control of the data was performed using double mass curves on the time series with gaps not exceeding 20% of the observations. The infilling of the data was accomplished using multiple linear regression with stations with higher Pearson correlation [12].
Stations used for the present study.
Code
Station
Regime
Latitude (°)
Longitude (°)
Elevation (m asl)
Annual precipitation (mm)
M141
El Labrado
TM
−2.732
−79.073
3335
1286
M139
Gualaceo
BM
−2.882
−78.776
2230
820
M138
Paute
TM
−2.800
−78.763
2194
750
M045
Palmas
UM
−2.716
−78.630
2400
1341
M137
Biblián
BM
−2.709
−78.892
2640
1001
The study area Paute basin in the Andes of Ecuador.
NCEP/NCAR reanalysis 1 data [13] with 2.5°×2.5° resolution was used as input. The selected synoptic predictors for the SDSM and AI models are presented in Table 2.
Synoptic predictors.
#
Synoptic predictors
ANN/LS-SVM
SDSM
(All stations)
El Labrado
Gualaceo
Paute
Palmas
Biblián
1
Precipitation
∗
∗
∗
∗
∗
∗
2
Pressure (surface)
∗
∗
∗
∗
3
Relative humidity (surface)
∗
∗
∗
∗
∗
4
Specific humidity 700 hPa
∗
5
Sea level pressure
∗
6
Temperature 2 m
∗
∗
∗
∗
∗
7
Potential temperature 700 hPa
∗
∗
∗
∗
8
Zonal wind (surface)
∗
9
Omega 500 hPa
∗
∗
∗
10
Geopotential height 200 hPa
∗
∗
11
Geopotential height 500 hPa
∗
12
Geopotential height 850 hPa
∗
∗
3. Methods
The methodology of the present study encompasses the following steps: (i) selection of the predictors, (ii) calibration and validation of the SDSM and AI models, (iii) SDSM and AI ensemble generation, (iv) bias correction by quantile mapping, and (v) evaluation of results. SDSM can be conceived as a weather generator model, while ANN and LS-SVM are both transfer functions in statistical downscaling models. Given the analogy between ANN and LS-SVM models, the results of both formed one ensemble, which were compared to the SDSM ensemble. Notwithstanding it is well known [14] that a selection of predictors for specific months or seasons might be a good option to improve the accuracy in climate projections; in the present study only one set of predictors for the whole year was considered given that the main interest was the comparison of downscaling methods rather than the analysis of climate projections. The same predictors were used for the AI ensemble of ANN and LS-SVM models, and one set of predictors in each station for the SDSM model. A more detailed description of the tested downscaling models is given in the following paragraphs.
3.1. Downscaling Using SDSM
SDSM is a hybrid between a stochastic weather generator and a multilinear regression method [14], forcing synoptic-scale weather variables to local meteorological variables using statistical relationships. In order to be better in agreement with the variance of the observed time series, stochastic techniques are used to artificially inflate the variance of the downscaled weather time series. The reader is referred to [3] for a detailed description of the SDSM technique, an approach widely used for the downscaling of large-scale meteorological, hydrological, and environmental variables.
Partial correlation at a significance level of p=0.05 was used to select the predictors that capture best for each of the climate stations the effect of global climate. Precipitation data from the reanalysis products was evaluated as the predictor that considerably helped improving the downscaling. For the climate stations Biblián, Paute, and Gualaceo, 5 predictors were identified, of whom 4 are similar: precipitation, pressure, relative humidity, and the temperature 2 m above the surface. For the Palmas station, the only station with a unimodal regime, the best results were obtained using three predictors, namely, precipitation, potential temperature at the 700 hPa level, and the geopotential height at the 850 hPa level. Table 2 provides the list of the selected predictors as a function of the downscaling technique and station.
Using the selected predictors, daily precipitation was downscaled. Given the low match with the observed time series of daily data, monthly time series were generated. At the monthly scale the precipitation time series in the study basin are continuous; months with zero precipitation are not present even during the dry months. Regarding the SDSM model, 20 versions were applied [15]. Then, the median of the simulated results was calculated as the representative value for the ensemble of the SDSM model variants.
3.2. Downscaling Using ANN
An artificial neural network (ANN) is composed of several interconnected layers of processing units (the neurons) that transform inputs into outputs. The inputs at the neurons are multiplied by weights and then inserted into an activation function. ANNs are characterized by their topology, and probably the most widely known neural network is the multilayer perceptron (MLP). It consists of multiple layers of adaptive weights with full connectivity between inputs and hidden units and between hidden units and outputs. MLP is feed-forward artificial neural network mapping sets of input data onto a set of appropriate outputs.
The neural network toolbox of Matlab [16] was used, and optimization of the neural network was pursued using the Levenberg-Marquardt method, minimizing the mean square error. The performance of a total of four ANNs was tested, respectively, a model considering either one or two intermediate neural layers, and a linear or sigmoidal transfer function in the neurons (Table 3). For the input layer all networks had 12 neurons (equal to the number of predictors; see Table 2) and for the network with one hidden layer 8 neurons were used, which was determined by trial and error. In a similar way for networks with two hidden layers, respectively, 9 and 5 neurons were used.
Artificial intelligence models.
Model
Type
Transfer function
1
Ann
Linear & 1 hidden layer
2
Sigmoidal & 1 hidden layer
3
Linear & 2 hidden layers
4
Sigmoidal & 2 hidden layers
5
LS-SVM
Linear
6
Polynomial
7
Radial basis functions
3.3. Downscaling Using LS-SVM
Support vector machines, SVM [7], solve nonlinear classifications and estimations of functions and densities using quadratic programming. A least squares support vector machine, LS-SVM [8], is a reformulation of a SVM replacing the solution of the convex quadratic programming problem by the solution of a set of linear equations. As explained by [8] it is possible to adopt into LS-SVM the robustness, sparseness, and weightings.
Due to the fact that LS-SVM has more recent application than ANN and SDSM techniques for the downscaling of GCMs, we present here a succinct description of LS-SVM theory. For a more in deep description of LS-SVM see [8].
Let us consider x∈Rn and y∈R; the LS-SVM model, mapping the x into a feature space, is(1)y=wTφx+b.The optimization problem can be stated as(2)minw,eΓw,e=12wTw+γ12∑i=1Nei2,where e and γ are the error and the regularization parameter, respectively. The minimization of the cost function Γw,e is subject to the constrains:(3)yi=wTφxi+b+ei.The Lagrangian of the optimization is(4)Lw,b,e,α=Γw,e-∑i=1NαiwTφxi+b+ei-yi,where αi are the Lagrangian multipliers.
After considering the conditions for optimality:(5)∂L∂w,∂L∂b,∂L∂ei,∂L∂αiT=0,0,0,0T.We obtain the matrix equation:(6)01-1-Ω-γ-1Ibα=0y,where Ωij=φxiTφxj is the Kernel function K(xi,xj) and I is the identity matrix.
Finally the LS-SVM model for function estimation is(7)yx=∑i=1NαiKxi,x+b.A Bayesian framework with three levels of inference was developed for the optimization of parameters [8]. The LS-SVMlab tool [17] developed in Matlab applying three Kernel tuning options (linear, polynomial, and RBF) was used within the ensemble of AI methods.
3.4. Bias Correction Using the Quantile Mapping Approach
First the predictors were selected, followed by the application of the multiple compositions of the SDSM model and the 7 AI models, 4 ANNs, and 3 LS-SVM models (Table 3). The output distributions of the ensemble of SDSM models and the 7 AI models were grouped into two distinct populations. Both these populations were corrected for bias and variance inflation applying the quantile mapping technique. The QM applied to SDSM population distributions is from now on called the SDSM_QM and the QM applied to AI the AI_QM.
The quantile mapping technique for bias correction and variance inflation can be regarded as a statistical transformation of the original distribution into a reference probability distribution. It aims to find the function that maps the distribution of the model variables into the distribution of the observed variables. Basically three types of approaches exist to find the mapping function; they are adjustment (i) based on theoretical distributions, (ii) based on parametric transformations, and (iii) based on nonparametric transformations.
After evaluation of several transformations the power parametric transformation was applied because of its good results and the parsimony of the model [9]. The power parametric transformation is defined as (8)P0∗=bPm-x0a,where P0∗ and Pm are the corrected modeled and modeled variables, and a, b, and x0 are parameters to be determined. To find the parameters a, b, and x0, the tool R QMAP was used [18]. For SDSM_QM and AI_QM the parameters obtained after QM adjustment are presented in Table 4.
Quantile mapping parameters for artificial intelligence and SDSM ensembles.
Model
Stations
Parameters
a
b
x0
AI
Biblián
0.03280
1.83727
0.87053
El Labrado
0.00820
1.98426
−8.57836
Palmas
0.00334
2.21628
6.04480
Gualaceo
0.31943
1.40124
16.92140
Paute
0.02791
1.94766
0.03257
SDSM
Biblián
0.00986
2.08840
3.21195
El Labrado
0.39313
1.19070
−45.84738
Palmas
0.00056
2.46774
−13.06023
Gualaceo
0.00077
2.62716
−1.14051
Paute
0.04141
1.80403
3.91130
3.5. Criteria for Model Evaluation
For the qualitative assessment of the used methods of downscaling the long-term monthly mean precipitation of both ensembles, the SDSM and AI, was compared with the long-term monthly mean observed precipitation, and for the quantitative evaluation statistical metrics of the distributions of the time series of observed and modeled values for the validation period were calculated. In addition to the statistical metrics, Box-Whisker plots presenting both ensembles versus the observations were drawn. This type of graphical representation was selected because, in addition to the median, the Box-Whisker plot depicts the extreme values, respectively, the minimum and maximum (the caps at the end of each box), and the outliers falling more than 1 time of the interquartile range above the third or below the first quartile (the points in the graph).
As statistical metrics the following were used:
Pearson correlation (R): to evaluate the linear correlation.
Mean bias (MB): to evaluate the mean (the 50th percentile) difference between modeled and observed distributions.
The root mean squared error (RMSE): to evaluate the error between modeled and observed time series.
The interquartile relative fraction (IRF): to evaluate the modeled variability representation relative to the observed:(9)IRF=Q3m-Q1mQ3o-Q1o,
where IRF is the interquartile relative fraction. A value of IRF>1 represents overestimation of the variability, IRF=1 is a perfect representation of the variability, and IRF<1 is an underestimation of the variability; Q3m and Q3o and the 75th modeled and observed percentile; Q1m and Q1o and the 25th modeled and observed percentile.
The absolute cumulative bias (ACB): to evaluate the bias of the 25th, 50th, and 75th percentiles; or(10)ACB=Q1m-Q1o+Q2m-Q2o+Q3m-Q3o,
where ACB is the absolute cumulative bias. A value of ACB=0 is a perfect representation of the three percentiles (resp., the 25th, 50th, and 75th percentile) of modeled and observed distributions, while under- or overestimation indicates a divergence of ACB from zero to positive values.
4. Results and Discussion4.1. Evaluation of SDSM and AI Ensembles4.1.1. Relative Performance of ANN and LS-SVM Models
As mentioned before, ANN and LS-SVM models were grouped into one ensemble considering both models as transfer function based models. Although previous studies [19, 20] demonstrated the relative superior performance of SVM based downscaling methods over other approaches including ANN based methods, we compared for the first time in the region, to the best of our knowledge, an ANN ensemble against a LS-SVM ensemble to evaluate their downscaling performance. The ensembles were not bias corrected in order to evaluate their actual performance. Table 5 presents the values of R, RMSE, MB, IRF, and ACB comparing ANN and LS-SVM ensembles. In El Labrado and Paute station similar results of both ensembles are obtained. However, both ensembles for Gualaceo station underrepresent the variability. IRF is 0.55 and 0.40 for ANN and LS-SVM, respectively. Therefore LS-SVM represents 15% less variability than ANN ensemble. The MB for ANN is −2.29 mm whereas for LS-SVM it is −0.21 mm, which in monthly scale is very low. The ACB metric for ANN is 33.41 mm whereas for LS-SVM it is 41.70 mm, meaning that although the MB is lower for LS-SVM, the bias in the 25th and 75th percentiles is higher than for the ANN ensemble. In Palmas station R is 0.25 and 0.45, respectively, for the ANN and LS-SVM ensembles. IRF values of 0.45 and 0.53 for ANN and LS-SVM are obtained, indicating a greater representation in variability by the latter approach. However, MB values of −1.85 and 7.87 mean that LS-SVM presents more bias in the 50th percentile than ANN. For Biblián station, MB is −18.39 mm and −7.05 mm and ACB 52.34 mm and 35.49 mm for ANN and LS-SVM ensembles, respectively, meaning a strong bias for the ANN ensemble with respect to LS-SVM ensemble.
Statistical metrics for ANN and LS-SVM ensembles.
Station
Metric
ANN
LS-SVM
El Labrado
Pearson correlation
0.49
0.58
IRF
0.59
0.53
Mean-bias
4.20
6.16
Cum_bias
31.63
37.79
RMSE
40.73
37.63
Gualaceo
Pearson correlation
0.70
0.71
IRF
0.55
0.40
Mean-bias
−2.29
−0.21
Cum_bias
33.41
41.70
RMSE
34.82
35.33
Paute
Pearson correlation
0.55
0.54
IRF
0.41
0.31
Mean-bias
−9.90
−8.34
Cum_bias
44.16
48.21
RMSE
31.61
31.04
Palmas
Pearson correlation
0.25
0.45
IRF
0.45
0.53
Mean-bias
−1.85
7.87
Cum_bias
36.99
37.81
RMSE
52.44
47.51
Biblián
Pearson correlation
0.67
0.65
IRF
0.57
0.54
Mean-bias
−18.39
−7.05
Cum_bias
52.34
35.49
RMSE
44.75
40.97
For a qualitative evaluation of ANN and LS-SVM ensembles the Box-Whisker plots for the results during the validation period are presented in Figure 2. For El Labrado station both ensembles similarly represent the median, although the low variance is clearly showed, as measured by IRF in Table 5. For Gualaceo and Paute stations both ensembles represent less variance, with LS-SVM presenting lower values. The percentiles above 75th are strongly underestimated in both stations making the necessity of correction on the distribution of the ensembles evident. In Palmas station both ensembles underrepresent the variance, and the median is rightly represented by ANN but overestimated by LS-SVM. Finally for Biblián station the variance is strongly underestimated as well as the higher percentiles. The median is better represented by LS-SVM and underestimated by ANN ensemble. Both methods were able to perform similarly well for the downscaling of monthly precipitation in the selected stations. In addition, comparison of the quantitative analysis based on the statistical metrics and the qualitative analysis based on Box-Whisker plots shed light on the relative performance of ANN and LS-SVM methods.
Box plots for ANN and LS-SVM ensembles evaluated in station El Labrado (M141), Gualaceo (M139), Paute (M138), Palmas (M045), and Biblián (M137).
4.1.2. Comparison of SDSM and AI Ensembles
Once the ANN and LS-SVM ensembles were evaluated, in a next step the derived SDSM and AI ensembles were compared after QM correction. Table 6 depicts for the 5 climate stations in the Paute River basin, of which data were used, the comparison between the SDSM and AI versus SDSM_QM and AI_QM. The evaluation between both sets is based on the statistical metrics R, RMSE, MB, ACB, and IRF. For El Labrado station SDSM_QM presents lower correlation than AI_QM with 0.38 and 0.58 values. Although SDSM_QM presents a lower bias than AI_QM, the RMSE of AI_QM is a bit lower. For Gualaceo station R for AI_QM and SDSM_QM are 0.72 and 0.5. RMSE as in El Labrado station is lower for AI_QM than for SDSM_QM with 32.87 and 44.86, respectively. For Paute station also R is higher for AI_QM with 0.57 and 0.47 for SDSM_QM, although the ACB is higher for AI_QM with 15.59 and 7.77 for SDSM_QM. For Palmas there is a marked difference in R values with 0.44 for AI_QM and 0.14 for SDSM_QM, depicting for the former a lower RMSE than the latter. Similar for the Biblián station is the R value for AI_QM higher than for SDSM_QM, respectively, 0.67 versus 0.45. Analogous to the Palmas station, a lower RMSE value for AI_QM equal to 39.87 was obtained compared to the calculated RMSE of 51.81 for SDSM_QM.
Statistical metrics for artificial intelligence and SDSM ensembles.
Station
Metric
AI
SDSM
AI_QM
SDSM_QM
El Labrado
Pearson correlation
0.58
0.37
0.58
0.38
IRF
0.49
1.00
0.87
1.14
Mean-bias
4.37
−41.49
−3.24
−0.54
Cum_bias
38.23
126.89
12.19
9.96
RMSE
37.70
65.41
41.67
51.44
Gualaceo
Pearson correlation
0.74
0.53
0.72
0.50
IRF
0.52
0.46
1.04
1.03
Mean-bias
−1.01
10.28
4.33
1.85
Cum_bias
34.02
47.50
7.35
9.01
RMSE
33.92
38.26
32.87
44.86
Paute
Pearson correlation
0.59
0.47
0.57
0.47
IRF
0.36
0.47
0.80
0.89
Mean-bias
−10.46
1.14
−4.26
1.34
Cum_bias
47.61
31.73
15.59
7.77
RMSE
30.60
30.26
31.13
35.03
Palmas
Pearson correlation
0.44
0.16
0.44
0.14
IRF
0.52
0.54
1.12
1.02
Mean-bias
7.39
16.31
0.97
−3.66
Cum_bias
38.14
56.93
8.77
12.28
RMSE
47.45
56.09
55.73
66.62
Biblián
Pearson correlation
0.66
0.46
0.67
0.45
IRF
0.61
0.56
1.29
1.25
Mean-bias
−11.12
−2.87
−1.21
1.66
Cum_bias
35.01
29.99
18.91
17.30
RMSE
41.73
44.89
39.87
51.81
All other metrics in Table 6 present similar values. The stronger differences arise generally in the RMSE and R statistical metrics, which might be related to the fact that QM corrects only the characteristic of the distribution, as can be seen in Figure 3. This figure presents the monthly precipitation Box-Whisker plots for AI and SDSM ensembles bias corrected for the 5 climate stations. As can be observed the distributions are fairly alike. From the analysis for all stations AI_QM presented higher values of R than SDSM_QM. Similarly AI_QM presents better agreement with the observed data with the exception of the Paute station. This fact might point to a slightly better representation of the observed monthly precipitation distribution by AI_QM ensemble, for this specific region.
Monthly precipitation box plots for artificial intelligence and SDSM ensembles bias corrected. Results evaluated in station El Labrado (M141), Gualaceo (M139), Paute (M138), Palmas (M045), and Biblián (M137).
4.2. Evaluation of Intra-Annual Precipitation Seasonality Representation
Whereas in previous section the entire distribution of downscaled estimates was evaluated, in the following the representation of seasonality is evaluated. Although the evaluation of seasonality representation might not help to quantify, for example, flooding events, it is very important for issues related to water availability for hydroelectricity generation, drinking water availability, and agriculture. Further, the evaluation of seasonality representation is of special importance in the study region due to the low resolution of GCMs, unable to depict the precipitation regime due to mesoscale influences [5].
4.2.1. The Added Value of Quantile Mapping
The QM correction parameters for the power parametric transformation applied to AI and SDSM ensembles are presented in Table 4. The comparison of the multiyear monthly mean (mymm) precipitation of the SDSM with the SDSM_QM ensemble is presented in Figures 4(a)–4(e). As shown in Figure 4(a), SDSM applied to the El Labrado station fails to capture the observed seasonality. However, the performance, bias, and variance improved considerably after applying QM. Seasonality applying SDSM to the Gualaceo (Figure 4(b)) station is less correctly presented and fails to capture the maximum in April and overestimates precipitation during the dry season in August. Application of QM only corrects the representation in August, but not in April. The ensemble of SDSM compositions represents well seasonality but underestimates significantly the November precipitation depth. Application of QM improves the representation of seasonality but does not improve the November estimate. The SDSM in Paute station represents seasonality well, but underestimates significantly the maximum in November. QM applied to SDSM in Paute improves the performance of seasonality, yet fails to improve the representation of the November precipitation (Figure 4(c)). The SDSM approach calibrated to the observations of the Palmas station (Figure 4(d)), a station with unimodal regime (UM), depicts fairly correct seasonality notwithstanding the limited spatial extent of the UM regime and the poor representation of the mesoscale influences in the synoptic predictors. Application of QM negatively affects the SDSM representation during the first 6 months of the year but improves slightly the representation during the remaining period of the year. The seasonality of the Biblián station (Figure 4(e)) is properly represented by the SDSM ensemble and as well as the SDSM_QM. Only the November peak in precipitation is captured by neither SDSM nor SDSM_QM.
Multiyear monthly mean precipitation for SDSM ensemble and SDSM ensemble bias corrected (a–e) and multiyear monthly mean precipitation for AI and AI bias corrected (f–j).
The comparison of mymm of the AI with the AI_QM ensembles is presented in Figures 4(f)–4(j). The AI emsemble underestimates precipitation in April, overestimate it in July and August, and underestimate it in November, although at some extent it represents seasonality. QM improves the intra-annual variability, but still the November maximum is not correctly estimated. The AI ensemble in Gualaceo (Figure 4(g)) station captures well both peaks in April and November but underestimates precipitation in August. The estimation in August is considerably improved after correction of the bias and the inflation of the variance. In the Paute station (Figure 4(h)) are both the peaks, respectively, in April and November, and the minimum in August were well captured by the AI ensemble, while QM further improves the distribution of the median of the monthly precipitation. The AI ensemble represents poorly the distribution of the Palmas station (Figure 4(i), UM regime), even after QM application. For the Biblián station (Figure 4(j)) the AI_QM captures the April peak one month earlier but fails to correctly depict the magnitude of the November peak.
Results clearly reveal that the application of QM to the output of both modeling approaches, SDSM and AI, overall improves the representation of seasonality, as well as the representation of rainy and dry periods. However, both approaches underestimate the median value of the precipitation depth in November. This fact could indicate that the set of synoptic predictors do not include a variable that is related to an enhancement of precipitation in this period. Further studies are needed to determine the variables and related phenomena.
4.2.2. Representation of Monthly Variability by Downscaled Results
To compare the representativeness of SDSM_QM and AI_QM, the correlation of the mymm time series with the observed values is shown in Table 7. For all stations AI_QM presents greater Pearson correlation coefficient than SDSM_QM. The multiyear median observed and estimated monthly precipitation depth, using, respectively, the SDSM_QM and AI_QM model ensembles are presented in Figures 5(a), 5(c), 5(e), 5(g), and 5(i). For the graphical presentation the Box-Whisker plot type was selected, as to show in addition to the median the variation in estimates (see Figures 5(b), 5(d), 5(f), 5(h), and 5(j)).
Pearson correlations between observations and SDSM_QM and AI_QM models.
Station
AI_QM
SDSM_QM
El Labrado
0.741
0.516
Gualaceo
0.946
0.672
Paute
0.730
0.629
Palmas
0.788
0.593
Biblián
0.691
0.532
Comparison of SDSM and AI ensembles bias corrected (a, c, e, g, i) and Box-Whisker plots for SDSM and AI ensembles bias corrected, from January through December (b, d, f, h, j).
For the El Labrado station (Figures 5(a) and 5(b)) the observed interquartile range is higher for the period of January to April, with lower values in August and September. It is worthwhile noticing that although AI_QM captures seasonality, the intra-annual variability is not captured. Even for some months SDSM_QM captures the variability better, as is the case for March. This fact suggests that an assessment for SD should be based on several models. The median and interquartile range are relatively well captured for the Gualaceo station (Figures 5(c) and 5(d)); the variability of the months from January to September is similar for the SDSM-QM and AI_QM estimates. October and November variability is different for the two models, but the median is well represented. The variability and the median are better represented by AI_QM and slightly overestimated by SDSM_QM in the period of June to August. Figures 5(e) and 5(f) depict the results for the Paute station, illustrating that both models relatively well represent the median and interquartile ranges in the period of January to September but fail to do so for the period of October to December. This fact highlights the need to further explore the relation between the synoptic conditions and rainfall. Neither the SDSM_QM nor the AI_QM model estimates correctly the median of the Palmas station (Figures 5(g) and 5(h)), the only station with a unimodal regime. Because both model approaches indistinctly overestimate or underestimate the median in some months, it might be worthwhile to examine more in detail the representation of an ensemble of both models. The interquartile range of each month is relatively well represented except in a distinct number of months, such as January, April, June, July, September, and October. This could mean that in those months the influences of the mesoscale factors are not properly represented in the synoptic variables. An option for its remediation could be a methodology in which the influences of mesoscale factors are considered, for example, dynamic downscaling, followed by the application of statistical downscaling to regional predictors. However, further studies are necessary to support the applicability of such an approach in mountain regions. For the Biblián station (Figures 5(i) and 5(j)), the two approaches overestimate precipitation in the period of January to March. AI_QM captures adequately the median from April to December, with exception of November, as was the case for the other stations. Overall, the AI_QM depicts fairly well the variability, except for October and November, whereas SDSM_QM underestimates the variability throughout the year.
5. Conclusions
The evaluation of downscaling methods in mountain regions is of major importance due to the misrepresentation of climate by GCMs. The low resolution of GCMs limits the accurate prediction of the probable impacts of climate change at basin scale. In the present work, the applicability of monthly precipitation downscaling of global climate models by SDSM, and methods of artificial intelligence, as neural networks and least squares support vector machines, was studied. Also a comparative analysis of the applied downscaling methods was conducted. Comparative analysis revealed that with respect to the downscaling of monthly precipitation neural networks and least squares support vector machine models perform equally. Considering the statistical metrics, such as Pearson correlation, root mean square error, and percentiles biases, overall the artificial intelligence methods showed better skills in relation to SDSM, although, in some stations and some months, the importance of considering both model approaches was necessary in order to derive robust conclusions. In general, although the representation of precipitation from January to August is adequate, especially in November, both approaches failed to represent precipitation in some stations. Further analysis of the synoptic conditions for this period is therefore recommended and a methodology considering downscaling with specific predictors by month or season might be advisable. From the analysis on Palmas station, a station with important mesoscale influences, we could derive that further evaluation of a methodology of downscaling using dynamic and statistical methods in cascade could help capture features that GCMs are not able to represent.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was funded by the Dirección de Investigación de la Universidad de Cuenca (DIUC) through the project “Análisis de los Efectos del Cambio Climático en los Caudales en las Cuencas Andinas del Sur del Ecuador (Paute), Debido a los Cambios en los Patrones de Iluvia y Temperatura” and by the Secretaria de Educación Superior, Ciencia, Tecnología e Innovación (SENESCYT) through a Ph.D. grant for the first author. The authors would also like to thank INAMHI for providing the meteorological data.
BuytaertW.VuilleM.DewulfA.UrrutiaR.KarmalkarA.CélleriR.Uncertainties in climate change projections and regional downscaling in the tropical Andes: implications for water resources managementMoraD. E.CampozanoL.CisnerosF.WyseureG.WillemsP.Climate changes of hydrometeorological and hydrological extremes in the Paute basin, Ecuadorean AndesWilbyR. L.DawsonC. W.BarrowE. M.SDSM—a decision support tool for the assessment of regional climate change impactsCoulibalyP.DibikeY. B.AnctilF.Downscaling precipitation and temperature with temporal neural networksOchoaA.CampozanoL.SánchezE.GualánR.SamaniegoE.Evaluation of downscaled estimates of monthly temperature and precipitation for a Southern Ecuador case studySkamarockW. C.KlempJ. B.DudhiaJ.GillD. O.BarkerD. M.DudaM. G.HuangW.WangW.PowersJ. G.VapnikV. N.SuykensJ. A. K.Van GestelT.De BrabanterJ.De MoorB.VandewalleJ.GudmundssonL.BremnesJ. B.HaugenJ. E.Engen-SkaugenT.Technical note: downscaling RCM precipitation to the station scale using statistical transformations—a comparison of methodsCampozanoL.CélleriR.TrachteK.BendixJ.SamaniegoE.Rainfall and cloud dynamics in the Andes: a southern Ecuador case studyCélleriR.WillemsP.BuytaertW.FeyenJ.Space-time rainfall variability in the Paute basin, Ecuadorian AndesCampozanoL.SánchezE.AvilésA.SamaniegoE.Evaluation of infilling methods for time series of daily precipitation and temperature: the case of the Ecuadorian AndesKalnayE.KanamitsuM.KistlerR.CollinsW.DeavenD.GandinL.IredellM.SahaS.WhiteG.WoollenJ.ZhuY.ChelliahM.EbisuzakiW.HigginsW.JanowiakJ.MoK. C.RopelewskiC.WangJ.LeetmaaA.ReynoldsR.JenneR.JosephD.The NCEP/NCAR 40-year reanalysis projectPervezM. S.HenebryG. M.Projections of the Ganges-Brahmaputra precipitation-Downscaled from GCM predictorsMahmoodR.BabelM. S.Future changes in extreme temperature events using the statistical downscaling model (SDSM) in the trans-boundary region of the Jhelum river basinBealeM. H.HaganM. T.DemuthH. B.Neural Network Toolbox™ User's Guide RMathWorks, Natick, Mass, USA, 2014De BrabanterK.KarsmakersP.OjedaF.AlzateC.De BrabanterJ.PelckmansK.De MoorB.VandewalleJ.SuykensJ. A. K.GudmundssonA. L.BremnesJ. B.Haugen1J. E.Engen-SkaugenT.Technical note: downscaling RCM precipitation to the station scale using statistical transformations—a comparison of methodsChenS.-T.YuP.-S.TangY.-H.Statistical downscaling of daily precipitation using support vector machines and multivariate analysisTripathiS.SrinivasV. V.NanjundiahR. S.Downscaling of precipitation for climate change scenarios: a support vector machine approach