Solar Radiation Models and Gridded Databases to Fill Gaps in Weather Series and to Project Climate Change in Brazil

+e quantification of climate change impacts on several human activities depends on reliable weather data series, without gaps and long enough to build up future climate. Based on that, this study aimed to evaluate the performance of temperature-based models for estimating global solar radiation and gridded databases (AgCFSR, AgMERRA, NASA/POWER, and XAVIER) as alternative ways for filling gaps in historical weather series (1980–2009) in Brazil and to project climate change scenarios based on measured and gridded weather data. Projections for midand end-of-century periods (2040–2069 and 2070–2099), using seven global climate models from CMIP5 under intermediate (RCP4.5) and high (RCP8.5) emission scenarios, were performed. +e Bristow–Campbell model was the one that best estimated solar radiation, whereas the XAVIER gridded database was the closest to observed weather data. Future climate projections, under RCP4.5 and RCP8.5 scenarios, as expected, showed warmer conditions for all scenarios over Brazil. On the contrary, rainfall projections are more uncertain. Despite that, the rainfall amounts will be reduced in the North-Northeast region and increased in Southern Brazil. No significant differences between projections using the observed and XAVIER gridded database were observed; therefore, such a database showed to be reliable for both to fill gaps and to generate climate change scenarios.


Introduction
Given the projections of global climate changes, simulation models can be used to estimate the impact of historical and future climates on human activities, mainly in crop growth and yield and food availability [1].For proper simulations, these models require high-quality and long-term historical daily weather data [2].However, the major difficulty regarding historical weather data in Brazil is the low density of weather stations, associated with the reduced number of measured variables and the large amount of missing data [3][4][5].
To overcome the lack of reliable weather data series, missing data can be filled in with estimated or interpolated data.Among the different approaches used to fill weather data gaps in, the main methods are climatic generators, which generate stochastic sequences of daily data, such as WGEN [6] and SIMMETEO [7] generators; empirical correlations using commonly measured meteorological variables present in the observed data [8][9][10]; and the use of the gridded weather database, based on satellite and/or surface data [2,4,11].
Once the historical data series have been filled, these can be used for generating future climate scenarios, derived from projections of climate models, which can be global (GCMs) or regional (RCMs).Despite the finer resolution of RCMs, considering the continental dimension of Brazil, GCMs (which would provide the RCM boundary conditions) offer insight into the general characteristics of future climate [12,13].
Due to the uncertainties associated with the GCM projections, different models can indicate different climate responses, and one way to reduce such an uncertainty is by considering an ensemble modeling approach [14], with the projections being obtained from multiple models, resulting in more reliable scenarios than if the models are considered individually [15].ese future changes can be projected based on GCMs generated by the Coupled Model Intercomparison Project Phase 5 (CMIP5 [16]), under di erent greenhouse gases emissions that follow distinct representative concentration pathways (RCPs) [17][18][19], assessed in the Fifth Assessment Report (AR5) of the Intergovernmental Panel on Climate Change (IPCC) [20].For South America and speci cally for Brazil, the rst projections have indicated an increase in temperatures and an uncertain pattern in the rainfall distribution [12,13].Such patterns have been con rmed in the more recent studies of Chou et al., Sánchez et al.,.
Given the great importance of historical weather data for assessing the impacts of climate change on human activities, mainly agriculture, in addition to the fact that Brazil has a low weather station density, with a large amount of missing data [3][4][5], the general objective of this study was to evaluate the performance of di erent alternatives to ll in weather data gaps and, based on that, to create climate change scenarios for Brazil.More speci cally, this study aimed (i) to evaluate the performance of temperature-based models for estimating solar radiation and gridded databases, such as AgCFSR, AgMERRA, NASA/POWER, and XAVIER, as procedures to ll in gaps of weather data (maximum and minimum air temperature, solar radiation, rainfall, wind speed, and relative humidity) for the period of 1980-2009; (ii) to generate, from the complete historical weather data, climate change scenarios, over the medium-term (2040-2069) and long-term (2070-2099) periods, based on seven GCMs of CMIP5, under intermediate (RCP4.5)and high (RCP8.5)emission scenarios; and (iii) to identify patterns of climate change in air temperature and rainfall in di erent Brazilian regions to de ne the expected trends in relation to the historical climate.

Materials and Methods
e present study was developed according to di erent steps and in a logical sequence presented in the ow chart of Figure 1 and in the following sections.

Sites and Weather
Data.Historical daily measured weather data of maximum and minimum air temperature, sunshine hours, rainfall, wind speed, and relative humidity, from 1980 to 2009, were obtained from the Brazilian National Institute of Meteorology (INMET).irty-one sites well distributed in the country were considered, as presented in Figure 2.More detailed description about the percentage of missing values for each weather variable is in Table S1 of Supplementary Materials.

Filling Gaps in the Meteorological
Database.Due to the large percentage of missing data in the historical weather databases, ranging from 1 to 46% (Figure 2 and Table S1 of Supplementary Materials), weather variables were generated by temperature-based models (solar radiation) and gridded databases (all variables), as alternatives to ll these gaps in.[8] and Prescott [24], with coe cients as suggested by Glover and McCulloch [25], and then admitted as the reference values (Table 1).e temperature-based models for estimating solar radiation use maximum and minimum air temperatures as inputs to estimate atmospheric transmissivity [10], which is a ected by cloudiness.Five solar radiation models (Hargreaves (Ha), Hunt (Hu), Annandale (An), Bristow-Campbell (BC), and Donatelli-Campbell (DC)) were assessed as presented in Table 1.

Daily Gridded Database.
Gaps in measured weather data (maximum and minimum air temperature, solar radiation, rainfall, wind speed, and relative humidity) were also lled in with data from the following four gridded databases: (a) AgCFSR and AgMERRA datasets [11], developed as a part of the Agricultural Model Intercomparison and Improvement Project (AgMIP) [31], to provide consistent, daily time series with global coverage of climate variables.ey are result of a combination of NCEP's reanalysis of the Climate Forecast System Reanalysis (CFSR) [32] and NASA's Modern-Era Retrospective Analysis for Research and Applications (MERRA) [33] with observed datasets from weather stations' networks and satellites, available on a daily temporal scale, for the period between 1980 and 2010, at 0.25 °× 0.25 °horizontal resolution.Concerning the solar radiation models, two independent datasets were considered with two years each, for the calibration and evaluation of the adjusted coefficients.To avoid inconsistencies in the analysis, two consecutive years with less than 2% of missing data (temperature and sunshine hours) were chosen.For the evaluation of the gridded weather data, the entire database was employed for the period between 1980 and 2009.e performance of temperature-based solar radiation models and gridded databases for filling in daily data gaps was assessed by comparing estimated and measured data on a daily basis, using the common model performance evaluation indices, such as the coefficient of determination (r 2 ) as a measure of precision; agreement index (d) [35] as a measure of accuracy; confidence index (c) [36] (being classified as great for values higher than 0.85, very good for values between 0.76 and 0.85, good between 0.66 and 0.75, median between 0.61 and 0.65, suffering between 0.51 and 0.60, bad between 0.41 and 0.50, and terrible for values lower than 0.41); mean error or bias (Bias) that indicates the tendency of error; and mean absolute error (MAE), which gives the magnitude of the errors [37].

Climate Change Projections.
Climate change scenarios, based on measured weather data fulfilled with the best alternative, were projected by models that are publicly available through the CMIP5 [16], based on two RCPs [18]: intermediate emission scenario (RCP4.5)and high emission scenario (RCP8.5).As suggested by Ward et al. [38], the intermediate scenario appears as the most likely future for planning purposes, in which observed fossil fuel trajectories show up to be consistent, whereas the high emission scenario represents the extreme conditions.e future scenarios were generated based on the delta method [39], in which simulated mean monthly changes are imposed for the baseline for all sites by adding temperature changes and multiplying precipitation changes, without changing the variability within a month (e.g., the number of rainy days), following the procedure as described by Hudson and Ruane [40].All other variables were kept unchanged.
Projections were performed for mid-of-century (2040-2069) and end-of-century (2070-2099) periods, for the following CMIP5 GCMs: CNRM-CM5 [41], CSIRO-Mk3-6-0 [42], GISS-E2-R [43], HadGEM2-ES [44,45], INMCM4 [46], MIROC-ESM [47], and MPI-ESM_LR [48].e use of seven different GCMs was adopted since the uncertainties are inherent to the climate system, as a result of nonlinear interactions and the intrinsic complexity of the natural atmospheric phenomena [49].erefore, for the same emission scenario, different models produce diverse projections of climate change, and one way to minimize these uncertainties is through a set of global and/or regional models, known as an ensemble approach [15].In this sense, the climate projection presented here for each variable is an average of the outputs of seven GCMs.
As an alternative to the use of gridded historical climate data for future climate projections, we analyzed climate projections based on measured weather data compared to the climatology provided by the best alternative method, considering only the nine sites which had a percentage of missing data on air temperature and rainfall lower than 10%, as presented in Table S1 of Supplementary Materials.2 presents the average daily annual coefficients of the temperature-based solar radiation models for all Brazilian locations assessed.e Ha model displayed adjusted coefficients varying from 0.10 °C−0.5 to 0.18 °C−0.5 , differing from the original values of 0.16 °C−0.5 and 0.19 °C−0.5 obtained by Hargreaves and Samani [50] for continental and coastal regions, respectively.e adjusted b coefficient for the Hu model ranged from 0.04 to 0.22.However, the c coefficient of this model showed quite distinct values, ranging from −7.70 and 9.98.e coefficients e of the BC model and h of the DC model were similar, ranging from 0.75 to 0.77 in both models; however, the coefficients f and g of the BC model were smaller than the coefficients i and j of the DC model, whereas f and g of the BC model were, in average, 0.03 and 1.63 and i and j of the DC model were 0.07 and 2.24.

Solar Radiation Models. Table
Statistical indices for each temperature-based model assessed are presented in Figure 3.For more detailed results, see Tables S2 and S3 of Supplementary Materials.As presented in Figure 3, r 2 for the BC model ranges between 0.32 and 0.79, with a mean value of 0.62.For the DC model, r 2 values range from 0.26 to 0.76, with an average value of 0.59.
e estimated solar radiation values presented d between 0.44 and 0.93 for the Ha and Hu models and from 0.55 to 0.92 for the An model, with a mean value of 0.79, for all of them.For the BC and DC models, this index ranged from 0.62 to 0.93 and from 0.60 to 0.93, respectively, with average values of 0.86 and 0.85 (Figure 3; Tables S2 and S3). e confidence index (c) ranged from 0.31 to 0.81, with an average of 0.61 for the Ha and An models, and from 0.25 to 0.82 for the Hu model, with an average of 0.62 (Figure 3).For the BC model, c ranged from 0.35 to 0.82, while for the DC model, c ranged from 0.32 to 0.80, with an average of 0.68 and 0.66, respectively.Considering the average values for all sites, the models of Ha, Hu, and An presented performances classified as "median," whereas the performances of BC and DC models were classified as "good," according to the Camargo and Sentelhas [36] classification.3 presents the performance of the different daily gridded databases used to fill the gaps in the historical weather series.All databases showed high accuracy (d ≥ 0.89) for maximum air temperature (T max ), with XAVIER also showing very high precision (r 2 � 1).Except for AgCFSR, all models underestimated T max .Among all databases, XAVIER was the best one for estimating T max , with MAE � 0.17 °C, whereas NASA/POWER presented the highest MAE of 2.46 °C.

Gridded Database. Table
All databases showed high accuracy (d ≥ 0.93) and good precision (r 2 ≥ 0.77) for minimum air temperature (T min ).As to T max , XAVIER showed the best performance, with the lowest Bias (0.06 °C) and MAE (0.30 °C).On the contrary, NASA/POWER presented the worst performance, with Bias � 0.76 °C and MAE � 1.74 °C.Both AgCFSR and AgMERRA presented similar Bias and MAE, as well as similar c index, respectively, of 0.84 and 0.86 (Table 3).
For global solar radiation (Q g ), NASA/POWER and XAVIER presented the best performance, with the latter presenting the highest accuracy (d � 0.97) and precision (r 2 � 0.94), resulting in a c index of 0.94, classified as great [36].NASA/POWER showed r 2 � 0. 76 and d  XAVIER also presented the best performance for estimating relative humidity (RH), with high precision   Advances in Meteorology (r 2 � 0.90) and accuracy (d � 0.97) and small errors (Bias � 0.18% and MAE � 3.76%), whereas the other systems underestimated RH, with MAE higher than 11%.Despite the poor performance of all databases for estimating wind speed (WS 2m ), XAVIER displayed the best statistical indices, with r 2 � 0.47, d � 0.79, and c � 0.54, and the smallest error, with MAE � 0.49 m•s −1 , which, however, is still classified as suffering according to Camargo and Sentelhas [36].

Climate Change Projections.
Based on the historical measured weather data fulfilled with the XAVIER gridded database, the ensemble of climate change projections was performed for RCP4.5 and RCP8.5 emission scenarios on 31 sites from 1980 to 2009, from mid-to end-of-century periods.Annual maximum and minimum temperatures showed an increase in tendency, while for rainfall, the South region will mostly experience increases (annually), and the North and Northeast regions will experience decreases, as presented in Figures 4-6.More details can be found in Tables S4 and S5 of Supplementary Materials.
Annual average changes, for all 31 sites, of maximum temperature showed increases in medium-and long-term projections of 2.01 and 2.52 °C for RCP4.5 and 2.70 and 4.61 °C for RCP8.5, while for minimum temperature, the increases will be of 1.79 and 2.25 °C for RCP4.5 and 2.56 and 4.45 °C for RCP8.5 (Table 4).Under the same emission scenarios and future projected periods, higher increases will occur for maximum than for minimum temperatures.As expected, increases under the RCP8.5 scenario will be higher than those under RCP4.5.However, such increases are much more pronounced in the long-term projections, with the mean increase achieved between 2.39 and 4.48 °C, under intermediate and high emission scenarios.
Rainfall projections for the 31 sites showed a decrease of −6.18 and −6.68% for RCP4.5 and −4.34 and −8.62% for RCP8.5 for the medium-and long-term projections (Table 4); however, these changes must be analyzed carefully, since rainfall is a variable of high spatial variability and with distinct distribution patterns over the country.
e monthly climate changes projected for all 31 sites for the RCP8.5 scenario in a long term (2070-2099) are presented in Figure 7. Temperature changes will vary between 2 and 7 °C for T max (Figure 7(a)) and between 2 and 5.5 °C for T min (Figure 7(b)).e highest temperature increases will occur in the second semester of the year, mainly in October, for both.erefore, as shown before, higher temperatures are expected on future climate projections, with increases Advances in Meteorology that will persist every month [13,22].Rainfall reduction especially in North and Northeast regions will occur mainly from August to October, which coincides with the dry season and the period of higher temperatures.
Analyzing the future climate projections, by comparing the observed and XAVIER gridded database as a reference for climatology, the projected annual average of maximum and minimum temperature and rainfall was similar, with about the same variability for both databases (Figure 8).For air temperature projections, based on the observed and gridded climatology, the di erences were not greater than 0.06 and 0.08 °C, respectively, for maximum and minimum temperatures, in both emission scenarios and future periods considered.Similarly, for rainfall, the di erences between the two databases did not exceed 1%, considering all scenarios and periods.

Solar Radiation Models.
In general, the temperaturebased models for estimating Q g presented very similar performance after their calibration for 31 sites in Brazil (Figure 3).However, the models which were based on three coe cients, BC and DC, had a subtle better performance, improving the general con dence index c above 0.6 for most simulations.As this is the rst attempt to calibrate these models considering several locations around the country, the calibrated coe cients (a for Ha; b and c for Hu; d for An; e, f, and g for BC; and h, i, and j for DC) were quite di erent from those obtained by other authors for speci c locations or locations within the same state, such as those presented by Barbosa et al. [51] for the state of Minas Gerais (MG), by Conceição and Marin [52] in the northwest of the state of São Paulo, and by Massignam [53] in the state of Santa Catarina.Also, the performances of these models when considering several locations spread in the country were a bit worse than those reported by speci c locations [51][52][53], which is mainly caused by the greater Q g variability observed around the country with the di erent atmospheric transmissivity caused by diverse cloud types.
Despite the di erences in performance reported above, the present study con rmed that BC and DC are the best temperature-based methods for estimating Q g .e performance of these methods, however, can vary according to the region and the season of the year, as reported by Rivington T max (°C) 1 T max (°C) 1 T max (°C) 1 T max (°C) 1 8 Advances in Meteorology et al. [54].In this study, it was found that the best Q g estimates were found in Southern and Southeastern Brazil, where it seems to be a better correlation between nebulosity and daily thermal amplitude.In these regions, the condence index was classi ed between good and very good, as can be seen in Tables S2 and S3 of Supplementary Materials.

Gridded Database.
e gridded data provided by di erence sources presented distinct performances for simulating weather conditions and variability in di erent parts of Brazil.For T max and T min , as well as for Q g , the four systems assessed presented good to great performance, according to the classi cation of Camargo and Sentelhas [36], with r 2 ≥ 0.64, d ≥ 0.88, and c index always above 0.71.In general, XAVIER was the system that presented the best performance for these three variables, with c always above 0.90.On the contrary, for Rain, RH, and WS 2m , the performances were quite variable, with AgCFSR, AgMERRA, and NASA/POWER presenting the worst estimates, with c equal to or below 0.33, 0.61, and 0.21, respectively, whereas XAVIER presented great performance for Rain (c 0.90) and RH (c 0.92).For WS 2m , XAVIER also had a better performance than the other sources, however, with lower indices when compared to the other weather variables (r 2 0.47, d 0.79, and c 0.54).
Similar results were found by Monteiro et al. [55] and by Battisti et al. [5] when using NASA/POWER, XAVIER, and AgMERRA gridded databases in several Brazilian locations.Despite the similar performances observed by these authors regarding the gridded data they used, both of them concluded that the di erences between observed and gridded data were not enough to lead to signi cant di erences for estimating the potential yield of sugarcane [55] and soybean [5].However, when simulating the attainable yield, which depends on the rainfall, Monteiro et al. [55] realized that the use of observed data improved the estimates substantially, once NASA/POWER did not represent rainfall spatial and temporal variability very well, as also observed in the present study (Table 3).Following the same strategy, Battisti et al. [5] also observed that the use of rainfall data from AgMERRA did not provide reliable results of the soybean attainable yield, whereas XAVIER data did.
Regarding rainfall data, the major limitation for their spatial interpolation based on satellite data, as done by AgCFSR, AgMERRA, and NASA/POWER, is the low or Advances in Meteorology inadequate resolution of the images which is not good enough to capture extreme events [56,57] and local spatial variability associated with the topography [58,59].Similarly, the poor performance of all databases to estimate WS 2m is related to two main aspects: the small magnitude of this variable, which leads to large errors even with small deviations, and its high spatial variability associated with the topography and land cover [60].Finally, the median to bad AgCFSR, AgMERRA, and NASA/POWER performance to estimate RH is related to the fact that the former two provide RH at the time of maximum daily temperature, which is not the daily average, which resulted in MAE between 14 and 17% in the assessed regions.NASA/POWER estimates RH based on similar procedures employed by AgCFSR and AgMERRA, which resulted in errors of similar magnitude, about 11%, very close to those reported by Stackhouse et al. [34] for several locations in the United States for a historical weather series of 31 years.
From the results presented in Table 3, the XAVIER gridded database was the best one to represent spatial and temporal weather data variability in Brazil, once it is based on data from ground stations from several sources.In addition, its high spatial resolution (0.25 °) allows a reasonable characterization of the topography and land cover e ects on   [21,22,61,62].For air temperature, Torres and Marengo [61] projected increases exceeding 2 °C by the end of the present century in South America with more than 90% of probability, which was con rmed by our results (Figures 4 and 5; Table 4).For rainfall, decreases will be expected in the northern part of the country, whereas in the center-southern part, rainfall increase will prevail; these results are comparable to those obtained by Sánchez et al. and PMBC [22,49].
e rainfall reduction in Northern Brazil will occur mainly from August to October, which coincides with the dry season, and when high temperatures predominate, it leads to higher water de cits, increasing the risks for rainfed perennial crops as well as for annual and perennial irrigated crops by increasing the crop water demand and irrigation requirements [63,64].
Comparing the future climate projections generated from observed and XAVIER gridded databases, considered as the historical basis for future climate projections, the results did not show any substantial di erence in the projected scenarios of temperature and rainfall, which makes possible to use the XAVIER database for studying the impacts of climate change on agriculture or any other human activity.

Conclusions
is study assessed the potential use of temperature-based solar radiation models and gridded databases as options to ll gaps in weather series and to project climate change scenarios in Brazil.Among the temperature-based solar radiation models, the one with the best performance was the BC model, which presented the lowest errors and highest precision and accuracy.In relation to the gridded data, the XAVIER database was the best one to represent observed weather series in Brazil, showing up to be reliable for both to ll gaps in and to be used as a reference to agricultural planning and agroclimatic risk studies for the present and future climates.Due to its outstanding performance, the  Advances in Meteorology 12 Advances in Meteorology XAVIER database can also be used for studies related to the impact of climate variability and climate change on other human activities in Brazil.

Figure 2 :
Figure 2: Weather stations from the Brazilian National Institute of Meteorology used in the present study, with total percentage of missing data (maximum and minimum air temperature, sunshine hours, rainfall, relative humidity, and wind speed), in the period from 1980 to 2009.

Figure 3 :
Figure 3: Boxplot of the statistical indices and errors of Hargreaves (Ha), Hunt (Hu), Annandale (An), Bristow-Campbell (BC), and Donatelli-Campbell (DC) temperature-based solar radiation models, when compared to measured solar radiation data of 31 Brazilian sites.Boxes denote the lower (25%) to upper quartile (75%) values, with a horizontal line at the median and crosses at mean values.

Figure 7 :
Figure 7: Monthly projected changes of maximum air temperature (a), minimum air temperature (b), and rainfall (c), averaged from seven global climate models (GCMs), in 31 Brazilian locations, at the end-of-century (2070-2099) period and under a high emission scenario (RCP8.5),when compared to the historical climate (1980-2009).

Figure 8 :
Figure 8: Boxplot of the projected annual average of maximum air temperature (a), minimum air temperature (b), and rainfall (c), based on seven global climate models (GCMs), for mid-of-century (2040-2069) and end-of-century (2070-2099) periods, under intermediate (RCP4.5)and high (RCP8.5)emission scenarios, based on the INMET and XAVIER historical database (1980-2009).Boxes denote the lower (25%) to upper (75%) values, with a horizontal line at the median and crosses at the mean values.

Table 1 :
Solar radiation-estimating models based on maximum and minimum air temperature.

Table 2 :
Average daily annual coefficients of Hargreaves (Ha), Hunt (Hu), Annandale (An), Bristow-Campbell (BC), and Donatelli-Campbell (DC) temperature-based solar radiation models for each of the Brazilian locations considered in this study.

Table 3 :
Statistical evaluation of daily gridded databases for maximum air temperature (T max ), minimum air temperature (T min ), solar radiation (Q g ), rainfall (Rain), relative humidity (RH), and wind speed (WS 2m ), considering 31 locations in Brazil.

Table 4 :
Overall changes of maximum, minimum, and mean air temperature and rainfall, averaged from seven global climate models (GCMs) for 31 Brazilian sites for mid-of-century (2040-2069) and end-of-century (2070-2099) periods, under intermediate (RCP4.5)and high (RCP8.5)emission scenarios, when compared to the historical climate conditions (1980-2009).study are in line with the projections performed by Chou et al., Sánchez et al., Torres and Marengo, and Reboita et al.