Spatial Interpolation of Daily Rainfall Data for Local Climate Impact Assessment over Greater Sydney Region

1New South Wales Office of Environment and Heritage, P.O. Box 3720, Parramatta, NSW 2150, Australia 2Jiangsu Key Laboratory of Agricultural Meteorology, Nanjing University of Information Science and Technology, Nanjing 210044, China 3NSW Department of Industry, Skills & Regional Development, Wagga Wagga Agricultural Institute, Wagga Wagga, NSW 2650, Australia 4Graham Centre for Agricultural Innovation (an Alliance between NSW Department of Industry and Charles Sturt University), Wagga Wagga, NSW 2650, Australia 5New South Wales Office of Environment and Heritage, P.O. Box 733, Queanbeyan, NSW 2620, Australia


Introduction
Rainfall is a highly important piece of data which is frequently required for water resource management, hydrologic and ecologic modeling, recharge assessment, and irrigation scheduling.Such data are normally recorded as observational data through comprehensively designed rainfall station networks.However, rainfall records are often incomplete because of missing rainfall data in the measured period or insufficient stations in the study region.
More recently, global climate models (GCMs) are widely used for assessing the responses of the climate system to changes in atmospheric forcing.Projections of potential climate change are essential for sustainable natural resources planning and management [1].GCMs provide information (e.g., rainfall, temperature) at a spatial resolution (above 50 km [2]) that is too coarse to be directly used in local ground impact studies or regional planning.The NSW Office of Environment and Heritage (OEH) and the University of New South Wales (UNSW) have developed finer-scale (10 km resolution) climate projections for southeast Australia (SEA) as part of the NSW and ACT Regional Climate Modeling (NARCliM) project [3].The NARCliM project also produced a higher resolution (2 km) regional climate projection from 2040 to 2059 for the Greater Sydney Region (GSR).Time slices of recent climate (1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) and future climate (2040-2059) were simulated using the Weather Forecasting and Research (WRF) model [4] and the outputs include daily rainfall and temperature.
Climate projections at 2 km resolution provide increased level of detailed information for fire and emergency management, water and energy management, agriculture, urban 2 Advances in Meteorology planning, and biodiversity management which need to adapt to a future climate [1].Such projections are still considered rather coarse in most hydrological and ecosystem modeling at local scales which often require even high-resolution rainfall data in both spatial and temporal contexts.Our current study on climate change impact assessments on rainfall erosivity and hillslope erosion across GSR required a spatial resolution down to 30-100 m [5,6].Finer-resolution rainfall data can be estimated through spatial interpolation techniques [7] which are widely used to produce continuous rainfall surfaces bridging the spatial gaps in the time series data.
There are many varieties of spatial interpolation techniques and they can be classified into three categories based on interpolation methods and scales of application.The first category includes Nearest Neighbor (NN), Thiessen polygons, Spline, and various forms of Kriging and Inverse Distance Weighting (IDW) which are frequently used in interpolating rainfall data from rain gauge stations [8][9][10][11].These interpolation methods are relatively simple, require relatively little input data, and are often used in small and medium scale catchments or basins.The second category uses ancillary data such as satellite imagery and digital elevation models, along with rain gauge station data, in the interpolation process for rainfall prediction at large scale [12][13][14].The third category forecasts rainfall based on complex interpolation models, such as fuzzy reasoning method, and artificial neural networks [15][16][17].
Many studies have been dedicated to the comparison and evaluation of different spatial interpolation methods at various spatial scales.For example, Dirks et al. [18] compared four spatial interpolation methods using rainfall data from a network of 13 rain gauges on Norfolk Island, and the results found that Kriging provided no significant advantage over IDW, Thiessen, or areal-mean methods.Hsieh et al. [19] used daily rainfall records from 20 rain gauges stations between 1990 and 2000 to predict the spatial rainfall distribution in the Shih-Men Watershed in Taiwan using ordinary Kriging (OK) and IDW.The results indicated that IDW produced more reasonable representations than OK.Dong et al. [20]  value for nearly all cases.These different models are usually compared with each other through a validation procedure in order to choose the process of reconstruction of the historical data that leads to better results, that is, the model characterized by the lower bias and the greater accuracy on the validation set [24,25].The previous studies on spatial interpolation techniques of rainfall suggest that each method has its advantages and disadvantages based on its objectives, and hence the optimal interpolation method to be adopted varies depending on the specific purposes.This study differs in several ways from previous ones.It specifically aimed to compare spatial interpolation methods for rainfall time series predicted from regional climate models (through the NARCliM project) for the current climate period and determine the suitable method to produce daily rainfall GIS layers for the future climate periods.The interpolated daily rainfall data have been directly used in rainfall erosivity and soil erosion risk modeling [6].The current and future rainfall variations and their potential impacts on soil erosion are examined based on the state plan regions so as to support the regional action plan as outlined by New South Wales Government [26].

Study Area.
We chose Greater Sydney Region (GSR) as the study area because (1) there is considerable seasonal variability in rainfall amount and intensity; (2) it is an area of significance in the state plan; and (3) 2 km projected rainfall data are available for this area.
The boundary of the GSR is defined as 148.8 ∘ E to 152.4 ∘ E and 35.7 ∘ S to 32.4 ∘ S and covers an area of 124,000 km 2 (Figure 1).There are six state plan regions (SPRs) in GSR, namely, Eastern/Inner, Northern, Northern Beaches, Southern, South Western, and Western.The SPRs are used in NSW regional action plans which focus on immediate and future actions the NSW Government will take to improve outcomes in each region.The key actions include land use planning to protect both the local environment and prime agricultural land.Our results are presented for SPRs so that the research outcomes can be directly used in the regional action plans.There are 72 rainfall station sites within GSR which can be used for comparison and accuracy assessment.There are also projected daily rainfall data for these sites for future periods up to 2100 [27].
2.2.The AWAP Data.This dataset has been developed by the Australian Water Availability Project (AWAP) and details on the creation of AWAP can be found in Jones et al. [28] and Raupach et al. [29].The AWAP data were produced using the WaterDyn model [29] for continental Australia and the datasets include rainfall, maximum and minimum temperature, and vapour pressure surfaces obtained by interpolating surface station measurements onto a 0.05 ∘ × 0.05 ∘ grid.Climatological averages are gridded using three-dimensional smoothing splines.The Barnes successive correction method was used for analysis of the anomalies [29].The number of stations reporting data varies with time and by variable, with rainfall interpolated from between 5,000 and 7,000 stations across Australia.The AWAP daily rainfall products have gone through several rounds of improvements and accuracy assessments against daily accumulation from a NASA multisatellite rainfall product and Bureau of Meteorology groundbased rain gauge stations.In this study, the most recent AWAP rainfall product (Run26j) was used to assess model performance.

The NARCliM Data. The New South Wales Office of Environment and Heritage (OEH) and the Climate Change
Research Centre at the University of New South Wales (UNSW) are developing an ensemble of future climate projections using regional climate models.The NARCliM project provides projected climate data for adaptation to a future climate for NSW and the Australian Capital Territory.The Sydney climate projections used in this study have been developed by UNSW as a pilot study using the GCM (CSIRO MK 3.5) and regional climate models.This model is just one of a suite of GCMs available for the GSR and was chosen because it performed best in replicating observed climate despite uncertainties in the downscaling sourced from the GCMs [30,31].CSIRO MK3.5 is considered a "wetter" model, projecting higher rainfall compared with the other GCMs (e.g., CCCMA3.1,ECHAM5, and CSIRO-MK3.0).CSIRO MK 3.5 was then dynamically downscaled to 2 km using the Weather Research and Forecasting (WRF) model [4] for two time slices of recent climate (1990-2009) and future climate (2040-2059) at daily temporal resolution.In this study, all daily rainfall data at both time slices were spatially interpolated, but only the rainfall data of recent time period were used for evaluation.

Interpolation Methods.
Four different interpolation methods (IDW, Kriging, ANUDEM, and Spline) were tested in this study to interpolate the 2 km data from the NARCliM project to a finer resolution of 100 m.We chose these methods as they are representative of available interpolation procedures, widely used in rainfall interpolation, and easy to implement in GIS (e.g., ESRI's ArcGIS).
IDW interpolation determines cell values using a weighted combination of a set of sample points.The weight is a function of the inverse distance [32].The surface being interpolated should be that of a locational dependent variable [33].Nearby data will have the most influence in the interpolation, and the surface will have more detail (be less smooth).IDW function in ArcGIS was used for IDW interpolation with a moderate weighting value (2) to control the significance of known points upon the interpolated values, based upon their distance from the output point.A radius of 6000 m was used in the interpolation.
Kriging is perhaps the most distinctive interpolation method [34,35].The term is derived from the name of D. G. Krige who introduced the use of moving averages to avoid systematic overestimation of reserves.Kriging is based on the regionalized variable theory which assumes the statistical surface to be interpolated has a certain degree of continuity.It is an advanced and complex geostatistical procedure that generates an estimated surface from a scattered set of points.Kriging interpolation offers several types of surface estimators; examples are ordinary Kriging (OK) and universal Kriging (UK).OK assumes that the variation in  values is free of any structural component and can be represented by the Spherical, Circular, Exponential, Gaussian, and Linear methods.UK assumes that the spatial variation in  values is the sum of three components: a structural component, a random but spatially correlated component, and random noise representing the residual error [36].Generally, OK has more restrictive assumptions but fewer computational problems, whereas the assumptions of UK are more general but difficulties of calculation are greater [37].For this study, we chose the ordinary Kriging method.We chose exponential semivariogram model using 12 neighboring samples and maximum radius of 6000 m.
ANUDEM was developed at the Australian National University [38] and it was implemented in the ESRI's ArcGIS (TOPOGRID program).It produces a regular grid of elevation data from relatively small and sparse sets of elevation and streamline data.The interpolation procedure of ANUDEM imposes an automatic drainage enforcement algorithm that removes spurious sinks or pits.This is of particular advantage in hydrological related studies.The interpolation procedure has been designed to take advantage of the types of input data and the known characteristics of elevation surfaces.The method uses an iterative finite difference interpolation technique and it is optimized to have the computational efficiency of "local" interpolation methods such as IDW, without losing the surface continuity of global interpolation methods such as Kriging and Spline.As ANUDEM is specifically designed for hydrological correction, we only used boundary (GSR) and cell size (100 m) as input parameters (no other subcommands used) so as to keep consistent with other methods.
Spline performs a two-dimensional minimum curvature spline interpolation on a point dataset resulting in a smooth surface that passes exactly through the input points [39].Spline interpolation is preferred over polynomial interpolation because the interpolation error can be made small even when using low degree polynomials for the splines.The basic minimum curvature technique is also referred to as thin plate interpolation.It ensures a smooth (continuous and differentiable) surface together with continuous first-derivative surfaces.Rapid changes in gradient or slope (the first derivative) may occur in the vicinity of the data points; hence, this model is not suitable for estimating second derivative (curvature).We used the "regularized" option that ensures a smooth surface together with smooth first-derivative surfaces, with 12 points per region and a cell size of 100 m.

GIS Operations and Implementation.
The above spatial interpolation techniques were implemented in ESRI's ArcGIS with the following procedures: (1) convert daily ASCII rainfall data to grid and point layers; (2) reproject rainfall data to the same projection as other datasets (geographic); (3) remove abnormal rainfall values (using a cut-off value of 350 mm/day); (4) interpolate daily rainfall using the four methods; (5) clip rainfall grids to the study area extent; (6) produce monthly and annual rainfall time series from the interpolated daily rainfall; (7) calculate rainfall erosivity based on the improved daily rainfall erosivity model [6] and soil erosion risk based on the revised universal soil loss equation (RUSLE) [40]; and (8) sample and statistically analyse rainfall, erosivity, and erosion data based on the state plan regions.
Automated GIS scripts have been developed for all the above procedures to process the projected daily rainfall data (originally in ASCII format) and spatially interpolate them into 100 m grids.A rainfall filter is applied such that any daily rainfall event above a maximum threshold (350 mm/day [3]) is replaced by that value.To save computation time, only wet days were selected for spatial interpolation and calculation of monthly and annual rainfall.A threshold of 0.1 mm was chosen to define wet days.Other thresholds could also be chosen (e.g., 0.2 mm in [41]), but a previous study showed that it is reasonable to choose the threshold in the range 0-1 mm [42].
2.6.Performance Assessment.In this study, mean absolute error (MAE), mean relative error (MRE), and root mean squared error (RMSE) were used to assess the performances of the four interpolation methods.MRE reflects the relative accuracy of the interpolation, and the MAE and RMSE are indicators of the magnitude of extreme errors.Lower MAE, MRE, and RMSE values indicate greater central tendencies and generally smaller extreme errors.In the meantime, the coefficient of correlation () was also used to evaluate whether the estimated data fits observed data.The formulas of MAE, MRE, and RMSE are given as follows following Taesombat and Sriwongsitanon [43]: where  is the number of rain events,   is the observed rainfall value at a time (), and    is the estimated rainfall value at a time ().

Error Assessment and Statistics.
With the aid of the automated GIS scripts developed in this study, we interpolated daily rainfall for the recent (1990-2009) and future (2040-2059) periods using the four interpolation methods as described above.We produced 13,318,880 daily, 1,920 monthly, and 140 annual rainfall GIS layers covering a 40year period.
The monthly and annual rainfall values (calculated from the interpolated daily rainfall) from the four interpolation methods were used in the assessment and evaluation as the comparison at daily step would be otherwise too complicated.We sampled the monthly and annual rainfall data at the 72 rainfall station sites [27] and 1000 random points for each SPR (total 6000 points across Sydney Region) for all periods.The sampling points were compared with the available gauged rainfall data for point-based assessment and the AWAP data for area-based (spatial) assessment at the same periods.
We compared MAE, MRE, RMSE, and  from the four interpolation methods to assess the relative accuracy and the comparisons are presented in Tables 1 and 2. The MRE values were similar for monthly and annual data.Both MAE and RMSE values of the four interpolation methods were in the order IDW < Kriging < ANUDEM < Spline.The  values of all the four interpolation methods reached significant correlation ( 0.05, = 0.081,  0.01, = 0.062).The sample number () is 1180 for annual rainfall and 14160 for monthly rainfall (12 * 1180).The  values between all four interpolation methods and AWAP data for monthly and annual rainfall are similar ranging between 0.53 and 0.55.Comparison suggests that IDW is slightly superior to the other three methods during interpolating the 2 km rainfall data.The low  (or large MAE and RMSE) is mainly caused by the input GCM daily rainfall projections.Figure 2 presents the scattered plot of the monthly and annual rainfall between interpolation and AWAP.Note that the sample numbers in the plots are slightly  different as some sampling points have no data (or no data cells in the GIS rainfall grids).
The error analysis also shows regional variation revealing that the interpolation accuracy is generally higher in the Western regions (Western and South Western) than the coastal regions (Tables 3 and 4).The MAE and RMSE values of monthly and annual rainfall are generally in the order Western < Eastern/Inner < Northern, and the MRE value in all three regions was similar.The MAE, MRE, and RMSE values for the four interpolation methods are that IDW < ANUDEM < Spline < Kriging.The  values for monthly and annual rainfall from all interpolation methods at all regions reached obviously significant correlation.The above performances indicate that all four interpolation methods

Comparison of Spatial Variations.
The spatial variations of monthly rainfall interpolated by all four methods have similar patterns (Figure 3).The minimum, mean, maximum, and standard derivation of monthly rainfall values from the  Beaches, and Eastern-Inner districts.The IDW and Kriging, as they emphasize the significant of known points upon the interpolated values, produced more localized patterns.But the Spline and ANUDEM generated more smooth surfaces (Figure 4).The monthly and annual rainfall patterns in the GSR are similar and consistent from the four interpolation methods.That is, the mean values in the Northern region are always higher than those of the Western and Southern regions for all the interpolation methods.The interpolated monthly and annual rainfall patterns show the obvious rainfall increase from west to east, but such trend in the north-south direction is not obvious.

Comparison of Seasonal Variations.
Compared with the AWAP rainfall, the interpolated rainfall shows similar seasonal variations within the GSR (Figure 5).All four interpolation methods produced a good reproduction of the recent seasonal rainfall patterns though there is generally overestimate, as inherited from the NARCliM projections, across all mid to high rainfall rates, particularly in autumn as observed in Evans and Argüeso [41].
In general, there is more rainfall from January to May, and little from July to September.The maximum, minimum, and annual values of future rainfall are increasing compared with the recent period and this might be due to the GCM (CSIRO MK 3.5) itself which is regarded as a "wetter" model (overestimate rainfall [3]).Figure 6 compares the monthly changes of rainfall from the four interpolation methods and AWAP.Three interpolation methods (ANUDEM, IDW, and Spline) show almost identical curves and monthly changes but interpolation from Kriging is noticeably different.The rainfall in February was recorded the highest during the recent period, but this is not reflected in the WFR projection and interpolation.This implies that the GCM and WFR (here the NARCliM project) can predict the general rainfall patterns and trends but they cannot correctly predict rainfall extremes or the spatial-temporal variations.

Potential Impacts on Soil Erosion
Risk.Future rainfall is predicted to increase significantly by 2050 in GSR using this specific climate model (CSIRO MK 3.5, Figure 7).This represents a "wet" scenario in soil erosion risk modeling to study the potential impacts of future rainfall on soil erosion and the relative risk across the state plan regions.
To assess the impacts of projected rainfall on soil erosion risk, we further applied an improved daily rainfall erosivity model [6] and the revised universal soil loss equation [40] to estimate rainfall erosivity and soil erosion for the recent (1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) and future (2040-2059) periods using the interpolated daily rainfall.Based on the projected daily rainfall, both rainfall erosivity and erosion are expected to increase up to 61% by 2050 in Sydney SPRs if there is no ground cover or protection (Table 5).The 60%+ extra hillslope erosion   for unprotected soil is a serious concern with implications for Sydney water quality.Review of erosion and sediment control standards for construction and reassessment of land management practices may be required for the predicted high risk areas.However, the hillslope erosion rate will be significantly reduced if the ground cover or materials (i.e., urban build-up areas) are considered.Note that the changes in rainfall, rainfall erosivity, and hillslope erosion rates are noticeably uneven in space and time.Changes are the greatest near the coast and in the west (Figure 8).Rainfall, rainfall erosivity, and their changes in summer are mostly greater than those in other months suggesting that summer is the critical season for soil erosion prevention and management.

Conclusions
In this study, four common but representative spatial interpolation methods have been applied to downscale RCM modeled rainfall data into high-resolution GIS data layers.
The results of our study show that the projected rainfall data, at monthly and annual scales, are close to the AWAP data in spatial patterns despite about 7% deference in absolute values as the GCM (CSIRO MK 3.5) generally overestimates rainfall particularly in autumn, and the  value between them reached significant correlation ( < 0.01).These results demonstrate that rainfall projections provide relatively good estimations of daily rainfall and provide meaningful spatial and temporal information of rainfall changes for soil erosion risk modeling and climate change adaptation future plans.
In our study, IDW, ANUDEM, and Spline interpolation methods produced very similar results and they all reveal similar patterns of monthly rainfall distributions, while Kriging produced slightly different seasonal patterns.The study suggests that IDW is slightly superior, though not significantly, to others in relative errors and computational efficiency.
The four interpolation methods also resulted in similar spatial patterns across GSR and they all show the obvious variations across the state plan regions.Predicted rainfall tends to gradually reduce from East to West and from Eastern/Inner to Northern Beaches.All four interpolation methods produced more reasonable estimations in Western district than Northern or Eastern/ Inner district as the rainfall patterns in the latter district are more variable.This implied that the ordinary interpolation methods can not accurately reflect extremes in monthly and annual rainfall patterns.While they are similar, we chose IDW interpolation method as it performed slightly better, and it is easier and faster in computation and compatible with previous ones [19,23,44].
There is likely significant increase in rainfall erosivity and soil erosion risk based on the rainfall projections used in this study.The predicted future soil erosion risk and the changes are useful information for climate impact assessments and adaptation and planning.In the near future, we will extend the techniques developed in this study to produce daily rainfall data at a resolution adequate for local soil erosion risk prediction from climate projections for the entire southeast Australia.The NARCliM projected daily rainfall data (at 10 km spatial resolution) from all 12 ensembles (4 GCMs and 3 RCMs [3]) are to be used as model inputs to provide unbiased future prediction on future rainfall erosivity and hillslope erosion risk for two future periods (2020-2039 and 2060-2079).
In conclusion, this study has demonstrated a suitable approach and processes to interpolate daily rainfall values for recent and future periods from GCM projections.The methods have been successfully implemented in GIS for efficient calculation and mapping of the spatial and temporal variation of rainfall across GSR.The spatial interpolation greatly enhanced the level of detail which is useful for climate impact assessments and soil erosion modeling at local scale.With the automated GIS process developed in this study, the daily rainfall GIS layers, consequently rainfall erosivity and soil erosion layers, can be readily upgraded when better or future rainfall data become available.The methodology and GIS programs are also readily applicable to any other regions with GCM projections at about the same resolution.

Figure 1 :
Figure 1: Locational map of the Greater Sydney Region (GSR), rainfall station sites, and state plan regions within GSR.

Figure 2 :
Figure 2: Scatter plots of correlation between interpolation and AWAP for monthly rainfall (a) and annual rainfall (b).

Figure 3 :
Figure 3: Comparison of interpolation methods for mean monthly rainfall against AWAP.

Figure 4 :
Figure 4: Comparison of interpolation methods for annual mean rainfall against AWAP.

Figure 5 :
Figure 5: Comparisons of seasonal mean rainfall from AWAP and the four interpolation methods in Greater Sydney Region.

Figure 7 :
Figure 7: Predicted temporal change of rainfall in Greater Sydney Region.

Figure 8 :
Figure 8: Modeled mean annual rainfall erosivity and hillslope erosion in Sydney Region.

Table 1 :
Interpolation errors of monthly rainfall.

Table 2 :
Interpolation errors of annual rainfall.

Table 3 :
Error assessment of interpolated monthly rainfall in state plan regions.

Table 4 :
Error assessment of interpolated annual rainfall in state plan regions.Western district (Blue Mountains), and lower values appeared in the midwestern district.The spatial patterns of annual rainfall patterns of the four interpolation methods are similar to those with predicted mean annual rainfall value around 883 mm (882.7-883.2mm).The minimum values of annual rainfall from