Modeling Spatiotemporal Precipitation: Effects of Density, Interpolation, and Land Use Distribution

1U.S. Geological Survey, Utah Water Science Center, 2329 W. Orton Circle, Salt Lake City, UT 84119, USA 2Flow Apportionment Laboratory, Department of Civil and Environmental Engineering, University of Utah, Salt Lake City, UT 84112, USA 3Directorate of Boundary Conditions, Department of Hydrology, University of Bayreuth, 95447 Bayreuth, Germany 4Department of Soil Physics, University of Bayreuth, 95447 Bayreuth, Germany


Introduction
Rain gauges provide a point estimation of precipitation, which, depending on the network size and density, require interpolation techniques to estimate the spatial distribution throughout a catchment.When the distribution of rain gauges throughout the catchment is skewed or the network density is limited, interpolation is affected leading to increased uncertainty in spatiotemporal precipitation estimates and runoff [1,2].Several studies have investigated rain gauge network density, distribution, and temporal discretization to quantify the effects on hydrologic models and the range in uncertainty [3][4][5].Obled et al. [6] compared the catchmentwide precipitation distribution from different rain gauge densities and found that the number of rain gauges was irrelevant to precipitation estimates but drastically increased the estimation accuracy of runoff and flood peak (time of concentration).This result was verified by Faurès et al. [7] where they further found that rain gauge location or distribution is important.However, the importance of rain gauge location was found by Lopes [8] and later by Krajewski et al. [9] to be temporally variable and dependent on event characteristics, which further complicates rain gauge network designs.
The density of observational rain gauge data to the accuracy of interpolation methods has also been shown to be of particular importance [4,5].Spatial interpolation is generally classified as either deterministic or geostatistical and typically involves weighting observational point measurements to regionalize estimates at ungauged locations.Deterministic spatial interpolation methods include the Thiessen polygon, spline, and inverse distance weighted (IDW) methods.

Advances in Meteorology
Although geostatistical methods, such as Kriging, can incorporate secondary data like elevation and aspect to improve interpolation accuracy [10], the benefit of one classification or method over another has generally been inconclusive.Geostatistical methods have been shown to produce better estimates than deterministic methods in some studies [11][12][13][14][15].However, other studies found that deterministic methods such as IDW performed better than other methods, including Kriging [16][17][18][19].While similar results between Kriging and IDW were presented [20], Laslett et al. [14] found that splines worked better than either IDW or Kriging.These results demonstrate the wide variability in interpolation estimation accuracy between methods due to site specific constraints, event-based conditions, and spatiotemporal dynamics.While studies have typically investigated the varying gauge density or interpolation methods separately, a systematic comparison of gauge density combined with different interpolation methods is less common [21,22].
Accurate catchment-wide rainfall estimates developed from point observations are critical in defining modeling boundary conditions.However, estimated rainfall is highly dependent on the density of observational points, the interpolation method utilized, and any observational errors, which subsequently influences the uncertainty in hydrologic partitioning [3].Watershed-scale catchment models are a useful tool for coupling hydrometeorologic drivers with ecosystem conditions [23].Typically, these studies depend heavily on observational meteorological networks, which may be limited in density and availability outside of the USA and Europe.However, these meteorological drivers and precipitation in particular are often defined as key variables in determining surficial hydrological processes [24].Furthermore, discharge has been shown to be extremely sensitive to spatiotemporal variability in precipitation throughout a catchment [3,25,26] resulting in considerable spatial variability in hydrologic partitioning [27].One method to describe these processes in hydrological catchment modeling is to use a spatially distributed approach.The idea of the distributed parameter approach is to reduce the number of individual parameters with the compromise of increased uncertainty.The SWAT model has been used throughout the world to investigate catchment flow processes and the resulting nutrient and sediment transport.While all catchment models operate on the mass balance approach, their individual equations and coupling between processes can vary.The SWAT model can be used to examine how soil type and land use coupled with meteorological inputs affect hydrologic flow partitioning.
The objective of this study is to quantitatively describe the influence of rain gauge density and interpolation method on spatiotemporal discharge throughout a monsoon-driven catchment.To this end, we develop a weighted, distributed approach for estimating and gap-filling the observed meteorological dataset for a complete and robust analysis.We further quantify the effect of rain gauge density through multiple observed and simulated weather stations and the statistical variability between catchment-wide interpolation methods.We use a calibrated catchment-wide ecohydrologic SWAT model to test the influence of different rain gauge densities and interpolation schemes and comprehensively describe the resulting variability in spatiotemporal stream discharge output.To further examine the extreme variability in these processes, we study a monsoon dominated, topographically complex catchment in Republic of Korea with a mosaic of land uses.

Study Location and Physiography
The 62.7 km 2 Haean catchment is the chosen study area with a bowl-shaped physiography, ranging in elevation between 339 and 1321 m a.s.l.(Figure 1(a)).It is located in the northeastern portion of Republic of Korea along the demilitarized zone (DMZ) between Republic of Korea and Democratic People's Republic of Korea (38.239-38.329N, 128.083-128.173E;Figure 1(a)).The combination of high elevation gradients and the bowl-shaped geologic structure drastically alters the local meteorological conditions.The Haean catchment is located in a region with a monsoonal climate and an average temperature of 8.65 ± 0.35 ∘ C that ranges between −26.9 ∘ C in January and 33.4 ∘ C in August [28].The climate of Republic of Korea is humid continental to subtropical and is heavily influenced by the East Asian summer monsoon.The monsoon season essentially extends from June through September, with up to 70% of the total annual precipitation between the months of June and August.The average annual rainfall is 1514 mm (930 to 2299 mm yr −1 ) with precipitation as high as 48.6 mm h −1 or up to 223.2 mm d −1 .Average catchment outlet (S7) discharge is 4.3 m 3 s −1 (1.2 to 379 m 3 s −1 ) while the average headwater discharge (S1) is 0.03 m 3 s −1 (1.4 × 10 −4 to 10.0 m 3 s −1 ).The catchment hydrology is further described by Shope et al. [28,29].The higher elevation mountain ridges of the basin are a Precambrian gneiss complex surrounding an interior of highly weathered Jurassic biotite granite [30].Bedrock is typically observed between 20 and 45 m below land surface in the catchment interior.Upland soil texture is typically saprolitic sand and sandy loam with high infiltration capacity [31,32] and generally extends up to 2 m in depth (Figure 1(b)).The catchment is predominately composed of loam-sand, forest soils of soil hydrologic group C and sand-silt, moderately steep and flat dryland soils of soil hydrologic group D. Approximately 15% of the catchment has low permeability, sand to clay, rice paddy, and sealed ground soils with a soil hydrologic group C to D classification.The catchment is one of the most agriculturally productive catchments in the region [29].The uplands predominately include deciduous oak (Quercus spp.) and maple (Acer spp.) forest and comprise approximately 56% of the total catchment land classification (Figure 1(c)).The midelevation, dryland, and central, low-elevation crops include rice (13.6%),C3 grasses (9.4%), cabbage (5.1%), potato (3.9%), radish (3.4%), soybean (2.6%), orchards (1.4%), ginseng (1.3%), and corn (0.8%), among other crops.

Methodology and Model Construction
3.1.Observed Meteorological Data.Meteorological data were collected from a total of 18 weather stations, with 15 local TERRECO-based stations distributed throughout the catchment [33], the Korean Meteorological Agency (KMA) Haean station in the central catchment, and two additional regional KMA meteorological stations (Figure 1(a)).Hourly climate data for the period from 1998 to 2011 were measured and collected from the regional stations of the KMA (Figure 1).Precipitation and minimum/maximum temperature were obtained from the Haean KMA station (38.287N, 128.148E).Relative humidity, temperature, and wind speed were obtained from the Inje KMA station in the adjacent Yanggu County (38.207N, 128.017E).Solar radiation was collected from the Chuncheon KMA station (37.904N, 127.749E).Climate data were also collected from the 15 micrometeorological stations (Delta-T Devices, Ltd.) distributed throughout the catchment (Figure 1) between 2009 and 2011.The measured parameters and the instrument uncertainty included solar radiation (±5 W m −2 ), relative humidity (±2%), wind speed (±0.1 m s −1 ), minimum and maximum air temperature (±0.2 ∘ C), and precipitation (±0.2 mm) (Figure 2).All subhourly weather data collected from the catchment-based network were subsequently aggregated into hourly datasets for further analysis.

Meteorological Gap-Filling Algorithm.
Each of the meteorological parameters was quality controlled by removing erroneous data and subsequently gap-filling missing data from a similar station using a weighted algorithm based on elevation, station proximity, and aspect.This process enabled a comprehensive and robust spatial and temporal analysis of meteorological effects as a function of spatial location, gauge density, and interpolation method.Less than 5% of the total meteorological dataset required the need for gap-filling.The algorithm, as formulated for precipitation, is The variable   is the estimated precipitation (mm),  is the observation point aspect (deg.),  is the distance to the observation point (m),  is the elevation (m),  is the time step,  is the total number of consecutive missing data,   is the observed precipitation (mm),  is the total number of observational meteorological stations,  is the cumulative number of stations, the "" and "" subscripts are the estimated and observed location values,  is the weighting factor, and  and  subscripts are the first and second most proximal locations to the estimation location, respectively.Locally based relative humidity was also modified by accounting for the temperature dependent local dew point.

Interpolation Methods.
A number of deterministic and geostatistical interpolation methods have been utilized throughout the literature and discussed elsewhere; however, a brief description of the 5 interpolation methods utilized in this study is warranted.The Thiessen polygon method is a deterministic method that prescribes a value at an ungauged location based on the proximity to gauged observations [34].Effectively, the study area is divided into subbasins or polygons with the nearest individual gauge observation representing the entire area within that polygon.Inverse distance weighting (IDW) overcomes the drawback of discontinuities between polygons in the Thiessen method by weighting values at ungauged locations as a function of distance to the observational location [35].Therefore, predicted values at simulated locations are more influenced by measured values that are closer.The nearest neighbor algorithm uses the value of the nearest pixel value without considering the values of neighboring points, which yields a piecewise-constant interpolant [36].Effectively, the method uses the pixel value corresponding to the smallest absolute difference when a set of four known value pixels has no mode.The spline interpolation algorithm uses a polynomial between each pair of tabulated points with nonlocally determined coefficients [37].The nonlocality guarantees global smoothness in the interpolated function up to some order of derivative.For example, cubic splines produce an interpolated function that is continuous to the second derivative.Ordinary Kriging is a geostatistical method that defines the spatial dependence of ungauged locations using a semivariogram, which is a function of the difference between paired locations and the distance between them.The experimental semivariogram is then optimized to a theoretical analytic model (i.e., logarithmic, exponential, Gaussian, power, and spherical) to define coefficients [38][39][40].The semivariogram provides the residual or nugget, which reflects the sampling error and spatial variance and should be close to zero.It also provides the distance value of the range at which the sill is reached indicating asymptotic behavior.Distances greater than the range are considered spatially independent.Our results indicated that the spherical model worked optimally for our study area.A representative example of the spatial variability in the 5 interpolation methods is shown in  The spatial and temporal discharge measurements allowed us to calibrate the rainfall-runoff model described by Shope et al. [29].As described by Shope et al. [28], multiple discharge methods were analyzed at each monitoring location and a weighted uncertainty approach was used to optimize the stream discharge by location and environmental conditions.While spatially distributed discharge measurements throughout the catchment are described elsewhere, the observed discharges at interior locations (S1, S4, S5, and S6) and the catchment outlet (S7) were utilized in this study for daily model calibration to parameterize spatial hydrologic partitioning.The monitoring locations are arranged along an elevation gradient.Since the catchment has a bowl-shaped topography, model parameterization as a function of the elevation gradient enables us to adequately provide catchmentwide model coverage.

SWAT Model Construction.
The SWAT model is a physically based, distributed model developed to predict the long-term climate and land use impacts on hydrology [41].The catchment is divided into spatially linked subbasins that are further separated into unique hydrological response units (HRUs) by integrating land use, soil type, and slope.SWAT uses a water balance approach to simulate hydrologic partitioning and the simulated components include runoff, percolation, lateral flow, groundwater flow, evapotranspiration (ET), and transmission losses [42,43].Precipitation is partitioned into canopy storage, soil infiltration, and runoff through either the SCS (soil conservation service) curve number (CN) method [44] or the Green-Ampt [45] method.The vegetation is important in estimating the CN for HRUs; therefore the distributed CN was modified for individual HRUs by temporal LULC characterization and crop growth.The SCS curve number method, calculated through plant evapotranspiration, was selected for the simulations and flow routing based on the variable storage method was used for the Haean catchment [43].Model inputs include the digital elevation model (DEM), soil discretization (Figure 1(b)), land use/land characterization (LULC) identification (Figure 1(c)), spatiotemporal crop management, biomass sampling, and evapotranspiration (ET).Details on measurement collection and model implementation are described further in Shope et al. [29].
The SWAT model prescribes the nearest weather station parameters to the centroid of each subbasin rather than interpolating between the stations [43].Since the catchment has a significant amount of topographical complexity, assignation of meteorological data to each subbasin severely affects the precipitation volume, soil moisture, and plant growth, which ultimately impacts runoff and discharge.To compare the effect of network density on spatially variable river discharge, we prescribed the precipitation in five ways: (1) in a lumped sense with a single precipitation dataset from the KMA station or each of the local stations, (2) spatial interpolation of the 16 local weather stations, including the Haean KMA station, (3) spatial interpolation to 50 virtual stations, (4) spatial interpolation to 142 virtual stations, and (5) spatial interpolation to 300 virtual stations (Figure 1(d)).This resulted in 5 interpolation methods described previously for each of the 4 distributions or 20 model simulations plus the estimates of the individual stations.To begin, we performed several interpolation methods over the entire catchment (Thiessen polygon, inverse distance weighted (IDW), nearest neighbor, spline, and ordinary Kriging) and then gridded the interpolated meteorology throughout the catchment.For the 16, 50, 142, and the 300 point virtual weather stations that were applied to individual subbasins, we used the interpolated meteorological data at the centroid of each of the subbasins, respectively.For simplicity, the calibrated and validated model with virtual weather stations prescribed to 142 subbasins as presented by Shope et al. [29] is considered the baseline scenario and deviations associated with rain gauge density and interpolation method are relative to this baseline scenario.
3.6.Model Calibration and Validation.Model results were evaluated for performance against the baseline scenario using the mean absolute error (MAE), root-mean squared error (RMSE), coefficient of determination ( 2 ), percent bias (PBIAS), plant growth, and average rainfall metrics.Error indices typically used in model evaluation include the MAE and RMSE, which indicate error in the units of the parameter of interest.MAE and RMSE values of zero indicate a perfect fit, although values less than half of the standard deviation of the measured data are considered low [46]: Pearson's correlation coefficient () and the coefficient of determination ( 2 ) describe the degree of collinearity between simulated and measured data and the proportion of variance in the measured data explained by the model.The coefficient of determination ranges from zero to one, where higher values have less variance.Typically, values greater than 0.5 are considered acceptable [47,48].The drawback is that these statistics are oversensitive to high extreme values and insensitive to additive or proportional differences between the measured and simulated predictions: Percent bias (PBIAS) measures the average tendency of the simulated data to be larger or smaller than the observed values [49].The optimal value of PBIAS is zero with lower values indicating accurate model observations.Values that are positive and negative indicate model underestimation and overestimation bias, respectively: As described in Shope et al. [29], model calibration was also performed using plant growth and baseflow contributions as calibration metrics.Leaf area index (LAI) was measured throughout the growing season for the primary crops and deciduous forest and the SWAT ecohydrologic plant growth parameters were adjusted during the calibration process.Spatiotemporal baseflow contributions were also calculated using several methods including those of Eckhardt [50] to estimate the percent groundwater contribution to the surface waters distributed throughout the catchment.These independently calculated baseflow estimates were then used to optimize the simulated groundwater contributions to   surface water.The weighted calibration procedure that Shope et al. [29] describe enabled the baseflow contributions to be an important calibration metric for model optimization.

Comparison of Observed Meteorological
Data.Since all of the interpolation methods were based on the 16 total meteorological stations distributed throughout the catchment, we were able to compare the statistical spatiotemporal variation in distribution between the single Haean KMA location and the 15 additional stations.Our results indicate that the single KMA location adequately accounts for catchment-wide precipitation contributions (Figure 4).The minimum, maximum, range, and standard deviation in precipitation amounts on a particular time step are nearly identical between each of the 15 stations and the KMA station, although the average and the median of the KMA station were slightly less relative to the 15 stations.This is expected, since the KMA location is at the lower elevation central portion of the catchment where lower precipitation is reported on average.The average precipitation for the 15 stations over the period of record amounted to 2.78 ± 1.12 mm, while the cumulative precipitation was 2.71 mm for   the KMA station.The difference in monthly and hourly distribution of precipitation between the KMA station and the micrometeorological stations was negligible.However, the KMA station generally underpredicted the precipitation from the remaining stations.Overall, there was more precipitation with a higher degree of variability during the predawn and dusk hours (Figure 4(a)), although there was not a significant difference in the precipitation distribution relative to time of day.As expected, the precipitation amount and variability were highest during the summer monsoonal period between June and September with the minimum precipitation occurring in January (Figure 4(b)).These periods are also the times when the highest difference in precipitation volume was recorded for 2009 and 2010.As shown in Figure 4, the absolute magnitude of residuals between the KMA station and the average of the 15 other stations was as high as 4.8% (40 mm) for an individual time step.However, these time steps are typically balanced by an equal but opposite difference in precipitation at adjacent time steps, indicating the degree of local convective precipitation in the catchment.Overall, there was not a discernible trend in time lag due to convective precipitation events.We ranked the hourly precipitation record for each meteorological gauge over the 12-year period of record from 1999 through 2010 by the hourly standard deviation between all gauges.Our results show that precipitation occurred at any gauge less than 3% of the time.Of this period, the standard deviation in precipitation between all gauges was 0.74 mm (0.05 to 15.14 mm).Furthermore, the highest variability in precipitation occurred at approximately 5:00 and 20:00.As expected, the precipitation amount and variability were highest during the summer monsoonal period between June and September with the minimum precipitation occurring in January (Figure 4(b)).

Differences in Interpolated
Precipitation.We examined the 5 different interpolation methods (Thiessen polygon, IDW, nearest neighbor, spline, and ordinary Kriging) to quantify the spatiotemporal meteorological variations on precipitation for each drainage.The difference in drainage area normalized precipitation between each of the interpolation methods at each monitoring location for each of the meteorological station densities is provided in Figure 5.For each of the gauge densities, the differences are typically greater in the southern and eastern portions of the catchment at SK, SN, and to a lesser degree SD.These drainages are relatively small in area, but, more importantly, the deviations appear to be greater for the spline and nearest neighbor interpolation schemes.

Comparison of Simulated Discharge by Interpolation
Method.Ultimately, a goal of the study is to quantify the role that precipitation volume in each drainage area has on the rainfall-runoff processes affecting surface water discharge distributions throughout the catchment.Therefore, we examined the 5 different interpolation methods to further quantify the spatiotemporal meteorological variations on discharge.To reiterate, the calibrated and validated SWAT ecohydrologic model described by Shope et al. [29], which used 142 virtual weather stations interpolated to each subbasin using IDW, is considered the baseline scenario in the current study.We then used the same calibration parameters and model inputs but changed the precipitation drivers for each of the 5 interpolation schemes and for each of the 5 distributions.Deviations in discharge from this scenario, which was calibrated through statistical metrics, plant growth, and baseflow contribution, are expected to be based solely on precipitation drivers.
We summarized the results by calculating the cumulative discharge throughout 2010 normalized to the drainage area of each surface water location (Figure 6; 142 stations).This analysis provides a more comprehensive and comparable assessment between locations and, as discussed later, between density distributions.As expected, locations with smaller drainage area and higher elevation displayed cumulatively higher precipitation amounts, which decreased toward the catchment outlet.Location SN has smaller, high elevation drainage, hence the slight increase in simulated discharge.Overall, the ordinary Kriging results were the most similar to the IDW interpolation method, particularly at higher distributions.Thiessen is the next most similar, although variability is generally more amplified.The nearest neighbor interpolation was similar to IDW, with the exception of  locations S5, S6, and S7, which are located in the lower catchment.The spline interpolation had the most deviations overall and was the least similar in structure to the others.This is in large part due to the calculated discharge gradients at boundary edges.

Comparison of Gauge Density and Distribution on Simulated Discharge.
Parameterization of a single point precipitation value that is prescribed to the entire catchment enables us to reflect the runoff results in a lumped parameter sense.As shown in Figure 6 (1 station), parameterization of the range in precipitation is provided at a single location in the centroid of the catchment, resulting in the distributed surface water discharge throughout the catchment.This is useful because it provides an indication of the range in catchmentwide discharge expected from a single point meteorological measurement that characterizes the entire watershed.For the example of Thiessen polygon simulation, the incorporation of a single precipitation station into the model shows a very strong reliance of discharge on the prescribed precipitation driver.
The multiple point precipitation distribution enables a comparison of the different spatial levels of distributed modeling on discharge.For each of the interpolation methods, with the exception of the spline method, general smoothing of the data occurs with gauge density (Figure 6).As the gauge density increases from 142 to 300 virtual weather stations, each of the methods again shows more pronounced peaks.This is important because Figure 5 indicates that, for each of the interpolation methods, a density between 50 and 142 meteorological stations depresses sharp deviations between monitoring locations and generally provides a smoother discharge representation based on elevation.The general variability observed for the spline interpolation, and to a lesser degree nearest neighbor, over each of the gauge densities is a function of both the method and the boundary conditions.Typically, higher degree polynomials approximated with the spline interpolation have extra oscillations due to discontinuities between observational points.The boundary condition is parabolically terminated and is represented as the second degree polynomial rather than the third degree polynomial used in the inner intervals.Because of these limitations, spline interpolations can exhibit extreme behavior in regions with limited or no data.While the nearest neighbor interpolation is a simple multivariate method in multiple dimensions, the limitation is that it does not consider the values of neighboring points, but only the nearest point, thereby providing a piecewise-constant interpolant.This can also cause significant variability at simulated pixels.
While identification of the cumulative discharge at each location, for the 5 interpolation methods, over 5 gauge densities (Figure 6) is useful for comparative purposes, the  difference from the calibrated discharge can be used to quantify process-based influences.We quantified the relative difference in discharge between individual simulations as a function of interpolation method and gauge density relative to the baseline simulation model (Figure 7).As mentioned, the ordinary Kriging and Thiessen polygon methods most closely approximate the calibrated IDW discharge results and a gauge density between 50 and 142 stations appears to be optimal.Generally, the bulk of the variability in discharge between each of the interpolation methods for different gauge densities is accounted for at downstream monitoring locations S6 and S7 (see Figure 1(a)).6  and 7, there appear to be factors controlling surface water discharge besides precipitation volume and the interpolation method utilized.The Haean catchment is composed of a myriad of agricultural and forest dominated landscape patterns that could exhibit significant controls on rainfallrunoff processes.To identify the role that these landscape features have on surface water discharge, we analyzed the spatial variability of both land use (Figure 8) and soil distribution (Figure 9) in each of the monitoring location drainage areas.Deciduous forest is obviously the most abundant land use within the Haean catchment (Figure 8).However, the C3 grasses as identified by Timothy, cabbage, potato, and radish are consistently spatially extensive, though at a lower percentage of the catchment area.The rice paddy LULC is spatially consistent throughout the lower catchment, with the exception of drainage area SN.The impervious nature of rice paddies and the plastic mulching field management of crops like cabbage, potato, and radish all contribute to a decrease in infiltration at the expense of surface water runoff.

Distributed LULC and Soil Distribution. From Figures
The distribution of different soil types throughout each of the surface water monitoring locations can have a similar effect on infiltration and runoff processes as LULC.As expected, the distribution of flat, loamy, dryland soil is consistent with dryland crops such as cabbage, potato, and radish.A similar pattern is observed between the distribution of forest soils and the forest LULC.These patterns are not necessarily mutually exclusive.Field studies identified that an annual replenishment of loamy-sand soil was applied to many of the fields investigated prior to the tilling process.This soil amendment is seen as a means to primarily increase subsurface infiltration to the plant rooting zone with the added benefit of enhancing the soil nutrient profile.Therefore, the shallow subsurface soil distribution is directly correlated to the majority of crops throughout the Haean catchment.

Influence of LULC and Soil Distribution on Discharge.
Higher discharge observed particularly at monitoring locations S6 and S7 (Figure 6) is a function of the impervious  nature of increased residential areas and the plastic mulching associated with agricultural management of potato, radish, and cabbage LULC.The plastic mulching management, common throughout the Korean peninsula, is where plastic sheeting is placed over the elevated rows in a row and furrow tillage field.The mulching has holes cut longitudinally to emplace individual plants.As described by Arnhold et al. [31] and Ruidisch et al. [51], the mulching has the effect of increasing runoff and sediment transport while reducing infiltration.These results are more pronounced at S7, particularly with the nearest neighbor and spline interpolation methods.Agricultural management of the ginseng crop, which cumulatively accounts for approximately 1.3% of the catchment, has an intricate drainage system that further induces runoff.Analysis of monitoring location S5 reveals that the contributing drainage area has a high concentration of bean and rice but low percentage of deciduous forest.The S5 drainage area is about 1 order of magnitude less than drainage areas of S5 and S6; therefore, with an increased distribution of rice paddy soil, there is more runoff for the drainage area.Location S6 has nearly consistently high discharge, which is correlated to increased potato, cabbage, radish, and residential coverage within the drainage area of 12.8% and is therefore a large portion of nondeciduous forest coverage.This means higher infiltration with forest LULC and typically similar discharge to monitoring location S5.Alternatively, monitoring location S7 has a drainage area of 8.6% of the catchment and so, with higher discharge, all crops with the exception of ginseng and C3 grasses are depressed.Therefore, the rise in discharge is diminished due primarily to C3 grasses.Locations SS and SD have much higher drainage area at 19.0 and 21.6%, respectively.Therefore, discharge is expected to be lower with increases in deciduous forest, C3 grasses, and orchard LULCs contributing to more infiltration.The drainage areas including SK, SN, SD, and SW are located in the southern, eastern, and southwestern portions of the catchment and cumulatively account for 50.1% of the entire area.Each of the monitoring locations is depressed in terms of cumulative discharge and, while dominated by deciduous forest, contains a substantial percentage of rice, C3 grasses, and cabbage LULC.The benefit of infiltration obtained from the C3 grasses and balanced by the runoff dominated rice and cabbage crops.Because the soil distribution is highly correlated with the LULC, similar results are observed when comparing the soil discharge.When the difference in surface water discharge normalized to drainage area is examined,    the deviations from the calibrated model are more evident.With fewer stations, the land use, soil, and precipitation volume are more homogenous or are lumped and deviations in LULC and soil distribution between drainage areas are accentuated.

Characterization of Interpolation Method and Gauge Density.
For the lumped parameter model, the differences in cumulative trends by site are substantial.The Thiessen polygon interpolation indicates that there is a high peak in discharge between S4 and S7, consistent with the entire range in precipitation while others are dampened.This indicates that there is simply flow accumulation along an elevation gradient.With the Thiessen polygon for a distribution of 16 stations, there is increased variance about the calibrated model but good reproducibility (Figure 6).As the density of gauges increases, there is more discharge at locations other than S7 and S6 at the 300 stations' distribution.The spline and nearest neighbor interpolation methods show the opposite trends.Therefore, the difference in precipitation distribution in the catchment must play a role with higher discharge than other methods at S7, which is elucidated from Figure 5.The distributions are much different than the lumped simulations, indicating the inherent problem of aggregating to a single point.Our results suggest that, at low densities of 16 stations or less, there is considerable variability in discharge relative to the calibrated simulation (Figure 6).Since the IDW calibration is the interpolation method of significance, a comparison of this interpolation method throughout different gauge densities is warranted.
As simulations approach 50 stations, the trend is relatively smooth, indicating a general decrease in discharge as drainage area increases.More importantly is the increase in discharge between S5 and S7 at 142 stations that forms a peak in discharge by the 300 gauges' distribution.This suggests that while the 142 stations used in our study adequately approximate the effects of spatially distributed precipitation on discharge, with more distributed precipitation, the trend of increased discharge at S6 is evident.While this is likely a result of more observations with higher measured precipitation in the S6 catchment, it also indicates the role of overparameterization of meteorological station density.The performance of the simulations measured through discharge is repeated with the ordinary Kriging and to a lesser degree the Thiessen polygon methods.The Thiessen results show a shift from increased discharge at S5 and S6, which suggests incorporation of higher precipitation prescribed subbasins when density increased.Therefore, the results from the interpolation methods showed an apparent influence of gauge density on their performance, regardless of the specific method used.This indicates that the number of meteorological stations plays an important role in the estimated cumulative catchment discharge.Our results indicate that the IDW and Kriging methods appear to be superior to the Thiessen polygon, nearest neighbor, and spline methodologies.While it is unclear if the performance as measured by discharge decreased with increased gauge density, it is apparent that the summation of precipitation distributed in the vicinity of an observed station plays a critical role in the structure of estimated runoff distributed throughout the catchment.

Influence of Observations on Simulated
Discharge.Consistent with the results of Otieno et al. [22], we found that IDW was the most accurate interpolation method, even better than Kriging, over multiple gauge densities.While they found that the Thiessen method had the lowest accuracy and the performance was correlated to gauge density, our results indicate that the Thiessen method was comparable to those of IDW and Kriging.This is not necessarily surprising, as Otieno et al. [22] used statistically similar locations rather than heterogeneous distributions of gauge locations, as would be expected in field deployment.Our results build upon this body of work by demonstrating that the heterogeneity in observed precipitation can drastically contribute to the variability in runoff discharge produced, particularly when gauge density is increased.Caracciolo et al. [24] found that the optimal rain gauge position and distribution drastically influence the catchment-wide rainfall-runoff through the soil distribution.They also found that the characteristics specific to storm events and the observation distribution of rain gauges contribute to the runoff estimation.Our results indicate that these influences are a function of the soil distribution across the landscape and the gradient in convective precipitation volume.We found that when the precipitation volume is high, spatial variability is highest and is most likely due to the land use and soil distribution; however, when spatial rainfall variability is low, the distribution of land use and soil has less influence.Because of the influence of land use and soil distribution on discharge, it is difficult to extend the specific simulation results to other locations [24].However, several characteristic traits can be extended to other locations.The best gauge network depends on the question of hydrologic output, total volume runoff, peak discharge, time of travel, soil moisture, and other hydrograph responses, which are all influenced by the infiltration capacity of the land use soil type.While we focused on cumulative discharge distributed throughout the catchment, the peak discharge time series may reveal that a different gauge density is optimal.It is also advantageous to locate the gauges near locations of less permeable soil and land use, where, with higher precipitation variability, the significance of the soil influence increases.Since the model was calibrated and validated to plant growth, baseflow, and hydrologic partitioning and the associated drivers such as the soil distribution, land use, and topography were consistent throughout the simulations, deviations in discharge are directly related to precipitation.

Summary and Conclusions
Our study used data from 16 meteorological weather stations distributed through a 64 km 2 catchment between 1998 and 2010.Our objective was to quantitatively describe the influence of spatial interpolation methods and rain gauge density on spatiotemporal discharge.The study location is a monsoon dominated catchment, which provides a wide range of spatiotemporally variable, convective extreme events.To this end, we developed a weighted, distributed approach for estimating and gap-filling the observed meteorological dataset for a complete and robust analysis.We analyzed 5 different interpolation methods (Thiessen polygon, IDW, nearest neighbor, spline, and ordinary Kriging).We also examined the effect of a single meteorological station prescribed with a single precipitation time series, throughout the range of observed precipitation in a series of lumped-parameter simulations.These results are compared to simulations of meteorological stations distributed throughout the catchment at several gauge densities, including the 16 observed stations, 50 virtual stations, 142 virtual stations, and 300 virtual stations.The virtual weather station conditions were prescribed from the different interpolation schemes to the centroid of individual subbasins at the densities described.
The interpolation methods varied in complexity between simple deterministic and complex geostatistical methods, and the errors resulting from the different interpolation schemes are found to propagate through the hydrologic simulations, which affects the discharge characteristics.We use a calibrated catchment-wide ecohydrologic SWAT model to test the influence of different rain gauge densities and interpolation schemes and comprehensively describe the resulting variability in spatiotemporal stream discharge output.The calibrated model is based on the IDW interpolation scheme of 142 subbasins and provides the baseline scenario which the other simulations are compared to.Our results indicate that IDW and Kriging interpolation schemes are superior to the other methods investigated including Thiessen polygon, nearest neighbor, and spline methodologies.However, the Thiessen polygon performed adequately relative to nearest neighbor and spline.These results indicate that the simpler IDW interpolation scheme can be used in place of more complex spatially correlated Kriging interpolations.We found that optimum gauge density does not necessarily have to include a high number of meteorological stations and can be between 50 and 142 stations.As the number of stations increases above the optimal range, observed precipitation volume is translated to more subbasins and the summation of these stations in a higher density subbasin can be drastically influenced.To further examine the extreme variability in surface water discharge, we characterized the land use and soil distribution throughout each of the subbasins.Our results indicate that the optimal rain gauge position and distribution of the gauges drastically influence catchmentwide rainfall-runoff.These influences on runoff processes are a function of the soil distribution across the landscape and the gradient in convective precipitation volume.We found that when the precipitation volume is high, spatial variability is highest and is most likely due to the land use and soil distribution; however, when spatial rainfall variability is low, the distribution of land use and soil has less influence.With an observed gauge density of approximately 4 km 2 , this can be considered a fairly high resolution observational network.Further, while the geostatistical Kriging method incorporates correlations between the gauges, it does not show increased performance relative to the IDW methodology.Our study adds to the bulk of research examining spatial interpolation methods at different gauge densities by including a broader variety of common interpolation schemes and rain gauge densities.We also use a calibrated SWAT ecohydrologic model to evaluate the performance of each interpolation method at different densities on spatially distributed catchment-wide surface water discharge.We describe the benefit of using a distributed station approach relative to a single catchment-wide lumped simulation.The best gauge network depends on the question of hydrologic output, total volume runoff, peak discharge, time of travel, soil moisture, and other hydrograph responses, which are all influenced by the infiltration capacity of the land use soil type.It is also best to locate the gauges near locations of less permeable soil and land use, where, with higher precipitation variability, the significance of the soil influence increases.

Key Points/Highlights
(i) Development of a weighted spatiotemporal algorithm for gap-filling weather data.
(ii) Compared the estimated discharge between lumped (single) and distributed precipitation over 5 gauge densities.
(iii) Compared the simulated discharge results between 5 deterministic and geostatistical interpolation methods.
(iv) Characterized the role that the land use and soil distribution have on discharge as a function of precipitation.

3. 4 .
Discharge Measurements.Continuous and event-based discharge measurements were collected between 2003 and 2011 at up to 14 locations throughout the Haean catchment.

Figure 1 :
Figure 1: Haean study area location showing (a) observed surface water monitoring locations and meteorological stations, (b) aggregated soil distribution, (c) consolidated land use classification, and (d) meteorological station distribution of 16, 50, 142, and 300 point locations.

Figure 2 :Figure 3 :
Figure 2: Hourly variability in meteorological parameters ((a) temperature and precipitation, (b) solar radiation, (c) wind speed, and (d) relative humidity) between all of the stations in the Haean study area for 2009 and 2010.The average time series between all locations for each of the parameters is also provided.

Figure 4 :
Figure 4: Distribution of (a) hourly and (b) monthly precipitation for all of the 16 meteorological stations over the period of record, excluding time steps without precipitation.The mean of the 15 local stations (black dot) and the mean of the Haean KMA station (gold diamond) are identified.For comparative purposes, the percentage difference of the KMA station relative to the mean of the 15 local stations is provided.

Figure 5 :
Figure 5: Difference in average annual precipitation from the calibrated 142-subbasin IDW amount as described by Shope et al., 2014.The panels represent the difference for each of the 5 interpolation methods at each drainage area for the (a) 16 observed, (b) 50 simulated, (c) 142 simulated, and (d) 300 simulated meteorological stations.

Figure 6 :
Figure 6: Comparison of 2010 cumulative surface water discharge normalized to the drainage area of monitoring location.The comparison encompasses each of the 5 interpolation methods and the 4 meteorological station distributions.The range in discharge shown in the Thiessen polygon panel for 15 stations assumes a single lumped precipitation value based on the minimum and maximum precipitation.

Figure 7 :
Figure 7: Comparison of the difference in normalized surface water discharge for each monitoring location between interpolation methods and meteorological station density.The coefficient of determination ( 2 ) is a comparison between the individual simulation and the baseline model.The range in discharge shown in the Thiessen polygon panel for 15 stations assumes a single lumped precipitation value based on the minimum and maximum observed precipitation.

Figure 8 :
Figure 8: (a) Percentage of aggregated Haean catchment LULC for each of the monitoring location drainage areas.The LULC for inland water and deciduous evergreen forest was negligible and therefore was not included.(b) Land use classification for each of the surface water monitoring location drainage areas.

Figure 9 :
Figure 9: (a) Percentage of the Haean catchment soil distribution for each of the monitoring location drainage areas.The soil classification was aggregated to six heterogeneous soil types.(b) Soil distribution for each of the surface water monitoring location drainage areas.