Three methods of temporal data upscaling, which may collectively be called the generalized k-nearest neighbor (GkNN) method, are considered. The accuracy of the GkNN simulation of month by month yield is considered (where the term yield denotes the dependent variable). The notion of an eventually well-distributed time series is introduced and on the basis of this assumption some properties of the average annual yield and its variance for a GkNN simulation are computed. The total yield over a planning period is determined and a general framework for considering the GkNN algorithm based on the notion of stochastically dependent time series is described and it is shown that for a sufficiently large training set the GkNN simulation has the same statistical properties as the training data. An example of the application of the methodology is given in the problem of simulating yield of a rainwater tank given monthly climatic data.
Commonwealth Scientific and Industrial Research Organisation1. Introduction
The k-nearest neighbor method has its origins in the work of Mack [1], Yakowitz and Karlsson [2], and others, e.g., [3, 4]. In this work an estimate for Yi given an independently and identically distributed (i.i.d.) sequence (Xj,Yj) of random vectors with Xj∈Rp and Yj∈R (where R denotes the set of real numbers) on the basis of {(Xj,Yj):j<i} is obtained by taking the average of Yj over the set {Yj:j∈J}, where J is the set of indices of vectors Xj which form the k nearest neighbors of Xi, in which k>1.
In later work by Lall and Sharma [5] and Rajagapolan and Lall [6] a related method, also called the k-nearest neighbor method, was used for simulating hydrological stochastic time series (Xi,Yi). In this method the next value in the simulated time series is chosen randomly according to a probability distribution over the set J of indices j of the k nearest neighbors Xj of Xi in {Xj:j<i}.
More recent work in the area has been carried out by Biau et al. [7], Lee and Ourda [8], and Zhang [9].
In the present paper we derive some general results about the k-nearest neighbor algorithm and related methods which we group together as a general class of methods which we call the generalized k-nearest neighbor method (GkNN method). We do not make the assumption that the time series are i.i.d. [1], null-recurrent Markov [10], or Harris recurrent Markov chains [11]. We introduce the natural notion of a time series being eventually well distributed from which, if satisfied, some properties of the GkNN algorithm can be deduced.
The generalized k nearest neighbor (GkNN) algorithm is described in Section 2. Section 3 investigates the problem of predicting the month by month yield (where we use the term “yield" to denote the value of the dependent variable Yi) while Section 4 considers the computation of the average annual yield. Section 5 computes the variance of the average annual yield while Section 6 considers the behavior of the total yield. Section 7 describes a general framework for viewing the GkNN algorithm and conditions under which this framework is applicable in practice. The eighth section of this paper presents the particular example of the problem of simulating rainwater tank yield. The paper concludes in Section 9.
2. The Generalized k Nearest Neighbor (GkNN) Method
In the GkNN method we are given a time series {vt∈V:t=1,…,T} of predictor vectors which may be obtained from, for example, a stochastic simulation of climatic data. Here V denotes the space of predictor vectors. We are also given training data {(wi,ui)∈V×[0,∞):i=1,…,N}.
We want to assign yields yt for t=1,…,T in a meaningful way. We are given a metric μ:V×V→[0,∞). We are also given a probability distribution {p1,…,pN} on {1,…,N}. In the GkNN method the yield time series yt for t = 1,..., T is computed as follows.
For each t=1,…,T,
Compute the metric values μ(vt,wi) for i=1,…,N and sort from lowest to highest. Let πt be the resulting permutation of {1,…,N}.
Randomly choose i∈{1,…,N} according to the distribution {p1,…,pN}. Denote it by i_selected.
Return yt=uπt(i_selected).
3. Prediction of the Month by Month Yield by GkNN Simulation
We want to determine by either theoretical calculation or computational experiment how well the GkNN method predicts yields, or at least to find some sense in which it can be said that the GkNN method is predicting yields accurately. Suppose that we have a training set {(wi,ui):i=1,…,N}. Let {vt:t=1,…,T} be a given climatic time series and {zt:t=1,…,T} associated (unknown) yields. The GkNN method is a stochastic method for generating a yield time series. Suppose that we run it R times resulting in a yield time series {yt(r):t=1,…,T} for run r, where r∈{1,…,R}.
We will first work out how well the GkNN predicted yield approximates the actual yield for any given month. A measure of the error of the predicted yield compared to the actual yield for month t and run r is the square of the deviation, i.e. (yt(r)-zt)2. The expected error for the GkNN computation of the yield for month t is(1)Et=limR→∞1R∑r=1Rytr-zt2.
We will show that this expected error exists and is positive. Let yt denote the expected value of the GkNN prediction of the yield for month t. We will show that yt exists. Let γ(t,r) denote the index i_selected chosen in Step (2) of the GkNN algorithm for month t and run r. By definition(2)yt=limR→∞1R∑r=1Rytr=limR→∞1R∑i=1Nuπtir∈1,…,R:γt,r=i=∑i=1NuπtilimR→∞1Rr∈1,…,R:γt,r=i=∑i=1Npiuπti.Thus yt exists. Now(3)ytr-zt2=ytr-yt+yt-zt2=ytr-yt2+yt-zt2+2ytr-ytyt-ztTherefore(4)Et=limR→∞1R∑r=1Rytr-yt2+yt-zt2=Varyt+yt-zt2Now the variance Var(yt) is given by(5)Varyt=limR→∞1R∑r=1Rytr-yt2=limR→∞1R∑i=1Nuπti-yt2r∈1,…,R:γt,r=i=∑i=1Nuπti-yt2limR→∞1Rr∈1,…,R:γt,r=i=∑i=1Npiuπti-yt2
Thus(6)Et=∑i=1Npiuπti-yt2+yt-zt2.
The expected error is the sum of two nonnegative terms. The first term can only be zero if all the points in the neighborhood {(wπt(i),uπt(i)):i=1,…,N;pi>0} have associated yields equal to yt and this is seldom the case. The greater the distribution of yields in the neighborhood the greater the first term will be and hence, the greater Et will be. Thus the expected error Et is positive and the error in the prediction of the yield during month t for any given run is likely to be positive.
A measure of the total error of the GkNN prediction of yield over the total simulation period for run r is(7)Er=∑t=1Tytr-zt2,and its expected value is(8)E=limR→∞1R∑r=1R∑t=1Tytr-zt2=∑t=1TlimR→∞1R∑r=1Rytr-zt2=∑t=1TEt>0.We may write(9)Et=∑i=1Npiuπti-yt2+yt-zt2=Etb+Etp,where(10)Etb=∑i=1Npiuπti-yt2,and(11)Etp=yt-zt2.We have(12)Etb=∑i=1Npiuπti-∑j=1Npjuπtj2.Now define π:V×{1,…,k}→{1,…,N} by(13)πv,i=theindexoftheithclosestelementofwj:j=1,…,Ntovwithrespecttothemetricμ,and let, for v∈V,i∈{1,…,k},πv(i)=π(v,i). Then(14)Etb=Evt,where E:V→[0,∞) is defined by(15)Ev=∑i=1Npiuπvi-∑j=1Npjuπvj2.E:V→[0,∞) may be called the base error map. We will show that E is bounded over the predictor vector space as follows:(16)Ev≤∑i=1Npiuπvi-∑j=1Npjuπvj2≤∑i=1Npiumax+∑j=1Npjumax2≤Numax+Numax2=N1+N2umax2,where umax=max{ui:i=1,…,N}.
4. Prediction of the Annual Average Yield by GkNN Simulation
Thus the GkNN method does not make accurate detailed month by month predictions of the yield. We would like to determine some way in which the GkNN method gives useful information about the system behavior. We will show that under certain assumptions the GkNN method gives an accurate prediction of the average annual yield and the accuracy of the prediction increases as the total time period of the simulation increases.
Given a permutation π:{1,…,N}→{1,…,N} let Vπ={v∈V:πv=π}. Let Π denote the set of all permutations of {1,…,N}. Suppose that the simulation is carried out over m years, so T=12m. The average annual yield for run r is(17)Yr=1m∑t=112mytr=1m∑π∈Π∑ytr:vt∈Vπ,t∈1,…,12m.Therefore the average of the average annual yield over R runs is given by(18)1R∑r=1RYr=1R∑r=1R1m∑π∈Π∑ytr:vt∈Vπ,t∈1,…,12m=1m∑π∈Π∑1R∑r=1Rytr:vt∈Vπ,t∈1,…,12m→1m∑π∈Π∑∑i=1Npiuπi:vt∈Vπ,t∈1,…,12m,as R→∞. Therefore the expected value of the predicted average annual yield is given by(19)Y=1m∑π∈Π∑i=1Npiuπit∈1,…,12m:vt∈Vπ.If X is a topological space and x={xt:t=1,2,…} is a time series in X then we will say that x is eventually well distributed if(20)limT→∞1Tt∈1,…T:xt∈Uexists for all Borel sets U⊂X.(Borel (X)=BorelsetsinX denotes the sigma algebra generated by the set of open sets in X [12].) This is a natural property for a time series to have. If x is eventually well distributed define its distribution to be the mapping ν:Borel(X)→[0,1] defined by(21)νU=limT→∞1Tt∈1,…T:xt∈U.It is straightforward to show that ν is finitely additive and ν(X)=1.
If the climatic time series {vt:t=1,2,…} is eventually well distributed with distribution ν then the average annual yield converges to a limit as the number of years m in the simulation increases given by(22)limm→∞Y=12∑π∈ΠνVπ∑i=1Npiuπi.
5. Variance of the Average Annual Yield Predicted by GkNN Simulation
We will now compute the variance of the average annual yield and show that it tends to zero as the number of years m in the simulation increases. We have(23)VarY=limR→∞1R∑r=1RYr-Y2=limR→∞1R∑r=1RYr2-Y2.We may compute(24)1R∑r=1RYr2=1R∑r=1R1m2∑π∈Π∑ytr:vt∈Vπ,t∈1,…,12m2=1R∑r=1R1m2∑π1∈Π∑ytr:vt∈Vπ1,t∈1,…,12m∑π2∈Π∑ysr:vs∈Vπ2,s∈1,…,12m=1m2∑π1,π2∈Π∑1R∑r=1Rytrysr:vt∈Vπ1,vs∈Vπ2,t,s∈1,…,12m=1m2∑π1,π2∈Π∑1R∑r=1Rytrysr:vt∈Vπ1,vs∈Vπ2,t,s∈1,…,12m,s≠t+1m2∑π∈Π∑1R∑r=1Rytr2:vt∈Vπ,t∈1,…,12m.Now for s,t∈{1,…,12m},s≠t,vt∈Vπ1,vs∈Vπ2,(25)1R∑r=1Rytrysr=1R∑uπ1iuπ2j:γt,r=i,γs,r=j;i,j∈1,…,N,r∈1,…,R=1R∑i,j=1Nuπ1iuπ2jr∈1,…,R:γt,r=i,γs,r=j=∑i,j=1Nuπ1iuπ2j1Rr∈1,…R:γt,r=ir∈1,…,R:γt,r=i,γs,r=j-1r∈1,…,R:γt,r=i,γs,r=j→∑i,j=1Npipjuπ1iuπ2j,as R→∞ (assuming that the index selection at Step (3) of the GkNN algorithm at time t is independent of its selection at time s). Therefore(26)limR→∞1R∑r=1RYr2=1m2∑π1,π2∈Π∑i,j=1Npipjuπ1iuπ2js,t∈1,…,12m2:vt∈Vπ1,vs∈Vπ2,s≠t+1m2∑π∈Πyt2t∈1,…,12m:vt∈Vπ.Also we compute(27)Y2=1m2∑π∈Π∑i=1Npiuπit∈1,…,12m:vt∈Vπ}2=1m2∑π1,π2∈Π∑i,j=1Npipjuπ1iuπ2js,t∈1,…,12m2:vt∈Vπ1,vs∈Vπ2.It follows that(28)VarY=1m2∑π∈Πyt2-yt2t∈1,…,12m:vt∈Vπ=1m2∑π∈ΠVarytt∈1,…,12m:vt∈Vπ=1m2∑π∈Π∑i=1Npiuπi-∑j=1Npjuπj2t∈1,…,12m:vt∈Vπ.Therefore(29)VarY≤Cm,where(30)C=12∑π∈Π∑i=1Npiuπi-∑j=1Npjuπj2,and so the variance of the predicted annual average annual yield as computed by the GkNN method tends to zero as the total number of years m in the simulation increases. If the time series {vt:t=1,2,…} is eventually well distributed with distribution ν then(31)limm→∞mVarY=∑π∈ΠνVπ∑i=1Npiuπi-∑j=1Npjuπj2.
6. Prediction of the Total Yield by GkNN Simulation
Thus the computation of average annual yield using GkNN seems to be well behaved. However it is perhaps of greater interest to consider the total yield at any month starting from the beginning of the simulation period. The total yield Ytot over a simulation period of m years is given by(32)Ytot=mY,where Y is the average annual yield. Therefore the variance of the total yield is given by(33)VarYtot=m2VarY=∑π∈Π∑i=1Npiuπi-∑j=1Npjuπj2t∈1,…,12m:vt∈Vπ.If the time series {vt:t=1,2,…} is eventually well distributed with distribution ν then Var(Ytot)=mf(m), where(34)fm=m-1∑π∈Π∑i=1Npiuπi-∑j=1Npjuπj2t∈1,…,12m:vt∈Vπ→∑π∈ΠνVπ∑i=1Npiuπi-∑j=1Npjuπj2,as m→∞. This limit will be positive for practical applications. Thus, in this case, the variance of Ytot becomes unbounded as m→∞.
7. A General Framework for GkNN
Let {vt:t=1,2,…}⊂V be a time series which may be a realization of some stochastic process and let Z be a topological space. A stochastic process y={yt:t=1,2,…}⊂Z will be said to be stochastically dependent on {vt:t=1,2,…} if there exists a continuous kernel K:V×Borel(Z)→[0,1] such that(35)Pryt∈Γ=Kvt,Γ,∀t=1,2,…The condition that K is a continuous kernel means that for all v∈V the mapping taking Γ∈Borel(Z) to K(v,Γ) is a probability measure and for all Γ∈Borel(Z) the mapping taking v∈V to K(v,Γ) is continuous. Equation (35) means that if {yt(r):t=1,2,…} for r=1,2,… are a collection of runs (replicates) of the stochastic process {yt:t=1,2,…} then(36)limR→∞1Rr∈1,…,R:ytr∈Γ=Kvt,Γ.Consider the GkNN process defined by training data {(wi,ui):i=1,…,N}⊂V×[0,∞). In this case the space Z is the space [0,∞). We will show that the process {y1,y2,…} is stochastically dependent on the time series {vt:t=1,2,…}. In fact we have (37)Pryt∈Γ=limR→∞1Rr∈1,…,R:ytr∈Γ=limR→∞1R∑i=1NδuπtiΓr∈1,…,R:ytr=uπti=∑i=1NδuπtiΓlimR→∞1Rr∈1,…,R:ytr=uπti=∑i=1NpiδuπtiΓ,where for a∈Z, δa:Borel(Z)→[0,∞) denotes the Dirac measure concentrated on a defined by(38)δaΓ=1ifa∈Γ0otherwise.It follows that the GkNN process is stochastically dependent on {vt:t=1,2,…} with kernel K defined by(39)Kv,Γ=∑i=1NpiδuπviΓ.Now suppose that {vt:t=1,2,…}⊂V is a time series and {yt:t=1,2,…}⊂[0,∞) is a stochastic process which is stochastically dependent on {vt:t=1,2,…} with kernel K where K is defined by a continuous functional kernel ϕ:V×[0,∞)→[0,∞), i.e.,(40)Kv,Γ=∫Γϕv,ξdξ.Let {zt:t=1,2,…} be a realization (replicate) of {yt:t=1,2,…} and let KN be the kernel associated with the GkNN process with training set WN={(vi,zi):i=1,…,N} and probabilities(41)pi=1kNfori=1,…,kN0otherwise,for which kN→∞ as N→∞ but kN/N→0 as N→∞. An example of a sequence kN satisfying this is kN=N. KN is given by(42)KNv,Γ=1kN∑i=1kNδzπviΓ=1kNi∈1,…,kN:zπvi∈Γ).Therefore for an interval (a,b)(43)KNv,a,b=1kNi∈1,…,kN:zπvi∈a,b.Now let ψ:V×[0,∞]→[0,1] be defined by(44)ψv,ξ=∫0ξϕv,ζdζ.Let {ρt} be defined by ρt=ψ(vt,zt) for t=1,2,…. Then {ρt} is a uniformly distributed sequence of random numbers and zt=(ψ(vt,.))-1(ρt). Thus(45)KNv,a,b=1kNi∈1,…,kN:ψvπvi,·-1ρπvi∈a,b=1kNi∈1,…,kN:ψv,·-1ρπvi∈a,b=1kNi∈1,…,kN:ρπvi∈ψv,a,ψv,b→ψv,b-ψv,a=Kv,a,b,as N→∞, assuming that μ(vπv(i),v) is small for all i=1,…,kN for N large enough (this will follow, if {vt} is eventually well distributed with positive distribution, given that {ρπv(i)} is a uniformly distributed sequence).
Thus the GkNN kernel equals the kernel of the dependent process in the sense defined above as long as the training set for the GkNN process is large enough.
8. Example of Temporal Upscaling of (Rainwater) Tank Data
We would like to estimate the month by month yield of a rainwater tank (RWT) given monthly climatic data. This is not straightforward because a monthly time step is too coarse for the RWT simulation model. To obtain reasonably accurate results a daily time step must be used for the RWT simulation [13, 14].
The monthly climatic data arises from the water supply headworks (WSH) model [15] and is usually stochastically generated with a very large time span (e.g., 1,000,000 years). The problem of temporal scaling up would not arise if the climatic data for the WSH model had a daily time step (and also if the RWT simulation algorithm could be executed sufficiently fast).
Temporal downscaling has been used extensively in studying the short term effects of long-term climate models such as models of climate change [16–19]. However in the present paper we are considering the problem of upscaling relatively short records of daily data to generate long term records of monthly data.
Three methods of temporal upscaling are the nearest neighbor (NN) method of Coombes et al. [20], Kuczera’s bootstrap method [21] and the k-nearest neighbor (kNN) method [5, 6, 16].
In each of these methods the RWT month by month yield associated with a WSH climatic time series is estimated using a comparatively short (e.g., 140 years) historical record of daily climatic data. In each case the RWT simulation model or, more generally, the Allotment Water Balance model described in [20] is run on this daily historical record for various RWT parameter settings. In order to do this it is necessary to have a demand model which is either a simulation or, as is unlikely, a historical record. The demand simulation will take into account the climatic variables, in particular, the temperature.
The upscaling methods can be described in terms of the following general format. Each of the upscaling methods aggregates the daily RWT yields and climatic variables obtained from running the RWT simulation on the historical record into monthly time steps. They then generate a list {rjR:j=1,…,N} of records of the form(46)rjR=month_labeljR,climatic_variable_1jR,…,climatic_variable_njR,RWT_yieldjR,where N is the number of months in the historical record. The month label is a number in {1,…,12} determined from the month corresponding to the record. For the method described in [20], n=3 and
climatic_variable_1 = average_temperature,
climatic_variable_2 = number_of_rainfall_days,
climatic_variable_3 = rainfall_depth.
For Kuczera’s bootstrap method and the kNN method as currently implemented n=1 and climatic_variable_1 = rainfall_depth.
Now for all three upscaling methods we are given a sequence c1H,c2H,… of monthly records coming from the WSH model where(47)ciH=month_labeliH,climatic_variable_1iH,…,climatic_variable_niH.
For each i we want to select a RWT yield to associate with ciH. The NN method does this by finding the record in {rjR:j=1,…,N,month_labeljR=month_labeliH} which is closest to ciH as measured by the metric (a variant of the Manhattan metric) given by(48)μNNrjR,ciC=∑p=2dwprjRp-ciHp,where d=n+1 is the record length (e.g. 2) and w1,…,wn are weights which were chosen to be 1 in [20]. The NN method is deterministic.
The kNN method is a stochastic method in which the following steps are carried out.
Evaluate the distance from each record rjR to ciH using the following metric (a variant of the Euclidean metric):(49)μNNrjR,ciH=∑p=1drjRp-ciHpsp21/2,
where sp is the standard deviation of {(rjR)p:j=1,…,N}.
Sort the metric values
Choose the top (closest) k values rj1R,..., rjkR
Assign a probability to each of the k selected values proportional to 1/t for t=1,…,k
Randomly select an index t according to the assigned probabilities and return the RWT_yieldjtR as the RWT yield corresponding to ciH
The bootstrap method is a stochastic method in which a scatter plot of {(rainfalljR,RWT_yieldjR):j=1,…,N} is created. The domain of the plot is divided up into bands of 50 samples per band. Then, given a WSH climatic record ciH the corresponding RWT yield is obtained by finding the band containing rainfalliH, randomly choosing a sample in that band and then returning its RWT yield value.
The bootstrap method of Kuczera can be modified by taking the set of samples associated with any given rainfall value to be the set of samples whose rainfall values are the 50 closest values to the given rainfall value rather than using predefined bands of 50 rainfall values. It can be argued that the modified bootstrap method is superior to the bootstrap method because the closest values are the most appropriate values to use and, for example, if the given rainfall value falls near the boundary of one of the predefined bands then the predicted yield using the bootstrap method will be biased towards the values near the centre of the band.
The modified bootstrap method, the Coombes method, and the kNN method are all examples of the GkNN method. For the modified bootstrap method the predictor vectors have one component, the rainfall. For the Coombes method the predictor vectors have three components, the average temperature, the number of rainfall days, and the rainfall depth. For the kNN method the predictor vectors have two components, the month label (an integer in {1,…,12}) and the rainfall depth. The training data is obtained by running the RWT simulation model using a daily time step over a relatively short period of time (e.g., 100 years) and then upscaling to a monthly time step by aggregation. The GkNN metric μ:V×V→[0,∞) may be the modified Manhattan metric of the Coombes method or the modified Euclidean metric of the kNN method.
For the bootstrap method the probability distribution on the set of nearest neighbors is given by(50)pi=150for i=1,…500otherwise.For the kNN method the distribution is given by(51)pi=ν-11ifori=1,…k0otherwise,where(52)ν=∑i=1k1i.
9. Conclusion
A generalization of three methods of temporal data upscaling, which we have called the generalized k-nearest neighbor (GkNN) method, has been considered. The accuracy of the GkNN simulation of month by month yield has been considered. The notion of an eventually well distributed time series is introduced and on the basis of this assumption some properties of the average annual yield and its variance for a GkNN simulation are computed. The behavior of the total yield over a planning period has been described. A general framework for considering the GkNN algorithm based on the notion of stochastically dependent time series has been described and it is shown that for a sufficiently large training set the GkNN simulation has the same statistical properties as the training data. An example of the application of the methodology has been given in the problem of simulating the yield of a rainwater tank given monthly climatic data.
Data Availability
The work of the paper is a theoretical study. The author did not implement any code or generate any data relating to the work. Therefore no data were used to support this study.
Conflicts of Interest
The author declare that they have no conflicts of interest.
Acknowledgments
The work described in this paper was partially funded by the Commonwealth Scientific and Industrial Research Organisation (CSIRO, Australia). Also the author would like to thank Fareed Mirza, Shiroma Maheepala, and Yong Song for very helpful discussions.
MackY. P.Local properties of k-NN regression estimates19812331132310.1137/0602035MR627598YakowitzS.KarlssonM.MacneillJ. B.UmphreyG. J.Nearest neighbour methods for time series, with application to rainfall/runoff prediction1987Reidel Publishing Company149160CoverT.HartP. E.Nearest neighbour pattern classification19671312110.1109/tit.1967.1053964Zbl0154.44505DevroyeL.On the almost everywhere convergence of nonparametric regression function estimates19819613101319MR63011310.1214/aos/1176345647Zbl0477.62025LallU.SharmaA.A nearest neighbor bootstrap for resampling hydrologic time series199632367969310.1029/95WR029662-s2.0-0029663871RajagopalanB.LallU.A k-nearest-neighbor simulator for daily precipitation and other weather variables199935103089310110.1029/1999WR9000282-s2.0-0032848078BiauG.DevroyeL.DujmovicV.KrzyakA.An affine invariant k-nearest neighbour regression estimate2012112243410.1016/j.jmva.2012.05.020MR2957283LeeT.OuardaT. B. M. J.Identification of model order and number of neighbors for k-nearest neighbor resampling20114043-413614510.1016/j.jhydrol.2011.04.0242-s2.0-79959202022ZhangS.Nearest neighbor selection for iteratively kNN imputation201285112541255210.1016/j.jss.2012.05.0732-s2.0-84865249371YakowitzS.Nearest neighbor regression estimation for null-recurrent Markov time series199348231131810.1016/0304-4149(93)90050-EMR1244548SancettaA.Nearest neighbor conditional estimation for Harris recurrent Markov chains2009100102224223610.1016/j.jmva.2009.06.013MR2560365HalmosP. R.1974New York, NY, USASpringerMR0033869MashfordJ.MaheepalaS.A general model for the exact computation of yield from a rainwater tank20153971929194010.1016/j.apm.2014.10.004MR3325588MashfordJ.MaheepalaS.NeumannL.CoultasE.Computation of the expected value and variance of the average annual yield for a stochastic simulation of rainwater tank clustersProceedings of the 2011 International Conference on Modeling, Simulation and Visualization Methods2011Las Vegas, Nev, USA303309CuiL.-J.KuczeraG.Optimizing urban water supply headworks using probabilistic search methods200312953803872-s2.0-014152166610.1061/(ASCE)0733-9496(2003)129:5(380)GangopadhyayS.ClarkM.RajagopalanB.Statistical downscaling using K-nearest neighbors200541210.1029/2004WR003444FowlerH. J.BlenkinsopS.TebaldiC.Linking climate change modelling to impacts studies: recent advances in downscaling techniques for hydrological modelling200727121547157810.1002/joc.15562-s2.0-35348933854MaraunD.WetterhallF.IresonA. M.Precipitation downscaling under climate change: Recent developments to bridge the gap between dynamical models and the end user201048313410.1029/2009RG0003142-s2.0-77954700859ErhardtR. J.BandL. E.SmithR. L.LopesB. J.Statistical downscaling of precipitation on a spatially dependent network using a regional climate model2015297183518492-s2.0-8493926513110.1007/s00477-014-0988-yCoombesP. J.KuczeraG.KalmaJ. D.ArgueJ. R.An evaluation of the benefits of source control measures at the regional scale2002443073202-s2.0-003640957710.1016/S1462-0758(02)00028-6KuczeraG.Urban water supply drought security: a comparative analysis of complimentary centralised and decentralised storage systemsProceedings of the Water Down Under 2008200815321543