Information Analysis of Catchment Hydrologic Patterns across Temporal Scales

Catchment hydrologic cycle takes on different patterns across temporal scales. The interim between event-scale hydrologic process and mean annual water-energy correlation pattern requires further examination to justify self-consistent understanding. In this paper, the temporal scale transition revealed by observation and simulation was evaluated in an information theoretical framework namedAleatory Epistemic Uncertainty Estimation.TheAleatoryUncertainty refers to posterior uncertainty of runoff given the input variables’ observations. The Epistemic Uncertainty refers to the posterior uncertainty increase due to the imperfect observation decoding in models. Daily hydrometeorological observations in 24 catchments were aggregated from 10 days to 1 year before implementing the information analysis. Estimations of information contents and flows of hydrologic terms across temporal scales were related with the catchments’ seasonality type. It also showed that information distilled by the monthly and annual water balance models applied here did not correspond to that provided by observations around temporal scale from two months to half a year. This calls for a better understanding of seasonal hydrologic mechanism.


Introduction
A major realm of hydrologic community is to figure out the components of hydrologic cycle.Each component should be determined either by observation or by an independent governing equation to guarantee the solvability of the problem.The accuracy of observation and domain of governing equations usually change with scales.The term scale here refers to the characteristic time (or length) of a process, observation, or model [1].While large scale hydrologic patterns are expected to emerge by integrating detailed event-scale hydrologic control functions along the spatial and temporal paths, such reductionism approach often fails to distill the dominant factors that contribute to catchment's long range hydrologic behaviours.On the other hand, the holism perspective has been widely adopted to provide coarse explanation of catchment's mean annual water balance [2,3].A cut-through between the reductionism and holism paradigms is required for reaching a self-consistent understanding of hydrologic temporal scale transition [4].
One practical attempt toward this goal is to expand the existing mean annual models to fit for small temporal scale hydrologic simulation.Classical Budyko curve [5] which connects the ratio of catchment's actual evapotranspiration () to precipitation () and dryness index (/, where  denotes potential evapotranspiration) can not exert firstorder control of water balance for excluding the impact of soil moisture change within the scales the model focuses on [6][7][8].By including the soil moisture storage term (), the expanded models could be applied for seasonal, even monthly, hydrologic simulation and prediction.
As is declared, the incorporation of new variable increases the system's freedom degree, which should be compensated by introducing new independent governing equation.In Budyko curve, the water supply  is partitioned into actual evapotranspiration () and runoff () with the evapotranspiration demand being .Accordingly, the adjusted models make a multistep precipitation partition given the water competition between catchment replenishment demand and evapotranspiration demand.Table 1 listed the analysis of some widely accepted water balance models following this cognitive framework.
Due to the same constraints of extreme zero-order and first-order boundary conditions where water supply far surpasses or falls behind demand, the curves above take on similar shapes and achieve similar satisfactory performances in monthly scale simulation.Each model requires 2 (in TPWB) to 4 (in ABCD or DWBM) parameters to adjust curve concavity and position to fit for observations.The statistic characteristics of state variables and parameters differ as the modelling scale changes [9].It is interesting to make a closer examination of data-revealed hydrologic pattern and model performance during scale transition given the wide temporal scale gap between event-scale hydrologic process and annualmean water balance.
Variables in large temporal scale hydrologic models are the aggregation of themselves in small scale models.The goal of these models is to find out the control of the aggregated variables on that period's total water balance, which is determined by the inner-scale temporal distribution of hydrologic events and catchment's storage capacity [10].For instance, given the same average water and energy supply, catchments with uniformly distributed rainfall and large storage capacity tend to generate less direct runoff and more evapotranspiration.These determinants are simplified as state variable  in the water balance models.The motivation of this paper was to quantify data-revealed (potential) and model-revealed (achieved) control of hydrometeorological variables' mean values on catchments average water balance over different temporal scales.The estimations were implemented within an information theoretical framework named Aleatory Epistemic Uncertainty Estimation (AEUE) [11] for its mathematical elegance in assessing such insufficient information control problems.
The rest of the paper is structured as follows: in Section 2, the definitions and properties of AEUE are briefly introduced before clarifying its logical extensions and technical adaptions in this work.Section 3 gives the data description.The results and their interpretations are in Sections 4 and 5.The last section draws conclusion and recommends directions for future work.

Aleatory and Epistemic Uncertainty in Hydrologic Simulation.
It is intuitively believed that infrequent samples of a random variable bring more surprise or information.The mathematical expression of this common sense is that information provided by an observation should be a decreasing function of its probability.If we further require the additive property of information between independent events, the form of information content attributed to a sample with probability  should be − log .The average information content of random variable  is  () = −Σ () log  () ℎ () = − ∫  () log  () .
(1) () and ℎ() are defined as discrete and continuous Shannon entropy [12], both measured in bits for logarithm base 2. The two terms are connected with the following limitation relation [13]: As is shown in Figure 1 In the literature of information theory, they are denoted as or Each term in the equations above is named in correspondence with that in Bayes' Theorem.Explicitly, ( | ) and ℎ( | ) are called conditional entropy, which represent the posterior uncertainty of  given the knowledge of .() and ℎ() represent the prior uncertainty.(; ) 1 Symbols without explicit explanation are model parameters.
is called mutual information.It represents the information contribution one variable provides to the other.The application of Bayes' Theorem in hydrologic simulation assessment [14][15][16] endows each of the information terms with its hydrologic significance.Conceive  as the hydrologic variable to be simulated and  as the input variable observations, both taken as continuous random variables; (6) quantifies the residual uncertainty of the hydrologic system given the inaccurate and insufficient observation system.This uncertainty is named as Aleatory Uncertainty (AU) [11]: Here   ,   represent the observed output and input random variables of hydrologic models.It should be noted that   is usually high dimensional, since information comes from different resources (both hydrometeorology and underlying surface observations) and lagging effects of former hydrologic behaviours.Models try to distill the largest information from   to construct their simulations   .Given that   is function of   , Data Processing Inequality Theorem [13] confirms that the potential maximum information models could distill (represented as (  ,   )) is not larger than that provided by the original data (represented as (  ,   )).A detailed proof is given in Appendix.The information loss due to imperfect input data processing is defined as Epistemic Uncertainty (EU) [11]: Equations ( 7) and ( 8) construct the Aleatory Epistemic Uncertainty Estimation (AEUE) framework.The sum of AU and EU is the posterior uncertainty of the hydrologic system given the simulation system.

Extending AEUE for Temporal Scale Information Analysis.
To implement AEUE across temporal scales, daily hydrometeorologic observations were aggregated into different temporal scales.The aggregated data were used for estimation of each term in (7) and (8).To achieve a more explicit information analysis, we adapted the strategy to gradually expand input variable species and lagging steps to detect the decreasing trajectory of AU.The decrease is attributed as the information contribution of EU of two typical water balance models were estimated.The Two-Parameter Water Balance (TPWB) Model [17] was preferred for its simplicity and satisfactory performance at monthly scale.The model structure was listed in Table 1.The other model was Budyko Model.It is the combination of Budyko curve [18] and mass conservation function.The most significant distinction between the two models is that TPWB adapts iterative structure.The performance of iterative models depends on its state variable's capacity to distill information of system's lagging effects and its constitutive functions' capacity to utilize the distilled information.These two factors were discerned through distinguishing (; ,  former , ,  former ), (; , , ), and (;   ), where  represents model's state variable and   represents simulated runoff.The difference of the first two terms tells state variable's representativeness and difference between the last two terms tells constitutive function's data processing efficiency.
Given the analysis above, the explicit AEUE framework estimates the terms as listed in Table 2.

High Dimensional Mutual Information
Estimator.The major obstacle AEUE faces is the estimation of high dimensional mutual information terms in Table 2. Samples of finite length can not support accurate estimation of their probability distribution in high dimensional spaces.This phenomenon, known as dimensionality curse [19], has hindered the definition-based estimation of mutual information.Given this, some "nonplugin" methods have been developed.The basic idea is to construct relation between carefully designed sample statistics with its entropy [20] or to perform transform for a step-by-step entropy estimation [21].Mutual information was estimated afterwards with the following equation: As can be detected, error may accumulate in these algorithms.
The estimator we adapted here belongs to the first mode but makes direct estimation of mutual information.The formula is based on the concept of  nearest neighbour distances [22]: Here () is the digamma function, () = Γ() −1 Γ()/. is order of nearest neighbour;   () and   () are the numbers of samples that are within the th nearest crisscross surrounding sample point . takes 4 to balance the statistical error and systematic error according to Kraskov's suggestion [22].
An intuitive explanation of (10) is that it estimates mutual information with statistics that depicts the average concentrating density of each window opened around a sample point.Numerical experiments showed that even less than 30 sample size produces satisfying results.For a strict proof, please refer to Kraskov et al. [22].
The samples' distance function should be predefined to determine   and   in (10).Since different dimensions hold specific hydrologic meanings and are not symmetric, the distance could not be efficiently depicted with Euclidean norm.Kernel trick was adapted to implicitly measure sample distances in their feature space.Kernels were chosen by optimizing their correspondent support vector regression (SVR) [23] performance implemented on the test set.Satisfactory kernel SVR performance suggests well-balanced compromise between minimizing variance and bias in the proper feature space.Results showed that kernel SVR was effective in performing high dimensional regression of hydrologic variables [24][25][26][27][28].The following function was used to depict the distance between two input variable samples  1 and  2 : () is the support vector regression function that fitted the input to the output variable.  and   were determined after the calculation of distances between samples.In practice, the support vector regression was implemented using the LIBSVM package [29].Radial basic function kernel was adopted for its satisfying performance.The data were first normalized to [−1, 1] to balance the impact of different dimensional terms.Results were sensitive to the penalty function parameter  and kernel parameter , both of which were autocalibrated with particle swarm optimization algorithm [30].

Data
24 catchments with daily hydrologic records (including , , and ) from MOPEX data set [31] were selected to implement cross temporal scale information analysis.Given their temporal water-energy distribution patterns, the selected catchments are classified into 4 groups, explicitly, weak seasonality with synchronous rainfall energy distribution (WS), weak seasonality with asynchronous rainfall energy distribution (WA), strong seasonality with synchronous rainfall energy distribution (SS), and strong seasonality with asynchronous rainfall energy climate (SA).The classification standard was based on the amplitude and phase of the average daily rainfall fitted with  curve.If the amplitude was less than 0.45, the catchment was taken as weak seasonality.If the phase of rainfall was inverse to that of potential evapotranspiration, it was taken as asynchronous rainfall energy climate type.The general conditions of the catchments were listed in Table 3.The vegetation, soil type, land use, and other specific catchment information are available from the following link: ftp://hydrology.nws.noaa.gov/pub/gcip/mopex/USData/.

Aleatory and Epistemic Uncertainty across Temporal Scales
4.1.Aleatory Uncertainty.Considering the implication of (2), the Aleatory Uncertainty comparison across temporal scales requires a scheme to preset resolution for each temporal scale.Since the acceptable deviation of small temporal scale hydrologic models should be stricter than larger scale models, we prerequired relative constant resolution across temporal scales.Explicitly, the width of the bin into which the objective variable is clustered in the p.d.f.curve is proportional to its mean value.It was further assumed that the mean value is proportional to its temporal scales.Thus, the quantization correction term in ( 2) is proportional to the logarithm of the temporal scale.For two scales  and  into which daily runoff observation data were aggregated, the entropy difference for depicting them with specific resolutions is Given this baseline, the estimated Aleatory Uncertainty was shown as follows.
In each subgraph above, the abscissa represents the input steps; for example, number  denotes that the current and (−1) lagging steps' input observations were used to decrease the uncertainty of runoff estimation.The ordinate represents the estimating temporal scale, which varied from 10 days to a year.
The general pattern is that AU decreases as temporal scale expands or more lagging input observations were included.It The detailed analysis discerning each term's information contribution for different catchments was discussed in the next session.

Epistemic Uncertainty.
The estimated Epistemic Uncertainty across temporal scales was shown in Figure 3.
For TPWB model, maximum EU appears around temporal scales from 2 months to half a year.This showed that, at seasonal temporal scale, the model can not distill the information provided by the data effectively.
The EU difference between TPWB and Budyko Model was related to the catchment's seasonality.In 11 out of 14 asynchronous seasonality catchments, EU differs significantly at small temporal scales.The difference diminished as scale expands.In the remaining 3 asynchronous catchments and 14 synchronous catchments, the difference stayed relatively constant across temporal scales.

Information Contribution of Included Input
Terms.The including of new information sources could decrease simulation uncertainty.The specific information contribution of including energy provision  and observed previous runoff   was obtained by subtracting the right column graphs from the left column ones (Figure 4).For instance, AU(; ) − AU(; , ) denotes the information contribution of considering  in the simulation.
For all the 10 weak seasonality catchments and 5 out of 14 strong seasonality catchments, the information contribution of  was more significant at temporal scales of less than half a year.It was distributed more uniformly across temporal scales in the 9 left strong seasonality catchments.
The prominent information contribution of previous runoff at small scales in some catchments was attributed to runoff convergence influence.

Information Contribution of Soil Moisture Memory.
Previous hydrologic behaviour exerts influence on current hydrologic response due to the storage capacity of soil moisture.Here this influence is defined as soil moisture memory and was represented by the difference between splines in each subgraph of Figure 2.
The second dissection scheme checks the information contribution of including lagged inputs in mutual information estimation.This is implemented by making differences in mutual information estimated with different input steps; for instance, the th spline in each graph from Figure 5 equals the difference of the ( + 1)th spline and th spline in the corresponding graph from Figure 6.
It could be depicted that the first lagging steps' input variables provide most information contribution across all the   temporal scales estimated here.As shown in the first column of Figure 5, these lagging effects were not significant when considering only the water provision.The consideration of energy provision is of key importance in estimating the soil moisture length.

Dissection of Model's Information Distilling Capacity.
As was declared, the simulation capacity of iterative structure models depends on their capacity to distill lagging effects information and process such information.The information distilling and processing capacity were discerned in the following graphs.The ordinate of each graph denotes mutual information.(;   , ) represents the mutual information between runoff and current input together with current state variable.It denotes model's capacity to distill lagging effects from previous hydrologic behaviours.The difference between (;   , ) and (;   ) denotes model's capacity to process the information it distilled.
It could be depicted that, in synchronous climate catchments, the information distilled by TPWB and Budyko Model increases as temporal scale expands, while, in asynchronous climate catchments, the information distilling capacity of TPWB does not change monotonously with temporal scales.

Discussion and Conclusion
The aggregation of event-scale hydrologic processes yields to the large temporal scale water-energy correlation pattern.The temporal scale transition was examined in the extended Aleatory Epistemic Uncertainty Estimation framework.
The Aleatory Uncertainty quantified the uncertainty caused by inaccurate and insufficient observation.For a large temporal scale, since the daily observations were aggregated, the large number law guaranteed that the accumulated error tended to 0 when there was no systematic observation bias.Thus, AU was mainly attributed to the insufficience of data.The aggregated variables could exert certain control on the total water balance.The control significance was closely related to the seasonality type as quantified in the previous session.
The Epistemic Uncertainty of a monthly and mean annual water balance model was estimated.The performance of TPWB was evaluated by quantifying its information distilling capacity and data processing efficiency.Results showed that information distilled by the models applied here did not correspond to the information provided by input observations around temporal scale from two months to half a year.This called for a better understanding of seasonal hydrologic mechanism.The information distilling capacity difference of TPWB and Budyko Model was related to the inner-year distribution of water and energy.In asynchronous catchments, the difference converged to 0 at half year scale, which suggested close hydrologic cycle.
The evaluations also revealed some counterintuitive phenomenon that needs to be stressed and explained.The meaning of soil storage capacity from a large temporal scale perspective was not as physically clear as it is in event scale.The state variable  is influenced by the distribution of hydrologic processes and soil properties.The strict definition is required to explain the uncertainty differences in different seasonality catchments.

Table 1 :
Structure analysis of water balance models 1 .

Table 2 :
Information terms to be estimated.
SA > AU WA > AU WS > AU SS .