Identifying and Evaluating Chaotic Behavior in HydroMeteorological Processes

The aim of this study is to identify and evaluate chaotic behavior in hydro-meteorological processes. This study poses the two hypotheses to identify chaotic behavior of the processes. First, assume that the input data is the significant factor to provide chaotic characteristics to output data. Second, assume that the system itself is the significant factor to provide chaotic characteristics to output data. For solving this issue, hydro-meteorological time series such as precipitation, air temperature, discharge, and storage volume were collected in the Great Salt Lake and Bear River Basin, USA. The time series in the period of approximately one year were extracted from the original series using the wavelet transform. The generated time series from summation of sine functions were fitted to each series and used for investigating the hypotheses.Then artificial neural networks had been built for modeling the reservoir system and the correlation dimension was analyzed for the evaluation of chaotic behavior between inputs and outputs. From the results, we found that the chaotic characteristic of the storage volume which is output is likely a byproduct of the chaotic behavior of the reservoir system itself rather than that of the input data.


Introduction
Hydrologic phenomena arise as a result of interactions between climate inputs and landscape characteristics that occur over a wide range of space and time scales.Due to the tremendous heterogeneities in climatic inputs and landscape properties, such phenomena may be highly variable and "complex" at all scales [1].The nonlinear behavior of hydrologic systems had been known for a long time [2,3].The rainfall-runoff process is nonlinear, almost regardless of the basin area, land uses, rainfall intensity, and other influencing factors, which are changing in a highly nonlinear fashion and so are the outputs, often in unknown ways [1].
To study the nonlinear characteristics of natural phenomena, many statisticians and scientists have suggested the chaos theory which analyze and forecast the nonlinear phenomena of the natural system.Lorenz [4] suggested the strange attractor in a simple model of convection roll in the atmosphere.Packard et al. [5] suggested the method of delays and Takens [6] proved the method of delays using differential topology.Grassberger and Procaccia [7] and Farmer et al. [8] demonstrated the estimation of chaotic characterization using correlation dimension.Wolf et al. [9] calculated the largest Lyapunov exponent using the Benettin's method.Fraser and Swinney [10] suggested a method for the estimation of time delay using the mutual information.Gilmore [11] introduced the topological method for chaos characterization, especially useful for small data sets.Farmer and Sidorowich [12] forecasted the chaotic time series using the local linear approximation.Also, Casdagli [13] forecasted the chaotic time series using the radial basis functions and Casdagli and Weigend [14] modeled and forecasted the chaotic time series using DVS (deterministic versus stochastic) algorithm.Kim et al. [15,16] suggested a new method for the estimation of delay parameters in chaos analysis.Falanga and Petrosino [17] estimated the complexity of the system by the degrees of freedom necessary to describe the asymptotic dynamics in a reconstructed phase space.The mechanism of stochastic resonance, which is a nonlinear phenomenon, has been applied in the field of the physics of atmosphere since it was introduced by Benzi et al. [18,19] and Nicolis [20].

Advances in Meteorology
Many hydrologists have been also analyzed hydrologic phenomena using nonlinear deterministic chaos to interpret the nonlinear characteristic of the hydrologic system.Rodriguez-Iturbe et al. [21] found the chaotic characteristics in rainfall data recorded with the time interval of 15 seconds using the correlation dimension and the Lyapunov exponent.Wilcox et al. [22] tested the chaotic behavior of daily snowmelt runoff data by correlation dimension.Sangoyomi et al. [23,24] used the Great Salt Lake volume data recorded with the time interval of 15 days for searching for the chaotic characteristics.Jeong and Rao [25] used 13 tree ring series to determine their chaos characteristics.Rodriguez-Iturbe et al. [26] investigated the nonlinear dynamics of soil moisture using a soil moisture balance equation.Kim et al. [27] searched strange attractor in wastewater flow using the C-C method.Ahn and Kim [28] showed that the nonlinear stochastic model is more valid for the SOI time series analysis and modeling than linear stochastic analog by the BDS statistic.Kim et al. [29] assessed nonlinear deterministic characteristics in hydrologic time series like rainfall, stream flow, and reservoir volume series.Sivakumar et al. [30] examined the utility of nonlinear dynamic concepts for analysis of rainfall variability across Western Australia.Kim et al. [31] assessed the applicability chaotic dynamics and filtering techniques in radar rainfall.
Even though Salas et al. [32] investigated how hydrologic process (e.g., precipitation) which is low-dimensional chaotic is changed by its transformations such as aggregation and sampling, mostly, the single hydrologic time series have been analyzed for investigating its chaotic and nonlinear dynamic characteristics.Therefore, the aim of this study is to identify chaotic behavior for the components in hydrometeorological processes such as air temperature, precipitation, discharge, and lake storage volume series.The components contribute to hydro-meteorological system as inputs and outputs.For this, the main question is, given that what is the significant factor to provide chaotic characteristics to output data?We pose the following two hypotheses.(1) Assume that the input data is the significant factor to provide chaotic characteristics to output data.(2) Assume that the system itself is the significant factor to provide chaotic characteristics to output data.
This paper is organized to solve the issue as follows.In Section 2, we give brief overview of the methodology to estimate the correlation dimension, which can detect chaotic characteristics of data series.In Section 3, we also give brief overview of the wavelet transform to extract the data of the representative period from the original time series and artificial neural networks (ANN) for modeling the hydrometeorological system.In Section 4, we apply methods for identifying chaotic behavior of the data series and discuss the results.Finally, in Section 5, we summarize the findings and conclusions.

Estimation of Correlation Dimension
2.1.Phase Space Reconstruction.Phase space is a useful tool for representing the evolution of a system in time.
It is essentially a graph or a coordinate diagram, whose coordinates represent the variables necessary to completely describe the state of the system at any moment (in other words, the variables that enter the mathematical formulation of the system).The trajectories of the phase space diagram describe the evolution of the system from some initial state, which is assumed to be known, and, hence, represent the history of the system [5].The "region of attraction" of these trajectories in the phase space provides at least important qualitative information on the "extent of complexity" of the system, which can subsequently be verified quantitatively using methods based on, for example, the concept of dimensionality.
For a dynamic system with known partial differential equations (PDEs), the system can be studied by discretizing the PDEs, and the set of variables at all grid points constitutes a phase space.One difficulty in constructing the phase space for such a system is that the (initial) values of many of the variables may not be known.However, a time series of a single variable of the system may be available, which may allow the attractor (a geometric object that characterizes the long-term behavior of a system in the phase space) to be reconstructed.The idea behind such a reconstruction is that a (nonlinear) system is characterized by self-interaction, so that a time series of a single variable can carry the information about the dynamics of the entire multivariable system.Many methods are available for phase space reconstruction from an available time series.Among these, the method of delays (e.g., [6]) is the most widely used one.According to this method, given a single-variable series,   , where  = 1, 2, . . ., , a multidimensional phase space can be reconstructed as where  = 1, 2, . . .,  − ( − 1);  is the dimension of the vector   , called embedding dimension; and  is an appropriate delay time (an integer multiple of sampling time).A correct phase space reconstruction in a dimension  generally allows interpretation of the system dynamics (if the variable chosen to represent the system is appropriate) in the form of an -dimensional map   , given by where   and  + are vectors of dimension , describing the state of the system at times  (current state) and  +  (future state), respectively.

Correlation Integral and Correlation Dimension.
The dimension of a time series is, in a way, a representation of the number of variables dominantly governing the underlying system dynamics.Correlation dimension is a measure of the extent to which the presence of a data point affects the position of the other points lying on the attractor in the phase space.The correlation dimension method uses the correlation integral (or function) for determining the dimension of the attractor and, hence, for distinguishing between low-dimensional chaos and high-dimensional system.The concept of the correlation integral is that a time series arising from deterministic dynamics will have a limited number of degrees of freedom equal to the smallest number of first-order differential equations that capture the dominant features of the dynamics.Thus, when one constructs phase spaces of increasing dimension, a point will be reached where the dimension equals the number of degrees of freedom, beyond which increasing the phase space dimension will not have any significant effect on correlation dimension.Many algorithms have been formulated for the estimation of the correlation dimension.Among these, the Grassberger-Procaccia algorithm [7] has been the most popular.The algorithm uses the concept of phase space reconstruction for representing the dynamics of the system from an available single-variable time series, as presented in (1).For an mdimensional phase space, the correlation integral or function () is given by where  is the Heaviside step function, with () = 1 for  > 0 and () = 1 for  ≤ 0, where  =  − ‖  −   ‖,  is the vector norm (radius of sphere) centered on   or   .If the time series is characterized by an attractor, then () and  are related according to where  is a constant and ] is the correlation exponent or the slope of the Log () versus Log  plot.The slope is generally estimated by a least square fit of a straight line over a certain range of  (scaling regime) or through estimation of local slopes between  values.The distinction between lowdimensional (perhaps determinism) and high-dimensional (perhaps stochasticity) can be made using the ] versus  plot.If ] saturates after a certain  and the saturation value is low, then the system is generally considered to exhibit low-dimensional and possibly deterministic dynamics.The saturation value of ] is defined as the correlation dimension (2) of the attractor, and the nearest integer above this value is generally an indication of the number of variables dominantly governing the dynamics.On the other hand, if  increases without bound with increase in , the system under investigation is generally considered to exhibit highdimensional and possibly stochastic behavior.

Wavelet Transform and Artificial Neural Networks
3.1.Wavelet Transform.According to Fourier theory, a signal can be expressed as the sum of a possibly infinite series of sine and cosines, referred to as a Fourier expansion [33].However, a Fourier expansion has only frequency resolution and not time resolution; that is, no amplitude modulation of the signal at a given frequency is considered.Movingwindow Fourier transforms have been used to address this issue, but this method is sensitive to the choice of window width.Alternatively, the wavelet transform [34,35] enables the identification of frequency components as well as their variation in time.The continuous wavelet transform of a discrete sequence  is defined by the convolution of   with a scaled and translated wavelet function : where ( * ) indicates the complex conjugate,  is the localized time index,  ̸ = 0 is the scale parameter, and  is the number of points in the time series.In this study, we use the Morlet wavelet function defined as () =  −1/4   0    2 /2 , where  0 is a frequency and  is a nondimensional "time" parameter.By varying the wavelet scales and translating along the localized time index , one can construct a picture that shows both the amplitude of any features versus the scale and how this amplitude varies with time.A vertical slice through a wavelet plot is a measure of the local spectrum.The time-averaged wavelet spectrum over all the local wavelet spectra gives the global wavelet spectrum: A more detailed presentation for wavelet transform analysis is referred to read Torrence and Compo [35].

Artificial Neural Networks.
ANN is a model of neurotransmission by a neuron, which is a nerve cell in the human brain.ANN is an empirical pattern search technique that enables the consideration of a nonlinear relationship between input variables and output variables.ANN is used in various areas because of its unique applicability [36,37].This includes the field of climate science, where its applicability is proven [38,39].
Many studies suggest the ANN technique, which is a nonlinear model of the data series, and ANN is better than other techniques by way of systematic evaluation of various techniques [40,41].Therefore, this study also applies ANN, which is judged to have superior applicability in the simulation of nonlinear characteristics of the hydro-meteorological system.

Study Area and Data Series
Used.The Bear River Basin, located in northeastern Utah, southeastern Idaho and southwestern Wyoming, comprises 7,500 square miles of mountain and valley lands including 2,700 in Idaho, 3,300 in Utah, and 1,500 in Wyoming.The Bear River crosses state boundaries five times and is the largest stream in the western hemisphere that does not empty into the ocean.It ranges in elevation from over 1,278 to 3,868 feet and is unique in that it is entirely enclosed by mountains, thus forming a huge basin with no external drainage outlets (http://www.greatsaltlakeinfo.org/Background/BearRiver).The Bear River is the largest tributary to the Great Salt Lake (see Figure 1).

Extraction of a Representative Time Series by Wavelet
Transform.All hydrological measurements are to some extent contaminated by noise.And the noise limits the performance of many techniques of identification, modeling, prediction, and control of deterministic systems [42].Independent component analysis (ICA) as a popular method is able to extract periodic signals from noise or nonlinear mixture [43,44].It has been applied in the fields of meteorology [45], oceanography [46], volcanology [47,48], and remote sensing [49].This study, however, uses wavelet transform for extracting the representative periodic components which affect the data series because ICA often leads to local minimum solution and the suitable source signals are not isolated [50].Moreover, the order of the independent components (ICs) is difficult to be determined in comparison with wavelet transform.
Wavelet power spectrum that estimated the wavelet mother function using the Morlet function is shown in Figure 3 (left) and the extent of spectrum in each period for time series can be identified.In this figure, a solid halfcircle line shows the edge of the cone of influence (COI) effect that can be caused by the discontinuity of the beginning and end of data series.In particular, the upper part of the solid line is statistically significant (a 95% confidence interval) and the lower part is excluded from interpretation.Parts with high-density spectrum are observed in some periods within a confidence interval.Global wavelet power spectrum (GWP) in Formula (6), which represents the average value according to the length of each period, provides more effective information about spectrum.Figure 3 (center) shows the result of GWP about spectrum.Considering that the right part of a solid line is statistically significant on a basis of a 95% confidence level, the periodic characteristics of the time series could be classified into one band.The band shows a strong spectrum of the period of approximately 1 year.The period extracted from the wavelet spectrum is shown in the right of Figure 3.

Analysis of the Time Series Using Attractor and
Correlation Dimension 4.3.1.Attractor Analysis.The attractor obtained by ( 1) can describe the characteristics of a time series.To obtain the attractor using (1), the index lag  and embedding dimension  must be chosen appropriately.The autocorrelation function (ACF) is expected to provide a reasonable measure of the transition from redundance to irrelevance as a function of delay.The decorrelation time which is equal to the lag (delay time: ) at which the ACF first attains the value zero is considered.Otherwise,  should be chosen as the local minimum of ACF, whichever occurs first [51,52].When the ACF decays exponentially, we select  at which the ACF drops to zero [53], at lag time 4 months in all of series.Therefore, the delay times of the systems can be obtained from ACFs and the attractors are drawn in Figure 4 for each time series.
For the attractor analysis, this study uses the extracted time series by the wavelet transform.The attractors of the time series have a circle with a boundary.If the attractor in the phase-space exhibits clearly within a very well defined boundary, suggest that the dynamics are simple and the system is potentially low dimension.Every time series have has a shape with a boundary which looks like a chaotic series.Particularly air temperature shows a very well defined boundary, and it is potentially low-dimensional series.Precipitation, however, shows relatively high complex and irregular, and it is a potentially high-dimensional system than the other data series.and three variables, respectively.Here we can find the truth that the time series have different chaos characteristics even if they are collected from a hydro-meteorological system.

Correlation Dimension Analysis Using Synthetically
Generated Series.Precipitation and air temperature from the meteorological system are considered as input time series of the runoff system.On the same principle, the output series which is a discharge at the runoff system or the basin outlet occurred by input series of precipitation and air temperature from the meteorological system can be the input data of the reservoir system.Here the methodology is suggested to solve the two hypotheses as follows.We composed the input data sets, which have an arbitrary correlation dimension and build up ANNs as a nonlinear model for modeling the reservoir system.The modeling results from the input data sets will be the criterion of the hypotheses.The first hypothesis will be reasonable if the system responses sensitively depending on the arbitrary input data sets, whereas the second hypothesis will be reasonable if the system does not response sensitively depending on the input data sets.

Correlation Dimensions of Generated Input Series to the
Reservoir System.The attractors in each time series (shown in Figure 4) have limit cycle regime which is the characteristics of a periodic system.Each time series as a periodic function can be written as an infinite sum of sine and cosine terms.Fourier [54] realized this first, so that this infinite sum is called a Fourier series.
The input data sets are composed of the nine sets using the three sine functions in the each hydro-meteorological time series.Here the sine function, which is useful for application to a periodic time series data, is made using the fitting toolbox of MATLAB.Therefore each time series is composed of the three cases of case (a), case (b), and case (c) as shown in Table 1.Case (a) is composed of the sum of few sine functions and case (c) is composed of the sum of lots of sine functions relatively.Case (b) is between case (a) and case (c).In case of precipitation and discharge, the functions are set to have at least three sine functions because the series are dominantly governed by four variables from the results of the correlation dimension analysis in Section 4.3.2.It is found that the fitting results have a good applicability, with the correlation coefficient (CC) in precipitation 0.54-0.65,air temperature 0.98-0.99,and discharge 0.88-0.92for 1116 months .The results of the correlation dimension analysis in each case are shown in Figure 6.The saturated correlation dimensions in each series are (a) 2.54, (b) 3.26, and (c) 4.05 in precipitation, (a) 1.02, (b) 1.84, and (c) 2.52 in air temperature, and (a) 2.48, (b) 3.13, and (c) 3.8.Case (c) is composed of many sine functions which showed the highest correlation dimension, whereas case (a) shows the lowest correlation dimension in each time series.

ANN Modeling and Correlation Dimension Analysis of Hydro-Meteorological System.
In order to build up the ANN model, this study sets precipitation, air temperature, and discharge as the input layer and storage volume as the output layer.As seen in Figure 7, a multilayered ANN model   consisting of one input layer, two hidden layers, and one output layer has been built.
Monthly data series from 1903 to 1970 (800 months) has been used for the learning period.316 months from the learning period  are set up as verification periods and the applicability of the constructed ANN model is reviewed by comparing it to observed storage volume as a target data series (see Table 2).For the composition of the prediction data, we again compose the three input data like cases (A), (B), and (C).Case (A) is integrated from each First of all, according to the model verification measures (see Figure 8), such as the coefficient of correlation (CC, 0.986) and root mean squared error (RMSE, 0.061), ANN is fitted very well and found its good applicability.
The storage volume series of a reservoir system is estimated using the ANN model after setting case (A), case (b), and case (c) as the input data.And then the correlation dimension analysis is performed for the estimated storage volume in each case.The results show that 2.55 in case (A) integrated the low-dimensional cases (a), 2.81 in case (B) integrated the middle-dimensional cases (b), and 2.89 in case (C) integrated the high-dimensional cases (c) as shown in Figure 9.

Summary and Discussions.
In this study, we posed the two hypotheses to identify chaotic behavior in hydrometeorological processes.For solving this issue, we composed the input data sets like cases (A), (B), and (C) and applied them to ANN model on the reservoir system of the Great Salt Lake.The of the hypotheses is the sensitivity of chaotic behavior in the system.In other words, the first hypothesis is reasonable if chaotic behavior in the system is sensitive depending on chaotic characteristics of the input data; otherwise, the second hypothesis is reasonable.The results of the correlation dimension analysis on every case analyzed in this study were summarized in Table 3.
As shown in    and 2.89 in case (C) from integrating the highest dimensions (precipitation 4.05, air temperature 2.52, and discharge 3.80).The input data did not impact significantly on chaotic characteristics of the storage volume as the output even though there was a little difference of the dimension 0.34 between case (A) and case (B).Therefore the chaotic characteristic of the storage volume output in the Great Salt Lake is most likely a byproduct of the chaotic behavior of the reservoir system itself rather than that of the input data.However this chaotic behavior will depend on each hydro-meteorological system.For example, small hydro-meteorological systems will be very sensitive and the chaotic characteristic will be also sensitive depending on the input data.

Conclusions
This study tried to identify and evaluate chaotic behavior in hydro-meteorological processes.For solving the issue suggested in this study, the two hypotheses were posed.First, assume that the input data is the significant factor to provide chaotic characteristics to output data.Second, assume that the system itself is the significant factor to provide chaotic characteristics to output data.The hydro-meteorological time series such as precipitation, air temperature, discharge, and storage volume were collected in the Great Salt Lake and Bear River Basin and the time series in the period of approximately one year were extracted from the original time series using the wavelet transform.The results of the correlation dimension analysis showed precipitation 3.92, air temperature 1.41,The input data sets by the summation of sine functions were composed and applied them to the artificial neural networks for modeling the reservoir system depending on the data sets and integrated the high, middle, and low dimensions.Finally the correlation dimension was analyzed to evaluate chaotic behavior of storage volume which is the final output with inputs of precipitation, air temperature, and discharge in the hydro-meteorological system.The results showed that the chaotic characteristic of the storage volume is most likely a byproduct of the chaotic behavior of the reservoir system itself rather than that of the input data.We expect that the methodology and procedure suggested in this study will provide a clue to understand chaotic behavior in hydrometeorological processes.

Figure 5
shows the relationship between the correlation dimension, 2, and the embedding dimensions, , from 1 to 15, for each time series.The correlation dimension seems to increase with the embedding dimension up to a certain point and saturate beyond that point.Such a saturation of the correlation dimension is an indication of the existence of deterministic dynamics.The saturation values of the correlation dimension for the series are showing 3.92, 1.41, 3.02, and 2.65 in Figures 5(a)-5(d).The low correlation dimensions suggest the presence of low-dimensional chaotic nature of the underlying system dynamics.As the nearest integer above the correlation dimension value generally provides the number of dominant variables influencing the dynamics of the underlying system, the correlation dimensions for the series indicate that the time series of precipitation, air temperature, discharge, and storage volume are dominantly governed by four, two, four,

Figure 2 :
Figure 2: Monthly time series plots for the period of 1903-1995.

Figure 3 :
Figure 3: Extraction of the representative time series using the wavelet transform (left: the wavelet power spectrum, center: the global wavelet power spectrum, and right: the extracted time series about the period of approximately 1 year).

Figure 4 :
Figure 4: Attractors in each time series.

Figure 5 :
Figure 5: The estimated correlation dimension for each time series.

8 Figure 6 :
Figure 6: Correlation dimensions of generated time series.

discharge 3 .
02, and storage volume 2.65 in each time series.

Table 1 :
Fitting functions in each case of each time series.

Table 3 ,
the correlation dimensions are 2.55 in case (A) obtained from integrating the low dimensions (precipitation 2.54, air temperature 1.02, and discharge 2.48) and 2.81 in case (B) from integrating the middle dimensions (precipitation 3.26, air temperature 1.84, and discharge3.31)

Table 2 :
Input data of ANN.

Table 3 :
Summary of correlation dimension in each case and time series.