G-Filtering Nonstationary Time Series

The classical linear filter can successfully filter the components from a time series for which the frequency content does not change with time, and those nonstationary time series with time-varying frequency TVF components that do not overlap. However, for many types of nonstationary time series, the TVF components often overlap in time. In such a situation, the classical linear filtering method fails to extract components from the original process. In this paper, we introduce and theoretically develop the G-filter based on a time-deformation technique. Simulation examples and a real bat echolocation example illustrate that the G-filter can successfully filter a G-stationary process whose TVF components overlap with time.


Introduction and Background
In this paper we develop filters for G-stationary processes, which are nonstationary processes with time-varying frequency TVF behavior.We begin with a very brief discussion of linear filters since the filters we develop here are generalizations of the linear filter.The traditional linear filter is defined as Y t X * a t realization.In this discussion we will consider low-pass and high-pass filters which are designed to remove the high-frequency behavior and low frequencies, respectively.We will use the popular Butterworth filter see 2 .The squared frequency response function of an Nth order low-pass Butterworth filter is given by where f c is the cutoff frequency, and N is the order of the filter.The frequency response function for a high-pass Butterworth filter has a similar form.Traditional linear filters such as the Butterworth filter can successfully extract components from stationary processes where the frequency behavior of the data does not change with time.However, for many nonstationary time series with time varying frequency TVF behavior, the time-invariant assumption of traditional filters often results in the failure to extract TVF components from such processes.Example 1.1 examines the application of the Butterworth filter on data containing timevarying frequencies.
Example 1.1 linear filter applied to TVF data .Figure 1 a shows a realization of length n 400 from the model X t cos 36π ln t 175 ψ 1 .5 cos 85π ln t 175 ψ 2 , 1.2 where ψ 1 1 and ψ 2 1.59.The TVF behavior is clear, both in the data in Figure 1 a and the two components that are shown in Figures 1 b and 1 c .However, it should be noted that the frequency content at the beginning of the "low frequency" component Figure 1 b is about the same as that toward the end of the "high frequency" component seen in Figure 1 c .For this reason, a Butterworth-type filter will not be able to completely separate the components.
In Figures 1 d and 1 e we show the results of 3rd order low-pass Butterworth filters applied to the data in Figure 1 a with f c .15 and .06,respectively.In Figure 1 d the cutoff frequency, f c .15, does a good job of extracting the low-frequency behavior for time t ≤ 100, but toward the end of the signal, both components pass through the filter.This occurs since the frequency behavior which decreases with time in the "high frequency" component has decreased to the point that it passes through the filter with f c .15. Figure 1 e shows the effects of lowering the cut off frequency to f c .06.For t ≥ 250 the filter does a reasonable job of extracting only the low-frequency component.However, the effect of lowering the cut-off frequency is that neither the high-nor low-frequency componentpasses through the filter toward the beginning of the data.In fact, because of the overlap in frequency content between the first part of the signal in Figure 1 b and the latter part of the signal in Figure 1 c , no cut-off frequency will be able to separate these two components using a filter such as the Butterworth.
The data set in Example 1.1 is a realization from an M-stationary process which is a special case of the G-stationary processes that have been introduced to generalize the concept of stationarity 3 .As illustrated above, current time-invariant linear filtering would have to be adjusted in order to filter the data from a G-stationary processes.Baraniuk and Jones 4 used the theory of unitary equivalence systems for designing generic classes of signal analysis and processing systems based on alternate coordinate systems.Applying the principle of unitary equivalence systems, the filtering procedure is to 1 preprocess the data, 2 apply a traditional time-invariant filtering method, and 3 convert the results to the original time scale.To solve the filtering problem for G-stationary processes, in this paper we define the G-filter and illustrate its application on signals that originated from G-stationary processes.
The paper is organized as follows.In Section 2 we discuss G-stationary models with focus on strategies for fitting G λ -stationary models of Jiang, et al. 3 .In Section 3 we present results concerning G-filters designed for filtering data from G-stationary models, and in Section 4 we introduce a straight-forward implementation of the G-filter for extracting components such as those shown in Figures 1 b and 1 c .

Time Deformation Method and the G-Stationary Process
A stationary process is a stochastic process in which the distribution does not change with time.A large volume of statistical theory and methods have been developed for stationary time series.However, many processes are nonstationary where the distribution changes with time.Several techniques have been proposed for analyzing these nonstationary processes using time deformation.For example, Girardin and Senoussi 5 presented a framework based on harmonic analysis for semigroups, where they defined a semi-group stationary process including a local stationary process and a time-warped process.Clerc and Mallat 6 derived an estimator of the deformation by showing that the deformed autocovariance satisfies a transport equation at small scales.Among the different types of nonstationary processes, special attention has been given to warped nonstationary processes that are obtained by deforming the index set of stationary processes.For example, self-similar processes are obtained by applying the Lamperti operator on the time scale of a stationary process 7 .In this paper, we investigate the use of time deformation based on G-stationary models to filter TVF data.We refer to the resulting filters as G-filters.

G-Stationary Processes
Definition 2.1.Let {X t : t ∈ S} be a stochastic process defined on S ⊂ R, let u g t be a mapping onto a set R g ⊂ R, and let g −1 denote an inverse.Then X t is said to be a G-stationary Although not immediately obvious, this definition basically says the following.Let g be a transformation of the time axis and let g t u and t g that is, Definition 2.1 gives the conditions on X t and g t that are necessary and sufficient in order for the dual process, Y u , to be weakly stationary.The implication is that, whereas X t may not be stationary on the original index set on t , it can be mapped onto a new deformed time scale on which it is stationary.We will refer to Y u as the stationary dual of X t , and we let C Y ξ E Y u − μ Y u ξ − μ denote the autocovariance of the dual process.R X τ is called the G-stationary autocovariance, and clearly C Y ξ R X τ .The G-spectrum is defined in Definition 2.2.Definition 2.2.Let {X t , t ∈ a, b } be a G-stationary process and let {Y u , u ∈ −∞, ∞ } be the corresponding dual with respect to the time-deformation function, g t .The G-spectrum of the G-stationary process, X t , is defined as the spectrum of the dual process, that is, If the mapping, g t , from the original space t ∈ a, b to the dual space u ∈ −∞, ∞ is onto, then

2.4
When g t at b, t ∈ −∞, ∞ , the G-stationary process, X t , is simply the traditional weakly stationary process, and when g t ln t , t ∈ 0, ∞ , X t is called an M-stationary process 13 .When g t is the Box-Cox transformation, t ∈ 0, ∞ , then X t is called a G λ -stationary process 3 .When g t at 2 bt, t ∈ 0, ∞ with a > 0 and b ≥ 0, then X t is called a linear chirp stationary process.See Liu 15 , and Robertson et al. 16 .

General Strategy for Analyzing G-Stationary Processes
We give the following outline for analyzing G-stationary processes.
1 Transform the time axis to obtain a stationary dual realization, we will discuss this below .
2 Analyze the transformed dual realization using methods for stationary time series.For example, sometimes this is done by fitting an AR p or an ARMA p, q model to the dual data and then computing forecasts, spectral estimates, and so forth as desired for the problem at hand in the current setting we will be filtering the dual data .
3 Transform back to original time scale as needed.
Step 1 transform the time axis to obtain a stationary dual realization .Finding a time transformation, u g t , is equivalent to determining a sampling scheme such that X t k X g −1 u k Y k , where Y k is stationary.That is, for a G-stationary process, the stationary dual process is obtained theoretically by sampling X t at the points t t k g −1 u k .In general, however, the observed data will have been obtained at equally spaced points, and are not available at the points t k .Interpolation has primarily been used to deal with this problem.See Gray et  In the following discussion we will discuss techniques for fitting a G λ -stationary model to observed TVF data.The G λ model is sufficiently general to include TVF data with either increasing or decreasing frequency behavior.The values, t k , needed for obtaining the stationary dual based on the discrete Box-Cox time transformation, k , where ξΔ n λ 1 1/λ is called the realization offset and Δ n is the sampling increment.Jiang et al. 3 employ a search routine to determine the values of λ and offset that produce the "most stationary" dual.For each set of λ and offset values considered in the search, the data, X t k , k 1, 2, . .., are approximated at the t k 's using interpolation.By then indexing on k, the dual realization associated with the given λ and offset is obtained, and for each combination of λ and offset, the resulting dual realization is checked for stationarity.Jiang, et al. 3 suggest measuring stationarity by examining the characteristics e.g., sample autocorrelations of the first and second halves of the transformed data.They employ a Q-statistic for measuring the difference between the sample autocorrelations of the two halves.This measure is based on the fact that the correlation structure under stationarity stays constant across the realization.The software, GWS, written in S , can be used to perform this search procedure, and it is available from the authors at the website http://www.texasoft.com/atsa.This software can be used to fit a G λ model to a set of data, and it provides methods for spectral analysis, forecasting, and so forth.In the examples here involving analysis of TVF data, we use the GWS software package.
Example 2.3 G λ -analysis of the TVF data in Example 1.1 .In this example we perform G λ analysis on the time series discussed in Example 1.1.We also use Wigner-Ville plots, which display the time-frequency behavior in the data by computing inverse Fourier transforms of windowed versions of the autocovariance function see 14, 18, 19 .As previously noted, X t in 1.2 is an M-stationary process.Figure 2 a shows the data previously shown in Figure 1 a , and in Figure 2 c we show the associated Parzen spectral density.Figure 2 c is a so-called "spread spectrum" showing a wide range of frequency behavior between about f .06 to f .22 which is caused by the frequency changes in the data that were noted in Example 1.1.The frequency change can be visualized in the Wigner-Ville plot in Figure 2 e .In the plot, darker shading corresponds to locations of stronger frequency behavior.For example, Figure 2 a shows a very pronounced lower TVF component which is illustrated with the lower "strip" that is at about f .1 at the beginning of the data whereas by t 300 the period length associated with the lower TVF component is about 20 f .05which is visually consistent with the component shown in Figure 1 b .The higher TVF component, represented by a lighter strip, indicates frequency at about f .22periods of about 5 early in the data which decreases to about f .1 periods of about 10 by about t 300.Again, this is consistent with Figure 1 c .The checkered pattern between these two strips is interference and is not indicative of strong frequency behavior.The Wigner-Ville plot visually illustrates the fact mentioned in Example 1.1, that the lower frequency behavior the bottom strip has higher frequency at the beginning of the data than does the higher frequency component upper strip at the end of the data set.That is, a horizontal line cannot be drawn that stays entirely within the two strips.Using the GWS software, it is seen that a G λ transformation with λ 0 M-stationary and offset 175 is a reasonable choice for producing a stationary dual.The dual data based on this transformation is given in Figure 2 b .The high-frequency and low-frequency behavior is similar to that seen in Figure 2 a except that neither frequency displays time-varying behavior.The corresponding spectral density in Figure 2 d shows two distinct peaks one near .07and the other near .13 .That is, the spectral density of the transformed data i.e., the G-spectral density clearly shows that there are two dominant "frequency" components in the data.The Wigner Ville plot in Figure 2 f shows two parallel lines, one at about f .07 and another at about f .13.This plot conveys the fact that the frequency behavior is not changing with time, which is consistent with stationary data.

The G-Filter
In this section we define the G-filter for filtering G-stationary data.The G-filter can be viewed as a specific example of a unitary equivalent system for filtering G-stationary signals see 4 .
Definition 3.1.Given a stochastic process {X t , t ∈ a, b } and an impulse response function, h t , t ∈ a, b , the G-filter or G-convolution, denoted X ⊗ h t , is defined as Proof.Consider the following: 3 b Let P X d f and P Y d f be the spectra of the stationary dual of the processes X d u and Y d u , that is, the G-spectra of X t and Y t .Since A h d f is the frequency response function of h d u , then based on the fundamental linear filtering theorem, it follows that If the mapping g t from the original space t ∈ a, b to the dual space

The M-Filter
Gray and Zhang 12 introduced the M-convolution or M-filter for filtering the M-stationary process.When g t ln t , t ∈ 0, ∞ , then it follows that which is the M-convolution defined by Gray and Zhang 1988 .

Filtering Data Using the G-Filter
Definition 3.1 shows that the G-filter is not a linear filter if viewed from the original space, but it is a linear filter if viewed from the dual space.Consequently, based on the definitions and results concerning G-filters in Section 3, we use the strategy proposed by Baraniuk and Jones 4 : 1 Make an appropriate time deformation, u g t , on the original time space to obtain a stationary dual process.
2 Apply a traditional linear filter on the dual space to extract components.
3 Transform the filtered dual components back to the original time space.In the following examples we illustrate the implementation of the G-filter using the steps outlined above.0 that has two pairs of complex conjugate roots quite close to the unit circle.Associated with each component is a system frequency.Realizations from this model will be characterized by frequency behavior in the neighborhoods of the two system frequencies, and consequently spectral estimates will have peaks close to these two frequency values.See 14, 17 .Figure 4 a shows a realization of length n 200 from the model in 4.1 , and it is seen that there is a dramatic increase in frequency with time i.e., period lengths decrease .There is also an indication of two TVFs.Using the GWS code, a time transformation with λ 2.9 and offset 14 was selected.Based on this time transformation, the dual data are obtained as in Figure 4 b where the data appear stationary, again with two underlying frequencies.The extreme increase in frequencies in the original data set is illustrated in the Wigner-Ville plot in Figure 4 c .The lower strip going from near zero frequency for small values of t to about f .1 at t 200.The upper less visible strip is associated with a higher-frequency TVF component that also begins at near zero frequency and increases to about f .25 toward t 200.The Wigner-Ville plot for the dual data indicates that the two frequencies have been stabilized and further support the stationary nature of the dual.The dual model is dominated by two frequencies, a lower one at about f .03and an upper again less visible frequency of about f .1.The results of applying 3rd order low-pass and high-pass Butterworth filters with a cutoff frequency f c .065 are shown in Figures 4 e and 4 f , respectively.Figures 4 g and 4 h show the filtered data sets plotted on the original time axis.
The behavior of the two filtered components is consistent with that in the original data as is shown in Figures 5 a and    another TVF associated with higher frequencies than the two main components.Gray et al. 13 analyzed this data set as an M-stationary model with offset 203.Their analysis suggests that there are three dominant TVF components and that the highest component is sufficiently high frequency at the beginning of the time series that it is too rapid for the sampling rate to detect until about t 100.In our analysis here we will use the time transformation suggested by Gray,et al. 13 to compute the dual data.Using this time transformation produces the dual data in Figure 6 b .The Wigner-Ville plot in Figure 6 c shows overlap between the TVF strips, for example, the frequency associated with the lower TVF strip near t 0 is similar to that of the upper strip at around t 250.The Wigner-Ville plot of the dual data is given in Figure 6 d where it can be seen that the two dominant frequencies have been stabilized, and that the two dominant dual frequencies are well separated and fall at about f .18and f .3.
Figures 6 e and 6 f show the low-pass and high-pass filter results, respectively, plotted on the original time axis.

Concluding Remarks
In this paper, we show that the classical linear filtering methods cannot be directly applied to TVF data where the TVF components overlap over time.We discussed a family of models, Gstationary models, which have proven to be a useful extension of the usual stationary models, and which can be used to model a certain range of nonstationary time series with TVF.Then

5 eFigure 1 :
Figure 1: a Realization from signal in 1.2 , b true low-frequency component, c true high-frequency component, d output from low-pass filter with f c .15, and e output from low-pass filter with f c .06.
Hlawatsch and Taub öck 8 and Hlawatsch et al. 9 discuss the use of time frequency displacement operators to displace signals in the time-frequency plane.See also Sampson and Guttorp 10 and Perrin and Senoussi 11 .Gray and Zhang 12 base time deformation on a log transformation of the time axis.They refer to processes that are stationary on the log scale as M-stationary processes and show that the resulting spectral representation is based on the Mellin transform.Flandrin et al. 7 point out that the log transformation of Gray and Zhang 12 is a special case of the Lamperti transform.Gray et al. 13 extend the M-stationary model to analyze data collected at discrete time points.Jiang et al. 3 defined a more general family of nonstationary processes called Gstationary processes whose frequencies monotonically change with time.See also Woodward et al. 14, Chapter 13 .

Figure 2 :
Figure 2: a Realization from signal in 1.2 , b dual data based on λ 0 and offset 200, c Parzen spectral density for data in a , d Parzen spectral density for dual data in b , e Wigner-Ville plot for data in a , and f Wigner-Ville plot for data in b .

d
High pass-filtered data

Figure 3 :
Figure 3: a, b Outputs from low-pass and high-pass filters with f c .1 applied to the dual data in Figure 2 b ; c, d are the filtered dual components after transforming back to the original time scale.

Example 4 . 1
Examples 1.1 and 2.3 revisited .In this example we revisit the TVF data set shown in Figure 1 a that was discussed in Examples 1.1 and 2.3.We will G-filter the data based on a G λ model fit to the data following the steps outlined above. 1 The first step was done in Example 2.3 yielding the dual data in Figure 2 b . 2 Based on the discussion in Example 2.3 and the Wigner-Ville plot in Figure 2 f it seems that the low and high frequency components in the dual data could be separated using a cutoff frequency of about f c .1.Figures 3 a and 3 b show the results of a low pass and high pass Butterworth filter, respectively, applied to the dual data in Figure 2 b .It can be seen that these filters do a good job of separating the low and high frequency components in the dual data.3 Figures 3 c and 3 d show the data in 3 a and 3 b , respectively, plotted on the original time scale using linear interpolation .Comparing Figures 3 c and 3 d with Figures 1 b and 5 b .Also, the TVF components are close to those in the original data as seen by comparing Figures 5 c and 5 d with Figure 4 c .

Example 4 . 3 d
bat echolocation data .In this example, we consider echolocation data from a large brown bat.The data were obtained courtesy of Al Feng of the Beckman Institute at the University of Illinois.The data shown in Figure6a consist of 381 data points taken at 7microsecond intervals with a total duration of .00266seconds.This signal is quite complex and appears to be made up of possibly two different signals.The Wigner-Ville plot in Figure6c suggests that the data contain two TVF components with a suggestion of possibly Wigner-Ville plot for high TVF component

Figure 5 :
Figure 5: a, b Two filtered components solid lines with the original signal dashed lines , c, d Wigner-Ville plot for the two extracted components.
al. 13 , Jiang et al. 3 , and Woodward et al. 14 .For an approach based on Kalman filtering, see Wang et al. 17 .
are the duals of X t and h t with respect to the time-deformation function g t , andX d * h d t is the usual convolution, X d * h d t X d τ h d t − τ dτ.Theorem 3.2.If the mapping g t from the original space t ∈ a, b to the dual space u .3 a the output process, Y t X ⊗ h t , t ∈ a, b , where h t is the impulse response function, is also G-stationary, b the G-spectra of X t and Y t satisfy 1 c , respectively, shows that the procedure did a good job of separating the two TVF frequencies that we were previously unable to separate in Example 1.1 using standard methods.The somewhat jagged behavior of the peak heights is due to the interpolation.Alternatives to linear interpolation are under current investigation.