An Adaptive Resolution Computationally Efficient Short-Time Fourier Transform

The short-time Fourier transform (STFT) is a classical tool, used for characterizing the time varying signals. The limitation of the STFT is its fixed time-frequency resolution. Thus, an enhanced version of the STFT, which is based on the cross-level sampling, is devised. It can adapt the sampling frequency and the window function length by following the input signal local characteristics. Therefore, it provides an adaptive resolution time-frequency representation of the input signal. The computational complexity of the proposed STFT is deduced and compared to the classical one. The results show a significant gain of the computational efficiency and hence of the processing power.


INTRODUCTION
Most of the real-life signals like speech, Doppler, seismic, and biomedical signals are time varying in nature. The spectral contents of these signals vary with time, which is a direct consequence of the signal generation process [1]. The STFT is a classical tool for characterizing such signals [2]. The limitation with the STFT is that it provides a fixed resolution time-frequency representation of the input signal. This fixed resolution is the reason for the creation of the multiresolution analysis (MRA) techniques [3][4][5], which provide a good frequency but a poor time resolution for the low-frequency events and a good time but apoor frequency resolution for the high-frequency events. This type of analysis is well suited for most of the real-life signals [3].
In this article, the fixed resolution dilemma is resolved to a certain extent by revising the STFT. The motivation behind the proposed STFT is to achieve a smart timefrequency representation of the time varying signals. The idea is to adapt the time-frequency resolution along with the computational load by following the input signal local characteristics. An efficient solution is proposed by smartly combining the features of both uniform and nonuniform signal processing tools.

PROPOSED ADAPTIVE RESOLUTION STFT
The block diagram of the proposed STFT is shown in Figure 1. The description of different blocks is given below.

Asynchronous analog to digital converter (AADC)
According to [6], the sampling instants of a nonuniformly sampled signal obtained with the level crossing sampling scheme (LCSS) are defined by (1). Where t n is the current sampling instant, t n−1 is the previous one, and dt n is the time delay between the current and the previous sampling instants (cf. (2)).
The LCSS drastically reduces the activity of the post processing chain, because it only captures the relevant information [7][8][9]. In this context, analog to digital converters based on the LCSS have been developed [10][11][12]. The AADC [10] is employed for digitizing x(t). An M-bit resolution AADC has 2 M − 1 quantization levels which are uniformly disposed according to x(t) amplitude dynamics. The AADC has a finite bandwidth. Thus, to assure a proper signal capturing a band pass filter with pass band [ f min ; f max ] is employedat the AADC input. Let ΔV in and Δx(t) be the AADC and x(t) amplitude dynamics, respectively. In order to avail the complete AADC resolution in the studied case,  Figure 1: Block diagram of the proposed STFT. "-" represents the signal flow, " ...... " represents the control flow, and "-----" represents the parameters flow, at system different stages.
Δx(t) is always adapted to match ΔV in . For an AADC, the maximum and the minimum sampling frequencies [7] are defined by (3) and (4), respectively. Where, Fs max and Fs min are the maximum and the minimum sampling frequencies of the AADC. f max is the bandwidth and f min is the fundamental frequency of x(t):

Enhanced activity selection algorithm (EASA) and window selector
The relevant parts of the nonuniformly sampled signal obtained with the AADC are selected-corresponds to the variable length rectangular window-by the EASA. The EASA is defined as shown in Algorithm 1. T 0 = 1/ f min is the fundamental period of x(t). T 0 and dt n detect parts of the nonuniformly sampled signal with activity. The condition on dt n is chosen in order to satisfy the Nyquist criterion for f min , when sampling x(t) nonuniformly with the AADC [13]. N i represents the number of nonuniform samples lie in the ith selected window, which lie on the jth active part of the nonuniformly sampled signal. Where, i and j both belong to the set of natural numbers N * . The window selector implements the condition given by expression (5). Jointly, the EASA and the window selector provide an efficient spectral leakage reduction in the case of transient signals [13]. Indeed, spectral leakage occurs due to the signal truncation problem. Usually an appropriate smoothening (cosine) window function is employed to reduce the signal truncation. For the proposed case, as long as the condition 5 is true, the leakage problem is resolved by avoiding the signal truncation. As no signal truncation occurs so no cosine window is required. In this case the window decision D i = 1, which makes the switch state 1 (cf. (Figure 1)). Otherwise, an appropriate cosine window is employed to reduce the signal truncation problem. In this case D i = 0, which makes the switch state 0. In expression 5, t i 1 represents the 1st sampling instant of the ith selected window and t i−1 end represents the last sampling instant of the (i − 1)th selected window.
For proper spectral representation, the condition given by expression (6) should be satisfied [13]. Where, L i is the length in seconds of the ith selected window. In order to satisfy this condition for the worst case, which occurs for The lower and the upper bounds on L ref are posed, respectively, by T 0 and the system resources (the maximum sample frame which the system can process at once). For N ref (cf. (7)), the condition 6 holds for all selected windows except for the case when the actual length of the jth activity is less than T 0 .

Adaptive sampling rate
The AADC sampling frequency is correlated to x(t) local variations [7,13]. Let Fs i represent the AADC sampling frequency for the ith selected window. Fs i can be calculated The selected data obtained with the EASA can be used directly for further nonuniform digital processing [8,14]. However in the studied case, the selected data is resampled uniformly. It enables to take advantage of both nonuniform and uniform signal processing tools [7,13]. Due to this resampling there will be an additional error. Nevertheless, prior to this transformation, one can take advantage of the inherent oversampling of the relevant signal parts in the system [7]. Hence, it adds to the accuracy of the post resampling process [11]. The nearest neighbour resampling interpolation (NNRI) is employed for data resampling. The reasons of inclination towards NNRI are discussed in [13,15]. A reference sampling frequency F ref is chosen, such as itremains greater than and closest to the F Nyq = 2· f max . Depending upon the values of F ref and Fs i , the resampling frequency Frs i (cf. (Figure 1)) can be adapted for the ith selected window. For the case, Fs i > F ref , Frs i is chosen as: It is done in order to resample the selected data, lie in the ith selected window closer to the Nyquist frequency. It avoids the unnecessary interpolations during the data resampling process and so reduces the computational load of the proposed technique.
For the case, Fs i ≤ F ref , Frs i is chosen as: Frs i = Fs i . In this case, it appears that the data lie in the ith selected window may be resampled at a frequency which is less than F Nyq and so it can cause aliasing. Since, the sampling rate of the AADC varies according to the slope of x(t) [10]. A high-frequency signal part has a high slope and the AADC samples it at a higher rate and vice versa. Hence, a signal part with only low-frequency components can be sampled by the AADC at a subNyquist frequency of x(t). But still this signal part is locally oversampled in time with respect to its local bandwidth [7]. Hence, there is no danger of aliasing. This statement is further illustrated with the results summarized in Table 1.

Adaptive resolution analysis
The STFT of a sampled signal x n is determined by computing the discrete Fourier transform (DFT) of an N samples segment centred on τ, which describes the spectral contents of x n around the instant τ. Where N is defined as: N = L·Fs. Here, L is the effective length in seconds of the window function w n and Fs is the sampling frequency. The STFT can be expressed mathematically by (9). In Equation (9), f is the frequency index, which is normalized with respect to Fs.
L controls the STFT time and frequency resolution [2]. In the classical case, the input signal is sampled at a fixed sampling frequency Fs, regardless of its local variations. Thus, a fixed L results into a fixed N. In this case, the time resolution Δt and the frequency resolution Δ f of the STFT can be defined by (10) and (11), respectively. Equation (11) shows that for a fixed Fs, Δ f can be increased by increasing N. But increasing N requires to increase L which will reduce Δt (cf. (10)). Thus, a larger L provides a better Δ f but a poor Δt, and vice versa. This conflict between Δ f and Δt is the reason for the creation of the MRA techniques [3][4][5].
The proposed STFT is a smart alternative of the MRA techniques. It performs adaptive time-frequency resolution analysis, which is not attainable with the classical STFT. It is achieved by adapting the Frs i , L i , and Nr i according to the local variations of x(t). Nr i is the number of resampled data points that lie in the ith selected window. Thus, the time resolution Δt i and the frequency resolution Δ f i of the proposed STFT can be specific for the ith selected window and are defined by (12) and (13), respectively. Because of this adaptive resolution, the proposed STFT will be named as the adaptive resolution STFT, (ARSTFT) throughout the following parts of this article. The adaptation of Frs i , L i , and Nr i also adds to the computational gain of the ARSTFT, compared to the classical one. It is achieved firstly by avoiding the unnecessary samples to process and secondly by avoiding the use of the cosine window function as far as the condition 5 is true. The ARSTFT is defined by (14). In (14), τ i and f i are the central time and the frequency index of the ith selected window, respectively. f i is normalized with respect to Frs i . n is the index of the resampled data points lie in the ith selected window. The notation w i n represents that the window function length L i and shape (rectangle or cosine) can be adapted for the ith selected window:

ILLUSTRATIVE EXAMPLE
In order to illustrate the ARSTFT an input signal x(t), shown on the left part of Figure 2 is employed. Its total duration is 30 seconds and it consists of three active parts. Each activity is a sinusoid of 0.9 v amplitude and of 50, 200, and 500 Hz frequency, respectively. The time length of each activity is 5, 0.5, and 1.6 seconds, respectively. x(t) is band limited between 50 to 500 Hz and it is sampled by employing a 3bit resolution AADC. Thus, Fs max and Fs min become 7 kHz and 0.7 kHz, respectively (3), (4).   given in Section 2, N ref = 4096 is chosen, which leads to 5 selected windows. First, two selected windows correspond to the first two activities and the remaining corresponds to the third activity. The last three selected windows are not distinguishable on the right part of Figure 2, because they lie consecutively on the third activity. The parameters of each selected window are summarized in Table 1. Table 1 exhibits the interesting features of the ARSTFT. Fs i represents the sampling frequency adaptation by following the local variations of x(t). It is achieved due to the smart features of the AADC and the EASA. N i shows that the relevant signal parts are locally oversampled in time with respect to their local bandwidths [7]. Frs i shows the adaptation of the resampling frequency for each selected window. It further adds to the computational gain of the ARSTFT, by avoiding the unnecessary interpolations during the resampling process. Nr i shows how the adaptation of Frs i avoids the processing of unnecessary samples during the spectral computation. L i exhibits the EASA dynamic feature, which is to correlate the window function length with the local variations of x(t). Adaptation of L i , Frs i and Nr i leads to the adaptive time-frequency resolution, which is clear from the values of Δt i and Δ f i in Table 2. Table 2 demonstrates that ARSTFT adapts its timefrequency resolution by following the local variations of x(t). It provides a good time but a poor frequency resolution for the high frequency parts of x(t), and vice versa. It is the type of analysis, well suited for most of the real-life signals [3]. The spectrum of each selected window is computed and plotted with respect to τ i on Figure 3. Figure 3 shows the fundamen- tal and the periodic spectrum peaks of each selected window. In this case, the spectrum periodic frequency for the ith selected window f i p is equal to Frs i . It shows the adaptation of Frs i , which can be visualized from Figure 3.
The ARSTFT also adapts the window shape (rectangle or cosine) for the ith selected window. The condition 5 remains true for the first two selected windows, which sets D i = 1. As no signal truncation occurs so no cosine window is required in this case. On the other hand, the number of samples for the fourth activity is 11200. Therefore, N ref = 4096 leads to the three selected windows for the time span of the fourth activity. The condition 5 becomes false in this case, which sets D i = 0. As signal truncation occurs so suitable length cosine (Hanning) windows are employed to reduce this effect.
In the classical case, if Fs = F ref is chosen, in order to satisfy the Nyquist sampling criterion for x(t). Then the whole signal will be sampled at 1.25 kHz, regardless of its local variations. It will produce unnecessary samples than required. Moreover, the windowing process is not able to select only the active parts of the sampled signal. In addition, L remains static and is not able to adapt with the signal local variations. Thus, it causes the system to process needless samples and so causes an increased computational activity than the proposed case. For classical case, fixed N = 4096 will produce nine fixed L = 3.3 second windows, for the total x(t) time span of 30 seconds. It will lead to fix Δt = 3.3 seconds and Δ f = 0.31Hz for all nine windows (cf. (10) and (11) .

COMPUTATIONAL COMPLEXITY
This section compares the computational complexity of the ARSTFT with the classical STFT. The complexity evaluation is made by considering the number of operations executed to perform the algorithm.
In the classical case, Fs is fixed. In this case, a time invariant, fixed L, cosine window function is employed to window the sampled data. If N is the number of samples lie in the window then the windowing operation will perform N multiplications between w n and x n (cf. (9)). The spectrum of the windowed data is obtained by computing its DFT. A complex term is involved in the DFT computation. The DFT complexity is calculated by taking the real and the imaginary parts separately. The DFT performs 2·(N) 2 additions and 2·(N) 2 multiplications, thus operations count becomes 4·(N) 2 for N output frequencies. The combined computational complexity C 1 of the STFT is given by (15). Where, A is the total number of windows occurs for the observation length of x(t). For the proposed ARSTFT, Fs i , Frs i , and w i n are not fixed and are adapted according to the local variations of x(t). The EASA performs 2·N i comparisons and N i increments for the ith selected window (cf. (Section 2)). The choice of Frs i and window shape requires three comparisons. The selected signal is resampled before computing its DFT. The NNRI is employed for the resampling purpose. The NNRI only requires a comparison operation for each resampled observation. Therefore, the resampler performs Nr i comparisons. If D i = 0, then a cosine window function is applied on the resampled data, which performs Nr i multiplications (cf. (Figure 1)). The DFT performs 4·(Nr i ) 2 operations for the ith selected window. The combine computational complexity C 2 of the ARSTFT is given by (16). Where i = 1, 2, . . . , K represents the index of the selected window. α is a multiplying factor, its value is 1 for D i = 0 and 0 for D i = 1. The computational gain of the ARSTFT over the classical one is calculated by employing the results of the illustrative example. The results are summarized in Table 3. Table 3 shows the computational gain of the ARSTFT over the STFT for each x(t) activity. It shows that the ARSTFT leads to a significant reduction of the total number of operations as compared to the classical one. This reduction in operations is achieved by adapting Fs i , Frs i , and w i n according to the local variations of x(t).

CONCLUSIONS
A new tool for the adaptive resolution time-frequency analysis is proposed. The ARSTFT is especially well suited for the low activity sporadic signals like electrocardiogram, phonocardiogram, seismic signals, and so forth. It is shown that Fs i and L i adapt by following the x(t) local variations. Criteria to choose the appropriate F ref and N ref are developed. A complete methodology of adapting Frs i and w i n for the ith selected window has been demonstrated.
The ARSTFT outperforms the STFT. The advantages of the ARSTFT over the STFT are the adaptive time-frequency resolution and the computational gain. These smart features of the ARSTFT are achieved due to the joint benefits of the AADC, the EASA, and the resampling as they enable to adapt Fs i , Frs i , N i , Nr i , and w i n by exploiting the local variations of x(t). The employment of fast algorithms in place of the DFT for the spectrum computation is in progress, it will further add up to the computational efficiency of the ARSTFT. Moreover, the performance comparison of the ARSTFT with other MRA techniques, in terms of computational complexity and quality, opens the way to new research prospective.