Measures of Dependence for 𝛼 -Stable Distributed Processes and Its Application to Diagnostics of Local Damage in Presence of Impulsive Noise

,


Introduction
Local damage detection is crucial task in the modern condition monitoring.Most of the methods are based on detection of cyclic impulsive signal in noisy observation.Many authors explore new possibilities for more efficient fault detection.Difficulty of this problem is found in the properties of the vibration data acquired from the industrial machines.Such signals are often affected by the non-Gaussian noise.This noise is a result of the vibration contamination from nearby working machines.During signal processing, it highly affects the chance of the fault detection.Main reason behind the interest in this task has serious economical background.In early stages, local damage is usually masked by the high energy of other signal components.It results in low signalto-noise ratio and thus such fault often remains undetected.
Main problem in local damage detection based on vibration signal analysis is type of the background noise.In many fields noise is non-Gaussian.In such case, methods used as a standard ones are not sufficient enough [1][2][3][4][5][6].Impulsive behavior of noise enforces application of methods adequate for this type of signal.In this paper, we provide technique that is based on the one of the most known impulsive-type models, -stable distribution-based one.
The -stable distribution and processes belong to the so-called heavy-tailed family, which indicates the extreme values of a corresponding random variable (or process) are more probable than in the Gaussian case.Since Mandelbrot introduced the -stable distribution in modeling of financial asset returns [7,8], numerous empirical studies have been done in different applications.Indeed, this distribution can be applied in finance [9][10][11], biology and microbiology Shock and Vibration [12,13], plasma fluctuations in fusion devices [14][15][16][17][18], physics [19], and electrical engineering [20].The -stable distributions have found also applications in technical diagnostics [21,22].
As was mentioned, the local damage of rotating machine is demonstrated by cyclic behavior of vibration signal.The cyclic nature of given data can be recognized by analyzing of statistics which indicate the dependence inside the process.Clearly, the classical measure is autocovariance (or autocorrelation).However, as mentioned, when the signal has impulsive nature, the classical measure should not be considered.Indeed, the second moment does not exist; thus covariance is infinite.In this case, the alternative measures of dependence are examined.The most known measures adequate for heavy-tailed processes are codifference and covariation [23].The codifference is considered for infinitely divisible processes and is defined through the characteristic function of given process.Therefore the empirical equivalent of codifference is based on the empirical characteristic function [24].The second measure defined only for -stable distributed processes is based on the spectral measure; then the problem of its estimation is much more complicated.The main goal of this paper is to introduce a new estimation method for covariation which follows directly from its definition.This approach complements the gap in the problem of covariation estimation, rarely discussed in the literature.
As the application, we show how the empirical codifference and covariation can be useful in the problem of local damage detection based on the vibration signal analysis.
The rest of the paper is organized as follows: in Section 2 we present main properties of -stable distributed random variables in one-dimensional and multidimensional case.In Section 3 we present the alternative measures of dependence adequate for processes with infinite variance, namely, covariation and codifference.Next section is devoted to the empirical counterparts of mentioned measures.The main goal is the introduction of a new technique of estimation of covariation which is based on the spectral measure estimation.In Section 5 we present how the proposed techniques can be useful in the problem of local damage detection in rotating machines.Last section concludes the paper.

The 𝛼-Stable Distribution
In this section we introduce the -stable distribution and present main properties of -stable distributed random variables.We consider separately one-dimensional and multidimensional cases.

One-Dimensional Case.
There are few equivalent definitions of -stable distributed random variables.One of the definition is in terms of its characteristic function.
Namely, a random variable  is said to have -stable distribution if there are parameters 0 <  ≤ 2,  > 0, −1 ≤  ≤ 1, and  ∈  such that the characteristic function of  has the following form [23]: In the above function the  parameter is called stability index,  is the scale parameter, and  is the skewness parameter while  is the shift parameter.It is worth mentioning that for  = 0 and  = 0 the random variable  with characteristic function given in ( 1) is called symmetric -stable.
Although the -stable distribution and processes have found many practical applications, they possess some drawbacks.One of them is that the probability density function (as well as cumulative distribution function) for -stable distribution does not have useful analytical form.However, there are three exceptions, the Gaussian distribution (for  = 2), Lévy distribution (for  = 1/2), and Cauchy distribution (for  = 1).
One of the main properties of -stable distributed random variables is the so-called heavy-tailed behavior.In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded; that is, they have heavier tails than the exponential distribution.For  < 2 the th moment exists only for  <  while for  ≥  it is infinite.Therefore, in case  < 2 the second moment of -stable distributed random variable  does not exist which indicates that many of the techniques valid for Gaussian case can not be applied.

Multidimensional Case.
A -dimensional -stable random vector X can be defined by means of its characteristic function.Namely, X is -dimensional -stable random vector if there exists a finite measure Γ on ( − 1)-dimensional unit sphere S −1 (called spectral measure) and shift vector  0 ∈ R  such that its characteristic function has the following form [23]: and ⟨t, s⟩ Let us observe that I(t) = − ln Φ(t) and therefore I is the exponent of the characteristic function Φ(t) of random vector X.It is obvious that -stable random vectors inherit properties from one-dimensional case, for instance, the heavy-tailed behavior of vectors' marginal distributions.It is also interesting that any linear combination of the marginal distributions of X of the form  = ∑  =1     have -stable one-dimensional distribution; see theorem 2.1.2[23].The contrary is only true if all linear combinations  = ∑  =1     are symmetric or have stability index greater than or equal to one or are strictly stable.
As it was mentioned, despite many interesting properties of -stable distributions and interesting applications, there are many disadvantages.One of them is related to the fact that for -stable distribution the main measure of dependence, such as correlation (or covariance), can not be used for  < 2, since in this case, as was indicated above, the theoretical second moment is infinite.Therefore, there is a need to consider alternative measures of dependence that can be adequate for infinite variance processes.In the next section we introduce two of them and indicate their interesting properties.

Alternative Measures of Dependence for Infinite Variance Processes
One of the measures that is often considered as a tool for measuring interdependence is the codifference.This measure is defined for general class of processes, namely, infinitely divisible.For the stationary infinite divisible process {()} the codifference is defined as follows [23]: The main properties of the codifference one can be found, for example, in [23]; we only mention here that this measure can be considered as an extension of the autocovariance.In case the {()} process is Gaussian the codifference reduces simply to the autocovariance [25].The codifference carries enough information to detect ergodic properties of the process {()} [26].It is also closely related to another measure, namely, dynamical functional, discussed, for example, in [27][28][29][30][31][32], in the context of chaotic behavior recognition.The codifference is also used to examine the so-called long range dependence (or long memory) in case the correlation function is not defined [33].
Another alternative measure of dependence is the covariation.This measure is defined only for symmetric -stable random variables with  > 1.If  1 and  2 are jointly symmetric -stable with  > 1 and Γ is the spectral measure of a random vector ( 1 ,  2 ), then the covariation of  1 on  2 is the real number defined as [23] CV where S 2 is the unit sphere in R 2 and the signed power  ⟨⟩ is given by  ⟨⟩ = ||  sign().The main properties of the covariation can be found in [23].We only mention, in contrast to codifference, that the covariation is not symmetric measure when  < 2 and when  = 2 it is equal to half of the covariance of  1 on  2 .Moreover, for  > 1, the covariation induces a norm on the linear space of jointly symmetric stable random variables.Namely, if  1 is a symmetric -stable random variable with  > 1, then The covariation norm of given random variable is equal to its scale parameter.If we assume  1 and  2 are symmetric stable with  > 1, then the codifference of  1 on  2 can be expressed by means of the covariation norm; namely [23,34],

Empirical Measures of Dependence for Infinite Variance Processes
In this section we introduce the methods of estimating of the alternative measures of dependence, adequate for infinite variance processes, considered in the previous section.We mention that the estimator of codifference was introduced previously in the literature; therefore we only show here the idea of estimating method.The main result of the paper is presented in Section 4.2, where we introduce a new tool of covariation estimating.

Codifference.
For the stationary process {()} we define an estimator of codifference in the following way [25]: − ln ( φ (1, 0,  () ,  ())) − ln ( φ (0, −1,  () ,  ())) , where φ(, V, (), ()) is an estimator of the characteristic function: In [24] an efficient methodology is introduced for estimating the codifference from a single trajectory of stationary process.Namely, if {  ,  = 1, . . ., } is realization of a stationary process {()}, then the estimator of the characteristic function takes the form At the end we should mention that the estimator of codifference for linear -stable process has very good properties [24,35,36] and has been successfully used, for instance, in the problem of proper model recognition [25] as well as to detect impulsive behavior of real data [21].

Covariation.
We estimate the covariation using formula (4); therefore a spectral measure Γ should be estimated first.
In general, the spectral measure of -dimensional stable random vector is a measure over -dimensional unit sphere.
It can be continuous, as well as discrete.In practice, we use a fact proved by Byczkowski et al. [37] that every spectral measure can be approximated by its discrete approximation.From this reason formula (4) can be estimated in the following way: where Γ * is the discrete approximation of spectral measure Γ and s  = ( 1, ,  2, ) for  = 1, 2, . . .,  are locations on unit circle of Γ * masses.
In the literature one can find several approaches to spectral measure's estimation, starting from methods based on empirical characteristic functions of bivariate  vectors [38,39], through spherical harmonic analysis [40] and quantile lines [41], to generalized empirical likelihood method [42].In this work, for the illustration purposes, we use the method of estimating of the discrete spectral measure proposed by Nolan et al. [38], which is based on one-dimensional projections of bivariate  vectors and the empirical characteristic functions.The method is presented in the following part.
In the next step, for all samples ⟨X 1 , t  ⟩, ⟨X 2 , t  ⟩, . . ., ⟨X  , t  ⟩ we estimate one-dimensional parameters of univariate -stable random variables (t  ), (t  ), (t  ), (t  ) using McCulloch empirical quantile method [43].Then we derive the empirical estimator of characteristic function's exponent: Obtained vector Î = [ Î(t 1 ), . . ., Î(t  )] is the projection estimator of I. Since we estimate a discrete approximation of spectral measures, we assume that all its mass is concentrated in points of unit circle S 1 : s 1 , s 2 , . . ., s  .Therefore, we have where Γ * is the discrete approximation of the true spectral measure Γ and function   is defined in (2).Nolan et al. in [38] suggest taking s  = t  for  = 1, . . .,  and assume that  = 2,  ∈ N; then empirical version of equation ( 12) can be rewritten in the following matrix equation form: where ĉ = [ 1 ,  2 , . . .,   ] is a row vector with elements: and Â = [ , ]  , is a square matrix with the following entries: In above formulas R and I denote real and imaginary part of complex number, respectively.
To avoid further numerical problems with inverting formula (13) such as obtaining the spectral measure masses of negative sign, we can obtain Γ * by minimizing some functional: In our further numerical experiments of estimating the covariance measure we implemented the above presented algorithms in MATLAB package including fmincon routine.We also note that another method of estimating covariation measure was suggested previously in the literature [44].This method leads to estimation of normalized covariation and is based on the th moment approach.In order to obtain the empirical covariation adequate to definition (4) by using the method introduced in [44] there is a need to know the theoretical stability index and the scale parameter of appropriate random variable.From the practical point of view, this means there is a need to estimate both of them.However, we have compared that method to the one proposed in this paper and result suggests that methodology proposed in this paper leads to even better estimates of the true values for theoretical models examined in the next section.One can find also other approaches of calculating the covariation from financial data [45].

Application for Local Damage Detection in Rotating Machines
Local damage (crack, pitting, spall, breakage, etc.) in mechanical components produces events, short in time (impulsive) and wideband in frequency, disturbance which occurred in measured vibration.Undetected local damage could develop into more severe one and lead to the breakdown of the whole machine.Thus, local damage detection is one of the most widely explored problems in modern condition monitoring.Detection of such damage in industrial reality might be practically difficult due to poor signal-to-noise ratio and specific properties of informative signal.Localized damage causes significant, local increase of interaction of surfaces being in contact.It means that at these time moments forces/moments are several (or more) times bigger than during normal operation.It accelerates degradation and might rapidly (much quicker than distributed damage) cause catastrophic failure.
Vibration analysis seems to be the most effective approach for this problem.Mechanism of generation of informative signals is well recognized [1][2][3][4]46].Local change of stiffness associated with crack or loss of surface causes impulsive disturbance in the signal.Due to rotation of elements, these disturbances should be cyclic [5].These two features impulsiveness and periodicity are a basis for damage detection.However, detection of impulsive behavior or period of such behavior is often difficult.Thus, many different decomposition techniques of raw signal could be used.In this paper we used one of the most known decomposition methods, namely, time-frequency decomposition via Short-Time Fourier Transform (spectrogram).
The time series from the spectrogram corresponding to given frequency are called the subsignals.They are analyzed using appropriate statistics (called selectors).Undoubtedly, the most popular statistic was kurtosis [6], one of the measures that can point out these frequency bins on timefrequency map revealing the most impulsive nature.However, for many real signals the kurtosis-based approach does not give expected results because it can be sensitive for impulses not associated with damage (related to artifacts, non-Gaussian noise, or even regular operation of other parts of machines, i.e., valves).
The idea of spectral kurtosis was extended and other selectors were proposed for local damage detection [5,[47][48][49].Generally, the mentioned selectors are constructed under the assumption that distribution of subsignal corresponding to healthy condition should resemble Gaussian one in comparison to distribution of subsignal corresponding to damage condition.The idea of modeling of the subsignals from timefrequency representation was also extended to more general class of distribution, namely, -stable, [21,50].As it was mentioned in the previous sections, the -stable distributions are especially important in the context of modeling of data with visible jumps; therefore they can be appropriate for describing of subsignals related to damage.In this case the value of  parameter is significantly smaller than 2. On the other hand, for  close to 2, the stable distribution is close to Gaussian one; therefore the class of -stable distributions can be also used for modeling of subsignals for healthy case.Because of that reason the -stable distribution approach seems to be appropriate in the problem of local damage detection.
To get information about hidden periodicity in the signal, so-called Spectral Correlation Density map and spectral coherence were calculated as bifrequency maps.One of the most popular algorithms was simple: decompose signal into set of df frequency bins using spectrogram and next for each bin calculate spectrum.Unfortunately, signal is nonstationary and highly impulsive for set of frequency bins so at least from theoretical point of view using classical techniques is not appropriate.Then we propose using -stable distribution perspective and consequently measures adequate for infinite variance processes, namely, codifference and covariation, described in the previous sections.
Visualization of such measures for each subsignal results in new form of lag-frequency map that presents diagnostic information in very clear way [21].Similar approach without the assumption of -stable distribution of appropriate subsignals was presented in [51].
Mining machines seem to be one of the most complex machines in industry with complex structure, high-power, time-varying load, and so on.In this paper we will concentrate on the belt conveyor system, commonly used for continuous transport of bulk material (coal, overburden, copper ore, etc.) in both opencast and underground mines.Depending on the design (power required by the machine to move) belt conveyor driving station might consist of one up to four drive units with 630 or 1000 kW power each.In our research 630 kW two-stage gearbox was considered.Real data analysis was performed on the vibration signal of a two-stage gearbox that operates in an open-pit mine and transfer torque from an engine to a belt conveyor pulley.Measurements were made using Bruel-Kjaer Pulse system with following acquisition parameters: length of the signal equal to the 2.5 s and sampling frequency 16384 Hz.Sensors were mounted on the housing of the gearbox.
Gearbox has been operating under regular conditions.Load has not been measured directly; however it was assumed based on the visual inspection of material stream transported on the belt that gearbox has been properly loaded.Frequency sampling used in the data acquisition is one of the typical values for this measurement system (power of 2).It is worth denoting that mining machines are those of high-power and their characteristic frequencies are relatively low.In the further analysis it can be seen that for higher frequencies (above couple of kHz) there is no information contained in these bands.Scheme of the gearbox is presented in Figure 1 with frequencies of the shafts presented in Table 1.As one can see, frequencies of the shafts ( 01 of shaft 1,  02 of shaft 2, and  03 of shaft 3), gear mesh frequencies ( 12 for first stage,  34 for second stage), and ratio (  ) can be easily estimated.We have also given the calculated frequencies for the investigated gearbox.Investigated gearbox has shaft speed equal to 995 RPM.Speed of second and third shaft are equal to 246 RPM and 78 RPM, respectively.Ratios of first and second stage are equal to  1 = 4.04 and  2 = 3.14, respectively.Numbers of teeth on the gears are as follows:  1 = 23,  2 = 93,  3 = 36, and  4 = 113.However, we were not able to investigate visually condition of the gears.On the other hand, according to the state of the start and our own experience, we know that cyclic impulsive In Figure 2 we present both time waveform of the acquired signal and its time-frequency decomposition, namely, spectrogram.We remind the reader that the spectrogram is a square of absolute value of the Short-Time Fourier Transform defined for time point  and frequency  as follows [5]: where ( − ) is the shifted window and  0 , . . .,  −1 is the input signal.It can not be easily observed that in the time domain the analyzed signal exhibits impulsive behavior.Cyclic changes of the amplitude are present due to the high energy in the lowfrequency band responsible for the deterministic component of the signal.
Observing spectrogram (see Figure 2(b)) one can point out three main types of frequency bands.First type has high energy (red band on the spectrogram) and is placed at low frequencies (below 1 kHz).It carries strong energy and thus it is responsible for the amplitude modulation of the signal, as it can be seen in Figure 2(a).One can see impulsive behavior repeating 4 times per second.However, second fault (16.5 Hz) is not visible due to the lower energy in comparison to the first one.Second type of the frequency bands is with low energy and it is connected with highfrequency additive noise in the signal (>7 kHz).In this band we can observe abnormal behavior at 0.2 s.It is related to the artifact, impulsive event which is not connected with the fault.Third type of the band containing information about the fault is usually called an informative frequency band.In this band we can observe mainly impulsive behavior.In our case we can see that impulses are present at bands 2-3.5 kHz and 4-5 kHz.Energy of the impulses significantly exceeds energy of the background noise.
As the next step of our analysis we performed estimation of the codifference measure on each of the extracted subsignals from time-frequency decomposition, each related to certain frequency bin.Combining estimates together, one obtains codifference measure for each of the observed subsignals.In Figure 3 one can see results of the application towards investigated signal.
It can be observed that lag-frequency representation has been improved over spectrogram.Both informative frequency bands have increased ratio of fault visibility.In this bands one can easily observe impulsive behavior at peaks with distance between them equal to 33 lags which translates to 0.0606 s and corresponds to 16.5 Hz.Uncertainty of the location of impulses is related to the resolution of the spectrogram.Increasing the resolution allows for more precise location of the fault in the new lag-frequency representation but results in the drawback of the increased computational time.One can observe that low-frequency band which contained no information in the spectrogram now holds important information about the second fault.We observe recurring impulses at peaks with distance between them equal to 132 lags which translates to 0.243 s and is equal to 4.11 Hz, close to the real fault frequency.
Summing up, new lag-frequency representation provides superior overall information in comparison to the spectrogram.In Figure 4 we present the result of the estimated covariation map.This map is constructed similarly to the previous one; however for each subsignal from the spectrogram we calculate the empirical covariation applying the new algorithm.It can be denoted that two frequency bands at which repeating sequences are present are similar to informative frequency bands from the spectrogram (except the low-frequency band containing information about 4.1 Hz fault).
It can be concluded that this representation provides more clear look at selection of the informative frequency band.Furthermore, repeating sequences allow us to detect the fault.Each dot at informative frequency bands is placed at peaks which have distance between them equal to 33 lags and translates to 0.0606 s which corresponds to 16.5 Hz which is frequency of the fault.
As it was said earlier kurtosis-based approach does not give expected results.It is due to the kurtosis being sensitive towards singular impulses (e.g., artifacts).In Figure 5 we present kurtogram of the investigated signal.The artifact has wideband frequency signature.
However, it is more visible in the low-energy highfrequency band.Thus, kurtosis in this band is significantly higher than for the impulsive behavior in the informative frequency bands.Highest value of the kurtosis is reached at dyad [8106.6667Hz; 170.6667Hz] which results in the frequency band [7936 Hz; 8277.3334Hz] which is not the correct one.
Summing up, provided new lag-frequency representations when combined together provide substantial information about the fault and its location in carrier frequency.Furthermore, using both of them allows for easier, more precise detection of the fault frequency.One can see more transparently features on the new lag-frequency representations than on the basis of the spectrogram.

Conclusions
In this paper the problem of local damage detection based on the vibration signal analysis is discussed.Real signals often acquired in the noisy environment possess impulsive behavior which is not related to the fault.Therefore the classical methods of fault detection seem to be inappropriate in this case.The main point of our methodology is the assumption that the examined signals exhibit heavy-tailed behavior and are modeled by -stable distribution.In general, the local damage is demonstrated by periodic behavior of given signal.However this periodicity very often is not visible in the raw signal but can be recognized by statistics that measure dependence between data.By the assumption of -stable distribution we propose considering empirical measures of dependence adequate for such distributionbased models, namely, codifference and covariation.Because in the literature one can find research articles devoted to the problem of codifference estimation, the main attention of this paper is paid to the covariation estimation method.We gave complete description of the new estimation technique and show the application to local damage detection based on real signal of mining machine working in harsh environment.The proposed techniques are compared with the classical kurtosis-based approach.

Figure 3 :
Figure 3: Codifference map of the investigated signal.

Figure 4 :
Figure 4: Covariation map of the investigated signal.