Algorithm Indicating Moment of P-Wave Arrival Based on Second-Moment Characteristic

The moment of P-wave arrival can provide us with many information about the nature of a seismic event. Without adequate knowledge regarding the onset moment, many properties of the events related to location, polarization of P-wave, and so forth are impossible to receive. In order to save time required to indicate P-wave arrival moment manually, one can benefit from automatic picking algorithms. In this paper two algorithms based on a method finding a regime switch point are applied to seismic event data in order to find P-wave arrival time. The algorithms are based on signals transformed via a basic transform rather than on raw recordings. They involve partitioning the transformed signal into two separate series and fitting logarithm function to the first subset (which corresponds to pure noise and therefore it is considered stationary), exponent or power function to the second subset (which corresponds to nonstationary seismic event), and finding the point at which these functions best fit the statistic in terms of sum of squared errors. Effectiveness of the algorithms is tested on seismic data acquired from O/ZG “Rudna” underground copper ore mine with moments of P-wave arrival initially picked by broadly known STA/LTA algorithm and then corrected by seismic station specialists. The results of proposed algorithms are compared to those obtained using STA/LTA.


Introduction
Obtaining accurate information about seismic phenomena induced by mining activity might be a difficult task.The recordings strongly depend on distance between source and measuring device, energy of the event, lithology of the rock mass, device parameters, noise induced by transmission line, and so forth.In order to acquire exact features of the event (like, e.g., 3-dimensional location) recordings from at least four different one-axial sensors are required.
When the seismic event occurs, its energy is transported via different types of seismic waves, which can be primarily classified as body waves (P-wave, S-wave) and surface waves (Rayleigh wave, Love wave, and Stoneley wave).P-waves possess the highest velocity among others; thus they indicate onset of the event.Therefore, in order to receive detailed information about particular phenomenon, the first step is to indicate its moment of P-wave arrival.
From mathematical point of view the problem is isometric with finding a moment in time series where it loses stationarity (as the background noise is considered to be stationary) or as a problem of finding structural break point.
The moment of P-wave arrival is commonly used in estimation of event location [1], energy [2], and focal mechanism [3].Determining such a moment manually is time-consuming and requires considerable experience.However, under development of science and technology, many automatic Pwave picking algorithms were proposed.Implementation and use of such methods are a much faster solution but not 100% reliable, as the results frequently differ from indications 2 Shock and Vibration given by seismic station specialists.Thus the algorithms are frequently used as an initial pick followed by experts' manual correction.
So far, there are plenty of different algorithms which can be divided into 2 main groups: proceeding in time and proceeding in frequency domain [4].Broadly known time domain methods include AR-AIC [5,6], which fits autoregressive model to the data and determines the moment of P-wave arrival in a point where Akaike Information Criterion [7] is minimized, and STA/LTA algorithm [8,9], which for fixed characteristic function (e.g., square of the signal) computes its average over short and long time window and indicates the onset time when the ratio of averages exceeds predefined value.The moment of P-wave arrival might be also determined with use of neural networks [10,11], methods based on wavelet transform [12,13], spectrogram [14,15], and cross-correlation [16].
When dealing with the problem of P-wave arrival moment, one may investigate it as an element of a signal segmentation procedure [17,18], as the indication of onset moment is basis for segmentation.Common methods are often used in both problems.
Recently, a method of finding a critical point which divides the time series into two stationary parts with different variances has been proposed [19].The basis for this method is statistical property of the second central statistical moment; that is, the expected value of cumulative sum of squares for stationary time series increases linearly with time.Such property is independent of the underlying probability distribution, as long as the variance is finite.The method has been already utilized in structural break detection method [20].It was decided to involve this idea for P-wave arrival point estimation.However, the entire seismic event does not possess stationarity property, nor it can be split into two stationary time series.Thus the method requires a modification.In this paper two similar methods are proposed and compared to a widely used STA/LTA algorithm.All of the investigated automatic P-wave picking methods are compared with arrivals indicated by specialists of O/ZG "Rudna" underground copper ore mine seismic station experts due to their extensive experience in analysis of mining-induced seismic events.
The rest of the paper is organized as follows: in Methodology the new method of structural break detection is presented.Moreover, we recall the STA/LTA algorithm (the classical method used to detection of P-wave arrival time).Next, in Section 3 of application to real data, the new methodology is applied to the real seismic signals.Obtained results are compared with the STA/LTA technique.The last section contains conclusions.

Methodology
2.1.STA/LTA Algorithm.One of the classical algorithms that are often used in the problem of P-wave arrival moment detection is based on the short-term-average and long-termaverage (STA/LTA) trigger method.The underlying idea of this method is to evaluate in a continuous fashion the value of characteristic function (CF) of a seismic signal in two moving-time windows (one short and one long) in order to detect the seismic event.The characteristic function used for calculation purpose can be defined as energy, absolute amplitude, or envelope function of the microseismic trace.Irrespective of the definition of the characteristic function (CF), the short time window (STA) is supposed to measure the instantaneous amplitude of the seismic signal, whereas the long time window (LTA) provides information about the amplitude of seismic noise.When their ratio exceeds a predefined value   (activation threshold), the following recorded samples are marked to be event-driven until the ratio falls below another predefined value   called the deactivation threshold.In this algorithm, for a raw signal  1 , . . .,   , the following statistic SLR  is being calculated: where  and  denote short and long time windows lengths (in samples), respectively.Moreover, in the above equation CF() is a specific characteristic function defined in terms of signal energy.In the literature different characteristic functions can be found, such as absolute value of the signal or envelope of the microseismic trace.In this paper we consider CF() =  2 .
In the STA/LTA algorithm the inspection of the SLR  statistic is performed and on such basis one can detect the moment of P-wave arrival.This moment is the minimum  for which the ratio STA/LTA exceeds the predefined value ; that is, In this paper we compare the classical approach based on the STA/LTA algorithm with the new algorithm based on the cumulative empirical second moment of given raw signal.

Algorithm Based on the Empirical Second Moment.
As it was mentioned, the proposed method is based on the empirical second moment of given raw signal  1 , . . .,   .First, we introduce the statistic which is a cumulative second moment of given sample: The   () statistic was used in [19] as a base of the method applied in the segmentation problem in case when in real data we observe that some characteristics change with respect to time.This statistic was also a main point of the testing procedure whether in the given sample a structural break point exists or not.
In this paper we extend the methodology presented in [19] and propose to analyze the following statistic: This choice is motivated by seismic recordings characteristics and discussion is carried out in further sections.As one can expect   can tend to −∞ if at least the first reading is equal to 0. In order to avoid this problem we modify the raw signal and in the further analysis instead of  1 , . . .,   we substitute the first reading  1 with the first nonzero reading.This technical issue is related to a single sample at the very beginning of the recording; thus it does not influence the results.We denote corrected series as   1 , . . .,    .Until the moment of P-wave arrival, the seismic recordings   consist of ambient noise which is considered stationary [21]; obviously they can be described by independent identically distributed Gaussian random variables.Moreover, we assume that the theoretical second moment of the distribution is finite.It can be shown that for data before the moment of P-wave arrival we have the following: Our methodology is therefore based on this observation.
In the procedure, in contrast to [19], we fit the logarithm function  1 =  +  log() to first  points of    () statistic.
After the moment of P-wave arrival the character of the    () statistic changes.It is not exactly known what kind of function we can observe after the moment of P-wave arrival; however it was noted that in general the statistic is concave with respect to .Here we decided to test two different concave functions: exponential  2 = exp  + const and power  3 =   + const.These functions are fitted with time shift; that is, () =    ( − ),  =  + 1, . . ., .In order to reduce computational time we subtract    () or    ( + 1) and then fit the exponential or power functions, respectively.Fitted functions coefficients are obtained by using of Levenberg-Marquardt algorithm (LMA) [22,23] which is an iterative algorithm used to solve nonlinear least squares problems.It combines features of Gauss-Newton method and the method of gradient descent [24].The LMA algorithm requires at least 3 points to fit considered functions.The next step is to calculate the squared errors between    () and fitted functions.The estimated point of P-wave arrival is  for which the error is minimized.Entire detection algorithm can be described as follows: (1) Set  = 3.

Application to Real Seismic Data
In this paper the proposed algorithm was applied to a 188 single-event recordings from O/ZG "Rudna" underground copper ore mine.The signals were gathered by seismic system ELOGOR-C which is used to rock mass observation.The system consists of 2 sets of 32 seismometers Willmore MK-III type; each collects velocity data in the frequency band 0.5-150 Hz which is adequate frequency band containing mining-induced events.Such band is enough for localization, seismic energy estimation, and focal mechanism indication by analysis of first motion direction, which is the basic purpose of the monitoring system.The microseismic events in higher frequency are registered in this mine by a different system.The data is transmitted to seismic station using analog transmission (frequency modulation) and sampled with sampling frequency 500 Hz.Due to characteristics of the deposit, the seismic system network is relatively flat and a few additional sensors are located in shafts.Analyzed signals are dated from August 1, 2015, to August 19, 2015.The events length extent from 4.6 s to 33 s.Moments of P-wave arrival was indicated preliminarily using the STA/LTA algorithm and then manually corrected by seismic station experts.
In Figure 1 an exemplary seismic event is presented with moment of P-wave arrival marked by red cross.In Figure 1(b) zoom on the arrival time is shown.It is easy to spot stationarity of the background noise before the arrival of P-wave (red cross).
Application of   () statistic can be seen in Figure 2.
As it was noted in [19], when applying   () statistic to a stationary process with variance  2 , its expected value would be   () =  2 .Seismic recordings before the moment of Pwave arrival (denote ) fulfill the stationarity assumptions.However, strict utilization of algorithm proposed in [19] cannot work properly, as after the P-wave arrives the series is not stationary. 2 statistics were computed for linear fits from arrival point indicated by seismic station specialists to the end of recording (see example in Figure 2).The mean value of these -statistics for the entire set of seismic records is 0.349 which is unacceptable.Thus the P-wave arrival indicated by using inappropriately fitted function might be false.Application of logarithm function to   () statistics might highlight the P-wave arrival, since the structural change is sudden in case of    () statistic contrary to   ().
In Figure 3 the values of    () statistic are contained.With use of logarithm, the break point (Figure 3, marked with red cross) might be noticed much easier than it could be indicated from    () (Figure 2).
It is worth to notice that the    () statistic can be divided into 2 concave series, and the division point is located in the moment of P-wave arrival (marked with red cross).
Figure 3 presents the quality of fit.The average  2 statistic for power function fit (fitted on the interval from the onset moment indicated by seismic station experts till the end of recording) is equal 0.976 and 0.974 for exponential function.This indicates that these functions appropriately approximate the    statistics.

Algorithm Results with Exponential Function Fitted.
In Figure 4 results of the algorithm (exponential function fitted to the second part of statistic) are contained.Performed data analysis provides that 54.3% of algorithm picks do not differ by more than 10 samples (which corresponds to 0.02 s) from moments indicated by seismic station experts.79.9% of differences do not exceed 50 samples (0.1 s).The largest difference is 177 samples (0.354 s). Figure 3:    () statistic applied to exemplary seismic signal presented in Figure 1 (a) and its zoom (b).Fit parameter  2 = 0.9557 and 0.9591 for exponential and power function fitting, respectively.

Algorithm Results with Power Function
Fitted.The results presented in Figures 4 and 5 provide that exponential and power fitting lead to similar results.Within the power function fitted 58.5% of differences do not exceed 10 samples (0.02 s) and 83% do not exceed 50 samples (0.1 s).The largest difference is 255 samples (0.51 s).

Results
Based on STA/LTA.In order to examine performance of the proposed algorithms we compare them to Pwave picks obtained by the STA/LTA method with the most optimal parameters.The method requires predefining threshold .The P-wave arrival is triggered when the STA/LTA ratio exceeds .Also, the length of short and long time windows needs to be predefined.This is a drawback of this method, as optimal values of these quantities can change for different working conditions of sensors.
The algorithm was tested with different  values from 1 to 10 (step 0.05).Simultaneously, different lengths of short and long time windows were tested (short window from 10 to 200, step 10 samples, and long window from 10 to 400 samples more than the short one, step 10 samples).The most accurate estimated arrival moments were obtained with  = 2.2 and  = 20 samples and  = 320 samples.
In Figure 6 one can observe that results are significantly worse than that these provided by the novel algorithms based on second statistical moment.The analysis provides that 47.8% and 71.7% of picks do not exceed 10 and 50 samples, respectively.Additionally, STA/LTA algorithm missed 3 Pwave arrivals; algorithm proceeded through entire signal and without any moment triggered.The largest difference  between algorithm results and seismic station specialists is 490 samples (0.98 s).Moreover, significantly more events are indicated before the actual moment of P-wave arrival.This shows that STA/LTA is prone to outliers.
In Table 1 basic statistics are included in order to compare three investigated methods.As it can be noticed the proposed algorithms outperform STA/LTA algorithm in all aspects,  expect the mode of absolute differences which is equal to 3 for all methods.Fitting the power function provides the best results in terms of correct picks and mean of absolute differences but has worse standard deviations than the exponential fitting.

Conclusions
In this paper the regime switching detection method was adopted in order to find P-wave arrival.The algorithm was tested on seismic signal recordings from O/ZG "Rudna" underground copper ore mine.The results included in this paper show that the proposed algorithms are capable of indicating P-wave arrival moments as the estimated points were close with points manually indicated by mine station experts.The results were also compared to those provided by

Figure 1 :
Figure 1: Exemplary seismic event with P-wave arrival determined by specialists (a) and its zoom (b).

Figure 2 :
Figure 2:    () statistic applied to exemplary seismic signal presented in Figure 1 (a) and its zoom (b).

Figure 4 :
Figure 4: Histogram of differences between algorithm picks and those given by specialists: exponential function fitted (a) and histogram of absolute differences (b).

Figure 5 :
Figure 5: Histogram of differences between algorithm picks and those given by specialists: power function fitted (a) and histogram of absolute differences (b).

Figure 6 :
Figure 6: Histogram of absolute differences between STA/LTA picks and those given by experts (a) and histogram of absolute differences (b).

Table 1 :
Comparison between algorithms based on second statistical moment and STA/LTA method.