Rolling Element Bearing Performance Degradation Assessment Using Variational Mode Decomposition and Gath-Geva Clustering Time Series Segmentation

By focusing on the issue of rolling element bearing (REB) performance degradation assessment (PDA), a solution based on variational mode decomposition (VMD) and Gath-Geva clustering time series segmentation (GGCTSS) has been proposed. VMD is a new decomposition method. Since it is different from the recursive decomposition method, for example, empirical mode decomposition (EMD), local mean decomposition (LMD), and local characteristic-scale decomposition (LCD), VMD needs a priori parameters. In this paper, we will propose a method to optimize the parameters in VMD, namely, the number of decomposition modes and moderate bandwidth constraint, based on genetic algorithm. Executing VMD with the acquired parameters, the BLIMFs are obtained. By taking the envelope of the BLIMFs, the sensitive BLIMFs are selected. And then we take the amplitude of the defect frequency (ADF) as a degradative feature. To get the performance degradation assessment, we are going to use the method called Gath-Geva clustering time series segmentation. Afterwards, the method is carried out by two pieces of run-to-failure data. The results indicate that the extracted feature could depict the process of degradation precisely.


Introduction
Rolling element bearings (REBs) are critical components of "rotor-bearing" system in rotating machinery.Since the cruel working condition, the REBs are vulnerable.So it is important to monitor their condition to avoid catastrophic accident in modern industry.There are many ways to monitor bearings, such as vibration [1], acoustic emission [2], oil-debris [3], and ultrasound [4].Among them, the method based on vibration signal analysis is extensively used for a bearing.
Recently, the prediction of the residual (or remaining) useful life (RUL) is a hotspot issue.To get a better bearing fault prediction result, the so-called performance degradation assessment (PDA) is a premise.The PDA contains two important aspects.One is to extract proper features that can reflect the process of degradation.The other one is to use a method to assess the REB's performance.Feature extraction is the fundamental of PDA.The feature must reveal the real performance of the REB and be sensitive to the degradation.The types of features are usually classified into three categories, time domain features, frequency domain features, and time-frequency domain features.The time-frequency domain features are always based on timefrequency analysis, combined with the concept of spectrum, entropy, and complexity, for example, the Rényi entropy [5], the permutation entropy [6], and the general mathematical morphology particle [7].In general, mechanical equipment undergoes a complete degradation from normal stage to failure.With the increasing of the running time and deepening of degradation, the amplitude of defect frequency (ADF) is raising therewith.So the ADF can reflect the degradation of REB directly.
To get the ADF, the first thing is to locate the defect frequency.In extraction of ADF, the common method is to use EMD and select appropriate intrinsic mode function (IMF) components, taking the envelope spectrum, and finally get the ADF [8].However, EMD remains an exclusively empirical algorithm, and it lacks a solid mathematical foundation.Despite numerous attempts to improve its performance, EMD is still of low efficiency and has endpoint effect and mode mixing problems.In 2014, Dragomiretskiy and Zosso [9] proposed a method called variational mode decomposition (VMD) as an alternative to EMD, which can adaptively decompose a multicomponent signal into a number of quasiorthogonal IMFs.It has been verified that VMD outperforms EMD with regard to tone detection and separation as well as noise robustness [10].It has a good performance and high operation efficiency.Particularly, it has solid foundation for a mathematical theory.Now the method has been applied to the abnormal ECG signal detection [11], the stock market forecasting [12], wind power prediction [13], power quality classification [14], and so on.
However, the parameters of VMD are selected by experience.Jun Zhu presented a method to optimize the parameters based on the kurtosis index through artificial fish swarm algorithm [15].Tang and Wang used Shannon entropy as an index to optimize parameters by panicle swarm algorithm [16].In this paper, we proposed a method to optimize the parameters based on BLIMFs themselves using genetic algorithm.As to PDA, Pan et al. [17] proposed a bearing degradation assessment method based on lifting wavelet packet decomposition and fuzzy C-means clustering.Wang et al. [18] proposed a method for PDA based on mathematical morphology fractal dimension and fuzzy C-means clustering.However, when using clustering, they only considered the distance factor, and no time parameter was recommended.The clusters should be contiguous in time.From this point of view, after extraction of ADF from the raw signals, the assessment of bearing performance is carried out by GGCTSS.
The remainder of this paper is organized as follows.In Section 1, the theory of VMD is briefly introduced.The optimization of the parameters is proposed in Section 2. In Section 3, by carrying out two run-to-failure data, we have extracted the ADF and used GGCTSS for PDA.Our conclusions are presented in Section 4.

VMD and Its Parameters
where {  } = { 1 , . . .,   } and {  } = { 1 , . . .,   } are the decomposed BLIMFs for the set of all K modes and their estimated center frequencies, respectively.The VMD process can be briefly described as follows; for further details, refer to [9].

Parameters Involved in VMD.
There are six parameters for VMD, where  represents the numbers of decomposition modes and  represents moderate bandwidth constraint; they are two parameters that significant influence on the decomposition results.In general, the parameter  is set up to zero, which means that the Lagrangian multiplier is effectively shut off.The parameter init is set to be one, which suggests that center frequencies of all the modes are initialized in the uniform distribution.The parameter DC is set to zero, for no DC part imposed.And the parameter tol for tolerance is set as default value 1e − 6.

The Optimization of VMD Parameters.
Figure 1 shows the impact of the parameter .It can be seen that the parameter  seems to have an inverse ratio of the decomposition modes' bandwidth.In Figure 1(a), there is no high frequency mode.It seems to be a better one with  = 200 in Figure 1(b).Because the parameter init is set to be one, the bandwidth of each mode is almost the same.To optimize the parameter  and , A criterion must be raised.We have proposed a criterion  in formula (2).
where  0 is the supposed frequency band of each mode and l is the actual frequency band of each mode.For example, in Figure 1(a),  0 is 0-2000 Hz, 2000-4000 Hz, 4000-6000 Hz, 6000-8000 Hz, and 8000-10000 Hz, and the actual frequency bands of BLIMFs are 0-1396 Hz, 1426-2549 Hz, 2900-4014 Hz, 3686-4620 Hz, and 3926-5376 Hz with the -axis threshold 0.001. is calculated to 0.4021, since K must be set as integer, but not for parameter .We can find the optimized  using genetic algorithm of each mode.The settings of genetic algorithm are using default in MATLAB and finally get the best K and .The minimum of K is 2. EMD can be used to find the maximum value of K.There are 10 IMFs of the signal decomposed by EMD, so the maximum of K is set up to 10.

Experiment Illustration.
The run-to-failure REB' test data are derived from IMS center [19].The bearing test rig hosted four test bearings on a shaft.The shaft was driven by an AC motor.The rotation speed was kept constant at 2000 rpm.A radial load of 6000 lbs was added to the shaft and bearing by a spring mechanism.All the bearings are force oil-lubricated.Four Rexnord ZA-2115 double row bearings were installed on the shaft as shown in Figure 2. The bearings had 16 rollers in each row, a pitch diameter of 2.815 inch, roller diameter of 0.331 inch, and a tapered contact angle of 15.17 housing.The data sampling rate was 20 kHz and the data length was 20480 points.The sampling interval was ten minutes.We use the Bearing1 1's data of set number 2 which exhibits outer race defect as shown in Figure 3.The number of the data files is 982 (the original files number is 984, for the last 2 files it is vague and distinctive).
For the theoretical value of outer race defect frequency, we can use the formula as follows and then have the theoretical frequency of 236.4 Hz: From Figure 4, we could not find the defect frequency of #600 (where # means the number of files).The range of  is set from 1000 to 10000.Now, by using genetic algorithm, the optimized parameter  of K mode decomposition is presented in Table 1.
From Table 1, the optimized parameter K is for 4 and  for 9534.The decomposition result by optimized parameters is shown in Figure 5.
As shown in the figure, the signal is decomposed into 4 BLIMFs.The spectrum is in accordance with the order from low to high; that is, VMD has the role of band pass filtering.The BLIMFs do not show the phenomenon of mode mixing.The contrast group EMD, as shown in Figure 6, is decomposed for 11 IMFs.As we can see, EMD has a characteristic that, with the order from low to high, IMF is narrowed approximate dichotomy, and each IMF has low frequency components, and LCD and LMD have similar characteristics.At the same time, the computing time of  VMD is far less than EMD, because there is no recursive iteration process of VMD.The signal of #600 is decomposed into 4 BLIMFs by optimized VMD and the envelope spectrum is shown in Figure 7.
It can be seen that the BLIMF2 and BLIMF3 have the peak of 230.7 Hz, which is close to the theoretical value 236.4 Hz.So the BLIMF2 and BLIMF3 are the right mode which have the defect frequency.By extracting the amplitude of all BLIMF2 and BLIMF3's defect frequency, an ADF of all REB's data is shown in Figure 8.
It can be seen from Figure 8 that, before #520, the amplitude is close to zero.That indicates that there is no obvious defect before #520.From #520 to #700, the ADF increases in a linear way.Guess that spalling is formed and become larger, and the REB is in the slight fault stage.About #700, there is a sudden change, and we speculate that it is caused by a bulge or a crack.From #700 to #850, the amplitude decreases and rises again.The fluctuating trend can be explained by the nature of the propagating process of the damage, when the spalling formed and later smoothed by the continuous rolling contact.As the damage spread over a broader area, the vibration level raises again.This is called "healing" phenomenon and has been stated in [1,20,21].Form #850 to the end #982, the condition of REB is becoming fierce.The "healing" phenomenon expands, and the variances enlarge.Overall, the ADF is increasing.If the ADF is segmented in 4 parts, it is rational for #1 to #520 for normal stage, #521 to #699 for slight fault stage, #700 to #850 for serious stage, and #851 to end for fault stage.According to the clustering order from left to right, the first two clustering centers are in the normal state of the REB.The third center is in the slight fault state.The fourth one is almost close to the boundary of severely fault and failure.The result is unacceptable.References [17,18] both used fuzzy C-means to realize PDA, Where owing to the degradation features which they have proposed is roughly monotonic, the cluster results of fuzzy C-means are basically distributed in time growth order.If the degradation features are inconspicuous or existing oscillation, we could not get a good clustering result.Therefore, it is necessary to import the concept of time series segmentation.
A sequence of N times observed data,  = { 1 ,  2 , . . .,   }, is called time series.Time series segmentation is to find a partitioning of T into c segments that are internally homogeneous.After a segmentation,  = { 1 ,  2 , . . .,   } can be partitioned into { 1   ,  2  , . . .,    }, where  1  = (1,  1 ),    = ( −1 + 1,   ), and 0 < The times  1 , . . .,  −1 , . . .,   are called segment boundaries or breaks, and the number c is called the order of the segmentation.This segmentation is a crisp one.A time series segmentation problem can convert into an optimization.Himberg et al. [22] have defined the cost function.Under the definition, Kehagias et al. [23] proposed a dynamic programming (DP) segmentation procedure and Gedikli et al. [24] improved it with modified dynamic programming (mDP).Both methods are based on minimization of the summation of cost functions followed by distances between actual values and constant or linear fittings.
Expanding and improving the crisp bounds of the segmentation, Abonyi et al. [25] developed an algorithm for dividing time series into fuzzy segments, which considered time series segmentation as Gath-Geva clustering with time parameter as an additional variable.The GGCTSS used local probabilistic principal component analysis (PPCA) models to measure the homogeneity of the segments and fuzzy sets to represent segments in time.The function to be optimized can be written as where the data point   = [  ,    ]  can be effectively modeled as a mixture of multivariate Gaussian distribution and  , stands for the th segment membership of   .The term  is weighing exponent of fuzzy degree,  ∈ [1, ∞].For general,  = 2.The procedure of the GGCTSS can be written as follows: (1) Initial the order segmentation c, iteration number s1, and the threshold for compatible cluster merging (0.75 as default).
(2) Input the number of principal components q and make sure that the sum of the anterior q eigenvalues account for more than 98% accuracy of the sum of all eigenvalues.
(3) Finish the iteration and give the results of segmentation.
The results include the fuzzy segmentation of a time series,   (  ), which is defined as where   (  ) is Gaussian membership function given by See [22] for more details of the method.The mDP algorithm is used to treat the ADF for the fourth-order segmentations, and the results are shown in Figure 10.
It can be seen that the segmentations by mDP algorithm are not good.So, we use GGCTSS for PDA.The process of life was divided into 4 segments.Set the iteration to be 100 times.
Figure 11: The result of the ADF based on GGCTSS.
It is about 70 iterations that the result is basically unchanged.
Figure 11 shows the result.
It can be seen that the four segmentations are partitioned according to the order of time.The normal stage is from #0 to #522, the slight stage is from #522 to #700, the severe stage is from #700 to #900, and the failure stage is from #900 to #982.It is possible to show that GGCTSS can be a good one for PDA.

Another Experiment Verification. IEEE PHM 2012
Prognostic Challenge data is another REBs' data.Figure 12 shows the experiment rig, the PRONOSTIA platform, where the data included three different loads for 1800 rpm and 4000 N, 1650 rpm and 4200 N, and 1500 rpm and 5000 N. The sampling frequency was 25.6 kHz, and 2560 samples were recorded each 10 seconds.There were 6 pieces of runto-failure data which were the learning sets, and 11 test bearings were truncated so that participants were supposed to predict the remaining life (see Table 2) and thereby preformed RUL estimating.The learning set was quite small while the spread of the life duration of all bearings was very wide.Performing good estimates was thereby difficult and this made the challenge more exciting.More details can be seen in [26].
The data is difficult to analysis because we know nothing about the nature and the origin of the degradation: inner or outer races.We are going to use Bearing1 1's data and the characteristics of the tested bearing are shown in Table 3.And the characteristics' frequency of REB is 168.3Hz for outer race, 221.7 Hz for inner race, 107.7 Hz for ball race, and 12.9 Hz for cage.
The numbers of Bearing1 1 are 2803, thus taking #2600 for analysis, since the REB is almost run to failure. Figure 13 shows the signal.Now, by using genetic algorithm, we have the optimized K and , and K is for 3 and  for 4032.The signal is    By extracting the amplitude of all BLIMF1 and BLIMF2' defect frequency, an ADF of all Bearing1 1's data is shown in Figure 15.
From Figure 15, there are no evident boundaries of each degradation stage.At the end of the bearing's life, there is a sudden increase.If we set the segment number as 3, the result is shown in Figure 16.The normal stage is from #0 to #1432, slight stage is from #1433 to #2264, and severe stage is from #2265 to #2803.

Conclusions
This paper has proposed a method to assess the performance degradation of REB using VMD and GGCTSS.We have given a novel method to optimize the VMD parameter.By using GGCTSS, we can effectively segment the degradation process.From the above analysis, we can conclude that (1) the ADF extracted by VMD can commendably reflect the degradation development of REBs; (2) compared with crisp time series segmentation, the GGCTSS is more suitable for PDA of REBs.
However, there are also some problems that should be focused on in the future study, for example, to optimize the selection of GGCTSS segment number.

Figure 1 :Figure 2 :
Figure 1: The impact of the parameter .

Figure 9 .
Figure 9.The four clustering results of the ADF are conducted by fuzzy C-means (the data have been normalized).According to the clustering order from left to right, the first two clustering centers are in the normal state of the REB.The third center is in the slight fault state.The fourth one is almost close to the boundary of severely fault and failure.The result is unacceptable.References[17,18] both used fuzzy C-means to realize PDA, Where owing to the degradation features which they have proposed is roughly monotonic, the cluster results of fuzzy C-means are basically distributed in time growth order.If the degradation features are inconspicuous or existing oscillation, we could not get a good clustering result.Therefore, it is necessary to import the concept of time series segmentation.A sequence of N times observed data,  = { 1 ,  2 , . . .,   }, is called time series.Time series segmentation is to find a partitioning of T into c segments that are internally homogeneous.After a segmentation,  = { 1 ,  2 , . . .,   } can be partitioned into {1   ,  2  , . . .,    }, where  1  = (1,  1 ),    = ( −1 + 1,   ), and 0 <  1 < ⋅ ⋅ ⋅ <  −1 < ⋅ ⋅ ⋅ <   = .The times  1 , . . .,  −1 , . . .,   are called segment boundaries or breaks, and the number c is called the order of the segmentation.This segmentation is a crisp one.A time series segmentation problem can convert into an optimization.Himberg et al.[22] have defined the cost function.Under the definition, Kehagias et al.[23] proposed a dynamic programming (DP) segmentation procedure and Gedikli et al.[24] improved it with modified dynamic programming (mDP).Both methods are based on minimization of the summation of cost functions followed by distances between actual values and constant or linear fittings.Expanding and improving the crisp bounds of the segmentation, Abonyi et al.[25] developed an algorithm for dividing time series into fuzzy segments, which considered time series segmentation as Gath-Geva clustering with time parameter as an additional variable.The GGCTSS used local probabilistic principal component analysis (PPCA) models to measure the homogeneity of the segments and fuzzy sets to

4 a (m/s 2 )Figure 10 :
Figure 10: The optimal fourth-order segmentations of ADF by mDP algorithm.(a) Constant fitting and (b) linear fitting.

Table 1 :
The optimized parameter  of each K mode decomposition.

Table 3 :
Characteristics of tested bearings.BLIMFs and the envelope spectrum is shown in Figure14.It is obvious that BLIMF1 and BLIMF2 have the frequency of inner defect.So Bearing1 1 gets inner defect finally.