Bearing Degradation Process Prediction Based on the Support Vector Machine and Markov Model

Predicting the degradation process of bearings before they reach the failure threshold is extremely important in industry. This paper proposed a novel method based on the support vector machine (SVM) and the Markov model to achieve this goal. Firstly, the features are extracted by time and time-frequency domain methods. However, the extracted original features are still with high dimensional and include superfluous information, and the nonlinear multifeatures fusion technique LTSA is used to merge the features and reduces the dimension. Then, based on the extracted features, the SVM model is used to predict the bearings degradation process, and the CAO method is used to determine the embedding dimension of the SVM model. After the bearing degradation process is predicted by SVM model, the Markov model is used to improve the prediction accuracy. The proposed method was validated by two bearing run-to-failure experiments, and the results proved the effectiveness of the methodology.


Introduction
Bearing is one of the most important components in rotating machinery.Accurate bearing degradation process prediction is the key to effective implement of condition based maintenance and can prevent unexpected failures and minimize overall maintenance costs [1,2].
To achieve effective degradation process prediction of the bearing, firstly, the features should be extracted from the collected vibration data.Then, based on the extracted features effectively prediction models should be selected [3].Feature extraction is the process of transforming the raw vibration data collected from running equipment to relevant information of health condition.There are three types of methods to deal with the raw vibration data: time domain analysis, frequency domain analysis, and time-frequency domain analysis.The three types of methods are often chosen to extract the feature.For example, Yu [4] chose the time domain and the frequency domain transform to describe the characteristics of the vibration signals.Yan et al. [5] chose the short-time Fourier transform to extract the features.Ocak et al. [6] chose the wavelet packet transform to extract the feature of bearing wear information.Because the frequency features from FFT analysis results often tend to average out transient vibrations and thus not providing a wholesome measure of the bearing health status, in this paper, the time domain and the time-frequency domain characteristics are used to extract the original features.
Although the original features can be extracted, they are still with high dimension and include superfluous information.So the original features fusion and dimensional reduction method should be used to deal with the original features so as to select the typical features.The most commonly used features fusion and dimensional reduction method is principal component analysis (PCA) [7,8].But the PCA is mainly used for dealing with the linear data set, while the bearing vibration features are usually suppressed by the nonlinear characteristic features, so the PCA cannot work effectively.Therefore, it is a challenging to find an effective nonlinear features fusion and dimensional reduction method.In this research a new feature extraction method local tangent space alignment (LTSA) [9] is chosen.The LTSA is an efficient manifold-learning algorithm, which can be 2 Shock and Vibration used as a preprocessing method to transform the high dimensional data into more easily handled low dimensional data [10]; the method has been used in many fields, such as face recognition, character recognition, and image recognition [11,12].In this paper, the LTSA is used to achieve extracting the more sensitive features.
After selecting the typical features, another challenge is how to effectively predict the bearing degradation process based on the extracted features.The existing equipment degradation process prediction methods can be roughly classified into model-based (or physics-based modals) and data-driven methods [13].The model-based methods predict the equipment degradation process using the physical models of the components and damage propagation models based on damage mechanics [14,15].However, equipment dynamic response and damage propagation processes are typically very complex, and authentic physics-based models are very difficult to build [16].Data-driven methods, also known as artificial intelligent approaches, are derived directly from routine condition monitoring data of the monitored system, which predicts the failure progression based on the learning or training process.The more prior the data is used for the training process, the more accurate is the model obtained [17].Artificial intelligent techniques have been increasingly applied to bearing remaining life prediction recently.Lee et al. [18] presented an Elman neural network method for health condition prediction.Huang et al. [19] proposed a backpropagation network-based method for bearing degradation process prediction.However, the neural networks have the drawbacks of slow convergence; difficulty in escaping from local minima; uncertain network structure, especially when doing the bearing degradation process prediction problem with large data, and those problems will be more troublesome.The SVM [20] is most widely used recently and has a good identify and regression ability.In this paper, the SVM is used to predict the bearings degradation process.
Although the SVM is effective in predicting the bearing running state, the prediction error still exists.Because any prediction methods based on the historical data for future prediction will more or less have some prediction error, it is necessary to improve the prediction results.However, the prediction error has the character of being affected by many factors, fluctuations, showing a great random, and the error points are not related.So if we want to achieve high prediction accuracy, we need to find the discipline of the prediction error and correct the error of prediction.The Markov model [21] used the state transition matrix to achieve a more precise prediction.It can be used to achieve the improvement of the prediction accuracy.So in this research, the Markov model is used to improve the prediction accuracy.
The remainder of this paper is organized as follows.The methods of features extraction and the theory of dimensional reduction method LTSA are introduced in Section 2. The SVM model and the Markov model for bearing degradation process prediction are described in Section 3. In Section 4, the flowchart and the procedure of this research are introduced.The case validation and actual application are presented in Section 5. Finally, the conclusions are given in Section 6.

Methods of Signal Processing and Dimensional Reduction
where  is the number of discrete points and   represents the signal value at those points, where  is the mean value.
The th central moment for a set of data is defined as The normalized forth moment, kurtosis, which is commonly used in bearing diagnostics, is defined as The skewness is defined as The peak-peak is defined as Empirical mode decomposition (EMD) is a powerful tool in time-frequency domain analysis.The advantage of EMD is the presentation of signals in time-frequency distribution diagrams with multiresolution, during which choosing some parameters is not needed.This property is essential in the detection of bearing faults.The EMD energy can represent the characteristic of vibration signals, and thus it is used as the input features.The (intrinsic mode function) IMF energy data sets are chosen as original features in this paper.The original features for bearing degradation process prediction based on the original features are shown in Table 1.

Dimensional Reduction
Based on the LTSA.Because the generated original feature sets are still with high dimension and include superfluous information, the feature extraction method LTSA is used to fuse the relevant useful features and extracts more sensitive features to work as the input of the proposed prediction model.The basic idea of LTSA is to use the tangent space of sample points to represent the geometry of the local character.Then these local manifold structures of space are lined up to construct the global coordinates.Given a data set  = [ 1 ,  2 , . . .,   ],   ∈   , a mainstream shape of -dimension ( > ) is extracted.The LTSA feature extraction algorithm is as follows [9]. ( where the  is the identity matrix; the  is the unit vector; the  is the points number of the neighborhood; the   is the transformation matrix.In order to minimize the error, the   and   should be found, and then where the Θ *  is the Moor-Penrose generalized inverse of Θ  .Suppose Let  = [ 1 ,  2 , . . .,   ],   =   ,   be a selected matrix from 0-1; the  are global coordinates, and their weight matrix is The constraints is   =   . (4) Extract of the low-dimensional manifolds feature: since the  is the eigenvalue of matrix , the corresponding minimum eigenvectors matrix is composed of eigenvalue.The value of section 2 to section  + 1 of matrix  make of the . is the global coordinate mapping in the mainstream form of low-dimensional transformed from the nonlinear high-dimensional data set of .
The procedure of feature extraction can be described as follow.
(1) Use the time domain analysis methods kurtosis, skewness, peak-peak, RMS, and sample variance to extract the statistical features.
(2) Use the EMD method to decompose the collected vibration signal of each data set and get the IMF components; calculate the energy of each IMF component and get the features of the bearing in this time; then get the features of the other data sets.
(3) Use the LTSA to reduce the original features dimensions and get the main features; the extracted features are used as input of the SVM model for bearing degradation process prediction.

The SVM and Markov Model for Degradation Process Prediction
where () is the high dimensional feature space, which is nonlinear mapped from the input space ,  is the weight vector, and  is the bias [22].
After training, the corresponding  can be found through () for the  outside the sample.The -support vector regression (-SVR) by Vapnik controls the precision of the algorithm through a specified tolerance error .The error of the sample is , regardless of the loss, when || ≤ ; else consider the loss as || − .First, map the sample into a high dimensional feature space by a nonlinear mapping function and convert the problem of the nonlinear function estimates into a linear regression problem in a high dimensional feature space.If we let () be the conversion function from the sample space into the high dimension feature space, then the problem of solving the parameters of () is converted to solving an optimization problem (12) with the constraints in (13): Subject to   − ( ⋅  () + ) ≤ , The feature space is one of high dimensionality and the target function is nondifferentiable.In general, the SVM regression problem is solved by establishing a Lagrange function and converting this problem to a dual optimization, that is, problem (14) with constraint of ( 15) in order to determine the Lagrange multipliers Subject to where   ,  *  are Lagrange multipliers and   ,  *  ≥ 0.   × *  = 0.  evaluates the tradeoff between the empirical risk and the smoothness of the model.
The SVM regression problem has therefore been transformed into a quadratic programming problem.The regression equation can be obtained by solving this problem.With the kernel function (  ,   ), the corresponding regression function is provided by where the kernel function (  ,   ) is an internal product of vectors   and   in feature spaces (  ) and (  ).

The Prediction Strategy and the Structure of the SVM Model.
Traditional forecasting methods mainly achieve single-step prediction; when those methods are used for multisteps prediction, they cannot get an overall development trend of the series.Multisteps prediction method has the ability to obtain overall information of the series which provides the possibility for long-term prediction.There are two typical alternatives to build multisteps life prediction model.One is iterated prediction and the other is direct prediction.The comparison of the two strategies can be found in a number of literatures [23].Marcellino et al. [24] presented a large-scale empirical comparison of iterated versus direct prediction.The results show that iterated prediction typically outperforms the direct prediction.So, the iterated multisteps prediction strategy has numerous advantages and will be adopted in this paper.
In order to determine the structure of the SVM, we constructed a three layers SVM prediction model.But to achieve the multisteps time series life prediction a basic problem should be suppressed.That is how many essential observations (inputs) are used for forecasting the future value (the output node number is 1), so-called embedding dimension .In order to suppress the problem, the CAO method [25], which is particularly efficient to determine the minimum embedding dimension through the expansion of neighbor point in the embedding space, is employed to select an appropriate embedding dimension .Then, the SVM input node number is determined.
To effectively select an appropriate embedding dimension based on the CAO method, the phase space reconstruction method should be mentioned.The fundamental theorem of phase space reconstruction is pioneered by Takens [26].For an -point time series X = { 1 ,  2 , . . .,   }, a sequence of vectors   in a new space can be generated as  () = {  ,  + , . . .,  +(−1) }, where  = 1, 2, . . .,   ,   =  − ( − 1) is the length of the reconstructed vector   ,  is the embedding dimension of the reconstructed state space, and  is embedding delay time.The time delay  is chosen through the autocorrelation function [27]: where    =   −,  is the average value of the time series.The optimal time delay  is determined when the first minimum value of () occurs.
The embedding dimension  is chosen through CAO method, defining the quantity as follows: where ‖ ⋅ ‖ is the Euclidian distance and is given by the maximum norm.  () means the th reconstructed vector and (, ) is an integer, so that  (,) () is the nearest neighbor of   () in the embedding dimension .A new quantity is defined as the mean value of all (, )  : where () is only dependent on the dimension  and time delay .To investigate its variation from  to  + 1, the parameter  1 is given by By increasing the value of , the value  1 () is also increased and it stops increasing when the time series comes from a deterministic process.If a plateau is observed for  ≥  0 , then  0 + 1 is the minimum embedding dimension.But  1 () has the problem of slowly increasing or has stopped changing if  is sufficiently large.CAO introduced another quantity  2 () to overcome the problem: where Through CAO method, the embedding dimension  of the SVM prediction model is chosen.The structure of the SVM model is determined.

SOM Clustering Method to Divide the Prediction Error.
State division is the process to determine the mapping from random variable to the state space.How to obtain state division is a crux for Markov model.Traditionally, it is performed by the state division approach described as follows.Let  = { 1 ,  2 , . . .,   } be the random sequence; let  = {1, 2, . . ., , . . ., } denote the state space; given where 1 ≤  ≤ , 1 ≤  ≤ , then the variable   belongs to the state , and the division of [ −1 ,   ] is usually uniform divided.However, the uniform divided method depends on the people's experience, which will affect the prediction precise.In this research, the SOM neural network [28] is used to divide the state.The SOM can be created from highly deviating, nonlinear data.After the data are input, the SOM is trained iteratively.
In each training step, one sample vector  from the input data set is chosen randomly, and the distance between it and all the weight vectors of the SOM, which are originally initialised randomly, is calculated using some distance measure.The best matching unit (BMU) is the map unit, whose weight vector is closest to .After the BMU is identified, the weight vectors of the BMU, as well as its topological neighbors, are updated so that they are moved closer to the input vector in the input space.The vectors are updated following the learning rule: (23) where ℎ( BMU ,   , ) is the neighborhood function, which is monotonically decreasing with respect to the distance between the BMU  BMU and   in the grid, and the training time () is the learning rate; a decreasing function with 0 < () < 1.
At the end of the learning process, the weight vectors are grouped into clusters depending on their distance in the input space.Unlike networks based on supervised learning, which require that target values corresponding to input vectors are known, the SOM can be used to cluster data without knowing the class membership of the input data.This character is suitable for the problem of the prediction error of the SVM model which is not clear to us, so we should classify the error without prelearning, and therefore, the function of SOM method determined is an efficient and necessary method for clustering the state.Based on the clustering results, the state is divided into some districts.

Markov Prediction Model to
Improve the Prediction Accuracy.Consider a stochastic process {  ,  = 1, 2, . ..} that takes on a finite or countable number of possible values.Unless otherwise mentioned, this set of possible values of the process will be denoted by the set of nonnegative integers {1, 2, . ..}.If   = , then the process is said to be in state  at time .Suppose that whenever the process is in state ; there is a fixed probability   that it will be next instating .That is for all states  0 ,  1 , . . .,  −1 , ,  and all  ≥ 0. Such a stochastic process is known as a Markov chain.Equation ( 24) can be interpreted as stating that for a Markov model, the conditional distribution of any state  +1 , given the past states  1 ,  2 , . . .,  −1 and the present state   , is independent of the past states and depends only on the present state.This is called the Markovian property.The value   represents the probability that the process will, when in state , next make a transition into state .Since probabilities are nonnegative and since the process must make a transition into some state, we have that If the process has a finite number of states, which means the state space  = {1, 2, . . ., , , . . ., }, then the Markov chain model can be defined by the matrix of one-step transition probabilities, denoted as The initial probability is computed by where   denotes the transition times from state  to state  and   denotes the number of random variables {  ,  = 1, 2, . . ., } belonging to state .Markov model adopts state vector and state transition matrix to deal with the prediction issue.Suppose that the state vector of moment  − 1 is  −1 , the state vector of moment  is   , and the state transition matrix is ; then the relationship is   =  −1 ,  = 1, 2, . . ., .
Update  from 1 to , and then where   is the state vector at moment .Equation ( 29) is the basic Markov prediction model, if the initial state vector and the transition matrix are given, which allows calculation of any possible future state vector.

Proposed Method
The flowchart of the proposed method is shown in Figure 1.
The method consists of four procedures sequentially: data processing and features extraction, merge of the original features, constructing-training SVM model and predicting, and Markov model for improving the prediction result.The role of each procedure is explained as follows.
Step 1. Data processing and features extraction.The time domain and time-frequency domain signal processing methods are used to extract the original features from the collected mass vibration data.
Step 2. Merge of the original features.The LTSA method is used to extract the typical features and reduce the dimension of the features.The extracted features are used for training the SVM model.
Step 3. Constructing the SVM model.The SVM model is constructed; the CAO method is used to determine the embedding dimension.The iterated multistep prediction method is used to forecast the future value.
Step 4. Markov model for improving the prediction result.This procedure uses the SOM method to cluster the prediction error before the Markov method; based on the state division results the Markov model is used to improve the prediction results obtained by SVM model, to get a more precise prediction.From the extracted features showed in Figures 5 and  6 we can see the following.(1) The bearing is in normal condition during the time correlated with the first 700 points.After that time, the condition of bearing suddenly changes.It indicates that there are some faults occurring in this bearing.

Validation and Application
(2) Different features reflect the bearing running state in different shapes.For example, the kurtosis and the IMF1    We performed feature extraction by means of LTSA to extract a sensitive feature and reduce the dimensionality of calculated features.After LTSA is used (in this article the parameters of the neighborhood factor  equals 8, the embedding dimension  equals 1), the bearing running state features dataset is got.The first main projected vector is chosen as the input of the SVM model.The result is shown in Figure 7.In comparison with the LTSA method, we also extracted the features through the PCA method, and the first main principal component is chosen.The result is shown in Figure 8.
From Figures 7 and 8 we can see that the LTSA method can extract an effective feature dataset, which is sensitive to the changes of bearing running state, while the extracted feature is based on PCA method with a bad effect, before the 700 point; we even cannot see the fluctuation of the bearing running state, and the trend convert is also not obvious, from which we cannot know the bearing running state effectively.This result indicates that information extracted by LTSA could be more effective than that extracted by PCA.
After extracting the typical features, the CAO method is used to determine the embedding dimension of the SVM model, based on the theorem of phase space reconstruction.We first choose the delay time  through the autocorrelation function.
The optimal time delay  is determined when the first minimum value of () occurs.
Based on the extracted features dataset, the delay time  is set to 3 for the projected vector values through the autocorrelation function as shown in Figure 9.
Then the embedding dimension is selected by the CAO method.The result is shown in Figure 10; the optimal embedding dimension  for the projected vector is chosen as 10.Based on the selected optimal embedding dimension, the SVM model is used to achieve the multisteps prediction.The RBF kernel function is used: In this research the regularity parameter  is set to 90.3, the kernel function parameter  2 is set to 20, and the  is set to 0.001.The parameters  and  are selected by Particle Swarm Optimization algorithm (PSO) [30].The popular size of the PSO is set to 100, the interaction number of the PSO method is set to 20, and the fitness function of the PSO method is set to choose the parameters which make the SVM model fitting error in the training process the smallest.The error goal is set to 0.05, the dimension of the PSO is set to 2, and the inertia weight is set to  = 0.5,  1 =  2 = 1.2.Based on the selected typical features, the features dataset is used to train SVM model and the input features number of SVM is 9 determined by embedding dimension.Then, the trained SVM model is used to predict the bearing running state.Before the 700th points, the bearing is working in a normal state, so the 701-900 points are used to train the SVM model and the following 85 points are employed for testing.In order to evaluate the predicting performance, the root-mean square error (RMSE) is utilized as follows: where  represents the total number of data points in the test set,   is actual value in training set or test set, and ŷ Dimension (d) represents the predicted value of the model.The actual value and the predicted result are shown in Figure 11.
From Figure 11 we can see that the trend of the bearing running state can be predicted by SVM model.From the prediction, we can get a general understanding of the bearing running state in the future, but the predicted result is not accurate, especially the stage of 70-85 point, through calculation.The RMSE of the actual and the predicted result is 0.0469, so the predict result is not satisfied and the prediction error of SVM model (the predicted results subtract the actual data) is shown in Figure 12.
Then the SOM method is used to divide the error into some districts, the iteration number if SOM is 2000; the structure of the state classification matrix is [3×1].The results of the state division by SOM are shown in Table 2.
It can be seen from Table 2 that when there is downward trend or the point value is less than 0, the state is set to 1 and when there is upward trend, the state is set Chose three points 77, 76, and 75 which are recent to the 78 point and set the transfer step as 1, 2, 3; the state prediction results based on the Markov model are shown in Table 3.
From Table 3 we can see that the accumulated value of stage 3 is the largest, so the stage of 78 point is set to 3 and the result is the same with the SOM clustering method.
In this research, according to the stage of the prediction of the Markov model, the correct value is calculated by x(0) () = x(0) () − , where  is the median value of the divided stage area and x(0) () is the value predicted though SVM model.
For the 78 point, the corrected value is 0.57298−(0.1326+0.43612)/2 = 0.2956, where the actual value at this point is 0.37206.Then other points have also been corrected though this method.The corrected results of the point 70-85 are shown in Table 4.
From Table 4 we can see that the Markov model makes the results more precise, which validate the necessary to use the Markov model to improve the effect of the proposed method.The RMSE of the actual and the predicted result is 0.0091, so the prediction accuracy improved significantly.However, because those points are so far away compared to the predicted point of 1-69, the results still have some error.
In order to compare the predict effect, the most usually used prediction model BP neural networks is used to predict the bearing running state based on the selected features.The learning rate of the neural network and its momentum coefficient are 0.01; the weights are initialized to uniformly distribute random values between −0.1 and 0.1; the iteration number is 2000; the training error is 0.001; the input number is 9; the hidden number is 15; the output node number is 1.The prediction results are shown in Figure 13.
From Figure 13 we can see that the prediction results based on the BPNN model is not working effectively.There are some peaks while in the same position the actual status is not obvious.The RMSE of the predicted result is 0.0932, so the prediction results of the traditional BPNN model is not more effective than the SVM model.In addition, the prediction method based on the BPNN has the problem of prediction results which are unstable; when the same data is used to train and predict, the results are different and even the neural network may fall into the local optimum as shown in Figures 14 and 15.
With the data, the proposed method has also been compared with other methods that had been proposed in relative research.(1) The principal signal features extracted by PCA are utilized by HMM to predict the bearing running state [31].
(2) The time domain and frequency domain features have been directly used as the input of the prediction model, and the result has been predicted by Neural Network algorithm [1].(3) The original features have been extracted by PCA as the input of the SVM prediction model [32].(4) The proposed method in this research.The RMSE of the different methods predicted results is shown in Table 5.
From Table 5, we can see that the RMSE of different prediction methods is very different.The prediction method that the original features have been directly used as the input of the NN model works the worst.This is the reason why the original features are still with high dimension and include superfluous information, which is not appropriate for state    prediction; in addition, the NN prediction model has the drawbacks of slow convergence and difficulty in escaping from local minima.The prediction method based on the HMM model works not more effective than the method based on the SVM model; that is the reason why the HMM is not appropriate for long time forecast.The proposed method works the best; this is because the LTSA features extraction method can effectively extract the typical features and reduce the dimension and the SVM-Markov model can predict the state more precisely than the SVM and Markov model only.So through the comparison we can get that the proposed method is very effective in bearing running state prediction.

Application.
After validating the effectiveness of the proposed method, the method has been used to the actual application.The test rig is shown in Figure 16.
The bearings are hosted on the shaft and the shaft is driven by AC motor.The rotation speed is kept at 1000 rpm and a radial load of 3 kg is added to the bearing.The data sampling rate is 25600 Hz and the data length is 102400 points collected on the date of 2011.11.25 as shown in Figure 17.Every 2 hours, the vibration data are collected for one time.The collected data from 2011.11.25 to 2011.12.17 are analyzed after running for 1 year.
The time domain and time-frequency domain methods are used to deal with the collected vibration data as described in Section 2, Table 1.Then the features are normalized through _ = ( −  min )/( max −  min ) and processed into the interval [0, 1].The LTSA is used to reduce the dimensionality of calculated features and the result is shown in Figure 18.
From Figure 18, we can see that the bearing running state has a fluctuation and upward trend.Especially at 150 points, there is a sudden change of trend, which reflects the bearing's working status change at this moment.
After extracting the typical features, the CAO method is used to determine the embedding dimension of the SVM model.The delay time  is set as 2 for the projected vector values though the autocorrelation function as shown in Figure 19.
The embedding dimension is selected by the CAO method.The result is shown in Figure 20 and the optimal embedding dimension  for the projected vector is chosen as 10.
Based on the selected optimal embedding dimension, the SVM model is used to achieve the prediction.The regularity parameter  is set as 909.5 and the  is set to 0.01 selected by PSO method.Based on the selected typical features, the features dataset is used to train SVM model, the 1-130 points  are used to train the SVM model, and the following 20 points are employed for testing.The prediction results are shown in Figure 21.
From Figure 21 we can see that the trend of the bearing running state can be predicted by SVM model; from the prediction, we can get a general upward trend similar to the actual status.In addition, the sudden change of points 8 to 12 (near the 150 points in original signal as mentioned in Figure 17) is also showed out.However, the results are still not precise.In order to improve the prediction effect, the SOM method is used to divided the prediction error into some districts, the iteration number if SOM is 1000; the structure of the state classification matrix is [3 × 1].The results of the state division by SOM are shown in Table 6.
Based on the classification of the SOM model, the Markov state is divided into the following districts [−0.13365, −0.034682], [−0.034682, 0], and [0, 0.053935]; the Markov model is used to improve the prediction error.
The corrected results of the points 1-20 are shown in Table 7.
From Table 7, we can see that the Markov model make the results more precise.
Through the validation and actual application result we can see that the proposed method can predict the future status of the bearing, which is necessary for us to make some plan and do maintenance to reduce the risk of unnecessary accident.

Conclusions
(1) The time domain and time-frequency domain methods are used to extract the original features from the mass vibration data, and in order to reduce the original features dimension and the superfluous information of the original features, the multifeatures fusion technique LTSA is used to fusion the original features and reduce the dimension.
(2) Use the proposed SVM model to achieve bearing running state prediction.The proposed approach is validated by real-world vibration signals.The results show that the proposed methodology is of high accuracy, which is effective for the bearing running state prediction.(3) This research gives an example of combined approaches for the bearing running state prediction.
Through analysis and validation we can get that the proposed method takes good use of the advantages of each part and achieve a high recognition accuracy and efficiency.(4) As the redundancy increases, the complexity of computation increases as well.This is one of the main shortcomings of the proposed method, which will be explored in the future.

Figure 1 :
Figure 1: The flowchart of the proposed method.

Figure 4 :
Figure 4: The bearing components with serious wrecked after test with roller element defect and outer race defect.

Time ( 1 Figure 7 :Figure 8 :Figure 9 :
Figure 7: The first main projected vector of test bearing based on LTSA method.

Figure 10 :Figure 11 :
Figure 10: Selection of the embedding dimension by the CAO method of LTSA1.

Figure 12 :
Figure 12: The prediction error of the SVM model.

Figure 13 :
Figure 13: Prediction result based on the BPNN model.

Method
The features extracted by PCA are utilized by HMM prediction modelThe features directly used as the input of the Neural Network prediction modelThe features extracted by PCA as the input of the SVM prediction model

Figure 14 :
Figure 14: The training process curve of the BPNN method.

Figure 15 :
Figure 15: The actual data and the result predicted by BPNN fall into the local optimum.

Figure 19 :
Figure 19: Selection of the delay time by the autocorrelation function value of LTSA1.

Figure 20 :Figure 21 :
Figure 20: Selection of the embedding dimension by the CAO method of LTSA1.

Table 1 .
The measurements value of kurtosis and skewness are depicted in Figures5(a

Table 2 :
The result of the state division and the prediction results.Prediction number Prediction error State Prediction number Prediction error State Prediction number Prediction error State

Table 3 :
Table of the probability status.

Table 4 :
The corrected results of the points 70-85 based on Markov model.

Table 5 :
The RMSE results of different prediction methods.

Table 6 :
The result of the state division and the prediction results.

Table 7 :
The corrected results of the points 1−20 based on Markov model.