Manifold Learning with Self-Organizing Mapping for Feature Extraction of Nonlinear Faults in Rotating Machinery

A new method for extracting the low-dimensional feature automatically with self-organization mapping manifold is proposed for the detection of rotatingmechanical nonlinear faults (such as rubbing, pedestal looseness). Under the phase space reconstructed by single vibration signal, the self-organization mapping (SOM) with expectation maximization iteration algorithm is used to divide the local neighborhoods adaptively withoutmanual intervention. After that, the local tangent space alignment algorithm is adopted to compress the high-dimensional phase space into low-dimensional feature space. The proposed method takes advantages of the manifold learning in low-dimensional feature extraction and adaptive neighborhood construction of SOM and can extract intrinsic fault features of interest in two dimensional projection space. To evaluate the performance of the proposed method, the Lorenz system was simulated and rotation machinery with nonlinear faults was obtained for test purposes. Compared with the holospectrum approaches, the results reveal that the proposed method is superior in identifying faults and effective for rotating machinery condition monitoring.


Introduction
Rotating machinery covers a wide range of mechanical equipment and is of importance in industrial applications.Therefore, faults in rotating machinery may severely affect operations in industry and even safety.To minimize the number of breakdowns as well as to increase the reliability, rotating machinery condition should be monitored for symptoms and incipient fault detection.By this, the life of machinery could be prolonged and the catastrophic consequences of unplanned failure could be avoided.Traditionally, to monitor the conditions and diagnose the faults of rotating machinery, vibration signals are most selected due to its easy-to-measure characteristics and analysis [1][2][3][4].The process technologies of vibration signals in common use are frequency spectrum, axis center orbit, time-frequency analysis, and so on.However, for the nonlinear faults of rotating machinery, such as rubbing, pedestal looseness fluid, and so forth, there are still some problems, for example, the amplitude and phase information of vibration signals are separated from each other, and the correlation of the vibration signals on the vertical and horizontal direction cannot be obtained.Hence, it is difficult to recognize the panorama of the rotor's vibration.
To overcome the shortcomings of the traditional methods, the holospectrum was put forth for synthesizing the information of the phase, amplitude, and frequency [5].As an effective fault diagnosis technology for the rotation machinery, it takes advantage of the improved Fourier transform algorithm to analyze the vibration displacement signals from one measuring section which can provide much more information about rotor vibration behavior.However, it needs artificial experience for better result.In case of holospectrum, it is unavoidable to select proper harmonic components for correct judgment.For example, for misalignment fault, the 2 Mathematical Problems in Engineering orbit of axis center of rotor is synthesized by 1x, 2x, and 4x frequency components, while the axis center orbit for oil whirl is built by 0.45x frequency component.That is to say, that the method failed to deal with the vibration data without manual intervention.
Due to instantaneous variations in friction, damping, and load, the mechanical systems are often characterized by nonlinear behaviors.Therefore, nonlinear analysis methods provide a good choice to extract defect-related features hidden in the measured signals, which may not be effectively identified using the conventional methods.Many nonlinear methods, such as correlation dimension, Lyapunov exponent, and approximate entropy [6], have been investigated.These methods are suitable to reveal the variations of the dynamical system where it is in the noise-free or low noise conditions.In fact, the vibration signals obtained from the mechanical system are inevitably contaminated by noise.Thus the above methods are conducted by averaging all points in the embedding space, and this may lose significant information about the time domain.
As a new dimension reduction technique, manifold learning methods have emerged in nonlinear research fields to identify meaningful low-dimensional structures hidden in high-dimensional observations, such as locally linear embedding [7], isometric feature mapping [8], and local tangent space alignment [9].These methods have been applied in computer vision, document analysis, and fault diagnosis [10][11][12].Yang et al. [13] proposed a method for nonlinear time series noise reduction based on principal manifold learning applied to the analysis of gearbox vibration signal with tooth broken.Li et al. [14] proposed the multiple manifolds analysis approach to extract manifold information from the bearing vibration signals with different faults.As for rotor systems, Jiang et al. [15] recently proposed the supervised manifold learning algorithm for effective feature extraction.Based on the survey of methods above, it is found that the manifold learning is an effective method for feature extraction.However, the features are extracted usually in uniform distribution of sample data, ignoring the influence of neighborhood size.
Obviously, neighborhood of high dimension constructed with vibration signal can not ensure uniform distribution.Same neighborhoods size can falsely estimate the relationships between the neighbors; it is therefore worthy of considering variable number of neighbors that are adaptively chosen.In order to distinguish the nonlinear fault of rotating machinery with vibration signals, a new low-dimensional embedding extraction method based on the local tangent space alignment combined with self-organization mapping is proposed.The main advantages of the approach, compared with other nonlinear analysis methods, are as follows: vibration signals are embedded into a high-dimensional space, which is more effective to discover the essential characteristics of the dynamical system, and it can distinguish the type of faults with less manual intervention.In a word, the new approach extracts the low-dimensional embedding from the manifolds to reflect the states of the mechanical system rather than extract a feature by averaging all points with the time waveform.
The organization of the rest paper is given as follows: a brief introduction of manifold learning with self-organizing mapping is given in Section 2. In Section 3, the details of feature extraction scheme are proposed.And Section 4 applies the method in detection for nonlinear fault of rotating machine.Finally, conclusions are drawn in Section 5.

Manifold Learning with Self-Organizing Mapping
2.1.Adaptive Selection of Neighborhood.Obviously, large neighborhoods cause confusions when dealing with the highly twisted manifold.In contrast, small neighborhoods can falsely estimate the relationships between the neighbors.Thinking to added noise, the distribution of samples in feature space is usually nonuniform.Thus, the fixed sizes of neighborhoods cannot satisfy the changing manifold structures.It is inevitable that the neighborhood size should be selected adaptively with the principle that all of subspaces should be connected to construct the topology structure of manifold.Meanwhile, there should be enough overlaps between adjacent neighbors, in order to transmit the local information.
From the view of network, self-organizing mapping (SOM) has the ability to divide nodes adaptively.Using competing-layer neurons to match the center of local neighbors of manifold structures, node grids are organized to cover the topological structures.Then with the learning of SOM, the local neighbors of high-dimensional manifolds are divided adaptively.

Self-Organizing Mapping with EM.
A SOM is a type of artificial neural network that is trained using unsupervised learning to produce a low-dimensional mapping space, discretized representation of the input space of training samples, and a self-organizing mapping consists of components called nodes.Associated with each node is a weight vector of the same dimension as the input data vectors, and a position in the mapping space.
Let    denote the probability that input   is assigned to the node with weight   .It is constrained by ∑     = 1 and    > 0. There is a neighborhood function ℎ , that corresponds to the control strength between node  and node .Usually, it is a decreasing function of the distance between nodes  and .Given the data X, the optimal goal is to find the probability assignments P and weights W that minimizes where (, ) is the distance between nodes of network and D is the distance between inputs X and weights W.
The closer the distance between the nodes, the smaller ℎ , value.So the logarithm likelihood function is defined as follows: where   is priori probability distribution and its initial value is usually set to uniform distribution.Then plusing (1) into (2), the free energy function is now the following: To minimize (P, W) of constraint conditions, the implementation process of SOM can be viewed as in the condition of known data X, seeking P and W posterior distribution density function.Obviously, it is suitable to select neighborhood with EM iteration [16].Therefore, the neighborhood selection algorithm is as follows.
(1) Calculate the neighborhood matrix of topology network in initial output layer H = [ℎ 1 , . . ., ℎ  ] and normalize it.Neighborhood function is given by where (, ) is the distance between nodes of network.Set the initial value:  = 1, 2, . . .,  and  = ; thus the sum of the relative entropy in initial competitive layer network is defined as The initial weight matrix W is given by W = diag((∑  Q) −1 ) * Q * X, where Q = [ 1 , . . .,   ] and   is a random value between 1 and .
(2) The location coordinate of topology node is set to the element of weight W. Then function F is calculated as where  = ∑  =1 min(  ) and  is the sum of relative entropy in competition layer network.D is the distance between inputs X and weights W, where The global optimal unit is   = min( 1 ,  2 , . . .,   ) in which  is set to the winner neurons index for input   .
(3) With the iterate minimal, F is found by The nodes which are greater than   can be found in  new and are labeled with V.So the node distribution is adjusted with  (4) Set ℎ new = ℎ * , in which rate is learning ratio.
Then new e, Q, W, F, and  are calculated, and return to Step (3) until the elements in the H are big enough.Finally, with the above iteration, the weight W is calculated and then fixed in the maximization for the new value.

Manifold Learning with SOM.
Manifold learning aims at discovering the intrinsic structure of nonlinear date.The process of the manifold learning with SOM is shown in Figure 1, and the implementation procedures are detailed as follows.
(1) Given a set of inputs X, the SOM network is adopted to optimize the weights W. Including multifrequency components or noise in vibration signals, obviously, the performance of the trajectory in phase space reconstruction is complex.Therefore, to balance the calculation and efficiency, network size is usually set to a larger scale.Meanwhile, to keep the consistency optimal results, initial weight of W can be set to unit matrix, and the learning ratio of H is also set to 1.1 for a gradual learning process.
(2) Selecting neighborhood adaptively: each element of W is used to set the center node of local neighbors.To ensure enough overlap, the radius of neighbor is equal to the half of maximal distance between center nodes.According to radius of neighbor, the local neighbors X  are selected, where  = 1, 2, . . .,  and  is the number of topology grids.
(3) Extracting local information: compute the  largest eigenvectors  1 , . . .,   of the correlation matrix (X  − X   T ) T (X  − X   T ), and set where X  is the mean of X  .
(4) Constructing alignment matrix: form the matrix B by locally summing B(X  , X  ) ← B(X  , X  ) + B  if a direct eigensolver will be used.Otherwise implement a routine that computes matrix-vector multiplication B for an arbitrary vector .
( From the viewpoint of geometry, the vibration data of the same operation state of rotor system has the same geometric property in space distribution or topological structure, its mapping points in the low-dimension embedding space can be distributed in embedded manifolds or in its neighbor.However the embedding dimension  of manifold has relation with where  min is the minimum embedding of dynamics system and is estimated with Cao algorithm [17].
Based on the adaptive neighborhood selection, the lowdimensional embedding can be extracted effectively.Since node neighborhood is divided adaptively by the grid of network for competition mechanism, SOM can overcome the limitation of fixed neighborhood algorithms.

Schematic Diagram of Feature Extraction
3.1.Manifold Reconstruction from Vibration Signal.The state space is constructed by a set of basis vectors which are composed of the dynamic variables of a system.But most commonly, not all the dynamic variables of the system are accessible for measure, an alternative form known as embedded phase space is convenient for research of the dynamics of the system.Suppose measurements obtained through sampling can be defined by   .In order to reconstruct a manifold from a time series, time delay embedding [18] is employed.For a time series X = [ 1 ,  2 , . . .  ], the delay vectors in the embedded phase space are given as where  = 1, 2, . . ., ,  is the embedding dimension and  is the delay time.
Roughly speaking, -dimensional delay embedding space is equivalent to the original observed state space of the dynamical system.Taking the steady-state response of the variable  and performing the time delay embedding transformation lead to trajectories in the embedding space that comprises a geometric manifold object.Takens has shown that [19], using delay coordinates and the embedding procedure, a manifold can be reconstructed by embedding a time series into a high-dimensional space, in which the topological structure and nonlinear characteristics hidden in the one-dimensional time series can be easily extracted.Thus, reconstructing a manifold in a high-dimensional space is the key to the proposed approach in this paper.However, the selection of time delay and embedding dimension in the phase space reconstruction is a question.Except for uniform time delay, the nonuniform time delay is also used to build the phase space [20].But it is well known that even nonuniform time delay also cannot guarantee that the time delay vectors distribution is uniform.So, in order to simplify the problem, the Hankel matrix is adopted in this paper to represent the high-dimensional space, where  = 1 and embedding dimension  is large enough.Namely, if  ≫  min where  min is the minimum embedding, it is possible to reconstruct the underlying dynamic structure of the entire system from the behavior of the single variable.

Schematic Diagram.
For feature extraction in rotating machinery fault diagnosis, the manifold learning with SOM is adopted to explore the geometric distribution properties embedded in the high-dimensional space.On the basis of the principles above, a new approach of feature extraction method based on adaptive manifold learning is proposed.First, high-dimensional observation space is built with phase space reconstruction, and then map the space phase data into a feature space, and estimate the intrinsic distribution of samples to gain the embedding manifold structure.Finally, the feature is represented by two dimensional projections for the sake of intuitive analyses of equipment operating status.The schematic diagram of the feature extraction method based on manifold learning is shown in Figure 2.
In application, it is should be noted that the holospectrum is drawn from the different harmonic components depending on manual intervention.Instead of by frequency component selection, signal is directly used to construct the dynamic trajectory with phase space reconstruction, and then through adaptive neighborhood selection strategy, embedding lowdimensional manifold can be extracted, therefore, reducing the dependence on human experience.

Experiments Verifications
To verify the capability of feature extraction of the proposed method, the nonlinear Lorenz system was adopted for test and is described as where the parameters  = 16,  = 4,  = 45.92, and the 8 dB white noise was also mixed.According to the evaluation index proposed in Cao algorithm, the  1 () and  2 () curve were calculated, respectively, where the time delay  = 1.The result is shown in Figure 3.
In manifold learning, the minimum embedding dimension is set to 3. The two dimensional projection of the phase space is shown in Figure 4, where embedding dimension  of Hankel matrix is set to 30.Due to the mixed noise, the dynamics trajectory is difficult to be identified.The two dimensional projection of Lorenz system extracted by the proposed manifold learning technique is shown in Figure 5.It is found that the embedding manifold structure is similar to the Lorenz system.
With the EM iteration, the neighborhood sizes learned from the SOM are shown in Figure 6.It is easy to figure out that the neighborhood sizes of nodes fluctuate are between 4 and 78.For the comparison, the adaptive neighborhood selection algorithm is also used to extract the low-dimension embedding using the neighborhood contraction and expansion [21].The two dimensional projection is displayed in Figure 7. Comparing to Figure 5, it is noted that projection of the reconstructed phase space has a certain deform, because the coefficient selection of contraction and expansion of

Application
The proposed method is applied to feature extraction of rotating mechanical nonlinear faults.Firstly, the fluid excitation failure in a N2 compressor high-pressure cylinder of petrochemical plant is adopted to extract feature.Usually, for the normal operation of rotation machinery, due to the laminar flow state of fluid medium through flow of rotor, the vibration of machinery is smaller.However, inappropriate adjustment of process parameters can lead to the steady turbulent flow phenomenon, resulting in the impact on the rotor.The compressor rotating speed is 11 416 r/min.Displacement transducers were used to acquire vibration signals of the rotor at the corresponding measurement points on coupling end.The sampling frequency is 2 000 Hz, and vibration waveform and frequency spectrum are shown in Figures 8 and 9, respectively.In Figure 9, except for the rotation frequency 190.3 Hz, there are large numbers of low frequency noise components.Therefore, the original axis center orbit plotted in Figure 10 is submerged.In order to eliminate the noise, the band-pass filter with phase preserving is used, and the corresponding filtering axis center orbit is also shown in Figure 11.According to the mechanism of fluid excitation fault, the typical performance of axis center orbit is unstable ellipse.However, due to its worse performance of filtering axis center orbit for fluid excitation fault, it is not easy to identify the fault pattern accurately.As a comparison, the proposed method is adopted to extract embedding manifold from the collected vibration data.The  1 () and  2 () curve were calculated and shown in Figure 12, and the minimum embedding dimension is set to 6.In corresponding 6 embedding vectors extracted with  = 30, there are three kinds of frequency spectrum structure shown in Figures 13-15.It is clear that, comparing with Figure 9, the rotation frequency and low frequency noise components are separated effectively.Therefore the projection space of embedding manifold corresponding rotation frequency is displayed in Figure 16.
It is very convenient to identify that, due to fluid excitation through flow of rotor, the projection trajectory are an unstable ellipse trace characteristic that is different from other faults.So, according to the special curve of the embedded manifold, the fault can be identified effectively.Obviously, the axis center orbit shown in Figure 11 has a certain limitation to identify this fault.
In order to verify the capability of the proposed method, different neighborhood sizes were adopted by LTSA to extract the low-dimensional embedding, where the neighborhood size  is set to 10 and 30, respectively.The experimental results with different  are shown in Figures 17 and 18.It is clear that the projection trajectory of embedding manifold with fixed neighborhood size cannot reflect the feature structure of fluid excitation fault, and the projection results are not as good as proposed method displayed in Figure 16.Obviously, the SOM neighborhood selection is more effective than the fixed neighborhood size.In addition, the other adaptive neighborhoods selection in manifold learning is also used to extract the low-dimension embedding, and the two dimensional projection is displayed in Figure 19.It is noted that although projection effect of the low-dimensional embedding is better than fixed neighborhood size, it is not good as the proposed method.
Next, the rotor-stator rub fault of turbines in refinery plant is also adopted to be analyzed.It should be noted that, due to the motion interference between stator and rotor, the axis orbit of rub fault is extremely complex and often shows a sudden change in axis center orbit.However, considering the influence of noise, only in the case of large fault degree, the characteristics can be observed effectively.component in frequency spectrum corresponds to a sinusoidal component.So different faults can be represented with eight major harmonic frequency components and four low frequency components, where these frequency components are usually extracted for diagnosis purpose.Thus, the rotation frequency and its harmonic components can be selected to reconstruct time domain waveform via inverse Fourier transform, and the purified axis center orbit also is drawn from the reconstructed signals in measuring section.The main purpose is to eliminate the noise of the original axis center orbit.As seen from Figure 22, it is difficult to identify fault through waveform or axis center orbit.According to the evaluation, the minimum embedding dimension is set to 4. In corresponding four embedding vectors, there are two kinds of frequency spectrum structure shown in Figure 23.It is clear that the rotation frequency and its harmonic components are extracted in different embedding.So the projection of embedding manifold corresponding rotation frequency and high frequency is displayed in Figure 24.It can be obviously seen that, due to the influence of the high frequency components, there are some sudden changes in two projection trajectory.Comparing the purified axis center orbit, in the projection shown in Figure 24, the change of trajectory very clearly reflects the characteristics of the rub fault.Obviously, as a typical nonlinear fault, it can be identified in the low embedding space effectively.
In order to further test the effectiveness of the proposed method, the fixed neighborhood size was adopted by LTSA to extract the low-dimensional embedding, where  is set to 20.The experimental result is shown in Figures 25 and 26 and is the corresponding frequency spectrum of embedding manifold.Obviously, due to the small neighborhood size, the high frequency components are filtered in embedding as shown in Figure 26; therefore, the sudden change behavior cannot be reflected in the trajectory of low-dimension embedding.It is clear that the method with fixed neighborhood size cannot extract effective low-dimensional embedding; thus identification of fault is seriously affected by the worst projection orbit.

Figure 1 :
Figure 1: Schematic diagram of the proposed manifold learning with SOM method.

Figure 2 :
Figure 2: Schematic diagram of feature extraction strategy.

f
Figure 24: Projection of embedding manifold of rotor-stator rub.