A New Feature Extraction Technique Based on 1D Local Binary Pattern for Gear Fault Detection

Gear fault detection is one of the underlying research areas in the field of conditionmonitoring of rotatingmachines.Manymethods have been proposed as an approach. One of the major tasks to obtain the best fault detection is to examine what type of feature(s) should be taken out to clarify/improve the situation. In this paper, a new method is used to extract features from the vibration signal, called 1D local binary pattern (1D LBP). Vibration signals of a rotating machine with normal, break, and crack gears are processed for feature extraction. The extracted features from the original signals are utilized as inputs to a classifier based on kNearest Neighbour (k-NN) and Support Vector Machine (SVM) for three classes (normal, break, or crack). The effectiveness of the proposed approach is evaluated for gear fault detection, on the vibration data obtained from the Prognostic Health Monitoring (PHM’09) Data Challenge. The experiment results show that the 1D LBP method can extract the effective and relevant features for detecting fault in the gear. Moreover, we have adopted the LOSO and LOLO cross-validation approaches to investigate the effects of speed and load in fault detection.


Introduction
It is a big challenge in fault detection and diagnostics to ensure the safe running of rotating machines.Vibration signal analysis has been widely used for fault diagnostics.With increasing improvements in vibration signal analysis, more accurate fault-detection techniques are being developed.In the area of gear fault detection, researchers are constantly investigating techniques for relevant features of fault detection.
Among several signal analysis methods, fast Fourier transform (FFT) is one of the most widely used and wellestablished methods.For instance, Pan and Sas in [1] conducted two tests, one to measure transient vibration signals and another to analyse the nonstationary vibration response of a rotor-dynamic system with both clutch and brake.Unfortunately, FFT-based methods are not suitable for nonstationary signal analysis and are not able to reveal the inherent information of nonstationary signals [1].On the other hand, both wavelet scalogram and wavelet transform are effective methods for extracting relevant features of vibration signal for fault diagnostics of rotating machinery and are suitable for nonstationary signal analysis.In [2], statistical feature vectors were obtained using Morlet wavelet coefficients, which were utilized as the input into Support Vector Machine (SVM) classifiers.Al-Atat et al. in [3] developed a model that built specific fault signatures more visibly by applying wavelet decomposition into the row signal.However, the wavelet scalogram is incapable of achieving good time and spectral concentration in both the time and frequency space.Moreover, the wavelet transform cannot fully estimate the "good" features, because the vibration signal generates the structure of components, which makes it difficult to identify features for each component by wavelet transform alone [2,4,5].Momoh and Dias [6] applied both FFT and wavelet transform to the extraction of features for fault detection and found that the wavelet transform scheme outperformed the FFT scheme.

Shock and Vibration
Another method of fault detection is called Time Synchronous Average (TSA).TSA is a signal processing technique, which is used to extract repetitive signals from additive noise [7,8].Peng et al. [4] used a TSA technique in time and frequency domain.A TSA signal was obtained by applying the TSA technique to the vibration signal.Statistical features were then obtained from the TSA signal.Their results showed that the TSA in the frequency domain is more sensitive to fault detection; however, the spectral analysis may be incapable of detecting gear failures at an early stage [7].Moreover, the TSA in frequency domain can be a successful technique if the frequency deterministic component is constant, but in reality a vibration signal contains small frequency variations [9,10].
Do and Chong in [11] reported that the one-dimensional vibration signal could be converted to two-dimensional grayscale image.They extracted local features from the grayscale image and utilized scale invariant feature transform (SIFT).SIFT produced a 128-dimensional key point utilized for the classification of motor faults.The proposed method was efficient at diagnosing motor faults in the presence of background noise.However, there are some serious disadvantages of using SIFT.Firstly, there is an uncertainty in the number of key points for different images.Secondly, using SIFT has a high computational cost in processing 128dimensional feature descriptors.
Shahriar et al. in [12] extracted an LBP feature from the images obtained from the vibration signal in order to create a fault diagnosis system for induction motors.These feature descriptors are then utilized by the classifier to diagnose faults on the motor.The method was effective in discriminating a normal and single fault in a time but was incapable of discriminating texture patterns for different fault categories.Moreover, the method required more complex computation such as the conversion of vibration signal into image followed by applying LBP.
In this paper, we use one-dimensional LBP inspired by the works in [13,14], who were the first to adopt 1D LBP extraction from a one-dimensional speech signal.The advantage of 1D LBP is the possibility of choosing fewer than eight bits and consequently a smaller number of features.Additionally, there is no need to normalize the vibration signal value to be suitable to represent a proper image format.Our experimental results show comparable performance accuracy between our 1D LBP-based model that considers six neighbours and a 2D LBP scheme that exploits eight neighbours.
In order to investigate the effect of different conditions (speed and load), we adopt a special technique of crossvalidation called Leave One Speed Out (LOSO) and Leave One Load Out (LOLO).This kind of cross-validation provides an experimental environment such that all the samples belonging to one condition will be used to test the model, while the model trained by samples belongs to different conditions.
Section 2 explains the processing of 1D LPB, Section 3 provides illustrations of data resulting from the experiments discussed in this paper, Section 4 explains the experiential work, Section 5 discusses the results obtained, and Section 6 reaches a conclusion.

1D Local Binary Pattern
The local binary pattern is a nonparametric operator.The LBP code can explain the data using the differences between a sample and its neighbours [15,16].LBPs have been widely used, particularly in face recognition systems [16][17][18].At a fixed pixel position, the LBP operator is described as an ordered set of binary comparisons of pixel intensities between the centre pixel and its neighbouring pixels.However, LBPs used for images utilize the pixel neighbour in two dimensions, which is called 2D LBP.
Although it is not widely used, 1D LBP can provide similar characteristics to the 2D LBP.For example, the researchers in [13] showed a distinctive marker of certain features of the speech signal, where the 1D LBP features were able to distinguish the unvoiced and the voiced components of speech signals.Additionally, the authors of [14] adopted 1D LBP to segment and separate Voice Active Detection (VAD) of the speech signal.
The 1D LBP operator labels every single value of the vibration signal by considering its neighbourhoods and using the value of the centre position as a threshold for the neighbourhoods.If the neighbour value is less than the centre value, the value of the neighbour will turn to 0; otherwise it turns to 1.A local binary pattern code for a neighbourhood is then produced.The decimal value of the LBP binary code presents the local structural knowledge around the fixed value [15].
The histogram of the 1D LBP signal displays how often these various patterns appear in a given signal.The distribution of the patterns denotes the whole structure of the signal.The 1D LBP operation of a sample value can be defined as where the Sign function is where [] is the signal and  is the number of considered neighbours.The Sign function [] transforms the differences to a -bit binary code.In this paper only six neighbours are considered (three to the left of the centre and three to the right).Equation (1) illustrates how the 1D LBP is evaluated.Hence, the value range of the new signal is between 0 and 63.The obtained signal is discriminated into two parts, uniform and nonuniform number.The uniform number comprises the numbers with fewer than or equal to two transition bits from 1 to 0 or 0 to 1 in their circular bit patterns.The nonuniform numbers have more than two transition bits.For instance, the patterns 111111 (0 transitions) and 100011 (2 transitions) are uniform, while the patterns 10101 (4 transitions) and 010101 (6 transitions) are nonuniform.There are 21 uniform numbers in the range 0-63 and the rest are nonuniform numbers.The histogram is computed such that an independent bin represents each uniform number, while all the nonuniform numbers are represented in one bin.Therefore, the set of features consists of 22 bins-21 bins for each uniform number and one bin for all nonuniform numbers.These bins are utilized as features to detect fault.The number of bins in the histogram depends on how many neighbours are considered.
Figure 1 demonstrates a 1D LBP operator for  = 6 with the centre sample as given.After processing 1D LBP, the 6neighbour samples in the example above produce the 100101 codes.The code is then converted to a decimal number that is equal to 37 and substituted in the same index as the centre sample.

Vibration Data
Fault detection is an important problem in machinery diagnostics.There are many techniques that have been developed to detect fault in the rotation machinery throughout vibration signal analysis.Vibration analysis is a way of interpreting where the fault is occurring in a rotating machine (e.g., motor and gearbox).In this paper, the authors applied their investigation to vibration data designed by the Prognostics and Health Management Society, known as the Prognostic Health Monitoring (PHM) Data Challenge.The challenge is how to detect and isolate faults in an equipment industrial gearbox using vibration data that have been collected from two accelerometers.There are a total of 560 recorded samples for two typical gearboxes.One of the gearboxes contains spur gear and the other contains helical gear.The data were recorded at different shaft speeds 30, 35, 40, 45, and 50 Hz, each under high and low load (see Table 1).The data consist of three gear modes, which are No Fault (NF), Chipped Tooth (CT), and Broken Tooth (BT) [19].In this paper, fault detection in helical gear is considered; hence the data comprise 120 recorded samples from the gearbox.

Experiential Work
The vibration data are used to detect fault in the gear.The data adopted in this paper consisted of three gear situations: NF, CT, and BT.One of the challenges of detecting fault in the gear is how to extract relevant features from the vibration signal.The 1D LBP is used as a technique to extract the features from the vibration signal.The procedure of 1D LBP is explained in Section 2. The features are then utilized as input to two classifiers (SVM and -NN).In the case of the SVM scheme, a pairwise approach is adopted for our multiclass problem.The kernel function of the SVM is linear and the optimization method is sequential minimal optimization.
The second adopted classifier is the NN, which is a geometric classifier that considers only one neighbour.Three types of cross-validation are exploited, Leave One Out (LOO), Leave One Speed Out (LOSO), and Leave One Load Out (LOLO).In order to investigate the influence of different conditions it is necessary to train the classification model with samples belonging to one condition and evaluated with the samples of another condition.This investigation has been performed by adopting LOLO and LOSO crossvalidation.Unlike LOLO and LOSO, LOO is considered for the experiments that do not consider the cross-condition (speed and load) in the training and testing data.Figure 2 illustrates the procedure of the adopted algorithm for fault detection.
We partition the experimental result into four different models.The first model can detect fault in the gear when the speed signal and load are fixed.We call this model Fixed Speed Fixed Load (FSFL).This model consists of 10 different cases because there are five speeds with two different loads.For example, one of the cases is when the speed of the vibration signal is 30 Hz with high load.The LOO crossvalidation is used in each case.
The second model detects fault in the vibration signal when the speed is fixed and both loads are combined.We call this model Fixed Speed Various Load (FSVL).Five cases are considered in this model.An example is when the speed signal is 45 with both high and low loads.Two crossvalidations are used in the second model, LOO and LOLO.
The third model is built for fault detection when the load is fixed and all the speeds are combined, for example, when the speed signal includes 30, 35, 40, 45, and 50 Hz with one load.We call this model Various Speed Fixed Load (VSFL).Here, two cases are considered and both LOO and LOSO are utilized as cross-validation.Finally, the fourth model is designed to detect fault when all the vibration data are combined, which means that all speeds and both loads are combined together.We call this model Various Speed Various Load (VSVL).In this model, three cross-validation methods are used, LOO, LOSO, and LOLO.

Result and Discussion
The fault detection is processed in all models mentioned in Section 4. The models are Fixed Speed Fixed Load (FSFL), Fixed Speed Various Load (FSVL), Various Speed Fixed Load (VSFL), and Various Speed Various Load (VSVL).

Model FSFL.
It can be seen from Tables 2 and 3, which demonstrate the result of the FSFL model, that neither of the classifiers SVM and -NN is significantly different in their performance based on a  value computed using a chisquare test for the cases of high load with speeds 30, 35, and 45 ( = 0.3,  = 0.3, and  = 0.08), while for the cases of low load with speeds 30 and 35 the  values are  = 0.04 and  = 0.08, respectively, which means that the only significant performance of -NN over SVM happens with the speed of 30 Hz under the low load.The reason for this statistically unclear performance is the limited number of samples involved in this model.
In order to compare the results of the 1D LBP scheme with the 2D LBP scheme proposed in [12], Tables 4 and  5 demonstrate the result of the same condition using the 2D LBP scheme.However, there is no significant difference between the two schemes in the case of using -NN and 2D LBP scheme significantly outperforms the 1D LBP using SVM only in one case (speed 30 Hz with low load).

Model FSVL.
The size of the data of this model (24 samples, eight for each class) is twice that of the data of the FSFL model because both loads are considered.From Table 6 it can be observed that the results of the FSVL model for both classifiers SVM and -NN are not significantly different in their performance based on a  value calculated by a chisquare test for the cases of speeds 35, 45, and 50.However -NN significantly outperform the SVM in both speeds 30 and 40 ( = 0.03 and  = 0.04, resp.).
In comparison with the 2D LBP approach, whose result is shown in Table 7, it can be observed that there is no significant improvement between 1D LBP and 2D LBP in the exception of the case of speed 35 when using SVM.
The result in Table 8 shows a comparison between 1D and 2D LBP by adopting LOLO cross-validation, which highlights the load effect on specific speed data.The results clearly show that 1D LBP outperforms 2D LBP (with the exception of 35 Hz speed) with  value of  = 0.02,  = 0.15,  = 0.02,  = 0.02, and  = 0.04 for 30, 35, 40, 45, and 50 Hz speeds, respectively.

Model VSFL.
In this model we investigate the case of the availability of data of one load with various speeds, which means that 60 samples will participate in each experiment.In the case of SVM, there is no difference in the performance of both 1D and 2D LBP.Additionally, the 1D LBP with SVM has no significant difference with the -NN.However, the 2D LBP outperforms the 1D LBP using -NN in the case of low load ( = 0.01).The results are shown in Tables 9 and 10.
The effect of speed on fault detection is very clear in the low performance in the case of LOSO (see Table 11).However, neither LBP scheme shows significant improvement on the other.Removing samples at the same speed as the testing sample from the training set led to a reduction in the performance accuracy of nearly 60%.

Model VSVL.
The data for the model are collected from all speeds and both loads together.The size of the data of this model is 120 samples.The performance of both classifiers is high when the cross-validation is LOO.However, the performance of the -NN significantly outperforms the SVM classifier with  = 0.03, as shown in Table 12.Furthermore, the performance of SVM in both 1D and 2D LBP is not significantly different ( = 0.06).But the 2D LBP outperforms the 1D LBP using -NN with  value of 0.03.(see Tables 12 and 13).A significant degradation in the performance occurs when LOLO and LOSO cross-validation are used.For example, when the cross-validation is LOSO, the performance of SVM is degraded by 56% and when the cross-validation is LOLO, the performance of SVM is degraded by 40%.The result of Table 14 also shows how 1D LBP is significantly more effective in fault detection in cross-condition; that is, the 1D LBP features can adapt the data of the different speeds and it is less sensitive than 2D LBP features with speed and load conditions.

Conclusion
In this paper, it has been shown that 1D LPB is an effective technique to extract features for detecting fault in gear when data with the same speed and/or load are available in training and testing the model.Moreover, the 1D LBP is cheaper in terms of computation than the 2D LBP scheme.The 1D LBP scheme is shown to be less sensitive to a specific load and speed; that is, 1D LBP features reduce the effect of different conditions such as speed and load.We have adopted the LOSO and LOLO cross-validation approaches to investigate the effect of speed and load in fault detection.

Table 1 :
The adopted helical gear data distribution on speeds and loads.

Table 2 :
The performance of SVM in FSFL model when the 1D LBP scheme is used for feature extraction.

Table 3 :
The performance of -NN in FSFL model when the 1D LBP scheme is used for feature extraction.

Table 4 :
The performance of SVM in FSFL model when the 2D LBP scheme is used for feature extraction.

Table 5 :
The performance of -NN in FSFL model when the 2D LBP scheme is used for feature extraction.

Table 6 :
The performance of classifiers for the FSVL model when the 1D LBP scheme is used for feature extraction.

Table 7 :
The performance of classifiers for the FSVL model when the 2D LBP scheme is used for feature extraction.

Table 8 :
The performance of SVM in both schemes when LOLO cross-validation is used.

Table 9 :
The performance of classifiers for VSFL model when the 1D LBP scheme is used for feature extraction.

Table 10 :
The performance of classifiers for VSFL model when the 2D LBP scheme is used for feature extraction.

Table 11 :
The performance of SVM in both schemes when LOSO cross-validation is used.

Table 12 :
The performance of classifiers for VSVL model when the 1D LBP scheme is used for feature extraction.

Table 13 :
The performance of classifiers for VSVL model when the 2D LBP scheme is used for feature extraction.

Table 14 :
The performance of SVM in both schemes when the crossvalidation is LOLO and LOSO.