Vibration Analysis of Shaft Misalignment Using Machine Learning Approach under Variable Load Conditions

)e Industry 4.0 revolution is insisting strongly for use of machine learning-based processes and condition monitoring. In this paper, emphasis is given on machine learning-based approach for condition monitoring of shaft misalignment. )is work highlights combined approach of artificial neural network and support vector machine for identification and measure of shaft misalignment. )e measure of misalignment requires more features to be extracted under variable load conditions. Hence, primary objective is to measure misalignment with a minimum number of extracted features. )is is achieved through normalization of vibration signal. An experimental setup is prepared to collect the required vibration signals. )e normalized time domain nonstationary signals are given to discrete wavelet transform for features extraction. )e extracted features such as detailed coefficient is considered for feature selection viz. Skewness, Kurtosis, Max, Min, Root mean square, and Entropy. )e ReliefF algorithm is used to decide best feature on rank basis. )e ratio of maximum energy to Shannon entropy is used in wavelet selection. )e best feature is used to train machine learning algorithm. )e rank-based feature selection has improved classification accuracy of support vector machine. )e result obtained with the combined approach are discussed for different misalignment conditions.


Introduction
All production and processing industries have been using rotary machines on a major scale. In order to ensure hasslefree operating conditions and fewer maintenance costs, it is essential to monitor machine health condition effectively. Eventually, machines get to adhere to the faulty conditions in due course of time due to various inherent causes. It is seen that out of many listed causes, misalignment is one of the prominent causes of fault set up. Hence, to avoid such faults, continuous monitoring is essential. Vishwakarma et al. [1] have discussed different modes of condition monitoring techniques. It emphasizes the importance of both time and frequency domain analysis for nonstationary signals. Tang et al. [2] have proposed an adaptive waveform decomposition method of the waveform to extract timefrequency features of nonstationary signals. e feature extraction for vibration signal of rolling bearing is carried out with the Adaptive Waveform Decomposition (AWD) algorithm and local frequency concept. e Adaptive Neuro Fuzzy Inference System (ANFIS) architecture has been implemented effectively for simulation of nonlinear components in online control systems. e effective implementation of ANFIS has reduced nondimensional errorindex and minimized adjustable parameters very less than other methods such as cascade correlation neural network and backpropagation neural network [3]. Li et al. have explained [4] the importance of online condition monitoring and diagnosis of power equipment. A brief review of a transformer, gas-insulated switchgear, cable, generator, and capacitor are described with the help of big data, Internet of things, and cloud computing techniques. e wavelet gray moment vector approach is claimed as an effective tool in fault diagnosis of rotating machinery [5]. e detailed fault classification method [6] is based on the wavelet packet energy ratio of resampled vibration signals. In comparison of wavelet transform with other methods such as variational mode decomposition and empirical mode decomposition, the evaluation of the upper as well as lower envelops is one of the main steps in calculations. For that reason, the error developed in envelop estimation will spread in the recursive decomposition results [7]. e analysis is carried out at a multilevel energy ratio to extract fault features of the vibration signal. Sohn and Farrar [8] have presented time series analysis for fault source in mechanical systems. A twostage model combined with autoregression and exogenous input technique is used for damage location. Dhumale and Lokhande [9] have presented fault diagnosis of voltage source inverter. e author has used extracted features from normalized current signals effectively to train Artificial Neural Network (ANN). e fault diagnosis system under variable load condition is developed for diagnosis of the voltage source inverter. Wilson Wang has proposed an extended Neuro Fuzzy system for real-time machinery condition monitoring. e developed monitoring system has been validated with experimental results and confirmed adaptability for different fault conditions [10]. Liu and Wang have explained fault diagnosis analysis for low-speed and heavy load slewing bearing. Different options viz. vibration, temperature, oil, stress, etc. are discussed and compared with innovative current analysis [11]. Shao et al. [12] have adopted deep wavelet autoencoder to handle unsupervised feature learning and developed an intelligent fault diagnosis system for rolling bearing. A multiple wavelet autoencoder is used to improve unsupervised feature learning ability. Tonks and Wang [13] have presented a combined approach of Artificial Neural Network (ANN) and Support Vector Machine (SVM) for fault identification in radial distribution systems. e principal component analysis technique is used for data analysis and faults are classified in combination with support vector classifiers. e change in thermometric condition in combination with SCADA system have been effectively used by Tonks and Wang [13] for detection of angular and offset misalignment in wind turbine shaft. e change in a condition of misalignment has been mapped with the changing temperature of the system. In this case, fault isolation is very essential to correlate misalignment with a change in temperature. e effect of torsional longitudinal vibration on aligned condition [14] has been studied through the simple lumped mass model and results are verified experimentally. A required coupling stiffness coefficient for reducing torsional vibration has been discussed. Acoustic emission technique [15] is used over conventional vibration analysis to detect angular misalignment of the shaft. e change in sound condition at support bearing is considered as a source of input. Especially at remote locations, the misalignment present is detected accurately with a combined approach of thermograph, i.e., thermal imaging and vibration analysis.
is is claimed as an effective technique in which elevated positions are considered for measurement such as windmill gearbox [16]. A theoretical analysis of the combined effect of shaft misalignment and unbalance is presented in the first part of the paper. Experimental validation is carried out to support the claims in conclusion [17]. e fuzzy-based controlling of current to avoid nonlinear load drawbacks has been explained. e compensating currents are injected with the help of static current distribution compensator [18]. Singh et al. have explained helical gearbox fault diagnosis using wavelet theory and J48 algorithm. e maximum accuracy in feature extraction is claimed by using SYM8 wavelet [19]. Patra and Bruzzone [20] have explored combined advantage of self-organizing map neural network and support vector machine to select uncertain and diverse samples in image classification. e effective combination of feature selection and feature extraction technique with SVM [21] is used for the prediction of defective software modules. e correlation-based feature selection technique with SVM has been compared with other available techniques to prove accuracy claimed in results. e vibration signal obtained is normalized [22] for effective feature selection. e discrete wavelet transform is applied for suitable feature extraction. e discrete wavelet transform and fuzzy logic have been used in combination to predict shaft misalignment. In the review paper, Hsu and Lin [23] presented compared several methods of SVM and showed sample cases which are one-against-one (OAO) as a best suitable for practical use. e mathematical analysis part of binary and multiclass SVM with an example of disease classification is explained [24].
An integrated approach of data mining and machine learning method is proposed for classification of type of damage condition in wood poles [25]. e advantage of nonlinear mapping of SVM along with enhanced cat swarn optimization is used to predict compressive strength of high performance concrete [26]. e nonlinear behavior of magnetorheological elastomer base isolator is optimized based on artificial neural network and ant colony algorithm [27]. e artificial neural network is proposed for accurate estimation of modulus of elasticity by considering effect of Alkali-Silica reaction in concrete [28].
e statistical methods such as the fuzzy system are much better by formulating rules for handling ambiguity and defining the relationship between input and output. If there is no ambiguity in the information collected and since the data is labeled, there is no need to use fuzzy systems or unsupervised machine learning algorithm such as K-NN [29]. ANN is a nonlinear model that is easy to use and understand as a simple statistical method [30]. Most of the statistical methods are parametric models that require a high background of statistics; ANN is a nonparametric model. It cannot define the relationship between input and output and cannot deal with uncertainty. To overcome this, a number of approaches have been combined with ANN to select features, and so on [31]. Deep learning requires a large amount of data and needs to be trained in complex data models, which can be very expensive. It requires huge datasets to train.
In an overview of the literature study, the feature extraction-based condition monitoring technique is discussed for various faults other than misalignment [1][2][3][4][5][6][7][8]. e various fault analysis technique based on vibration, temperature, stress analysis, and some of these in combination with ANN are considered for fault identification. In many cases of mechanical fault analysis, measurement of fault is essential to understand severity of fault generated. e major of faulty severity is not focused cases such as unbalance, misalignment, and crack analysis [9][10][11][12][13][14][15][16].
ere are different artificial intelligence algorithms for fault classification. e selection depends on the type of problem and size of data available [28][29][30][31]33]. e fault prediction and measure of fault severity, both are essential parts of run time condition monitoring. e present work focuses on the classification and measure of shaft misalignment under variable load conditions using combined approach of ANN and SVM. e normalized vibration signals are used for feature extraction. e suitable mother wavelet is selected on the basis of maximum Energy to Shannon Entropy (ESE) ratio. e ReliefF algorithm is used for best feature selection. e selected best features are used to train SVM for classification of misalignment as well as an input for ANN for measure of misalignment. e suitable structure for ANN is selected out of several trained structures based on the accuracy of prediction. e novelty of proposed Classification and Prediction of Shaft Misalignment (CPSM) is to classify the type of misalignment and to measure misalignment under variable speed conditions with minimum number of features and least data size for training. is is achieved by normalization of vibration signals before feature extraction. e results obtained show that accuracy of SVM and ANN classifier has been improved due to rank-based feature selection.

Methodology
e proposed CPSM is implemented to output signals obtained for healthy and faulty condition, and observations are recorded for all conditions. e outline of test rig used is shown in Figure 1. An accelerometer is placed at the casing of second bearing to sense vibration in all three directions viz. Longitudinal (V g ), Lateral (V t ), and Vertical (V r ). e misalignment is generated artificially in set up to visualize a proportional change in Overall Vibration Level (OVL). e wide range of vibration levels is observed for a different range of misalignment and speed conditions. ese output signals obtained are normalized in the range of 0 to 1. e normalization of the signal maintains distinctive values of extracted features under varying load conditions without loss of information. e normalized signals viz. V gNn , V tNn , and V rNn are obtained from [22] where j is a direction vector which represents g, randt three direction. e output vibration signals are recorded in these directions. e vibration signals are normalized and features such as Detailed coefficient (DC) and Average coefficient (AC) are extracted. Normalization reduces data size in the training of classifier and helps to improve accuracy. e selection of appropriate mother wavelet is carried out on the basis of DC.
e DC and AC are obtained from From equations (2) and (3), h and g are filtered coefficients, p is the number of samples, and u is shifting parameter. e extracted features are used for feature selection viz. Maximum (Max), Minimum (Min), Skewness, Kurtosis, Rms, and Entropy. e selected DC feature is revealed better with change in OVL for different conditions of misalignment. Hence, in CPSM, selection of correct mother wavelet is carried out on the basis of DC.

Experiment Facilities and Instrumentation
In the training of ANN and SVM, a large amount of realtime data with the actual misaligned condition is the foremost important part. It is obtained from the experimental setup. Figure 2 shows a pictorial view of the experimental setup. It comprises a motor, coupling, base plate, and two bearings. e vibration isolation pads at the base of a heavy foundation plate are used to isolate vibration from other sources. In the setup preparation, a major focus is projected on the actual induction of parallel and angular misalignment. e proper directional slot at the base of motor and base plate interface facilitates an easy induction of offset and parallel misalignment for experimental purpose on the artificial mode. It is very important to ensure zero misaligned states of an experimental setup in static conditions before carrying out the experiments. A fixture with special consideration has been prepared to verify zero misalignments. is fixture facilitates the use of the Face and Rim method to serve the purpose of checking alignment conditions. In this, the fixture is clamped on motor side coupling and the dial is simultaneously mounted on the face and rim part of rotor side coupling. e face dial calibrates angular alignment deviation and rim dial calibrates offset alignment deviation simultaneously in reference to motor side coupling. A variable frequency drive (VFD) is used to run setup at different operating speeds. e range of motor speed is closely considered with a standard rated speed of industrial motor selection.
In the implementation of CPSM, two sets of observations are recorded. One with varying speeds and constant misalignment and others with varying misalignment and constant speed. e few samples collected at 1200 rpm for different misalignment conditions are presented in . e variation of OVL with respect to change in misalignment and speed is shown in Figure 6. In order to prevent any unbalance or runout problem, a shaft along with a rotor was tested on a dynamic balancing machine before its assembly. e vibration signals are collected at bearing. An accelerometer (PCB make, Model: 352B70, measurement range: ±49000 m/s 2 , frequency: 0.4 to 20 kHz) is used for sensing vibration signals in three directions. e directional slots are provided at the base plate, which enables to introduce offset and angular misalignment in the setup on artificial mode. e misalignment is introduced with a step of 0.02 mm, i.e., 7.88 mils for the entire range of experiments.
e digital storage oscilloscope (DSO) (Tektronix make TBS 1064, 60 MHz, 4 channels, measurement accuracy: vertical ±3%, from 10 mV/div to 5 V/div) is used to record and store vibration signals obtained from an accelerometer. ese signals are analyzed with Discrete Wavelet Transform (DWT) and further considered as input for processing data with SVM-ANN. e input referred for experimentation is shown in Table 1.

Implementation
In the experimental setup, misalignment is introduced externally to obtain the vibration signal required for analysis. e output vibration signals are recorded in all three directions viz. Longitudinal (V g ), Lateral (V t ), and Vertical   (V r ). For detailed discussion and comparison, a sample vibration signal at 1200 rpm and variation of misalignment in the range 0 to 0.2 mm is considered, as shown in Figure 7.
It is observed that the overall vibration level (OVL) is increased with an increase in the value of misalignment, as compared in Figure 6. e fault which may occur in the rotary machine is confirmed with the particular fault frequency. It clear that the misalignment of the shaft is observed at 1X and 2X frequency harmonics [16,22]. For detailed discussion and comparison, a sample vibration signal at 1200 rpm and variation of misalignment in range 0 to 0.2 mm is considered, as shown in Figure 7. It is essential to understand fault information associated with nonstationary signals. is is possible with multilevel analysis using DWT. DWT is a widely used technique to obtain information in time and frequency domain. e DC feature extracted is compared for various sample condition mentioned in Figure 7. e change in OVL for different misaligned condition has been reflected with corresponding change in DC values. It is compared and depicted in Figure 8. It is observed that few faults are registered at same common frequency as shown in the mechanical fault diagnostic chart [32]. In such cases, DWT assists to identify and isolate uncommon feature of fault.  Shannon entropy are commonly selected features that can be obtained from extracted feature of vibration signals. e required feature extraction of vibration signal is carried out using wavelet transform. Dwt is useful in time and frequency domain, which shows fault existing impulse effectively. Different mother wavelets are examined at different levels to select most suitable wavelet and suitable level based on maximum Energy to Shannon Entropy ratio (ESE), as shown in Table 2.
All mother wavelet considered are compared at different levels of decomposition based on ESE ratio, as shown in Figure 9. It is clearly observed for all considered mother wavelet that energy contained in signal is reduced as the level of signal decomposition increased. erefore, level one is considered, as it contains maximum information for wavelet and feature selection. Different types of mother wavelet viz. Daubechies (DB), Coiflet, Symlet, HARR, DMEY, Biorthogonal, and Reverse Biorthogonal are considered during the analysis while using DWT. e correct level of mother wavelet is decided on the basis of maximum Energy to Shannon Entropy (ESE) ratio. is ratio is maximum at level 1 for all considered wavelet. e average of ESE for each class of fault and for each mother wavelet is plotted and compared in Figure 10. It is clear from Figure 10 that DB2 and SYM2 has higher ESE ratio. As mathematical function of these two wavelets are the same, so any one can be selected. Hence, DB2 mother wavelet at level 1 is selected for analysis.

Multiclass SVM Theory
e SVM is used as a classifier in the present work. e SVM refers to category of supervised learning algorithm in which set of output values are given to learning machine. Let us consider two classes of misalignment to be identified as Q 1 and Q 2 . M is an unknown feature vector to be classified into classes. e classification rule is applied, as shown in equation (4). In this equation, 'i' is the unknown input to be classified: where W T is the orientation of hyperplane and 'a' is position of hyperplane. e classifier is implemented to train the data by finding out the values of W T and a. During training, ensure that the value of W T and a is modified in such a way that P(M 1 ) will come to the positive side of hyperplane. Similarly, modify the value of W T and "a" in such way that P(M 2 ) will come to the negative side of hyperplane. Support Vector Machine (SVM) finds the best position of line. SVM tries to keep the maximum distance between these classes and separating boundaries so that a small noise cannot misclassify the given feature of unknown input.
For every M i , assign class belonging to it. M i , belonging to class Y i , where Y i � ±1, can be written as e generalized equation can be written as equation (6), irrespective of class:  Shock and Vibration Once W and "a" are determined, the unknown vector P can be classified into two classes using equation (4). In SVM, representing maximum margin can be written as where β is margin. e distance of M from the hyperplane is given by By proper scaling, the β parameter can be set to unity. Hence, equation (8) can be written as if M i is not Support Vector.

Shock and Vibration
From equation (8), it can be observed that the margin β can be maximized by minimizing ||W|| and maximizing the bias 'a'. erefore, a function to minimize the weight can be written as Similarly, for support vector, the constraint is obtained by the following equation to minimize the term (1/2Wt · nW). e above constraint optimization problem can be converted into unconstraint optimization using the "Lagrangian Multiplier": where α i is the Lagrangian Multiplier Optimization of equation (11), which can be carried out by taking derivative with respect to 'a' and equating it to zero and equation (12) is obtained: Similarly, Lagrangian Multiplier Optimization can be carried out by differentiating equation (11) w.r.t W, and equation (13) is obtained: e above binary classification is applicable if class labels have only two values (k-class, k < 2). In some cases, it is required to deal with more than two classes in the actual fault diagnosis of the machine. In such a case (k-class, k > 2), Multiclass SVM classifier is used. It can be obtained by combining several binary classifiers. e various methods of obtaining Multiclass SVM are viz. One-against-all (OAA), One-against-one (OAO), and Direct acyclic graph (DAG). e OAO is observed most commonly as the considered method [24]. is method constructs k * (k − 1)/2 classifier, where each one is trained on data from two classes. Let, ith and jth be two classes for training, which can be explained as follows: Minimize: e decision is based on the following rule. If ((w ij ) T ϕ(x t ) + b ij ) says that x is in ith class, then one vote will be added to ith class. If this is not true, then a vote will be added to the jth class. Accordingly largest vote count will decide class for x.  Table 3. For training, the Levenberg-Marquardt algorithm method is used with tan sigmoid activation function. e selection of the best structure depends on the size training data, neurons in input, hidden, and output layer, and the initial weight is assigned to the input signals.

Training of ANN
In the range of 0 to 0.2 mm, 10 conditions of misalignment are considered for different operating speeds up to 2100 rpm, respectively. In the proposed work, for ANN training, 6000 samples are considered, which includes 4500 samples of the misaligned condition and 1500 samples of aligned (healthy) conditions. e samples are considered as 50% of the total for the training of ANN, 25% of the total sample for testing of ANN, and the remaining 25% for crossvalidation of ANN. e Machine Learning Models (MLMs) can guarantee the optimal performance, as it test several models, first with the fundamental ones. Crossvalidation is a method for assessing MLMs by training numerous MLMs on subsets of the accessible input data and assessing them on the complementary subset of the data. Use crossvalidation to detect overfitting, failing to generalize a MLM.
An input layer, hidden layer, and output layer are considered as the main part of the ANN structure. For the proposed ANN structure, it has three inputs. e attainment of accuracy controls the number of hidden layers. Hence, the number of hidden layers is considered a trial and error basis such as 5, 10, 15, 20, and 25. e learning rate during the training of ANN is varied in the range of 0.01 to 0.04. e number of epochs considered is 1800. e different structures considered for training and testing are shown in Table 3. e ANN structure 3-20-11 with learning rate 0.03 is good for MoM in offset misalignment and angular misalignment. e performance of ANN during training is shown in Figure 11. e MSE, is 0.05 which is obtained at 1300 epochs. e MATLAB software is used for training, testing, and validation of ANN. e output error observed in the testing of trained ANN is calculated as below: where E o is expected output and A o is actual output.

Results and Discussion
In the present study of shaft misalignment, the output vibration signals are obtained from the experimental setup. e output signals are normalized to achieve reduction in data size required for training as explained earlier.
e average ESE is more for DB2 and SYM2. e DB2 wavelet is selected as the suitable mother wavelet as explained earlier.
e eighteen statistical features are obtained to analyze information in output signal. e ReliefF algorithm is used to optimize feature selection on rank basis. e sample vibration signal at 1200 rpm for misalignment is considered for presenting key points of analysis. It is clearly seen that all mother wavelet reflect good ESE ratio at level 1.
e explanation in support with fact that disorder, i.e., 8 Shock and Vibration entropy is minimum and information is maximum at the first level of signal decomposition. is is the basis for selection of level 1 for comparison. e rank basis feature optimization is carried out by using ReliefF algorithm. e kurtosis feature is observed as rank 1 feature. erefore, kurtosis features of all   Shock and Vibration 9 signals are obtained using DB2 at level 1.
e kurtosis feature of normalized vibration signals is considered as an input to train ANN and SVM. In the rank-based feature selection, the Kurtosis as a single feature shows efficiency 19.8 %. is efficiency has been improved upto 89.7 % by adopting combination of top eight ranked features, as depicted in Table 4. It is clear that eight topranked features are sufficient to obtain highest efficiency for fine Gaussian SVM classifier referred in this implementation.
In the CPSM approach, SVM and ANN are used for misalignment analysis. SVM classifier is applied to identify the class of misalignment. e misalignment condition which is to be classified has assigned a number as 0, 1, and −1 for healthy condition, offset misalignment, and angular misalignment, respectively. e few samples of output vibration signals are tested with SVM to confirm the output of fault classification, as shown in Table 5. It contains classification results obtained for three conditions related to healthy and misalignment. e sample cases of parallel and angular misalignment for varying load conditions are considered, as shown in Table 6. e output of Multiclass SVM has obtained with good accuracy, as shown in Table 6. e selected kurtosis feature of DC of all normalized vibration signals is taken as input to SVM. e points which has significant effect on improving the accuracy of CPSM are explained. e best part of the CPSM is the normalization of the vibration signal. e correct selection of mother wavelet is on the basis of the ESE ratio. e selection of the correct ANN structure has also contributed to minimizing MSE. e application of ReliefF algorithm for deciding rank of features and selecting top-ranked feature combination in training has also influenced in improving classification accuracy of misalignment.

Conclusion
A combined approach of Support Vector Machine and Artificial Neural Network is implemented successfully to obtain classification and prediction of shaft misalignment. e main contribution of study lies in implementing normalization of signals and ranking of features which is uncommon in problem of misalignment analysis. e selection of proper mother wavelet on the basis of maximum ESE has contributed well in feature selection. e implementation of the classification and prediction of the shaft misalignment method is seen with the least error in output results in misalignment fault classification and MoM. e accuracy of Support Vector Machine is seen to a good level for all conditions viz. healthy, parallel, and angular misalignment. In tested results of trained artificial neural network, a 3-20-11 structure is observed as the best artificial neural network structure for offset and angular measure of misalignment. e use of first eight ranked features has improved classification accuracy. It is observed that the selection of epochs and learning rate has also an effect on minimizing the error. e different samples tested for parallel and angular misalignment have classified successfully with the implementation of support vector machine. e average error observed in the output of the trained artificial neural network is 2.28%. It is concluded that the error in output of classification and prediction of shaft misalignment is within limit due to normalization, correct wavelet selection on the basis of maximum energy to Shannon entropy ratio, and rankbased feature selection using ReliefF algorithm. is approach is helpful in effective real-time condition monitoring of machines. e future scope of this work can be extended for fault prognosis of other faults related with bearing, gears, insufficient lubrication, and unbalance combined with misalignment to ensure effective condition-based maintenance.

Data Availability
Data used in this work can be made available on request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.