Machine Learning-Based Fault Diagnosis of Self-Aligning Bearings for Rotating Machinery Using Infrared Thermography

Department of Mechanical Engineering, National Institute of Technical Teacher’s Training and Research, Chandigarh, India Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India School of Interdisciplinary Research, Indian Institute of Technology, Delhi, India Laboratory of Robotics, Informatics and Complex Systems (LR16ES07), National Engineering School of Tunis, University of Tunis El Manar, BP. 37, Le Belvédére, 1002 Tunis, Tunisia


Introduction
Condition-based maintenance and condition monitoring are associated with maintenance of equipment based on the real-time condition of subsystem(s) of the machine. Every year, industries around the world spend billions of dollars on plant maintenance processes; it has been documented that maintenance expenses can account for up to a third of production expenses [1,2]. Bearings are the core components of the rotating machines in any industrial unit, but these are frequently vulnerable to severe circumstances throughout their operation causing a catastrophic failure; therefore, prior detection of the bearings is extremely essential [3][4][5]. Bearing fault (BF) is a widely recognized fault in any rotating machine. e bearing failure may be due to lack of lubrication, disproportionate greasing, corrosion, overheating, and so on. e malfunctioning of the bearing gives rise to process downtime as well as enhances the maintenance cost [6,7]. For the diagnoses of such faults, numerous condition monitoring (CM) techniques have been utilized from the last few decades, namely, vibration-based CM, acoustic emission, and motor current signature analysis. ese traditional techniques are expensive since their setup includes sensors and data acquisition structure; moreover, mounting of sensors is quite difficult in such CM techniques. In the current state, IRT is a well-known CM technique that is noninvasive and contactless in nature. Due to the impeccable features such as noninvasiveness, contactless, reliability, fast response, and accuracy, an extensive approach like IRT has been proposed in this article for the fault diagnosis [8,9]. Further, the data acquired by the machine are processed by image processing techniques.
Nowadays, the denoising of thermal images is done through an intelligent image processing technique named 2D-DWT which has been pursued in the current work. Higher dimensionality data are obtained during the decomposition process which not only reduces the performance but also finds difficulty in storing the data. To eliminate this problem, principal component analysis (PCA) has been applied in this article which is very efficient for the reduction of dimensionality of extracted features, and thereafter the most relevant features are accomplished. Lastly, the specialized features are used by the classifiers named as support vector machine (SVM), linear discriminant analysis (LDA), and k-nearest neighbor (KNN) for the diagnosis of bearing faults, and results have been compared by using these classifiers.
e article is classified into certain sections. Section 2 presents the comprehensive literature survey on fault diagnosis of bearings using IRT as the CM technique. Section 3 consists of the experimental methodology for the detection of bearing faults. Section 4 contains experimental setup and data processing. Section 5 presents the results obtained by using the different classifiers whereas Section 6 comprises conclusions for the fault diagnosis of bearings.

Fault Diagnosis of Bearings Using IRT
IRT has been considered as one of the most emanating CM techniques having numerous applications. IRT has been used in civil construction [10,11], an inspection of power supplies [12,13], estimation of plastic distortion [14], surveying of fatigue fractures [15], and analysis of printed circuit board [16]. In industries like aerospace [17], wood [18], and nuclear [19], IRT serves to be an evolving CM tool for the detection of faults. Based on the external energy source, IRT is predominantly classified as active and passive IRT. In the active IRT, outsourcing of energy is required for the generation of thermal contrast whereas passive thermography does not require an external energy source. Image processing along with the machine learning methods shaped IRT a more compelling condition monitoring (CM) technique for the fault diagnosis of rotating machinery; since relevant information from the thermal signal is extracted by image processing technique, consequently a suitable and competitive technique is needed for diagnosing the bearing faults. During the early period, fast Fourier transform (FFT) was used very frequently as the signal processing method, but as time progresses, an evolving signal processing technique named discrete wavelet transform (DWT) has been used by various researchers for the bearing fault diagnosis [20]. Yang et al. in [21] used histogram features for the fault detection of rotating machines. Younus et al. in [22] used 2D-DWT for the decomposition of thermal images while carrying their research on fault diagnosis of rotary machines. Lim et al. in [23] used an innovative fault diagnostic technique and made a comparison between vibration signals and thermal images and compared the results with a plausible accuracy. Schulz et al. in [24] used IRT for the fault diagnosis of bearings considering different bearing conditions, and the features taken into consideration were Gini coefficient, the moment of light, and standard deviation.
Jeffali et al. in [25] presented a novel methodology using IRT for the fault diagnosis of the asynchronous induction motor which in turn is helpful in predicting the remaining useful life of the machine. e selection of features is a very crucial step during the fault diagnosis of rotary machines. Eftekhari et al. in [26] used a single Gaussian model (SGM) which is based on Mahalanobis distance (MD) for the segmentation of hot spot region in an induction motor. Hwang et al. in [27] presented an integrated system for the fault diagnosis of bearings using the cepstrum coefficient method and artificial neural network (ANN) models. Zhiyi et al. in [28] proposed an intelligent method for rotor-bearing fault diagnosis using thermal images and enhanced CNN transferred from convolutional autoencoder. Glowacz in [29] proposed an effective method for the fault diagnosis of electric impact drills using IRT. e author proposed BCAoID as the feature extraction method, and the extracted features were analyzed using NN and BNN classifiers. Eren in [30] proposed an intelligent method using one-dimensional CNN for the bearing fault detection. e author proposed a fast and accurate system which combines both feature extraction and classification phases into a single learning body by the help of one-dimensional CNN which reduces the computational complexity without compromising the fault detection accuracy. Chen et al. in [31] proposed a method for the roller bearing fault diagnosis using empirical mode decomposition (EMD) and quantile permutation entropy, and SVM classifier was used for the classification and performance evaluation. Kaur and Singh in [32] proposed a multiobjective evolutionary approach for the optimization of hyperchaotic map in image encryption. e optical parameters of hyperchaotic map were obtained by a dual local search-based multiobjective optimization. Using these optical parameters, the secret keys were created which further helps in the encryption process. Two levels of permutation and diffusion operations were used for the encryption process for the better performance. Kaur et al. in [33] proposed a minimax differential evolution approach for the optimization of 7D hyperchaotic map in image encryption. e optical parameters of 7D hyperchaotic map were obtained by a minimax differential evolution. Using these optical parameters, the secret keys were created which are further used to perform the diffusion operation on the input image for the encryption process. Lei et al. in [34] presented a novel technique for the fault detection of bearings using empirical mode decomposition and multiple adaptive neuro-fuzzy inference systems (ANFISs). Kankar et al. in [35] proposed an effective method for ball bearing fault diagnosis using support vector machine (SVM) and artificial neural network (ANN) as the classifiers, and a comparative experimental study for the effectiveness of both the classifiers is presented by the researchers.

Experimental Methodology
e experimental methodology utilized during the fault diagnosis is shown in Figure 1. Firstly, the data are acquired in the form of thermal images using the FLIR E60 thermal camera. Further, the decomposition of thermal images was 2 Mathematical Problems in Engineering done by 2D-DWT, and features were extracted. Higher dimensionality data are obtained during the decomposition process which not only reduces the performance but also finds difficulty in storing the data. To remove this problem, principal component analysis (PCA) has been applied which is very efficient for the reduction of dimensionality of extracted features and thereafter the most relevant features are accomplished. Lastly, the specialized features are used by the classifiers named as support vector machine (SVM), linear discriminant analysis (LDA), and k-nearest neighbor (KNN) for the diagnosis of bearing faults, and results have been compared by using these classifiers.

Experimental Setup and Data Processing
e experimentation was carried out on a bearing test rig having a single phase 2 HP, 2 poles, and 220 V induction motor with diverse bearing conditions, and the FLIR E60 thermal camera was utilized to capture the thermal images for fault diagnosis of bearings in rotating machines. e setup used for the experimentation work is shown in Figure 2. e thermal camera adopted the principle of using thermal images to capture the thermal radiations emitted by the object, thereafter transforming those thermal radiations into temperature by considering some input variables. e relative humidity, emissivity, temperature, and distance are some of the decisive variables of the FLIR E60 thermal camera, but the emissivity is the most considerable variable as it gets varied by the bearing surface temperature; therefore, for cast iron, emissivity value of 0.65-0.77 was unvaried during the present experimentation work. e working distance of bearing under examination from the thermal camera is about 2.5 feet as presented in Figure 2. e specification of the FLIR E60 thermal camera is shown in Table 1. e specification of the induction motor used for experimental measurement is shown in Table 2.
e type of bearing used in the experimentation work along with the specifications is presented in Table 3. Bearings with an inner diameter and outer diameter of 25 mm and 52 mm, respectively, were taken into consideration for experimentation, and three different bearing conditions were considered. Out of these three bearing conditions, one is treated as a healthy bearing (H) and the rest two signify the faulty bearings, namely, inner race fault (IRF) and outer race fault (ORF), and with the help of wire electrical discharge machining, the rest two bearing states, namely, IRF and ORF were artificially created at similar depth as shown in Figure 3. e real-time thermal images of the three bearing states captured from the thermal camera are presented in Figure 4. e experimentation cycle of each bearing state was of 90 minutes. e parameters that were taken into consideration during the experimentation were the load (kg) and the shaft speed (rpm). e variation in the load, as well as the shaft speed, is shown in Table 2. After every successive interval of 15 min, the thermal camera captures a thermal image till the total cycle is completed. Since there are three diverse bearing states, three diverse loads, and three diverse shaft speed, so by employing full factorial we have 162 (6 × 3 × 3 × 3) thermal images during the experimentation work as shown in Table 4.

Image Processing.
Image processing is a crucial step in fault diagnosis as it converts the images acquired from the thermal camera into digital form to extract some useful information from it. In this article, three different bearing conditions were taken into consideration and the acquired thermal images of these bearings contain immense noise and higher dimensionality data; therefore, their decomposition Mathematical Problems in Engineering or denoising becomes a crucial step. For this purpose, discrete wavelet transform (DWT) has been applied in this article. e real-time decomposition of the original image into the denoised image using DWT is shown in Figure 5. DWT has been accepted as one of the outstanding approaches for the decomposition of thermal images [30]. 2D-DWT is a comparably straightforward extension of 1D-DWT. It can be perceived as a series of consecutive levels of decomposition in which an original image with a particular scale is decomposed by using a high pass and a low pass filter and then down-sampled by a factor of 2. is process is done both along the rows as well as the columns of the sample image. e output obtained at each level by using 2D-DWT is the wavelet coefficients named as the approximation and detailed coefficients. e four sub-bands obtained after first level decomposition are LL, LH, HL, and HH sub-bands, respectively, as shown in Figure 6. e LL sub-band will give an approximation to the input image, the LH sub-band will extract the horizontal features of the input image, the HL sub-band will extract the vertical features of the input image, and the HH sub-band will emphasize on the edges along the diagonals of the image. e scale of the input image (I 0 ) at m � 0 can be enumerated by 2 m � 2 0 � 1. e other sub-bands at m � 1 are illustrated as Here, ↓ and * represent the subsampling and convolution of the input image. L x, L y, H x, and H y represent the low and high pass filters.

Extraction of Features.
Extracting the features from the procured thermal images is an essential step of fault diagnosis.
e various attributes such as texture, pixel, and region of interest can be obtained through the features of the input image. In this article total, eleven statistical features have been extracted for further processing. ese features include mean, standard deviation, variance, skewness, kurtosis, median, energy, correlation, entropy, contrast, and homogeneity. ese features have been extracted from thermal images for different healthy and faulty bearing conditions. ese sets of features will serve as an input for the feature selection stage which is a very crucial stage in the fault diagnosis of bearings.

Selection of Features.
Selecting the most appropriate features from a complete set of features reduces the computation as well as enhances the classification accuracy. In the present work, PCA has been utilized as the feature

Classification of Faults and Performance Evaluation.
For the fault classification, the set of appropriate features were trained to three different classifiers, namely, LDA, KNN, and SVM, and the accuracy of all these classifiers was compared for the performance evaluation. LDA was not only used as the classifier but it was a well-known method used for feature dimension reduction. With the assistance of a linear transformation matrix, LDA projects the features from parametric to feature space. LDA can even be computed for a set of large data. In the present work, KNN was also utilized for fault classification. KNN is a supervised learning algorithm that works on the principle of storing the data during training, and based on the similarity feature, it classifies any new data to that category whose features are quite similar to the new data. Further, another supervised technique named SVM was used for solving the classification problems. In contrast to other machine learning techniques, SVM proves to be more accurate and reliable especially for the classification problems associated with IRT [38]. e crucial parameter for solving classification problems through the SVM classifier is the kernel function. Kernel function consists of Gaussian, quadratic, cubic, and linear functions for performance evaluation. In our present work, the quadratic SVM-based kernel function proves to be more accurate.

Results and Discussion
is section presents the experimental results obtained from the thermal analysis of bearings for fault diagnosis. e    current section presents the measurement of the response parameter which is the temperature of the region of interest for different bearing states. In addition to that this section also presents the classification and performance evaluation for different bearing conditions utilizing LDA, KNN, and SVM.

Measurement of Response Parameter for Different Bearing
States. During experimentation, the temperature of the region of interest (bearing mounted at the free end of the shaft of the bearing test rig) was measured. e recorded temperature values for each bearing state considering different shaft speeds and load are shown in Table 5.

ermal Performance
Curves. e thermal performance curves for each bearing state at different shaft speeds and different loads are shown in Figures 7(a)-7(i).
At no load and 500 rpm, there is a slight increment in the temperature for the healthy bearing state which means the rise in temperature at 500 rpm in comparison to the room temperature is less. For both the faulty bearings states, the difference in the temperature is more as compared with healthy bearing state which signifies that fault in the bearing leads to more heat generation and in turn the temperature rise. As the load increases to 2 and 4 kg, the rise in the temperature is more for faulty bearings because as the shaft load increases the load on the motor also increases which in turn increases the temperature of the bearings which was recorded by the thermal camera.

Scatter Representation.
In image processing, entropy and energy provide a measure of how the pixel values are distributed along with the gray level range. Usually, the image with few gray levels will have higher energy than the others with many gray levels. Contrast is the difference in luminance or color that makes an object distinguishable. In statistics, homogeneity and, its opposite, heterogeneity, arise in describing the properties of a dataset or several datasets. ey relate to the validity of the often convenient assumption that the statistical properties of any one part of an         overall dataset are the same as any other part whereas standard deviation gives important information about the contrast of the image. 2D feature space of different selected features such as contrast, homogeneity, mean, kurtosis, entropy, and the standard deviation is shown in Figure 8. Standard deviation is taken as the reference among all the extracted features because it is a measure of variability or diversity used in statistics. In terms of image processing, it shows how much variation exists from the average value. A low standard deviation indicates that the data points tend to be very close to the mean whereas a high standard deviation indicates the data points are spread out over a very large range of values.

Fault Classification and Performance Evaluation.
e current section presents the results for diverse bearing conditions utilizing LDA, KNN, and SVM. e results obtained through these classifiers are presented in the form of a matrix known as confusion or error matrix. e confusion matrix describes the performance of each classifier for each bearing condition in the form of rows and columns. Rows in the matrix represent the predicted class whereas columns refer to the true class. e confusion matrix of the LDA classifier for diverse bearing states at diverse load and shaft speed is presented in Table 6. e results state that the highest accuracy achieved by using the LDA classifier is 94.4% which was achieved at no load and 500 rpm. However, with the increment in the load as well as shaft speed, the accuracy reduces to nearly 90%. e confusion matrix obtained by using KNN as a classifier for diverse bearing states at diverse load and shaft speed is presented in Table 7. From the results, it was cleared that LDA surmounts KNN with 100% accuracy for the healthy conditions at diverse load and shaft speed. e confusion matrix obtained by using SVM as a classifier for diverse bearing states at diverse load and shaft speed is presented in Table 8. From the results, it was cleared that 100% accuracy has been achieved for all bearing states at no load and 2 kg load whereas at 4 kg load 100% accuracy is achieved only for healthy and inner race conditions and accuracy reduces to nearly 95% for an outer race condition.
It has been cleared from the results that SVM outperformed LDA and KNN in every aspect for bearing fault classification. An overall summary of fault diagnosis of various faults using IRT is presented in Table 9. e results state that the IRT-based approach could be efficiently utilized for fault detection of bearings in rotating machines.

Conclusions
e present work proposed an intelligent IRT-based system for fault classification of distinct bearing states. e acquired thermal images of distinct bearing conditions were initially preprocessed utilizing 2D-DWT accompanied by selecting the most appropriate features through PCA which further helps in classification and performance evaluation done through LDA, KNN, and SVM wherein SVM outperformed both LDA and KNN. e main outcomes obtained from the present work are as follows: (i) DWT gives diverse resolutions at diverse frequencies while analyzing the signal which makes DWT a better approach for the decomposition of the thermal images (ii) PCA has been applied for the selection of the most relevant features among the set of features which in turn reduces the computation and enhances the classification accuracy (iii) Fault classification was done by using three classifiers, namely, LDA, KNN, and SVM among which SVM outperformed both LDA and KNN (iv) e present research work using the IRT approach for fault diagnosis is well compared with that of the various approaches utilized by the researchers e classification outcomes proclaim that the present scheme could be utilized for detecting and inspecting the rotating machine faults and their condition. A multisensorbased approach combining a thermal camera and accelerometer or acoustic emission sensor can be considered for studying different behavior of rotating components. ANSYS tools can also be used to investigate the thermal behavior of bearings in greater depth.

CM:
Condition monitoring NDT: Nondestructive testing AE: Acoustic emission IRT: Infrared thermography 2D-DWT: 2-dimensional discrete wavelet transform ANN: Artificial neural network RTD: Resistance temperature detector MoASoID: Method of areas selection of image differences BCAoID: Binarized common area of image differences NN: Nearest neighbor classifier CWT: Continuous wavelet transform SVM: Support vector machine ICA: Independent component analysis PCA: Principal component analysis MD: Mahalanobis distance STD: Standard deviation ROI: Region of interest LDA: Linear discriminant analysis STFT: Short-term Fourier transform FFT: Fast Fourier transform MCSA: Motor current signature analysis BNN: Back propagation neural network SURF: Speeded up robust features GMM: Gaussian mixture model BoVW: Bag-of-visual word SIFT: Scale invariant feature transform CNN: Convolutional neural network.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.