Vibration Analysis for Machine Monitoring and Diagnosis: A Systematic Review

. Untimely machinery breakdown will incur signiﬁcant losses, especially to the manufacturing company as it aﬀects the production rates. During operation, machines generate vibrations and there are unwanted vibrations that will disrupt the machine system, which results in faults such as imbalance, wear, and misalignment. Thus, vibration analysis has become an eﬀective method to monitor the health and performance of the machine. The vibration signatures of the machines contain important information regarding the machine condition such as the source of failure and its severity. Operators are also provided with an early warning for scheduled maintenance. Numerous approaches for analyzing the vibration data of machinery have been proposed over the years, and each approach has its characteristics, advantages, and disadvantages. This manuscript presents a systematic review of up-to-date vibration analysis for machine monitoring and diagnosis. It involves data acquisition (instrument applied such as analyzer and sensors), feature extraction, and fault recognition techniques using artiﬁcial intelligence (AI). Several research questions (RQs) are aimed to be answered in this manuscript. A combination of time domain statistical features and deep learning approaches is expected to be widely applied in the future, where fault features can be automatically extracted from the raw vibration signals. The presence of various sensors and communication devices in the emerging smart machines will present a new and huge challenge in vibration monitoring and diagnosing.


Introduction
Machines are widely used in today's industries and are paramount for factory operation. Careful monitoring of the machines must be done and when the machines do breakdown unexpectedly, it will cause massive loss to the company. is can be prevented by diagnosing the machine to determine the fault or potential fault such as imbalance, wear, misalignment, defective bearing, friction whirl, and cracking teeth in gearing [1,2]. Several available diagnosis methods have been applied over the years including oil analysis, vibration signal analysis, particle analysis, corrosion monitoring, acoustic signal analysis, and wear debris analysis [3,4]. Among these analyses, acoustic and vibration signal analysis emerge as popular choices because many faults can be identified without stopping the machine or tearing the machine down. e changes of these signals often indicate the presence of a fault. Acoustic analysis has the advantages of short analysis time, high recognition efficiency, and nondestructive testing. However, it is very challenging to properly capture the acoustic signals due to several factors such as environmental conditions, different parameter of recording software, and reflected acoustic signals [5]. Vibration signals analysis also has some advantages and disadvantages. Real-time machine monitoring can be achieved using vibration analysis and there are many well-developed signal processing techniques that can be applied. e limitations of vibration analysis are noise contamination and proper mounting position of the vibration sensors [6]. Another technique that can be used for machine monitoring and diagnosis is thermal imaging analysis. In this analysis, an infrared camera is usually used to detect many electrical faults in the machine based on the thermal anomalies. e thermal images obtained are useful in detecting and locating the machine's faults. However, this technique is expensive and requires a longer time to process the thermal images compared to processing the acoustic and vibration signals. Vibration analysis is considered as the best method in determining the machine condition [7]. According to Saucedo-Dorantes et al. [8], the percentage of fault diagnosis techniques conducted with the means of vibration analysis exceeds 82%. Machines are mostly made up of moving parts that generate unwanted vibration and with vibration analysis, and a decision on whether the machine can continue to operate or needs to be shut down and repaired can be made [3]. e machine's condition can be determined by the vibration amplitude and frequency, as both can reveal the severity and source of the machine problem, respectively [9]. At first, without the help of vibration equipment, machine conditions can still be diagnosed with a human brain of trained personnel coupled with the senses of touch and hearing which acts as a vibration analyzer. However, human perception is somewhat limited, and it is impossible to detect problems that are beyond the capability of human senses of touch and hearing. en, vibration analysis was based on a real-time spectral analyzer and now it can be categorized into time, frequency, and time-frequency domain [10]. Time and frequency domain analysis analyses the time series of data with respect to time and frequency, respectively. Timefrequency domain analysis used both time and frequency domains at the same time [3]. Vibration analysis for most machine monitoring and diagnosis can be divided into lowspeed and high-speed machines. Because currently there is no universally accepted speed range to differentiate both types of machines, machines with rotating speeds up to 600 rpm such as wind turbines and paper mills are considered low-speed machines. According to Kim et al. [11], low-speed machines can be very time-consuming and more difficult to monitor compared to high-speed machines as the rotating elements and the fault is typically low unless the fault has reached above the background noise level. Vibration monitoring of low-speed machines is highly associated with low-speed bearing in ensuring the reliability of the machine. e vibration analysis for machine monitoring and diagnosis typically consists of three main steps, which are data acquisition, signal processing, and fault recognition. To date, there are lots of techniques and instruments used in each of the aforementioned steps, and choosing the right ones might be quite challenging.
is is because each method and instrument have its characteristics, advantages, and disadvantages. ese methods can be divided into two main groups which are model-based and datadriven methods. Model-based methods require an analytical model of the system whereas data-driven methods do not need any assumption about the system's model. In data-driven methods, advanced signal processing techniques are applied. Because it is very difficult to model a faulty system, data-driven methods are widely applied in machine diagnosis and monitoring compared to modelbased methods. us, the main contribution of this article is the review of various data-driven vibration analysis techniques and instruments used for monitoring and diagnosing the machines. However, due to a wide range of techniques, only the widely used ones are discussed in this article. ere are several review articles regarding this area, and Table 1 discusses each of the review articles. Based on Table 1, we aimed to fill the research gap in reviewing the vibration analysis for machine monitoring and diagnosis, where the data acquisition system and comparison between different vibration analysis techniques including the latest deep learning approach are not discussed. We also aimed to answer the following research questions described in Table 2.
is manuscript is organized as follows. In the next section, the research approach for this review article is discussed. en, Sections 3-5 will follow the vibration analysis steps, as shown in Figure 1. Firstly, in Section 3, the data acquisition stage is discussed. is stage involves the types of vibration sensors used to obtain the vibration data and analyzers for analyzing the acquired data. e data acquisition stage is not discussed in most of the published review articles. Besides, different sensor mounting techniques are also discussed in this section. Section 4 is about the signal processing or feature extraction methods performed by researchers over the years, based on the time, frequency, and time-frequency domains. e final stage in vibration analysis, which is the fault recognition stage, is discussed in Section 5. In this section, different AI-based methods such as support vector machine (SVM), neural network (NN) including deep learning, fuzzy logic, and genetic algorithm (GA) in the fault recognition step are discussed. e discussions and findings of our review are explained in Section 6, and we conclude our studies in Section 7.

Research Approach
A comprehensive literature study was conducted to answer the five research questions. A Systematic Literature Review (SLR) approach was employed to collect the relevant primary studies regarding the vibration analysis for machine monitoring and diagnosis [17]. Firstly, we classified the articles and the selected papers were then analyzed and differentiated through the content analysis method.
e results can be classified into four main categories: (1) Survey, where review articles made by other researchers on vibration analysis for machine monitoring and diagnosis are discussed.

Analyzer
Analyzer is an instrument used to analyze the vibration data produced by machinery. It is composed of a sensor (which is presented in the later section of this paper), amplifier, filter, and A/D converter. e signal from the vibration sensor passes through the amplifier to increase the resolution and signal-to-noise ratio. e amplified signal then passes through a filter so that aliasing would not be encountered in the digitization stage. e signal is digitized in A/D converter, and then it goes through the processing unit where it can be portrayed as a time waveform or can be further processed to acquire frequency spectrum [10,18]. e vibration analyzer can be divided into conventional and computer-based vibration analyzer. A conventional vibration analyzer is a standalone instrument that is specifically built for vibration. It is a complex and expensive instrument, usually used by vibration experts. is instrument can help the user to determine the presence of a problem as well as its root cause and time for the machine to fail. ere are single, dual, and four-channel analyzers available in the market. A single channel analyzer can only receive an input from one accelerometer at a time, whereas a dual-channel analyzer can receive inputs from two differently placed accelerometers at the same time [19]. A four-channel analyzer can accept input from multiple sensors and capable of measuring horizontal, vertical, axial, and early bearing detection simultaneously. It is usually used with a triaxial accelerometer. e key advantage of a four-channel analyzer is the ability to observe the operating deflection shape (ODS) of a machine. Nuawi et al. [20] used a four-channel vibration analyzer in the monitoring process of bearing condition and the application of a dual-channel analyzer for machine monitoring can be seen in [21,22]. Another cheaper alternative is a handheld vibration meter.
is battery-powered device is equipped  [12] Presented a review of some vibration feature extraction methods applied to different types of rotating machines No discussion on the data acquisition system and AIbased fault recognition techniques Kumar et al. [3] Reviewed various techniques including the AI methods used for fault diagnosis based on vibration analysis methods No discussion on the data acquisition system and the comparison between different feature extraction and fault recognition methods are not presented Boudiaf et al. [13] Presented the vibration analysis techniques in terms of their capabilities, advantages, and disadvantages in monitoring rolling element bearings Only discussed four types of vibration analysis techniques Aherwar [14] Discussed a variety of vibration analysis approaches in diagnosing the rotating machinery including the AI methods No discussion on the data acquisition system and the review was conducted in 2012; thus, it lacks the latest AI methods such as deep learning Sait and Sharaf-Eldeen [15] Presented a review of vibration-based analysis damage detection techniques to monitor gearbox condition with an accelerometer and provides a display of vibration levels when in contact with machinery [19]. It requires very little skill to use but its measurement capability is somewhat limited and lacking in data storage performance. A computer-based vibration analyzer is an emerging instrument where vibration data can be processed virtually with the help of specific software and a personal computer.
is method has gained popularity because it is simple, inexpensive, and easy to repair, and can perform most of the functions available in the conventional vibration analyzer such as oscilloscope, multimeter, and waveform generator. LabVIEW is a widely used programming language in this method due to numerous arrays of data acquisition cards and measurement systems supported by it [23][24][25]. Ansari and Baig [23] used the computer-based vibration analyzer to monitor the condition of the machine, and they found that a conventional vibration analyzer is faster and more accurate. To overcome these limitations, dedicated hardware such as Digital Signal Processor (DSP) or Field Programmable Gate Array (FPGA) was used alongside a personal computer and sensor for user control and result display [18]. e computer processor usually has to handle the whole operating system in addition to the virtual analyzer whereas the DSP only performs one task. is makes the vibration analysis faster in a computer equipped with DSP. In [26], vibration analysis on the rotating machine has been conducted by employing two DSPs. Rangel-Magdaleno et al. [27] has conducted a vibration analysis on the CNC machine using an FPGA device. FPGA played a role in processing the vibration data and the computer screen portrayed the results obtained for further analysis. Rodriguez-Donate et al. [28] developed an online monitoring system for the induction motor with the  implementation of FPGA, and they found that the FPGAbased system has better processing speed compared to DSP and all the peripheral digital structures and processing unit can be included in a single chip. Compared to DSP, a computer-based vibration analyzer using FPGA is better because it can achieve true parallelism. Both devices actually provide better performances than using only a computerbased vibration analyzer [18]. More information regarding DSP and FPGA analyzers can be found in [29][30][31].

Sensor
A sensor or transducer is a device that converts mechanical signals to electrical signals [32]. e type of sensors used is usually based on the frequency range, sensitivity, design, and operational limitations. No matter what type of sensors is used, the stiffer the mounting of the sensor, the higher the frequency range and its reading accuracy [33]. In vibration analysis, there are three widely used sensors for acquiring the vibration signal. ese sensors are accelerometer, velocity, and displacement sensor. e noncontact LDV sensor is also discussed in this section. e advantages and disadvantages of each vibration sensor can be seen in Table 4.

Sensor Mounting Method.
Choosing a mounting method as well as implementing it correctly is an important factor in vibration data collection. For continuous or online monitoring of machine condition, vibration sensors are usually mounted permanently at a specific location in the machine. Mounting can be divided into four main methods, namely, stud-mounted, adhesive-mounted, magnet-mounted, and nonmounted. Stud mounting is usually preferable for permanent mounting applications. e sensor is screwed in a stud and secured to the machine. Apart from highly reliable and secure, this mounting technique has the widest frequency response compared to other methods. Make sure that the location where the sensor is going to be mounted is clean and paint free because any irregularities in the mounting surface will produce improper measurements or worst, damage to the sensor itself [19]. For adhesive mounting, no extensive machining is required as epoxy, glue, or wax will be applied. If the machine cannot be drilled for stud mounting, adhesive mounting is generally the best alternative. Although this mounting technique is easy to apply, the accuracy of the measurement is reduced because of the presence of damping in the adhesive [19]. In addition to that, it is also more difficult to remove the sensor compared to other mounting methods. e magnetic mounting method is  usually limited to temporary applications with a portable analyzer and not preferable for permanent monitoring because the high-frequency signals might be disrupted. e nonmounting method is typically applied by a probe tip, where there is no external mechanism between the transducer and the target surface. It is usually used in areas that are difficult to reach. e length of the probe tip, however, will affect the measurement, with longer probes lead to more inaccuracies.

5.2.
Accelerometer. An accelerometer is a device used to measure the vibration or acceleration of a structure in the SI unit of g (m/). e working mechanism is that when the piezoelectric material in the accelerometer is subjected to a force, it produces a charge corresponding to the force applied. Because force is directly proportional to the acceleration, any change to this factor will produce a change in the charge produced, which is then amplified [33]. Uniaxial accelerometer can only detect movement in one plane, whereas triaxial accelerometer covers all the three dimensions. Compared to the uniaxial accelerometer, triaxial accelerometer has a higher memory capacity but much more expensive [34]. Accelerometer is a widely used sensor due to its reliability, simplicity, and robustness. It can be further divided into a piezoelectric and MEMS accelerometer. Piezoelectric accelerometer relies on the piezoelectric effect of quartz or ceramic crystals, which are usually preloaded, to generate an electrical output that is proportional to the applied acceleration. Changes in the charge produced depend on this acceleration [35,36]. Piezoelectric accelerometer possesses several advantages such as better frequency and dynamic range, lightweight, and high sensitivity. However, it is vulnerable to interference from the external environment [37]. It also requires electronic integration in order to obtain velocity and displacement data because it is AC coupled [37]. Salami et al. [38] demostrated the application of LabVIEW in monitoring and analyzing the vibration signals, where piezoelectric accelerometer was used in their study. Igba et al. [39] installed the piezoelectric accelerometer sensor on the operational turbines to obtain the vibration data for time domain analysis. Khadersab and Shivakumar [40] used the piezoelectric accelerometer to obtain vibration data from rotating machinery to analyze the bearing faults. Figure 3(a) shows the vibration measurement piezoelectric accelerometer.
MEMS accelerometer usually consists of movable proof mass with plates, supported by a mechanical suspension system to the frame [41]. When it is subjected to acceleration, the proof mass tends to resist motion due to its own inertia and therefore the spring is stretched or compressed. As a result, force corresponds to the applied acceleration is created. MEMS accelerometer is DC coupled and very suitable for measuring low-frequency vibration and acceleration. It requires low processing power and provides superior sensitivity [41]. Modern MEMS accelerometer provides quite good data quality up to several tens of kHz. e drawback is that it suffers from a poor signal-to-noise ratio. Contreras-Medina et al. [42] used a low-cost MEMS accelerometer in detecting machinery failures. Chaudhury et al. [41] used the MEMS accelerometer in different rotating machines for vibration monitoring. A performance comparison between conventional piezoelectric accelerometer and MEMS accelerometer can be seen in [43,44]. It was found that MEMS accelerometer's sensitivity is more stable compared with the piezoelectric accelerometers and this low-cost MEMS accelerometer can be a good alternative to the high-cost piezoelectric accelerometer. is sensor has also been applied in [26,39].

Velocity Transducer.
A velocity transducer measures the voltage produced by the relative movement of the object, usually in the m/s or cm/s unit. It works based on the concept of electromagnetic induction, and it can operate without any external device [45]. As the surface where the sensor is mounted vibrates, the movement of the magnet in the coil will produce a voltage proportional to the velocity of the vibration [46]. is voltage signal represents the vibration produced and is then feeds a meter or analyzer [33]. Velocity sensors are not recommended when diagnosing the high-speed machinery because the operational frequency range is limited from 10 Hz to 2 kHz [10]. Generally, velocity transducer costs less than other sensors and coupled with its easy installation feature, it is favorable in monitoring the vibration of rotating machinery. However, it is big, heavy, and most velocity transducers are prone to reliability problems at operational temperatures that exceed 121°C [37,47]. A velocity transducer was applied by Rossi [48] to measure the frame vibration of compressor, which usually comprises of frequencies below 10 Hz. Figure 3(b) shows the vibration measurement using velocity transducer.

Displacement Sensor.
A displacement sensor, which is sometimes called eddy current or proximity sensor measures both relative vibration and position of the shaft. e displacement unit can be in m, cm, or mm. It is usually used in measuring low-frequency vibration of less than 10 Hz, but it can also measure vibration up to 300 Hz [45]. However, they do not excel in measuring a shaft bending away from the probe location [47]. Unbalance and misalignment problems are the types of problems that can be detected by the displacement probe. For the measured vibration frequencies above 1 kHz, the amplitude is usually lost in the noise level [47]. It has the advantages of a good dynamic range within a specific frequency range, reasonable sensitivity, and a simple postprocessing circuit with negligible maintenance. However, it is difficult to install, susceptible to shocks, and some traditional displacement sensors are not calibrated for unknown metal materials [37]. Sarhan et al. [49] used the displacement sensor in monitoring the cutting forces of the machining center under different cutting conditions. Saimon et al. [50] developed a low-cost fiber optic displacement sensor (FODS) for industrial applications that is immune to electromagnetic interference. e capability of a fiber optic displacement sensor in capturing the amplitude and frequency of vibration was studied by Binu et al. [51], and based on the results, this sensor can solve many sensing problems in aircraft.

LDV.
LDV is a noncontact optical measurement instrument that can be applied to determine the vibration velocities of any points on the surface of a particular machine [52,53]. e working mechanism of LDV is based on the laser Doppler concept where a frequency-modulated coherent laser beam is reflected from a vibrating surface, and Doppler shift of the reflected beam is compared with the reference beam,. Currently, a higher power infrared (invisible) fiber laser is more popular in LDV compared to the He-Ne laser. e introduction of this technology has fulfilled the goal of achieving long-range measurements without compromising the signal quality [54]. Continuous-scan laser Doppler vibrometry (CSLDV) has accelerated the measurement at many points. e laser beam will scan continuously along a defined path across a structure according to the desired scan frequencies. One major advantage of LDV is the ease of changing the measurement point, which can be done by just deflecting the laser beam. Despite that, the application of LDV in machine monitoring and diagnosis is limited because of the price and portability factors.

Time Domain Analysis
e simplest vibration analysis for machine diagnosis is used to analyze the measured vibration signal in the time domain. Vibration signals obtained are a series of values representing proximity, velocity, and acceleration, and in time domain analysis, the amplitude of the signal is plotted against time.
Although other sophisticated time domain approaches have been used, the approach of visually looking at the time waveform should not be underestimated because numerous information can be obtained in this manner.
is information includes the presence of amplitude modulation, shaft unbalance, transient, and higher-frequency components [55]. However, simply looking into these vibration signals cannot segregate the variations in vibration signals for different machine failures due to the noisy data, especially at the early stage of failure. us, a signal processing method is required to obtain the important information from the time domain signals by converting the raw signals into appropriate statistical parameters such as peak, RMS, crest factor, and kurtosis. Several statistical parameters are usually extracted from the time domain signal so that the most significant parameter, which can effectively differentiate between healthy and defective machine vibration signals, can be chosen [56]. In this article, the statistical parameters of peak, RMS, crest factor, and kurtosis are discussed and the advantages and disadvantages of each parameter can be seen in Table 5.

Peak.
e peak is the maximum value of the signal, v(t), over measured time and can be defined as [55] peak � |v(t)|max. (1) If there is a presence of impacts, the peak values of the vibration signal will vary. Under a fault condition, the peak value increases. e fault's severity and type can be assessed based on the amplitudes of the corresponding peaks. Peak value feature was studied by Lahdelma and Juuso [57] to diagnose bearing and gear faults in the machine. e proposed approach is suitable for online analysis as the requirements for frequency range are small. Shrivastava and Wadhwani [58] used statistical parameters such as peak, RMS, crest factor, and kurtosis to diagnose the rotating electrical machine. Although all parameters can differentiate between healthy and faulty conditions, they concluded that determining the type of faults in this manner is not very efficient. Igba et al. [39] used the peak values approach in monitoring the condition of wind turbine gearboxes because faults can be detected based on the changes in their values. is approach can also counter the limitations of the RMS feature, where the RMS is not significantly affected by lowintensity vibrations.

RMS.
RMS value presents the power content in vibration and useful in detecting an imbalance in rotating machinery. According to Vishwakarma et al. [59], this is the simplest and effective technique to detect faults especially imbalance in rotating machines. However, detecting the faults at the early stage is still a problem in this method and this technique is only appropriate for the analysis of a single sinusoid waveform.
e RMS value is more suitable for steady-state applications and analysis of single sinusoid waveform [1]. RMS is preferred over peak value due to the peak value's sensitivity to noise. RMS value of a pure sinusoid is equal to the area under the half-wave, which is 0.707. RMS value can be represented by where T represents time duration and v(t)is the signal. Referring to Igba et al. [39], the RMS method has two disadvantages. e first one is the RMS values of a vibration signal are not affected by the isolated peaks in the signal, reducing its sensitivity towards incipient gear tooth failure. Next, it is also not significantly affected by short bursts of low-intensity vibrations. is will produce some complications in detecting the early stages of bearing failure. Bartelmus et al. [60] applied the RMS values as a diagnostic feature to diagnose gearbox fault, where the models of the behavior of gearboxes that correlate the transmission error function and load variation are presented. Sheldon et al. [61] used the RMS feature in diagnosing wind turbine gearbox and stated that applying the RMS feature is not recommended in detecting early stages of bearing failure. RMS was among the statistical parameters applied by Krishnakumari et al. [62] in fault diagnostics of the spur gear. e parameters are then combined with fuzzy logic and diagnostics accuracy was found to be 95%, where DT reduces the demand for human expertise. Other applications of RMS values in vibration analysis for machine monitoring can be seen in Table 6 [63, 64].

Crest Factor.
A crest factor is the ratio of the peak value of the input signal to the RMS value and is represented as follows [55]: For a pure sine wave, the crest factor will be � 2 √ � 1.414, and for normally distributed random noise, the value will be approximately 3. Compared to peak and RMS values, the crest factor is usually used when measurements are conducted at different rotational speeds because it is independent of speed. Crest factors are also reliable only in the presence of significant impulsiveness [1]. Jiang et al. [65] employed the crest factor features and SVM to diagnose gear faults. It was found that the crest factor is the most sensitive feature for gear failure and by applying this feature, the achieved diagnostic accuracy is 93.33%. Shrivastava and Wadhwani [58] applied the crest factor values along with other time domain features for the fault detection and diagnosis of rotating electrical machines. ey found that the crest factor feature is unable to classify between healthy bearing, bearing with defective ball, and bearing with defective outer race. Aiswarya et al. [66] used the crest factor feature along with other time domain features to diagnose faults in the turbo pump of a liquid rocket engine. Combined with the SVM method in the fault classification stage, the proposed method can diagnose the fault effectively with 100% accuracy.

Kurtosis.
Kurtosis is a nondimensional statistical measurement of the number of outliers in distribution and in vibration analysis, it corresponds to the number of transient peaks. A high number of transient peaks and a high kurtosis value may be indicative of wear. Kurtosis is not sensitive to running speed or load, and its effectiveness is dependent on the presence of significant impulsiveness in the signal [67]. Kurtosis feature can provide the information regarding the non-Gaussianity or impulsiveness of the vibration signals [68,69]. In machine condition monitoring applications, kurtosis is usually preferable to crest factor but the latter is more widely used. is is because the meters that can record the crest factor value are easily available and more affordable compared to the kurtosis meter. Kurtosis was among the five parameters applied by Fu et al. [70] to be incorporated with the unsupervised AI method in diagnosing rolling bearing. Based on the results, the proposed method was found to have a sensitive reflection on fault identifications, including a slight fault. Runesson [67] employed the kurtosis along with RMS value in monitoring the condition of a mechanical press. e results demonstrated that kurtosis generally is not reliable but contains some useful information in the monitoring of the gearbox of the mechanical press machine. Other research studies that applied the kurtosis approach in vibration analysis for machine monitoring can be seen in Table 6 [63, 71].

Frequency Domain Analysis
Most real-world signals can be broken down into a combination of unique sine waves. Each sine wave will appear as a vertical line in the frequency domain, where the height and position of the line represent the amplitude and frequency, respectively. In frequency domain analysis, the amplitude is plotted against frequency and compared to the time domain, and the detection of the resonant frequency component is easier.
is is one of the reasons why frequency domain methods are favorable in detecting faults in the machine [59]. Several characteristics of the signal that are not visible in the perspective of time domain can be observed using frequency domain analysis. However, frequency analysis is not suitable for signals whose frequencies vary over time. 8 Shock and Vibration e advantages and disadvantages of each frequency domain method can be seen in Table 7.

FFT. Fourier transform (FT) converts a signal f(t) in
the time domain to the frequency domain, generating the spectrum F(ω). FT is given by where ω is the frequency and t is the time. It can be converted back to time domain from the frequency domain by inverse Fourier transform (IFT). is can be obtained as FFT is an efficient and widely used algorithm to obtain the FT of discretized time signals. e FFT plot of fault-free industrial machines consists of only one peak, which represents the natural frequency of the operating machine.
us, defect in the machine can be identified when there is a presence of other peaks aside from the natural frequency peak in the plot. However, Goyal and Pabla [37] claimed that during the conversion between domains, there is a little loss of time information. FFT is also unable to investigate the transient features efficiently in time and can predict the fault but cannot determine the severity of fault [3]. However, it is the quickest way to separate the frequencies of the signal for the diagnosis process. A combination of time domain signal analysis and FFT is normally associated with diagnosing the low-speed machine to produce more accurate results but the major concern is its reliance on the magnitude of the fault having an effect on the carrier frequency [11]. Saucedo-Dorantes et al. [8] used the FFTand PSD method to diagnose the fault in a gearbox and detect the bearing defect in the induction motor. ey found that the proposed method can perfectly detect the presence of wear at low operating frequencies but is not suitable at high operating frequencies.
Patel et al. [72] proposed the FFT method as an analysis tool in monitoring a rotating machine, and major faults such as misalignment and bearing can be monitored in this manner. Other applications of FFTcan also be seen in Table 6 [23,73].

Cepstrum Analysis.
Cepstrum analysis was developed in the 1960s and can be defined as the power spectrum of the logarithm of the power spectrum [74]. Cepstrum analysis can be used to detect any periodic structure in the spectrum   Used two cepstrum analysis approaches, namely, automated cepstrum editing procedure (ACEP) and cepstrum prewhitening (CPW) to detect bearing fault CPW approach is more suitable for applications that do not require bandpass filtering but applying both approaches without prior knowledge can lead to false result in detecting bearing faults [81] Applied the envelope analysis to diagnose faults in rotating machines under variable speed conditions Squared envelope method is an optimal approach in fault diagnosis in terms of computational cost and simplicity compared to the improved synchronous average (ISA), the cepstrum prewhitening (CPW), and the generalized synchronous average (GSA) [86] Used the envelope analysis to diagnose bearing faults e squared envelope method is more suitable to analyze the cyclostationary signals compared to the envelope method [90] Combined the higher-order spectrum analysis and SVM to diagnose faults in power electronic circuit e proposed method achieved the accuracy of up to 99% [91] Applied the power spectrum analysis (PSA) and SVM to diagnose rolling bearing fault Using the PSA with SVM classifier gives better result compared to NN classifier [100] Applied the WPT method to monitor the condition of the machine Using the proposed method as input to a NN classifier produced nearly 100% classification accuracy. Also, the proposed method produced a better result compared to FFT when the data are corrupted by noise [102] Combined the Hilbert transform and WPT to detect gearbox fault e proposed method is capable of detecting early gar fault [101] Applied a denoising method based on WT to diagnose rolling bearing and gearbox e proposed method is more effective and has more advantages compared to Donoho's soft-thresholding denoising method [103] Proposed an orthonomal DWT (ODWT) method to monitor and diagnose bearing faults at an early stage e proposed method outperforms the EEMD and Hilbert envelope spectrum analysis method [115] Applied the HHT and FT methods to diagnose machine fault HHT outperforms FT, where FT can only differentiate characteristic frequency in low-frequency band [116] Proposed a new local mean parameter to improve the HHT method to detect gearbox fault Introducing the new parameter improves the HHT process and makes the fault detection process simpler [120] Applied the STFT method to diagnose faults in a hydroelectric machine e basis for the effective fault diagnosis of hydroelectric machines was proposed High performance in detecting periodic impulse force, highly sensitive to shock, can be assimilated with a shape factor, independent of the signal amplitude Costly kurtosis meter can be erroneous  [146] Applied the combination of EMD and NN methods to diagnose the roller bearing fault e proposed method can successfully diagnose roller bearing fault but has an end effect complication [147] Combined the WPT, GA, NN, and SVM methods to diagnose fault in diesel engine e proposed method produced 100% classification accuracy [148] Proposed a CNN approach that makes use of cyclic spectrum maps (CSMs) of raw vibration signal to diagnose the motor bearing in the rotating machine Based on the validation with benchmark vibration data collected from bearing tests, the proposed technique is superior to its referenced methods in terms of the classification accuracy [149] Proposed a distribution-invariant deep belief network (DIDBN) as a basis for intelligent fault diagnosis of machines e proposed method is able to achieve a high diagnosis accuracy even with new working conditions [150] Used the CNN method with 1D image of raw three-axis accelerometer signal as the input It was found that CNN trained with a higher number of kernels in the first layer produced slightly better performance [151] Proposed a hybrid deep signal processing method to diagnose bearing faults in the machine, where the signal processing, feature extraction, and bearing fault diagnosis were automatically conducted e proposed method is superior to the manual extraction methods and commonly used deep learning structures in terms of accuracy, and it is not affected by the operation conditions [152] Presented the augmented deep sparse autoencoder (ADSAE) method in diagnosing gear faults, where data shifting technique was incorporated to enhance the SAE model Compared with other deep learning architectures, the proposed method provides a higher accuracy (99%) and only requires a few raw vibration signal data [62] Combined the decision tree and fuzzy logic methods to diagnose spur gear fault based on the statistical features such as RMS, crest factor, and kurtosis e performance of the proposed method in diagnosing fault was found to be 95% [160] Proposed the fuzzy logic method to diagnose the operation of rotating machines e proposed method can easily diagnose the operational status of the rotating system [161] Applied the fuzzy logic method to monitor and diagnose the condition of the pump e proposed method can successfully identify and classify the faults of the five-plunger pump [162] Proposed the combination of DWT and fuzzy logic to predict the presence of misalignment in rotating machinery e proposed approach has an error of less than 1% in predicting the degree of misalignment [163] Developed a gas turbine vibration monitoring approach based on Takagi-Sugeno fuzzy logic Experts' knowledge regarding the maintenance of the gas turbine in accordance to the vibration level detected can be successfully expressed by the proposed method [168] Proposed a combination of wavelet support vector machine (WSVM) and immune genetic algorithm (IGA) to diagnose gearbox fault e proposed method yielded a better diagnostic accuracy compared to the SVM and NN method, in addition to strong generalization capability [169] Combined the GA, SVM, and EEMD methods to diagnose gear faults Incorporating the GA to select the parameter of SVM can improve the generalization ability and classification accuracy of the diagnostic system [170] Applied the combination of GA and SVM in bearing fault diagnosis Applying the cross-validation method to optimize SVM outperforms the SVM method optimized by GA in bearing fault diagnosis such as harmonics, sidebands, or echoes [66]. is allows faults such as bearing and localized tooth faults, which produce low-level harmonically related frequencies to be detected. ere are four types of cepstrum which are real cepstrum, complex cepstrum, power spectrum, and phase spectrum, but power cepstrum is the most widely used cepstrum in machine diagnosis and monitoring. According to Goyal and Pabla [45], cepstrum analysis is important in gearbox diagnosis. Dalpiaz et al. [75] compared the cepstrum analysis with other methods in monitoring the condition of a gearbox and found that cepstrum analysis is insensitive to gear cracks. To detect the rubbing phenomenon of the sliding bearing in the machine, the cepstrum analysis method was applied by Sako et al. [76]. It was found that the proposed approach can even detect mild rubbing, which is difficult to be achieved by using conventional abnormality diagnostic methods. Experimental work was conducted by Aralikatti et al. [77] to diagnose the universal lathe machine, where cepstrum analysis was employed to the time domain signal. e vibration signal was obtained by a triaxial accelerometer. It was concluded that analyzing the vibration signal in the frequency domain does not guarantee the presence of a fault. Other applications of cepstrum analysis can be discovered in Table 6 [78, 79].

Envelope Analysis.
Envelope analysis, also known as amplitude demodulation or demodulated resonance analysis, was introduced by Mechanical Technology Inc. [80]. is technique separates the low-frequency signal from background noise [59]. Envelope analysis is made up of a bandpass filtering and demodulation step that extracts the signal envelope, and its spectrum possibly contains the desired diagnostic information [81]. It is widely used in rolling element bearing and low-speed machine diagnosis and has the advantage of early detection of bearing problems [55,81]. e challenge of this approach is determining the best frequency band to envelope. Envelope analysis needs a sharp filter and precise specification of the frequency band for filtering in order to work smoothly [55]. Regarding bearing failure, the noise components make the envelope analysis difficult to determine the fault. e introduction of the squared envelope analysis method has solved this problem, where a squared envelope can be computed as shown in [82]. It is highly preferable in analyzing the cyclostationary signals. Rubini and Meneghetti [83] compared the envelope analysis with the wavelet transform (WT) method in diagnosing the incipient faults in ball bearings. e results demonstrated that after 30 min (48000 cycles), the envelope analysis is no longer able to diagnose the presence of the fault whereas the WT method is still relevant. An envelope analysis approach has also been applied by Widodo et al. [84] to preprocess the vibration signals of lowspeed bearing, and thus, determining the bearing characteristic frequencies. is method is then compared with the acoustic emission (AE) signal analysis, and during the fault recognition stage with the SVM technique, it produces worse performance than the AE approach. Envelope analysis also was applied by Leite et al. [85] to detect the bearing fault in induction motor, and the proposed method can efficiently detect fault without any information regarding the model. Application of envelope analysis in machine monitoring can also be seen in Table 6 [81,86].

Spectrum Analysis/Comparison
. Spectrum analysis is related to FFT in a way that FFT is often used in spectrum analysis to transform the signal from time to frequency domain [87]. Spectrum comparison should be conducted on a logarithmic amplitude scale (dB) as the changes on a logarithmic axis can determine the state of the vibration. However, ones have to deal with the small fluctuations of the rotating speed of the machine [55]. A fault that is able to change the vibration signature significantly over a short period of time can be determined by this method [88]. Spectrum analysis is a complex analysis that even with the amount of literature available, expert skills are still required to exploit the diagnostic capabilities of spectrum analysis. Compared to the cepstrum analysis, spectrum analysis does not provide any information regarding the time localization of frequency component [77]. Salami et al. [38] have employed the spectrum analysis method for machine condition monitoring, and it is observed that this approach can produce smoothed and high-resolution spectral estimates of the vibration signals compared to the FFT approach. is spectral is useful for monitoring the state of machines. Ciabattoni et al. [89] proposed a novel statistical spectrum analysis (SSA) where the spectral content of vibration signals was calculated using FFT and then transformed into statistical spectral images in diagnosing faults of rotating machines. Other applications of spectrum analysis can be observed in Table 6 [90,91].

Time-Frequency Domain Analysis
Time and frequency domains are integrated into the timefrequency domain analysis. is means that the signal frequency component and their time-variant features can be determined simultaneously in this analysis. Vibration analysis approaches mentioned before (time domain and frequency domain methods) mostly rely on the stationary assumption that is unable to detect the local features in time and frequency domain simultaneously [92].
us, such methods are inappropriate for nonstationary signal analysis. As mentioned before, the time-frequency domain analysis methods discussed in this study include WT, HHT, WVD, STFT, and PSD. e advantages and disadvantages of each time-frequency domain method can be seen in Table 8.

WT. WT technique was first proposed by Morlet back in
1974 [93]. It is a linear transformation decomposing a time signal into wavelets, which are local functions of time, equipped with predetermined frequency content. Instead of sinusoidal functions, wavelets are used as the basis [92,94]. A suitable wavelet basis has to be chosen according to the signal structure to avoid misleading diagnosis results. WT provides a superior time localization at high frequencies compared to STFT. WT is preferable when dealing with nonstationary signals and in analyzing the transient signal from the measured vibration signal [95]. According to Zou and Chen [96], the WT technique is highly sensitive to stiffness variation compared to WVD. e WT method can be distinguished into discrete and continuous wavelet transform (DWT and CWT). In DWT, the power of two acts as a scaling factor, and it is typically applied through a pair of lowpass and highpass wavelet filters. e scaling factor is selected arbitrarily or via convolution for the CWT [45]. Both DWT and CWT are referred to as standard WT, which cannot efficiently perform feature extraction of certain types of signals due to its incapability to produce a sparse representation. It has a low-frequency resolution for highfrequency components and poor time localization for lowfrequency components. is gives birth to the wavelet packet transform (WPT) technique. It is a more advanced form of CWT, where it further decomposes the detailed information of the signal in the high-frequency region and improves the frequency resolution, making it applicable for the analysis of various nonstationary signals. Dalpiaz and Rivola [94] applied the WT method in monitoring the condition of the automatic packaging machine and found that WT is able to determine the variation of vibration frequency content within the machine cycle. e advantage of CWT is that it has a finer scale parameter compared to the DWT method, but in terms of computational, DWT is more efficient [97]. Al-Badour et al. [98] employed the CWT and WPT in detecting faults in rotating machinery. WPT is actually an extension of the DWT method but with finer frequency resolution. ey found that in terms of speed and spectral characterization of the vibration signal, the WPT method is better than the CWT method. Rangel-Magdaleno et al. [99] applied the DWT method to detect the incipient broken bar in the induction motor, and the detection accuracies achieved are 96.55% for the unload conditions, 80.5% for the half-load, and 87.6% for the full-load condition. Other applications of the WT technique can be found in Table 6 [100-103].

WVD.
Wigner introduced the WVD method and Ville applied it to process the signal, and thus, it was named the Wigner-Ville distribution. It is a specific case of the Cohen class distributions which yields a time-frequency energy density computed by correlating the signal with a time and frequency translation of itself [104]. e WVD of a signal x(t)is represented aswhere x * is the conjugate of x and τ is the delay variable. WVD has several advantages such as better resolution than STFT, excellent accuracy, and window function is not needed for its analysis [45]. WVD is not directly used by researchers to determine the time-frequency structures of signals because of the cross-term interference problem [93].
Staszewski et al. [105] performed the fault detection analysis on the gearbox using the original WVD method and its weighted form, and they claimed that compared to the original WVD, its weighted form can reduce interference in the time-frequency domain, with a cost of a reduction in the frequency resolution. Directional Wigner distribution (dWD) was specifically developed for transient complexvalued signal analysis and applied in rotating or reciprocating machines by [106]. Baydar and Ball [107] used the smooth pseudo WVD technique on acoustic and vibration signals to diagnose the gearbox condition, and the results showed that acoustic signals were more effective in early fault detection, compared to vibration signals. An application of the WVD method in diagnosing induction machines was demonstrated by Climente-Alarcon et al. [104]. By using this proposed technique, more reliable diagnosis results might be obtained in a situation where harmonics tracing is difficult.

HHT.
David Hilbert first introduced the Hilbert transform in 1905. en, Huang et al. [108] introduced the HHT in 1998 to determine the characteristics of stationary, nonstationary, and transient signals. HHT consists of empirical mode decomposition (EMD) of signals and Hilbert transform, and by combining these two methods, a Hilbert spectrum can be obtained, where the faults in a running machine can be diagnosed [92]. us, in this manuscript, any works regarding the EMD method fall into the HHT section. A complicated multicomponent signal can be broken down into a series of intrinsic mode functions (IMFs) using this method. By using the EMD technique, a complicated signal x(t) can be reconstructed with the help of IMFs, expressed as Can only be applied to the well separated harmonics, the fluctuations of the curve of the spectrum are averaged-out due to the filtering Envelope analysis Excellent application in bearing system, works well even in the presence of a small random fluctuation Can lead to a gross diagnosis error, not suitable to be applied in the gear system Spectrum analysis Useful in detecting signal that changes significantly over a short period of time, higher spectral estimation performance compared to the FFT Require experts' skills due to its complexity Shock and Vibration 13 where c i (t) is the th IMF and x n (t)is the residual signal that represents the slowly varying or constant trend of the signal [92]. HHT has several advantages such as low computational time, and it does not associate with any convolution [45]. However, EMD, which is the main part of HHT, has certain drawbacks. People might misinterpret the result due to unenviable IMFs generated at the low-frequency region, and in addition to that, the lowfrequency components' signals cannot be separated. us, the ensemble EMD (EEMD) method was introduced to overcome the limitation of EMD by introducing Gaussian white noise to the EMD [92,109]. However, in the lowfrequency region, the energy leakage and modal aliasing problems still exist. is is what motivates Torres et al. [110] to propose the complete EEMD with adaptive noise (CEEMDAN) method, which can produce better modal frequency spectrum separation outputs. Peng et al. [111] introduced a better version of HHT that incorporated the WPT technique to decompose the vibration signal into a set of narrowband signals. e proposed method was shown to have a better resolution in the time and frequency domain compared to the waveletbased scalogram. Wu et al. [112] employed the HHT approach in diagnosing the looseness faults of rotating machinery, and the proposed technique is successful in determining the faults at different components of the machine. Osman and Wang [113] proposed a normalized HHT (NHHT) technique to tackle the problem of selecting the appropriate distinctive IMF components, especially for bearing health condition monitoring. However, the EMD of the proposed method only processes signals over narrowbands. Chen et al. [114] proposed a combination of CEEMDAN and particle swarm optimization least squares support vector machine (PSO-LSSVM) to improve the diagnosis accuracy of rolling bearings. e diagnosis accuracy of several rolling bearings fault types was improved by this method to 100%. Other applications of HHT in machine monitoring and diagnosis can be observed in Table 6 [115,116]. 9.4. STFT. STFT was pioneered by Gabor back in 1946 in the communication field [117]. It has the ability to counter the limitations of FFT and is mostly applied to extract the narrowband frequency content in nonstationary or noisy signals [37]. In the STFT method, the initial vibration signal is broken down into time segments by windowing, and then FT is applied to each time segment [45]. e mathematical equation for STFT is given by where x(t) is the interpreted signal and ω(t) is the window function centered at time Τ. STFT depends on the width of the window. AA large window width is chosen to obtain greater accuracy in frequency, whereas to improve the accuracy in time, a small window width is desired. e main drawback of this approach is that it cannot achieve high resolution in the time and frequency domain simultaneously. Safizadeh et al. [118] proposed the STFT application in machinery diagnosis and proved that although STFT provides the time-frequency information with limited precision, and it is better than the conventional methods of machine diagnosis. To avoid cross-term effects, Burriel-Valencia et al. [119] implemented the STFT method for fault diagnosis of induction machines, where the spectrums in the frequency domain in the relevant frequency band are filtered. e proposed method greatly reduced the computing time and memory resources. e STFT method has also been applied in fault diagnosis of hydroelectric machine [120], induction motor [121], and rolling element bearing [122] (refer to Table 6). 9.5. PSD. PSD can be applied to measure the amplitude of oscillatory signals in the time series data and determine the energy strength of frequencies, which may be useful for further analysis. From the complex spectrum, the one-sided PSD can be computed in (m/s 2 ) 2 /Hz as where t 2 − t 1 is the time range and X(f) is the complex spectrum of the vibration x(t) in a time range, which can be expressed in units of (m/s 2 /Hz). PSD can also be directly calculated in the frequency domain if FFT of vibration signal is used, by applying the following formula [123]: where G rms is the root-mean-square of acceleration in a certain frequency f. PSD can analyze the faulty frequency bands without facing a slip variation issue and does not necessarily focus on one specific harmonic [124]. It requires very little processing power and can be directly computed by FFT or by converting the autocorrelation function [45]. PSD technique has been used by Cusido et al. [124] alongside WT to diagnose faults in induction machines, and the proposed technique can successfully diagnose faults for every operating point of the induction motor. However, to improve the diagnosis accuracy, good knowledge to determine the suitable mother wavelet and sampling frequencies are still required. Mollazade et al. [123] also used the PSD values in the feature extraction stage and fuzzy logic in the fault recognition stage of fault diagnosis of hydraulic pumps. e classification accuracy of the proposed technique for 1000, 1500, and 2000 rpm conditions is 96.42%, 100%, and 96.42%, respectively. Other applications of PSD in machine monitoring and diagnosis can be seen in Table 6 [125,126].

Fault Recognition/AI-Based Technique (RQ 3)
e application of AI in vibration analysis for machine monitoring and diagnosis has become increasingly popular, and based on this review, AI-based techniques contribute about 57% of the overall vibration analysis method in machine diagnosis and monitoring, as shown in Figure 5.
is is because most of the techniques mentioned before require huge expertise for successful implementation, which makes them not suitable for common users [80]. Furthermore, the expert is not immediately available. is is where AI-based methods come in because a nonexpert user can make reliable decisions without the presence of a machine diagnosis expert. AI can be defined as any task performed by a program or a machine that is difficult enough that it requires intelligence to accomplish it [127]. Several AI-based methods of vibration analysis for machine monitoring and diagnosis are SVM, NN, fuzzy logic, and GA.

SVM. SVM was initially introduced by Vapnik and is
the most widely used classification algorithm. is method transforms the data set or sample space to a high-dimensional, kernel-induced feature space by nonlinear transformation and then determines the best hyperplane [1]. e best hyperplane means the one with the largest margin between the two classes A and B, as shown in Figure 6. e data points from both classes that are closer to the hyperplane and influence the position and orientation of the hyperplane are called support vectors. e learning and test data for the SVM are obtained from the feature extraction process and after training the SVM algorithm, the SVM matrix was obtained [128]. Optimization methods such as GA and particle swarm optimization (PSO) are usually incorporated with SVM in order to achieve better results. One of the reasons why SVM is widely applied in vibration analysis for machine diagnosis is due to its compatibility with large and complex datasets such as data collected in the manufacturing industry [129]. SVM is very useful as the number of features of classified entities will not affect the performance of SVM [130]. is means that for the base of the diagnosis system, there is no limited number of attributes that can be selected. ere is no requirement for experts' knowledge in SVM, as is the case with fuzzy logic, and no layers are involved in SVM structure, compared to NN.
Poyhonen et al. [130] implemented the SVM technique to diagnose faults in an electrical machine, and the results showed that the classification accuracy was high, except for the detection of eccentric rotors. Tabrizi et al. [131] combined the SVM with the WPT (for signal preprocessing) and EEMD (for feature extraction) methods to detect small defects on roller bearings under different operating conditions. A classification tree kernel-based SVM has also been employed together with CWT to identify bearing faults, and this combination proves to be a promising method and superior to other SVM methods with common kernels in diagnosing the rolling element bearing fault [132]. Pinheiro et al. [128] used the SVM method for fault diagnosis of a rotary machine and successfully detected several unbalance faults. However, its performance is still questionable as a small number of samples are used. Further implementation of SVM in vibration analysis for machine diagnosis can be found in Table 6 [90,[133][134][135].

NN.
NN is made up of a large number of richly interconnected artificial processing neurons called nodes, connected to each other in layers forming a network [136]. NN has the ability to model processes and systems from raw vibration data extracted from frequency and time-frequency domain techniques mentioned before [137]. In the training stage of NN, superior input variables might suppress the influence of the weak variables.
us, the data must be properly processed and scaled before being fed into the NN. Normalizing the raw vibration data to the values between 0 and 1 can help to reduce the effect of the input variable [137]. Training time increases according to the complexity of the network, and this directly affects the accuracy of the results. Due to the robustness and efficiency in handling noisy data, the backpropagation neural network (BPNN) is widely used in machine diagnosis [138]. BPNN, pioneered by Rumelhart and McClelland in 1986, is made up of three layers, which are input, hidden, and output layer [93]. e presence of hidden layers gives the NN an ability to explain nonlinear systems, and the higher the number of hidden layers, the deeper the NN [139]. Similar to the SVM method, NN does not need a knowledge base to detect the location of the faults, unlike the fuzzy logic method. Ertunc et al. [140] compared the performance of NN and the combination of NN and fuzzy logic, which is known as adaptive neurofuzzy inference systems (ANFIS). Envelope analysis was applied for the signal processing step. ey concluded that the ANFIS method was superior to the NN, especially in diagnosing fault severity. Castelino et al. [137] used the application of NN in the vibration monitoring carried out on industrial rotary machines that were running in real operating conditions. e results showed that NN performed better for nonstationary signals in the time-frequency domain compared to the frequency domain. Recently, the deep learning or deep neural network (DNN) has been applied widely in machine monitoring and diagnosis. It is a type of NN that contains more than one hidden layer. Compared to the NN and SVM method, the DNN approach can adaptively learn the hierarchical representation from raw data through multiple nonlinear transformations and approximate complex nonlinear functions instead of extracting the fault feature manually [141]. However, due to its deep architecture, a high number of parameters are involved, which leads to the risk of overfitting. Widely applied DNN architectures include autoencoders (AE), convolutional neural network (CNN), restricted Boltzmann machines, and deep belief networks. Table 9 shows the comparison between NN and deep learning/DNN in machine monitoring and diagnosis.
Hoang and Kang [142] applied the CNN technique, which is a class of DNN that is most commonly applied for image analysis, to diagnose the rolling element bearing in rotary machines. Raw vibration signals in time domain are converted into a 2D form for the CNN to perform vibration image classification. is method achieved 100% accuracy using the bearing datasets from Case Western Reverse University, but hyperparameters for the CNN model can still be improved. A combination of transfer learning and DNN has been employed by Qian et al. [143] in diagnosing the rotating machines under different working conditions. Transfer learning focuses on how to store the knowledge or solution obtained when solving a problem and apply it to different but related problems so that the amount of data collection and training costs can be reduced [144]. e proposed method is robust, and there is no requirement for further training when supplied with new datasets from different working conditions. Similar applications can also be found in Table 6 [78,[145][146][147][148][149][150][151][152].

Fuzzy Logic.
Compared to conventional logic, fuzzy logic aims at modeling the imprecise modes of reasoning to make rational decisions in an environment of uncertainty and imprecision [153,154]. ere are four main stages of the fuzzy logic system, which are fuzzification, an inference mechanism, rule-base, and defuzzification component. e fuzzification stage converts the input data into fuzzy sets before the fuzzy inference stage makes a reliable conclusion based on the rules created in the rule-base stage. Finally, the defuzzification stage produces quantifiable results. Fuzzy logic is associated with membership functions, which role is to map the nonfuzzy input values to fuzzy linguistic terms and vice versa. To obtain a diagnostic system with excellent sensitivity, the rules and membership functions can be tuned [155]. However, properly determine the fuzzy rules and optimize the membership functions are the biggest challenge in fuzzy logic. Fuzzy logic is easier to implement compared to SVM and NN. Apart from that, unlike other AI methods such as SVM and NN, it does not rely on the datasets as there is no training or testing stage in fuzzy logic. In particular cases, this method can only provide a general diagnosis, as the specific fault symptom of a machine cannot be regularly determined. However, this is the only available alternative when collecting the fault data is not possible [156]. Lasurt et al. [157] compared the performance of fuzzy logic with NN in fault diagnosis of electrical machines, and they claimed that fuzzy logic performs better in detecting different faults for a wide range of operating conditions, compared to NN. Wu and Hsu [158] combined the method of DWT and fuzzy logic to detect gear fault and the results obtained showed that the recognition rate of the proposed method is over 96% under various experimental conditions. Mukane et al. [159] applied the FFT method for signal processing and fuzzy logic for fault recognition in identifying machinery faults. Multiple faults can be identified in this manner including the severity of the faults. Other applications of fuzzy logic in vibration analysis for machine diagnosis can be found in Table 6 [160][161][162][163].
10.4. GA. GA, derived from a study of a biological system, can solve both constrained and unconstrained optimization problems based on a natural selection process [164,165]. At each step, GA randomly selects the best individuals based on their quality from the current population to become parents for the children of the next 57% 43% AI-Method Non AI-Method  generation, in order to reach an optimal solution. is step will continue until a terminating condition is reached.
ere are three main processes that occur at each GA, namely, selection, crossover, and mutation. GA is usually used to optimize the monitoring system parameters and boosting the speed and accuracy of fault diagnosis. Samanta et al. [145] presented a combination of ANN and SVM with GA for bearing fault detection. Based on the result, SVM performs better than NN, and GA helps to reduce the training time of both methods. Han et al. [166] used the GA in the process of diagnosing the induction motor and found that GA helps the diagnosis system to perform better by choosing critical features and optimizing the network structure. Hajnayeb et al. [167] claimed that removing some features from the input features will result in quicker and more accurate diagnosis systems. GA is usually combined with other AI methods such as SVM, fuzzy logic, and NN. In NN, the GA can be used as an alternative to learning the weight values and to optimize the topology of a NN. For the fuzzy control, the GA can be used to tune the associated membership function parameters as well as generating the fuzzy rules. e applications of GA can also be observed in Table 6 [168][169][170]. e advantages and disadvantages of each fault recognition method can be seen in Table 10.

Discussion
More than 100 articles were discussed in this study, covering a topic associated with the vibration analysis in machine monitoring and diagnosis in terms of instruments used in the data acquisition stage, feature extraction methods, and fault recognition by AI techniques. Because there is a large amount of literature on this field, a review of all the literature is impossible, and some papers might be omitted. For the data acquisition process and to answer the RQ 1, most of the studies applied the simpler and cheaper alternative of the computer-based analyzer, and with the help of DSP and FPGA, this analyzer is nearly as good as the conventional analyzer. Accelerometer is still the best sensor option for vibration analysis, and this has been proved by most of the articles reviewed. However, due to the high cost of the piezoelectric accelerometer, researchers are continuously working towards the application of the MEMS accelerometer, which can provide the same or better performance. Velocity transducer is preferable to be applied in diagnosing low-speed machinery compared to accelerometers as the absolute accelerations measured are much smaller in value for similar vibration displacements. Noncontact sensors also have a huge potential in machine monitoring as mounting the sensor on the machine is not a concern anymore which in turn, produces a more accurate measurement. However, due to its costly application, it is not widely used. LDV with multichannel measurements is expected to be widely applied in the future, where the cost of implementing the LDV is reduced.
Regarding the signal processing techniques (RQ 2), the works on improving the detection and diagnosis of faults in the time-frequency domain have attracted numerous attention from researchers. is is because it can be implemented to investigate the nonstationary signals as failure signals are not repetitive at the earliest stage. Usually, these nonstationary signals contain abundant information on machine faults. Time and frequency domain techniques for machine monitoring are based on the assumption of stationary signals, and this is not suitable for detecting shortduration dynamic phenomena, especially in rotating machinery. However, traditional methods should not be omitted as it is preferable in certain applications. For lowspeed machines, envelope analysis is widely applied as it can detect low energy signals and the envelope spectrum is further analyzed using time-frequency domain methods such as WT and HHT, where the noise level in the vibration signals is reduced. To make use of the advantages of a certain method and to make up for the limitations, some researchers applied the fusion of certain techniques. Based on Figures 7(a)-7(c), the most applied time, frequency, and time-frequency domain methods in machine monitoring and diagnosis are RMS, FFT, and WT techniques, respectively.
To answer the RQ 3, researchers also move towards implementing the intelligence system in the vibration analysis for automated decision making, and from the review and referring to Figure 7(d), SVM is the most widely used method mainly due to its high classification accuracy and low computational time. Besides, it was found that the application of the time domain parameters is directly proportional to the applications of AI methods. is is because time domain features can improve the performance of AI methods and have a low computational cost, which would not put much computational burden on AI methods. e works on refining the algorithms for lower computational cost and easy implementation of the vibration analysis are still in progress for the AI-based techniques. Based on the previous studies regarding vibration analysis for machine monitoring and diagnosis, most of the reported studies applied the vibration analysis method on one test rig or machine, where the results produced are excellent but the same performance cannot be guaranteed when applied to other machines.
is also applies to the environment, where most of the reported works are conducted in a controlled environment and the performance might differ if employed in an industrial setting. Similar cases applied to the fault recognition using AI, especially in SVM and NN methods. ese data-driven methods are based on the training of historically obtained datasets fed to the algorithm, and when entirely new datasets are used, generalization issues might occur. us, it is important to train the AI algorithms with appropriate, diverse, and optimized datasets. It is also recommended to Lower Higher test the AI techniques at different operating or environmental conditions than the training datasets to counter the generalization problem. It was discovered that almost 80% of the previous studies applied the feature extraction or signal processing techniques such as envelope analysis, STFT, WT, and HHT, whether with the combination of AI method or not. Although these techniques achieved good performance in monitoring and diagnosing the machine condition, expert knowledge regarding signal processing is still required. Furthermore, the feature extractor has to be reconstructed for every specific fault diagnosis task. is is one of the reasons researchers are migrating to the deep learning approach where fault features are automatically extracted from the raw vibration signals.
In terms of noise, most of the reported existing methods can effectively distinguish noise from vibration signals. However, this is based on the assumption of Gaussian distribution vibration signals. In an industrial environment, vibration signals are usually corrupted with non-Gaussian noise, due to the abnormal operation of gears or bearings and random disturbances that occur in the machine, which is not widely considered in the reported techniques. is is because industrial machines are complex systems made up of various components such as shafts, bearings, and gearboxes, which run simultaneously. us, some fault signatures are often covered by the machine natural frequencies and submerged by high non-Gaussian noise, and as a result, the faulty frequencies

18
Shock and Vibration become nondominant in the spectrum, which makes the machine diagnosis and monitoring process more difficult [171,172]. For RQ 4, recent applications of smart machines present a new and huge challenge in monitoring and diagnosing their condition. e presence of various sensors and communication devices in the smart machine will produce very noisy data.
us, research studies on a vibration technique that is precise and robust and can handle a huge amount of noisy data efficiently are desired. Sometimes the collected vibration data from the sensors are insufficient for machine diagnosis and monitoring. is is why the need for the Digital Twin modeling approach arises. According to [173,174], the Digital Twin model can map various characteristics of the physical machine into the virtual world to produce a digital replica of the machine that is transferable, detachable, modifiable, reproducible, repeatable, and erasable. us, extrapolating the vibration data acquired from the sensors is possible based on the mathematical representation of the machine. By combining the AIbased data-driven approach and physics-based simulation model, Digital Twin can obtain additional information regarding the prediction of machine failures. is model can also be applied to run several simulations in different operational and environmental conditions, increasing the robustness of the diagnosis technique. However, producing an effective and proper Digital Twin model remains a challenge due to the nonlinear dynamics, and uncertainty that occurs in the operating smart machines [175]. us, works on constructing a Digital Twin model that can properly represent the actual conditions of the machine still can be explored. In addition, a technique to monitor and diagnose the machine's condition from a remote location without the need to visit the machine is worth exploring. For AI-based techniques, the performance can be further evaluated by studying the effects of simultaneous fault occurrence on the machine. In terms of AI algorithms, incorporating the transfer learning approach into the algorithm can be further explored since the deep transfer learning method is still in its early stages. Table 11 shows the most cited articles reviewed in this manuscript for each method to answer the RQ 5.

Conclusion
Vibration analysis for machine monitoring and diagnosis has become cheaper and cheaper thanks to the emerging technology and development in the data acquisition process and signal processing techniques including the instrument applied. Nowadays, even inexperienced users can conduct effective vibration monitoring without the presence of an expert. In this study, we have conduct a systematic review of vibration analysis for machine monitoring and diagnosis, which can be divided into data acquisition, feature extraction, and fault recognition stages. Several RQs have been answered in this study which might provide useful information on this area. From the study, several key factors are determined: (i) With the advancement of powerful software and the Internet, a computer-based analyzer is preferable in the future due to its low cost and performance, which is as good as the standalone analyzer. (ii) Noncontact sensor is the future of vibration analysis for machine monitoring and diagnosis due to its flexibility and independence of any mass-loading effects, without compromising the signal quality. (iii) Time and frequency domain methods are suitable for stationary signals and time-frequency domain techniques are preferable for nonstationary signals and early fault detection. (iv) Deep learning, especially the deep transfer learning method, is starting to be applied in vibration analysis for machine monitoring and diagnosis as it helps in minimizing the requirement for expert knowledge in the complicated feature extraction step. Traditional AI methods such as SVM, NN, and fuzzy logic still require expert knowledge in the feature extraction stage of newly fed datasets. (v) Traditional time domain features such as RMS and crest factor are still relevant in the future and its application with AI will continue to increase.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare no conflicts of interest.  [63]. From the results, the proposed method can still effectively diagnose the machine even with the reduced number of inputs.

762
Frequency domain A research conducted by [86] to study the relationship between the classical envelope analysis and spectral correlation analysis in the diagnostics of bearing faults. It was shown that the envelope analysis provided the same results as the complex spectral correlation function.

613
Time-frequency domain A study on applying the CWT method in the feature extraction of mechanical vibration signals conducted by [101]. e results proved that the proposed method is more effective than the Donoho's "soft-thresholding denoising" method.