A Fault Diagnosis Method of Rolling Bearing Integrated with Cooperative Energy Feature Extraction and Improved Least-Squares Support Vector Machine

To solve the problem that the bearing fault of variable working conditions is challenging to identify and classify in the industrial field, this paper proposes a new method based on optimization of multidimension fault energy characteristics and integrates with an improved least-squares support vector machine (LSSVM). First, because the traditional wavelet energy feature is difficult to effectively reflect the characteristics of rolling bearing under different working conditions, based on analyzing the wavelet energy feature extraction in detail, a collaborative method of multidimension fault energy feature extraction combined with the method of Transfer Component Analysis (TCA) is constructed, which improves the discrimination between the different features and the compactness between the same features of rolling bearing faults. *en, for solving the problem of the local optimal of particle swarm optimization (PSO) in fault diagnosis and recognition of rolling bearing, an improved LSSVM based on particle swarm optimization and wavelet mutation optimization is established to realize the collaborative optimization and adjustment of LSSVM dynamic parameters. Based on the improved LSSVM and optimization of multidimensional energy characteristics, a new method for fault diagnosis of rolling bearing is designed. Finally, the simulation and analysis of the proposed algorithm are verified by the experimental data of different working conditions. *e experimental results show that this method can effectively extract the multidimensional fault characteristics under variable working conditions and has a high fault recognition rate.


Introduction
In industrial equipment, the rolling bearing is an essential part of high-speed rating machinery. During actual operation, once the rolling bearings have failures (such as internal cracking, abrasion, external cracking, et al.), the safety and reliability of the entire system will be directly affected. In this situation, the condition monitoring and fault diagnosis of rolling bearings have become a hot research topic in prognostic and health management (PHM) of industrial systems.
Over the past decade, the accretion data, which reflect the running performance of rolling bearings, are usually used to analyze and test the fault characteristics of the rolling bearings. However, because the working operation of rolling bearings is influenced by various kinds of dynamic factors, the fault characteristics of the acceleration data may be quickly submerged by the ambient noise.
us, these all cause a huge difficulty in diagnosing the fault of rolling bearings. And fortunately, when the fault characteristic of rolling bearings is located in the blind areas, the energy waveform of the original signal shows the characteristics of high efficiency and low amplitude. So, some researchers have tried to extract the energy feature from the acceleration data to accomplish the target of fault diagnosis for the rolling bearings. Unfortunately, because the fault data signal of rolling is nonlinear and nonstationary vibration response in the industrial environment, the energy characteristics extracted by based-wavelet packet and improved methods cannot effectively distinguish the differences of different features and the tightness of the similar characteristics for the complex industrial environment.
To extract useful information of energy features, Wavelet eory is a usual method to analyze the vibration data of rolling bearing in previous work [1]. e key reason is that the wavelet packets can adaptively be selected according to the characteristics of the signal and may divide a frequency band into multiple frequency bands. Based on this, some scholars had presented some improved extracted method (such as wavelet packet transform (WPT), the fuzzy mutual information of wavelet packet transform (FMIWPT), dualtree complex wavelet packet transform, support vector machine based-WPT, complex wavelet packet energy moment entropy and maximal overlap wavelet packet transform, et al.) of energy feature in [2][3][4][5][6][7]. ese methods may not only implement the initial enhancement of the fault feature but also extract multiple permutation entropy features in the real application.
To address this problem, some scholars have tried to decompose the fault data signal into different frequency bands by using the wavelet packet and reconstruct the nodes in the frequency band in [8]. e advantage of the method is that characteristic frequency points may be located as quickly as possible in an industrial environment. Meanwhile, to treat the irregular vibration signal, the fault features may be extracted in time domain by using wavelet transform (see [9]). Besides, the optimization of structure of the energy characteristics has been also discussed briefly using Transfer Component Analysis (TCA) in [10]. By using the algorithm, the data properties may be preserved and the data distributions in different domains may converge to a stable scale. However, the running state of rolling bearings is affected by the endogenous factors; the different decomposition depths of the energy features are a very key problem in our working condition. Due to the diversity and variability of the actual fault diagnosis distribution, some methods (such as optimized transfer learning (TL) algorithm and regularization terms of multilayer) are aimed at solving the domain adaptation and reducing the distribution discrepancy and the among-class distance of the learned transferable features in [11,12]. ese methods can optimize the structure of feature sets better. At the same time, for getting better effectiveness in fault diagnosis under variable working conditions, some improved methods based on transfer learning (such as highorder Kullback-Leibler, parameter transfer, improved joint distribution adaptation, et al.) were also presented in [13][14][15][16]. So, it is essential to find out a new method to further optimize the structure of energy characteristics of rolling bearings in the real application. is problem is the first key core of this paper.
Additionally, on the one hand, the purpose of extracting the energy features is to implement the accurate diagnosis of fault state in industrial scenes. For this reason, the diagnosis method needs to be also simultaneously concerned while the multifeatures of fault signal are extracted. In the light of this, the support vector machine (SVM), which has the preferable ability of the classification, is usually used to implement the classification and recognition of fault in running processing of rolling bearing. However, the algorithm does not suit the situation of large amounts of data. us, some researchers presented improved algorithms such as binary SVM or based-HV SVM to identify the multifault types of the rolling bearing [17]. Further, the least-squares support vector machine (LSSVM) was constructed to reduce the difficulty of calculation and improve the recognition speed. e algorithm and model have solved the inequality constraint in SVM. But how does equality constraint substitute the inequality constraint is very difficult in practice. To overcome this problem, some optimized methods (such as multiclass LSSVM, trend analysis based-LSSVM, et al.) have been presented to diagnose the fault state in [18][19][20][21][22]. ese models can better identify the fault state in a complex industrial scenario. Meanwhile, to get the better classification performance of fault states, some integrated intelligent diagnosis methods and models (such as based-SVM neural network, based-LSSVM neural network, et al.) were also established in [23]. e result showed that the improved algorithms have the classification performance for the rolling bearing in industrial systems.
However, although these improved models and algorithms may achieve the desired goal in fault diagnosis of rolling bearing, there are two crucial parameters of LSSVM worth noting, i.e., the penalty factor and the kernel function parameter. Because the penalty factor trades off between misclassification samples and interface simplicity and the kernel function defines the size of the impact of the single training sample, the accuracy of fault diagnosis is decided by them to a great extent. At present, the optimization of the two parameters has not yet been resolved. From the point of practical engineering application, there are few methods to synergistically adjust the structure of the feature to make it better for practical fault diagnosis. To overcome the problem, some optimized algorithms (such as multimode PSO, the PSO based on the Mahalanobis distance (MD), implements mutation based on PSO, et al.) was proposed to adjust significant parameters in [24][25][26]. So, the second area that we were focusing on in this paper is considering the interactive impact between the optimization selecting of the energy feature and the accuracy of fault diagnosis [27][28][29][30][31].
According to the statement, the method in this paper uses three effective methods to construct a bearing fault diagnosis model. First, wavelet transform and energy features are used to represent the characteristics of the bearing signal,d while the eight-dimension energy feature set cannot distinguish the difference between the five bearing states. en, TCA is introduced to optimize the distribution of the energy feature set. Because the TCA can both reduce the distribution between different bearing states and increase the distance of the learned transferable features, the optimized feature set is beneficial to the improvement of diagnostic accuracy. At last, the improved PSO aims to find the optimal parameters of LSSVM.
According to these two points, a new fault diagnosis method of rolling bearing was presented by integration with cooperative energy feature extraction and improved LSSVM to extract the multidimensional feature set and enhance the accuracy of fault diagnosis. e rest of the paper is arranged as follows. In Section 2, the cooperative energy feature extraction rule has been discussed in detail combined with TCA and WP. In Section 3, we have established an improved LSSVM algorithm with dynamic parameter adjustmentbased Particle Swarm Optimization (PSO) and Wavelet Mutation Optimization (WMO). In Section 4, the fault data coming from the laboratory of the Guangdong Institute of Petrochemical Technology was used to verify the effectiveness of the model algorithm. Finally, some promising applications of the model have been discussed in detail in Section 5.

Cooperative Energy Feature Extraction Model and Algorithm for Vibration Signal of Rolling Bearings
In general, the extraction of the reasonable feature from original signal data is a universal method in an industrial scenario. But, as we all know, the original signal data of the bearing is large and complicated in the real industrial scene. In addition, the original data set is disturbed by complex noises. erefore, in this situation, how to extract the energy features of the original signal data is very important to exactly represent the running state of the rolling bearing. Based on this, the Wavelet eory and Transfer Component Analysis are introduced to construct the cooperative energy feature extraction model. e advantage of this processing method has the following two points: the first point is that the primary signal components with different frequency bands may be in detail depicted by wavelet packet because the wavelet packet may provide satisfactory localization properties in both time and frequency domains; the second point is that the structure of energy feature can be optimized by the TCA. Also, this cooperative processing method can get the internal form of the energy feature. Next, the energy feature extraction model based on wavelet packet shall first be expounded.

Energy Feature Extraction Model Based on Wavelet
Packet. To address the above first problem, the vibration data may be divided into multiple frequency bands by wavelet packet. In a real application, the internal characteristics of the signal can be adaptively selected. To better understand the idea, the detailed algorithm shall be in detail described by using a wavelet packet in the next step.
To further analyze the data resource, let L 2 (R) � ⊕ j∈Z W j indicate the fact that multiresolution analysis is based on different scale factors of j. In the multifrequency analysis, L 2 (R) is decomposed into a series of subspaces of the orthogonal sum of W j (j ∈ Z), where W j is the subspace of the wavelet function. In our work, the wavelet space of W j is refined in binary mode to achieve the goal of increasing the frequency resolution. To ensure the mapped performance between the scale-space V j and wavelet subspace of W j in a new subspace U n j , the iteration formula was defined as follows: where the subspace of U n j is the closure space of the function ω n (t) and U 2n j is the closure space of function ω 2n (t); the following two-scale equations should be also satisfied: ; the sequence of ω n (t), n ∈ Z is the basis function. And then the sequence constructed is determined by the basis function ω 0 (t) � ϕ(t) and is called the orthogonal wavelet packet; ω 0 (t) and ω 1 (t) are the scaling function of ϕ(t) and the wavelet basis function ψ(t), respectively. In addition, the normalized orthogonal basis of Further, for an arbitrary c n j (t) ∈ U n j , c n j (t) can be expressed as follows: And then, c n j+1 (t) can be decomposed into c 2n j (t) and c 2n+1 j (t) by using wavelet decomposition.
In addition, d i+1,n l is used to obtain the equations for d j,2n l and d j,2n+1 l according to the following formula: In conventional approaches, the three-layer wavelet packet decomposition structure is shown in Figure 1.
From Figure 1, for an arbitrary signal S at which the frequency range is in[0, f], it may be decomposed into a high-frequency part D 1 and a low-frequency part A 1 . After the first layer in the multiresolution analysis framework, the frequency range of the high-frequency part is [f/2, f], and the frequency range of the low-frequency part signal is [0, f/2]. Once the first layer was ended, the decomposition in the second layer starts to perform; i.e., the low-frequency part AA 2 and the high-frequency part DA 2 are obtained from decomposing the low-frequency part A 1 . e high-frequency part D 1 is also decomposed to obtain the low-frequency component AD 2 and the high-frequency componentDD 2 . is means that the four frequency ranges may be indicated Analogously, the signal data set may be implemented to decompose layer by layer. e decomposition relationship for signal S may be formulated as follows: rough the above algorithm, the different orthogonal wavelet spaces of U n j have different time-frequency resolution spaces, and all U n j can cover the entire bandwidth of signal S. Obviously, the time-frequency domain analysis can adaptively project the spectral components of the signal onto the Mathematical Problems in Engineering orthogonal wavelet packet space of the corresponding frequency band. In engineering of energy feature extraction of rolling bearing, because the components of the original signal at each decomposition level represent the signal information in the corresponding local time-frequency area, the information of the component signal may be always intact. Of course, the energy of the signal distribution has been calculated at a certain decomposition level, and the energy in the orthogonal wavelet packet space at a certain decomposition level can be calculated. en, the frequency indices of energy wavelet packets are arranged to form the eigenvectors of the original signals.
To better characterize the energy feature, suppose that the calculation formula of the wavelet packet energy is as follows: where ω n (2t − k) is the wavelet packet transform coefficient.
To further understand and analyze the distribution of energy features, the statistical distribution of the energy is calculated according to the decomposed signals at different frequency bands.
Unfortunately, when the energy feature with different working conditions is input into the classifier, the result of training accuracy is 97.5%, and the test accuracy is only 87.2%. e energy characteristics cannot fully depict the differences among different states of the bearing, which results in low accuracy for bearing fault diagnosis. us, to find a method to make up for the shortage of wavelet packet is necessary. To reduce the data dimension and optimize the data distribution, the TCA theory was used to further optimize the feature sets in our research. Next, an improved cooperative energy feature extraction shall be established to solve the problem of combining with wavelet packet and TCA.

An Improved Cooperative Energy Feature Extraction Method Based on the Transfer Component Analysis Algorithm
In the real operation of extracted energy feature for signal data of rolling bearing, how to accurately distinguish the differences among the states in variable working conditions is very crucial. To ensure that the energy feature set of rolling bearing has the characteristics such as stronger class and compact inner class, the TCA was introduced to reduce the distribution discrepancy and among-class distance of the learned transferable features. e main role of TCA is to optimize the structure of energy characteristics gotten in the above section. To accommodate more flexible modeling, based on introducing the basic concept and approach of transfer feature, this section would design and implement a cooperative energy feature extraction method by using the TCA algorithm.

Basic Concepts of Transfer Feature.
Notice that the TCA algorithm can adjust the edge distribution probability of the data set, and the edge distribution probability represents the probability distribution of the data set. e distribution of the bearing feature set is insufficient to meet the accuracy requirements of fault diagnosis. To reduce the distance in the same feature set and expand the gap of different feature sets, the method can reduce the distribution between the source domain and target domain data. e transfer feature mapping process is designed in Figure 2.
e circle and triangle represent source domain and target domain, respectively. A and B mean different data. Before the common mapping process is implemented, the edge probability distribution between the feature set of the source domain differs from the feature set of the target domain. e mapping relationship from the source domain to the target domain should be depicted.
For simplicity of further analysis, assume that the source domain is D S � X S , X T , the target domain is D T � X T , X T is the feature set of the source domain, Y S is the label set, and X T is the feature set of the target domain. And then, P(X S ) ≠ P(X T ).
In fact, after feature mapping by using the TCA algorithm, the edge probabilities of M(X S ) and M(X T ) are as similar as possible, and the following relationship should be satisfied: Once the above formula is correct, the source domain feature sample set and the target domain feature sample set are mapped to the shared subspace, and the knowledge of the feature sample transfer process can be fully utilized to improve the cross-domain learning ability.

Energy Feature Extraction Method Based on the Transfer Component Analysis Algorithm
To ensure that the difference between the source domain and the target domain should be reduced by finding the common points, the distance between the transfer method and retaining the original features of the two data sets is defined as follows: where n S is the number of labeled source domain training samples data and n t is the number of unlabeled data in the In this situation, the goal is to predict the sample label of y T i . At the same time, the data mapping function ϕ between the source domain and the target domain is defined as follows: e objective of this process is to reduce the difference between the edge probability distributions P(X S ) and P(X T ) so that P(X * S ) ≈ P(X * T ). Similarly, for a given source domain data set X S and associate target domain data set X T , the distance function MMD between the two data sets can be expressed as follows: where ϕ(x T j ) 2 H is the squared standard operation performed in the regenerative kernel Hilbert space. e source and target domain data are mapped into a shared low-dimensional potential space through the nonlinear mapping, and then the kernel functions can be solved as follows: In equation (11), K S,S , K S,T , K T,S , and K T,T are the corresponding kernel functions obtained from the source domain, the target domain, and the hybrid domain, respectively.
Further, the formula can be rewritten as where Tr represents the trace of a matrix For simplicity, L i,j maybe expressed as follows: rough the above analysis, the distribution among different data may be reduced, and shared feature representation of the two domains is realized. e representation may also maintain the data feature attributes of the two domains. Also, the method may achieve this goal and extract the data components for the transfer of data from different but related fields. e main purpose of the algorithm is twofold. First, the distance between ϕ(X S ) and ϕ(X T ) is minimized; second, the main feature attributes of the raw data sets X s and X T are preserved.
For the whole mapped samples, we can find an embedded matrix W ∈ R (n x +n t )×m (m ≪ n x + n t ), s.t.
Based on equation (14), equation (11) may be rewritten as follows: Once the covariance matrix W may be found, the largest variance of the energy feature can be maintained into the  Mathematical Problems in Engineering newly created subspace. e concrete kernel matrix formula can be indicated as , i.e., where H � I − (1/(n s + n t )1 T ) ∈ R (n s +n t )×(n s +n t ) indicates the center matrix. erefore, the problem may be transformed into the optimal problem of � I m , and I m ∈ R m×m is a unit matrix. e final core learning problem can be established as follows: where μ is the trade-off parameters and μ > 0. Next, the optimization problem can be transformed into a maximum mapping matrix W, which can be obtained by matrix decomposition. First, we need to calculate the matrix (KLK + μI) − 1 KHK to obtain W.
So far in this discussion, the core energy features can be selected by integration with the above model. On the other hand, because the distance between the same-state features becomes increasingly similar, the separability of energy feature becomes increasingly clear for different states. All in all, the compactness of features has been greatly improved after integrating with TCA. It is convenient to use classifiers to improve the fault diagnosis accuracy of bearings.

Design and Analysis of a Cooperative Energy Feature
Extraction Algorithm. According to the above theoretical analysis, an improved cooperative energy feature extraction algorithm may be designed as follows. e data are input into the optimized LSSVM, training is performed with the source domain data, and the target domain data are used to test the training result. Finally, the classification results are obtained and accuracy is assessed. e detailed flowchart is shown in Figure 3.
Whether the extraction mechanism of the energy feature is improved, the final goal of the energy feature is to improve the accuracy of the fault recognition of rolling bearing. Of course, good differentiation among different states and a high correlation among the same states will bring some gains in diagnostic accuracy. at is to say, it is also convenient to use classifiers to improve the fault diagnosis accuracy of rolling bearings. In the next step, the fault diagnosis method of rolling bearing shall be established to solve the goal.

Classification Process of Improved LSSVM with Dynamic Parameter Adjustment
According to the above analysis, an improved fault diagnosis method combining with improved LSSVM with dynamic parameter adjustment is listed in Figure 4.
(i) Step 1. Input the extracted data features into the improved LSSVM model and train the two parameters that need to be optimized. (ii) Step 2. Initialize the parameter in particle swarm, such as evolutionary algebra, the learning factor, the  initial position x id of each particle, the initial velocity v id , et al. (iii) Step 3. e best position is set as the initial position of each particle. e optimal fitness equals the best position of each particle. e speed and position of each particle are calculated according to the formula. (iv) Step 4. Calculate the scale parameter and wavelet function value from the wavelet variogram. e mutation operation is performed on the current optimal particle according to the wavelet function formula.
(v) Step 5. Update p best and g best according to the fitness value of the particle. en, update the velocity and position information of the particle at the same time. (vi) Step 6. Determine whether the results of the algorithm reach the optimal condition. e training classification accuracy of the classification model is defined as the fitness degree of the PSO. If the fitness value calculated in the current cycle is the best, the current particle is saved as the best particle. If the fitness is not the best, the optimal parameters from the end of the previous cycle are used. e optimal particle search continues until the end of the cycle.
e punishment coefficient C and Gaussian radial kernel function R are saved to construct the LSSVM classification model. rough this algorithm, the cooperative energy feature extraction model and algorithm for the vibration signal of a rolling bearing are used to build a multidimensional feature set. And the fault diagnosis may also be implemented. e special flowchart of fault diagnosis is designed in Figure 4.

Experiments and Discussions
To verify the effectiveness of the proposed fault diagnosis method, the experimental acceleration data of bearings are used for fault diagnosis.
e experimental data were obtained from the multifault diagnosis equipment of the rotary unit in the State Key Laboratory of Bearings, Guangdong University of Petrochemical Technology. Figure 5 shows the single-stage centrifugal fan fault diagnosis unit. Figure 6 shows the schematic diagram of the inner and outer cracks of bearings.
With this experimental platform, the data for each fault can be acquired under five states: normal, external cracking, internal cracking, missing bearings, and wearing bearings.  Figure 7.
To verify the effectiveness of our algorithm, a sample data set containing 10240 sampling points in a sample period is used to extract the energy feature. To facilitate signal processing and extraction, the original signals are divided into 1024 * 10 groups. Figure 7 shows the acceleration signal of the bearing under normal conditions. From Figure 7, the original data set has been divided into 1024 * 10 groups. And the signal in each group is decomposed into 8 frequency bands by wavelet packet algorithms as shown in Figure 8. e signal has a large volume due to high sampling frequency, and it is difficult to distinguish faults from these signals. First, the original signal is decomposed into three layers of wavelet packets to obtain signal components with eight different frequency bands. Figure 8 is a diagram showing the signal components obtained by decomposing the original signals for the original states of the bearing. e frequency band of the original signal is divided into multiple bands. According to the characteristics of the signal, the corresponding frequency band is adaptively selected to match the signal frequency, thereby improving the resolution of the signal frequency. Figure 7 only shows that the signal is decomposed into signal components of different frequency bands, and it does not reflect obvious fault characteristics. erefore, the next step is to further extract the energy characteristics of the signal components.
After the original signal in five different states is decomposed by the wavelet packet, the characteristic histogram is obtained by calculating the energy of the node in the component signal. e energies described the multidimensional feature set of the bearing and the energy features extracted from one group of the normal signal. From the using point of view, constructing the feature set of signals is reasonable by using them in a sample period. As shown in Figure 9, the distribution of energy is different under different bearing states. e energy characteristics can initially show the difference, and then the energy feature values are extracted to construct the energy feature table. So, these features may be used to constitute a complete feature set for structure processing and fault diagnosis. Based on this, the energy feature values extracted from the 1024 * 10 group of the normal signal are listed in Table 1.
In our experiment, the data sets for the five bearing states include 10 * 10240 groups, and each group of signals is divided into ten groups for signal decomposition. We can obtain 8 different frequency bands from the original signal. e energy characteristics of the nodes are used to construct a multidimensional energy feature set for the bearing. Table 1 shows the multidimensional energy feature data sets for the original states; it is obvious that different bearing state has different energy features. en, the table of energy features is input into the classifier. e energy features extracted from the bearing fault vibration signal constitute a feature set that has been normalized. e labeled source domain data sample set and the unlabeled target domain data sample set are mapped to the regenerative Hilbert kernel space. Between the source domain and target domain, the difference in the total maximum mean value reflects the difference in the distribution. e smaller the maximum mean difference is between the source domain and target domain, the stronger the source domain to target domain mobility. It is beneficial to select source domain data with high similarity to the target domain data.
Unfortunately, when the energy feature with different working conditions is input into the classifier, the result of training accuracy is 97.5%, and the test accuracy is only 87.2%. e energy characteristics cannot fully depict the differences among different states of the bearing, which results in low accuracy for bearing fault diagnosis. us, to find a method to make up for the shortage of wavelet packet is necessary. To reduce the data dimension and optimize the data distribution, the TCA theory was used to further optimize the feature sets in our research. Next, an improved cooperative energy feature extraction shall be established to solve the problem of combining with wavelet packet and TCA.
Energy feature is recalculated from each component by the improved cooperative energy feature extraction algorithm. In our simulation experiment, the feature set in A is inputted into TCA which is used to optimize the distribution of the feature set. In this hidden subspace, a classifier can be trained using the tagged samples from the mapped source domain, and the classifier is used to test the target domain data in the hidden space. e simulation results are shown in Figures 10 and 11. Figure 10 shows the original energy distributions of the bearing. e five state characteristics of the bearing (normal, outer crack, inner crack, wear, and missing steel ball in bearing) are not distinct, and a poor energy distribution leads to low classification accuracy. Obviously, after inputting the energy characteristics into the TCA algorithm, the energy distribution of the bearing is shown in Figure 11. For indeed, the distance between same-state features becomes increasingly similar and the energy features possessed the advantage of the time-space concentricity.
at has shown that our model and algorithm are effective.
Whether the extraction mechanism of the energy feature is improved, the final goal of the energy feature is to improve the accuracy of the fault recognition of rolling bearing. Of course, good differentiation among different states and a high correlation among the same states will bring some gains in diagnostic accuracy. at is to say, it is also convenient to use classifiers to improve the fault diagnosis accuracy of rolling bearings. In the next step, the fault diagnosis method of rolling bearing shall be established to solve the goal.

Comparative Experimental Analysis.
As a classifier, the optimized LSSVM is used for random cross-validation experiments. e data set of the source domain is used as a training set, and the data set of the target domain is used as a test set.

Mathematical Problems in Engineering
In the TCA algorithm, the kernel function maps the data from the source domain and the target domain to the highdimensional space. erefore, the choice of the kernel function is related to the data mapping process of the source domain and the target domain. Four different kernel functions, namely, primal, RBF, linear, and SAM, are used to conduct comparative experiments. Under different kernel functions in TCA, the ability to analyze the corresponding energy characteristics is tested. e training accuracy and test accuracy are calculated. Because TCA is a data dimensionality reduction algorithm, the dimension of data reduction is related to the classification accuracy. In this paper, the original data dimension of the energy feature data set is 8, and the dimensionality reduction is varied from 1 to 8 to test the diagnostic accuracy of the fault diagnosis method. Combining the results from Table 2 and Figure 12, the diagnostic accuracy of the RBF kernel function is relatively high and stable. erefore, the RBF kernel function is used for bearing fault diagnosis analysis.
Based on the above fault diagnosis classification model, each group uses 100 sets of data features for fault identification. e simulation results for the training phase and the test phase are shown in Figure 12(a).
According to the test data in Figure 13(b), there are 120 groups of test data for the 5 states ((1) normal, (2) internal cracking, (3) outer cracking, (4) wear, and (5) missing).   Among them, two values are incorrectly classified, so the test accuracy is 98.3%.
To verify the validity and superiority of the algorithm presented in this paper, we compare different unoptimized algorithms with the optimized classification algorithm proposed.
Four different methods are compared under the same experimental environment and the same experimental data. Table 3 shows that the correct rate can reach 100% during the training process using the method developed in this paper.
Additionally, the correct rate can reach 99.8% during the test process. e fault diagnosis accuracy is better  than that of the other three methods.
e comparison shows that the TCA algorithm is effective in analyzing the energy characteristics of wavelet packets. Moreover, the optimized classification algorithm is superior to the traditional single classification algorithm and has a better diagnostic ability.

Conclusions
In this paper, to improve the accuracy of identifying and classifying fault in variable working conditions, a new method based on optimization of multidimension fault energy characteristics and integrate with an improved leastsquares support vector machine (LSSVM). e main contributions of this paper are as follows.
(1) e method of wavelet packet is used to reduce the surrounding noise and decompose the original signal with eight different frequency bands. e energy of every component is calculated to construct a feature set for bearing. (2) Because the TCA can amend the distribution of the energy feature, the e distribution of the feature set is optimized, and the data dimension is much closer than before. e optimized feature structure could improve the accuracy of bearing fault diagnosis. (3) Particle swarm and the wavelet mutation were integrated to optimize two parameters of LSSVM. rough the real data of bearing, the training accuracy of the proposed method is 100%, and the test accuracy 99.8%. e experiment result shows that the proposed method is effective in the low-precision problem of fault diagnosis for complex bearings in the equipment. (4) Unfortunately, there are still two problems to be solved in the next research. First, the complex noise of the original signal brings interference to the fault diagnosis of bearing. Second, the kernel function selected in the TCA algorithm is very single. erefore, the next step is to focus on signal denoising and TCA construction of multicore kernel functions to further improve the fault accuracy.

Data Availability
e data used to support the findings of this study are included within the article. e data sets are provided by the Guangdong University of Petrochemical Technology for experimental verification.

Conflicts of Interest
e authors declare that they have no conflicts of interest regarding the publication of this paper.