Tensor Transfer Learning for Intelligence Fault Diagnosis of Bearing with Semisupervised Partial Label Learning

A new tensor transfer approach is proposed for rotating machinery intelligent fault diagnosis with semisupervised partial label learning in this paper. Firstly, the vibration signals are constructed as a three-way tensor via trial, condition, and channel. Secondly, for adapting the source and target domains tensor representations directly, without vectorization, the domain adaptation (DA) approach named tensor-aligned invariant subspace learning (TAISL) is first proposed for tensor representation when testing and training data are drawn from different distribution. Then, semisupervised partial label learning (SSPLL) is first introduced for tackling a problem that it is hard to label a large number of instances and there exists much data left to be unlabeled. Ultimately, the proposed method is used to identify faults. The effectiveness and feasibility of the proposed method has been thoroughly validated by transfer fault experiments. The experimental results show that the presented technique can achieve better performance.


Introduction
Fault diagnosis is a key process to ensure a reliable and costeffective performance of engineered system research. Downtime caused by failures of components such as bearing faults directly reflects on the economic viability of large systems [1][2][3][4]. Therefore, for maintaining reliability and operational safety, fault detection has attracted a lot of attention [5].
In recent years, the fault classification method has been very successful for bearing based on an assumption, which is that candidate label sets are provided for all training examples [6]. Based on this assumption, lots of effort are taken on traditional intelligent fault diagnosis approaches. Liu et al. [7] proposed a personalized diagnosis method to detect faults in a bearing based on acceleration sensors and an finite element method (FEM) simulation driving SVM. A novel supervised sparse feature extraction method is proposed for rotating machine fault diagnosis in [8]. Reference [9] proposed a novel fault diagnosis method based on local-global deep neural network algorithm. A deep learning model named renewable fusion fault diagnosis network is proposed for updating automatically as the collected fault data increases in [10]. Nowadays, various fault diagnosis methods have enriched fields of fault diagnosis. Some novel intelligent fault diagnosis techniques [11][12][13][14] are also promoted for fault diagnosis. It can be found that these approaches are applicable to vector data only. Aside from this, some tensor-based diagnosis techniques become prosperity in the fields of fault diagnosis, especially in the age of big data [5,15,16].
Although the researches above realized nice performance, they may suffer the two drawbacks as follows: (1) Through the literature review, it can be seen that an important assumption in these intelligent fault diagnosis methods is that the labeled training and unlabeled testing data come from the same distribution [17]. However, assumption fails by two main reasons [18]. Firstly, labeled fault signal are hard to be obtained from some equipments. Secondly, an intelligent fault diagnosis algorithm trained with labeled data possibly fails in classifying unlabeled data when the labeled and unlabeled data are subject to different machines. Thus, distribution discrepancy exists between source and target domains, which causes important classification performance degenerates [17]. In order to handle with the domain distribution problem, DA techniques have been developed [19].
DA method establishes knowledge cross-domain transfer from source to target-domains via studying a domaininvariant feature subspace [17]. DA techniques have been successfully developed and applied in [20,21]. Reference [22] proposed a defect identification method of wind turbine blades based on defect semantic features with transfer feature extractor. Reference [23] presented a novel domain adaptation model based on geodesic flow kernel (GFK) and strengthened feature extraction and Z-score normalization. Aside from this, reference [24] proposed a feature-based transfer neural network to identify the health states of motor bearings and gearbox bearings. A transferable convolutional neural network (CNN) [25] is proposed for intelligent fault diagnosis of rotary machinery.
Nevertheless, it can be seen that the existing diagnosis technique of transfer learning with DA approaches focus on the vector data. Therefore, the approaches are used to establish high-dimensional data; the data must be vectorized. Aside from this, vectorization always leads to high computational complexity and so on.
For addressing these issues, a new method is used for tensor data representation. The idea of the proposed method is that an invariant tensor subspace is used for adapting the tensor representations [17].
Different from the vector subspace, the tensor subspace concludes a set of subspaces characterizing each mode separately [21]. The proposed technique realizes mode-wise partial adaptation for reducing the dimensionality issue. Therefore, a joint optimization problem is formulated by seeking such a tensor subspace and learning the alignment matrices [17]. The issue is optimized via an alternating minimization method. In cross-domain visual recognition, the TAISL has achieved great success. However, there is no reports about TAISL for rotating machinery fault classification in available references.
(2) Through the literature review, it can be also seen that existing methods often assumes that each training example is associated with a ground-truth label. Nevertheless, one can only get access to a candidate label set associated with each training example among which only one label is valid in many practical applications [6]. Therefore, partial label learning (PLL) has been proposed for dealing with this kind of training examples in [26]. The PLL has attracted increasing research attention, so extensive PLL methods have been proposed in references [27,28]. However, there is no reports about PLL for rotating machinery fault classification in available literatures.
A basic assumption is that all the candidate label sets are provided for training sample in the previous researches on PLL. Nevertheless, in practical applications, such assumption is difficult to hold [6]. A fault can be labeled by a candidate label set, but there still exist many faults that have actually no label information for them.
In this work, it is clear that neither PLL nor semisupervised learning (SSL) can address the issue concerned. For instance, although some examples could be very helpful, large numbers of unlabeled instances are ignored via PLL. The SSL assumes that the ground-truth single-label is accessible to each labeled training example, which is not the case in our situation.
A new method named SSPLL is introduced into the field of bearing fault diagnosis. It is critical that the candidate label sets of partial label instances are disambiguated and the dataset distribution information of unlabeled examples is used simultaneously. Particularly, the candidate label sets of partial label instances are disambiguated by an iterative label propagation step from partial label to unlabeled examples and the iterative label propagation procedure is used to distribute valid labels to unlabeled examples in proposed algorithm. Thus, a new approach is proposed for classification of bearing faults with semisupervised partial label learning based on tensor representation. The main highlights of the proposed method are generalized as follows: (1) To deal with domain shift issue in tensor space, a novel DA method is proposed for bearing fault diagnosis based on tensor representation (2) To adapt the source domain and target domain based on tensor representation, the tensor transfer learning is introduced To tackle a problem that it is hard to label a large number of instances and there exists much data left to be unlabeled, a new method named SSPLL is introduced to deal with this issue in intelligent fault diagnosis field (4) To realize the process of labeling information propagation from the source domain to the target domain, a weighted graph is established in this paper (5) To assist the iterative label propagation step, establishing four normalized weight matrix corresponding to the four phases in the label propagation step separately in this work The remainder of this paper is structured as follows: in Section 2, the basic theory of the proposed method is described. The explored method are illustrated in Section 3. The developed method is validated in Section 4. In Section 5, the conclusions are drawn.

The Basic Theory of the Proposed Method
In this section, the theory of semisupervised partial label learning is introduced. Y ∈ ℝ t denotes the d-dimensional example space and X = x 1 , x 2 , ⋯x 2 represents the label space with n category labels in the raw PLL. Officially, C = ðy j , H j Þ∥1 ≤ j ≤ m means the partial label training set, where y j ∈ Y denotes a d-dimensional signature vector ðy i1 , y i2 , ⋯y it and H j ∈ Y means the associated candidate label set. Basing on a critical assumption about PLL, the real label y j for x j is concealed in its candidate label set ðy j ⊆ S j Þ and thus is incapable of being attained by the learning method [6].

Journal of Sensors
The training set contains partial label instances According to semisupervised partial label training set C = D p ∪ D u , a recognizing model f : Y ⟶ X is induced from C by SSPLL, f predicts its label. Please refer to [5,15,16] for a basic theory of the proposed method.

The Proposed Method
Framework of the proposed technique is shown in Figure 1. 3.1. TAISL. Please refer to literature [29] for a detailed discussion of the TAISL. The domain adaptation and shift based on tensor representation are illustrated in Figure 2.

A Scheme.
The adapt domains problem is tackled by introducing an invariant subspace between the source S and the target domains T .
SSPLL is difficult that the learning approach is needed to disambiguate the candidate label sets of partial label instances and exploit the distribution information of unlabeled data simultaneously. A simple scheme is proposed for disambiguating the candidate label sets of partial label training instances. For example, the effective single-label is found from a candidate label set. Therefore, an easy SSL issue is introduced to replace previous problem. This new problem can be tackled by learning a method.
The step of label set disambiguation and unlabeled data exploitation are completely separated in the technique above. The disambiguation accuracy are incapable of being improved via unlabeled examples. For solving this key According to the graph G established above, a t × s weight matrix W = ½w ði,jÞ ðt×sÞ is able to be specified, where w ðj,iÞ ≫ 0 if ðx i , x j Þ ∈ Eand w ðj,iÞ = 0 otherwise. In this paper, for capturing the little influences between examples, the weight calculation approach is proposed for applying in the IPAL method [27], which selects the weights by addressing a new optimization issue:    Journal of Sensors From Equation (1), a linear least square issue is fitted to optimize the weight vector, which can be found simply via a quadratic programming solver. Then, the row W = D ð−1Þ W is utilized to normalize weight matrix W. Here, D = diag ½d 1 , d 2 ,â is a diagonal matrix with d j = ∑ s ði=1Þ w ðj,iÞ .

Iterative Label Propagation
Algorithm. Four normalized weight matrices are established for facilitating the iterative label propagation step in the label propagation step separately. Particularly, H = WGCðC p , C u , kÞ is utilized for the label propagation from source domain set C p to target domain set C u . J = WGCðC p , C p , kÞ is proposed for the label propagation from C p to itself. V = WGCðC u , C u , kÞ is utilized for the label propagation from C u to itself. L = WGCðC u , C p , kÞ is constructed for the label propagation from C u to C p . Detailed description of the algorithm is illustrated in reference [6].

Experimental Validation
In this work, to prove the proposed method, the compared methods is introduced. These approaches contains: GFK [30] (geodesic flow kernel), TJM [31] (transfer joint matching), JDA [32] (joint distribution adaptation), and TCA [33] (transfer component analysis).   Figure 4. As is illustrated in Figure 4, the results of five transfer fault detection experiments are shown. The transfer results of the proposed method are also compared with four approaches. According to the comparison results, it can be seen that the proposed approach obtains the highest testing accuracy in the 12 transfer tasks among these four methods.

Dataset. By Case Western Reserve University [34] and
From Figure 4(a), the average testing accuracy of the proposed method is 95.6%, which is the highest one among these four methods. Due to be unsuitable to deal with tensor data, the average testing accuracy of JDA reaches 35.7%, which is smaller than the accuracy obtained by the proposed method. The average testing accuracy of TJM reaches 34.7% and the accuracy of GFK is 31.8%. Because they cannot extract high-level signatures from tensor samples of the target-domain, these two techniques realize poorer testing accuracies than the proposed method. In terms of TCA, its average accuracy reaches 37.8%, which is smaller than the testing accuracy realized by the proposed method.
When p = 0:20, the experimental results are presented in Figure 4(b). From Figure 4(b), the average testing accuracy of the proposed method is 96.8%, which is the highest one among these four methods. Due to be unsuitable to deal with tensor data, the average accuracy of JDA is 37.3%. The average testing accuracy of TJM reaches 36.8% and the accuracy of GFK is 33.9%. In terms of TCA, its average accuracy reaches 39.7%.
When p = 0:30, the experimental results are presented in Figure 4(c). From Figure 4(c), the average testing accuracy of the proposed method is 97.9%, which is the highest one among these four methods. Due to be unsuitable to deal with tensor data, the average accuracy of JDA is 39.2%. The testing accuracy of TJM is 38.4%, and the accuracy of GFK is 36.8%. In terms of TCA, its average accuracy reaches 41.8%.
The experimental results are presented in Figure 4(d) when p = 0:40. From Figure 4(d), the average testing accuracy of the proposed method is 98.9%, which is the highest one among these four methods. Due to being unsuitable to deal with tensor data, the average testing accuracy of JDA reaches 41.4%. The average testing accuracy of TJM reaches 41.9%, and the accuracy of GFK is 39.4%. In terms of TCA, its average accuracy reaches 44.8%.
According to the experimental results, the presented technique can correctly and accurately classify the 12 transfer tasks in the target domain. The results clearly show that TRSSPLL technology can identify fault categories more accurately and effectively than other methods.

Transfer Fault Identify Results in GUET Dataset.
In this section, the data is acquired by Guilin University of The experimental results are showed in Figure 7. Compared with the other approaches, the presented approach can realize the best result. This further demonstates effectiveness and superiority of the proposed method. In addition, transfer diagnosis tasks can benefit from DA algorithms. The GUET data demonstrate the performance of the presented approach.
When p = 0:10, the results of the six cross-domain fault detection methods are shown in Figure 7(a). From Figure 7(a), it can be seen that the average test accuracy of the proposed algorithm reaches 96.4%, which is the highest of the five methods. The average test accuracy of JDA, TJM, GFK, and TCA reached 35.7%, 43%, 49%, and 44.2%, respectively.
When p = 0:20, the results of the six cross-domain fault detection methods are shown in Figure 7(b). It can be seen that the average test accuracy of the proposed algorithm reaches 97.9%, which is the highest of the five methods. The average test accuracy of JDA, TJM, GFK, and TCA reached 37.5%, 45%, 507%, and 46.4%, respectively.
When p = 0:30, the results of the six transfer fault diagnosis methods are shown in Figure 7(c). The average accuracy of proposed algorithm is 98.6%, which is the highest of the five methods. The average test accuracy of JDA, TJM, GFK, and TCA reached 39.5%, 47.4%, 52.8%, and 48.6%, respectively.
When p = 0:40, the results of the six cross-domain fault detection methods are shown in Figure 7(d). The average test accuracy of the proposed algorithm reaches 99.4%, which is the highest of the five methods. The average test accuracy of JDA, TJM, GFK, and TCA reached 41.6%, 49.3%, 54.6%, and 50.8%, respectively.
According to the experimental results, the presented approach can accurately identify the six transfer tasks in the target domain. The results clearly show that the proposed technology can identify fault categories more accurately and effectively than other methods.
The proposed method is validated by different experiment data. The purpose of this paper is to deal with tensor data in source and target domain. As for JDA, TJM, GFK, and TCA, they are traditional transfer algorithms. It is 1 3 5 C h a n n e l    [17,39] are proposed for tackling domain shift problems. Nevertheless, the proposed approach can not handle with tensor data. Aside from this, the unlabeled data is not considered by these fault diagnosis techniques. Thus, for dealing with transfer, tensor data issues, and unlabeled data, a new tensor transfer approach is proposed for rotating machinery intelligent fault diagnosis with semisupervised partial label learning in this paper.
In this paper, the data of training and testing are acquired in source and the target domains separately, so the transfer fault detection experiment is more difficult than existing cross-domain task. The testing accuracy has been realized 98.9% and 99.4% in two transfer tasks by the promoted technique, respectively. Therfore, according to consequences, the proposed method are competitive.

Conclusion
Since some information is hard to denote by vector arithmetic, thus, a new DA method based on tensor representation is first applied to adapt the source and target domains tensor data directly, without vectorization in the field of intelligent fault diagnosis. Then, SSPLL is proposed for training set consists of two kinds of weak supervision, i.e., partial label data and unlabeled data. An iterative label propagation method is introduced, which can process two kinds of weakly supervised data simultaneously by jointly propagating label between partial labeled and unlabeled instances and derive a good label assignment. The employed approach realizes higher classification accuracy of bearing heath states compared with vectorbased representation algorithms. Aside from this, the experimental results verify that the presented approach is superior to methods that only considering one kind of weak supervision. In future works, the model will be used to large-scale data, and weak supervision data will also be considered in dynamic environments. Then, a powerful invariant tensor subspace need to be learned in further research works.

Data Availability
The data used to support the findings of this study can be obtained through the first author's email.

Conflicts of Interest
The authors declare that they have no conflicts of interest.