Fault Diagnosis Method of Check Valve Based on Multikernel Cost-Sensitive Extreme Learning Machine

. Check valve is one of the most important components and most easily damaged parts in high pressure diaphragm pump, which is a typical representative of reciprocating machinery. In order to ensure the normal operation of the pump, it is necessary to monitor its running state and diagnose fault. However, in the fault diagnosis of check valve, the classification models with single kernel function can not fully interpret the classification decision function, and meanwhile unreasonable assumption of diagnostic cost equalization has a significant impact on classification results. Therefore, the multikernel function and cost-sensitive mechanism are introduced to construct the fault diagnosis model of check valve based on the multikernel cost-sensitive extreme learning machine (MKL-CS-ELM) in this paper. The comparative test results of check valve for high pressure diaphragm pump show that MKL-CS-ELM can obtain fairly or slightly better performance than ELM, CS-ELM, MKL-ELM, and multikernel cost-sensitive support vector learning machine (MKL-CS-SVM). At the same time, the presented method can obtain very high accuracy under imbalance datasets condition and effectively overcome the weakness of diagnostic cost equalization and improve the interpretability and reliability of the decision function of classification model. It, therefore, is more suitable for the practical application.


Introduction
High pressure diaphragm pump is the most important equipment for high concentration slurry pipeline transportation.Its working condition is directly related to whether the pump can be restarted after stopping and whether it will produce accelerated flow in batch transportation.Check valve is the core and the easiest damaged component of the high pressure diaphragm pump.In order to ensure the normal operation of the pump, it is necessary to monitor its running state and diagnose fault [1].So, the research of condition monitoring and fault diagnosis of the high pressure diaphragm pump has important practical significance in promoting development of slurry pipeline transportation field.
However, the fault characteristics of reciprocating machinery are difficult to extract because of its complex structure, multiple excitation sources, unstable operation, and so on [2].In order to complete the condition monitoring and fault diagnosis of reciprocating machineries effectively, both domestic and foreign scholars have introduced the fault diagnosis methods of rotating machinery into the fault diagnosis of reciprocating machinery and made many valuable research results [3][4][5].Ogle and Morrison [6] analyzed the failure accident of diaphragm pump and found that the environmental stress cracking of diaphragm is one of the main reasons for the diaphragm pump failure.The research results have provided effective theoretical support for accident prevention and pipeline maintenance and greatly reduced maintenance costs.In recent years, wavelet transform and Fourier transform, information entropy, neural network, bispectrum analysis, feature fusion, evidence theory, chaos theory, fractal theory, decision tree, and SVM have been widely applied to the fault diagnosis of reciprocating machinery, and many significant research achievements have been obtained [7][8][9][10][11][12][13][14][15][16].Yet compared with the fault diagnosis of rotating machinery, there are still many research 2 Complexity contents to be improved: (1) the data sample size of reciprocating machinery is huge and a great deal of multisource heterogeneous information is held within them due to the influence of complex structure, multiple excitation source, multiple wearing parts, coupling of the signal, and strong nonlinearity of reciprocating machinery.It is not reasonable to use a single kernel function (such as radial basis function kernel, polynomial kernel function) for processing all the samples, and it is unable to explain the signal completely.Consequently, it is an inevitable choice to combine multiple kernel functions to achieve better processing results [17][18][19].
(2) It is impossible for fault diagnosis models to get ideal classification results when datasets of fault diagnosis are not balanced (the fault samples are far less than the normal samples) and the diagnostic cost is unequal (e.g., the diagnostic cost between "the normal state which is identified as a fault state" and "the fault state which is identified as a normal state" is quite different; the former will only result in an "invalid" examining and repair for operator, but the latter will result in major safety incidents), so the hypothesis deficiency of minimum classification error and diagnostic cost equalization in the existing classification model need to be overcome [20].
(3) At present, BP neural network and SVM are relatively mature classification learning methods and play important role in the fault diagnosis of reciprocating machinery.But, BP neural network has the problems of easily falling into local minimum, being not convergence, and so on.Meanwhile, the optimization calculation load of SVM increases with the optimization parameters and data sample size.And many parameters will be optimized to get the optimization SVM classification model.It, therefore, is one of the hot topics to explore new classification method which has the advantages of fast training speed and fewer optimization parameters to obtain global optimal solution [21].
In recent years, ELM is widely used because of its effectiveness, high speed, being easy for implementation, and multiclassification in the related fields of machine learning [22][23][24].Moreover, the modified ELM models can validly solve the problems of imbalance sample and obtain better performance [25][26][27][28].Therefore, the modified ELM methods have become the main research direction.For one thing, the transfer function of the original hidden layer based on random feature mapping will be substituted for the more efficient transfer functions.Then, the sigmoid function and radial basis function (RBF) [29][30][31][32] which are widely used in neural networks have been introduced into ELM and obtained better experiments results.For another thing, how to improve classification performance of ELM under multisource heterogeneous data and information fusion is also one of the latest research trends of the modified ELM classification models.Liu et al. [33] proposed the multikernel ELM (MKL-ELM) combined with the multikernel learning with ℓ  constraints.Compared with traditional ELM, the MKL-ELM can solve these issues, including the selection and optimization of multikernel function, the application of multisource heterogeneous data processing method, and information fusion method in the classification.But, in [33], the researcher does not consider the impact of classification cost on the classification model.So, the cost-sensitive mechanism was introduced into the conventional ELM [34], and a new classification model based on cost-sensitive is proposed to conquer the drawback of diagnostic cost equalization.But, it is not very effective in dealing with the multisource heterogeneous data and information fusion because of the restriction of single and permanent kernel during the subsequent processing.
With the intensive study of ELM theory and application, the MKL-ELM and CS-ELM have greatly promoted the development of ELM.But there is still plenty of room for improvement and extension.This is typically shown in two aspects: (1) how to select the most appropriate cost-sensitive method; (2) how to construct more general multikernel function which can be widely used in fault diagnosis field.Based on the points discussed above, the multikernel function and cost-sensitive mechanism are introduced into ELM to construct the fault diagnosis model based on MKL-CS-ELM for check valve of high pressure diaphragm pump in this paper.
This paper has the following main contributions.First, the advantages, shortcomings, and the application ranges of oversampling, undersampling, and threshold adjusting are analyzed to provide theoretical support for the choice of cost-sensitive methods.Second, a new fault diagnosis method based on MKL-CS-ELM is proposed to diagnose the check valve faults of high pressure diaphragm pump.Third, the comparison experiments of ELM, CS-ELM, MKL-ELM, MKL-CS-SVM, and MKL-CS-ELM are carried out, and the effectiveness of the proposed MKL-CS-ELM method is verified.
The remainder of this paper is organized as follows.Section 2 describes the fundamental theory of ELM, MKL-ELM, cost-sensitive learning, and evaluation index of classification model.Section 3 presents the implementation process of the proposed method in detail.Section 4 elaborates experimental process.Section 5 shows the experimental results analysis.Section 6 offers the discussion and conclusion.

Extreme Learning Machine (ELM).
From the classification optimization point of view, the principle of ELM is similar to SVM and LSSVM, whose goal is to obtain the minimum training error and maximum classification margin or generalization ability.So, on the basis of SVM principle analysis, the optimized mathematical model of ELM is described as follows [35] In (1),  ∈ R |(⋅)×| stands for the connecting weighting coefficients between hidden layer and output layer, ‖ ⋅ ‖  is Frobenius norm,  represents regularization parameter or penalty factor which achieves the balance between the minimum training error and maximum classification margin,  .= [ 1 ,  2 , . . .,   ]  , (1 ≤  ≤ ) is the th column of the error matrix  ∈ R × ,   represents the transpose of matrix  (similarly hereinafter), (  ) is the output function of hidden layer for the input neuron   , {(  ,   )}  =1 represents the given training set, and   = [0, . . ., 0, 1  , 0, . . ., 0]  ∈ {0, 1}  represents the case that sample   belongs to the classification label . and  are the number of training samples and categories, respectively.According to KKT (Karush Kuhn Tucker) theory, the analysis solution of (1) is calculated and the detailed solution process can be read in [36].The solution of the output weight  * is solved using the Moore-Penrose Φ+ : In ( 2), the output matrix of the hidden layer is Φ = [( 1 ), . . ., (  )]  ∈ R ×|(⋅)| , the output result of ELM classification model is  = [ 1 , . . .,   ] ∈ R × , and  is identity matrix.
For a given new sample , the output decision function () of the ELM is shown as follows: (3)

Multikernel Extreme Learning Machine (MKL-ELM).
The common definition of multikernel function is the linear combination of basic kernel function.So, the combination coefficient of optimal kernel function and the maximum margin of ELM are the key and core of the MKL-ELM [33].A typical form of multikernel function is shown in In (4), {  (⋅, ⋅)  =1 } represents  basic kernel functions.For the convenience of processing and computing, the combination coefficients   of basic kernel function satisfy restricting condition ∑  =1   = 1.The feature mapping of (4) is shown in In (5), (⋅, ) and {(⋅)}  =1 are the high dimensional feature mapping of (⋅, ⋅; ) and {  (⋅, ⋅)  =1 }, respectively.In the construction process of multikernel function, RBF kernel function, Laplace kernel function, and inversedistance kernel function are selected as basic kernel functions.
In order to insure that the final solution and combination kernel function of multikernel optimal problem are subject to the boundedness and symmetric positive semidefinite, respectively, the ℓ  norm is used as the constraint condition of the combination coefficient  of the multikernel function.The different value of  in the ℓ  norm represents different constraint norm.According to the theoretical basis of multikernel SVM [37,38] and ( 5), the theoretical expression of conventional MKL-ELM is described as follows: In (7), the connecting weighting coefficient is the connecting weighting of the th basic kernel function.
Substituting ( 5) into (7), the expression of MKL-ELM is obtained and shown in If Equation ( 9) is similar to the expression of ELM.So, the Lagrangian function of MKL-ELM can be calculated: In (10),  ∈ R × and  are the Lagrangian multiplier.Then, KKT optimization condition is calculated and shown in The matrix form of ( 11) is expressed in In (12), the compound kernel function (⋅, ⋅; ) represents (  ,   ; ) = (  ; )  (  ; ) = ∑  =1   (  ,   ).Then, the solution of  is shown as follows: At the same time, the combination coefficient  of multikernel function can be calculated by the derivative of The sparse MKL-ELM constrained ℓ 1 norm is given by  = 1 in ( 14).The optimal parameter of  * and  * is calculated by iterative optimization methods.Now, for a given sample , the output decision function () of MKL-ELM is expressed as follows: In (15), the component of

Cost-Sensitive Methods.
The cost-sensitive methods largely fall into three groups [39]: constructing the costsensitive classification model directly, establishing the costsensitive classification model using the Bayesian risk theory, and building the cost-sensitive classification model by changing the samples distribution.The latter two methods are emphatically introduced [40].
Assuming that the number of given the class labels of training set is  and the number of training samples in each category is   , the classification cost is defined as follows.
( In oversampling and undersampling, the cost expression of  *  is defined by where  *  is the number of categories . represents the resample category of oversampling and undersampling, which is calculated by ( 17) and ( 18), respectively: However, the realization principle of threshold adjusting can be interpreted as follows: In (19),   ( ∈ {1, . . ., }) is the actual output of different output nodes of ELM, and it satisfies constraint condition At the same time, the output of threshold adjusting also satisfies constraint condition

The Evaluation Indicators of Classification Model.
The evaluation indicators of binary classification and multiclassification are introduced to validate the effectiveness of the proposed method in the section.

The Cost-Sensitive Evaluation Indicators of Binary Classification.
In binary imbalanced learning, the cost matrix is shown in Table 1.It is generally recognized that the cost of correct classification is defined as  p =  N = 0.
Based on Table 1, the cost-sensitive evaluation indicators of binary classification are defined as follows.

The Cost-Sensitive Evaluation Indicator of Multiclassification.
The cost-sensitive evaluation indicator of multiclassification is more complicated than binary classification.The indicator of robustness   referred to in [41] is introduced to describe the classification performance in multiclassification.
The robustness indicator   is calculated by In (23), AV Cost  is average cost of method .max  AV Cost  represents the maximum average cost of the designed method.The indicator of robustness   is lower, and the robust performance of the method is better.

Classification Method of Imbalance Sample Distribution Based on MKL-CS-ELM
The main procedure of proposed MKL-CS-ELM method involves data preprocessing (data normalization and feature extraction), construction of multikernel function, and cost-sensitive learning.The brief process of MKL-CS-ELM is shown in Figure 1.The detailed process of the proposed method is described in Algorithms 1 and 2. The oversampling process refers to Algorithm 1 and the principle of undersampling is similar to oversampling.Algorithm 2 is the implementation process of threshold adjusting method.The check valve completed a process of feeding and discharging in every stroke of the diaphragm pump.Assume that the stroke coefficient of diaphragm pump is 50 r/min and the reciprocating action of inlet and outlet check valve will be 72000 times when it is in the normal operation for one day.Therefore, the check valve is core component of frequent motion in diaphragm pump, and it also turns into one of the most important reasons for the check valve failure.The high pressure diaphragm pump and the failure check valve for mineral slurry pipe transportation with solid-liquid two-phase flow are shown in Figure 2.

Experimental Description
In Figure 2, the check valve of the high pressure diaphragm pump is a cone-valve and its simple structure is shown in Figure 3.And "spool-spring" forms a weakly damped oscillation system.There are two reasons for the vibration of the system: one is external factor (resonance); the other is caused by its own characteristics.When the frequency of the external excitation source is an integral multiple of the natural frequency of the valve system, the resonance of the whole system will occur during work.So, the different running states of the check valve can be effectively judged by analyzing the vibration signal of the check valve.three-cylinder diaphragm pump includes 3 pairs of check valves, which means that it includes 3 inlet check valves and 3 outlet check valves.So, in the process of data acquisition, the six PCB 352C33 accelerometers are installed on the check valve housing to collect vibration data by a PXI-3342.The data sampling frequency   is 2560 Hz and the data point  is 20480.

Experimental Setting.
The data attributes of check valve and classification information are defined as in Table 2.
Based on the data characteristics in Table 2, the three kinds of cost matrixes are introduced and defined as follows [42].

The Feature Extraction of Wavelet Packet Energy Entropy.
Figure 5 shows the time and frequency waveform of the vibration signal for the check valve under 3 different operating conditions, including normal condition (NC), stuck valve fault (NK), and abrasion fault (NM).From point of the time domain and frequency domain waveform, it can be seen that the abnormal check valve has occurred, but further reasons or categories can not be obtained.In order to realize the automatic identification of the different running states of the check valve, it is necessary to extract the effective characteristics of the running state and then construct the state identification model.The feature extraction makes full use of the advantage of wavelet packet and entropy in this paper.The third-layer wavelet packet energy distribution coefficient and energy entropy are extracted as characteristic parameters of the following classification model [42].The selection of feature extraction method is based on the following points to consider.
(1) It is by using wavelet packet technique that the vibration signal of check valve can be mapped to wavelet-basis functions without information loss and has the superior ability in localization analysis of nonstationary signal.
(2) Entropy is introduced into depicting the operation state characteristics for check valve.This is mainly because the more disordered the system is, the greater the entropy becomes.And then, we can extract sensitive and transient features to describe the operation state of check valve.
The steps of feature extraction are listed below.
(1) Signal decomposition and reconstruction: the vibration signal of check valve is analyzed by three layers' wavelet packet transform to get the wavelet coefficients of the thirdlayer decomposition.In this paper, "db10 wavelet" is chosen as basic wavelet-basis function, which is mainly because "db10 wavelet" can well reflect the sensitive and transient features of vibration signal of check valve.
(2) Extraction feature vector: the wavelet packet energy distribution coefficient  3 of reconstructed signals of the third-layer wavelet packets coefficients and energy entropy  compose the feature vector  = [ 30 ,  31 , . . .,  37 , ]. 3 and  can be calculated as follows: where  denotes the number of component signals ( = 8) and ( 3 ) represents the energy of the reconstruction signal of third-layer wavelet coefficients.
According to the definition of feature extraction in ( 27) and ( 28), the feature vectors of check valve can be calculated.Because of the limited space, partial features (not all features) are shown in Table 3.Compared to the normal check valve with the fault check valve, the operating conditions will be easily distinguished based on the wavelet packet energy distribution coefficient  and energy entropy .It shows that the feature extraction method based on wavelet packet energy entropy is effective and reliable.

Discussion of Experimental Results
Based on the definition of cost functions, the diagnosis cost matrix is constructed and shown in Table 4.The value of diagnostic cost  is from 1 to 5 ( ∈ [1,5]) and increases by certain step length (usually 0.5) in the experiments.In the experimental processing, the 110 data samples are collected, including 70 NC data samples, 20 NK data samples, and 20 NM data samples.The data samples of the check valve will be processed by combining the cost matrix shown in Table 4 with theoretical illustration of oversampling, undersampling, and threshold adjusting in Section 2.3.Then, the fault diagnosis classification models of ELM, CS-ELM, MKL-ELM, MKL-CS-ELM, and MKL-CS-SVM are constructed.The experimental results of binary classification and multiclassification for check valve are elaborated as follows in detail.

The Experimental Results Analysis of Binary Classification
for Check Valve.In the binary classification experiments, the datasets of NC and NK are selected as the test data.The cost matrix is consistent with Table 4.The experimental results are described as follows.

The Experimental Results of Oversampling.
The data sample distribution of oversampling is calculated and shown in Table 5 according to cost matrix in Table 4, (16), and (17).The 90 data samples are collected, 54 samples are selected as training samples, and the remaining 36 samples as test samples.Then the recognition results of classification models are presented in Figure 6.
As seen in Figure 6, some conclusions can be observed, including the following: (1) In the cost-sensitive processing of oversampling, the AP of CS-ELM, MKL-CS-SVM, and MKL-CS-SVM increases at first then decreases with increasing cost , the AN increases at first then reaches steady state with increasing cost , and the global classification accuracy (Accuracy) increases at first and then decreases with increasing cost .(2) The recognition results of ELM and MKL-ELM method do not change with increasing cost , which is mainly because the data distribution of the mentioned ELM and MKL-ELM does not also change.Therefore, it is only for the comparison of experimental results and independent of the diagnostic cost .(3) In CS-ELM, MKL-CS-SVM, and MKL-CS-ELM method, the optimal recognition effect is obtained when the diagnostic cost is  = 2.5.(4) Compared with the ELM and MKL-ELM methods without diagnostic cost, the diagnostic cost can improve the accuracy and reliability of classification models in CS-ELM, MKL-CS-SVM, and MKL-CS-SVM method.(5) From the experimental results, we can also see that the multikernel learning mechanism is also helpful to further improve the diagnostic performance of the classification models.At the same time, Figure 6 also shows CS-ELM and MKL-CS-ELM are more sensitive to the cost  than the MKL-CS-SVM.

The Experimental Results of Undersampling.
The data sample distribution of undersampling is calculated based on the Table 4,(16), and (18).Then, the recognition results of the above-mentioned classification models are displayed in Figure 7.
As shown in Figure 7, some conclusions can be obtained, which are similar to the results of oversampling methods.Moreover, the classification performance of mentioned classification models for check valve is slightly poor in undersampling.The experimental results found that the major  problems are mostly owing to lack of the enough samples of check valve and the extreme imbalance of sample distribution is caused in the undersampling processing.At the same time, we can also observe an interesting phenomenon that when the sample is very small, the classification results of MKL-CS-ELM are slightly worse than the other classification models.This is probably an indirect argument that the training process of MKL-CS-ELM also needs the sufficient samples and the essence of MKL-CS-ELM is the single-hidden layer feedforward neural network.At the same time, the presented results also indirectly demonstrate the superiority of SVM in classification with smaller samples.In Figure 8, the classification models of CS-ELM, MKL-CS-SVM, and MKL-CS-ELM can obtain good effects due to the introduction of cost-sensitive learning mechanism.The AN, AP, and Accuracy of aforementioned classification models are significantly improved with the increasing cost .At the same time, the misclassification and missed diagnosis samples are sharply reduced with the increasing cost .Compared with the performance of oversampling  and undersampling, the experimental results show that the threshold adjusting algorithm can also achieve satisfactory results.Therefore, the cost-sensitive method of threshold adjusting is also one of the effective choices for imbalance and inequality diagnosis cost in binary classification problems.

The Experimental Results Analysis of Multiclassification
for Check Valve.In order to test validity and generalization ability of MKL-CS-ELM, the aforementioned three costsensitive methods are applied to identify multioperation states of check valve.Then the effectiveness of the proposed method is verified by multiclassification tests.

The Experimental Results of Oversampling.
In the multiclassification experimental processing, the 110 data samples are collected, 66 samples are selected as training samples, and the remaining 44 samples are as test samples.The data sample distribution of oversampling is calculated based on ( 16) and (17).And the recognition results of classification models are presented in Figure 9.
As seen in Figure 9, the classification accuracy of CS-ELM, MKL-CS-SVM, and MKL-CS-ELM increases with the increasing cost .On the contrary, the misclassification samples sharply reduce with the increasing cost .The three classification models of CS-ELM, MKL-CS-SVM, and

The Experimental Results of Undersampling.
Similar to the previous oversampling approach, the data sample distribution of undersampling is calculated.Then, the experimental results of mentioned-above classification models are presented in Figure 10.As shown in Figure 10, the classification accuracy of multikernel cost-sensitive diagnosis models is obviously decreased due to sharply reducing of data samples in undersampling.But, the misclassification samples can be also effectively restrained (even reduced to 0) by undersampling when the cost  is equal to 2.5.However, Figure 10 also shows that the undersampling method should not be used in the conditions of the insufficient samples and high-accuracy requirements.

The Experimental Results of Threshold Adjusting.
In the same way, the multiclassification recognition results of five mentioned classification models by the threshold adjusting are presented in Figure 11.
As depicted in Figure 11, in threshold adjusting processing, the misclassification samples are reduced to 0 when the cost  is increased to 2.5.The cost-sensitive classification models reach balanced state when the cost  is increased to 2.5, but the missed diagnosis samples and accuracy of CS-ELM, MKL-CS-SVM, and MKL-CS-ELM have no obvious change with the continuous increasing cost .

Robust Performance Evaluation of Three Cost-Sensitive
Methods for Check Valve.In order to assess the effectiveness of three cost-sensitive classification methods and choose the proper evaluation index for fault diagnosis of check valve, the robust performance evaluation   according to the description in Section 2.4.2 is calculated; the change regularity of robust performance index   varying with cost  is obtained and shown in Figure 12.
Figure 12 shows the comparative tests of robust performance evaluation in three cost-sensitive methods.The robust performance index   of the undersampling is biggest.That is to say, when the sample distribution is very imbalanced, it is not suitable to adopt the cost-sensitive method of undersampling.Moreover, in CS-ELM, MKL-CS-SVM, and MKL-CS-ELM method, the robust performance index   in  oversampling decreases at first then increases with increasing cost  and the robust performance index   in threshold adjusting decreases at first and then reaches steady state with increasing cost .At the same time, Figure 12 also shows that the robust performance index   of oversampling is smaller than the threshold adjusting when the diagnosis cost  is less than 2.5, and then the change trend is reversed when the diagnostic cost  is greater than 2.5.Therefore, the oversampling and threshold adjusting are more appropriate cost-sensitive methods in multioperation states recognition of check valve.

Discussion and Conclusion
6.1.Discussion.High pressure diaphragm pump is often used as the core power equipment in slurry pipeline transportation, and its operating conditions are extremely complex.Therefore, it is critical to improve state recognition accuracy for ensuring operation safety and stability.However, the check valve is the core component of the high pressure diaphragm pump, and it is one of the most easily damaged and frequently replaced parts.Meanwhile, in the developed data acquisition system of check valve, the vibration data with  normal operation has been collected in most of the time; on the contrary, the vibration data of fault time and fault state accounted for less.Therefore, it is of great significance to identify the operation state of the check valve effectively under the condition of complex operation and information asymmetry.Inspired by multikernel learning and cost-sensitive analysis, a fast diagnosis method of check valve based on MKL-CS-ELM is proposed.The presented MKL-CS-ELM method can complete the rapid positioning and analysis of the check valve fault and provide theoretical support for the adjustment and optimization in operation conditions of check valve during the follow-up operation.we need to combine the signal characteristics and previous empirical rules about the selection to the kernel function so as to complete the selection of the effective kernel function and construct the multikernel function.
(2) In order to overcome the deficiency of assuming that the classification cost is equal through the classification model and improve the actual adaptability of the model, the paper makes the choice of the common cost-sensitive processing methods to construct CS-ELM model.The effectiveness of the introduction to cost-sensitive mechanism has been demonstrated through the binary classification and multiclassification recognition results; the experimental results when using three kinds of cost-sensitive methods have also been compared with each other in different situations to provide theoretical support and guidance for the selection of cost-sensitive method.However, the cost of diagnosis needs to be moderate through the experimental comparison; otherwise it will reduce the overall recognition accuracy of the classification model.

Conclusion. The fault diagnosis model of MKL-CS-ELM
based on the multikernel learning and cost-sensitive learning is constructed, and the datasets of check valve are used to verify the effectiveness of the proposed method.By ) Cost[] ( ∈ {1, . . ., }) represents the total cost function of category , namely, Cost[] = ∑  =1 Cost[, ].The cost expressions of oversampling, undersampling, and threshold adjusting by definition are discussed as follows.

4. 1 .
The Principle of Check Valve and Experiment Platform 4.1.1.The Principle of Check Valve.

Complexity 7 (Figure 2 :
Figure 2: The high pressure diaphragm pump and fault check valve.

Figure 3 :
Figure 3: The structure diagram of cone-shaped check valve.

Figure 6 :
Figure 6: The recognition results of five classification models in oversampling.

5. 1 . 3 .
The Experimental Results of Threshold Adjusting.Based on the Table 4 and (19), the recognition results of five classification models in threshold adjusting are presented in Figure 8.

Figure 7 :
Figure 7: The recognition results of five classification models in undersampling.
-ELM can gain the optimal classification performance when the cost  is equal to 2.5 in oversampling processing.Meanwhile, compared with the experimental results illustrated in Figures9(a), 9(b), and 9(c), some conclusions are summarized as follows: (1) The CS-ELM and MKL-CS-ELM are more sensitive to the cost  than the MKL-CS-SVM.(2) The classification performance of MKL-CS-ELM is slightly better than other above-mentioned classification models.(3) The change regularity of classification accuracy, misclassification, and missed diagnosis samples with the cost  is obtained and shown as follows: (1) the diagnosis cost  = 2.5 can be regarded as a demarcation line and inflection point of classification accuracy.(2) The misclassification and missed diagnosis samples drastically reduce when the cost  is less than 2.5.And the misclassification samples are reduced to 0 and reached balanced state when the cost  is greater than 2.5.But the missed diagnosis samples are sharply increased and the classification accuracy is also gradually decreasing.(3) The experimental results show that the abovementioned cost-sensitive methods are feasible in check valve fault diagnosis of industrial field.

Figure 8 :
Figure 8: The recognition results of five classification models in threshold adjusting.

Figure 9 :
Figure 9: The recognition results of five classification models in oversampling.

( 1 )
The multikernel learning mechanism is introduced to realize the multikernel projection of nonlinear and nonstationary data, which can overcome the limitation of incomplete information characterized with the single kernel function effectively and improve the ability to represent signals.Three kinds of common kernel function are used to construct multikernel classification model during the experiment.The introduction of multikernel learning can improve the recognition accuracy of classification model effectively through the analysis of MKL-ELM and ELM.In this case, what kind of kernel function and how many kernel functions are selected still lack normative choice mechanism.Therefore,

Figure 10 :
Figure 10: The recognition results of five classification models in undersampling.

Table 1 :
The cost matrix of binary classification.

Table 2 :
The description of the experimental datasets.

Table 3 :
The wavelet packet energy entropy features.

Table 4 :
The cost matrix of check valve.

Table 5 :
The data sample distribution by oversampling for check valve.