Autodetection of J Wave Based on Random Forest with Synchrosqueezed Wavelet Transform

J wave is the bulge generated in the descending slope of the terminal portion of the QRS complex in the electrocardiogram. The presence of J wave may lead to sudden death. However, the diagnosis of J wave variation only depends on doctor's clinical experiences at present and missed diagnosis is easy to occur. In this paper, a new method is proposed to realize the automatic detection of J wave. First, the synchrosqueezed wavelet transform is used to obtain the precise time-frequency information of the ECG. Then, the inverse transformation of SST is computed to get the intrinsic mode function of the ECG. At last, the time-frequency features and SST-based and the entropy features based on modes are fed to Random forest to realize the automatic detection of J wave. As the experimental results shown, the proposed method has achieved the highest accuracy, sensitivity, and specificity compared with existing techniques.


Introduction
J wave is the bulge or ectrosis generated in the descending slope of the terminal portion of the QRS complex in the electrocardiogram (ECG). The morphologic pattern, amplitude, and the duration of J wave are various; besides, it always hides in the ST segment [1,2]. The presence of J wave may lead to fatal malignant arrhythmia and even sudden death. Therefore, more and more attention has been attached to the research of J wave.
In 1936, Shipley and Hallaran discovered J wave in the ECG of patients with premature repolarization syndrome for the first time [3]. In 1938 Tomashewski found J wave in a frozen male patient's ECG [4]. In 1980s, the phenomenon of sudden death during sleep in young healthy men occurs frequently in Southeast Asian countries [5]. From 1948 to 1982 in Philippines, Manila, 722 cases of sudden death in healthy youth were reported and J wave occurred in their ECG [6][7][8]. In 1984, Otto et al. reported three healthy young men who had ventricular fibrillation during sleep, whose heart structures were normal, but the ECG showed J wave [9]. In 1992, Brugada brothers reported 8 cases with sudden cardiac death and J wave was found in their ECGs [10]. In 1996, Professor Yan and Professor Antzelevitch published an article in Circulation to investigate the molecular and electrophysiological principles of J wave [11]. Since then, the study of J wave has attracted more and more attention of experts and scholars, but these studies mainly focus on the view of medical science, and so far, only the doctor's clinical experience, combined with the naked eye to identify J wave appearing in the diagnosis of J wave syndrome [12]. However, the clinical misdiagnosis and missed diagnosis are easy to occur if the disease is diagnosed only by doctor's clinical experiences; because the morphologic pattern, amplitude, and the duration of J wave are various, the resulting symptoms are also different. Therefore, the automatic detection of J wave forms the perspective of signal processing and machine learning is a significant task.
At present, there are very few people who do this work. To the best of our knowledge, in 2014, Clack et al. analyzed the ECG with the help of computer for the first time. They set up a breakpoint at the descending slope of the QRS wave. As a result, they achieved the sensitivity of 89.5%, the specificity of 94.5%, and the accuracy of 91.3% [13]. In 2015, Wang et al. used signal processing combined with functional analysis to recognize J wave automatically, which achieved the sensitivity of 88.45%, the specificity of 87.8%, and the accuracy of 89.6% [14]. However, the datasets are too small and the results are not universal. In our previous work, we have used the curve fitting and wavelet transform to extract ECG features. Combined with SVM classifier, the sensitivity of 93.21%, the specificity of 93.87%, and the accuracy of 92.58% have been achieved [15]. However, at that time, the amount of data is too small, and the result is not convincing. The other drawback of this method is that the computational efficiency is not high. Since the incidence of J wave syndrome patients is low, we have tried to build J wave database in paper [16] before we have not collected enough samples. In that system, we have achieved the average sensitivity of 91.32%, average specificity of 92.2%, and average accuracy of 93.35%. But the built database, after all, is not real data, so we reexplore the methods of J wave automatic detection and identification after we collected enough data.
Wavelet transform (WT) is a good time-frequency analysis method, while it is restricted to the Heisenberg timefrequency uncertainty principle [17]. To put it another way, it is impossible for WT to improve the time-frequency resolution at the same time; that means, a high time resolution means a lower frequency resolution and vice versa. The temporal resolution and frequency resolution vary with the wavelet scale in WT, and the time-frequency blurring occurs on the transformed time-scale plane. Empirical mode decomposition (EMD) is an effective tool for time-frequency analysis of signals, while there are a lot of problems such as sifting criterion, endpoint effect, and mode mixing in EMD [18]. Moreover, the EMD does not have a firm mathematical framework. In view of the above shortcomings, ID et al. proposed a new time-frequency transform method named synchrosqueezed wavelet transform (SST). It is a powerful tool for time-frequency analysis of ECG and the precise time-frequency information can be evaluated using SST [19].
In this paper, a new methodology based on SST and Random forest (RF) is proposed to realize the automatic detection of J wave. We computed the time-frequency feature based on SST as the first feature. Through the inverse transformation of SST we obtained five modes of the ECG episodes and we have evaluated Renyi entropy, approximate entropy, and sample entropy as the nonlinear features. Then, the RF is utilized to achieve the detection and classification of J wave-positive and J wave-negative from ECGs. The flow chart of the proposed method for the automatic detection of J wave is provided in Figure 1. The remaining part of the paper is organized as follows. In Section 2, we describe the database used in this work. In Section 3, the developed method is described. The results and discussion are presented in Section 4 and Section 5, respectively. Finally, the present work is concluded in Section 6.

Data Preparation
2.1. Data Source. In our work, the ECG signals were collected from the Shanxi Dayi Hospital, which is the cooperating partner of our project. Infiniti digital twelve-channel ECG SE 1200-Express was applied and the ECG data were sampled at 500 Hz. The database consisted of 30 normal ECG recordings (20 males and 10 females), which come from the health checkup, and 25 abnormal ECG recordings (23 males and 2 females), which come form the patients with J wave related diseases, and all human beings enrolled in the study were signed informed consent. We choose 20-minute duration of Holter monitoring for each ECG record. It is to say that we intercepted 1200 heart beats of each ECG recording in our research. In this paper, the normal ECG patter is defined as J wave-negative and the abnormal ECG patter is defined as J wave-positive. We divided the data into training sets and testing sets. Among them, the training sets contain 18 J wavenegative data and 15 J wave-positive data, while the testing sets are comprised of 12 J wave negative data and 10 J wavepositive data.

Preprocessing of Data.
Denoising of the ECG signal is carried by eight level Daubechies wavelet 6 (db6) in this preprocessing stage [20]. Pan-Tompkin's algorithm is used for the detection of R-peak on the preprocessed ECG signal, after that the ECG episodes are segmented using the detected R-point [20,21]. The number of the ECG beats for J wavepositive and J wave-negative used in this study are revealed in Table 1. Since J wave always hides in the ST segment, we choose 120 samples after R-point as our subjects in the study.

Time-Frequency Feature SST-Based
3.1.1. The Basic Theory of SST. SST is a powerful and promising tool to analyze the time-frequency (TF) information of nonstationary signals, which is based on WT and reallocation methods [22]. It is computed by reassigned wavelet coefficients from time-scale plane to TF-plane; thus, a sharper TF distribution is achieved. It is the postprocessing of WT. Besides, it succeeded the philosophy of the EMD. Different from EMD, it has a sound theoretical base and the mode mixing phenomenon has been overcome in SST. Another advantage of SST is that the kind of mother wavelet has a small part to play in the results of SST [19]. The basic principles of SST are as follows [23,24].
The Continuous Wavelet Transform (CWT) of a signal is [25]   where, is the mother wavelet, is the scaling factor, is time shift factor. According to Plancherel's theory, equation (1) can be rewritten as: wherê( ) is the Fourier transform of ( ) and̂( )is the Fourier transform of ( ).
One of the properties of WT is that the TF energy of the results always concentrated around the central frequency of the signal. The most powerful place, commonly known as "ridge", is the signal frequency. However, the energy smeared around the "ridge" always affects the recognition of the signal, which means, when̂( ) is gathered around = 0 , ( , ) will be gathered around = 0 / , while ( , ) will be diffused around the the "ridge" = 0 / . On the other hand, the oscillation of ( , ) in points to the original frequency , nothing to do with the value of [22]. This is the theoretical basis of SST.
The process of the SST is as follows [26,27]: (1) Calculate the frequency domain form of the results of WT, just as (3).
(2) Calculate the instantaneous frequency (IF) of the signal.
(4) Compress and rearrange the coefficients of WT. The information can be transformed from the time-scale plane to the time-frequency plane; moreover, the IF can be extracted in this step.

The Parameter Selection of SST.
SST is an improvement based on CWT. The choice of wavelet basis and the setting of wavelet base parameter make great differences to the results of CWT. In [25], It is proved that the wavelet base has much more smaller effect on SST compared with CWT and it is another advantage of SST. Morlet wavelet is carried in this paper, and we set the center frequencies of the wavelet basis are 25hz, 35hz, and 45hz, respectively to find the best center frequency of Morlet wavelet. The TF curve of J wavepositive and J wave-negative signal obtained by WT and SST at different center frequencies is revealed in Figures 2-4. The TF-plane WT-based is obtained by (4). It can be shown from Figures 2-4 that the TF-plane derived from the WT is subjected to a poor TF resolution and smearing effect along frequency axis is serious. In contrast, the TF resolution SST-based is more focused and more energyintensive. Besides, when the center frequency of wavelet basis is 35hz, we obtain the best TF resolution. Since the excellent performance of SST to achieve a high-precision TF resolution, we choose the results of SST as the first kind of feature to realize the automatic detection of J wave. SST can avoid frequency mixing effectively. Even the decomposed signal is contained of modes with relatively close frequency, SST can still extract them. This powerful function is based on the precise reconstruction theory of SST and the theories are as follows.
(2) Calculate the frequency center of individual component, that is, ( ) [28].    BioMed Research International (4) Calculate the individual component of ( ). It can be reconstructed from thẽ̃( , ) using the inverse CWT over a narrow frequency ∈ [ −(1/2)Δ , +(1/2)Δ ] around the ℎ component [30]. It can be evaluated as follows: Here, we get five modes and the intrinsic modes of J wavepositive and J wave-negative are provided in Figure 5. It can be seen from Figure 5 that the amplitude and frequency information of J wave-positive and J wave-negative are distinguished significantly, especially at mode 3, mode 4, and mode 5. It is evident that the frequency characteristics of J wave-positive in mode 3, mode 4, and mode 5 are higher compared to the corresponding mode of J wave-negative, while the amplitude are lower.
The entropy features extracted in this paper are resulted from these intrinsic mode functions. It is discussed in the next subsection.

Nonlinear Entropy Feature Inverse SST-Based.
Due to the nonlinear properties of biological signals, researchers tend to choose the theory of nonlinear dynamics, which are effective methods, to analyze them. When studying biological signals, entropy, as a kind of nonlinear feature, often makes a good performance. Renyi entropy (RE), approximate entropy (ApEn), and sample entropy (SampEn), for this reason, are used in this study to implement J wave automatic detection.

Mode Renyi Entropy.
In [31], Williams et al. introduced the RE of TF distribution. RE can be used as a measure of signal complexity at frequency domain, and the essence of signal can be researched by counting the RE of the signal at frequency domain [32]. Suppose that is a random variable with a finite number of values. Its probability distribution is = { 1 , 2 , ... } with ( ) = ∑ ≤ 1, and its RE is defined as When = 1, the first-order RE degenerates into Shannon entropy. So, we regard RE as a more general form of information entropy. The theoretical derivation and simulation experiment in [33] concluded that when = 3, the measurement of RE has the best stability. Therefore, the thirdorder RE can describe the information of different signals effectively. From what has been discussed above, we choose = 3 in this article.

Mode Approximate Entropy.
ApEn is a kind of nonnegative quantitative description of the complexity of nonlinear time series. The more complex time series correspond to the greater value of ApEn [34]. Simultaneously, ApEn can obtain stable statistics even though the data is short. It is also for this reason, ApEn can achieve good performance in our work, which is defined as where is the correlation coefficient and it can be denoted as where , represent phase trajectory points and , , Θ, denote the number of midpoint in the phase space, radial length of a circular disk centered at the reference points, step function, and embedding dimension, respectively [34].
The performance of the ApEn is related to the values of , , and . The results of the literature [34][35][36] show that when is 2 and is 0.2 multiple of the standard deviation of the data (SDNN), the value of ApEn has a steady statistical properties. Accordingly, we take = 2 and = 0.2 × in this work, respectively.

Mode Sample Entropy. Proposed by Richman and
Moornan, SampEn is similar to the ApEn but with higher precision to measure the complexity of the time series. For the sake of the value of SampEn, continuous matching of points inside the threshold is done until the match does not exist [34,37]. The variables ( ) and ( ) for all lengths up to keep track of all matching templates. It is given by where = 0, 1, . . . , − 1 and is the length of the study object. Similar to ApEn = 2, = 0.2 × are taken in this paper, respectively [37].

Classification. Combining with his Bagging Integrated
Learning Theory proposed in 1996 and the random subspace method proposed by Ho in 1998, Leo Breiman introduced Random forest (RF) in 2001. RF is always regard as an excellent ensemble classifier [38]. The core of RF is to establish many decision trees according to random features from random samples with bagging strategy, and the final classification result is voted by these trees [39]. The classification processes of RF are given as follows: (1) Adopt the technique of bootstrap resampling to extract multiple samples from the original samples.
(2) Build CART decision tree by selecting features randomly from all features of above samples.
(3) Repeat the upper two steps times, which is to set up the CART decision trees.
(4) Combine multiple decision trees' prediction and draw a final classification results by voting.
RF is selected as the classifier to realize J wave automatic detection in this paper, since it has the following excellent properties compared to other classifiers: (1) RF can deal with high-dimensional data and weak relevant data effectively [39][40][41].
(3) It can draw the rank of importance of the features [40]. (4) There are less parameters which need to set in RF compared to other state-of-the-art classifiers. The number of the base decision trees is always the only variable need to set in RF; according to the study in [42,43], the number of the base learners is set to 150 in this work.  The results reveal that the mean and the standard deviation values of the J wave-positive episodes are higher than J wavenegative. In the analyses of the statistical significance of these features from mode 1 to mode 5, we have used Welch's twotailed t-test technique [44,45] by means of SPSS statistical analysis software. By doing this, t-value and p-value can be obtained, which are typically used to quantify the idea of statistical significance. The t-value and the p-value of RE from mode 1 to mode 5 have been listed in Table 2. The high tvalue and the low p-value show that the discrimination of RE between J wave-negative and J wave-positive subjects are significant.

Analysis of Mode Entropy
The within-class variation of ApEn and SampEn feature for J wave-negative and J wave-positive class from mode 1 to mode 5 is shown in Figures 8(a), 8(b), 10(a), and 10(b), respectively. The mean and the standard deviation values of ApEn and SampEn are shown in Figures 9(a), 9(b), 11(a), and 11(b), respectively. From these figures, we have shown that the statistical features of ApEn and SampEn are significantly different for J wave-negative and J wave-positive episodes. The J wave-positive class has higher standard deviation values at mode 1 to mode 5, while inverse tendencies in the mean value of ApEn and SampEn from mode 1 to mode 5 are observed. The t-value and the p-value of ApEn and SampEn from mode 1 to mode 5 have been revealed in Tables 3 and 4. The p-values of the ApEn feature at mode 1, mode 2, and mode 5 are 0.003, 0.007, and 0.002, respectively. However, the p-values of the SampEn at mode 1 to mode 5 are less than 0.001; thus, the SampEn may have better performance in the process of the classification. Anyway, the entropy features are statistically significant for classification of J wave-negative and J wave-positive class from ECG and these features are suitable for detection of J wave-positive.

Performance Metrics.
The following five types of performance evaluation indicators are used to evaluate the effect of the proposed method for ECG J wave detection [44,46,47]: where true positive (TP) and false negative (FN) stand for the number of heartbeats of J-positive which have been classified correctly and incorrectly, respectively, while true negative (TN) and false positive (FP) stand for the number of heartbeats of J-negative which have been classified correctly and incorrectly, respectively. An ideal classification system should have lowered both FN and FP, so that it achieves high Se, high Sp, and high ACC as well as high MCC. In addition, the area under the receive operating characteristic curve (AUC) is used in our work to achieve more objective evaluation results and the higher the value of AUC, the more desirable the classification system.

Experimental Results.
Firstly, we compared the presented method in this paper with some existing techniques and the results are listed in Table 5. Table 5 revealed that the proposed method outperforms the methods reported in [13,14]. The proposed method has achieved the highest ACC, Se, Sp, MCC, and AUC of 96.9%, 96.5%, 95.8%, 0.923, and 0.957, respectively. Besides, the databases of [13,14] are too small. In [13] the database is comprised of 100 resting 12-lead samples. In [14] the training set contains 100 samples and the test set contains 116 samples, which results in the results of the experiment having no generality.
Secondly, the previous method we have reported in [15,16] has been evaluated with the latest collected data and the results are shown in Table 6. Feature importance 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 Relative importance Figure 12: The ranking results of features extracted in this paper.   Table 6 revealed that the proposed method has achieved the highest ACC, Se, Sp, MCC, and AUC, which proved the effectiveness of the proposed method to realize the automatic detection of J wave.

Discussion
In this subsection, the effect of different features on the classification results has been discussed firstly. RF can rank features according to the importance of the features; this is one of the superiorities of the RF. Figure 12 depicts the ranking results. It is observed from Figure 12 that the time-frequency feature based on SST outperforms the other features. In the nonlinear entropy feature, the RE has the best performance and the effect of the SampEn is better than the ApEn. The RE feature emphasizes the spectral variation combined with the excellent time and frequency property of SST. This may be the reason for which the RE is ranked first in entropy feature. The received operating characteristic (ROC) curves of RF classifier for various features are provided in Figure 13. It can be seen that the RF classifier has the highest AUC when it is fed to all features (time-frequency feature, RE, ApEn, and SampEn feature) and the area is 0.957. The rest of area under ROC curves are 0.951, 0.809, 0.721, and 0.702 and the corresponding features are time-frequency feature, RE, ApEn, and SampEn feature, respectively, which is consistent with our earlier analysis.
In addition, the computational efficiency and detection results of different classifiers are discussed in this subsection. Table 7 reveals the ACC, the value of MCC, AUC, the training time, and the testing time using the features extracted in this paper to different classifiers. For RF, the numbers of base learners are set as 100, 150, and 300, respectively. For K-Nearest Neighbour (KNN), the k is set as 6 [48]. For support vector machines (SVM), the idea of 10-fold cross validation and the grid-search is adopted to get the satisfactory parameter in radial basis function (RBF) and the penalty factor , which had the shortest time-consuming [49,50].
It can be seen from Table 7 that the RF is more timeconsuming than KNN and DT in training sets, since it needs to establish many decision trees and votes for samples through the trees in the process of the training. It can also be observed that as the number of base learners increases, the training time and the testing time increase linearly. However, the testing time of RF is far less than its training time. In application, the testing time is more important, since the offline data is usually adopted in the process of testing. When the number of base learners is 150, the RF achieved the highest ACC, MCC, and AUC of 96.9%, 0.923, and 0.957, respectively, compared with other classifiers, which proved the sensible of choosing the classifier in our paper. Although SVM achieved comparable classification results to RF, its time consumption was much greater than that of RF.

Conclusion
A new method is proposed in this paper to achieve the automatic detection of J wave. The experimental results have proved that the proposed method can detect the J wave automatically and accurately. What is more, it provides a reliable foundation for the clinical diagnosis. We introduced time-frequency domain features and nonlinear entropy features (RE, ApEn, and SampEn) in the process of the feature extraction, after that, the RF is utilized in the stage of classification. The entropy features are computed by the modes of the ECG, which are evaluated by the inverse transformation of SST. Compared with the existing techniques, the advantages of the proposed method are as follows. It is the first time to obtain the intrinsic mode function of ECG though SST. The good time-frequency characteristics and the perfect reconstruction ability of SST make it a powerful tool to discriminate J wave-negative and J wave-positive from ECGs. Combined with RF, which is a kind of ensemble classifiers with great performance, we obtain the best results to realize the automatic detection of J wave.
In the future, the work can be extended in two aspects: (1) The developed methodology can be applied to the diagnosis and recognition of other heart diseases, even other biosignals, such as electroencephalogram(EEG).
(2) Feature selection can be studied to further improve the computational efficiency of J wave automatic detection.

Data Availability
All data used and analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.