Atrial fibrillation (AF) is one of the most common cardiovascular diseases, with a high disability rate and mortality rate. The early detection and treatment of atrial fibrillation have great clinical significance. In this paper, a multiple feature fusion is proposed to screen out AF recordings from single lead short electrocardiogram (ECG) recordings. The proposed method uses discriminant canonical correlation analysis (DCCA) feature fusion. It fully takes intraclass correlation and interclass correlation into consideration and solves the problem of computation and information redundancy with simple series or parallel feature fusion. The DCCA integrates traditional features extracted by expert knowledge and deep learning features extracted by the residual network and gated recurrent unit network to improve the low accuracy of a single feature. Based on the Cardiology Challenge 2017 dataset, the experiments are designed to verify the effectiveness of the proposed algorithm. In the experiments, the F1 index can reach 88%. The accuracy, sensitivity, and specificity are 91.7%, 90.4%, and 93.2%, respectively.
Atrial fibrillation (AF) is the most common persistent cardiovascular disease, which can easily lead to strokes, hemiplegia, and other diseases, seriously threatening patients’ health; thus, timely diagnosis and treatment are necessary. However, owing to the shortage of medical resources and the single model of doctor diagnosis, it becomes urgent to improve automatic detection technology. Automatic detection of cardiac rhythm is a meaningful and important issue in different age groups, including adults [
The detection of atrial fibrillation signals is mainly divided into four parts, including data preprocessing, feature extraction, feature selection, and classification. Among them, feature extraction directly affects the accuracy and efficiency of atrial fibrillation signal classification. Commonly used feature extraction in the literature usually falls into two categories, traditional feature extraction and feature extraction based on deep learning methods. Traditional feature extraction methods are generally divided into three categories. The first is to extract the statistical characteristics of ECG signals, that is, use the statistical data to summarize a series of ECG data. Typical statistics include mean, maximum, minimum, variance, skewness, kurtosis, count, and percentage. Kaya et al. [
With the rapid development of deep learning, the advantages of feature-level fusion have become more and more obvious. In recent years, some researchers have used feature fusion for ECG signal detection. Smoleń [
This paper presents a robust method capable of detecting AF from single short ECG lead recording. Here are the four main contributions of this paper: (1) novel combination of deep learning and the traditional features; (2) proposed an improved residual network and gated recurrent unit network, which extracted deep learning features in spatial and time series; (3) performing ECG feature fusion used discriminant canonical correlation analysis; and (4) achieving superior classification results compared to the above-cited method of the same database [
The structure of this paper is as follows: Section
This section mainly introduces deep learning feature extraction methods and traditional feature extraction methods based on expert knowledge.
This article uses a large dataset released by the PhysioNet/CinC Challenge in 2017, which contains 8528 single-lead ECG records [
The PhysioNet 2017 dataset.
Type | Recording | Average time length (s) |
---|---|---|
Normal | 5076 | 31.9 |
AF | 758 | 31.6 |
Other rhythm | 2415 | 34.1 |
Noisy | 279 | 27.1 |
The Butterworth band-pass filter is used to denoise the original ECG. The frequency response of the Butterworth filter is maximally flat (i.e., has no ripples) in the passband and rolls off towards zero in the stopband [
Due to the uneven number of samples in the database, the number of normal rhythms and other rhythm samples is large, namely, 5076 and 2415, respectively, while the number of atrial fibrillation rhythms and noise samples is small, 758 and 279, respectively, which easily affect the performance of model training and overfitting occurs. In this paper, class_weight is used to balance the sample and it provides weights for each output class. The weight of normal and other signals is very small, while the weight of atrial fibrillation and noise signal is much bigger. The class_weight method uses balance, and its weight calculation method: n_samples/(n_classes
This paper adopts residual network and gated recurrent unit for deep learning network feature extraction, which can not only reduce the depth of the network and effectively prevent overfitting but also extract the timing characteristics of the signal while extracting their spatial characteristics. The specific network structure is shown in Figure
Deep learning feature extraction uses ResNet (residual network) and GRU (gated recurrent unit).
To deal with the degradation of neural networks, the method of establishing identity mapping with residual structure simplifies the multilayer network into a shallower network. According to the characteristics of the residual network, a one-dimensional residual network suitable for processing atrial fibrillation signals is designed. The residual network consists of six residual convolution blocks. In the first two residual blocks, the filter is 16. The residual ConvBlock is composed of four convolution blocks and a one-dimensional average pooling layer. Each convolution block contains a one-dimensional convolution with a step length of 1, a batch normalization, a linear unit with leakage correction, and a spatial random loss. The active layer is finally followed by a one-dimensional average pooling layer, the commonly used batch normalization (BN), LeakyRelu, and SpatialDropout. The spatial random activation function prevents overfitting, which is more conducive to promoting independence between feature maps than dropout. The number of filters in every two residual blocks is doubled, and the convolution step length in each convolution block is 1. The data obtained through the residual network is input into the gated recurrent unit network, and the number of neurons is set to 32; finally, the output of the last hidden layer is extracted as the deep learning feature.
In fact, the ECG signal is used as input to extract relevant statistical features. First, the multilead differential electrocardiogram summation absolute value and adaptive threshold real-time detection algorithm [
ECG detection algorithm detects QRS.
After the R wave is detected, the RR interval is calculated based on the R wave, and the RR interval is calculated as follows:
RR intervals and P waves [
The statistical characteristics of RR intervals include standard deviation and variance, maximum RR interval, minimum RR interval, average RR interval, pNN50 (the proportion of the number of RR intervals in the ECG sequence whose RR interval difference is greater than 50 ms in all RR intervals), RMSSD (root mean square of the difference between the RR intervals), SDSD (standard deviation of the difference between the RR intervals), and the mean, variance, skewness, and kurtosis of each of the RR intervals divided into six segments.
The statistical characteristics of the P wave include the mean, variance, skewness, kurtosis, sample entropy, and sample entropy coefficient, and the P wave is divided into the average value, variance, and skewness of each of the six segments.
In order to extract the features of the ECG signal more comprehensively, we also extract the signal features based on the medical field and the frequency domain. These features first transform ECG data from time domain into frequency domain; then, frequency-related features are extracted. In the presented paper, the periodogram power spectral density (PSD) and energy spectral density are calculated. PSD is calculated using Fast Fourier Transform (FFT). After the transformation, energy within a specific range (band) is obtained. The chosen bands are between 5 frequencies: 0.1, 6, 12, 20, and 30 Hz. Another four features compute the variation based on QRS [
Based on expert knowledge, this model performs time domain and frequency domain feature extraction on the denoised ECG signal to obtain feature vectors. It uses a convolution residual network and gated recurrent unit to form a deep learning network, and input data filled ECG signal deep learning network to obtain deep feature vectors.
The two feature vectors obtained are fused into one feature vector in series and input into the classifier composed of the fully connected layers to classify ECG signals, as shown in Figures
The structure of the proposed simple feature fusion.
The specific process of simple feature fusion.
In view of the shortcomings of the above-mentioned concatenation method, this section uses discriminant canonical correlation analysis (DCCA) [
The structure of the proposed DCCA feature fusion.
In this paper, the discriminant canonical correlation analysis (DCCA) method is used for deep learning feature and traditional feature fusion, the preprocessed ECG signals are extracted separately to obtain two feature vectors, and then the DCCA method is used for feature fusion. The specific implementation is divided into four steps as follows:
Find a set of projection direction Calculate the intraclass correlation matrix
A graphical representation of the relationship between sample characteristics. Among them, hexagon and circle represent each feature, solid line represents the correlation within the class, and dashed line represents the correlation between classes.
Then, the intraclass correlation matrix and the interclass correlation matrix are, respectively, shown as
Solve the eigenvalues and eigenvectors. The optimization problem of DCCA can be transformed into
Use the Lagrangian multiplier method to solve the above optimization problem turning the above problem into a problem of finding characteristic roots and characteristic vectors.
The eigenvector For each pair of samples
Block diagram for realizing canonical correlation analysis.
In order to optimize the atrial fibrillation detection model, a large number of experiments are carried out using a single-lead ECG dataset. The experiment in this article is to train on a server equipped with Tesla V100-SXM2 GPU and Ubuntu 16.04 operating system, and its dynamic memory of the computer is 32480MiB.
In this paper, normal F1 score, atrial fibrillation F1 score, other F1 score, and the average value of three categories of F1 score are four metrics for evaluating the classification performance of the experiments. The definition of these four metrics can be defined as
Because the noise signals are too small and unbalanced, the result of the entire dataset is unstable, and the first three types of signals are selected as the final F1 index. Even so, the F1 score of noise will also affect the other three types. In addition to F1, we also use true positive (TP), true negative (TN), false positive (FP), and false negative (FN) to calculate accuracy (Acc), specificity (Spe), and sensitivity (Sen). The calculation formula is as follows:
Four experiments are used to verify the feasibility and efficiency of the proposed feature fusion model. The first three experiments are comparative experiments.
In this experiment, after the ECG signal is denoised, its statistical features and frequency domain features are extracted manually based on expert knowledge, and finally, the XGBoost (Extreme Gradient Boosting) classifier is used for classification. The experimental block diagram based on traditional feature extraction and classification is shown in Figure
Block diagram of AF by traditional feature experimental pipeline.
The XGBoost parameters are tuned using random grid search cross-validation, and the optimal parameters are selected. The minimum leaf node weight is set to 20, the maximum depth of the tree is set to 11, the subsample is set to 0.8, the colsample_bytree is set to 0.9, the learning rate is 0.2, and the maximum depth of the tree is 11.
The minimum loss function is reduced to 1, the softmax objective function is used for classification, and the final F1 is 75%.
In this experiment, the ECG signal is detected based on the model of residual network and gated recurrent unit. The experimental block diagram of using deep learning feature extraction to classify atrial fibrillation is shown in Figure
Block diagram of AF by deep learning feature experimental pipeline.
Firstly, padding the original ECG data. Since the central electrical data of the database varies from 9 s to 61 s and the convolutional network requires equal length input, the ECG data is padded the same length. This paper uses the maximum length of the ECG signal. The sampling rate is 300 Hz, and the calculated maximum length is 18286. Each ECG data is inputted into the residual network. The residual network includes six residual convolution blocks, and each of them consists of a convolution block, a residual block, and a one-dimensional average pooling layer. Each convolutional block includes four parts: a one-dimensional convolution layer with a step size of 1, a batch normalization layer, a linear unit with leakage correction, and a spatial random inactivation layer. After the residual network, data is inputted to the gated recurrent unit for training. The number of neurons in the gated recurrent unit is 32. Finally, it is output through the fully connected layer. F1 ended up at 83%.
In this experiment, the features are simply spliced and fused and input to the fully connected layer for classification.
The feature vectors based on expert knowledge and the feature vectors extracted by the residual network and gated recurrent unit are spliced in series to obtain the fused features and input to the fully connected layer for classification. The specific process is as follows: firstly, add a flatten layer to make the traditional feature vector one-dimensional; then, use the deep learning model for training, the output of the last hidden layer of the recurrent unit as the deep learning feature vectors; finally, use the concatenation method to integrate the two feature vectors into one, and add a fully connected layer for classification. The value of F1 is 85%, and the accuracy and loss diagrams are shown in Figures
The accuracy diagram of series feature fusion.
The loss diagram of series feature fusion.
In this experiment, the feature vectors extracted by the traditional feature extraction method based on expert knowledge and the deep learning feature vectors extracted using the gated recurrent unit and residual network are fused with discriminant canonical correlation analysis and then input to the fully connected layer for feature classification. The final accuracy on the verification set is 91.7, and F1 is 88%. The accuracy and loss diagrams are shown in Figures
The accuracy diagram of DCCA feature fusion.
The loss diagram of DCCA feature fusion.
The result of the different model.
Model | Acc | Spe | Sen | ||||
---|---|---|---|---|---|---|---|
Expert features | 87% | 73% | 65% | 75% | 79% | 82% | 72% |
Resnet+GRU | 91% | 81% | 77% | 83% | 86% | 85% | 84% |
Simple fusion | 92% | 83% | 80% | 85% | 88% | 89% | 86% |
Proposed | 93% | 88% | 84% | 88% | 92% | 93% | 90% |
As can be seen from Table
In order to verify the effectiveness of the proposed method, comparisons are also performed with previous studies. Table
Comparison of previous studies of ECG based on the PhysioNet/CinC challenge 2017 public dataset.
Method | Acc | Spe | Sen | ||||
---|---|---|---|---|---|---|---|
Convolutional recurrent neural network [ | 92.4% | 81.4% | 80.9% | 84.9% | 87.5% | 94.6% | 82.9% |
Decision tree ensemble [ | 88.9% | 79.1% | 70.2% | 79.4% | —— | —— | —— |
16-layer 1D residual convolutional network [ | 90.0% | 82.0% | 75.0% | 82.0% | 80.2% | —— | —— |
2D convolutional network with LSTM layer [ | 88.8% | 76.4% | 72.6% | 79.2% | 82.3% | —— | —— |
1DCNN containing residual blocks and recurrent layers [ | 91.9% | 85.8% | 81.6% | 86.4% | —— | —— | —— |
Proposed in this paper | 93.1% | 88.3% | 84.0% | 88.3% | 91.7% | 93.2% | 90.4%% |
This paper proposes a classification method for atrial fibrillation signals based on the feature fusion of discriminant canonical correlation analysis. This method can not only extract the deep learning features of ECG signals but also fuse the traditional features of ECG signal samples. With DCCA, the maximum and minimum correlations among classes of different sample types are considered, and the recognition results are better than that of series feature fusion as well as the use of deep learning or traditional features alone. This method has been verified on the public short single-lead ECG dataset of the 2017 PhysioNet/CinC Challenge, with a verification accuracy of 91.7%, a sensitivity of 90.4%, and a specificity of 93.2%. The database used in this article itself has the problem of large differences among various categories, which shows that the fusion method in this article improves the overall accuracy while taking into account other measurement standards, and steadily improves the classification performance of ECG signals. However, this paper only considers the comprehensive and complementary representation of ECG features through feature-level fusion and does not consider the fusion of decision-making layers, such as neural network algorithms, hidden Markov models, and combinations of multiple classifiers. In future researches, the classification model and feature fusion method will be further improved. On the basis of DCCA feature fusion technology, core-based DCCA will be introduced. At the same time, more cutting-edge classifiers will be selected for classification and recognition, which will be more effective to improve recognition results.
The datasets used during the present study are available from the corresponding author upon reasonable request or can be downloaded from
The authors declare no conflict of interest.
Q.Z. and C.C. performed the conceptualization; J.S. contributed to the methodology; J.S. and C.C. helped in the validation; Q.Z., H.L., and M.S. performed the formal analysis; J.S. did the investigation; Q.Z., H.L., and M.S. helped in finding resources; J.S. wrote and prepared the original draft; C.C. and Q.Z. wrote, reviewed, and edited the manuscript; Q.Z. did the supervision, project administration, and funding acquisition. All authors have read and agreed to the published version of the manuscript.