R Peak Detection Method Using Wavelet Transform and Modified Shannon Energy Envelope

Rapid automatic detection of the fiducial points—namely, the P wave, QRS complex, and T wave—is necessary for early detection of cardiovascular diseases (CVDs). In this paper, we present an R peak detection method using the wavelet transform (WT) and a modified Shannon energy envelope (SEE) for rapid ECG analysis. The proposed WTSEE algorithm performs a wavelet transform to reduce the size and noise of ECG signals and creates SEE after first-order differentiation and amplitude normalization. Subsequently, the peak energy envelope (PEE) is extracted from the SEE. Then, R peaks are estimated from the PEE, and the estimated peaks are adjusted from the input ECG. Finally, the algorithm generates the final R features by validating R-R intervals and updating the extracted R peaks. The proposed R peak detection method was validated using 48 first-channel ECG records of the MIT-BIH arrhythmia database with a sensitivity of 99.93%, positive predictability of 99.91%, detection error rate of 0.16%, and accuracy of 99.84%. Considering the high detection accuracy and fast processing speed due to the wavelet transform applied before calculating SEE, the proposed method is highly effective for real-time applications in early detection of CVDs.


Introduction
An electrocardiogram (ECG) is a recording of the electrical activity of the heart [1,2] and a graphical representation of the signals obtained from electrodes placed on the skin near the heart [1,3,4]. The recent use of computers in conducting ECG analysis allows the patterns of the ECG signal, composed of multiple cycles that include numerous sample points, to be visualized [5]. Some of these sample points are fiducial-namely, the P wave, QRS complex, and T wave [4]. The identification of these points is a critical step in analyzing the ECG signal and has become possible by analyzing its morphological patterns [6].
In 2012, cardiovascular diseases (CVDs) accounted for 37% of premature (under the age of 70) noncommunicable disease mortality [7][8][9]. The detection of fiducial points is critical for the initial diagnosis and analysis of CVDs. In Figure 1, we can see the peaks of the QRS complex, the highest of which is known as the R peak in the QRS interval [10]. Other time segments are also shown, such as the P-Q and S-T segments. The R peak in the QRS interval is the most important feature for analyzing the ECG data. All these waves are electrical manifestations of the contractile activity of the heart [11]. Detection of the main characteristic waves in an ECG is one of the most essential tasks, and the performance of any CVD analysis method depends on the reliable detection of these waves. R peak detection in ECG is one such method that is widely used to diagnose heart rhythm irregularities and estimate heart-rate variability (HRV) [12,13].
The Pan and Tompkins methods (PT) [14] appear to be the most common benchmark given that they incorporate several fundamental techniques including low-pass filtering, high-pass filtering, derivative filtering, squaring, and windowing for the detection of the R peaks. The principal drawback of a filtering-based approach is the adverse effect on performance [20] because of the change in frequency of the characteristic wave. Shannon energy with the Hilbert transform method (SEHT) [22] provides good accuracy for detecting R peaks. However, the Hilbert transform in SEHT requires large memory and processing time, making it unsuitable for real-time application. Moreover, SEHT detects many noise peaks in ECG data with long pauses. Zhu and Dong [23] developed an R peak detection method called PSEE by using only the SEE. The authors used an amplitude threshold that affects the performance of the algorithm for valid peak detection. A QRS complex generally overlaps in the frequency domain [26], resulting in false positive detections. Some of the threshold techniques are highly noise-sensitive [17]; therefore, developments of sophisticated, automatic, and computationally efficient techniques are required that can outperform existing methods to ensure the real-time analysis of an ECG for the proper diagnosis of CVDs.
In this paper, we propose a novel approach of using the wavelet transform (WT) and modified SEE for the rapid detection of R peaks in the QRS complex. First, the proposed WTSEE algorithm performs a WT to reduce the size and noise of ECG signals and subsequently calculates the SEE after first-order differentiation and amplitude normalization. Following this step, the peak energy envelope (PEE) is made from the SEE for easy identification of peaks. R peaks are then estimated from the PEE, and the estimated peaks are adjusted from the input ECG. Finally, the algorithm generates the final R peaks by validating R-R intervals and updating the extracted R peaks.
The proposed R peak detection method was validated using 48 first-channel ECG records of the MIT-BIH arrhythmia database with sensitivity, positive predictability, detection error rate, and accuracy. We also measured the mean of R-R interval (MRR), standard deviation of normal to normal R-R intervals (SDNN), and root mean square of successive heartbeat interval differences (RMSSDs), which can be used for analyzing heart rate variability.
The remainder of this paper is organized as follows. In Section 2, we discuss the proposed R peak detection method using the wavelet transform and SEE. In Section 3, we provide the experimental results, the performance analysis of the proposed method, and a discussion of the real-time implementation. In Section 4, we conclude our work and provide insight into future study.

Methods
An ECG signal consists of many cycles of P, Q, R, and S waves, with each cycle comprising many sample points. In MIT-BIH record 100, a cycle comprises approximately 280 sample points [27]. There are up to 10 fiducial points that determine the overall characteristics of a cycle of ECG signal. Among various existing SEE-based methods, band-pass filters, such as the Chebyshev type I filter and Butterworth filter, are used for denoising input ECG signals as preprocessing steps. In the proposed method, wavelet transform replaces the band-pass filters by level 2 down-sampling, soft thresholding for denoising detailed coefficients [17], and reconstructing the level 1 signal. By applying this procedure, we can reduce the size and time required for extracting R peaks.
The proposed WTSEE algorithm extracts these fiducial points (R peaks) from the down-sampled ECG signals by calculating SEE, repeating a similar procedure as SEE to emphasize peak information (thus, we call this procedure the peak energy envelope), detecting R peaks, and updating the R peaks by comparing the R-R intervals. Figure 2 presents the data flow in the proposed R peak detection algorithm with its principle stages. First, the proposed WTSEE algorithm performs a wavelet transform, instead of a band-pass filter, to reduce the size and noise of ECG signals. It then calculates SEE after first-order differentiation and amplitude normalization. In the third stage, the PEE is created from the SEE for easy identification of peaks. In the next stage, R peaks are estimated from the PEE, and the estimated peaks are adjusted from the input ECG. Finally, the algorithm generates the final R peaks by validating R-R intervals and updating the extracted R peaks. In Figure 2    analysis combines filtering and down-sampling as shown in Figure 3 [4,17].
(1) As the first step of the proposed method, WT reduces the size and noise of an ECG. By applying these transform, memory requirements and processing time are dramatically reduced.
Symlets wavelet sym5 is chosen as the wavelet function to decompose the ECG signals, and the thresholding method    Low-pass filter  is employed to remove the noise [4]. Figure 4 demonstrates the difference between the scaling function and wavelet function with symlets [28]. By applying this procedure, the peak signal becomes clearer. The sym5 wavelet transformation is performed as where A C is an approximation coefficient vector, D C is a detail coefficient vector, E I is an input ECG signal vector, and dwt * is a DWT function.
(2) To reduce the noise of an ECG signal, we applied a soft thresholding that is recognized as more powerful than hard thresholding as where coefficients D C j and D C j are detail coefficients after and before thresholding, respectively. In the proposed scheme, we chose the universal threshold selection method. Here, the value of threshold (t) is computed as where N is the total number of wavelet coefficients and σ = median D C j /0 6745 is the standard deviation of the noise. After the noise removal, reconstruction is applied to obtain noise-free ECG signal as follows: where A C is a previously extracted approximation coefficients vector, D C is a noise-removed detail coefficient vector, E F is a denoised ECG signal, and idwt * is an inverse DWT function.

Shannon Energy Envelope Calculation.
(3) After applying the 2nd DWT along with noise removal and reconstruction, we perform the first-order differentiation of the signal to obtain the slope information. The first-order differentiation is equivalent with a high-pass filter that passes high-frequency components (QRS complex) and attenuates lower frequency components (P and T waves). The mathematical implementation of the first-order differentiation can be shown as (4) Next, the signal is normalized to scale its value to 1. This is to prepare the signal for SEE computation.
(5) After differentiating the ECG signal, it becomes a bipolar signal. Given that the method is based on peak detection, we must transform the differentiated signal into a unipolar signal. The unipolar signal can be obtained by the following Shannon energy function: To obtain a smooth SEE, the signal is passed through a zero-shift moving average filter [13] that can be considered as a moving window integrator. The mathematical expression of a moving average filter is expressed in the following where the window width is set as 33: The length of the moving average filter is an important parameter for this detection method. Generally, the length of the moving average filter is taken as approximately the width of the QRS complex. In the existing method [13], 65 samples are used for the moving average filter. However, in this paper, we reduce the length to 33.

Peak Energy Envelope Calculation.
After the SEE calculation stage, we obtain the smooth peak signal, but it contains both true R peaks and false R peaks. In this stage, we thus attenuate the false R peaks and emphasize the true R peaks. This is performed by similar steps to those of the SEE, so we call this procedure the peak energy envelope (PEE). (7) It is natural that the amplitude values for true R peaks are higher than those for false peaks. Thus, if we take the first-order differentiation of the signal, it stores the slope information of the true peaks but reduces the slope information of the false peaks. The first-order differentiation can be expressed as (8) Next, the signal is amplitude normalized to unity as The normalized bipolar signal is converted to a unipolar signal by a squaring operation. The amplitude of false peaks is very low, and the squaring operation will attenuate these peaks completely. As a result, the true R peaks are amplified, and false R peaks are diminished. The squared unipolar signal can be expressed as: (10) The signal is then passed through a moving average filter to obtain a resulting signal with smooth peak. In our proposed technique, we use a moving average filter length of 43 samples, which is smaller than the existing length of    resized ECG after wavelet transform, (c) output of 1st-order differential, (d) squared differential, (e) normalized differential, (f) calculated Shannon energy (SE), (g) extracted Shannon energy envelope (SEE) after moving average, (h) normalized difference of SEE, (i) calculated peak energy (PE) after square operation, (j) extracted peak energy envelope (PEE) after moving average, (k) estimated peaks in PEE (red circles), and (l) true detected R peaks in ECG signal (red circles).
85 [13]. The final signal expressed next is used for peak detection where the window width is set as 43. Figure 5 shows an example flow of the proposed WTSEE method for detecting R peaks. After squaring, to obtain a smooth SEE, a moving average filter operation is performed. The output of the moving average filter does not provide any unnecessary rising peaks. (11) The locations of rising peaks are referred to as the locations of true R peaks, as shown in the Figure 5(k). Thus, no amplitude threshold value is required for the detection of R peaks.
where R E is the estimated locations of rising peaks and f indPeaks * is a peak-finding function from smooth PEE P S . (12) In this stage, the R peaks are detected by applying a peak-finding algorithm. However, the detected peak locations are slightly different from the actual positions of the R peaks in the ECG signal. Thus, to find the real positions of R peaks, the actual sample instant of R peaks in the input ECG signal is found by searching for the maximum amplitude within ±25 samples of the identified location in the previous step as where R C k is the searched real position from the input ECG signal and E F k is the amplitude of the kth positions of estimated R peak R E k . In Figure 5, the total signal processing of the proposed method is shown, where red circles indicate the detected R peaks by using the proposed method in Figure 5(l).
2.5. R Peak Update. As the final procedure, the previously detected peak set R C is validated and updated using the following steps. This validation and update process is based on the R-R intervals between neighboring R peaks. The objective of this procedure is balancing the R-R intervals. (13) Measure the x-axis intervals between neighboring R peaks as follows: Generate the final R feature, R F x , by repeating the validation and update for all candidate peaks of R C x . Each peak R C x k can be classified into one of three categories according to the value of the ΔR C k : (a) R C x k is not included to R F x if the x-axis intervals between two neighboring peaks, ΔR C k − 1 or ΔR C k , are less than a given threshold; that is, where θ Δ1 is a threshold value that determines the minimum offset between R-R peaks. This threshold value can be set as follows: where α = 0 5 and μ Δ is the average spacing of R-R peaks.
(b) Find additional R peaks between R C x k−1 and R C x k+1 when x-axis intervals with two neighboring peaks are greater than a given threshold: where θ Δ2 = β * μ Δ is a threshold value that determines the maximum interval between R-R peaks and γ is a parameter for defining the size of search area. We used β and γ as 1.5 and 0.5, respectively.
(c) R C x k is maintained as R F x when the x-axis intervals are located at the proper intervals:

Experimental Data.
We used the MIT-BIH arrhythmia database [28] from the PhysioNet site. This database was developed with the aim of benchmarking references for automated analysis of ECG signals. The MIT-BIH arrhythmia DB has 48 two-channel ambulatory ECG recordings of 30 min duration each. They are available at different frequencies and different time lengths. These signals are sampled at 360 samples per second and have a resolution of 11 bits over a 10 mV range. We used the first lead of all 48 records to validate the performance of the proposed method. There are 112,647 labeled beats, where the "Normal beat" class contains 75,052 beats of annotation type "1." Table 1 summarizes the annotations in the MIT-BIH arrhythmia DB and shows the examples of the 23 types of annotated beats. In the table, 1: NORMAL, 2: LBBB,…, and 38: PFUS are the annotation types and their abbreviations, respectively (please refer to the site for the meaning, annotation types, and their abbreviations). And # indicates the number of corresponding beats in MIT-BIH DB, and R-# is the record number of illustrated data.
As shown in Table 1, there are many abnormal beats in the ECG signals, so considering all annotations to be the reference peaks is not the correct way to verify the performance of the R peak detection methods. To solve the problem of incorrect or ambiguous annotation, we apply the validation process to determine whether each annotated position is true peak or not. Figure 6 shows examples of the detected R peaks from various records in the MIT-BIH DB. In this figure, black asterisks ( * ) denote the annotated beats in DB and red circles (O) denote the extracted peaks. As shown in the figure, the proposed method can detect normal R peaks under various conditions such as baseline drift, noisy signal, tall T waves as shown in Figure 6(c), or long paused waves as shown in Figure 6(h). Figure 7 shows another example of the detected R peaks and annotated beats from various records in the MIT-BIH database. In this figure, numbers, such as 1, 2, 3, 5, 14, and 28, indicate the annotation types of each beat. As shown in the figure, the proposed method can remove abnormal beats at the initial position of the ECG signal and correct the location errors of record 117 as shown in Figure 7(c).

R Peak Detection Results.
Despite the superiority of the proposed method, detection errors still exist.  method. FP is mainly generated in a high-frequency region, and FN occurs in a small-noise area between R peaks. Especially, the DER of the proposed method appears the best among others.

Evaluation Measure and Results.
To evaluate the performance of our proposed R peak detection method, we require three parameters-namely, true positive (TP), false negative (FN), and false positive (FP)-from the detected R peak.   Here, TP is the number of correctly detected R peaks, FN is the number of missed R peaks, and FP is the number of noise spikes incorrectly detected as R peaks. Sensitivity (Se), positive predictability (+P), detection error rate (DER), and accuracy (Acc) can be computed by using TP, FN, and FP by using the following equations, respectively, as widely used in the literatures [1,2,4,13,[24][25][26]: where TP are the correctly detected beats (true positive), FP are falsely detected beats (false positive), and FN are the undetected beats (false negative). The sensitivity is used for evaluating the ability of the algorithm to detect true beats, the positive predictability is used for evaluating the ability of the algorithm to discriminate between true and false beats, and the error rate is used for evaluating the accuracy of the algorithm [25]. The performance of the proposed R peak detection method for 48 ECG recordings of the MIT-BIH arrhythmia database is summarized in Table 2. The proposed method detects a total of 109,415 (In total, there are 116,137 annotated beats in MIT-BIH DB, but this number varies among different references due to the use of different steps and comparing tools [24]) true peaks. It also produces 79 FNs and 99 FPs. The average accuracy for the proposed method is 99.838%, the sensitivity is 99.93%, the positive predictability is 99.91%, and the detection error rate is 0.163%. The obtained performance is comparable to the best performances reported in the literature considering the processing speed and memory use. The high detection accuracy of the proposed method is especially meaningful because it is based on wavelet transform. Previous studies using wavelet transform did not show high performance compared with nonwavelet transform-based methods. Moreover, the final validation procedure proposed in our method verifies the validity of each detected peak and further improved the peak detection accuracy.
In Table 3, the performance of the proposed method on the MIT-BIH arrhythmia DB is compared with other existing methods. It shows that our proposed method has comparable accuracy to other conventionally used methods including wavelet transform techniques, the differential operation method, the Pan and Tompkins algorithm, SEHT, and the Shannon energy technique.
The previous four measures are related only to the R peaks. Recently, measures of HRV have been suggested, such as the mean of R-R intervals MRR , standard deviation of Sample 50000 Sample 230000 Sample 460000 Annotated beats Detected peaks   normal to normal R-R intervals SDNN , and root mean square of normal to normal RR intervals RMSSD , where R is the peak of a QRS complex (heartbeat) [29].
(i) MRR is measured by where N is the total heart beats N = TP + FN , I n is R-R interval between nth R peak and previous R peak, and I − is a mean of R-R intervals.
(ii) SDNN is defined by (iii) RMSSD is measured by I n − I n − 1 2 23 Figure 9 depicts the measured MMR, SDNN, and RMSSD from R-R intervals of the extracted R peaks. By investigating these measures, initial diagnosis of CVDs for patients can be achieved. In Figure 9, half of MRRs are illustrated to show three different kinds of measures together. The proposed method obtained MRR of 0.808, SDNN of 0.145, and RMSSD of 0.193, on average.

Experimental Environment and Real-Time Implementation.
The proposed R peak detection method using WT and modified SEE has been implemented with MATLAB R2016b on a PC with a 3.3-GHz Intel Core i5-4590 CPU, 4 GB of memory on Windows 7. The average processing time of our method is approximately 13.7 s on 48 ECG records in the MIT-BIH arrhythmia DB with a length of 30 minutes. As a result, the average throughput is 28.6 ms for an ECG signal. We showed exact parameter values used in our experiments around each of the equations throughout the manuscript to help researchers who want to implement our methods. It is recommended that researchers carefully follow the descriptions provided in Section 2.5 to obtain the same peak detection accuracies as those presented in this paper.  0  100  101  102  103  104  105  106  107  108  109  111  112  113  114  115  116  117  118  119  121  122  123  124  200  201  202  203  205  207  208  209  210  212  213  214  215  217  219  220  221  222  223  228  230  231  232 233 234 Figure 9: HRV measurements of the MIT-BIH DB. MRR: mean R-R intervals (ms), SDNN: standard deviation of normal to normal R-R intervals, RMSSD: root mean square of normal to normal R-R intervals.

3.5.
Discussions. This study presented our approach using WT and modified SEE for detecting R peaks of an ECG signal in the QRS complex. We used WT to take advantage of its representation power of temporal features at various resolutions. We applied WT to ECG signals to perform both down-sampling and noise reduction in a single step. This WT replaces the band-pass filtering used in a few representative ECG signal analysis methods [13,25]. We also applied a soft-thresholding method for noise removal with keeping informative signals. The application of SEE and PEE can be considered as similar to the methods in [13]. However, we introduced additional postprocessing steps (Section 2.5) where the R-R peak intervals were inspected to determine whether to keep each detected R peak or detect additional R peaks in an R-R peak interval. The novelty of our work is twofold: (1) Firstly, we proposed an ECG analysis technique that starts with WT method. There have been many WT-based methods, but their performances have been observed lower than those of non-WT-based methods [1,4,10,13,[16][17][18]25]. Our study demonstrates the possibility of obtaining high-peak detection accuracy in ECG signal analysis by using WT method. (2) Secondly, we proposed a postprocessing method that verifies the validity of each detected peak and indicates the probability of a missing peak.
The experimental results of our study show reasonable R peak detection performance with using less memory and processing time. This study has diverse applications in the initial diagnosis of CVDs for remote patients. The quantitative parameters (i.e., sensitivity, positive predictability, detection error rate, and accuracy) confirm acceptable performance of the proposed WTSEE detection method in practical applications.

Conclusion
In this paper, we presented a novel approach to detecting R peaks of an ECG signal and calculating the R-R interval of the R peaks via the wavelet transform (WT) method and a modified Shannon energy envelope (SEE) for rapid ECG analysis. The proposed method utilized WT and showed superior performance to other WT-based methods. We believe that the use of WT combined with subsequent processing steps properly analyzed the ECG signals to detect R peaks with high accuracy. Moreover, the proposed postprocessing method effectively validated each detected peak and indicated miss detection to further improve the peak detection accuracy.
The proposed system was tested using the MIT-BIH arrhythmia DB. We analyzed the system performance for sensitivity, positive predictability, detection error rate, and accuracy. We compared the proposed approach with the existing techniques to validate the superior performance of the former. In the future, we would like to apply our analysis method to detecting other peak points in ECG signals. Detecting various types of peaks in ECG signal will provide more useful information for diagnosing CVDs and create a number of related applications.