Deep Layer Kernel Sparse Representation Network for the Detection of Heart Valve Ailments from the Time-Frequency Representation of PCG Recordings

The heart valve ailments (HVAs) are due to the defects in the valves of the heart and if untreated may cause heart failure, clots, and even sudden cardiac death. Automated early detection of HVAs is necessary in the hospitals for proper diagnosis of pathological cases, to provide timely treatment, and to reduce the mortality rate. The heart valve abnormalities will alter the heart sound and murmurs which can be faithfully captured by phonocardiogram (PCG) recordings. In this paper, a time-frequency based deep layer kernel sparse representation network (DLKSRN) is proposed for the detection of various HVAs using PCG signals. Spline kernel-based Chirplet transform (SCT) is used to evaluate the time-frequency representation of PCG recording, and the features like L1-norm (LN), sample entropy (SEN), and permutation entropy (PEN) are extracted from the different frequency components of the time-frequency representation of PCG recording. The DLKSRN formulated using the hidden layers of extreme learning machine- (ELM-) autoencoders and kernel sparse representation (KSR) is used for the classification of PCG recordings as normal, and pathology cases such as mitral valve prolapse (MVP), mitral regurgitation (MR), aortic stenosis (AS), and mitral stenosis (MS). The proposed approach has been evaluated using PCG recordings from both public and private databases, and the results demonstrated that an average sensitivity of 100%, 97.51%, 99.00%, 98.72%, and 99.13% are obtained for normal, MVP, MR, AS, and MS cases using the hold-out cross-validation (CV) method. The proposed approach is applicable for the Internet of Things- (IoT-) driven smart healthcare system for the accurate detection of HVAs.


Introduction
The heart valve ailments (HVAs) are cardiovascular abnormalities, and these ailments occur due to the defect in any of the valves (tricuspid, pulmonary, mitral, and aortic) of the heart [1,2]. The valves of the heart prevent the backward flow of the blood, and for the proper functioning of the heart, the valve should be effectively closed or opened during the flow of blood from one chamber to another chamber of the heart [3]. The HVAs are classified as mitral stenosis (MS), mitral valve prolapse (MVP), mitral regurgitation (MR), and aortic stenosis (AS) based on the defect in the heart valves [4]. The MR ailments occur due to the improper clos-ing of the mitral valve, which further causes the reverse flow of blood from the left ventricle to the left atrium [5]. Similarly, the AR refers to the improper closing of the aortic valve; as a result, the backward flow of blood from the aorta to the right ventricle may occur [5]. Moreover, the MS is termed as the problem in the opening of the mitral valve, where the left ventricle is not getting a sufficient amount of blood from the left atrium [6]. Similarly, the AS pathology refers to the improper opening of the aortic valve, which prevents the flow of blood from the left ventricle to the aorta of the heart [5] [6]. For the diagnosis of these pathologies, different imaging techniques such as computed tomography scan, magnetic resonance imaging (MRI), cardiac echocardiography, and ultrasonic devices have been used [7][8][9][10]. It has been reported from the literature that various quantitative parameters such as transvalvular velocity, average value area, and mean value of transvalvular gradient have been considered to determine the progression of HVAs [11]. The aforementioned imaging modalities have limitations, such as the selection of tuning parameters in ultrasonic devices to obtain better resolution images of heart chambers and valves for the diagnosis of HVAs [10,12]. Also, these imaging techniques are costly and require trained medical staff for the accurate assessment of HVAs [13]. The phonocardiography (PCG) is a noninvasive and low-cost diagnostic test used for the detection of HVAs [14,15]. The diagnostic features such as the duration of both the systolic segment and diastolic segment, morphologies of both S1 and S2 components, and the appearance of murmurs have been investigated for the diagnosis of HVAs [14,16]. To assist the clinicians in the diagnosis of HVAs, an automatic diagnosis system (ADS) will be helpful especially while treating patients admitted in the intensive care unit where continuous recording and monitoring of PCG signal is done 24 hours [17]. The ADS comprises the evaluation of various diagnostic features from the PCG recording and automated classification of HVAs using the PCG signal features [13]. For smart healthcare and the Internet of healthcare things (IoHT) applications [18,19], the automated diagnosis of HVAs from the PCG signal is a challenging area of research. Therefore, the development of new methods for the extraction of PCG signal features and the classification of HVAs is required.
In the last two decades, various algorithms have been used for the automated detection of HVAs using PCG signals. These algorithms have considered different feature extraction methodologies to extract the features from the PCG signal and used various machine learning classifiers for the categorization of HVAs. A review of various automated methods for the detection of HVAs has been reported in [20,21]. The time, frequency, time scale, and timefrequency domain-based features from PCG signal have been used for the detection of HVAs. The time-domain features from the PCG signals have been used in [22][23][24][25][26], for the categorization of both normal and abnormal heart sounds. Similarly, in [27][28][29][30], the frequency domain features from the PCG signals have been considered for the discrimination of normal and abnormal cardiac sounds. The time-scalebased methods such as discrete wavelet transform (DWT) [31,32], empirical mode decomposition (EMD) [31,32], and tunable Q-wavelet transforms (TQWT) [33] of PCG signals have also been used for the detection of HVAs. Moreover, the time-frequency analysis-based approaches such as the short-time Fourier transform (STFT) [34,35], synchrosqueezing transform (SST) [36], and other time-frequency decomposition-based approaches [37][38][39] of PCG signals are used for the categorization of HVAs. The machine learning techniques such as the support vector machines (SVM) [40], random forest (RF) [41], convolutional neural network (CNN) [42], and hidden Markov model (HMM) [43] have been used for the classification of HVAs. It is evident from the literature that time-frequency and time-scale analysisbased approaches have demonstrated higher classification performance for the detection of HVAs using PCG signals. Son et al. [44] have combined the Mel frequency cepstral coefficients (MFCC) and DWT-based features from the PCG signals and used these features for the detection of HVAs. They have considered various machine learning classifiers for HVA detection. In [45], the authors have applied a novel algorithm based on wavelet fractal dimension and a twin support vector machine (TWSVM) for the classification of HVAs using PCG signals. Moreover, Ghosh et al. [36] have extracted the magnitude and phase features from the timefrequency representation of the segmented PCG cycles for the discrimination of HVAs. They have used synchrosqueezing transform (SST) for the evaluation of the time-frequency matrix from the PCG signal. The SST-based method has drawbacks such as it has poor time-frequency resolution for PCG signals as it uses the coefficient reassignment in the time-frequency plane based on the instantaneous frequency of the PCG signal [36,46]. Also, the SST method has shown less performance for the detection of HVAs. The methods reported in the literature have segmented the PCG signal into cardiac cycles and then extracted features from the segmented cardiac heart sound cycles for the detection of HVAs. The PCG signal with multiple cardiac heart sound cycles effectively captures the variations in the amplitudes and shapes of S1 and S2 sound components and the duration of systolic and diastolic segments [14]. The existing approaches have not considered the PCG signals from all HVA classes to design the automated diagnosis frameworks. Therefore, an intelligent system which uses PCG signal with multiple cardiac heart sound cycles and classifies all HVAs is required for healthcare applications.
The PCG signal is nonstationary, and the components of this signal such as S1, S2, and murmurs are nonlinear and time-varying [47,48]. In our previous work, we have analyzed the PCG signal using Chirplet transform (CT) for the detection of HVAs [44]. The CT works well for chirp-like signals with linearly time-varying components [49,50]. But the CT fails to capture the transition from S1 component to systolic murmur, and from S2 component diastolic murmur in the time-frequency plot of the pathological PCG signals [13]. In this work, we have considered the spline CT (SCT) as the extension of CT for the evaluation of the timefrequency matrix from the PCG signal. The SCT has advantages such as it has better time-frequency localization for the nonlinearly time-varying components of the nonstationary signal as compared to CT [51]. Therefore, we can expect that the time-frequency matrix computed using SCT of the PCG signal can effectively capture the pathological variations and provide better resolution in the time-frequency domain of the PCG signal. Recently, the convolutional neural network (CNN) and stacked autoencoder-(SAE-) based deep neural network (DNN) methods have been used for the automated assessment of HVAs using PCG signals [44,52]. In order to obtain the optimal parameters in CNN and SAE networks, rigorous training based on the gradient descent algorithm is used [53]. Also, these networks require more instances during the training process for obtaining the optimal model parameters [54]. The DNN based on extreme learning machine-(ELM-) autoencoder has advantages such 2 BioMed Research International as it requires less training time for the evaluation of the model parameters [55], and the ELM-autoencoder model can be efficiently implemented in real-time for the dimension reduction [56]. The sparse representation-driven classification methods have been widely used for various biomedical applications [13,[57][58][59]. These methods require fewer features for training instances and also have fewer training parameters for the prediction of class labels from the test feature vectors [59]. The SRC has shown better performances as compared to other machine learning approaches for the classification of HVAs from PCG signal features [13]. The kernel sparse representation classifier (KSRC) uses the kernel trick to map the feature instances to the higher dimensional space, and the SRC is applied in the higher dimensional space for the classification [60,61]. The KSRC has shown better classification performance for the dataset which consists of nonlinearly separable feature instances as compared to SRC [57,62]. Therefore, the DNN developed based on the ELMautoencoder, and KSRC will be effective for the automated detection of HVAs using the time-frequency representation of the PCG signal. The contributions of this paper are written as follows: The remaining sections of this manuscript are written as follows. In Section 2, the proposed method for the detection and classification of HVAs is described. The results obtained from the proposed work are discussed in Section 3, and conclusions are presented in section 4.

Proposed Method
The flow diagram of the proposed HVA detection approach is depicted in Figure 1, and the details of the various steps involved in the proposed approach are explained in detail in the following subsection.

PCG Signal Collection and Filtering.
In this work, we have collected the PCG recordings from a public database available in (https://github.com/yaseen21khan/Classification-of-Heart-Sound-Signal-Using-Multiple-Features-). The detailed description of the PCG signals database is given in [44]. The database contains a total of 1000 PCG recordings of different classes. Out of those 1000 recordings, each class (normal or pathological) contains 200 PCG recordings. The annotations for the PCG signals for normal (N) and pathological (MS, MR, AS, and MVP) classes are given in the database. The PCG recordings are given in wav file format, and these signals were recorded from the subjects with different time durations [44]. The resolution of each PCG recording in the database is 16 bits, and the sampling frequency is 8 kHz. In this work, the collected PCG recordings are downsampled to 4 kHz for the time-frequency analysis. Moreover, we have also evaluated the performance of the proposed approach using the database available in 15 recorded PCG signals. These 15 PCG signals were recorded from 15 different subjects (12 males and 3 females with the age group of 27 ± 5 years) using Thinklab digital stethoscope (https:// www.thinklabs.com/). The subjects have given written consent before recording the PCG signal in a noninvasive way [36]. The sampling frequency of each recorded signal is 4 kHz. In this work, we have also considered the Michigan heart sound and murmur database (MHSMD) (http://www.med.umich.edu/lrc/ psb_open/html/repo/primer_heartsound/primer_heartsound .html) to evaluate the performance of the proposed method. The MHSMD contains both normal and abnormal (AS, MS, MR, and MVP) PCG signals with a sampling frequency of 44.1 kHz [63]. Each PCG signal from MHSMD has been downsampled to 4 kHz. For each database PCG recording, a Butterworth bandpass filter with a lower and upper cutoff frequency of 25 Hz and 900 Hz is used [64]. After filtering, the amplitude normalization is performed with respect to  3 BioMed Research International the maximum amplitude value of the PCG recording. The normalized PCG recording, xðnÞ, is evaluated as follows [65]: where jxðnÞj, n = 1, 2, ⋯, N is the absolute value of the amplitude of nth sample of the PCG recording, and N is the total number of samples. After normalization, the time-frequency representation of each PCG recording is computed using SCT. The following subsection describes the spline kernelbased CT for the extraction of the time-frequency matrix from PCG recording.

Spline Kernel-Based Chirplet Transform (SCT).
The spline kernel-based CT is the CT with a modified kernel function [51]. This modified kernel function uses different frequency rotate and frequency shift operators for the timefrequency representation of the nonstationary signal. For a PCG signal, xðnÞ containing N samples, the discrete SCT is evaluated as follows [51]: with xðnÞ = xðnÞ · Ψ R ðn, QÞ · Ψ S ðn,ñ, QÞ. T represents the time-frequency matrix, where Ψ R ðn, QÞ and Ψ S ðn,ñ, QÞ are the frequency-rotate and frequency-shift operators, respectively. The window function is given by [49,50], The frequency-rotate operator is expressed as in (4), and the frequency shift operator is as in (5) [51]: where Qði, lÞ = q i l represents the local polynomial coefficient matrix for the spline kernel. The parameter L is denoted as the order of the spline. The parameter γ i in SCT should satisfy the following conditions as [51], with initial value γ 1 = 0. The factor i = 1, 2, ⋯, I is the ith piece, where the spline is defined in a piecewise polynomial form and I is the total number of pieces [51]. For a pathological PCG signal, we have compared the time-frequency representations that are obtained using both CT and SCT methods. The AS pathological PCG recording is shown in Figure 2(a). The time-frequency contour plots of pathological PCG signal computed using CT and SCT are shown in Figure 2(b) and Figure 2(c), respectively. It can be observed from the figure that the time-frequency plot obtained using CT has an energy distribution between 25 Hz and 300 Hz. However, the murmurs are high-frequency sounds produced during the recording of the PCG signal [66,67]. It is clearly observed from the time-frequency plot of the PCG recording obtained using SCT that the murmur energies are distributed between 100 Hz and 780 Hz. This shows that the information regarding the murmurs is not effectively captured in the CTbased time-frequency representation and the SCT provides better time-frequency localization for PCG recording as compared to CT. The PCG signals for normal (N) and pathological classes such as MR, MS, AS, and MVP are depicted in Figure 3(a), Figure 3(c), Figure 3(e), Figure 3(g), and Figure 3(i), respectively, and the time-frequency plots for these signals were obtained using SCT are shown in Figure 3(b), Figure 3(d), Figure 3(f), Figure 3(h), and Figure 3(j), respectively. It can be observed that the pattern associated with the pathological PCG signal has different morphology for each type of HVA as compared to the normal PCG signal. The energies in the S1 and S2 components of the normal PCG signals are grossly distributed from 25 Hz to 300 Hz (as shown in Figure 3(b)). However, during HVA, the energy is distributed above 300 Hz in the time-frequency plot of the PCG signal. Each frequency component in the time-frequency matrix of the PCG recording has different characteristics for normal and pathological PCG signals. Therefore, the features computed from each frequency component of the PCG recording in the time-frequency domain will be helpful for the accurate detection of HVAs. In this study, we have extracted three types of nonlinear features, namely, L1-norm, sample entropy, and permutation entropy from the first 400 frequency atoms or components of the time-frequency representation of the PCG recording. The L1-norm (LN) features for the kth frequency component is evaluated as [68] Moreover, we have also evaluated the sample entropy (SEN) [69] and permutation entropy (PEN) [70] features from the kth frequency atom of the matrix T. The features are denoted as SEN k and PEN k . A 1200-dimensional feature vector based on the combination of 400 LN, 400 SEN, and 400 PEN features is formulated for each PCG recording obtained from the database and 15 recorded PCG signals. The KSRC classifier is used to detect HVAs from the 1200-dimensional feature vector. In the following subsection, the descriptions of DLKSRN for the classification of HVAs are presented.

Deep Layer KSRC.
In this work, the DLKSRN is proposed for the classification of HVAs using PCG signal features. The architecture of DLKSRN is shown in Figure 4. It consists of an input layer, first ELM-autoencoder hidden layer, second ELM-autoencoder hidden layer, and an output layer. In this 4 BioMed Research International work, the hold-out and 10-fold cross-validation (CV) techniques are used to select the training and test PCG recordings. The feature matrix which comprises of the training feature vectors of the PCG recordings and the class labels are given as, fz i , y i g m i=1 with, z i ∈ ℝ p and y i ∈ 1, 2, 3, 4, 5, where 1, 2, 3, 4, and 5 are class label representations for normal, MVP, AS, MR, and MS classes. p is the size of the feature vector obtained from each PCG recording, and m is the number of PCG recordings considered during the training of the DLKSRN. The hidden layer matrix in DLKSRN is evaluated by solving the following optimization problem as, where W i is the ith hidden layer weight matrix andZ is the input feature matrix for the ELM-autoencoder. For first ELM-AE,Z is the feature matrix (Z) containing PCG instances and time-frequency-based features. Similarly, for the second AE, the feature matrix (Z) is the hidden layer matrix (H 1 ) obtained from the first ELM-autoencoder. The weight matrix evaluation for each ELM-autoencoder is given by, The feature matrix obtained in the second hidden layer of ELM-autoencoder is given as follows: The new feature matrix, Z * , is used as the input to the KSRC layer of the proposed DLKSRN for the classification. KSRC is a kernel-based sparse representation technique, and it does not require rigorous training like deep neural networks (DNNs) to evaluate the class labels of the test feature vectors [60,61]. It consists of four steps to estimate the class label of test PCG feature vectors. These steps are (i) mapping of the feature vectors of PCG signal into higher dimension space using kernel function, (ii) use of kernel-based dimension reduction for feature reduction, (iii) evaluation of coefficient vector and residual to test PCG feature vector by solving L1-norm optimization problem, and (iv) assignment of the class label to test PCG vector based on finding the minimum distance for all classes [60,61]. The SRC has less performance when the feature vectors are not linearly separable. To overcome this limitation, KSRC maps the input PCG feature vector to a higher dimension space and performs the SRC in that new space.
The mapping function Ψðz * Þ projects each training feature vector to a higher dimensional space, and it is given as In new feature space, we can represent the mapped feature vector, Ψðz * t Þ, as the linear combination of the mapped training feature vectors of the PCG recordings, and it is given by In KSRC, the dimension of kernel space r is higher as compared to the second hidden layer spacep, and also, it can be higher than the number of training instances m. Therefore, for getting a sparse solution of γ in (11), the dimension reduction step is used in the kernel space. The dimension reduction is performed based on the use of the transformation matrix A. The constraint for the optimization problem in (11) is modified as follows:  BioMed Research International where the transformation matrix can be evaluated as follows: The matrix S is the pseudotransformation matrix, and it is evaluated using any one of the dimension reduction techniques (random projection, kernel principal component analysis (KPCA), and kernel linear discriminant analysis (KLDA)) [60,61]. The expression in (12) can be simplified as follows: The above equation can also be written as, S T kðz * t , z * i Þ = KγS, where kðz * t , z * i Þ and K are the kernel function and the kernel matrix, respectively. The original optimization problem in KSRC is modified as follows: subject to S T kðz * i , z t Þ = S T Kγ. The residual for the test instance z * t for the cth class is obtained as follows [60]: where δ c = ½δ c ðγ 1 Þ, δ c ðγ 2 Þ, ⋯, δ c ðγ m Þ, and δ c ðγ i Þ is the characteristic function for the cth class. This function is defined as follows [60,61]: The residual for each class is computed, and the final class label for the second hidden layer feature vector of test PCG recording is given by In this study, the number of neurons used in the first and second hidden layers of the proposed DLKSRN is 800 and 600, respectively. Moreover, we have also considered the random forest (RF) [36] and K-nearest neighbour (KNN) [65] classifiers for the classification of HVAs from the feature vectors of test PCG recordings. The optimal parameters of the RF classifier [71] such as the number of trees, number of splits for each decision tree, and depth of each decision tree obtained using the grid-search technique are 20, 20, and 15, respectively. For the KNN classifier, we have considered the number of the nearest neighbours as 3 and used Euclidean as the distance metric [72]. The performance of the 1200 dimensional SCT-based time-frequency features of PCG recordings is evaluated using DLKSRN, KSRC, RF, and KNN classifiers with the hold-out cross-validation (CV) strategy. For hold-out CV, 60%, 10%, and 30% of PCG signal instances are considered for training, validation, and testing of the DLKSRN classifier. Similarly, for the 10-fold CV case, 90% of PCG signal instances from the feature matrix are used to train the DLKSRN classifier. The remaining 10% PCG signal instances are evaluated during the testing phase of the DLKSRN classifier in each fold. The metrics, namely, the sensitivity, specificity, precision, F-score, and overall accuracy (OA), are used to evaluate the performance of DLKSRN, KSRC, RF, and KNN classifiers [72]. In the following section, the results obtained using the proposed approach are discussed in detail.

Results and Discussion
In the first part of this section, the statistical analysis results of SCT-based features of PCG recordings are presented. In the second part, the classification results using RF, KNN, KSRC, and the proposed DLKSRN models are shown. The third part of this section describes the comparison and advantages of the proposed approach for HVA detection. In this study, we have conducted a statistical analysis of all   Table 1. It is noted that each feature has distinct mean values for each of the pathological classes   (AS, MS, MR, MVP) and normal class. The SEN feature for more than 300 frequency components of the SCT-based time-frequency matrix has a lower mean value for the normal class as compared to the pathological classes. Similarly, more than 230 PEN features have lower mean values for the normal class, and more than 200 L1-norm features have higher mean values for the AS class. The pathological signature for MS is the presence of diastolic murmurs [73], and murmurs are observed between the systolic interval of PCG recording in MVP pathology [74]. In MS and AS pathologies, the murmurs have lowpitch sounds. Similarly, the high-pitch sounds are observed in the PCG recording during AR-based HVA [14]. The aforementioned pathological changes on the PCG recording affect the morphologies of the SCT-based time-frequency matrices. Hence, the features from the time-frequency matrices have distinct mean and standard deviation values. We have also used the analysis of variance-(ANOVA-) based test [75] to verify the statistical significance of SCT-based time-frequency features. It is observed from the ANOVA test that all 1200 features extracted from the SCT-based time-frequency representation of PCG recording have p values less than 0.001 and is significant for the detection of HVAs.
The classification results of RF, KNN, KSRC, and DLKSRN are shown in Table 2. In this work, we have considered five random trials based on the hold-out CV to evaluate the performance of each classifier. The performance metrics       can be noted that the nonlinear features extracted using SCT of PCG recording are able to classify the HVAs accurately using DLKSRN. In this study, the parameters of the DLKSRN classifier such as the number of neurons in the 1st and 2nd hidden layers are selected based on the maximum accuracy values in the validation and test sets. The variations in the overall accuracy values with hidden neurons in the 1st and 2nd hidden layers are shown in Table 4. It can be observed from the table that the overall accuracy value of the DLKSRN classifier is high when the number of neurons in the 1st and 2nd hidden layers are 800 and 600, respectively. The overall accuracy value decreases by increasing the number of neurons in both hidden layers. Similarly, for the MHSMD database, the classification results obtained using the DLKSRN classifier are shown in Table 5. It is observed that the proposed SCT-based time-frequency domain features combined with the DLKSRN classifier have obtained an overall accuracy value of 96.79%. The sensitivity and specificity values are greater than 94% for each class using the DLKSRN classifier. Moreover, we have tested the effectiveness of our proposed approach with 15 recorded PCG signals. The DLKSRN model which has been trained using the features from the public database has been used to test the performance of the private database. The LN, SEN, and PEN features are extracted from all 15 recorded PCG signals. The trained DLKSRN model successfully predicted all 15 feature vectors of PCG recordings as normal class thereby showing the effectiveness of the proposed approach for real-time precision of HVAs.
The objective of this study is the HVA detection using nonlinear features extracted from the SCT-based time-frequency analysis of PCG recording. The proposed features are found to be discriminative with the lowest p values obtained using the statistical test. The classification results obtained using the hold-out and 10-fold CV-based PCG instance selection reveal that the proposed approach has obtained an overall accuracy of more than 99% for the detection of HVA. A comparison with the existing algorithms for automated HVA detection is shown in Table 6. Safara et al. [76] developed the automated approach using wavelet packet decomposition-based feature extraction technique and SVM classifier for the discrimination of MR, AS, and AR-based HVAs with PCG recordings. They have achieved an accuracy of 97.56%. Maglogiannis et al. [77] used the SVM classifier coupled with the morphological features (standard values of S1 and S2 peaks and other features) for the detection of MR and MS pathologies and reported an accuracy of 91.23% in classifying two HVAs. Moreover, Zheng et al. [78] employed the energy fraction and energy-based features coupled with the SVM classifier for the automated discrimination of HVAs such as tricuspid insufficiency (TI), pulmonary stenosis (PS), mitral insufficiency (MI), and AS. They have obtained an overall accuracy of 97.17% in classifying the four HVAs. The time-frequency domain magnitude and phase features extracted using the SST of PCG signal have been used in [36] for the discrimination of AS, MS, and MR classes. They have obtained an overall accuracy value of 95.12% in classifying the three classes. The combination of MFCC-and DWTbased features extracted PCG signal, along with SVM classifier, has been used for automated HVA detection with an overall accuracy of 97.9% [44]. The CT-based timefrequency features obtained from PCG and composite   12 BioMed Research International classification model yielded an overall accuracy value of 98.33% [13]. Oh et al. [79] have proposed a waveNet-based DNN model for the classification of HVAs using PCG recordings and obtained an overall accuracy value of 98.20%. The proposed approach demonstrated higher classification performance as compared to the existing algorithms for automated HVA detection. The method reported in [13] has classified AS, MS, and MR pathologies using PCG. However, in the present work, we have considered MVP pathology along with AS, MS, and MR ailments for the development of an automated HVA detection system. The advantages of the proposed HVA detection approach are given as follows: In this work, the local features from the frequency components of the time-frequency representation of the PCG signal are evaluated. The two-dimensional convolutional autoencoder [80] can be used for the extraction of learnable features from the SCT-based time-frequency representation of the PCG signal for the classification of HVAs. The sparse residual entropy features [81] and wavelet bispectrumbased features [82] can be used for the detection of HVAs from the PCG signal. The convolutional neural network [83], convolutional attention-based network [84], and other deep learning methodologies [85] can be used for the detection of HVAs without using extracted features from PCG recordings.

Conclusion
A novel HVA detection approach using PCG signals is proposed. This approach used SCT to compute the timefrequency representation of PCG recording. The nonlinear features (LN, SEN, and PEN) are computed from the frequency components of time-frequency representation. The DLKSRN classifier is used to discriminate automatically into four categories of HVA classes using the extracted features. The proposed approach demonstrated an average accuracy of 99.23% and 99.24% using hold-out and 10-fold CV methods. The proposed approach is also evaluated using the recorded signal, and the result obtained shows the practicality of the proposed approach. In the future, we intend to extend this method to detect coronary artery disease and psychological stress using PCG signals. The approach can also be implemented in real-time for IoMT applications.

Data Availability
The codes and the classification results of the proposed work are available upon request to the authors.