The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the shortterm heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of IIIII degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis,
According to the World Health data, hypertension affects more than 1 billion people worldwide. Many factors can conduce to hypertension, including occipital stress and job strain [
The heart rate variability (HRV) is among one of the widely used biomedical signals, due to ease of record the electrical heart activity [
Common HRV analysis implies the application of a variety of analysis methods: statistical, spectral, and nonlinear analysis. Generally, in a single study, a limited number of features are extracted. Such as in [
The common uses of machine learning approaches for condition classification based on HRV information imply usage of several available methods: support vector machine (SVM), discriminant analysis (DA), and ordinal pattern statistics (OPS) [
In one of the previous works, the investigation of the linear and quadratic discriminant analysis was carried out, implying the study of arterial hypertension diagnostic using single features of shortterm HRV signals. In that work, the evaluation of the features and the evaluation of the classifier efficacy were carried out by means of an inhouse software produced in MATLAB [
In summary, the goal of the present work is to study the efficacy of different machine learning approaches for diagnostic of the arterial hypertension by means of the shortterm HRV, using combinations of statistical, spectral (Fourier and Wavelet transforms), and nonlinear features. By applying feature combinations of different methods, we aim to build more robust and accurate classifiers.
The clinical part of the study was performed in the Sverdlovsk Clinical Hospital of Mental Diseases for Military Veterans (Yekaterinburg, Russian Federation). For the HR signals registration, the electroencephalographanalyzer “Encephalan13103” (“MedicomMTD,” Taganrog, Russian Federation) was used. The rotating table Lojer (Vammalan Konepaja OY, Finland) performed the spatial position change of the patient during passive orthostatic load; the lift of the head end of the table was up to 70° from the horizontal position. The clinical part of the study was approved by the local Ethics Committee of the Ural State Medical University.
Participants of this study were 30 healthy volunteers and 41 patients suffering from the arterial hypertension of II and III degree. The electrocardiography (ECG) signals were recorded in two functional states: functional rest (state F) and passive orthostatic load (state O). The length of the signal in the mentioned state was about 300 seconds. The HRV signals were consequently derived from the ECG signals automatically by the “Encephalan13103” software. Figure
Diagram of the study.
Prior to the processing, the original time series were cleaned from the artifacts. By artifacts, in this study, we considered values of the RR intervals that differed from the HR mean by more than three standard deviations. NN is the abbreviation for the “normal to normal” time series, that is, without artifacts. Among all studied time series, less than 2% of data was removed. For spectral and multifractal analyses, NN time series were interpolated using cubic spline interpolation with the 10 Hz sampling frequency.
The feature dataset is the same as used previously [
Statistical methods are used for the direct quantitative evaluation of the HR time series. Main quantitative features are as follows:
The geometric methods analyze the distribution of the RR intervals as a random numbers. The common features of these methods are as follows:
The following indexes are derived from common geometric features:
Spectral analysis is used to quantify periodic processes in the heart rate by the means of the Fourier transform (Fr). The main spectral components of the HRV signal are high frequency—HF (0.4–0.15 Hz), low frequency—LF (0.15–0.04 Hz), very low frequency—VLF (0.04–0.003 Hz), and ultralow frequency—ULF (lower than 0.003 Hz) [
The studied quantitative features of spectral analysis are
spectral power of the
total power of the spectrum—
normalized values of the spectral components by the total power—
the LF/HF ratio, also known as the autonomic balance exponent,
For nonstationary time series, one can also use the wavelet transform (wt), to simultaneously study timefrequency patterns. The general equation for continuous wavelet transform is as follows:
Moreover, the connection between the scale and the analyzed frequency is in accordance with the following:
It is possible to acquire same spectral features by means of the wavelet transform:
Spectral power of the
Normalized values of the spectral components by the total power—
The LF/HF ratio.
Additionally, standard deviations SDHF(wt), SDLF(wt), and SDVLF(wt) of the HF_{wt}(
Moreover, one can study informational characteristics of the wavelet transform by analyzing the
Example of
As the nonlinear feature in this study, we have used the Hurst exponent calculated by the aggregated variance method. The variance can be written as follows:
Note that
As nonlinear methods, we adopted the multifractal detrended fluctuation analysis (MFDFA) [
The main steps of the method include the following:
The detrending procedure with second degree polynomial on nonoverlapping segments where the length of the segments corresponds to the studied time scale boundaries.
In current study, we investigated time scale boundaries that correspond to the LF and VLF frequency bands: 6–25 sec and 25–300 sec, respectively. In our earlier works and by other authors, it was noted that multifractal analysis of the HF component is not informative because of the noising [
Determination of the fluctuation functions for
Estimation of the slope exponent
Calculation of the scaling exponent
The Legendre transform application for the probability distribution of the spectrum estimation:
Figure
The features of multifractal analysis.
List of studied features.
Feature  Description  Equation 


Mean value of the RR  ( 
HR  Heart rate  ( 
SDNN  Standard deviation of the RR  ( 
CV  Coefficient of the variation  ( 
RMSSD  Square root of mean of squares of differences between successive RR  ( 
NN50  Variation higher than 50 ms in RR signal  — 

Mode of the RR signal  — 
VR  Variation range of the RR signal  — 
AM_{0}  Amplitude of the mode  — 
SI  Stress index  ( 
IAB  Index of autonomic balance  ( 
ARI  Autonomic rhythm index  ( 
IARP  Index of adequate regulation processes  ( 
HF(Fr)  High frequency Fourier spectral power  — 
LF(Fr)  Low frequency Fourier spectral power  — 
VLF(Fr)  Very low frequency Fourier spectral power  — 
TP(Fr)  Total power of the Fourier spectrum  — 
LF/HF(Fr)  Autonomic balance exponent of the Fourier spectrum  — 
HF_{max}(Fr)  Maximum power of the HF  — 
HF_{n}(Fr), LF_{n}(Fr), and VLF_{n}(Fr)  Normalized power of the HF, LF, and VLF Fourier spectrum  ( 
IC  Index of centralization  ( 
IAS  Index of the subcortical nervous center’s activation  ( 
RF  Respiration frequency  — 
HF(wt)  High frequency wavelet spectral power  — 
LF(wt)  Low frequency wavelet spectral power  — 
VLF(wt)  Very low frequency wavelet spectral power  — 
HF_{n}(wt), LF_{n}(wt), and VLF_{n}(wt)  Normalized power of the HF, LF, and VLF wavelet spectrum  — 
SDHF(wt), SDLF(wt), and SDVLF(wt)  Standard deviations of the HF( 
— 
TP(wt)  Total power of the wavelet spectrum  — 
LF/HF(wt)  Autonomic balance exponent of the wavelet spectrum  — 
(LF/HF)_{max}  Maximal value of dysfunctions  — 
(LF/HF)_{int}  Intensity of dysfunctions  — 
Nd  Number of dysfunctions  — 

Hurst exponent  ( 

Smallest fluctuations of the LF and VLF spectral band  — 

Greatest fluctuations of the LF and VLF spectral band  — 
WLF, WVLF  Spectrum width of the LF and VLF spectral band  — 

Correlation degree of the LF and VLF spectral band  — 

Spectrum height of the LF and VLF spectral band  — 

1/2width measure of the LF and VLF spectral band  — 
For the machine learning evaluation, the respective functions of the
In this work, two variants of the discriminant analysis were tested—linear and quadratic discriminant analyses (LDA and QDA). The LDA aims to find the best linear combination of the input features to properly separate studied classes. In the case of the QDA, the studied classes are separated by a quadratic function [
The
The base idea of the support vector machine methods is creation of the decision hyperplane which would separate different classes. In that case, the margin between two nearest points on the different sides of the hyperplane is maximal. In present study, the radial basis function (RBF) is used. For implementation in
The decision trees classification model is built around a sequence of the Boolean queries. The sequence of such queries forms the “trees” structure. In the present work, variations of the classifier were analyzed—with fixed value of the maximal tree depth (
This method is based on the application of the Bayes’ theorem with assumptions that data has strong (or naive) independence. In current study, the Gaussian distribution of data is assumed [
In the current investigation, all possible combinations of all features were analyzed. However, it is well known that using combined correlated features in machine learning may lead to misleading results. Therefore, the first step in this investigation is to sort uncorrelated combinations. For this task, we compute the correlation coefficient. The whole flowchart of the script for noncorrelated feature combination selection is presented in Figure
Flowchart of the noncorrelated combination selection.
The threshold correlation value was set to 0.25. Usually, correlation more than 0.75 is considered to be high. Therefore, a value lower than 0.25 is a good benchmark for low correlation. In the current work, two to five feature combinations were made. In case of more than two feature combinations, the correlation was checked pairwise. When all calculation was finished, the noncorrelated features were saved to a file for future purposes.
Table
Noncorrelated combination selection data.

Total 
Selected 
Calculation 

2  1378  586  0.027 
3  23,426  1669  0.477 
4  292,825  1339  11.559 
5  2,869,685  295  228.267 
Figure
Flowchart of classifier efficacy evaluation algorithm.
Crossvalidation implies division of the original datasets into
In the current investigation, the number of random folds l was set to be 5. For the implementation of 5fold crossvalidation, we randomly divide the original dataset into 5 subsets. The division is implemented for both groups simultaneously. As the result, each subset included 6 healthy volunteers and 8 patients diagnosed with hypertension.
Many machine learning methods are sensitive to train set selection, so, in order to remove such influence, the crossvalidation procedure was repeated 100 times with different folds. The repeated crossvalidation allows to increase number of classification accuracy estimates [
Table
Calculation times of classifier efficacy evaluation, sec.
Features in combinations  LDA  QDA  NN3  NN4  NN5  RBF SVM  DT  Naive Bayes 

2  165  113  281  281  281  139  89  130 
3  482  346  917  860  806  403  269  375 
4  397  288  640  643  642  325  222  300 
5  88  64  141  140  140  71  50  66 
The classifier performance was averaged over 5 crossvalidations and over 100 implementations. Figures
Classifier score for 2feature combinations.
Classifier score for 3feature combinations.
Classifier score for 4feature combinations.
Classifier score for 5feature combinations.
Maximal scores achieved by each learning machine approach.
Scores of the PCA achieved by each learning machine approach.
Figure
According to the data presented in Figures
It is worthy to mention that generally classification accuracy rises as the number of features in the feature set increases. For 4feature sets, the maximum is achieved—accuracy for 5feature sets is lower for all machine learning approaches. It drops significantly in case of support vector machine approach.
Table
Best classification scores.
Score, %  Features  

Linear discriminant analysis  
91.33 ± 1.75  HR  VLF_{n}(Fr)  LF/HF(Fr)  VLF(wt) 
90.30 ± 1.37  HR  VLF_{n}(Fr)  VLF(wt)  (LF/HF)_{int} 
90.04 ± 1.85  HR  LF/HF(Fr)  VLF(wt)  VLFn(wt) 
90.44 ± 1.60  HR  VLF_{n}(Fr)  LF/HF(Fr)  SDVLF 
90.11 ± 1.80  HR  LF/HF(Fr)  SDVLF  VLF_{n}(wt) 
90.16 ± 1.61  HR  SDVLF  VLF_{n}(wt)  (LF/HF)_{int} 


Quadratic discriminant analysis  
90.31 ± 1.71  HR  VLF_{n}(Fr)  LF/HF(Fr)  VLF(wt) 


3nearest neighbors  
87.14 ± 2.12  LF/HF(Fr)  SDVLF  VLF_{n}(wt)  W_{1/2}VLF 


4nearest neighbors  
85.56 ± 2.40  SDVLF  VLF_{n}(wt)  LF/HF(wt)  W_{1/2}VLF 


5nearest neighbors  
86.63 ± 1.30  HR  HF(Fr)  LF_{n}(Fr)  W_{1/2}VLF 


Support vector machine, radial base function  
86.73 ± 2.24  IAS  RF 

WVLF 


Decision trees, max depth 5  
87.10 ± 3.40  IARP  LF/HF(Fr)  IAS  WLF 


Decision trees, no max depth  
87.34 ± 3.08  IARP  LF/HF(Fr)  IAS  WLF 


Naïve Bayes classifier  
88.17 ± 1.07  VLF(Fr)  VLF_{n}(Fr)  LF/HF(Fr)  W_{1/2}LF 
Data in Table
Among 53 studied features, 36 form combinations that have the classification score higher than 85. Table
Features occurrences for classification score higher than 85%.
Features  Occurrences, %  Features  Occurrences, % 

VLF_{n}(Fr)  50.89  Nd  4.73 
VLF(Fr)  50.89 

4.73 
VLF_{n}(wt)  47.93  WLF  4.73 

34.91  IARP  3.55 
LF/HF(Fr)  34.32  IC  2.96 
HR  33.73  HF(Fr)  2.96 
SDVLF  30.18 

2.96 
(LF/HF)_{max}  24.26  LF_{n}(wt)  2.37 
LF/HF(wt)  18.34  SI  2.37 
(LF/HF)_{int}  18.34  LF_{n}(Fr)  1.78 

17.75 

1.78 

13.61  ARI  1.78 
WVLF  13.02  HF_{n}(Fr)  1.78 
IAS  10.65  SDHF  1.18 
VLF(wt)  7.69  IAB  0.59 
RF  6.51  NN50  0.59 

5.92 

0.59 

5.33 

0.59 
Table
Feature occurrences for classification score higher than 90%.
Features  Occurrences, % 

HR  100.00 
LF/HF(Fr)  62.50 
VLF(wt)  62.50 
VLF_{n}(wt)  50.00 
VLF_{n}(Fr)  50.00 
SDVLF  37.50 
(LF/HF)_{int}  37.50 
For discussion purposes, a comparison of the results of the current study with results of one of the commonly used procedure, principal components analysis (PCA), was executed. The PCA is a statistical procedure used to reveal the internal structure of the dataset [
Table
Dataset analysis by PCA.
Principal component  Explained variance, %  Cumulative variance, % 

1  34.88  34.88 
2  17.65  52.54 
3  13.03  65.57 
4  8.87  74.45 
5  5.38  79.83 
6  4.10  83.93 
7  3.55  87.47 
8  2.42  89.89 
9  1.78  91.67 
10  1.36  93.03 
11  1.01  94.04 
12  0.95  94.98 
13  0.84  95.82 
14  0.74  96.56 
15  0.61  97.17 
In order to compare results of the semioptimal search of the noncorrelated feature space with PCA, combinations of the first 10 components were consequently tested for all machine learning approaches using 100 repeated 5fold crossvalidation. Figure
Comparing the results of Figures
In this work, various machine learning approaches were tested in task of the arterial hypertension diagnostics. In earlier works, the same datasets were used for investigation of the linear and quadratic DA methods [
The results of the current investigation showed that for the studied task, the application of the discriminant analysis (linear and quadratic) revealed to be the most appropriate classifiers. These approaches have high classification score and low deviations over different realizations. A set of four features in combination seems to be the optimal number, as the classification accuracy score is higher and more consistent than those for two, three, and five features in combination.
Prevalence of the VLF and LF/HF spectral features among best combinations might indicate that sympathetic nervous system takes an important part in the initialization of the arterial hypertension and maintenance of the increased vascular tone as well as increased cardiac output. These results are in accordance with scientists’ interpretation of the arterial hypertension development [
The results of the suggested approach were compared with data set prepared by the commonly used procedure of principal component analysis. Results of the
In future works, our research group will continue to improve results on this problem. One of the investigations that are planned is to analyze robustness of the classifiers based on multiple signals recorded simultaneously. Among the other perspective directions of future investigation is usage of the advanced neural networks [
The authors declare that there are no conflicts of interest regarding the publication of this paper.
The work was supported by the Act 211 Government of the Russian Federation, Contract no. 02.A03.21.0006, and by the FCT project AHA CMUPERI/HCI/0046/2013.