Multifeature Deep Cascaded Learning for PPG Biometric Recognition

Aiming at the problem that the traditional photoplethysmography (PPG) biometric recognition based on sparse representation is not robust to noise and intraclass variations when the sample size is small, we propose a PPG biometric recognition method based on multifeature deep cascaded sparse representation (MFDCSR). The method consists of multifeature signal coding and deep cascaded coding. The function of multifeature signal coding is to extract the shape, wavelet, and principal component analysis features of the PPG signal and to perform sparse representation. Deep cascaded coding is multilayer feature coding. Each layer combines multifeature signal coding with the result of the previous layer as input, and the output of each layer is the input of the next layer. The function of deep cascade coding is to learn the features of the PPG signal, layer by layer, and to output the category distribution vector of the PPG signal in the last layer. Experiments demonstrate that MFDCSR has better recognition performance than current methods for PPG biometric recognition.


Introduction
Photoplethysmography (PPG) biometric recognition has attracted the attention of many researchers in the past decade [1][2][3][4][5][6]. PPG signals not only have common characteristics to traditional biological features, such as universality, uniqueness, stability, and ease to collect, but also have the following advantages: (1) Liveness detection. PPG signals can only be captured from living individuals. (2) High security. It is very difficult to imitate PPG signals. (3) Universality. PPG signals can be captured from any individual. (4) Small amount of data. PPG signal is one dimensional and is easier to store and process than image signals such as fingerprints and irises.
Although many findings have been published, the research of PPG biometric recognition is still in the laboratory stage, and it is still far from practical application. ere is no mature PPG biometric recognition product on the market, and the main reasons are as follows: (1) Noise. PPG signals often contain different kinds of noise due to a variety of factors, including acquisition equipment, body position, and collection environment. During the acquisition process, the noise level of each individual also varies over time. (2) Intraclass variation. Differences in acquisition principles and the environment required by different types of acquisition equipment also lead to considerable variance in the height of the main wave, descending midisthmus amplitude, height of repetition wave, and pulse signal origination point of acquired PPG signals. e PPG signal is nonstationary and susceptible with time. erefore, there is a very challenging problem for PPG biometric recognition with an intraclass variation.
Sparse representation selects a linear dictionary of original training signals to reconstruct the testing signal and has been used in fields such as noise removal, signal compression, feature extraction, and pattern recognition [7][8][9]. Sparse representation can concentrate on the energy of PPG signals in a small number of samples and reconstruct the signal from them, thus effectively removing noise and redundant information. However, sparse representation is a shallow decision model. When a PPG signal contains a large quantity of noise and intraclass variations, the shallow decision model has weak learning ability and poor robustness to PPG biometric recognition. erefore, how to further improve the robustness of sparse representation in PPG biometric recognition is a problem worthy of study.
Most PPG signal feature extraction methods use singlefeature extraction, such as extracting waveform features, global features, and wavelet features. PPG signals are affected by the external environments of noise and intraclass variations, and single features always fail to obtain reliable identification results. Complementary information exists between features and provides more adequate information in the process of PPG biometric recognition. In the multifeature learning process, fusion learning of different heterogeneous features of PPG signals could improve recognition performance. However, the heterogeneity of features also poses obstacles to fusion learning. erefore, it is worthwhile to study how to establish the connection between PPG features and thus obtain an effective fused feature representation of PPG signals.
To solve the above problems, we integrate sparse representation learning into deep cascade learning and propose a multifeature deep cascaded sparse representation for PPG signal biometrics. A simple illustration of the proposed methodology is shown in Figure 1. e proposed method includes two parts of multifeature signal coding and deep cascade coding. e multifeature signal coding is to extract the shape, wavelet, and principal component analysis features of the PPG signal and to perform sparse representation. Deep cascaded coding is to learn the discriminative features of the PPG signal, layer by layer, and to output the category distribution vector of the PPG signal in the last layer. e main innovations of this work are as follows: (1) e proposed multilayer feature extraction model based on multifeature deep cascaded sparse representation does not require a large training database, and it has good feature representation capabilities (2) e application of multifeature learning to PPG signal identity recognition improves the recognition performance by exploiting the complementarity of features (3) By transforming different base classifiers, the proposed model has good scalability e rest sections of this work are as follows: first, we give the related work in Section 2. en, we introduce the proposed methodology in Section 3 and report the experimental results and analysis in Section 4. At last, we give the brief conclusion and some future work in Section 5.

Related Work
e existing methods of PPG biometric recognition have fiducial and nonfiducial-based methods. e fiducial methods use the time domain characteristics as fiducial points, such as the amplitude, time interval, and slope of the PPG signal. Chakraborty and Pal [1] used the first and second derivatives to extract 12 feature points of amplitude and time from the PPG signals of 15 healthy individuals and calculated the Euclidean distance of the PPG signal feature statistical parameters for recognition. Lee and Kim [2] took 708 data records from 10 healthy individuals and used the feedforward neural network for PPG biometric recognition. Nadzri et al. [3] extracted the systolic peaks, diastolic peaks, and dicrotic notches of PPG signals, and used the Bayes network, radial basis function, and multilayer perceptron for PPG biometric recognition. Sancho et al. [4] extracted the feature of the time domain and Karhunen-Loève transform, and used matching metrics on four different PPG databases. Nonfiducial approaches take a more holistic approach to extract the overall signal morphology. Spachos et al. [5] proposed a PPG biometric recognition by the linear discriminant analysis (LDA) and K-Nearest Neighbor (KNN). Karimian et al. [6] used discrete wavelet transform (DWT) for PPG biometric recognition. Yadav et al. [10] proposed a method of continuous wavelet transform (CWT) and direct linear discriminant analysis (DLDA) for PPG biometric recognition. Faragó et al. [11] presented the correlation-based nonfiducial feature extraction technique by computing correlations of PPG signals. Lee et al. [12] used the discrete cosine transform (DCT) for PPG biometric recognition.
Recently, many PPG biometric recognition methods based on deep learning have been proposed. Everson et al. [13] proposed a PPG biometric recognition based on a fourlayer deep neural network, which included two convolutional neural networks and two long and short-term memory layers. Biswas et al. [14] proposed a novel deeplearning framework that could effectively estimate heart rates and only used wrist-worn single-channel PPG signals collected in a mobile environment for PPG biometric recognition. Hwang and Hatzinakos [15] proposed a PPG recognition method by using a convolutional neural network with long-term short-term memory to construct a personalized data-driven network and modeled the time-series sequence inherent within the PPG signal. e fiducial and nonfiducial-based approaches are sensitive to external factors, and the recognition results for PPG signals are not always reliable. Although deep learning has good recognition performance, it needs more computing resources, more training time, and too many training parameter adjustments. Moreover, the interpretability and theoretical analysis of deep learning are still not completely clear.

Proposed Methodology
e proposed methodology includes the multifeature signal coding and deep cascade coding. We then give a detailed description of the whole procedure.

Multifeature Signal Coding.
Multifeature signal coding contains the multifeature extraction and the sparse residual coding (SRC). We first extract the shape feature, wavelet feature, and principal component analysis (PCA) features of the PPG samples. en, we obtain the sparse representation coding of each sample. (1) Shape Feature. We acquire the fiducial points of PPG signals, and forty features can be extracted [16], which includes the amplitude and time interval of PPG signals, such as the pulse interval, augmentation and alternative augmentation index, systolic peak time, and dicrotic notch time.
(2) Wavelet Feature. e low-frequency component of PPG signals contains the discriminative feature, and the highfrequency component includes the detail feature. e discrete wavelet transform can extract the low and high-frequency components, which can obtain wavelet coefficients with the discriminative and detailed information of PPG signals in time and frequency [17]. We choose Daubechies wavelet Db8 as wavelet bases, which can reduce the noise influence.
(3) PCA Feature. PCA feature is a linear combination of original features, which can reduce the dimension by mapping high-dimensional data space into the low subspace spanned. We can obtain the principal component features of the PPG signal [18]. PCA feature can summarize the most important features and compress the scale of the original PPG signals.

Sparse Residual Coding.
We assume that X � (X 1 , X 2 , . . . , X C ) ∈ R m×n represents the PPG training samples, each class X i has n i samples, C is the class number, m is the dimension value, X i � (x i1 , x i2 , . . . , x in i ) ∈ R m×n i , and n � C i�1 n i . A testing sample y can be represented by a linear combination of training samples X as where w p � (0, . . . , 0, w i1 , w i2 , . . . , w in i , 0, . . . , 0) T ∈ R n is a coefficient array that has nonzero values of the i-th class. It is important to note that the advantage of representing the test sample as a linear combination of training samples has been explored in [7][8][9].
Like the work in literature [9], the sparse representation coefficient of sample y can been obtained as follows: where λ 1 is a regularization parameter and ‖•‖ 1 is the L 1 norm.
We can obtain the coefficient w p by solving equation (2), and the sparse residual r p i can be obtained as where δ i (•) is a function, which can only set the coefficient array as nonzero values of the i-th class, 1⩽i⩽C. At last, we obtain the residual representation s p as If a testing sample y has label i, then s p i with label i is larger than s p with other labels, and we have more discrimination information.
In order to facilitate description, we use the following definition to obtain the residual coding v of testing sample y by training samples X: v � SRC X (y). Level N-2 . Similarly, each column in W X 1 is inputted into the SRC unit to obtain the coding vector set W 1 , which is concatenated with W X 1 as the input training samples M 1 of the second level, W 1 � SRC W X 1 (W X 1 ), M 1 � (W T 1 , W X T 1 ). Second, in a similar way, the coding v y 2 and v y 3 of query sample y, augmented by the coding obtained by the previous level, are inputted as the query sample of the third and fourth level, respectively. W X 2 and W X 3 , augmented by the training samples obtained by the previous level, are inputted as the respective training samples of the third and fourth level, respectively. All of these steps are repeated N times, and the final query coding d N ∈ R 2C×1 and training coding M N ∈ R 2C×n are obtained.
Finally, d N ∈ R 2C×1 and M N ∈ R 2C×n obtained by level N are inputted into the sparse representation classification to get the final prediction coding

Recognition.
At the recognition stage, we can obtain the label of query sample y by the coding v (N+1) ∈ R C×1 as follows: e total process of MFDCSR is given as Algorithm 1.
Input: the PPG training sample sets X ∈ R m×n , query sample y ∈ R m , class number C, regularization coefficients λ 1 , level number N. Output: the label of the query sample y. (1) Initialize: i � 1; (2) Obtain feature coding vectors v y 1 , v y 2 and v y 3 of a query sample y and feature coding matrix W X 1 , W X 2 and W X 3 by multi-feature signal coding; (10) Obtain feature coding vector v (N+1) � SRC M N (d N ); (11) Find the index of maximum value in v (N+1) using equation (6), which shows the label of y.

Databases.
To verify the validity of the proposed MFDCSR, we choose three databases: Beth Israel Deaconess Medical Center (BIDMC) [20,21], Multiparameter Intelligent Monitoring for Intensive Care (MIMIC) [22], and CapnoBase [23]. e BIDMC contains 53 8-minute long recordings with 125 Hz sampling frequency and was collected from 19 to 90 years old. MIMIC was captured from the patients in ICUs, and the signal recordings have different types.
e MIMIC database contains different data of PLETH, ABP, and RESP. PLETH is the PPG data signal with 125 Hz frequency. e CapnoBase database includes PPG, ECG, and other biometric recordings for 42 cases of 8minute long recordings with 300 Hz frequency. In all experiments, we first take 1-minute long recordings per subject as the experiment data and use 60% of the data for training, 30% for validation, and 10% for testing. e testing samples

Performance Metrics.
To verify the correctness and feasibility of the proposed MFDCSR, we use the subject recognition rate as detecting criterion, which is defined as follows: where N test sample is the total number of test samples, and N correct sample is the number of correctly identified testing samples.

Parameter Evaluation.
First, we detect the performance influence of the cascade level numbers. e subject recognition rates with different level numbers on three databases are shown in Figure 3.From Figure 3, we can see that the subject recognition rate increases with the growth of the level numbers. On BIDMC, after the level number is more than seven, the subject recognition rate increases slowly; on MIMIC and CapnoBase, the subject recognition rate increases slowly after the level number is more than five. erefore, for all databases, we set the level number as seven in the proposed MFDCSR. en, we evaluate the parameters of sparse representations on the performance influence. As suggested in [24], we set regularization coefficient λ 1 as , where m is the sample dimension of PPG signals. e iteration number of sparse representations also affects the recognition performance, and the subject recognition rate with different iteration numbers is shown in Figure 4.
From Figure 4, we can see that the subject recognition rate obtains better results if the iteration number is more than 50.
Finally, to evaluate the performance influence of cycle numbers per sample, the subject recognition rate under different cycle numbers is shown in Table 1.
From Table 1, we can see that the recognition performance increases with the increase of cycles on the three databases. When the number of cycles reaches 1.5, the subject recognition rate is 99.88% on CapnoBase. When the number of cycles varies from 0.5 to 1.5, the recognition performance increases quickly on all the three databases. In our experiment, we set 1.5 cycles per sample, which consumes time of about 1-3 seconds, and it is acceptable for practical application.

Comparison with Single-Feature MFDCSR Method.
MFDCSR uses three features of shape, wavelet, and PCA, and we compare it with only single-feature MFDCSR. e recognition performance on the three databases is shown in Table 2.
As is shown in Table 2, the multifeature MFDCSR has better performance than all MFDCSRs with single-feature on the three databases, which can show that multifeature learning can enhance the recognition performance.       Figure 5. From Figure 5, we can see that MFDCSR with different noise levels has better recognition performance than the other methods on three databases. MFDCSR can extract more discriminative information by multifeature deep cascaded sparse representation and has more robustness than the other methods.

Comparison with State-of-the-Art Methods.
In this section, we give the comparison of MFDCSR with state-ofthe-art methods for PPG biometric recognition on the three databases, and the results are shown in Table 3.
From Table 3, we can see that MFDCSR outperforms other methods on all three databases. For example, the subject recognition rates of our method increase by 0.21% on the MIMIC database, and it is evident that multifeature deep cascaded sparse representation can enhance the performance of the proposed method.
It is worth noting that MFDCSR and our previous work in literature [19] are different. Our previous work only extracted one feature as a feature descriptor, and MFDCSR extracted multiple features of the shape, wavelet, and PCA. Our previous work used multiscale representation to deal with noise, and MFDCSR exploits the complementarity of features to improve the recognition performance. MFDCSR is a multiple feature learning method, and our previous work is a signal feature learning method, so MFDCSR and our previous work in literature [19] are different methods.  Scientific Programming

Analysis of Computation Time.
It is important to analyze the computation time of the proposed method for PPG biometric recognition, and we give the time cost of the recognition process on the CapnoBase database. We cannot obtain the source codes of the other methods for PPG biometric recognition, so we only give the time cost of our proposed method in Table 4. Our experiments are carried out on an Intel i7-4790 3.60 GHz and 16 GB RAM with MATLAB 2016b. As shown in Table 4, we can see that the feature-extraction time of MFDCSR is fast. e training time of MFDCSR without gradient backpropagation is acceptable, as the deep learning training is known to take more time.

Conclusions
ere is growing concern about the study of PPG biometric recognition in recent years. In this paper, we propose a PPG biometric recognition method based on multifeature deep cascaded sparse representation. First, we extract the shape feature, wavelet feature, and PCA feature of PPG signals. en, we use the SRC to obtain the sparse representation of different features. Second, to mine more discriminative information, we perform a levelwise coding learning without back propagation. At last, experimental results demonstrate that the PPG biometric recognition method based on multifeature deep cascaded sparse representation has better recognition performance. In future work, we aim to change the cascade model to improve the feature extraction technique for PPG biometric recognition.