A Fetal Electrocardiogram Signal Extraction Algorithm Based on Fast One-Unit Independent Component Analysis with Reference

Fetal electrocardiogram (FECG) extraction is very important procedure for fetal health assessment. In this article, we propose a fast one-unit independent component analysis with reference (ICA-R) that is suitable to extract the FECG. Most previous ICA-R algorithms only focused on how to optimize the cost function of the ICA-R and payed little attention to the improvement of cost function. They did not fully take advantage of the prior information about the desired signal to improve the ICA-R. In this paper, we first use the kurtosis information of the desired FECG signal to simplify the non-Gaussian measurement function and then construct a new cost function by directly using a nonquadratic function of the extracted signal to measure its non-Gaussianity. The new cost function does not involve the computation of the difference between the function of the Gaussian random vector and that of the extracted signal, which is time consuming. Centering and whitening are also used to preprocess the observed signal to further reduce the computation complexity. While the proposed method has the same error performance as other improved one-unit ICA-R methods, it actually has lower computation complexity than those other methods. Simulations are performed separately on artificial and real-world electrocardiogram signals.


Introduction
The fetal electrocardiogram (FECG) contains much important information about the health and possible diseases of the fetus, which reflects the complete view of the heart activities. Additionally, it is more sensitive than color Doppler ultrasound in the case of fetal acidosis and anoxia. With the analysis of the FECG, the doctor can discover fetal hypoxemia, umbilical abnormality, and other fetal abnormality situations timely and take early effective actions to ensure the health of the fetus. However, the FECG signal is considerably weaker than the maternal electrocardiogram (MECG) and is often embedded in the noise, MECG, baseline wandering, power line interference, and so forth. Accordingly, it is quite difficult to extract the FECG signal and further reduce the diagnostic accuracy.
A number of methods have been reported for extracting the FECG signal, like filtering method [1], singular value decomposition [2], wavelet transform [3], independent component analysis (ICA) [4], and others [5]. Every method has its advantages and disadvantages [5,6]. In this paper, we only consider the method based on ICA. The FECG extraction based on the ICA methods considers the extracting FECG signal as a blind source separation problem. They consider the unknown FECG signal, MECG, and other interference signals as source signals and the measured signals from the maternal abdomen and chest as mixed signals (measured signals). It actually can separate all source signals, including the FECG signal and MECG signal. Compared with the other methods, it has a simple structure and has been proved to be efficient in extracting FECG in many studies [4][5][6][7][8][9]. However, for extracting the FECG, the ICA method must first separate all the source signals from the measured signals and then artificially selects the FECG signal. When the number of measured signals is very large, the extraction of the FECG signal will consume plenty of time.

Computational and Mathematical Methods in Medicine
To overcome this problem, the independent component analysis with reference (ICA-R) is applied to extract the FECG [10][11][12][13][14], which separates the desired FECG signal by using some prior information about the desired FECG without separating all the source signals. Therefore, this FECG extraction method based on ICA-R is more efficient than those based on ICA. ICA-R was proposed and implemented by Lu and Rajapakse [15][16][17] by incorporating the prior information of the desired signal into the contrast function of the ICA [18]. To improve the ICA-R performance, some authors proposed improved ICA-R methods. For example, to reduce the computational complexity of ICA-R, Lin et al. [11] proposed a fast one-unit ICA-R by using prewhitening to deal with the observed signal. Compared with the original ICA-R, this method has shorter running time and the same error. Also, Huang and Mi [19] studied the gradient of inequality constraint function and corrected the gradient. This method has a higher success rate for extracting the desired signal. In addition, Li et al. [14] first divided the cost function of ICA-R into two parts that are the negentropy function and the closeness measure function and then directly used the firstorder derivative of the negentropy contrast function to update the separating vector. This method has faster convergence rate and higher success rate to extract the desired signal than the previously reported method [19]. Moreover, Sun and Shang [20] studied the stability of ICA-R and proposed an initialization method of separating the matrix. This method has a higher accuracy and stability than the original ICA-R. Kavuri et al. [21] proposed a one-unit ICA-R by using evolutionary algorithm to optimize the cost function to find a global optimal solution. Although this method has a smaller error than the original ICA-R, it does consume a considerable amount of time. Additionally, Zhang et al. [22] proposed the ICA-R based on kurtosis and analyzed how to choose the reference signal. Wang [23] used the method that has been used in the fast ICA to propose a fixed-point ICA method. Rodriguez et al. [24] proposed a multiunit ICA-R method based on nonorthogonal decoupling of separated matrix. This method cannot be extended to one-unit ICA-R. Mi [25] proposed a strategy to detect future misconvergence to improve the probability of extracting the desired signal. Chen et al. [26] proposed an ICA-R that can be used in single channel by discrete wavelet transform.
The one-unit ICA-R is a special case of the multiunit ICA-R. It only separates one desired source signal every time. The only difference between the one-unit ICA-R and the multiunit ICA-R is that the former only extracts one desired signal, and the latter is based on the former and uses the orthogonalization method to extract multiple desired signals. If we use the deflationary orthogonalization method or symmetric orthogonalization method, which are widely used in ICA [18] to extract the signal in the one-unit ICA-R, the one-unit ICA-R can be easily extended to multiunit ICA-R. In the FECG extraction, we only want to extract the FECG with just one signal; thus, the one-unit ICA-R is enough to extract FECG.
Although the one-unit ICA-R and multiunit ICA-R have a little difference, they share the same cost function. Additionally, they both use the same non-Gaussian measurement function that is used in the classical ICA algorithm to measure the non-Gaussian component of the extracted signal. The negentropy approximation is needed to compute the difference between the function of the extracted signal and the function of a Gaussian vector that has the same mean and variance as the extracted signal. This will consume plenty of time. The classical ICA algorithm has no information about the source signals, except that the non-Gaussian measurements and statistics are independent. However, the ICA-R has prior information about the desired signal, but the previous methods did not use such information to reduce the computational complexity of the negentropy approximation. The aim of this paper is to reduce the computational complexity for the one-unit ICA-R by taking advantage of the prior information of the desired ECG to simplify the negentropy approximation.
This paper is organized as follows. Section 2 reviews the previously reported algorithms including the traditional ICA and one-unit ICA-R. Section 3 analyzes in detail the non-Gaussian measurement function that is used in the one-unit ICA-R cost function. In Section 4, we simplify the non-Gaussian measurement function based on the analysis in Section 3 and, then, derive an improved one-unit ICA-R with lower computation complexity. In Section 5, we test Mi's method [25] and our proposed method on artificial ECG data and real-world ECG data to compare the performance of each method. Some conclusions are included in Section 6.

Traditional ICA.
The traditional ICA is a signal processing method for estimating the source signals from observed signals that are a mixture of source signals. The model of linear ICA is as follows: where x( ) = [ 1 ( ), 2 ( ), . . . , ( )] is the observed signal, A is the mixing matrix with the size ( × ), and s( ) = [ 1 ( ), 2 ( ), . . . , ( )] is the source signal. The aim of the ICA is to estimate the source signal s( ) from the observed signal x( ) by computing the separating matrix W = [w 1 , w 2 , . . . , w ] with size ( × ). The estimated source signal can be expressed as follows: where y( ) = [ 1 ( ), 2 ( ), . . . , ( )] is the estimated source signal, which is the same as s( ) under ideal condition. However, under real condition, the shape of the waveforms of y( ) is close to s( ), except for the amplitude and sequence of the waveforms. The source signals s( ) are considered as non-Gaussian and mutually statistically independent. Accordingly, it has strong non-Gaussian properties. The mixing signals x( ) have strong Gaussian characteristics according to the central limit theorem. When the estimated source signal is close to the source signal s( ), it will have strong non-Gaussian properties. Thus, the ICA uses a non-Gaussian measurement function of the estimated source signals as a cost function [18]: where is a positive constant, is the estimated signal (extracted signal), is the nonquadratic function, and V is a Gaussian variable with the same mean and variance as . By maximizing the cost function, we can obtain the separating matrix and recover the source signal.

One-Unit ICA-R.
The ICA can separate all the source signals from the observed signals. However, if we only want to extract some or one of the desired source signals, we must use ICA to separate all source signals and further select the desired source signal from the estimated signals. When the dimension of observed signals is very large, especially for some biomedical signals, it will take a long period of time to extract the desired signal. The one-unit ICA-R was proposed by Lu and Rajapakse [15] to reduce computational time and avoid additional operations to select the desired source signal from the estimated signals.
The one-unit ICA-R method combines prior information of the desired source signal into the ICA cost function (3) and constructs a new cost function. The cost function for the oneunit ICA-R is as follows [15]: where = w x, w is the separating vector, x is the observed signals, is the extracted signal, is a positive constant, and V is a Gaussian vector with the same mean and variance as . The inequality constraint term is ( ) = ( , ) − , where is the reference signal constructed by the prior information of desired signal, is used to measure the closeness between the estimated signal and the reference signal , and is a threshold parameter to control the closeness level. A common and simple measure of closeness is the mean squared error (MSE) ( , ) = {( − ) 2 }. ℎ( ) is used to restrict the estimated signal to unit variance. The above inequality constraints can be transformed into equality constraints by introducing the slack factor . Thus, the cost function of the one-unit ICA with reference to equality constraint can be expressed as follows: The augmented Lagrange multiplier method is used to optimize the above cost function with equality constraint and derive the augmented Lagrange function that can be expressed as follows: where and are the Lagrange multipliers for the constrains ( ) and ℎ( ), respectively, and is a scalar penalty parameter. A Newton-like learning algorithm is applied to optimize the above equation and get the updated separating vector. The updated formula is as follows [15]: where is the current iterative number, is the fixed learning rate, R is the covariance matrix of the observed signals, and and are the first and second derivatives of , respectively, and can be expressed as , ( ) and ( ) are the first and the second derivatives of ( ) with respect to , respectively, and ( ) and ( ) are the first and the second derivatives of ( ) with respect to , respectively. The optimum multipliers and are updated by the following equations: The same author Lu and Rajapakse proposed [17] another cost function for the one-unit ICA-R, as follows: The learning rule of the separating vector is where The symbols of the parameters in (12) have the same meaning as those in (7). Many studies about the ICA-R method are based on the cost function (11)

Analysis of the Non-Gaussian Measurement Function in the One-Unit ICA-R Method
The non-Gaussian measurement function (5) that is also the cost function of the fast ICA can also be expressed as where |(⋅)| is the absolute value of (⋅). Maximizing (14) equals maximizing (3). In ICA-R, V is a Gaussian variable having zero mean and unit variance [17]. It actually has nothing to do with the separating vector w and can be seen as a constant for w. Accordingly, maximizing (14) equals maximizing or minimizing { ( )}. We denote it by ( ); that is, where = w x = w As = b s, s is the source signals, A is the mixing matrix, and w is the separating vector. In the oneunit ICA-R, we only extract one desired signal. We consider that the successfully extracted signal equals ; hence, the At the stability point, the first-order derivative of (b s) with respect to b * is The { ( ) } ̸ = = 0, which means that the source signals with zero are mutually statistically independent, is used in (16). It is a fundamental assumption in ICA-R [17] and its second derivative is as follows: Adding a small fluctuation = ( 1 , . . . , −1 , , +1 , . . .) into b * , we obtain Taylor's series expansion of (b new s) at the point b * is In the one-unit ICA-R, the extracted signal is considered to have unit variance. When the desired signal has unit variance, then ‖b * + ‖ = I. That is, 2 1 + 2 2 + ⋅ ⋅ ⋅ + ( + 1) 2 + ⋅ ⋅ ⋅ = 1. Thus, we obtain ( + 1) = √1 − 2 1 ⋅ ⋅ ⋅ − 2 −1 − 2 +1 ⋅ ⋅ ⋅ and its Taylor's series at point zero is Substituting (20) in (19), we get As the components of are very small, we can neglect its higher order term in (20) and (21). From the last item in (21), we can obtain that when { ( )}− { ( )} > 0 then (b * ) is the minimum, and vice versa is the maximum.   (14) can be simplified as

Improve One-Unit ICA-R Method
As V has nothing to do with the separating vector, we can further simplify the non-Gaussian measurement function (22) as On the other hand, when the { ( )} − { ( )} > 0, the non-Gaussian measurement function can be simplified as Therefore, the negentropy approximation in (23) and (24) can be expressed as In [16], the following function ( ) was proposed for extracting the super-Gaussian and sub-Gaussian signal, respectively.
where , ∈ + . In [25], Mi has proved that (27) should be corrected as Accordingly, in this paper, we use (29) instead of (27). For simplicity, we set = 1 in (29). The first and second derivatives are tanh( ) and 1 − tanh 2 ( ), respectively. If we use (29) as a nonlinear function, in (25) can be expressed as follows: In (30), we used Taylor's series expansion of tanh( ), which is described as Neglecting the higher order term in (30) and considering that the desired signal has unit variance, which is a very common hypothesis in ICA and ICA-R, we can obtain the following: According to the kurtosis definition of a super-Gaussian signal, when is a super-Gaussian signal with unit variance, { 4 } > 3. Then < 0 in (32). This means that if the desired signal is super-Gaussian, we can directly simplify (25) as

Improved One-Unit ICA-R for Extracting FECG.
The oneunit ICA-R is used to extract one desired signal. The prior information that the desired signal is a super-Gaussian or a sub-Gaussian signal is easily obtained. The FECG and MECG are both super-Gaussian signals in general. Therefore, in the extraction of FECG, we can directly use (33) instead of (14) to obtain the non-Gaussian measurement of an extracted signal. Centering and whitening can reduce the computation complexity of one-unit ICA-R [11]. Thus, we use centering and whitening to process the observed signals. After preprocessing, the new cost function of the one-unit ICA-R for extracting FECG can be expressed as where = w x is the extracted signal, x is the centered and whitened signal, and w is the separating vector. in (34) is different from̂in (12). is just a positive constant. Correspondingly, the augmented Lagrangian function ( , ) for the problem in (34) is expressed as follows: To find the maximum of ( , ), Newton's method is used to optimize ( , ). The learning rule of the separating vector is given by the following equation: where is the learning rate: The main difference between the improved algorithm in (36) and the original algorithm in (12)
The one-unit ICA-R uses Newton's method to optimize cost function. Newton's method is sensitive for the initial value. The initial separating vector directly affects the convergence of Newton's method. However, the original algorithm randomly initializes the separating vector. To avoid random initialization, inspired by previous a report [20], we use the following expression to initialize the separating vector: where r is the whitened reference signal, x is the whitened signal, x −1 is the inversion of x, and w 0 is the initialization separating vector. The closer to whitened desired signal r is, the closer to the perfect separating vector w 0 will be. This can speed up the algorithm convergence rate and improve the probability of global convergence compared with random initialization. Therefore, the proposed one-unit ICA-R algorithm for extracting FECG can be described as follows: (1) Center the observed signals.

Simulation and Discussion
To demonstrate the effectiveness of the proposed one-unit ICA-R algorithm, both artificial ECG data and real-word ECG data are used in the following two experiments. To quantitatively compare the performance of the proposed oneunit ICA-R and Mi's one-unit ICA-R [25], we adopt the individual performance index (IPI) as an indicator function, as was used in a previous study [11]. It is defined as where | ⋅ | denotes the absolute value operator, denotes element of the global system vector p = w BA = ( 1 , 2 , . . . , ), A is mixing matrix, B is the whiten matrix, w is the separating vector of mixed signals after whitening, w B is the global separating vector, is the number of mixing signals, and max | | is to find the maximum valued among the elements in the vector p = ( 1 , 2 , . . . , ). The extracted signal by proposed method can be described as From (41), it can be easily seen that if p = (0, 0, . . . , , . . . , 0), ̸ = 0, we will have the extracted signal = , which is just rescaled signals of . It is obvious from (40) that we have IPI ≥ 0 and that IPI equals zero if and only if p is a rescaled canonical base vector that only one element of vector is not zero. Therefore, the closer to a rescaled canonical base vector the global system vector p, the nearer to zero the IPI and thus the better the performance of one-unit ICA-R method.
In the experiments, we select ( ) = log cosh ( ) as the nonquadratic function and ( , ) = {( − ) 2 } as the closeness measure function. Additionally, ( ) = ( , ) − . We also use the corrected first derivative of ( ) as its first derivative. The corrected first derivative is given in [19] as In the following experiments, we set , , and all equal to one and perform 1000 trials for each data set.

Experiment with Artificial ECG Data.
The artificial signals include the power line interference with 50 Hz, Gaussian noise, FECG, and MECG. The FECG and MECG are generated by using the ECG toolbox of Sameni [28]. The reference signals for the FECG and MECG are constructed with sign operation to roughly give the signs of the most data samples of the desired source signal [14,15]. There are 5000 samples in the experiment. The mixing matrix is randomly generated as ] .
Four artificial source signals are depicted in Figure 1, which are considered independently of each other. s1 is the power line interference with sub-Gaussian distribution. s2 is the random Gaussian noise with Gaussian distribution. s3 and s4 are FECG and MECG with super-Gaussian distribution, respectively. The mixtures are shown as mixed signals in Figure 2. The FECG extracted by the proposed method and its reference are shown in Figure 3. The MECG extracted by the proposed method and its reference are shown in Figure 4. Comparisons of the extracted FECG and the extracted MECG are shown in Figures 3 and 4 with the desired signals s3 and s4 in Figure 1, respectively. It is evident that the waveforms of the extracted signals are most identical to the waveforms of the desired signals.
To quantitatively compare the performance of the proposed method and that of Mi's method [25], the IPI, defined in a previous report (40), are computed for the extraction results of both algorithms in the same experiment. Both algorithms use the same parameters and reference signal. The results are shown in Table 1, where it can be seen that the IPIs    widely used by other researchers [11][12][13]25]. It was recorded over 10 s and sampled at 250 Hz by placing eight electrodes on different locations of a pregnant woman. The real-word ECG data are shown in Figure 5, where the signals Ch1-Ch5 were the recordings from five electrodes placed on the woman's abdomen. Accordingly, the FECG, respiratory motion artifacts, and the MECG were visible in these recordings. The signals Ch6-Ch8 were the recordings from three electrodes placed on the woman's thorax. In these thoracic measurements, the FECG was invisible because of the distance between the fetus and the chest leads.
In the Ch1 recording, the strong and slow heart belongs to the mother, while the weak and fast belongs to the fetus. Therefore, we can use the information about fetus in the Ch1 recording to construct reference signal for extracting the FECG. We use impulse series whose occurrence time is the same as that of the subpeaks in the first channel signal Ch1 as reference for extracting the FECG. The impulse series whose occurrence time is the same as the peaks of any channel signal were used as reference for extracting the MECG. The reference signal and the FECG signal extracted by both methods are shown in Figure 6, where the reference signal is denoted as Ref, the FECG extracted by the proposed method is denoted as A, the FECG extracted by Mi's method is denoted as B, and both extracted FECG signals are redescribed in C and represented with different color curves. The reference signal and the MECG signal extracted by both methods are shown in Figure 7, where the reference signal is also described as Ref, the MECG extracted by the proposed method is denoted as A, the MECG extracted by the previous method is denoted as B, and both extracted MECG signals are redescribed in C with different color curves. We can see in Figures 6 and 7 that the FECG and MECG are both successfully extracted by both methods. Since the mixing A and the pure FECG and MECG signals are not available for the real-word ECG recordings, the IPI performance cannot be computed. But we can compare the waveforms of extracted signals by both methods to estimate their relative error. Therefore, we redescribe the extracted FECG and MECG signal waveforms by both methods in the same subfigure C with different color curves in Figures 6 and 7, respectively. From the subfigure C in Figures 6 and 7, it can be seen that the different color curves are almost the same. This means that the proposed method and Mi's method have almost the same error. However, their consumed running time is considerably different, as shown in Table 2. The running time is approximately reduced to half of that of the Mi's algorithm.

Conclusion
In this paper a fast one-unit ICA-R algorithm for extracting ECG is proposed by simplifying the non-Gaussian measurement method of the extracted signal. In the proposed algorithm, the nonlinear function of the extracted signal is directly used as non-Gaussian measurement function. This avoids having to compute the difference between the function of the extracted signal and the function of the Gaussian random vector with the same mean and variance as the extracted signal. Centering and whitening are also used to preprocess the observed data to avoid computing the observed signal covariance matrix. As a result, the computation complexity for the one-unit ICA-R algorithm is greatly reduced.
The validity of the proposed algorithm is tested and compared to Mi's algorithm using artificial ECG data and real-world ECG data. Both experimental results demonstrate that the convergence rate of the proposed algorithm is two times faster than Mi's algorithm, while both methods have the same error.