A Hybrid Model of Maximum Margin Clustering Method and Support Vector Regression for Noninvasive Electrocardiographic Imaging

Noninvasive electrocardiographic imaging, such as the reconstruction of myocardial transmembrane potentials (TMPs) distribution, can provide more detailed and complicated electrophysiological information than the body surface potentials (BSPs). However, the noninvasive reconstruction of the TMPs from BSPs is a typical inverse problem. In this study, this inverse ECG problem is treated as a regression problem with multi-inputs (BSPs) and multioutputs (TMPs), which will be solved by the Maximum Margin Clustering- (MMC-) Support Vector Regression (SVR) method. First, the MMC approach is adopted to cluster the training samples (a series of time instant BSPs), and the individual SVR model for each cluster is then constructed. For each testing sample, we find its matched cluster and then use the corresponding SVR model to reconstruct the TMPs. Using testing samples, it is found that the reconstructed TMPs results with the MMC-SVR method are more accurate than those of the single SVR method. In addition to the improved accuracy in solving the inverse ECG problem, the MMC-SVR method divides the training samples into clusters of small sample sizes, which can enhance the computation efficiency of training the SVR model.


Introduction
The technique of noninvasive imaging of the heart's electrical activity from the body surface potentials (BSPs) constitutes one form of the inverse problem of ECG [1,2]. Approaches to solving the inverse ECG problem have been usually based on either an activation-based model or a potential-based model, which includes epicardial, endocardial, or transmembrane potentials. Activation-based models are used to investigate the arrival time of the propagation wavefront within the myocardium [3,4]. The potential-based models are used to evaluate the potential values on the cardiac surface [5][6][7] or within the myocardium [8] at certain time instants. In this study, we explore a new solution for ECG inverse problem using the potential-based approach.
Due to its inherent ill-posed property, the inverse ECG problem is usually solved by "regularization" techniques. In the last decades, numerous regularization methods have been proposed to solve this ill-posed problem, including truncated total least squares (TTLS) [9], GMRes [10], and the LSQR [11,12]. Most of them are essentially L2norm based regularization schemes, which inherently lead to considerable smoothness of the inverse solutions. L1-norm regularization method can overcome this drawback of L2norm regularization method, which has been applied for epicardial potential reconstruction [13][14][15]. Although the above-mentioned regularization methods can more or less deal with the geometry and measurement noises for the ECG inverse problems, which depends on the regularization parameters, the robustness of the inverse solution is not always guaranteed. In this paper, without seeking assistance  from the regularization techniques, we explore an alternative, more robust approach to solve the inverse ECG problem. The method is called Support Vector Regression (SVR) [16]. To find the solution for the inverse ECG problem, a regression model will be set up with multi-inputs (BSPs) and multioutputs (transmembrane potentials, TMPs). This statistic method based solution will be assessed with the quality of the inversely predicated TMPs from the measured BSPs. Compared with conventional regularization methods (e.g., zero order Tikhonov and LSQR), the SVR method can produce more accurate results in terms of reconstruction of the transmembrane potential distributions on epi-and endocardial surface. In addition, when the PCA and KPCA are adopted to extract useful features from the original inputs for building the SVR model, the SVR method with feature extraction (PCA-SVR and KPCA-SVR) outperforms that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs [17].
Compared with using single SVR model, the hybrid models by integrating difference methods show better performance. The self-organizing map (SOM) is an unsupervised and competitive learning algorithm, which can be viewed as clustering techniques [18]. Combining the SOM with SVR or LS-SVM, the proposed hybrid method has the potential to find better inverse solutions than using a single SVR model [19,20]. Xu et al. [21] proposed the Maximum Margin Clustering (MMC) method, which performs clustering by simultaneously finding the large margin separating hyperplane between clusters. The MMC method has been successfully applied to many clustering problems [22]. However, its efficiency is an issue of concern. Recently, Zhang et al. proposed [23,24] an efficient approach for solving the MMC via an alternative optimization procedure, which was implemented by using the SVR method with the Laplacian loss in the inner optimization subproblem. The modified MMC algorithm is more accurate, much faster and therefore more practical for solving engineering inverse problems. In this paper, the hybrid model of modified MMC method and SVR is proposed to solve the inverse ECG problem, which is referred to as an MCC-SVR method. The conference version of this submission has appeared in CINC 2011 [25]. This submission has undergone substantial revisions and offers extended experiment results.
The main purpose of this study is to use an MCC-SVR model to investigate the reconstruction capability of TMPs. In this study, based on our previously developed realistic heart-torso model, the equivalent double layer (EDL) source model method was applied to generate the data set for training and testing the SVR model. The proposed algorithm was also compared with a single SVR model for noninvasive ECG imaging.

Theory and Methodology
The framework of the proposed MCC-SVR method is shown in Figure 1. The MCC method is used to classify the input data; the SVR is then applied to construct the regression model of each cluster.
Algorithm 1: Iterative SVR procedure for MCC method. [23,24]. The clustering principle is to find a labeling to identify dominant structures in the data and to group similar instances together, so the margin obtained would be maximal over all possible labelings, that is, given a training set

Maximum Margin Clustering (MMC) Method
, where x i ∈ χ is the input and y i ∈ {±1} is the output. The SVM finds a large margin hyperplane to separate patterns of opposite classes by the classify function f (x) [26]: where ϕ(x) denotes the high-dimensional feature space, which is nonlinearly mapped from the input space x by the kernel function k, ω is the normal vector of the hyperplane, and b is the offset of the hyperplane. Computationally, this leads to the following optimization problem [24,26]: where ξ = [ξ 1 , . . . , ξ n ] T is the vector of a slack variable for the errors, and C > 0 is the trade-off parameter between the smoothness ω 2 and the fitness (ξ T e) of the decision function f (x). MMC attempts to extend large margin methods to allocate the input data points to different classes, leading to large separation between the different classes. Here, the case with two clusters is considered in this work. Since one could simply assign all the data points to the same class and obtain an unbounded margin, a proper constraint on the class balance needs to be imposed. Xu et al. [21] introduced a class constraint that requires y to satisfy where ≥ 0 is a user-defined constant controlling the class imbalance. Then the margin is maximized with respect to both unknown y and unknown SVM parameter (ω, b) as follows: The origin nonconvex MMC problem in (4) can be formulated as a sequence of QPs which can be solved by some efficient QP solvers. However, it suffers from a premature convergence and easily gets stuck in poor local optima. Zhang et al. [23,24] proposed to replace the SVM by SVR with Laplacian loss, which can lead to a significant improvement in the clustering performance compared to that of iterative SVM procedure. The primal problem of SVR with Laplacian loss can be formulated as where ξ i and ξ * i are slack variables. With the obtained labels, the MMC problem based on the iterative SVR with the Laplacian loss becomes After ω is obtained from the optimization of SVR, the problem in (6) is reduced to the form According to Zhang's proposition [24], for a fixed b, the optimal strategy to determine the y i 's in (7) is to assign all y i 's as −1 for those with ω T ϕ(x i ) + b < 0 and assign y i 's as 1 for those with ω T ϕ(x i ) + b > 0. The bias b can be determined as follows. (i) we sort the ω T ϕ(x i )'s and use the set of midpoints between any two consecutive sorted values 4 Computational and Mathematical Methods in Medicine as the candidates of b; (ii) from these sorted b's, the first and the last (n − )/2 of them can be dropped, and the middle can be remained; (iii) for each remaining candidate, we determine the y i 's according to the above proposition and compute the corresponding objective value in (7); (iv) finally, we choose the b that has the smallest objective. The complete iterative SVR procedure for MCC method is shown in Algorithm 1.

Support Vector Regression (SVR) Model.
The SVR algorithm [26] is only briefly described here; for details, see [16,26]. As a linear regression model, the SVR algorithm relies on an estimation of a linear regression function: where ω and b are the slope and offset of the regression linear, and ·, · denotes the dot product in . The above regression problem can be written as a convex optimization problem: In (9), an implicit assumption is that a function f essentially approximates all pairs (x i , y i ) with ε precision, but sometimes this may not be the case. Therefore, one can introduce two additional positive slack variables ξ i , ξ * i to refine the estimation of variables ω and b. Now (9) can be reformulated [16] as where the constant C is a trade-off parameter and n denotes the number of samples; ξ i represents the upper training error, and ξ * i is the lower training error subject to ε intensive tube. According to the strategy outlined by Vapnik [26], using Lagrange multipliers, the constrained optimization problem shown in (3) can be further restated as the following equation: where α i and α * i are the Lagrange multipliers. The term K(x i , x j ) in (11) is defined as the kernel function, whose values are the inner product of two vectors x i and x j in the feature space ϕ(x i ) and ϕ(x j ). And bias b can be computed as follows: The kernel function handles any dimension feature space with no explicit calculation of ϕ(x). In this study, the Gaussian kernel function is chosen as the SVR's application mapping in this study: where x i and x j are input vector spaces; σ 2 is the bandwidth of the kernel function.
In this study, an accurate and fast approach based on the GA and the simplex search techniques is presented to determine the optimal hyperparameters of the SVR model [17], as shown in Figure 2. The GA algorithm used here is based on a GA toolbox developed by Chipperfield et al. [27], and the simplex optimization method is implemented using the MATLAB optimization toolbox. The developed SVR model was trained and validated with the software LIBSVM [28].

Simulation Protocol and Data
Set. The SVR model is tested with our previously developed realistic heart-torso model [6,17]. In this study, an equivalent double layer (EDL) source model is adopted to simulate the cardiac equivalent source, which represents the cardiac electrical activity by means of double layer source on the closed surface (including the endo-and epicardial surface of ventricle). For the ECG inverse problem studies, the ventricular surface TMPs and body surface potentials (BSPs) are evaluated based on the EDL source model. The transfer matrix A between TMPs and BSPs is evaluated by the boundary element method (BEM), and it has the dimension of 412 × 478 and its condition number (the ratio of largest and smallest singular values) is 5.6 × 10 12 . As shown in Figure 3, the EDL source method is used to obtain the BSPs ϕ B and the TMPs ϕm. For the construction of the training and testing data set Different Action Potentials (APs) for various myocardial cells and the normal Ventricular Excitation Sequence (VES) are used to calculate the TMPs (ϕm) at different times; from the calculated TMPs, the corresponding BSPs are deduced with the transfer matrix A.
In this study, a normal ventricular excitation data set is prepared for the setup of the SVR model. The considered ventricular excitation period from the first breakthrough to the end is 357 ms and the time step is 1 ms, and, thus, 358 BSPs ϕ B and TMPs ϕm temporal data sets are numerically recorded; in addition, the 30 dB simulated Gaussian white noise is added into the BSPs ϕ B representing the measurement noises. 60 datasets at times of 3 ms, 9 ms, where ϕ Bt are the body surface potentials at time instant t, ϕ Bt max is the maximum value of BSPs at the time t, and ϕ Bt min is the minimum value of BSPs at the time t.
As the TMPs are known in advance in the simulation study, the accuracy of reconstructed TMPs at the testing time t can be evaluated by either relative errors (REs): or the correlation coefficient (CC), given by where n is the number of nodes on the ventricular surface. ϕ e t denotes the simulated TMPs distribution at time t, and ϕ c t are inversely computed. The quantities ϕ c t and ϕ e t are the mean value of ϕ c t and ϕ e t over the whole ventricular surface nodes at time t.

Results
According to the MCC method, the above 298 training samples are classified four clusters as shown in Figure 4(a), and the numbers of the four clusters is 80, 74, 70, and 74, respectively. Then the individual SVR model is trained for each cluster, and the hyperparameters are determined using the GA-Simplex method. For 60 testing samples, the MCC method is used to find their corresponding clusters, as shown in Figure 4(b).
To illustrate the performances of the reconstructed TMPs, four sequential testing time points (3,15,27, and 39 ms after ventricle excitation) are presented. The inverse ECG solutions are shown in Figure 5; in contrast to the conventional regularization methods, such as zero order Tikhonov regularization method and LSQR regularization method, the single SVR method can yield rather better results with lower RE and higher CC. Moreover, it can be seen that the MCC-SVR method offers superior performances than the single SVR method, as its solution is more close to the simulated TMPs distributions. The time courses of the simulated TMPs and reconstructions for one representative source point on the heart surface are depicted in Figure 6. It can be found that, in reconstructing the TMPs for one representative source point over all the testing times, the MCC-SVR method offers better solution compared with single SVR method.    The RE and CC of the reconstructed TMPs with different methods can be found in Figure 7. In contrast to the single SVR method, the MCC-SVR method can yield improved results with a lower RE and a higher CC over the 60 testing samples.
By dividing the training samples into smaller clusters, the training time for each cluster can be reduced. The training times for each cluster by using the MCC-SVR method are 6715.9 seconds, 6821.1 seconds, 4550.6 seconds, and 5162.4 seconds, respectively, and the total time of the four clusters is 23250 seconds. When using the single SVR model to train the model of the all training samples, it takes 35233.4 seconds.

Discussion and Conclusion
In this study, MCC-SVR method is proposed to solve the noninvasive ECG imaging problem. Here, the MMC approach is adopted to cluster the training samples firstly, and then SVR method is applied to construct the model for each cluster. After building different cluster models, for the testing sample, we can find its matched cluster and then use the corresponding SVR model to reconstruct the TMPs. From the reconstructed TMPs as shown in Figures 5 and  6, it can be seen that the MCC-SVR methods offer better solution compared with single SVR method. According to the evaluation indices RE and CC, the performances of the reconstructed TMPs by using the MCC-SVR can constantly converge to a smaller RE and a higher CC on the testing samples than those of the single SVR method, as shown in Figure 7. In terms of the computation efficiency of the training SVR model, for the given training samples, the MCC-SVR method can save about 34% time than the single SVR method. With the increasing of the training samplings, the MCC-SVR method lessens more training time than the single SVR method. Moreover, the training process for each cluster can be implemented simultaneously using parallel computing, therefore further enhance the training efficiency.
In summary, this paper proposed the MCC-SVR method for the inverse solutions of the ECG problem. The new algorithm was tested and compared with single SVR schemes using a realistic heart-torso model. The experimental results show that the MCC-SVR can improve the generalization performance of the single SVR in reconstructing the TMPs, leading to a more accurate reconstruction of the TMPs. In our future work, we plan to improve the MCC-SVR method for solving various nonlinear regression problems in noninvasive ECG imaging.