In this paper, we focus on recognizing epileptic seizure from scant EEG signals and propose a novel transfer enhanced

Epilepsy is a kind of chronic disease, which is caused by the sudden abnormal discharge of brain neurons, resulting in transient brain dysfunction. Usually patients themselves have no obvious impression of the epileptic seizure process. For this reason, doctors can only diagnose the patient’s condition according to the patient’s family members or other personnel present during the epileptic seizure in the past. However, the accuracy of this manual diagnosis method is low. The pathogenesis of epilepsy is mainly manifested by abnormal neural discharge and abnormal brain waves. Although medical imaging, such as Computed Tomography (CT), magnetic resonance imaging (MRI), functional magnetic resonance imaging (FMRI), Single-Photon Emission Computed Tomography (SPECT), Positron Emission Computed Tomography (PET), has made great progress over the years, and the major diagnostic method of epilepsy is based on electroencephalogram (EEG). More specifically, PET and fMRI cannot be used as common technical means because of their technical requirements and costs. In addition to the high cost, MRI cannot judge the nonstructural lesions as well. Invasive cortical electroencephalogram (ECoG) requires craniotomy and implantation of electrodes, which has a high risk; and noninvasive EEG and MEG can provide functional and structural detection. Taking all these into account, EEG has been widely concerned in more and more theoretical researches and clinical practice because of its low cost, convenient signal acquisition, and noninvasiveness.

The research on diagnosis of epilepsy through EEG signals has been a hot topic in related fields, compared with manual diagnostic method, and machine learning methods are less time-consuming and more accurate [

General steps of machine learning model processing EEG signals.

In summary, one of the significant issues in the field of processing EEG signals by machine learning technique is the insufficient training data. We briefly introduce some mechanisms for epileptic diagnosis through EEG signals here. Jiang [

Transfer learning is believed to be an effective strategy to solve problems caused by insufficient training data [

According to the transfer learning theory [

In the scenario of recognizing epileptic seizure, we aim at diagnosing the actual patient. As TrEEM is built on graph theory and pairwise similarity matrix and is an exemplar-based clustering model, this model selects exemplar from actual data. This advantage fits the requirements in the relevant scenario here.

TrEEM embeds KL distance between target data and source data into the calculation of similarity matrix. Thus, the optimization mechanism utilized in EEM can be directly used to solve the new target function of TrEEM. In detail, we leverage

The paper is organized as follows. The related works are discussed in Section

Many researchers are committed to using machine learning technology to classify EEG signals, including SVM, fuzzy system, naïve Bayes, and exemplar-based clustering model. In this section, we illustrate two popular learning frameworks, namely, Enhanced

Consider a dataset

The target function of a typical exemplar-based clustering model is defined as follows [

In [

When, for

Enhanced

EEM algorithm is one of the most popular exemplar-based clustering models, and it performs effectively and steadily in numerous simulation experiments involved [

TSK fuzzy system is a rule-based system and it is widely used as a typical fuzzy system model for both classification and clustering. Generally, the

Accordingly, based on the relevant theory of TSK fuzzy system, the target model above in equation (

In this section, we briefly introduce two popular machine learning clustering frameworks used in the recognition of EEG signals, namely, EEM and TSK fuzzy system. The detailed descriptions are shown in Table

Descriptions of two popular machine learning algorithms used in recognition of EEG signals.

Algorithms | Theoretical basis | Descriptions | Optimization frameworks |
---|---|---|---|

EEM | Graph theory | Select exemplar from actual data, do not need to preset the cluster number | Enhanced graph-cuts optimization algorithm, expand the candidate region |

TSK | Fuzzy system | Rule-based learning model, strong interpretability and robustness | Parameter learning process of corresponding linear regression model |

In this section, we first analyze the theoretical basis of TrEEM from Bayesian probabilistic framework. Second, we induce the novel algorithm TrEEM in detail. Then, considering the optimization algorithm utilized in EEM algorithm, we optimize target function as well. Generally, the structure of this novel model is shown in Figure

Structure of TrEEM algorithm.

Besides, we list the frequently used notations in Table

Involved notations and descriptions.

Notations | Descriptions |
---|---|

Target data | |

Source data | |

Pairwise similarity matrix | |

Source-data-based exemplar set | |

Source exemplar for target sample | |

Target-data-based exemplar set | |

Exemplar for target sample |

As mentioned before, transfer learning considers two datasets from similar source, namely, source data and target data; and the relationship between source data and target data is considered as a significant factor in this model (see Table

As to the exemplar set, we should exclude the situation when an exemplar appoints other exemplars among current exemplar set except for itself as its own exemplar. Consequently, Bayesian posterior probability of an exemplar set is defined as follows:

Accordingly, under Bayesian probabilistic framework and the discussion of EEM algorithm in Section

In conclusion, equation (

According to information theory, the Kullback-Leibler distance (KL distance) is a natural distance between two real probability distributions and it has been widely applied to solve numerous issues [

Consider two probability distributions as

What is worth mentioning is the fact that KL distance is an asymmetric measurement, namely,

Furthermore, given

Although the target data is not exactly same as source data, according to those theoretical analyses of transfer learning, the source-data-based learning model and results should contribute to the learning of new target data as well [

Observing equation (

Introducing the definitions of

Comparing equations (

As mentioned before, the novel target function in equation (

In detail, we redefine the similarity relationship of target data by imbedding source-data-based exemplar set

single out the nearest exemplar

compute probabilistic Euclidean similarity

calculate transfer similarity matrix

call the optimization process of EEM as shown in Algorithm

Randomly generate expansion order

Let

compute

for

compute

for

for

Accept the new exemplar

end

Until convergence

EEM utilizes

Note that TrEEM model redefines the similarity matrix as equation (

Obviously, in the process of optimization, this current exemplar

Specifically, if

Then, we take the greater value of 0 and

Namely, if

In this case, we define the current exemplar of

On the other hand, may be current exemplar set

Remember that equations (

To sum up, the optimization mechanism is shown below in detail.

The similarity matrix is calculated according to the Euclidean distance;

To comprehensively evaluate the TrEEM model, we have conducted several experiments based on both synthetic and real-world datasets. For comparison, we also perform comparison with other different machine learning mechanisms, namely, EEM [

Before inputting the TrEEM model, we need to preprocess the original nonstationary EEG signals [

Various methods have been commonly used to extract EEG signals’ features, including wavelet [

Besides, we use both synthetic and real-world datasets in this section. Firstly, we randomly generate 300 two-dimensional data points as 3 classes, shown in Figure

Synthetic dataset.

Description of Bonn EEG dataset.

Subjects | Groups | Descriptions |
---|---|---|

Healthy | A | Signals captured from volunteers with eyes open |

B | Signals captured from volunteers with eyes closed | |

Epileptic | C | Signals captured from volunteers during seizure silence intervals |

D | Signals captured from volunteers during seizure silence intervals | |

E | Signals captured from volunteers during seizure activity |

In addition, we examine the involved experimental results from two performance indices, namely,

In all, the experiments are implemented in 2010a Matlab on a PC with 64-bit Microsoft Window 10, an Intel (R) Core (TM) i7-4712MQ, and 8 GB memory.

As mentioned before, four machine learning methods are involved in this section, namely, EEM, multiclass SVM, TSK-FS, and the proposed TrEEM algorithm. There is no need to preset the cluster number in advance for EEM and TrEEM. In fact, it is a huge advantage for all exemplar-based clustering frameworks, whereas cluster number is an important parameter for TSK-FS. Multiclass SVM and TSC are two typical classification methods. Both EEM and TrEEM need parameter self-similarity

Parameters settings of involved algorithms.

Algorithms | Parameter setting |
---|---|

EEM: a typical exemplar-based clustering model | Self-similarity |

Multiclass SVM: a typical classification learning model | Kernel function |

TSK-FS: a widely used fuzzy-rule-based learning model | FCM [ |

TSC : transfer spectral clustering model | Preset the cluster number |

TrEEM: the proposed transfer exemplar-based learning model | Self-similarity |

TSC : transfer spectral clustering model | Preset the cluster number |

To construct the transfer learning scenario, for both synthetic and real-world EEG signal datasets, we randomly choose 80% data as source data and the remaining 20% as target data. For statistical analysis, in the experiment procedure, each algorithm is repeatedly executed 10 times; and we record the average performance and the corresponding standard deviation of

Comparison results of both synthetic and Bonn EEG datasets (the number in parentheses is the standard deviation).

Datasets (source data, target data, attributes, and classes) | Algorithms | Performance indices | |
---|---|---|---|

Synthetic dataset (240, 60, 2, 3) | EEM | 0.8316 (0.0258) | 0.7150 (0.1131) |

Multiclass SVM | 0.8513 (0.0214) | 0.8523 (0.0812) | |

TSK-FS | 0.8712 (0.0145) | 0.8816 (0.0914) | |

TSC | 0.88230.0313 | 0.92360.0158 | |

TrEEM | 0.8957 (0.0264) | 0.9856 (0.0000) | |

Bonn EEG dataset (400, 100, 6, 5) (use KPCA to extract feature) | EEM | 0.7754 (0.2146) | 0.9800 (0.0000) |

Multiclass SVM | 0.7827 (0.1834) | 0.9643 (0.0023) | |

TSK-FS | 0.6819 (0.1579) | 0.9623 (0.0000) | |

TSC | 0.72170.1241 | 0.9636 (0.0002) | |

TrEEM | 0.8323 (0.1652) | 0.9600 (0.0012) | |

Bonn EEG dataset (400, 100, 6, 5) (use wavelet to extract feature) | EEM | 0.7925 (0.0091) | 0.7530 (0.0514) |

Multiclass SVM | 0.7815 (0.0165) | 0.9034 (0.0135) | |

TSK-FS | 0.7303 (0.0251) | 0.9800 (0.0000) | |

TSC | 0.75010.1252 | 0.9600 (0.0021) | |

TrEEM | 0.8071 (0.0078) | 0.9800 (0.0000) |

Observing Table

In the experiment procedure, we also find that parameter self-similarity

The regularization factor

Effects of parameter

Effects of parameter

Effects of parameter

Table

Average running time (Seconds) of the models on both synthetic and Bonn EEG datasets.

Datasets (source data, target data, attributes, and classes) | EEM | Multiclass SVM | TSK-FS | TSC | TrEEM |
---|---|---|---|---|---|

Synthetic dataset (240, 60, 2, 3) | 0.1870 | 0.4562 | 0.0430 | 0.1520 | 0.2050 |

Bonn EEG dataset (400, 100, 6, 5) (use KPCA to extract feature) | 0.6650 | 0.8934 | 0.3730 | 0.7923 | 0.6700 |

Bonn EEG dataset (400, 100, 6, 5) (use wavelet to extract feature) | 0.4850 | 0.6327 | 0.1800 | 0.5331 | 0.5290 |

Therefore, from experimental results in Tables

For both synthetic and real-world EEG signal datasets, TrEEM performs great. Thus, we believe that TrEEM can effectively absorb knowledge from scant target data when similar source data exists.

For time consumption, TrEEM takes source data into account, which will inevitably increase the time complexity. Remember that the scale of target data will not be big, and the time consumption is very acceptable especially when combined with the performance in Table

Although TrEEM requires the most parameters shown in Table

In conclusion, the contribution of this paper is providing a novel TrEEM framework to learn from few EEG signals when recognizing epileptic seizure. Starting from information theory, the proposed TrEEM method implants the similarity relationship between source and target data into the exemplar-based clustering model to improve the utilization rate of EEG signals, whereas this structure keeps all merits of the original optimization scheme. Therefore, without increasing the complexity of the model, TrEEM utilizes transfer learning method to learn from scant EEG signals. Yet our experimental results have shown promising performance of TrEEM, and several other perspectives should be considered as well. For instance, when each class contains unbalanced data, will this TrEEM method still work? And if we can provide multiple source data, what should we do to make them collaborate instead of bringing a negative effect? These are the problems that we should discuss in the future.

The data that support the findings of this study are available from the corresponding author upon reasonable request.

The authors declare that they have no conflicts of interest.

This work was supported in part by the 2018 Natural Science Foundation of Jiangsu Higher Education Institutions under Grant 18KJB5200001, by the Natural Science Foundation of Jiangsu Province under Grant no. BK20161268, and by the Humanities and Social Sciences Foundation of the Ministry of Education under Grant 18YJCZH229.