Recently, data with complex characteristics such as epilepsy electroencephalography (EEG) time series has emerged. Epilepsy EEG data has special characteristics including nonlinearity, nonnormality, and nonperiodicity. Therefore, it is important to find a suitable forecasting method that covers these special characteristics. In this paper, we propose a coercively adjusted autoregression (CA-AR) method that forecasts future values from a multivariable epilepsy EEG time series. We use the technique of random coefficients, which forcefully adjusts the coefficients with
Forecasting time series data predicts future values by discovering a set of rules or identifying patterns from past data. Linear regression models for forecasting time series such as autoregressive (AR), moving average (MA), and autoregressive moving average (ARMA) are widely used [
EEG time series signals obtained from a brain have irregular and complex wave structures. They also include a large amount of noise. Epilepsy EEG data is a representative example of a complex time series. Epilepsy is a disease defined by abnormal electrical activity in the brain that is central to the diagnosis of epilepsy. Epilepsy EEG signals display changes over time through constant interaction with external factors [
In recent years, studies have been conducted to automatically detect and predict epilepsy seizures using EEG data. Univariate, bivariate, and multivariate algorithms were proposed to solve the problem of seizure detection and prediction based on the EEG analysis of single or multiple electrodes [
We can give the following problem definition.
In this paper, we propose an adaptive forecasting algorithm that adjusts its coefficients of the autoregressive (AR) model forcedly. To forecast the future values of epilepsy EEG data including special characteristics, we use the random coefficients with −1 and 1 and the fractal dimension which the order of the CA-AR model determines. We conduct experiments with sets of EEG time series to evaluate the suitability of our forecasting approach. The experimental results demonstrate that the proposed method provides better forecasting performance than previous methods. The proposed algorithm provides the following benefits:
The remainder of the paper is organized as follows. In Section
An autoregressive model is a simple model to estimate the future value of a series using previous input values. The AR
An important feature of the AR model is utilizing recent past observations in the process of estimating the current observation
In particular, if we take
Epilepsy EEG data has special characteristics, such as nonlinearity and abnormal and nonstandard distributions [
In this paper, the fractal dimension is used to determine the order of the CA-AR model. To calculate the fractal dimension, we apply the box-counting method [
In this paper, the measured value
In this section we present the empirical verification of our data analysis to forecast epilepsy EEG data. EEG datasets are provided in [
In this paper, we proposed a novel approach to help in the improvement of epileptic seizure forecasting in nonlinear and nonperiodic EEG signals. In this section, we first analyze the characteristics of epilepsy EEG data which show nonlinearity and periodicity by applying cepstrum and lag plots. The cepstrum is employed to extract periodicities or repeated patterns [
Periodicity detection using cepstrum in Subject E.
The 50th original signal of Subject E
The 50th signal of Subject E using cepstrum
Even though the periodicities in the original signal repeatedly appear as a sinusoidal wave during seizure activity, when we applied the cepstrum to the seizure activity signals, the results differ from the original signal. Seizure activity signals do not have any periodicity. We observed that our experimental results of the seizure activity signals by the cepstrum do not have any periodicity. Therefore, since most conventional forecasting or prediction approaches require periodicity in observed data, these approaches are not appropriate for the nonperiodic seizure activity signals.
In this paper we also applied lag plots to find hidden characteristics in the data. Lag plots are useful in the analysis of cyclical data [
Lag plots: epilepsy EEG dataset. Subject A is shown in (a). This appears hard to predict. (d) is Subject C, and Subject E is shown in (g). (b), (e), and (h) show the two-dimensional lag plots of
The 1st original signal of Subject A
The 1st original signal of Subject C
The 1st original signal of Subject E
The
Fractal dimension: the log-log plot about the 1st signal (a vector
Subject A
Subject C
Subject E
In this section, we present how the order of CA-AR is determined using fractal dimension. We use box-counting analysis which is a common method for fractal dimension estimation. It is also known that it is easy, automatically computable, and applicable to patterns with or without self-similarity [
We measured the fractal dimension of the 100 single signals from each subject to determine the order of CA-AR using box-counting analysis. Figure
We applied the box-counting method to estimate the fractal dimension of the Phase Space from a signal of each subject. The vector space of the delay coordinate vectors is termed the Phase Space [
Figure
Fractal dimension of time delay space: (a) the 13th electrode of each subject, (b) the 14th electrode, (c) 17th electrode, and (d) 23rd electrode.
The 13th electrode
The 14th electrode
The 17th electrode
The 23th electrode
Figure
To evaluate the reliability of optimal order for our model, we measured Root Mean Square Error (RMSE) of forecasting from all signals of each subject. An autoregressive model of order
We forecasted the signals from 481 to 500 time points by the proposed model and compared the forecasting errors between the optimal order that decided by the average of fractal dimensions and the other orders. For forecasting, we used the
Root Mean Square Error of forecast using CA-AR and AR.
Electrode | Proposed method | Standard AR | ||||||
---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
RMSE of Subject A | ||||||||
| ||||||||
7 | 0.1878 | 0.1320 | 0.0419 | 0.0523 | 1.0044 | 0.8149 | 0.5294 | 0.5399 |
15 | 0.3766 | 0.5306 | 0.9532 | 0.9942 | 0.9947 | 0.9514 | 0.9985 | 1.3158 |
20 | 0.2629 | 0.1525 | 0.1943 | 0.3741 | 1.0003 | 1.0071 | 1.0478 | 1.0128 |
27 | 0.1226 | 0.0757 | 0.1347 | 0.0897 | 1.0093 | 1.0961 | 1.2423 | 1.3061 |
35 | 0.0213 | 0.0559 | 0.0348 | 0.0563 | 0.9995 | 0.9299 | 0.8068 | 0.8334 |
50 | 0.1271 | 0.1436 | 0.1093 | 0.1339 | 1.0038 | 0.9907 | 1.0183 | 1.1479 |
60 | 0.0356 | 0.0273 | 0.0332 | 0.0070 | 0.8364 | 0.5043 | 0.7012 | 0.5120 |
70 | 0.0554 | 0.0449 | 0.0369 | 0.0402 | 0.9960 | 0.9987 | 0.9937 | 1.0352 |
80 | 0.0101 | 0.0068 | 0.0078 | 0.0110 | 0.4783 | 0.3180 | 0.2287 | 0.2108 |
87 | 0.1368 | 0.1458 | 0.1524 | 0.1370 | 1.0011 | 0.9736 | 1.0224 | 1.3125 |
95 | 0.0999 | 0.0677 | 0.0902 | 0.0385 | 1.0025 | 1.0141 | 1.0118 | 0.9709 |
| ||||||||
Average | 0.1306 |
|
0.1626 | 0.1758 | 0.9388 | 0.8726 | 0.8728 | 0.9270 |
| ||||||||
RMSE of Subject C | ||||||||
| ||||||||
7 | 0.0302 | 0.0427 | 0.0372 | 0.0939 | 0.9501 | 0.5978 | 0.6480 | 0.6142 |
15 | 0.8090 | 0.7386 | 0.8873 | 1.5213 | 1.4048 | 1.1270 | 1.2222 | 1.8517 |
20 | 0.0384 | 0.0305 | 0.0335 | 0.0798 | 0.9979 | 0.9496 | 0.8857 | 0.8286 |
27 | 0.1288 | 0.1376 | 0.1122 | 0.1107 | 0.9469 | 1.02888 | 1.0845 | 0.7823 |
35 | 0.0495 | 0.0394 | 0.0428 | 0.1079 | 1.0635 | 1.0051 | 0.9715 | 0.9691 |
50 | 0.0369 | 0.0500 | 0.0437 | 0.0694 | 1.0187 | 0.9474 | 0.9118 | 0.8815 |
60 | 0.0319 | 0.0412 | 0.0282 | 0.1105 | 0.9208 | 0.6429 | 0.7676 | 0.7703 |
70 | 0.1612 | 0.1302 | 0.1527 | 0.0376 | 1.0973 | 1.3006 | 1.2392 | 1.3000 |
80 | 0.1068 | 0.1126 | 0.1087 | 0.1012 | 1.2428 | 1.4660 | 1.4313 | 1.4892 |
87 | 0.1478 | 0.1470 | 0.1621 | 0.0706 | 1.1839 | 1.5971 | 1.6903 | 1.9786 |
95 | 0.1158 | 0.1105 | 0.1141 | 0.0710 | 1.6958 | 1.7924 | 1.6513 | 1.6922 |
| ||||||||
Average | 0.1506 |
|
0.1566 | 0.2158 | 1.1384 | 1.1322 | 1.1367 | 1.1962 |
| ||||||||
RMSE of Subject E | ||||||||
| ||||||||
7 | 0.2225 | 0.0098 | 0.0074 | 0.0272 | 0.9566 | 0.9761 | 0.7667 | 0.7669 |
15 | 0.0709 | 0.0749 | 0.0763 | 0.0739 | 1.0041 | 0.9226 | 0.9306 | 0.9534 |
20 | 0.1757 | 0.0540 | 0.1248 | 0.1429 | 1.2710 | 1.1595 | 0.9591 | 0.3047 |
27 | 0.0661 | 0.0494 | 0.0822 | 0.1094 | 1.1139 | 0.9623 | 1.1128 | 0.8602 |
35 | 0.1138 | 0.0375 | 0.0605 | 0.0774 | 1.0139 | 1.0181 | 1.2581 | 2.8540 |
50 | 0.0345 | 0.0310 | 0.0121 | 0.1890 | 0.7694 | 0.8749 | 0.8942 | 1.6892 |
60 | 0.0062 | 0.0343 | 0.0109 | 0.0794 | 0.8563 | 0.6543 | 0.8975 | 1.0238 |
70 | 0.0975 | 0.0246 | 0.0590 | 0.0450 | 0.1912 | 1.1594 | 1.7443 | 1.5908 |
80 | 0.0642 | 0.0237 | 0.0517 | 0.1455 | 0.5931 | 0.5950 | 3.4392 | 9.2907 |
87 | 0.0039 | 0.0802 | 0.1155 | 0.1507 | 1.0098 | 0.5522 | 0.5753 | 0.6716 |
95 | 0.1567 | 0.0389 | 0.1023 | 0.1285 | 1.2754 | 1.1505 | 1.0432 | 0.619 |
| ||||||||
Average | 0.0920 |
|
0.0639 | 0.1063 | 0.9141 | 0.9114 | 1.2383 | 1.8749 |
Figure
Forecast comparison: (a) and (b) plot forecasts of the 20th and 80th electrode of Subject A to compare the proposed and standard AR methods. (c) and (d) show forecast plots of Subject C. (e) and (f) display the forecast plot of the 20th and 80th electrode of Subject E, respectively.
The 20th electrode of Subject A
The 80th electrode of Subject A
The 20th electrode of Subject C
The 80th electrode of Subject C
The 20th electrode of Subject E
The 80th electrode of Subject E
In this paper, we compared the forecasting results among several existing methods and CA-AR method. Table
Root Mean Square Error of forecast comparison.
Forecasting time points | 500 (training time points) | 1000 (training time points) | ||||||
---|---|---|---|---|---|---|---|---|
ANN | Fuzzy | NN | CA-AR ( |
ANN | Fuzzy | NN | CA-AR ( | |
150 | 0.2059 | 0.1210 | 0.1462 | 0.0162 | 0.1297 | 0.0721 | 0.0950 | 0.0114 |
1500 | 0.6174 | — | 0.1050 | 0.0541 | 0.3221 | — | 0.0887 | 0.0533 |
2000 | 0.6328 | — | 0.0998 | 0.0487 | 0.356 | — | 0.0918 | 0.0486 |
2500 | 0.7607 | — | 0.0995 | 0.0497 | 0.3741 | — | 0.0910 | 0.0507 |
3000 | 0.7801 | — | 0.0976 | 0.0480 | 0.3613 | — | 0.0874 | 0.0479 |
As a result of Table
Table
Forecast time comparisons.
Forecasting |
500 (training time points) | 1000 (training time points) | ||||||
---|---|---|---|---|---|---|---|---|
ANN | Fuzzy | NN | CA-AR ( |
ANN | Fuzzy | NN | CA-AR ( | |
150 | 0.5304 | 186.94 | 1.060 | 0.0780 | 4.430 | 675.38 | 1.669 | 0.0156 |
1500 | 0.9516 | — | 13.722 | 0.0499 | 5.004 | — | 13.887 | 0.0811 |
2000 | 0.9953 | — | 20.439 | 0.0749 | 4.995 | — | 20.689 | 0.0718 |
2500 | 0.9766 | — | 28.189 | 0.0967 | 5.098 | — | 29.178 | 0.0874 |
3000 | 1.0764 | — | 32.723 | 0.0748 | 5.248 | — | 37.272 | 0.0736 |
We evaluated forecasting error with each signal in each subject, and Table
The measured forecasting error with several signals from each subject.
Electrode | Subject A | Subject C | Subject E | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
NN | Fuzzy | ANN | CA-AR ( |
NN | Fuzzy | ANN | CA-AR ( |
NN | Fuzzy | ANN | CA-AR ( | |
4 | 0.320 | 0.050 | 0.429 | 0.019 | 0.013 | 0.031 | 0.156 | 0.023 | 0.018 | 0.044 | 0.136 | 0.029 |
8 | 0.086 | 0.050 | 0.177 | 0.127 | 0.010 | 0.018 | 0.093 | 0.018 | 0.002 | 0.121 | 0.086 | 0.016 |
35 | 0.043 | 0.086 | 0.113 | 0.056 | 0.045 | 0.029 | 0.394 | 0.078 | 0.069 | 0.169 | 0.203 | 0.038 |
70 | 0.145 | 0.058 | 0.166 | 0.045 | 0.038 | 0.048 | 0.130 | 0.066 | 0.012 | 0.072 | 0.038 | 0.025 |
95 | 0.093 | 0.087 | 0.173 | 0.068 | 0.030 | 0.061 | 0.111 | 0.024 | 0.013 | 0.081 | 0.115 | 0.039 |
| ||||||||||||
Average | 0.137 | 0.066 | 0.212 |
|
|
0.037 | 0.177 | 0.042 |
|
0.097 | 0.115 | 0.029 |
Our method needs only the past values of
Comparison of the foresting error rates and the execution time between the existing methods and the proposed method.
Forecasting |
Electrodes | RMSE | Execution time (sec) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
500 (training) | 1000 (training) | 500 (training) | 1000 (training) | ||||||||||
ANN | NN | CA-AR ( |
ANN | NN | CA-AR ( |
ANN | NN | CA-AR ( |
ANN | NN | CA-AR ( | ||
1000 | 4 | 0.091 | 0.020 | 0.047 | 0.13 | 0.023 | 0.05 | 1.01 | 7.44 | 0.08 | 5.34 | 9.11 | 0.06 |
8 | 0.039 | 0.006 | 0.043 | 0.055 | 0.016 | 0.051 | 0.08 | 7.75 | 0.03 | 5.01 | 8.86 | 0.06 | |
35 | 0.228 | 0.072 | 0.039 | 0.192 | 0.062 | 0.045 | 0.90 | 7.64 | 0.08 | 4.90 | 8.78 | 0.05 | |
70 | 0.057 | 0.010 | 0.056 | 0.049 | 0.011 | 0.054 | 0.10 | 7.57 | 0.08 | 4.48 | 8.63 | 0.06 | |
95 | 0.091 | 0.032 | 0.044 | 0.145 | 0.038 | 0.034 | 0.94 | 7.44 | 0.06 | 4.70 | 8.75 | 0.05 | |
| |||||||||||||
Average | 0.101 |
|
0.046 | 0.114 |
|
0.047 | 0.61 | 7.57 |
|
4.88 | 8.83 |
|
|
| |||||||||||||
2000 | 4 | 0.217 | 0.021 | 0.046 | 0.245 | 0.03 | 0.038 | 1.11 | 20.55 | 0.08 | 4.87 | 20.14 | 0.05 |
8 | 0.060 | 0.019 | 0.047 | 0.082 | 0.027 | 0.018 | 1.11 | 20.64 | 0.08 | 5.55 | 20.87 | 0.05 | |
35 | 0.253 | 0.063 | 0.052 | 0.191 | 0.088 | 0.048 | 0.83 | 20.31 | 0.06 | 4.91 | 21.03 | 0.05 | |
70 | 0.058 | 0.015 | 0.049 | 0.046 | 0.012 | 0.033 | 1.01 | 20.12 | 0.08 | 4.52 | 20.94 | 0.04 | |
95 | 0.173 | 0.037 | 0.048 | 0.198 | 0.038 | 0.048 | 0.92 | 20.58 | 0.08 | 5.12 | 20.47 | 0.05 | |
| |||||||||||||
Average | 0.152 |
|
0.048 | 0.152 | 0.039 |
|
1.00 | 20.44 |
|
5.00 | 20.69 |
|
|
| |||||||||||||
3000 | 4 | 0.376 | 0.029 | 0.052 | 0.399 | 0.019 | 0.051 | 0.92 | 32.74 | 0.08 | 5.79 | 38.31 | 0.05 |
8 | 0.094 | 0.015 | 0.051 | 0.112 | 0.032 | 0.051 | 1.22 | 32.39 | 0.03 | 5.68 | 36.47 | 0.05 | |
35 | 0.259 | 0.070 | 0.049 | 0.213 | 0.087 | 0.047 | 0.98 | 32.40 | 0.08 | 4.76 | 36.57 | 0.05 | |
70 | 0.055 | 0.023 | 0.048 | 0.044 | 0.013 | 0.046 | 1.33 | 32.79 | 0.08 | 5.13 | 37.53 | 0.05 | |
95 | 0.227 | 0.049 | 0.050 | 0.233 | 0.059 | 0.049 | 0.94 | 33.29 | 0.11 | 4.88 | 37.47 | 0.05 | |
| |||||||||||||
Average | 0.202 |
|
0.050 | 0.2 |
|
0.049 | 1.08 | 32.72 |
|
5.25 | 37.27 |
|
The accuracy of time series forecasting is a very important factor to many decision processes, and hence the research for improving the effectiveness of forecasting models has lasted. Both the neural network and the AR model capture all of the patterns in the data [
Epilepsy is a common neurological disorder in which some nerve cells spasmodically incur excessive electricity for a short time. Seizure predictions are mostly handled by statistical analysis methods from the EEG recordings of brain activity. The forecasting of epilepsy seizures can be used as a warning about seizures occurring on certain time scales by estimating the change in brain waves. That is, the forecasting of seizures alerts patients before an epilepsy seizure occurs. As a result, they could avoid potentially dangerous situations such as brain damage or injury during seizures.
In recent years, much research has looked into the prediction of epilepsy seizures using EEG data. Mormann et al. [
Li and Yao [
Several techniques have been proposed to analyze characteristics of seizures via various methods. Liu et al. [
In this paper, we proposed a new CA-AR forecasting method based on the AR model that can forecast the seizure of complex epilepsy EEG data by applying the property of nonstandard distribution from [
Epilepsy may be caused by a number of unrelated conditions, including damage resulting from high fever, stroke, toxicity, or electrolyte imbalances. An algorithm capable of effective real-time epileptic seizure prediction will allow the patient to take appropriate precautions minimizing the risk of a seizure attack or injuries resulting from such an attack. Conventional methods for forecasting or prediction of data require periodicity in the observed data. However, when we applied the cepstrum, seizure activity signals did not exhibit periodicity. In addition, we could distinguish whether the epilepsy EEG data is random or nonrandom using the lag plot. If the lag plot has a nonrandom pattern, it can be used for prediction by conventional approaches. However, our data appears to have a random distribution.
This study proposed the random coefficients appropriate for random distribution data. Further, we used the log-log plots (box-counting) using the concept of fractal dimensions to forecast epilepsy EEG data to estimate the vital forecasting optimal order
Future research could focus on extending CA-AR to perform forecasting on a multiple, coevolving time series which includes linear or non-linear correlations and periodicity or nonperiodicity. A more ambitious direction would be to automatically readjust the parameter and coefficient equations.
This research was supported by the MKE (the Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program (NIPA-2013-H0301-13-3005) supervised by the NIPA (National IT Industry Promotion Agency). It was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2012-0007810).