A Framework on Performance Analysis of Mathematical Model-Based Classifiers in Detection of Epileptic Seizure from EEG Signals with Efficient Feature Selection

Epilepsy is one of the neurological conditions that are diagnosed in the vast majority of patients. Electroencephalography (EEG) readings are the primary tool that is used in the process of diagnosing and analyzing epilepsy. The epileptic EEG data display the electrical activity of the neurons and provide a significant amount of knowledge on pathology and physiology. As a result of the significant amount of time that this method requires, several automated classification methods have been developed. In this paper, three wavelets such as Haar, dB4, and Sym 8 are employed to extract the features from A–E sets of the Bonn epilepsy dataset. To select the best features of epileptic seizures, a Particle Swarm Optimization (PSO) technique is applied. The extracted features are further classified using seven classifiers like linear regression, nonlinear regression, Gaussian Mixture Modeling (GMM), K-Nearest Neighbor (KNN), Support Vector Machine (SVM-linear), SVM (polynomial), and SVM Radial Basis Function (RBF). Classifier performances are analyzed through the benchmark parameters, such as sensitivity, specificity, accuracy, F1 Score, error rate, and g-means. The SVM classifier with RBF kernel in sym 8 wavelet features with PSO feature selection method attains a higher accuracy rate of 98% with an error rate of 2%. This classifier outperforms all other classifiers.


Introduction
Epilepsy is an immensely sensitive and intensely fatal neurological disorder. Approximately, 1% of the world population is suffering from this ailment. It is normally identified by analyzing EEG signals [1]. In the clinics, visual observation of EEG signals is leaned on as the standard method to detect it. is type of detection is time-consuming and induces a lot of errors. Above all, the epileptic seizure should be timely and accurately diagnosed before the patient goes to an ictal state [2]. Hence, an accurate seizure detection system will serve as a top-of-the-line boon to humanity. Various methods of seizure detection technique have been attended; these methods are broadly classified into three major groups: Feature extraction techniques, feature selection, and classifiers [3]. e interpretation and identification of epilepsy using EEG signals have emerged as an interesting study field in the last a few decades. Identification of epileptic seizures, spike detection, interictal and ictal analysis, linear and nonlinear analysis, and optimization algorithms have all been extensively studied [4].
Epilepsy is characterized by abrupt disturbances in the brain's electrical activity, and it is a condition that afflicts a significant number of individuals all over the globe. Epilepsy can lead to many serious injuries, such as broken bones, accidents, and burns. Some of these injuries could even be fatal. is issue reflects a very high societal cost for families of the middle class, and as a result, it causes a great deal of financial difficulties for such families. Both surgical and pharmaceutical approaches may be used, depending on the patient's epilepsy degree of severity, in order to successfully treat the condition [3]. It is not possible to properly manage seizures in all people by using antiepileptic medication, and surgery may also not be an option for certain patients due to the severity of their condition [4]. erefore, forecasting the onset of an epileptic seizure and then identifying the kind of seizure that has occurred is highly significant. e technique for feature extraction, feature selection, and classification is explained in tremendous detail in this article. ere are significant number of publications that have been presented in the literature about the identification of epilepsy based on EEG data.

Related Works.
Discrete wavelet transform (Haar, dB4, Sym8) was employed to extract EEG signal features, and epilepsy risk levels were identified using EM, MEM, and SVD classifiers with code converter technique by Harikumar et al. [4] with an overall accuracy of 97.03% achieved. Murugavel and Ramakrishnan [5] utilized the wavelet transform with approximate entropy to extract the features of EEG signals and multiclass SVM with ELM to identify the epilepsy seizures and reached 96% of classification accuracy. Truong et al. [6] described a hills algorithm to extract the EEG features with a sensitivity of 91.95% and a specificity of 94.05%, and their data demonstrated the efficacy of their proposed approach. Manjusha and Harikumar [7] proposed detrend fluctuation analysis with power spectral density to reduce the dimensionality of EEG data. K-means clustering and KNN classifier were applied to identify the epilepsy risk levels. e proposed work achieved 90.48% of sensitivity and 92.85% of specificity. Radüntz et al. [8] projected a support vector machine (SVM) and artificial neural network (ANN) to identify the epilepsy risk levels, and they used two classification methods, SVM and ANN, and found that ANN was more accurate than SVM (95.85% vs. 94.04%).
Ijaz et al. [9] utilized hybrid prediction model with density-based spatial clustering of applications with noise to detect the outliers of diabetes and hypertensions data and synthetic minority over sampling technique with random forest to identify the diabetes and hypertensions and reached 92.56% of classification accuracy. Vulli et al. proposed a fast AI and a one-cycle policy with tuned dense net 169 to normalize the breast data. e proposed model was used to detect breast cancer metastasis. e proposed work achieved 97.4% accuracy [10]. Ghaemi et al. [11] utilized the improved binary gravitation search algorithm with wavelet domain to extract the features of EEG signals and SVM to identify the optimal channels and reached 80% of classification accuracy. Binary particle swarm optimization (BPSO) was used to choose the best channels, and Gonzalez et al. [12] used fisher discriminant analysis to find the auditory event-related potentials, which gave the best accuracy overall. Poli [13] analyzed the applications of particle swarm optimization (PSO). Independent component analysis (ICA) was employed to extract EMG signal features, and muscle activation intervals were identified using wavelet transform by Azzerboni et al. [14]. Greco et al. [15] used ICA to minimize EMG signal interference and the Morlet wavelet transform to determine muscle activation intervals. To detect the features of an epileptic seizure, various expansion methods have been proposed in the literature, such as discrete wavelet transform (DWT), continuous wavelet transform (CWT), Fourier transform (FT), discrete Fourier transform (DFT), fast Fourier transform (FFT), and short-term Fourier transform (STFT). From the detailed literature survey, it is acceptably assumed that DWT is the best method to detect seizure features. e DWT has the advantage of evaluating the signal in both the time and frequency domain. e following is the list of most important objectives that this research aims to achieve: (a) In DWT, Haar, db4, and Sym8 techniques are proposed in this study to detect the seizure feature. (b) Besides, this research proposes the Particle Swarm Optimization technique to select the best feature. (c) e derived features from DWT are fed into the classifier for further classification. Normally classifiers are used to identify the signals, whether it has epileptic or not. e seven classifiers LR, NLR, GMM, K-NN, and SVM (Linear, Polynomial, and RBF) are used in this study. e organization of the paper is as follows: Section 2 describes the materials and methods and explains the Haar wavelet, dB4 wavelet, and Sym8 wavelet-based feature extraction of EEG signals, Section 3 discusses the PSO-based feature selection, Section 4 describes the Classifiers, Section 5 exhibits results and discussion and Section 6 presents the conclusion and future work.

Materials and Methods
e suggested method for automated epileptic seizure detection is presented in this section. e schematic diagram of the proposed method is shown in Figure 1. In this schematic diagram, the effectiveness of the EEG signal is maximized in the feature extraction stage by using multiple feature extraction approaches. e remainder of this section provides a full discussion of the feature extraction techniques used. A Particle Swarm Optimization (PSO) technique is used to choose the best features of epileptic seizures after the features have been extracted. After feature extraction and selection, the extracted and selected features are deployed to several classifiers, and performance benchmark results are analyzed and compared. e most effective classifiers have the highest benchmark value. Next, the dataset and specifics of each subsystem are detailed. e implementation environment details of the study are given in Table 1.
As given by e publicly available Bonn University datasets are chosen for the analysis. e Bonn University EEG datasets have A, B, C, D, and E with a sampling frequency of 173.6 Hz [16]. Dataset A represents the normal signal, and E represents the abnormal (epileptic seizure) signal, which is considered for this analysis. e details of the dataset are exhibited in Table 2. All of these segments have 100 epochs, with a recording period of 23.6 seconds. In sets (A) and (B), signals were obtained from healthy patients who would not even have epilepsy, with the set (A) being recorded when the subjects' eyes were open and set (B) when their eyes were closed. Signals from patients with epilepsy were obtained in sets (C), (D), and (E). For set (C) and (D), signals were composed of epileptic patients but not during an incidence of epilepsy, whereas in Set (E), signals were obtained from individuals during an existence of epilepsy [17]. Each epoch has 4096 samples of EEG signal. In this research, we purport to perform the analysis on the A-E epilepsy sets only.

Wavelet Feature Extraction.
In this work, the first step in analyzing epileptic seizures is the extraction of features from the obtained EEG datasets from the Bonn University database. Discrete wavelet transform (DWT) is used to extract the EEG features. e three wavelet families employed for feature extraction from EEG signal (A-E Bonn) datasets at level 4 wavelets decomposition are Haar wavelet (HAAR), dB4 wavelet (Daubechies), and Sym8 wavelet (Symlet8).
After passing through the wavelets at level 4 decomposition, the input EEG signals of [4096 × 100] samples per set are reduced to [256 × 100] approximate values of samples. e essential features of wavelets are described in the following section of the paper.

Haar Wavelet.
It is essentially a discontinuous function that appears like a step function. It is a wavelet that is comparable to Daubechies dB1. e Haar wavelet is a basic kind of compression that involves average and difference terms, storing detail coefficients, removing data, and reconstructing the matrix to make it seem like the original matrix [18]. Only the Haar wavelet is well supported, orthogonal, and symmetric.
e Haar decomposition has excellent time localization because of the compact support of the Haar wavelets [19]. e mathematical expression of the Haar wavelet function (ψ j,k ) and scaling function (� j,k ) is represented as follows:

dB4
Wavelet. Ingrid Daubechies, one of the most lustrous luminaries in the domain of wavelet research, devised compactly supported orthonormal wavelets, which made discrete wavelet analysis feasible. e order of the Daubechies family wavelets is N, and dB is the wavelet's "family name." ese wavelets are energy-saving since they are orthogonal and compactly supported [20]. dB4 wavelet function is utilized in this work. Due to the overlapping windows used by Daubechies (dB) wavelets, all high-frequency changes are reflected in the spectrum of the highfrequency coefficient. Filter coefficients are used to create the Daubechies (dB) family of wavelets and scaling functions [21]. e 2π cyclic trigonometric polynomial related with the filter h k is the first step in Daubechies technique to creating orthogonal compactly supported wavelets. e filter's element sequence is deduced as follows [22]: e mathematical expression of the Haar wavelet scaling function (m 0 (ω)) is represented as By creating this function to provide orthogonally and smoothness, a new family of wavelets may be generated. As dB4 has a very small basis function, it may separate signal discontinuities more effectively.

Sym8 Wavelet.
e Symlet wavelet family is an abbreviation for "symmetrical wavelets." ey are well constructed also to have the least amount of asymmetry and the greatest number of vanishing moments for a certain compact support. In this work, a wavelet function of type Sym8 was used. Sym8 Wavelet is a nearly symmetrical and smooth wavelet function [23]. In order to identify the presence of nonlinearity in the wavelet features, the statistical parameters, such as Mean, Variance, Skewness, Kurtosis, Pearson correlation coefficient, Canonical Correlation Analysis (CCA) for without feature selection method are given in Table 3.
As indicated in Table 3, the statistical parameters of the wavelet feature depict the presence of nonlinearity among the A-E sets for all three wavelets. Pearson Correlation Coefficient (PCC) exhibits peculiar types of no correlation in the intra epochs of A set. At the same time, CCA demonstrates more correlation among the two classes of A-E sets.
is is an indication that features in the A-E sets are correlated and overlapped. It glitters in the histogram plots shown below.
Histogram of Haar Wavelet features for Epilepsy E-Set is exposed in Figures 2 and 3 displays the Histogram of Haar Wavelet features for Normal A-Set. Figure 2 demonstrates the nonlinear nature of the wavelet features of the E-set with less outlier. Figure 3 flaunts the availability of outlier in the wavelet features for normal A-set. Finally, these extracted features are then fed as input to the feature selection using Particle Swarm Optimization (PSO) algorithm.

PSO as a Feature Selection Algorithm
PSO is an illustrious method developed by Kennedy and Eberhart in 1995 [24]. Each search space is traversed by a collection of particles. e parameters for location y and velocity w are included in each swarm member i. Each particle's location parades a possible optimization solution.     experiences. Each and every particle is represented as a potential solution to the obvious problem in a D-dimensional space in the basic formulation of PSO [25]. In a D-dimensional space, the particle i is represented as follows: Furthermore, each particle remembers its prior optimum location. e i th particle's best prior location may be expressed as e i th particle's velocity is expressed as follows: e greatest fitness value is assigned to the global best. e best particle in the world is chosen from all the particles in the population. It is mathematically expressed as follows: P g � p g1 , p g2 , p g3 , . . . , p gD .
e cognitive component represents the location of the velocity adjustments made by the particle's prior best position. In contrast, the social component represents the position of the velocity adjustments made by the particle's global best position and is expressed as follows [26].
where w denotes the inertia weight, η1 and η2 represent the positive acceleration constants. e velocity vector drives the optimization process, which in turn depicts the socially exchanged information. Figure 4 determines the performance of MSE in a number of iteration for PSO feature selection at different weights. It is observed from Figure 4 that the optimum weight is chosen at = 0.5 with lower MSE values compared with other weights values. In this circumstance, inertia (w) is set to 0.5, while η1 and η2 are both set to 1.
e output of PSO feature selection will make [256 × 100] input as wavelet features are reduced to [256 × 10].
Let us, forthwith, analyze the presence of nonlinearity in the PSO features. In this case, the statistical parameters, such as Mean, Variance, Skewness, Kurtosis, Pearson correlation coefficient, and Canonical Correlation Analysis (CCA) are the best-suited ones. Hence, these parameters are extracted with wavelet feature along with the PSO feature selection method, and the same is given in Table 4. From Table 4, the statistical parameters indicate the presence of nonlinearity for the PSO features among both classes. PCC demonstrates the uncorrelated condition among the intraclass PSO features among the classes. CCA also distinguishes the noncorrelation among inter-classes that are A-E sets. e normal probability plot for dB4 wavelet coefficient with PSO feature selection for Epilepsy E-Set is shown in Figures 5 and 6 displays the normal probability plot for dB4 wavelet features with PSO feature selection for Normal A-Set. It is observed from Figures 5 and 6 that the PSO features for dB4 wavelet feature extraction exhibits uncorrelated, overlapped, and nonlinear nature of the A-E sets. e extracted features without PSO feature selection and with PSO feature selection are then fed as input to the various classifiers like linear regression (LR), nonlinear regression (NLR), Gaussian mixture model (GMM), K-Nearest Neighborhood (K-NN), SVM (Linear), SVM (Polynomial), and SVM (RBF) classifiers.
ese are discussed in the following sections.

Mathematical Model-Based Classifiers for Epilepsy Detection
In this section, model-based classifiers are used to classify the features that were extracted and selected with the help of wavelet (Haar, db4, and sym8) techniques and PSO methodology.

Linear Regression.
Linear regression is a supervised learning technique in which one or more independent variables are linearly connected to the dependent variable [27]. Simple linear regressions employ only one independent variable, whereas multiple linear regressions use several independent variables [28]. A residue value is computed    Journal of Healthcare Engineering depending on the targeted value using conventional linear regression. e linear regression model equation is then implemented to the residue value. e performance of the classifier is evaluated based on the variation from its target value. Mathematical expression for simple linear regression is as follows: where Y represents the dependent variable(y − axis), X represents the independent variable(X − axis), b indicates the slope line, and a represents the intercept ofy. erefore, slope line (b) and intercept (a) mathematically expressed as follows:

Nonlinear Regression.
Nonlinear regression (NLR) is a regression analysis method in which empirical data are represented by a function that depends on one or more independent variables and is a nonlinear combination of model parameters. An approach of successive approximations is used to fit the data. Statistical model for nonlinear regression is expressed as follows [29]: where x represents the independent variables of vector, y indicates the dependent variables of vector, and f represents the expectation nonlinear function. ereupon, expectation nonlinear function f mathematically is expressed as follows: On the basis of the target set, a residue value is computed. e performance is then evaluated by applying the residue value and the EEG signal samples to the nonlinear equation.

Gaussian Mixture Model (GMM).
e Gaussian mixture model (GMM) is a weighted sum of Gaussian component densities that defines a parametric probability density function. Arbitrary density modeling is possible with GMMs with numerous coefficients. e random vector with probability density is expressed as follows [30]: where L represents the number of Gaussian mixture components, B i indicates the weight of the mixture. erefore, the mixing parameters (θ) are often computed by increasing the log-likelihood function. Mathematical expression for log-likelihood function as follows:    Journal of Healthcare Engineering e expectation-maximization (EM) method is a frequently employed strategy for maximum likelihood outcomes.

K-Nearest Neighborhood (K-NN).
e K-nearest neighborhood (K-NN) method is based on the supervised learning approach and is one of the most basic machine learning algorithms. e K-NN approach may be wielded for both regression and classification. However, it is more commonly utilized for classification tasks [31]. e steps of the K-NN algorithm are as follows: Step 1: choose a neighbors' number K Step 2: determine the Euclidean distance between K neighbors Step 3: using the estimated Euclidean distance, find the K closest neighbors Step 4: compute how many data points each category has between all these K neighbors Step 5: define the additional data points to the class with the highest number of neighbors

Support Vector Machine (SVM). SVM is widely used for pattern classification.
e SVM algorithm is applied to separate nonlinear samples into another higher dimensional space by kernel functions and then to locate the optimal separating hyperplane by solving a quadrate optimization problem [32]. e kernel function of SVM is the linear kernel, polynomial kernel, radial basis function (RBF), and sigmoidal neural network kernel. SVM-Linear, SVM-RBF, and SVM-Polynomial are used in this work.
where c represents the bandwidth of the kernel and σ indicates the positive parameters to standardize the radius.

Results and Discussion
is paper considers regular 10-fold training and testing with 90% and 10% of the input features used for training and testing, respectively [33]. Table 3 highlights the average MSE results for Haar, dB4, and Sym8 wavelet features in various classifiers without PSO feature selection, and Table 5 illustrates the Average MSE for Haar, dB4, and Sym8 wavelet features in various classifiers with PSO feature selection. Table 6 depicts the confusion matrix for the seizure detection. Table 7 displays the Average performance of the classifier for Haar, dB4, and Sym8 wavelet features in various classifiers without PSO feature selection, and Table 8 exhibits the average performance of classifier for Haar, dB4, and Sym8 wavelet features in various classifiers with PSO feature selection. e following performance parameter measurements may be calculated and employed to examine the classifier's performance based on the confusion matrix. e following are the formulae for the sensitivity, specificity, accuracy, F1 Score, error rate, and G-mean and MSE.
From Table 6, True-Positive is represented as TP, True-Negative as TN, False-Positive as FP, and False-Negative as FN [34]. A TP states a positive sample that has been accurately forecasted as positive. A TN states a negative sample that has been accurately forecasted as negative. A FP occurs when a result is incorrectly assumed to be positive but is really negative. A FN occurs when a result is incorrectly assumed to be negative when it is really positive [35].
e Sensitivity is computed as follows: e specificity is expressed as follows: e overall accuracy of the classifier is computed as follows: F1 Score is expressed as follows: Geometric Mean (G-mean) is computed as follows: Mean Square Error (MSE) is computed as follows [36]: where P i indicates the value of observed at a particular time, Q j represents the value of target at typicalj(j � 1 to 100), and B represents the number of observations per patient, in our case, which is 25600. e low value of MSE always demonstrates higher classification benchmark parameters of the Classifier. Table 9 portrays the average performance measures like Sensitivity, Specificity, Accuracy, F1 Score, Error Rate, and G-mean for Haar, dB4, and Sym 8 wavelet features in various classifiers without feature selection method. Table 9 illustrates that Haar wavelet SVM with RBF kernel classifier  Table 11 outlines the previous identification efforts for EEG signals. e accuracy of these efforts ranged from 73.5% to 97.3%.  e suggested approaches for linear regression, nonlinear regression, GMM, K-NN, SVM-linear, SVM-polynomial, and SVM-RBF classifiers using wavelet (Haar, dB4, sym8) and PSO features outperformed other existing approaches in epileptic seizure classification. e SVM classifier with RBF kernel in sym 8 wavelet features with the PSO feature selection method attains a higher accuracy rate of 98% with an error rate of 2%. is classifier outperforms all other classifiers.

Conclusion
Epilepsy or "seizure disorders" is a chronic disorder and is the fourth most common neurological disorder affecting people across all ages. Early diagnosis can help the patient's rehabilitation. is paper proposed the four levels of decomposition using Haar, dB4, and Sym 8 wavelet transforms for feature extraction from Bonn A and E EEG signals. e PSO technique was used to reduce the magnitude of decamped signals. en seven classifiers were used to classify the signals as seizure and nonseizure. e SVM classifier with RBF kernel in sym 8 wavelet features with the PSO feature selection approach achieves a higher accuracy rate of 98% with a 2% error rate. is kind of classification algorithm outperforms all others. It is thereby proposed to engage further research in the direction of deep neural networks and other mathematical model-based classifiers like NBC and Random Forest.

Data Availability
e data used to carry out this study can be obtained from the corresponding author upon request. e dataset of EEG can be obtained from BONN university EEG database.