EEG Channel Selection Using Multiobjective Cuckoo Search for Person Identification as Protection System in Healthcare Applications

Recently, the electroencephalogram (EEG) signal presents an excellent potential for a new person identification technique. Several studies defined the EEG with unique features, universality, and natural robustness to be used as a new track to prevent spoofing attacks. The EEG signals are a visual recording of the brain's electrical activities, measured by placing electrodes (channels) in various scalp positions. However, traditional EEG-based systems lead to high complexity with many channels, and some channels have critical information for the identification system while others do not. Several studies have proposed a single objective to address the EEG channel for person identification. Unfortunately, these studies only focused on increasing the accuracy rate without balancing the accuracy and the total number of selected EEG channels. The novelty of this paper is to propose a multiobjective binary version of the cuckoo search algorithm (MOBCS-KNN) to find optimal EEG channel selections for person identification. The proposed method (MOBCS-KNN) used a weighted sum technique to implement a multiobjective approach. In addition, a KNN classifier for EEG-based biometric person identification is used. It is worth mentioning that this is the initial investigation of using a multiobjective technique with EEG channel selection problem. A standard EEG motor imagery dataset is used to evaluate the performance of the MOBCS-KNN. The experiments show that the MOBCS-KNN obtained accuracy of 93.86% using only 24 sensors with AR20 autoregressive coefficients. Another critical point is that the MOBCS-KNN finds channels not too close to each other to capture relevant information from all over the head. In conclusion, the MOBCS-KNN algorithm achieves the best results compared with metaheuristic algorithms. Finally, the recommended approach can draw future directions to be applied to different research areas.


Introduction
Over many years, our universe has transferred to a digital community in which each person is living with a particular digital identifier. Indeed, there are several kinds of identifiers, such as identification cards and passwords. Meanwhile, they can be easily circumvented, stolen, and forgotten [1]. erefore, personal behavior or characteristics can be used to strengthen identification systems. Such techniques, the so-called biometrics, make use of several pieces of in-person information to allow more robust identification systems, such as face and voice recognition, fingerprint information, and iris data [2]. e motivation of using EEG and body sensors in healthcare systems has been an interesting area for many researchers [3][4][5].
On the other hand, the widespread and influential deployment of biometric systems leads to a new challenge, which is called "spoofing" [1,[6][7][8]. Such type of attack is classified as the most dangerous in security systems since it is designed to break the biometric systems' security, thus allowing unwarranted persons to get admission to the system [2].
In real life, there have already been several spoofing attacks on the biometrics systems, such as face spoofing (printed photos and 3D mask attack [9][10][11][12]), fake fingerprints (gummy fingers), finger-vein systems fooled through a piece of paper [13], iris recognition systems fooled by an eyeball opposite to the scanner of iris, and voice recognition fooled through replaying a voice recording opposite to the recognition system speaker [14]. erefore, people are looking for biometric authentication systems that can grant access to a person based on invisible characteristics, thus becoming harder to be attacked by an external threat. In this context, one shall refer to user authentication based on brain signals, which can be captured by the well-known electroencephalogram (EEG) exam [15]. e EEG signals appear as a great alternative for designing new biometric systems since several studies showed that such information presents uniqueness features, universality, and natural robustness to spoofing attacks [1,16]. ese signals represent the graphical recording of the brain electrical activity, which can be measured by placing electrodes (sensors) in different positions of the scalp [1,[17][18][19][20]. EEG channel selection is totally dependent on the characteristics of the EEG signal, where the most informative features that provide the highest accuracy rate from the channel selection shall be determined.
Multiobjective optimization algorithms have been applied in different methodological phases and criteria [21,22]. In a multiobjective formulation of the feature selection problem, the possible features define the vector of decision variables.
e space of features in many classification problems is usually very large [23]. In [24,25], the authors proposed new methods for feature selection, based on a multiobjective evolutionary algorithm, which managed to select a small subset of features, trying to avoid the overfitting problems and reduce classification problems in highdimensional feature space. In the same direction, the feature selection problem has been also solved by an embedded multiobjective genetic optimization procedure, subject to the simultaneous minimization of the misclassification ratio and number of selected attributes [26,27]. Multiobjective optimization algorithms have been also used in the deionizing stage of EEG signal. In [28], an approach has been adopted in the signal deionizing stage to model decomposition and present two metrics to quantify the amount of EEG information lost during the cleaning process. Furthermore, in [7], the authors proposed a novel method for extracting unique features from the original EEG signals using the multiobjective flower pollination algorithm and the wavelet transform.
One of the significant challenges concerning EEG-based user identification technique is signal acquisition, which is performed by placing several electrodes (sensors) on a person's head. As a drawback, such a process is usually uncomfortable since it requires good knowledge to place the sensors in their correct positions. Additionally, some questions must be considered such as the following: "Is it really necessary to put all these electrodes on a persons' head? If not, can we identify the most relevant ones for user identification and then use a smaller number of sensors?" e above questions motivated our attempts to model the EEG channel selection as an optimization problem. In this work, we aim at learning the most important EEG channels by proposing a hybrid approach composed of the cuckoo search algorithm (CSA) [29], hereinafter called "MOCS-EEG." e classification step is performed by KNN classifier. e main contributions of this paper are threefold: (1) To evaluate binary cuckoo search algorithm (BCSA) for EEG-based person identification (2) To model the problem of EEG channel selection as an evolutionary-based optimization task (3) To propose a multiobjective technique combined with the KNN classifier for EEG-based biometric person identification We use evolutionary optimization algorithms for the EEG channel selection due to their efficiency when solving challenging real-world problems and their simplicity.  Computational Intelligence and Neuroscience Specificity (EEG Precision ), F-score (EEG Fscore ), and Recall (EEG Recall ). e results show that the proposed technique (MOCS-KNN) is able to outperform other metaheuristic algorithms in almost all results produced. e remainder of this paper is organized as follows: the related works about a number of EEG-based identification techniques are reviewed in Section 2. A discussion about cuckoo search algorithm is presented in Section 3. e proposed method is provided in Section 4. Results are discussed in Section 5, and Section 6 provides conclusions and future work.

Related Works
is section presents an overview of previous studies related to using optimization algorithms in the BCI applications. Several works have been proposed based on optimization algorithms for tackling issues relevant to person identification case study. According to [1,30], the number of selected channels can effect positively the accuracy of classification task. e authors especially addressed the problem of reducing the number of required sensors while maintaining a comparable performance. ey have achieved significant results by obtaining very good person identification rates using much fewer channels. In [1], the authors proposed a binary version of the flower pollination algorithm under different transfer functions to select the best subset of channels that maximizes the accuracy. In [31], the authors proposed a genetic algorithm to reduce the number of necessary electrodes for measurements by EEG devices. e results were encouraging, and it was possible to accurately identify a subject using about 10 out of 64 electrodes. e Authors in [16] proposed hybrid optimization techniques based on binary flower pollination algorithm (FPA) and β-hill climbing (called FPA β-hc) for selecting the most relative EEG channels (i.e., features) that come up with efficient accuracy rate of personal identification. e proposed method is able to identify persons with high Acc, Sen, F s, and Spe and less number of channels selected. However, these studies have concentrated only on the number of channels as baseline for optimization process.
In [7], the authors have proposed novel method for EEG signal denoising based on the multiobjective flower pollination algorithm. ey designed a multiobjective function that considers a balance between reducing the EEG noise and keeping its signal energy. In [32], the authors proposed multiobjective optimization method for optimal electroencephalographic (EEG) channel selection to provide access to subjects with permission in a system by detecting intruders and identifying the subject. e optimization process was performed by the nondominated sorting genetic algorithm (NSGA). e optimization process consists of finding the best nu and gamma for the SVM with the RBF kernel to increase the TAR, TRR, and accuracy of subject identification or maintain them as high as possible for previous configurations, while using the smallest number of possible EEG channels. However, the optimization process is restricted within SVM hyperparameters, and it is hard to generalize for another study, especially when using different classifiers where each classifier has its own characteristics and hyperparameters.
To summarize, for person identification within optimization algorithms, two schemes have been observed: first, optimization scheme based on single objective criteria, mainly channel selection; second, optimization scheme based multiobjective criteria such as channel selection, EEG noise, and classifiers hyperparameters. Bioinformatics applications frequently involve classification problems that require improving the learning accuracy [33]. According to [32], certain aspects need to be analyzed and improved before reaching an industrial level application of new biometric systems. One is person identification, which is an essential security layer in any secure system. is is also important for the development of portable low-density EEG devices that retain similar accuracy to high-density EEG.
us, the accuracy of person identification is very important aspect. To the best of our knowledge, on the one hand, this is the first study to present multiobjective optimization based on the number of channels and the classification accuracy weights as baseline for optimization process. On the other hand, this is the first study to implement and test eight optimization algorithms in order to generalize the best algorithm that can be adopted for the person identification task.

Cuckoo Search Algorithm
Cuckoo search (CS) algorithm is a natural-based swarmintelligence metaheuristic proposed by Yang and Deb [34] to imitate the behavior of cuckoo birds in the reproduction process. It simulates the way of cuckoo bird when laying its fertilized eggs in other bird's nest where its children are looked after by proxy parents. Cuckoo bird may also remove the original nest eggs to improve the hatching chance of their eggs. When the proxy parents discover that their foreign eggs do not belong to them, they either throw them out of the nest or abandon the nest. e process of the cuckoo egg reproductions is modeled as an optimization algorithm to formulate CS. ree assumptions are adopted to formulate CS algorithm in optimization context.
(i) Each cuckoo chooses only one nest to lay one egg in that nest. (ii) e high quality egg in the best nest is marked to be used in the next generations. (iii) Since the number of hosting nests is predetermined in advance, the host cuckoo can discover that the eggs in the nest are not its own eggs with a probability of p a where p a ∈ (0, 1). In this case, the host birds either throw the foreign eggs or abandon the nest and rebuild another one in different place.
In CS algorithm, the eggs in each nest represent the set of solutions while the cuckoo eggs represent the new solution (see Figure 1). e quality of eggs in the nest is the fitness function of that solution. e ultimate aim is to replace the eggs in a nest with potentially better cuckoo eggs. e cuckoo frequently changes its position using Levy flights Computational Intelligence and Neuroscience after leaving nest. e host bird can throw the cuckoo eggs or leave the nest when cuckoo eggs are discovered. e flowchart of the CS algorithm is given in Figure 1. e pseudocode of CS algorithm is given in Algorithm 1. Table 1 shows the local and global search parameters of CS algorithm. e discussion below provides procedural steps of the CS algorithm.
Step 1: initialize CS algorithm parameters. Initially, the optimization problem is conventionally modeled in terms of objective function as follows: where f(x) is the objective function to evaluate the quality of the solution which LB i is the lower bound and UB i is the upper bound of variable x i , respectively. e parameters of the optimization problem are normally extracted from the datasets. e objective function is used to evaluate the solutions of the problem. e CS parameters can be divided into two types: (i) algorithmic parameters such as maximum number of iterations and number of nests or population size; (ii) control parameter p a which is the discovery rate of alien eggs/solutions.
Step 2: initialize the host nest population (HNP). e host nest population is formulated as a matrix HNP of size N × d, where the N is the total number of eggs in the nests and d is the solution dimension. Each row in the HNP represents a solution as shown in the following equation: e objective function value f(x i ) of each solution x i is also calculated.
Step 3: this step is also called global random walk. For each solution, x i (t) in the HNP is updated (i.e., x i (t + 1)) using Levy flights step as formulated in the following equation: where where α > 0 is the step size scale factor while s is the step size. Note that the step size is calculated based on the scale size of the optimization problem on hand [35][36][37]. e mathematical notation ⊕ refers to pairwise product operation. Le � vy(λ) is the Le � vy flights and is calculated based on heavy-tailed probability distribution formulated in (3). e random walk is represented in the stochastic equation (4). Γ is the gamma function.
Step 4: update the host nest population. In order to update the HNP, each solution x i (t + 1), i ∈ (1, ldots, N) updated by Le � vy flights step is compared with another randomly selected solution ereafter, the solution x j (t) is replaced by solution x i (t + 1), if the latter is better.
Step 5: local random probability of P a , each solution x i (t + 1), i ∈ (1, ldots, N) in the HTP(t + 1) is check for weather or not abandoned as follows: where x j (t) and x k (t) are two different randomly selected solutions and H(u) is the Heaviside function. ε is a function that generates a random number extracted from a uniform distribution, and s is the step size. For more clarifications about the CS algorithm convergence behavior, interesting papers can be referred to [34,35,38].

Methodology
is section provides a full explanation of the methodology of the proposed MOCS with KNN classifier (MOCS-KNN) to solve EEG channel selection problem. Overall, the methodology has five phases. Figure 2 shows the flowchart of these phases. Phase I involved EEG signal acquisition task which has been done using 64 electrodes. Section 5.2 will provide more details about this phase. In Phase II, two conventional filters (band-pass and notch filter) were used to remove unwanted artifacts from the original EEG signal such as those used in [16], and then wavelet was applied to denoise the EEG signal as suggested in [17]. In Phase III, three autoregressive coefficients have been extracted from the denoised EEG signal as feature extraction data, that is, AR5, AR10, and AR20, which are suggested by Rodrigues et al. [1].
Phase IV is the main contribution of this work where a multiobjective cuckoo search algorithm for EEG channel selection is proposed. e following subsections explain in detail the proposed method.

Formulation of Multiobjective Approach.
In this work, we used a weight sum approach for implementing multiobjective optimization as suggested by [66]. In the weighted sum approach, the weighting coefficients consider the preferences of the multiple objectives. Basically, the multiobjective optimization for EEG channel selection can be defined as follows: where N is the number of objective functions and W k refers to the nonnegative weights. Parameter Description x t+1 i e next position x t k e current position selected randomly at the position k x t j e current position selected randomly at the position j α Positive step size scaling factor s Step size ⊗ Entrywise product of two vectors H Heaviside function p a Used to switch between local and global random walks ε Random number from uniform distribution L(s, λ) Le � vy distribution, used to define the step size of random walk (1) Initialize the parameters of CS algorithm.
(3) Calculate the fitness of each host nest in the population ( while termination criterion is not met do (6) Global random walk using (3)  (7) Update the host nest population (8) Local random walk using (5) (9) end while (10) Return the best solution. ALGORITHM 1: e CS algorithm pseudocode.
Computational Intelligence and Neuroscience 5 Maximize where f 1 and f 2 refer to accuracy measure (12) and number of electrodes selected, respectively. W 1 � 0.8 and W 2 � 0.2 refer to the weights of f 1 and f 2.

Transfer Function.
Since the proposed approach was initially designed to handle continuous-valued optimization problems, we need to map each possible solution onto a binary-valued position (i.e., the EEG channel selection problem requires encoding each possible solution as a binary vector, where "0" means the channel will not be used and "1" the opposite situation) [1,67]. In order to restrict binary solutions only, we need to use the so-called "transfer function" V, which is defined as follows: and ϕ ∼ U(0, 1). Figure 3 illustrates how to build a binary vector and to select the optimal EEG subset channels using MOCS-KNN. ere are three steps that must be considered to select the optimal subset of channels. First, random initialization of the binary vector (representing the EEG channels) is conducted, where "1" means that a given channel will be selected and "0" indicates that the channel will not be selected. Later, the MOCS-KNN will start searching for the space to find the optimal subset of channels, i.e., the one that can provide the highest accuracy rate. Finally, we discard all channels with "0" values and keep the remaining ones.

Cuckoo Search Algorithm for EEG Channel Selection.
MOCS-KKN is a powerful metaheuristic swarm-based optimization algorithm. MOCS-KKN has a high ability to explore and exploit a particular problem search space using its two control parameters, A and C. In addition, it can explore the search space optimally using its best three solutions. erefore, MOCS-KKN is adapted for the EEG channel selection problem (MOCS-KKN) in an attempt to find the optimal/near-optimal EEG channel set and achieve the highest accuracy rate. Each solution provided by MOCS-KKN is evaluated based on the objective function (12). e main MOCS-KKN adaptation steps for the EEG channel selection problem are thoroughly discussed below.
Step 1: initialize MOCS parameters. e first step of adapting MOCS is initializing the EEG channel selection problem and MOCS parameters. e EEG channel selection problem parameters are CHn. MOCS parameters are the minimum (LB) and maximum (UB) ranges for the search agent, which are initialized to be 1 and 64, respectively, due to the total number of EEG channels, the number of search agents in the pack (N), and the maximum number of iterations (I).
Step 2: initialize MOCS population. In this step, all MOCS's solutions are initialized and generated randomly to configure the population. Each solution represents a cuckoo in MOCS and contains the AR  0  1  1  1  1  1  0  0  1  0  1  0  0  0  1  1  1  1  1  0  0  1   1  1  0  0  1  1  1  1  1  0  1  0  0  1  0  0  1  0  0  1  0  0  1  1  Optimal EEG channels selected using Multi-objective Step 2: Step 1: Omit all EEG channels with 0 value  Computational Intelligence and Neuroscience feature's coefficient. Figure 4 presents a solution in MOCS-CS population (P). P contains 30 solutions as shown in the following equation: Step 3: objective function evaluation. As mentioned previously, each solution is evaluated based on the objective function in (8). After this evaluation, the best three solutions will be assigned to best cuckoo.
Step 4: update MOCS-KNN population. e population of BMOCS-KNN method will be updated in this step in an attempt to find better solutions and achieve the optimal EEG channel set. is updating can be done using (4). e updating mechanism starts by generating new solutions.
Step 5: check the stop criterion.
Steps 3 and 4 of MOCS-KNN are repeated until the stop criterion is met.

Result and Discussion
In this section, we discuss the details of the experiments used to assess the robustness of the proposed approach as well as the dataset employed in this work. Parameter setting and experimental setup are later discussed in Section 5.3, while Section 5.4 presents a comparison between the proposed approach, MOCS-KNN, and other metaheuristic algorithms.

e Performance of Traditional EEG Classification Methods.
e main purpose of this section is to provide a brief idea about the performance of traditional machine learning classification approach used for EEG-based personal identification problem. e measurements used to evaluate the performance are the classification accuracy and the area under curve (AUC). e results obtained are summarized in Table 2 using three datasets. Several traditional EEG classification methods are experimented with: artificial neural networks (ANN), linear support vector machines, support vector machine with radial basis function (RBF-SVM), k-nearest neighbors (k-NN), decision tree (J48), optimum-path forest (OPF), Naive Bayes, and linear discriminant analysis (LDA). Based on the results reported, the KNN is able to achieve better classification accuracy for EEG-based personal identification problem. erefore, the KNN is adopted for MOCS-EEG. e area under curve (AUC) measures and confusion matrix are also visualized for KNN results in Figure 5.

refers to channel selected and 0 refers to channel non selected
Step 3: Computational Intelligence and Neuroscience coefficients: AR5, AR10, and AR20. To reduce the dispersion of the EEG patterns and to quickly process the extracted features, we compute the mean value of each electrode. Figure 6 shows the distribution of the electrodes of the EEG dataset used in the paper.

Experimental Setup.
In this section, the performance of the proposed channel selection approach and other approaches was evaluated using three EEG signal datasets collected by applying autoregressive (AR) models according to three different coefficients. e solution representation in all channel selection approaches is represented by a vector that consists of a series of 1's and 0's, where "1" means that the channel is selected and "0" means that the channel is ignored. During classification process, the EEG signal data dimension will be reduced and formed solely according to those channels endorsed by channel selection approaches. On the other hand, the channels that were not endorsed will be removed from the original dataset. Afterwards, the     Computational Intelligence and Neuroscience classification process is applied here using 10-fold-crossvalidation, where the reduced data is iteratively divided into 10 parts, where one part is considered as testing data and the remaining parts as training data. is process is repeated 10 times where the testing part is allocated new samples each time until all samples being covered. e 10-fold-crossvalidation process is presented in Figure 7. In this work, the classification accuracy obtained by k-nearest neighbors (KNN) and the number of channels are used to design the multiobjective fitness function. Table 3 shows the parameters used for selected metaheuristic algorithms in this work. With respect to CS, we need to define β, which is utilized for the computation of the Levy distribution and p a . p a stands for the probability of replacing worst nest by new constructed ones.

Comparison between MOCS-KNN and Other
Metaheuristic Algorithms. Since the proposed approach is nondeterministic, we computed the mean accuracy rate over 25 runs to avoid biased results. e experiments have been performed using a Lenovo PC, Intel ® Core i7 2.59 GHz processor, 12 GB of RAM, and official Windows 10. e performance of the MOCS-KNN is evaluated using five measures: (i) accuracy (EEG ACC ), (ii) Specificity (EEG Precision ), (iii) F-score (EEG Fscore ), (iv)  Computational Intelligence and Neuroscience where FR, FA, TA, and TR represent the false rejection, false acceptance, true acceptance, and true rejection, respectively. Figures 8-10 show the boxplot and convergence rate over 25 runs for selected metaheuristic algorithms during the experimental evaluation using AR5, AR10, and AR20, respectively. e boxplot components are defined as follows: box length illustrates interquartile range, the whiskers indicate the range of the values, the horizontal line in the box indicates the median value, and the outliers are represented by the circles. e boxplots reveal that MOCS-KNN managed to yield highly accurate results. As shown in Figure 8, for AR5 and AR20 datasets, it can be observed that MOCS-KNN shows a superior efficacy in convergence trends compared to other metaheuristic algorithms. For AR10 dataset, MOCS-KNN and MOMVO-KNN show competitive efficacy in the early stage of convergence, but in the later stages, MOCS-KNN surpasses MOMVO-KNN. Overall, MOCS-KNN shows improved convergence learning due to its capability to pave the way for its swarm to achieve the best trajectory leading to global optimal solution by avoiding stagnation drawbacks.
To further validate the results obtained by MOCS-KNN and other methods, Wilcoxon signed-rank statistical test [70] is adopted in this study to show if there is statistically significant difference between these methods. In Table 4, Zvalue represents standardized test statistics, and P-value represents the statistical significance (P < 0.05). A P value < 0.5 means that there is statistical significant difference   KNN)). e comparison includes five measures, which are accuracy ratio (EEG ACC ), channels selected (EEG Len ), Specificity (EEG Precision ), F-score (EEG Fscore ), and Recall (EEG Recall ). Table 5 shows the results of proposed technique (MOCS-KNN) with other metaheuristic algorithms using autoregressive three different coefficients. Overall, it is worth mentioning that the MOCS-KNN obtained the best results on all measurements for all datasets.
To be more precise, in terms of number of channels, the results of MOCS-KNN and MOWOA-KNN are equal, where MOCS-KNN obtained the minimum number of channels in AR20 (24), while MOWOA-KNN obtained the minimum number of channels in AR10 (25), and both methods minimize the number of channels in AR5 to the least length (24). To gain a clear overview of the performance of MOCS-KNN and other methods in all measurements, the summation of ranks is applied in AR5, AR10, and AR20 datasets for all measurements as shown in Table 6. To elaborate the summation of ranks procedure, the method that achieved the best result will be given a rank of "1," the second best method will be given a rank of "2," etc. e summation of ranks in the last row of Table 6 represents the sum of ranks of each method with the corresponding datasets. e bold font highlights the best result. e results suggest that MOCS-KNN achieved the best performance in all evaluation measurements, followed by MOMVO, MOPSO, MOWOA, MOFFA, MOMFO, and MOGWO.

Conclusion and Future Work
In this work, we proposed a binary version of multiobjective approach using several metaheuristic algorithms with the aim of addressing the challenge of channel selection in EEGbased biometric person identification. e main purpose of this work is to demonstrate that not all available EEG channels need to be used to achieve high accuracy rate. erefore, we introduce for modeling the problem of channel selection as an optimization issue, where the channel subset which optimizes the recognition ratio over a validation set is utilized as the fitness function. e outcomes of experiments showed that the introduced method outperformed several metaheuristic algorithms and the one proposed by Rodrigues et al. [1]. It is worth noting that, while retaining high accuracy rates, the number of sensors has been lessened by half. Additionally, the outcomes displayed a positive correlation between the number of features obtained from the EEG signal and the accuracy ratio; i.e., further features lead to higher accuracy rates. Such finding suggests that the proposed algorithm has the potential to remove duplicate and undesirable features whereas retaining specific features.
Regarding the future work, we intend to evaluate selected metaheuristic algorithms over different features, such as timeand frequency-domain information, to improve the overall identification performance while selecting fewer channels.