Arithmetic Optimization with RetinaNet Model for Motor Imagery Classification on Brain Computer Interface

Brain Computer Interface (BCI) technology commonly used to enable communication for the person with movement disability. It allows the person to communicate and control assistive robots by the use of electroencephalogram (EEG) or other brain signals. Though several approaches have been available in the literature for learning EEG signal feature, the deep learning (DL) models need to further explore for generating novel representation of EEG features and accomplish enhanced outcomes for MI classification. With this motivation, this study designs an arithmetic optimization with RetinaNet based deep learning model for MI classification (AORNDL-MIC) technique on BCIs. The proposed AORNDL-MIC technique initially exploits Multiscale Principal Component Analysis (MSPCA) approach for the EEG signal denoising and Continuous Wavelet Transform (CWT) is exploited for the transformation of 1D-EEG signal into 2D time-frequency amplitude representation, which enables to utilize the DL model via transfer learning approach. In addition, the DL based RetinaNet is applied for extracting of feature vectors from the EEG signal which are then classified with the help of ID3 classifier. In order to optimize the classification efficiency of the AORNDL-MIC technique, arithmetical optimization algorithm (AOA) is employed for hyperparameter tuning of the RetinaNet. The experimental analysis of the AORNDL-MIC algorithm on the benchmark data sets reported its promising performance over the recent state of art methodologies.


Introduction
Brain-computer interface (BCI) is a technology that permits us to communicate with the computer, whereby the device forecasts the abstract aspect of cognitive states with brain signals, namely, electroencephalography (EEG). Also, it is named as Brain-computer interface (BCI) that is commonly associated with AI-enabled approach which permits the user to harness brain, etc [1]. It is a noninvasive approach that gathers brain oscillatory activation patterns from the scalp. e human brain produces electrical signal that is identified by using EEG. erefore, it is highly reliable and applicable method for receiving the control command for BCI [2]. Studies involving EEG signals when imagining limb or finger movement, widely called motor imagery (MI), to function artificial intelligence (AI) technique has been witnessed in this study [3]. An effective BCI scheme has two fundamental needs that consist of effective machine learning (ML) method for the classification of feature extraction and an efficient set of EEG feature must be capable of differentiating task induced brain activities. e study aims to identify the MI-task induced EEG patterns [4,5]. Mostly, BCI system involves filtering or preprocessing to remove this undesirable component that is embedded with the EEG signals which leads to wrong conclusions and bias the analysis of the EEG [6]. Appropriate preprocessing within the BCI scheme results in cleaner EEG signal, thus enhancing the classification outcomes. e study focuses on the quantum mechanics inspired preprocessing phase within the BCI scheme, for extracting further data from the attained noisy EEG signal, and leads to increased classification performance although categorized by using multiple classification methods [7]. Especially, SVM is widely employed for MI classification in BMI. Imagery signal classification is performed by LR method. KNN is utilized in seizure detection, where NB is utilized for detecting the lower limb movement by analyzing EEG signals [8]. At the same time, DT is primarily utilized for hand amplitude modulation and movement interpretation spatial activity. Deep Learning (DL) method could considerably simplify processing channel, allow automated end-to-end training of retrieval, preprocessing, and classification models [9], while guaranteeing better performance in target. Deep  e rest of the paper is organized as follows. Section 2 offers related works, Section 3 provides proposed model, Section 4 discusses performance validation, and Section 5 draws conclusion.

Related Works
Zhang et al. [10] validate and developed a DL-based algorithm for automatically recognizing two distinct MI states by choosing the related EEG channel. It employs an automated channel selection (ACS) approach. Furthermore, we proposed a CNN method for fully exploiting the time-frequency feature, therefore outperforming conventional classification method interms of robustness and accuracy. Kant et al. [11] present an integration of DL-based TL and CWT for solving the problems. CWT transforms 1D-EEG signal into 2D time-frequency-amplitude representation enables users to make use of deep network via TL method. Corsi et al. [12] adapted a fusion technique that integrates features from instantaneously recorded MEG and EEG signals to enhance classification performance in MI-based BCI. omas et al. [13] introduce a discriminatory filter bank (FB) common spatial pattern model for extracting FB for the classification of MI. e presented model improves the classifier performance in BCI datasets.
Dong et al. [14], proposed a hierarchical SVM approach for addressing an EEG-based 4-class MI classification process. Wavelet packet transform is applied for decomposing original EEG signal. EEG feature vector is extracted and a a two-layer HSVM approach is developed for classifying this EEG feature vector, whereas "OVO" classifier is utilized in the initial layer as well as "OVR" in the next layer. Zhang et al. [15], proposed a "brain-ID" architecture based hybrid DNN using TL method for handling single difference of 4-class MI tasks. A dedicated HDNN is designed for learning the common feature of MI signals. e suggested algorithm comprises LSTM and CNN models that are employed for decoding the spatiotemporal features of the MI signal. Zhang et al. [16] introduce 5 systems for adoptation of a DCNN based EEG-BCI scheme for decoding hand MI. All the systems are widely trained, pretrained method and adapt it to improve the efficiency.

The Proposed Model
In our study, an AORNDL-MIC approach was introduced to classify the MI on BCIs. e proposed AORNDL-MIC technique encompasses a series of operations namely MSPCA based denoising, CWT based decomposition, RetinaNet based feature extraction, AOA based hyperparameter tuning, as well as ID3 based classification.

Data Preprocessing.
Initially, the data preprocessing takes place in two stages namely MSPCA based noise removal and CWT based decomposition. Consider a measurement data set with m sensor exists, namely xeR m . All the sensors in the measurement samples have n sampling data, that is integrated into a data matrix of size mxn. e procedure has been shown as follows [2]: All the columns represent a measurement variable, and all the rows of X denote a sample. e PCA models initiated by normalizing all the samples of X by calculating the covariance matrix of X.
e method of decomposition X in its PCA, in which PeR m×A has initial A feature vector of cov (x). Once the feature decomposition of X is made, the size of feature value is arranged from larger to smaller. A indicates the amount of PCA, and it is equivalent to the amount of columns in T. T ∈ R m×A denotes a matrix, in which all the columns are called as the principal element variable.

2
Journal of Healthcare Engineering In which λ 1 , λ 2 , . . . , λ n represent the initial A large eigen values of covariance matrix of X, equation (4) In the study, the wavelet transform is integrated into the PCA model for creating MSPCA to the incoming signal denoising purpose. In MSPCA, the PCA ability for extracting covariance among parameters is integrated to orthonormal wavelets' capability. e capability of PCA is improved by integrating the multi-scale analysis. Simultaneously, it leads to the MSPCA [17]. It finds linearly correlated wavelet coefficient at multilevel sub-bands, attained using wavelet transform. It represents every subband with less features when eliminating the autocorrelated coefficient. e signal is recreated by utilizing the wavelet syntheses. It reduces unnecessary noises from the received signals and generated noise-free and simple signal versions. Also, it can be utilized as a scalogram that is signified by exact value of CWT of the signals. MI signal is gradually changing event peppered by abrupt transient with feature taking place at distinct scales, so lower frequency event, offering maximum time localization to higher frequency, shorter duration event, and higher frequency localization to extended duration, is attained utilizing scalogram.

RetinaNet Based Feature Extraction.
Next to the data preprocessing phase, the AORNDL-MIC technique involves the RetinaNet model as a feature extractor. RetinaNet comprises of two fully convolution networks (FCN), a feature pyramid network (FPN), and residual network (ResNet). ResNet uses distinct network layers. e important role of ResNet is the concept of RL that enables raw input data to be transferred directly to the succeeding layers. e widely employed type of network layer consists of 101-layer, 152-layer, and 50-layer. e study chooses 101-layers with the optimal training efficiency [18]. en, extracted the feature of the echocardiography with ResNet and later transmitted to the following subnetworks. FPN is an approach to effectively extract the feature of all the dimensions in a picture with a traditional CNN. Figure 1 illustrates the overview of CNN. Firstly, use single-dimension images as input to ResNet. Next, start from another layer of the convolution network, the feature of each layer was chosen using the FPN and later integrated to generate the last output. e class subnetwork in the FCN implemented the classifier process. Focal loss: it is an amended form of binary cross-entropy expression, as well as the cross-entropy loss: whereas y ∈ [ ± , 1] characterizes the ground truth category and p ∈ [0, 1] signifies the predicted likelihood of algorithm for y � 1.
e abovementioned equation is abbreviated as To resolve the problems of the data imbalance among the negative and positive instances, the novel version is changed into the subsequent form: and amongst them, whereas, α ∈ [0, 1] characterizes the weight factor. To resolve the problems of complex samples, the variable C is presented for obtaining the last form of focal loss [19]. Figure 2 illustrates the structure of RetinaNet.
Since the hyperparameters of the RetinaNet model influence the overall classifier results of the AORNDL-MIC Journal of Healthcare Engineering technique, the AOA is utilized. In general, as other MH approaches, the AOA consists of, exploration, and exploitation phases, stimulated by mathematical operations, like −, +, * , and /. Firstly, the AOA generates a set of N solutions [20]. erefore, solution or agent represents X population, as: 3.3. AOA Based Hyperparameter Tuning. en, the fitness function of solution is calculated for detecting optimal one X b . According to the Math Optimizer Accelerated (MOA) values, AOA implements exploitation or exploration methods. Subsequently, MOA is upgraded by where M t characterizes the overall amount of iterations. Min MOA and Max OA signify the minimal and maximal values of the accelerated function, correspondingly, the division (D) and multiplication (M) are applied in the exploration stage of the AOA, as follows: Next e signifies smaller integer value, LB j and UB j shows upper and lower limits of the searching space at jth parameter. μ � 0.5 denotes the control function. Furthermore, Math Optimizer (M OP ) is determined by α � 5 characterizes the dynamic variable which defines the accuracy of the exploitation stage. Additionally, subtraction (D) and addition operator (A) operators are employed for executing the AOA exploitation phase, as follows.
Now r 3 characterizes an arbitrary value in [0, 1]. Next, the agent updating procedure is executed by the AOA operator [21]. In summary, Algorithm 1 demonstrates the steps included in AOA.

ID3 Based Classification.
Lastly, the ID3 architecture receives the feature vector as input and carries out the classification process.
e ID3 technique selects test elements with computing and relating its information gains (IG). Assume S be the group of data instances. Supposing the class element C has m distinct values that signify m various class labels C i (i � 1, 2, . . . , m). Assume that S i be the amount of instances from class C i (i � 1, 2, . . . , m). e predictable data amount needed for classifying S was provided in equation (15): where p i signifies the probability of samples from S appropriate to class C i . I(S 1 , S 2 , . . . , S m ) refers to the average data amount needed for identifying the class label to every instance from S. Let the element A has v distinct values a 1 , a 2 , . . . , a ] from the trained data set S. When A is a nominal element, Afterward, the element separates S as to v subset such that S 1 , S 2 , . . . , S ] , in that S j represents the subset of S where sample from S j has the similar element value a j on A. But, instance from S j can have various class labels [22]. Assume S ij be the group of instances that class label is C i from the subset of S j |A � a j , j ∈ 1, 2, . . . , ], S j ∈ S} in which element A � a j . e needed data amount (i.e., entropy) of element A for splitting the trained data set S was measured by (16): e minimum data amount needed, a further purity of sub-dataset is.
where p ij implies the probability of instances from S j based on class C i . I(s 1j , s 2j , . . . , s mj ) signifies the average data amount needed for identifying the class labels to every instance from S j . e IG of A has determined as: Specifically, the count of novel data requirement (only dependent upon class) minus the count of novel data requirements (based the split on element A). Selecting the element with maximal Info Gain (A) as test element that is allocated to internal node from DT. During this process, the required data amount to classify samples is minimal.

Results and Discussion
e performance validation of the AORNDL-MIC technique has been validated under two dataset includes BCI competition 2003 dataset III and BCI competition IV data set 2b. e BCI competition 2003, dataset III [23], comprises 3-channel EEG data in healthy females, for the imagination of the right, and left -hand movements. e data from the analysis has of recording in the motor cortex area of brain utilizing 3 electrodes (C3, Cz, and C4) under the motor imagery of combined right-or-left-hand movement. All individual trail last to 9-second duration of data to all channels C3, Cz, and C4 per trial with every label obtainable. It holds 280 out of which 140 trials were accessible with its labels, and other 140 instances were employed for validation method. e BCI competition IV data set 2b comprises nine subjects all with 5 sessions of motor imagery experimentally, amongst that the initial 2 sessions are verified with no feedback and the remaining 3 sessions are combined online feedback [24].   Upgrade the MOA and M OP using equations (11) and (13) Figure 6 and Table 2. e result exhibits that the SqueezeNet, ResNet50, GoogleNet, Den-seNet201, ResNet18, and ResNet101 techniques have resulted to lower kappa values of 57%, 41%, 44%, 36%, 29%, and 30% correspondingly. Next, the VGG19, AlexNet, and VGG16 models have resulted in slightly increased kappa values of 91%, 87%, and 90%, respectively. However, the proposed AORNDL-MIC technique has accomplished higher kappa value of 94.84%.

Result Analysis on BCI Competition 2003 III Dataset.
A comparative study of the AORNDL-MIC method with recent algorithms on the test BCI competition 2003, dataset III is illustrated in Table 3

Result Analysis on BCI Competition IV Data Set 2b
Dataset. A classification results of the AORNDL-MIC method on BCI competition IV data set 2b under several subjects and runs is shown in Table 4 and Figure 8. e experimental value indicates that the AORNDL-MIC algorithm has demonstrated better performance with an average accuracy of 85.33%, 84.22%, 90.11%, 87.11%, and 85.89% under runs 1-5, respectively. An average classification results of the AORNDL-MIC method under several subjects are portrayed in Figure 9.
e results showed that the AORNDL-MIC system has the ability of accomplishing improved outcomes with the maximum average accuracy of 81.20% under S-1, 87.20% under S-2, 84.60% under S3, 91.60% under S-4, and so on.    Table 5 and Figure 10 provide a comparative study of the AORNDL-MIC system with current methodologies interms of accuracy.
e experimental results indicated that the AORNDL-MIC technique has resulted in better results over the other methodologies under all subjects. For instance, with S-1, the AORNDL-MIC algorithm has accomplished higher performance of 81.20% whereas the CSP, FBCSP MIBIF, FBCSP MIRSR, and FDBN techniques have attained lower accuracy of 66%, 68%, 70%, and 81% respectively. Moreover, with S-5, the AORNDL-MIC approach has reached superior accuracy of 85.80% whereas the CSP, FBCSP MIBIF, FBCSP MIRSR, and FDBN methods have attained lesser accuracy of 77%, 93%, 93%, and 93%, respectively. Furthermore, with S-9, the AORNDL-MIC approach has gained superior accuracy of 87.60% whereas the CSP, FBCSP MIBIF, FBCSP MIRSR, and FDBN methods have achieved minimum accuracy of 83%, 88%, 87%, and 91% correspondingly.    For ensuring the improvement of AORNDL-MIC model, an average accuracy analysis is also made in Figure 11. From the figure, it is apparent that the CSP and FBCSP MIBIF techniques have reached lower performance with an average accuracy of 76.33% and 79.56% respectively. In line with, the FBCSP MIRSR and FDBN systems have resulted in moderately increased average accuracy of 80.22% and 84.22% respectively. However, the AORNDL-MIC approach has gained effective performance over the other methodologies with the maximal average accuracy of 86.53%. By observing the experimental results and discussion, it is confirmed that the AORNDL-MIC approach has shown better results over the other methodologies.

Conclusion
In this study, an AORNDL-MIC system was developed to categorize MI on BCIs. e proposed AORNDL-MIC technique encompasses a series of operations namely MSPCA based denoising, CWT based decomposition, RetinaNet based feature extraction, AOA based hyperparameter, and ID3 based classification. e AOA is employed to tune the hyperparameter of RetinaNet and improves the classification performance of the AORNDL-MIC technique. For ensuring the outcome of the AORNDL-MIC method, a number of experiments were performed and the outcome is examined under different aspects. e experiment results of the AORNDL-MIC algorithm on the benchmark datasets reported its promising outcome over the current state of art approaches. In the future, hybrid DL model can be utilized for boosting the efficacy of the MI classification process.

Data Availability
Data sharing not applicable to this article as no datasets were generated during the current study.

Ethical Approval
is article does not contain any studies with human participants performed by any of the authors.

Consent
Not applicable.

Conflicts of Interest
e authors declare that they have no conflicts of interest.