Analysis of Different Classification Techniques for Two-Class Functional Near-Infrared Spectroscopy-Based Brain-Computer Interface

We analyse and compare the classification accuracies of six different classifiers for a two-class mental task (mental arithmetic and rest) using functional near-infrared spectroscopy (fNIRS) signals. The signals of the mental arithmetic and rest tasks from the prefrontal cortex region of the brain for seven healthy subjects were acquired using a multichannel continuous-wave imaging system. After removal of the physiological noises, six features were extracted from the oxygenated hemoglobin (HbO) signals. Two- and three-dimensional combinations of those features were used for classification of mental tasks. In the classification, six different modalities, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), k-nearest neighbour (kNN), the Naïve Bayes approach, support vector machine (SVM), and artificial neural networks (ANN), were utilized. With these classifiers, the average classification accuracies among the seven subjects for the 2- and 3-dimensional combinations of features were 71.6, 90.0, 69.7, 89.8, 89.5, and 91.4% and 79.6, 95.2, 64.5, 94.8, 95.2, and 96.3%, respectively. ANN showed the maximum classification accuracies: 91.4 and 96.3%. In order to validate the results, a statistical significance test was performed, which confirmed that the p values were statistically significant relative to all of the other classifiers (p < 0.005) using HbO signals.

Although fMRI and EEG have shown positive developments for rehabilitation of patients suffering from different motor disabilities, for example, amyotrophic lateral sclerosis (ALS), locked-in syndrome (LIS), and other physical disabilities, fMRI machines are quite expensive as well as heavy, rendering them infeasible for the purposes of portable BCI systems [22]. More recently, alternative f NIRS-based BCI systems have been widely used due to their well-balanced spatial and temporal resolution, safety, ease of use (portability), and less susceptibility to gross electrophysiological artifacts caused by eye blinks, eyeball movements, and muscle activity [23]. Indeed, over the past few decades, f NIRS-based BCI systems have shown promising results in becoming an effective medium of communication for patients with disabilities [18].
Near-infrared spectroscopy (NIRS) functions by utilizing the near-infrared (NI) spectrum of light (wavelength 600∼ 1000 nm) to measure the hemodynamic response represented by oxygenated hemoglobin (HbO) and deoxygenated hemoglobin (HbR), after which the modified Beer-Lambert 2 Computational Intelligence and Neuroscience law is used to determine the changes in the HbO and HbR concentrations (Δ HbO ( ) and Δ HbR ( ), resp.) [24][25][26][27][28]. Jobsis first introduced, in 1977, the principal of near-infrared spectroscopy [29], which entails the use of emitters and detectors separated by a distance of 3∼4 cm. The distance is critical, as a small distance (1 cm) contains only a skin-layer contribution, while a large distance (5 cm) can result in lowquality and undesirable signals [23].
In f NIRS-based BCI studies, various mental tasks like motor imagery [15,16], music imagery [17,[30][31][32], mental arithmetic (MA) tasks [17,33,34], object rotation [34][35][36][37], and others [38][39][40][41] have been used to acquire maximum classification accuracies that facilitate communication with patients suffering from LIS and ALS. In an f NIRS-based BCI system, the prefrontal cortex of the brain plays an important role in the acquisition of fine signals, for two specific reasons: Usually, it is not involved in motor disabilities, and its hairfree region enhances signal strength and penetration depth [24]. After acquiring brain signals using an f NIRS-based BCI system, the first step is to eliminate physiological noises using different kinds of filters [42], the next step is to extract the features from the signals, and the final step is to apply classification techniques to acquire the maximum accuracy for the specified task.
In recent decades, various classification schemes have been used in the f NIRS-based BCI area to classify different mental tasks and, thus, acquire maximum classification accuracies, thereby improving the quality and effectiveness of communication with patients suffering afflictions such as ALS and LIS [30,33,34,[43][44][45]. In this study, we acquired mental arithmetic (MA) task versus rest signals from the prefrontal cortex of the brain, after which we removed the signals' physiological noises using the 4th-order Butterworth band-pass filter [18,19,46]. Subsequently, those filtered signals were utilized to calculate the different combinations of the statistical properties of the time-domain signals. Then, after obtaining the features, we employed, to acquire maximum classification accuracies across all of the subjects using Δ HbO ( ) signals, different types of classifiers, that is, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), -nearest neighbour ( NN), Naïve Bayes, support vector machine (SVM), and artificial neural networks (ANN). By using 2-dimensional Δ HbO ( ) feature combinations with those classifiers, the classification accuracies were 71.6 ± 1.1, 90.0 ± 1.3, 69.7 ± 0.5, 89.8 ± 1.4, 89.5 ± 1, and 91.4 ± 0.8, respectively, and using the 3-dimensional feature combinations, the classification accuracies were 79.6 ± 1.5, 95.2 ± 1, 64.5 ± 0.3, 94.8 ± 1.2, 95.2 ± 0.7, and 96.3 ± 0.3, respectively.

Subjects.
Seven healthy subjects participated in the experiment. All of them had normal vision and no history of any physical, mental, or psychological disorder. The experiments were conducted in accordance with the latest Declaration of Helsinki, and verbal consent was obtained from all of the subjects after explaining the experimental paradigm.

Experimental
Paradigm. The subjects were seated in a quiet room on a comfortable chair in front of a computer monitor. They were asked to relax and to restrict their motor motions before the start of the experimental paradigm. The subjects were asked to rest and then to perform a mental arithmetic task, as shown in Figure 1(a). Specifically, each subject first rested for 44 s to adjust the baseline correction of the signals, and then he/she performed a mental arithmetic task for 44 s, of which paradigm was repeated five times. The total length of the experiment was 440 s for each subject. The 44 s task-rest periods are rather longer than the conventionally used 20 or 30 s task-rest periods [47][48][49][50][51]. The reason for using a longer duration was to get more data to extract statistical features for the purpose of training the classifiers. Of course, the statistical features are more reliable if the number of data points is larger. Since the main objective of this work was to determine the best performing classifier, training with the reliable and large amount of data was desirable. In the mental arithmetic task, the subjects performed a mental calculation consisting of the subtraction of a two-digit number (10∼ 20) from a three-digit number with successive subtraction of another two-digit number from the result of the initial subtraction (e.g., 300 − 14, 286 − 11, and 275 − 16) [19,43,52].

Optodes Placement.
A total of 4 emitters and 10 detectors were positioned on the prefrontal cortex for the detection of mental arithmetic and rest signals, of which configuration included 16 channels. In f NIRS-based BCI systems, the prefrontal cortex is the brain region most widely used, as the hairlessness incurs fewer and less slippage-relatedmotion artifacts and signal attenuation, respectively. The distance between the emitter and the detector plays an important role in the acquisition of fine-quality signals and the obtainment of maximum information therefrom [53]. Usually in f NIRSbased BCI systems, the emitter-to-detector distance is 3∼4 cm [54]; in our research, the distance was set to 2.8 cm, as shown in Figure 1(b).

Signal Acquisition.
A multichannel continuous-wave system (DYNOT: DYnamic Near-infrared Optical Tomography; two wavelengths: 760 and 830 nm; sampling rate: 1.81 Hz) obtained from NIRx Medical Technologies was used for the detection of brain activity. The near-infrared (NIR) light has been transmitted to the scalp from the source with the abovespecified wavelength and then scattered through the cortical region of the brain where chromophores of HbO and HbR are present, which absorb some of the NIR light, the rest of which has been detected by the detectors.

Signal
Processing. The modified Beer-Lambert law (MBLL) is used to calculate the concentration changes of HbO and HbR (Δ HbO ( ) and Δ HbR ( )) in the microvessels of the cortex:  where Δ ( ; ) ( = 1,2) is the absorbance (optical density) measured at two points of wavelength , HbX ( ) is the extinction coefficient of HbX (i.e., HbO and HbR) in M −1 mm −1 , is the differential path length factor (DPF), and is the emitter-detector distance (in millimetres). The signals obtained after conversion to Δ HbX ( ) contain physiological noises; so, we used a notch filter with band-reject ranges of 1∼1.2 Hz, 0.3∼0.4 Hz, and below 0.01 Hz to minimize the effects of such heartbeat-, respiration-, and Mayer-waverelated noises, respectively.

Feature Extraction.
In this study, we used the following statistical properties of time-domain signals as features: signal mean [18,36,45,52,55,56], signal peak [33,45,57], signal slope [18,58], signal variance [45,59], signal kurtosis [45,59], and signal skewness [45,59]. Two-and three-dimensional combinations of those features were used for classification of the signals extracted from Δ HbO ( ). These features were calculated across all 16 channels spatially during the entire task and rest periods. All the features were normalized between 0 and 1 by the following equation [42]: where represents the feature values rescaled between 0 and 1, ∈ are the original values of the features, and max( ) and min( ) represent the largest and smallest values, respectively. Figure 2 shows the 3D feature space of the mental arithmetic and rest tasks for mean, speak, and skewness.

Linear Discriminant
Analysis. LDA has been most frequently used for pattern recognition in f NIRS-based BCI systems, thanks to its low computational cost and high speed [46,55,[60][61][62]. Basically, LDA finds the projection to a line such that the samples from the classes are well separated from each other, thus achieving its main objective, dimensionality reduction. LDA does this, specifically, by maximizing the ratio of between-class variance and minimizing the ratio of within-class variance. The Matlab5 command "classify linear" was used with 10-fold cross-validation to extract the classification performance. quadratic decision boundaries between classes, thereby enabling the classifier to perform more effectively and enhancing classification accuracy [17,63]. The Matlab5 command "classify quadratic" was used with 10-fold crossvalidation to extract the classification performance. In the present work, normal LDA and QDA, that is, without shrinkage or regularization, are used.

-Nearest Neighbour. NN is the simplest classification
technique used in f NIRS-based BCI systems for machinelearning algorithms [64]. The NN algorithm works by determining which of the points from the training data are close enough to be considered when selecting the class to predict for a new observation. In the present research, the value of was set to 1 in order to allow for the closest training samples of the class. The Matlab5 command " NN classify" was used with 10-fold cross-validation to extract the classification performance.

Naïve Bayes
Classifier. In addition to LDA, QDA, and NN, the Naïve Bayes approach was also implemented in our study, due to its simplicity and transparency in machinelearning modalities. This approach is fundamentally based on the Bayes theorem with assumptions of strong independence among the features [65,66]: where ( | ) is the feature probability of the class (target) of a given feature, ( ) is the prior probability of the class, ( | ) is the likelihood which is the probability of feature given class, and ( ) is the prior probability of the feature.

Support Vector
Machine. SVM is a widely employed classification modality in f NIRS-based BCI systems due to its high classification performance, relatively good scalability to high-dimensional data, and explicit control of errors [19,34,44,59,67,68]. The main idea of SVM is to create the hyperplanes that maximize the margins between the classes that can be obtained by minimizing the cost function and, thereby, enable maximum classification accuracy. The vectors that represent the hyperplanes are known as support vectors. The optimal solution * that maximizes the distance between the hyperplane and the nearest training point(s) can be obtained by minimizing the cost function: where , ∈ 2 and ∈ 1 , ‖ ‖ 2 = , is the tradeoff parameter between error and margin, is the measure of training data, and is the class label for the th sample. The main advantage of SVM is that it can be used as both a linear and a nonlinear classifier. In order to make SVM a nonlinear classifier, one of various types of kernel functions (i.e., polynomial, radial basis, and sigmoid functions) can be used. In our present research, we utilized a third-degree polynomial kernel function with = 0.5. Tenfold cross-validation was then used to estimate the classification accuracies. The reason for using nonlinear SVM is that it has been shown to yield better classification accuracies than the linear classifiers [19].
2.7.6. Artificial Neural Networks. ANN is a classification technique widely used for deep machine-learning and pattern recognition in f NIRS-based BCI system [35,69,70]. The ANN classification modality plays an important role in the rehabilitation of patients suffering from afflictions such as ALS and LIS by decoding useful information. In our research, we used a three-layer perceptron consisting of an input, a hidden layer, and an output. The numbers of hidden neurons are specified by the following equation: where is the number of input neurons, is the number of output neurons, and is a constant with ∈ (0, 1]. For ANN classifier, the Matlab toolbox was used with 10 hidden neurons, 70% of the total data was used for training, 15% data was used for validation (measure of network generalization), and 15% data was used for testing (independent measure of network performance during and after training) [71].

Results and Discussion
In this study, we analyse and compare the performance of LDA, QDA, NN, Naïve Bayes, SVM, and ANN classifiers in order to determine the best classifier for f NIRS-based BCI system using mental arithmetic tasks and rest. The classification accuracies for mental arithmetic task and rest were calculated for all possible 2-and 3-feature combinations of six different features. The extracted features include the Computational Intelligence and Neuroscience 5  signal mean, signal peak, signal skewness, signal slope, signal variance, and signal kurtosis. These features are calculated for the whole task and rest periods. It was found that the presence of signal mean and signal peak in both 2-and 3-feature combinations yielded maximum classification accuracies. This finding is an endorsement to our previous finding in [21]. Tables 1, 2, 3, 4, 5, and 6 show the classification accuracies among all of the subjects for the respective classifiers. Those accuracies were extracted from 2-dimensional combinations of features derived from Δ HbO ( ) signals. The average classification accuracies of the LDA, QDA, NN, Naïve Bayes, SVM, and ANN classifiers for the 2-dimensional feature combinations were 71.6 ± 1.1, 90.0 ± 1.3, 69.7 ± 0.5, 89.8 ± 1.4, 89.5 ± 1, and 91.4 ± 0.8, respectively. To further examine the performances of the classifiers used in our study, we also employed 3-dimensional combinations of features and extracted the corresponding classification accuracies, which were 79.6 ± 1.5, 95.2 ± 1.0, 64.5 ± 0.3, 94.8 ± 1.2, 95.2 ± 0.7, and 96.3 ± 0.3, respectively. In both (2-and 3-dimensional) cases, it was found that the ANN classifier has the highest classification accuracies: 91.4 and 96.3% for mental arithmetic task and rest. Figure 3 shows the averaged HbO and standard deviation for mental arithmetic and rest task. Tables 7 and 8 provide the comparison of all classifiers-in terms of average 6 Computational Intelligence and Neuroscience  Several previous studies have used multiple types of classifiers to extract the classification accuracies for f NIRSbased BCI system. For example, Naseer et al. [19] have used LDA and SVM to acquire the classification accuracies for a two-class BCI system, the classification accuracies were 74.2 and 82.1% respectively. Moreover, Khan and Hong [72] used LDA and SVM classifiers for a two-class BCI system; the classification accuracies were 84.6 and 85.8%. In the present study, the six different classifiers were used to obtain the highest average classification accuracies for a two-class (metal arithmetic and rest) BCI system. The ANN classifier showed the maximum average classification accuracies 91.4 and 96.3% for 2-and 3-dimensional combinations of features derived from Δ HbO ( ) signals, respectively. Figure 4 plots the Computational Intelligence and Neuroscience 7

Conclusion
In this study, we examined the effects of using different classification modalities for the classification of a twoclass functional near-infrared spectroscopy-(f NIRS-) based brain-computer interface (BCI) according to a mental arithmetic task and rest experimental paradigm. It was shown 8 Computational Intelligence and Neuroscience