A New Feature Analysis Approach to Selecting Channels of EEG for Fatigue Driving

Fatigued driving is a significant contributor to traffic accidents. There are some issues with common EEG data of 32 channels, 64 channels, and 128 channels, such as difficult acquisition, high data redundancy, and difficult practical application. A new channel selection method called ReliefF_SFS is proposed to address the problem of how to reduce the number of channels while maintaining classification accuracy. It combines the ReliefF algorithm and the sequential forward selection (SFS) algorithm. When only T6, O1, Oz, T4, P3, and FC3 are used, the classification accuracy under Theta_Std+FE combined with ReliefF_SFS achieves 99.45%. The strategy suggested in this paper not only ensures the recognition accuracy but also reduces the number of channels when compared to other models based on the same data set.


Introduction
Electroencephalogram (EEG) signals are spontaneous electrical activity of brain cells recorded by electrodes on the surface of the scalp, which are highly random. EEG signals record the electrical changes in brain activity and can directly reflect the fatigue state. There are two main categories of feature extraction methods based on EEG signals, including linear analysis methods and nonlinear analysis methods. Linear analysis methods mainly include time domain analysis methods, frequency domain analysis methods, and time-frequency domain analysis methods. Time domain analysis methods mainly include the extraction of features such as mean, median, variance, standard deviation, skewness, and kurtosis. The frequency domain analysis method mainly decomposes the EEG signal into multiple bands by wavelet transform or Fourier transform. Muhammad et al. [1] extracted the mean, variance, minimum, maximum, δ, θ, α, β, γ, and sample entropy of the ECG signal and used support vector machine(SVM) for classification. The accuracy of binary classification reached 80%. Nonlinear analysis methods mainly include the extraction of features such as entropy and fractal dimension. Ye et al. [2] proposed a fatigue driving state recognition method based on sample entropy and kernel principal component analysis, which combined the advantages of high recognition accuracy of sample entropy and strong processing capability of kernel principal component analysis in nonlinear principal component reduction and nonlinearity and achieved good results. Lin et al. [3] proposed a method for the dynamic construction of functional brain networks based on singular value entropy and fractal dimensionality. The experimental results showed that the method has high accuracy in fatigue driving recognition.
Although these methods perform better in feature extraction of EEG signals, most scholars only analyse EEG signals from a single aspect of linearity or nonlinearity, which is one-sided. Therefore, in this paper, the frequency domain features and fuzzy entropy features of the EEG signal are extracted separately, and the best performing subband features of the frequency domain features are fused with the fuzzy entropy features to form fused features, which are used as preparatory data for channel selection.
The 32-channel, 64-channel, and 128-channel EEG signal acquisition devices require electrodes to be arranged in various brain regions of the human brain, which is not only time-consuming and labour-intensive but also lacks relevance and convenience, as well as resulting in significant data redundancy and inefficient data processing. In the practical application of fatigue driving detection systems, due consideration should be given to the convenience and speed of EEG signal acquisition, the comfort of the driver, and the impact of the device on the driver's operation. Therefore, it is of practical importance to investigate the use of as few electrode channels as possible to detect the driver's driving status, not only to reduce the difficulty of EEG signal acquisition but also to improve the practicality.
Many scholars have conducted in-depth analysis and research on the channel selection of EEG signals in different fields. Zheng et al. [4] proposed a feature extraction and channel selection method of EEG signals for portable HCI systems for emotion recognition. This method was formed by extracting discriminative features of EEG signals in different dimensions and combining the relief algorithm, and the floating generalized sequential backward selection algorithm. The experimental results showed that the majority of the optimal channel set was located at the front end, and 10 channel EEG signals with extremely high accuracy were selected, with an average classification accuracy of 91.31% on both the self-collected and public datasets. Ru et al. [5] proposed a dynamic channel selection method based on channel location and EEG signal power spectral density and selected one of the channels with the strongest epilepsy detection ability as the feature extraction channel, so as to enhance the performance of epilepsy recognition and detection. Finally, 6 channels were selected from 21 channels, achieving a better performance of 98.99% accuracy, 98.52% sensitivity, and 99.52% specificity. Shoka et al. [6] proposed an automatic epilepsy diagnosis system based on EEG signal feature extraction and channel selection, which minimized the dimensionality by selecting the most affected channels through the variance parameter and finally reduced 23 channels to 3. Zhang et al. [7] proposed a ReliefF-Pearson based channel selection algorithm for olfactory EEG signals, combining the weighting idea of ReliefF and the correlation principle of Pearson. The results showed that the method was able to significantly reduce the number of channels while ensuring a certain classification accuracy. Praveena et al. [8] proposed a supervised classifier-based important feature selection method for seizure recognition, in which the ReliefF method was used to reduce the dimensionality of extracted features, and the long short-term memory (LSTM) method was used for classification. The results showed that the classification accuracy of the method was improved by 0.6%-16%.
In the field of fatigue driving, although there are researchers working on channel selection methods, they are still in the early stages of research, with few researchers or research results, and even more distant from practical applications. EEG signal channel selection for fatigue driving mainly includes single-channel selection and multichannel selection. Single-channel selection methods ensure a minimum number of channels, but ignore the fact that EEG signals from different drivers are different, resulting in poor detection results. The multichannel selection method ensures detection results with the lowest possible number of channels.
Hu [9] used a channel+feature+classifier approach applied to the fatigue driving dataset, and the selected combination of the CP4 channel, fuzzy entropy feature, and random forest classifier achieved 96.6% accuracy. Liu [10] proposed an adaptive multiscale sample entropy feature extraction algorithm based on empirical modal decomposition applied to the fatigue driving dataset and achieved 97.87% recognition accuracy on Fp1 and Fp2 electrodes. Chai et al. [11] used independent component analysis (ICA) and scalp map projection for EEG-based driver fatigue classification. The channels are reduced from 32 to 16, and the classification results of 16 channels are equivalent to those of 32 channels. Min et al. [12] proposed a feature extraction method of multi entropy fusion to select 10 channels of fatigue driving EEG data in 4 regions on an accuracy-based weight calculation method, which achieved 98.3% recognition accuracy.
The above studies show channel selection has practical significance in the field of fatigue driving. However, how to reduce the number of EEG signal channels as much as possible while improving the recognition accuracy still requires continuous research. Therefore, this paper focuses on the study of multichannel selection methods. Based on the extracted different EEG signal feature data, combined with the weight calculation of ReliefF algorithm and the feature selection of SFS algorithm, a channel selection method based on ReliefF_SFS is proposed to explore the use of as few EEG signal channels as possible to achieve a high recognition accuracy. It not only reduces redundant channels but also improves the practicality of fatigue detection in the driving field.
The rest of the paper is organized as follows: Section 2 introduces the relevant theory and methods of the proposed method. Section 3 focuses on the feature extraction part of channel selection. Section 4 introduces the channel selection algorithm with ReliefF_SFS on different features. Section 5 presents the experiments and analyses the results. Section 6 summarizes the paper.

Frequency Domain Features.
A large amount of EEG signal feature information is reflected in the frequency features, and extracting the frequency domain features after wavelet decomposition of EEG signals is a common analysis method. Wavelet packet decomposition (WPD) [13] is a mainstream signal analysis method that has been widely used in various signal-related fields, including medical diagnosis, metal detection, and natural disaster signal analysis. WPD can decompose and reconstruct a signal into multiple signal components with the same bandwidth but different center frequencies. WPD can provide higher accuracy in the high frequency part of the signal and no redundant or missing information. WPD has a strong ability to decompose nonstationary signals to obtain multiscale signals. Therefore, it is commonly used for signal feature extraction. Equation Computational and Mathematical Methods in Medicine (1) is used for wavelet packet decomposition of EEG signal.
where h k is a low-pass filter bank and g k is a high-pass filter bank. The wavelet packet decomposition is a collection of functions with certain connections, including scale functions W 0 ðxÞ = ΦðxÞ and wavelet functions W 1 ðxÞ = φðxÞ.The WPD method can decompose both low frequency signals and high frequency signals at the same time and is more efficient than the wavelet transform [14] for feature extraction.
In the experiments of this paper, firstly, the sampling frequency of EEG signal was reduced to 128 Hz, secondly, a sixlayer wavelet packet decomposition tree was built, and then the original EEG signal was decomposed into four subbands, including Theta subband (4-8 Hz), Alpha subband (8-13 Hz), Beta1 subband (13)(14)(15)(16)(17)(18)(19)(20), and Beta2 subband (20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30). Finally, the standard deviation (Std) features are extracted for each subband, and the best performing subband features are fused with the fuzzy entropy features to form the fused features. The standard deviation is a measure of the dispersion of the data and is calculated as where xðiÞ denotes the time series and N s denotes the size of the time series.

FE.
The concept of fuzzy entropy (FE) was first proposed by Chen et al. [15] in 2007. FE describes the fuzziness of a fuzzy set [16] and measures the probability of a new pattern being generated. The larger the measure, the greater the probability of the new pattern being generated and the greater the sequence complexity. The specific algorithm is described as follows: Step 1. Given a time series.
Step 2. Dividing the time series into k = n − m + 1 series with a window of m.
Step 3. Calculate the distance between each sequence and all k sequences.
Matrix obtained. : ð6Þ Step 4. Calculation of fuzzy affiliation based on distance d.
Averaging over all affiliations except itself.
Step 5. Grow the window m to m + 1 and repeat steps 2 to 4.
Step 6. Calculation of fuzzy entropy.
In this paper, the fuzzy entropy is calculated by taking m as 2 and r as 0.25.

ReliefF.
The Relief algorithm was first proposed by Kira and Rendell [17] in 1992. Relief is a feature weighting algorithm that assigns different weights to features based on the correlation between features and categories. The correlation between features and categories in the Relief algorithm is based on the ability of features to discriminate close samples. The Relief algorithm is simple and efficiently, which has been widely used. However, it has some limitations as it can only handle two categories of data. Therefore, Kononenko [18] extended it to obtain the ReliefF algorithm in 1994, which can handle multicategory problems. This algorithm is used to deal with regression problems where the target attributes are continuous values. The larger the feature weights, the better its classification performance. The ReliefF algorithm [8] is described in Algorithm 1.
In this paper, the weights are calculated as 80 for m, 10 for k, and 30 for N.

SFS.
The sequential forward selection (SFS) algorithm [19] is used to reduce the initial d-dimensional feature space to a k-dimensional feature subspace, where <d . The algorithm can be described as follows: the feature subset X starts from the empty set and one feature x at a time is selected to be added to the feature subset X such that the feature function YðXÞ is optimal. In simple terms, this means that one feature is chosen at a time that makes the evaluation function optimal and is a simple greedy algorithm.
In this paper, the characteristic function is defined as where Acc indicates accuracy, TP indicates positive samples predicted by the model as positive class, TN indicates negative samples predicted by the model as negative class, FP indicates negative samples predicted by the model as positive class, and FN indicates positive samples predicted by the model as negative class. The SFS algorithm is described in Algorithm 2.
2.5. KNN. The K-Nearest Neighbor (KNN) classification algorithm was originally proposed by Silverman et al. [20] in 1951 and later modified by Cover and Hart [21] in 1967. The KNN classifier is a simple and general classification method. Due to its simplicity and robustness, it has been widely used in a number of fields, including pattern recognition, model ranking, and text classification. KNN is a nonparametric lazy learning algorithm, whose algorithm principle is that when a new value X is predicted, the class of X is determined based on the class of the K nearest points to it. The two most important processes in this algorithm are the calculation of point distances and the selection of K values. In the distance calculation process, the KNN algorithm uses the Euclidean distance, which is calculated in two dimensions as.
In the process of selecting the K value, the crossvalidation starts from selecting a smaller K value and keeps increasing the value of K. The variance of the validation set is then calculated, and a more appropriate value of K is finally found [22].
In this paper, K was taken to be 10.

Feature Extraction
3.1. Data Collection. The experimental data were collected mainly through a vehicle driving simulator (ZY-31D Vehicle Driving Simulator, Beijing Zhongyulai Fit Teaching Equipment Co., Ltd.) and a set of 32-channels EEG signal electrode caps (sampling frequency 1000 Hz). The position of the 32 electrodes in the electrode cap according to the 10-20 international standard is shown in Figure 1.
In the experimental data collection process, firstly, subjects were simulated to drive for 20 minutes using a vehicle driving simulator, and the last 5 minutes of data were recorded as resting state data (recorded as JX), then, subjects drove continuously for more than 1 hour, and the Fatigue Scale-14 (FS-14) [23] was used to determine the driver's state until the subject's brain was in a fatigued state, and the last 5 minutes of data were recorded as fatigue state data (recorded as ZD). The final EEG data were collected for 300 seconds each in the resting and fatigued states, with a sampling frequency of 1000 Hz and 32 channels. In the actual processing, the two reference electrode data (A1 and A2) were removed. The final experimental data was obtained for 600 seconds and 30 channels for each subject.
The names of the electrodes correspond to their positions as follows: Fp1 (1), Fp2 (2) Firstly, after acquiring the resting-state data and fatigue state data, we divided each part of the resting-state data or fatigue state data into 1 second and formed them together. This is calculated as shown in equation 10.
Where dif f ðA, R 1 , R 2 Þ denotes the difference between samples R1 and R2 on feature A, and M j ðCÞ denotes the jth nearest neighbor sample in class C ∉ classðRÞ.
This is calculated as shown in equation 11;   Computational and Mathematical Methods in Medicine Then, we splice the processed resting-state data and fatiguestate data into a complete dataset. During the training and testing period, the whole dataset is shuffled and divided into the training set and the testing set.
3.2. Feature Matrix Construction. In this paper, experimental data from 10 people were collected at a sampling frequency of 1000 Hz for 300 seconds each in the resting and fatigue states to construct the experimental samples, using 10-20 international standards for 30 channels (removing the two reference electrodes). For each individual, this constitutes a sample matrix of ð2 × 300 × 1000Þ × 30 and for 10 individuals, a sample matrix of ð2 × 10 × 300 × 1000Þ × 30, where ð 2 × 10 × 300 × 1000Þ represents the size of the rows of the sample matrix and 30 represents the number of channels. The total sample size is 6,000,000 (including 3,000,000 resting state samples and 3,000,000 fatigue state samples), and the number of channels is 30. For each feature, each subject gets a 300 × 30 resting state feature sample matrix X JX and a 300 × 30 fatigue state feature sample matrix X ZD , as shown in Equations (12) and (13), where x i,j denoting the feature value. 10 subjects get a ð10 × 300Þ × 30 resting state feature sample matrix and a ð 10 × 300Þ × 30 fatigue state feature sample matrix, where 10 × 300 represents the size of the matrix rows and 30 represents the number of channels, that is 3000 resting state samples and 3000 fatigue state samples were obtained after feature extraction, and the number of channels was 30, as preparatory data for subsequent channel selection. Cz (15) C3 (14) T3 A1 (13) TP7 (18) CP3 T5 (19) CPz CP4 TP8 (20) P4 O2 Oz O1 Pz P3 (

Channel Selection Model Construction Based on ReliefF_
SFS. The ReliefF method is a widely used feature selection method in classification problems, which has the advantages of simple computation and high operational efficiency. However, the ReliefF method only gets the weight of the feature, which can only evaluate the contribution value of the feature to the classification and cannot help delete the redundant feature [24]. The SFS method determines the optimal feature subset by selecting one feature at a time that results in the optimal value of the evaluation function. Therefore, this paper proposes a channel selection method based on ReliefF and SFS methods (recorded as ReliefF_ SFS method) by combining the weight calculation properties of ReliefF method and the feature selection properties of SFS method. The method firstly uses the ReliefF method to calculate the channel weights of the EEG signals after feature extraction, secondly uses the SFS method to iteratively select the channel with the largest weight to join the channel subset Obtain the optimal channel subset that satisfy the classification threshold Calculate the feature values of R subjects in two driving states (resting state and fatigue state) in S seconds and construct a feature matrix. For EEG signal data with a raw sampling frequency of 1000 Hz, the feature values are calculated with a division of 1 second to obtain the feature matrix F ð2×R×SÞ×N , where 2 × R × S represents the size of the rows of the matrix F, i.e. the number of samples, and N represents the number of electrodes, i.e. the number of channels; 3: Channel Weight Calculation.
Based on the feature matrix F ð2×R×SÞ×N the ReliefF method is used to calculate the channel weights of the feature data to obtain the weight matrix W 1×N ; 4: Channel Subset Selection.
Using the SFS method, the channel subset starts from the empty set, and the channel with the largest weight is selected to join the channel subset each time. The channel subset feature matrix F ð2×R×SÞ×n ′ is constructed iteratively, where n (taking values from 1 to 30) represents the number of channels in each subset; 5: Divide the Training Set and the Test Set.
For each channel subset the feature matrix F ð2×R×SÞ×n ′ is randomly divided into two parts: the training set matrix F m1×n ′ and the test set matrix F m2×n ′ , where m1 : m2 = 8 : 2, i.e. m1 = 80% × ð2 × R × SÞ and m2 = 20% × ð2 × R × SÞ; 6: Calculate the Recognition Accuracy. The training data and the testing data are input into the KNN classifier, and the classification test is performed using fivefold cross-validation to obtain the recognition accuracy (i.e.: feature function value) matrix Acc N×1 . Here, KNN is responsible for the verification of the channels selected by the SFS algorithm. If the recognition accuracy of the channel subset reaches the classification threshold, the channel subset will be viewed as the optimal channels. Or, the new channel subset is needed to be selected by the SFS algorithm. When the SFS algorithm is finished, the best combination of channels will be output.
Algorithm 3: Channel selection algorithm based on a single feature combined with ReliefF_SFS. 6 Computational and Mathematical Methods in Medicine (the channel subset starts from the empty set), and then uses a KNN classifier to perform five-fold cross-validation for each channel subset to obtain the recognition accuracy (i.e., the value of the feature function) for each channel subset. Finally, the optimal number of channels is determined based on the recognition accuracy. This method not only solves the problem of redundancy of EEG signal channels but also reduces the data dimensionality and facilitates the acquisition of signals and data processing. Figure 2 shows the model construction process of the method. Figure 2 includes the following main sections: (1) EEG Signal Data Acquisition. The EEG signal data set is obtained by acquisition with specialised equipment. (The specific method is shown in Section 3.1) (2) EEG Signal Feature Extraction. The frequency domain features, fuzzy entropy features, and fusion features of all channels were extracted for each Ssecond data of R subjects in the EEG signal data set in resting and fatigue states, respectively, as shown in Section 3.2. The obtained feature data were used as the preparatory data for channel selection (3) Weighting Calculation. Based on the extracted fullchannel EEG signal feature data, the ReliefF method was used for channel weight calculation (4) Channel Subset Selection. Using the SFS method, the channel subset starts from the empty set, and the channel data with the largest weight is selected to join the channel subset each time. The channel subset is constructed iteratively (5) Recognition Test. For each channel subset, five-fold cross-validation was used to randomly select 80% of the data as the training set and the remaining 20% as the test set. The recognition accuracy of each channel subset was calculated separately based on the KNN classifier, and the optimal channel subset was determined by the recognition accuracy. The recognition accuracy and channel selection results of different features were compared to obtain the optimal combination of features+accuracy+channels number

Channel Selection Algorithm Based on a Single Feature
Combined with ReliefF_SFS. In this section we focus on the channel selection algorithm based on single feature combined with ReliefF_SFS. The single features described here are the Std features of the Theta, Alpha, Beta1, and Beta2 subbands (recorded as Theta_Std, Alpha_Std, Beta1_Std, and Beta2_Std, respectively) and the FE feature. The algorithm is described in detail as shown in Algorithm 3.

Channel Selection Algorithm Based on Fusion Features
Combined with ReliefF_SFS. In this section, the best performing subband features selected from the frequency domain features in Section 4.2 are fused with FE features before ReliefF_SFS channel selection, and the fusion method is shown in Section 3.2.2.
The overall process is similar to the channel selection algorithm based on a single feature in Section 4.2. However, different from being based on a single feature, the fusion features requires the construction of the fusion feature matrix.
For the selected optimal frequency domain features and fuzzy entropy features are fused to obtain the fusion feature matrix F1 ð2×R×SÞ×x×N , where 2 × R × S represents the size of the matrix F1 rows, i.e. the number of samples, N represents the number of electrodes, i.e. the number of channels, and x represents the fusion feature amount ðx = 2Þ, i.e. the number of features extracted from each channel.
After obtaining the fusion feature matrix, the ReliefF_ SFS is used for channel subset selection.

Validity Test Based on a Single Feature Combined with ReliefF_SFS
(1) Validity Test Based on Frequency Domain Features Combined with ReliefF_SFS. Table 1 shows the results of the four frequency domain feature data (Theta_Std, Alpha_Std, Beta1_Std, and Beta2_Std) sorted by channel weight value from largest to smallest. Table 2 shows the recognition accuracy of each channel subset obtained from the four frequency domain feature data based on the ReliefF_SFS method after classification and recognition using a KNN classifier. Figure 3 shows the optimal recognition accuracy and the corresponding number of channels obtained from the four frequency domain feature data after being processed by the ReliefF_SFS channel selection method.
As can be seen from Table 2 and Figure 3, the channel selection method based on Theta_Std features combined with ReliefF_SFS achieves a maximum recognition accuracy of 99.42% when using the 15 channels with the highest weights; the channel selection method based on Alpha_Std features combined with ReliefF_SFS achieves a maximum recognition accuracy of 91.73% when using the 19 channels with the highest weights; the channel selection method based on Beta1_Std features combined with ReliefF_SFS achieved a maximum recognition accuracy of 97.00% when using the 17 channels with the highest weights; the channel selection method based on Beta2_Std features combined with ReliefF_SFS achieved a maximum recognition accuracy of 86.70% when using the 20 channels with the highest weights. The experimental results show that the channel selection method based on Theta_Std features combined with ReliefF_SFS achieves up to 99.42% classification accuracy when using the 15 channels with the highest weights named (numbered) as T6 (27) (5), and CP3 (19). And for each feature data      (2) Validity Testing Based on Fuzzy Entropy Features Combined with ReliefF_SFS. Table 3 shows the results obtained by sorting the FE feature data according to the channel weight values from largest to smallest. Table 4 shows the recognition accuracy of each channel subset obtained from the FE feature data based on the ReliefF_SFS method after classification and recognition using a KNN classifier. From Table 4, it can be seen that the channel selection method based on FE features combined with ReliefF_SFS achieves 99.22% classification accuracy when using the 7 channels with the highest weights named (numbered) as O1 (28), T6 (27), FC3 (9), Oz (29), TP8 (22), T4 (17), and P3 (24).

Validity Testing Based on Fusion Features
Combined with ReliefF_SFS. Table 5 shows the results obtained by sorting the fused feature data Theta_Std+FE according to the average channel weight value from largest to smallest. Table 6 shows the recognition accuracy of each channel subset obtained from the fused feature data Theta_Std+FE based on the ReliefF_SFS method after classification and recognition using a KNN classifier. From Table 6, it can be seen that the channel selection method based on Theta_Std+FE features combined with ReliefF_SFS achieves 99.45% classification accuracy when using the 6 channels with the highest weights named (numbered) as T6 (27), O1 (28), Oz (29), T4 (17), P3 (24), and FC3 (9).

Comparative Analysis.
The EEG data were extracted in the frequency domain, fuzzy entropy, and fusion features, and then processed using the ReliefF_SFS channel selection method proposed in this paper, and the accuracy and number of channels obtained are shown in Table 7. As can be seen from Table 7, the channel selection method based on Theta_Std+FE features combined with ReliefF_SFS has the best performance in terms of both the number of channels and accuracy, with 99.45% classification accuracy when only six channels (T6, O1, Oz, T4, P3, and FC3) are used.
At the same time, by using the algorithm in this paper and the algorithm in other papers for fatigue driving status recognition experiments under the same data set, it is concluded that the proposed method in this paper has reduced the number of channels and improved the accuracy, which   proves that the proposed method in this paper is feasible. This is because there may be redundant or unimportant data in the full channel data, resulting in a lower accuracy rate when using the full channel data for driver fatigue recognition. Table 8 lists some of the compared methods and their corresponding channel numbers and recognition accuracies. It can be seen that the proposed channel selection method based on Theta_Std+FE features combined with ReliefF_ SFS has the best recognition accuracy.

Subject-Specific Validity of Selected Channels.
To verify the subject-specific validity of channel selection with Relief_SFS, we draw the brain topographic maps for the selected subjects. The specific process is as follows: Firstly, we first randomly select 5 subjects from the dataset; secondly, for each subject, we calculate the specific features (which include frequency domain features, fuzzy entropy features, and fusion features) for each channel. Thirdly, we normalize the selected features and then draw their brain topographic map. Figure 4 shows the brain topographic map of each subject based on the frequency domain features. Figure 5 shows the brain topographic map of each subject based on the  Figure 6 shows the brain topographic map of each subject based on the fusion features.
Each row in the figure represents the performance of the resting and fatigue states of each subject selected for the different normalized features of the brain topography. Among them, JX1 and ZD1 represent the brain topography of the first subject's resting and fatigue state under each channel, respectively. Other symbols are in the same way. The darker the area of the graph, the greater the feature value of the channel.
As can be seen from Figures 4-6, overall, the channel area with the highest weights based on different features varies among states. The variation in features of the selected channels by our method is more obvious for different features. This also validates that the channels chosen by our approach are significant.

Conclusion
In this paper, a channel selection model based on ReliefF_ SFS is proposed by extracting different features of EEG signals and combining the weight calculation of the ReliefF algorithm and the feature selection of the SFS algorithm. The experimental results show that the channel selection method proposed in this paper is feasible, and the number of channels is reduced while the recognition accuracy is guaranteed, which is of great significance for the implementation of practical applications.

Data Availability
The data that support the findings of this study is restricted as subjects' privacy.

Conflicts of Interest
The authors declare no conflicts of interest.