Time-Frequency Analysis of EEG Signals and GLCM Features for Depth of Anesthesia Monitoring

One of the important tasks in the operating room is monitoring the depth of anesthesia (DoA) during surgery, and noninvasive techniques are very popular. Hence, we propose a new scheme for DoA monitoring considering the time-frequency analysis of electroencephalography (EEG) signals and GLCM features extracted from them. To this end, at first, the time-frequency map (TFM) of each channel of each EEG is computed by smoothed pseudo-Wigner–Ville distribution (SPWVD), where the EEG signal used in this paper is recorded in 15 channels. After that, we consider the gray-level co-occurrence matrix (GLCM) to obtain the content of TFM, and after that, four features such as homogeneity, correlation, energy, and contrast are obtained for each GLCM. Finally, after the selection of efficient features using the minimum redundancy maximum relevance (MRMR) method, the K-nearest neighbor (KNN) classifier is utilized to determine the DoA. Here, we consider the three states, namely, deep hypnotic, surgical anesthesia, and sedation and awake states according to bispectral index (BIS), and each EEG epoch is classified to these states. We also employ data augmentation to enhance the training phase and increase accuracy. We obtain the accuracy and confusion matrix of the proposed method. We also analyze the effects of a number of gray levels of GLCM, distance measure in KNN classifier, and parameters of data augmentation on the performance of the proposed method. Results indicate the efficiency of the proposed method to determine the DoA during surgery.


Introduction
General anesthesia (GA) is a necessary state for many surgical procedures. ere are several essential features of anesthesia which are displayed by patients. Some of these features are the lack of movement, awareness, and recall of the surgical intervention as well as unresponsiveness to painful stimuli [1,2]. As overly light anesthesia is the most common cause of awareness, anesthesiologists use several indicators to measure the depth of anesthesia (DoA). A continuum of progressive central nervous system (CNS) depression and decreased responsiveness to stimulation is referred to as DoA or depth of hypnosis [3,4].
One of the most forlorn and fearsome senses is awareness during anesthesia. It is a complication with potential long-term psychological consequences like posttraumatic irritability, stress, and anxiety [5]. Monitoring DoA is a solution to this issue. e most important task in the operational room is preventing excessive DoA or awareness and improving patients' outcomes which can be performed by precise drug delivery to the patients. An accurate evaluation of DoA can help us to this end [6].
Numerous approaches and devices were designed and produced to assess DoA. ese devices work based on clinical/conventional monitoring and/or brain electrical activity monitoring, and each of them has its special drawbacks. Previous studies demonstrate that since CNS is the final target for GA drugs, the electroencephalogram (EEG) signal can be more informative for the determination of DoA compared to those works just based on simple vital signs. BIS [7], auditory evoked potential (AEP) monitors [8], entropy [9], and narcotrend [10] are some of EEG-based commercially DoA monitors. It turned out that these devices are not exactly accurate and suffer from several drawbacks.
Recently, several works aimed at introducing new methods to measure the DoA. Bayesian techniques were employed in [11] to the assessment of DoA, where the limiting large-sample normal distribution was considered, and it was shown that the maximum a posteriori (MAP) values increase gradually as the anesthesia states change from awake to light, moderate, and deep anesthesia. Distinguishing awake states from GA using EEG signals was the aim of authors in [12]. ey extracted 11 features from EEG signals such as entropy, fractal, and spectral, and then, efficient ones were selected. It was found that entropies including permutation and sample, Beta-index, and detrended fluctuation analysis yield the highest accuracy with a classifier based on the adaptive neuro-fuzzy inference with linguistic hedges. Quasi-periodicities in EEG were used for analyzing the variations of DoA in [13]. To this end, phaserectified signal averaging was employed. e results indicated that this method achieves better results than the sample and permutation entropies. e six features including beta ratio, spectral edge frequency, and four bands of spectral energy were extracted from the EEG signal, and then, the decision tree classifier was used to determine the DoA in [14]. e authors considered the four classes for DoA as deep, moderate, and light versus awake state. In [15], at first, the noise removal from EEG signals was performed by Hurst's method, and the maximum of Hurst's ranges was considered as EEG response. en, it was shown that maximum PSD can be used to distinct transitions of DoA states. Atomic decomposition was considered in [16] to decompose the EEG signals. en, several features were extracted from decomposed subbands, and the SVM classifier discriminates between awake and sedated states. e near-infrared spectroscopy (NIRS) signals were considered for recording the cerebral hemodynamic variables in [17] to monitor the DoA. e authors proposed to measure the sample entropy to describe the complexity information of cerebral hemodynamic variables. e multimodal system, which simultaneously uses the EEG and NIRS signals, was considered in [18] to monitor the DoA. It was shown that, with the EEG + NIRS signals, the clinically important transition from the awake to deep state can be detected, while transition in a clinical trial cannot be detected by BIS.
Ordinal power spectral density (O-PSD) was introduced in [19] for measuring the DoA. A deep neural network (DNN) named AnesNet was introduced in [20] to quantify the DoA. e raw EEG signals were given to a convolutional neural network (CNN) with convolution, pooling, and fully connected layers to determine the DoA. In [21], short-time Fourier transform (STFT) was employed for obtaining PSD, and CNN determines the DoA according to the given PSD. e wavelet transform was employed in [22] for analyzing the DoA from EEG signals. For this purpose, extracted features were clustered using a classifier based on the wavelet. Also, the specifications of eigenvectors were considered for extracting the specs of the midlatency auditory evoked EEG under anesthesia.
Since EEG signals present nonlinear characteristics in anesthesia conditions [23], we present a method for DoA monitoring considering the time-frequency analysis of EEG signals. We employ the smoothed pseudo-Wigner-Ville distribution (SPWVD) [24] to obtain a time-frequency map (TFM) of EEG epochs. In order to obtain the characteristics of TFM, we employ the gray-level co-occurrence matrix (GLCM) and, then, extract four features from it. Since all extracted features are not informative, we employ minimum redundancy maximum relevance (MRMR) to select the efficient ones from the feature vector. Finally, the K-nearest neighbor classifier (KNN) determines the depth of anesthesia. We also utilize the data augmentation by adding zeromean white Gaussian noise to training samples in order to enhance the generality of the trained classifier. Hence, the contributions of this research can be summarized as follows: (i) Employing SPWVD for obtaining the TFM of EEG signals (ii) Employing GLCM features in order to describe the time-frequency content (iii) Feature selection by MRMR algorithm to reduce the complexity of classification (iv) Employing data augmentation to increase the generality of KNN classifier (v) Obtaining the accuracy and confusion matrix of the proposed scheme (vi) Analyzing the accuracy for different distance measures as well as different number of gray levels and augmentation parameters Following this introduction, Section 2 describes the recorded EEG signals which are used in this research. Section 3 explains the proposed for monitoring DoA in detail. Section 4 contains the results and eventually. Section 5 presents the concluding remarks and the directions of future works.

Data
Six female participants aging in the range 26-72 years old, with a mean of 45.5 years old, were contributed in order to record the required data. ese participants were scheduled for elective gynecological surgeries. It should be mentioned that this research was approved by the Institutional Research Ethics Committee. According to the American Association of Anesthesiology physical status classification, all patients were in ASA I and ASA II. All patients confirmed informed consent before initiation of data recording [6].
In order to avoid unnecessary delays in the surgical program, the preoperation (Pre-Op) period started about an hour before surgery, and then, spontaneous EEGs were recorded for 5 minutes. After that, the patients were transferred to the operation room and electrodes of the BIS device were attached before starting the operational (OP) 2 Computational Intelligence and Neuroscience period. Spontaneous EEG signals were recorded during maintenance and emergence periods of anesthesia. A long recording was made ten minutes before the anesthesia was stopped and waking up. Table 1 contains the duration of spontaneous EEG and BIS data recording during surgery for different patients [6]. e EEG electrodes were montaged according to the 10/ 20 standard to include 15 channels, namely, Fp1, Fp2, F7, F3, Fz, F4, F8, T7, C3, Cz, C4, T8, P3, Pz, and P4 [6]. Simultaneously, BIS was recorded in a parallel manner using the available anesthesia monitor (Aspect Medical Systems) and was considered as a reference. BIS index is in the range [0, 100], where 0 denotes full cortical silence and 100 is the fully awake state. e appropriate state for adequate surgical anesthesia is the BIS level between 40 and 60 [2]. e EEG signals were segmented into the epochs of 30 seconds, which have 50% overlap with each other. e average of BIS during each epoch was computed, and each was labeled as deep hypnotic (D), surgical anesthesia (A), and sedation and awake (S) considering the average BIS of an epoch. e epochs with the average BIS smaller than the 40 are labeled as D. e average BIS between 40 and 60 labels the epoch as A. e epochs with average BIS greater than 60 are called S. In Table 2, the numbers of epochs from each label for different subjects are reported. Figure 1 presents the procedure of the proposed scheme for DoA classification. As shown, the proposed scheme generally determines the DoA in three steps including preprocessing, feature extraction and selection, and classification, which are explained in detail in the following.

3.1.
Preprocessing. At first, artifacts and corrupted BIS data were identified and removed manually from raw recorded signals. e frequency content of the cleaned data is in the range [0, 300] Hz and has a maximum amplitude of about 100 μV. At first, a high-pass filter with a cutoff frequency of 0.5 Hz was employed to remove the disturbances at very low frequencies. Also, to remove the high-frequency noise, a low-pass filter with a cutoff frequency of 70 Hz was used. Furthermore, a notch filter with a null frequency of 50 Hz was employed to remove power supply noise. Since the maximum informative frequencies of EEG signals are less than 60 Hz, we consider the decimation to reduce the sampling frequency from 1000 Hz to 100 Hz, which reduces the computational complexity [21]. One epoch from states of overdeep, surgical anesthesia, and sedation and awake in different channels is shown in Figures 2-4 , respectively.

Feature Extraction.
e proposed feature extraction in this paper employs the three steps as follows: In the following, we present each step in more detail.

Time-Frequency Analysis.
It was mentioned that EEG signals are not stationary and should be analyzed as pseudostationary signals.
erefore, traditional frequency-domain analysis tools such as Fourier transform cannot help to characterize it in the frequency domain. Hence, we should consider time-frequency transforms in order to obtain the frequency-domain content. e first step for computing the WVD of a signal x(t) is obtaining its analytical signal as where the Hilbert transform of x(t), i.e., x(t), is defined as Accordingly, the Winger distribution is defined as where (·) * denotes the conjugate operation and j � �� � −1 √ . e Wigner-Ville distribution provides the energy density in the time-frequency domain. is distribution is one of the Cohen class distributions. Distribution smoothing should be applied to diminish the cross-term, which is the zero-density energy components of Winger distribution. Reducing the crossterm enhances the accuracy of distribution. In order to improve WVD, a smoothing function, either frequency or time smoothing function, can be added [25,26].
Considering the frequency smoothing function h(τ), the pseudo-WVD (PWVD) is defined as As the smoothing function g(t) is added in the time domain, the smoothed PWVD (SPWVD) is obtained as  Computational Intelligence and Neuroscience 3 It should be mentioned that both h(τ) and g(t) are rectangular windows.
EEG signals are recorded in 15 channels; hence, SPWVD should be calculated for each channel separately.  the time-frequency plane. It is observed that the deep hypnotic epoch demonstrates the minimum activity compared to the surgical anesthesia and sedation and awake epochs. Because in this case person has the lowest activity and as observed EEG rhythms with frequencies lower than two Hz are dominant. Comparing Figure 6 with Figures 5 and 7 depicts that the EEG epoch from the surgical anesthesia class has higher activity in the frequency range in the band [0 8] Hz compared to surgical anesthesia and sedation and awake states. e activity in higher frequencies reduces in sedation and awake compared to the surgical anesthesia state, but it is higher than the deep hypnotic state.
3.2.2. GLCM Features. As shown, the EEG epochs have different time-frequency contents during different levels of GA. Hence, we can use texture analysis methods to describe the time-frequency content. Several texture-based methods were introduced to obtain the content of images such as local binary pattern (LBP) [27][28][29][30], autocorrelation function (ACF) [31], binary Gabor pattern (BGP) [32], gray-level cooccurrence matrix (GLCM) [28,33], and local spiking pattern (LSP) [34]. Different combinations of gray levels within the image can be described by GLCM which can be useful in the identification of the different regions of interest in the images. GLCM extracts the texture features considering the second-order relationship between reference and neighboring pixels [35,36]. GLCM develops the co-occurrence matrix by comparing the pixel values of neighboring pixels, where the number of rows and columns of the matrix is equal to a number of gray levels. After computing GLCM for k th channel of EEG signal, i.e., G k , with L levels, four features are extracted from it including contrast (c k ), correlation (r k ), energy (e k ), and homogeneity (h k ) which are computed as follows [37]: Computational Intelligence and Neuroscience 5 where Let the 4 × 1 vector ] k � [c k , r k , e k , h k ] T denote the feature vector for each channel; hence, the feature vector of each epoch considering all channels with 60 features is obtained as follows:  [40]. Its goal is to find the set of features such that selected features are mutually and maximally dissimilar which is achieved by minimizing the redundancy and maximization of the relevance of the selected features to the actual classes. MRMR considers the pairwise mutual information of features and mutual information of a feature and actual classes to quantify the redundancy and relevance. Let r h and e h denote the relevance of h with respect to a response y and the redundancy of h, respectively, which are defined as follows: where |h| is the cardinality of set h and I (A, B) is the mutual information of two sets A and B which is computed as follows: e selected feature set h should maximize the r h and minimize the e h . ere are 2 |f| combinations and finding the optimal set h requires them all. But, MRMR employs the forward addition scheme by using the mutual information quotient (MIQ) value to rank the features. MIQ value is defined as where r f and e f denote the relevance and redundancy of feature f, respectively, and defined as , y), e steps by which MRMR selects the features in the steps are as follows.
(1) Initialize the set of selected features as h �.   I(x, z) . (14) (6) Repeat step 5 until the relevance is zero for all features in h c . (7) Add the features with zero relevance to h in random order.

Classification.
e nonlinear and commonly used KNN classifier is considered in this paper to classify the features obtained from the feature selection step. KNN classifies the samples based on the distance between the unknown features and the features of training samples. It considers the labels of K as the most similar neighbors to predict the class of the training samples and considers the label of the class with the greatest number of samples among them [41][42][43][44]. In this study, we consider several distance measures to compute the similarity between the test and training samples.
Consider two feature vectors with N features as v � v 1 , . . . , v N T and w � w 1 , . . . , w N T . e different distance metrics are calculated as follows:   Computational Intelligence and Neuroscience (i) Euclidean distance: (ii) Standardized Euclidean distance: where V denotes the diagonal N × N matrix. e ith diagonal element of V is S 2 (i), where S is a vector of scaling factors for each dimension. (iii) Minkowski distance: (iv) Chebyshev distance: (v) City block distance:  Computational Intelligence and Neuroscience (vi) Cosine distance: (20) (vii) Correlation distance: , (21) where v and w are the mean of vectors v and w, which are computed as follows: (viii) Mahalanobis distance: where C denotes covariance matrix.
It should be noted that, for p � 1, p � 2, and p � ∞, Minkowski and the city block distances are the same, Euclidean distance, and Chebyshev distances, respectively.

Results
Here, we provide the obtained results to demonstrate the robustness of the proposed DoA classification, where a 10fold cross-validation method is used to obtain the accuracy. In this way, the available data is randomly partitioned into 10 nonoverlapping parts with an equal number of samples, and training and testing procedure is repeated 10 times. At each time, nine parts are considered as training samples and one part is used for testing. Also, training and test are performed for each subject separately, and the results are reported for each subject.
In order to increase the number of training samples and enhancing the generality of the trained classifier, the data augmentation scheme is presented in [45].
is method increases the number of training samples by adding zeromean Gaussian noise with standard deviation σ to them. It should be noted that data augmentation is only performed on training samples, and test samples are those selected from original samples. e classification accuracy is evaluated for several augmented multiples and noise variances.

Classification Accuracy.
e classification accuracy of the proposed DoA classification is presented in Table 3 considering the features extracted from each channel of recorded EEG signals as well as considering whole features extracted from all channels. It is observed that the selection of the channel has considerable effect accuracy. ese results are obtained considering augmented multiple and noise variance equal to 35 and 0.1, respectively. It should be noted that the performance of the proposed method is presented for different values of the augmented multiple and noise variances in the following subsections. Also, the number of gray levels is set to 16, and Mahalanobis distance is considered for measuring the distance between training and test samples.
It is observed that channel T7 provides the highest average accuracy which is equal to 75.54%. Also, subjects 4 and 6 have the highest accuracy as 67.93% and 82.41%, respectively. On the other side, channel Fz yields the lowest average accuracy at 65.63%, where subjects 4 and 2 have the lowest and highest accuracy equal to 52.29% and 77.98%, respectively.
According to reported results, considering whole extracted features from all channels enhances the classification accuracy considerably. In this case, the lowest and highest accuracy results belong to subjects 3 and 5 with 93.34% and 96.92%, respectively, and an average result of 95.32% is obtained. ese results indicate the efficiency of the proposed method for the classification of EEG signals in order to determine DoA. Table 4, in the revised version, compares the performances of several TFMs and classifiers. We considered the PWVD, WVD, and STFT as well as KNN, SVM, random forest, and decision tree classifiers. It is observed that the pair (SPWVD, KNN) outperforms the other pairs of TFMs and classifiers. For all TFMs, KNN provides the best accuracy, and then SVM yields good performance. Among TFMs, SPWVD and WVD have the highest and lowest accuracy, respectively. Table 5 presents the confusion matrix of the proposed method for the classification of DoA for different subjects. It is observed that the proposed method provides efficient accuracy for all subjects in the presence of biased data. It is noticeable that the proposed method does not classify the deep hypnotic as sedation and awake epoch and vice versa, which indicates the efficiency of the proposed method. e misclassifications of epochs from classes deep hypnotic and sedation and awake are classified as class surgical anesthesia. Also, most misclassifications of epochs from surgical anesthesia are classified as sedation and awake. e sensitivity of different classes is also provided in Table 4 which indicates the classification accuracy of each class in different subjects.

e Effect of the Number of Gray Levels.
e effect of the number of gray levels in computing GLCM on the performance of the proposed DoA classification is presented in Table 6 in which the classification accuracy is obtained considering all channels. It is observed that, for all subjects, accuracy increases considerably as the number of gray levels increases from four to 16 while increasing the number of gray levels from 16 to 64 reduces the accuracy. Hence, 16 gray levels are considered in order to extract texture-based features from time-frequency images obtained from SPWVD.

Accuracy of Distance Measures.
e accuracy of the proposed method considering different distance measures considered in the KNN classifier is given in Table 7. As mentioned earlier, the Mahalanobis distance provides the highest accuracy of 95.32%, and after that, the Chebyshev distance reaches the accuracy of 93.26%. Also, the standardized Euclidean distance has the lowest accuracy of 90.73%.

e Effect of Augmentation.
e effect of the augmentation multiple and noise variances in data augmentation on   Computational Intelligence and Neuroscience the performance of the proposed DoA classification is shown in Figure 8, where the average accuracy of all subjects is reported. As observed from Figure 8(a), increasing the augmentation multiple from 1 to 35 increases the accuracy, but after that, accuracy reduces. Hence, the augmentation multiple is set to 35 in all results reported previously in this section. When the greater number of augmented data increases, it is possible that there is outlier data which reduces the accuracy. Also, it is seen from Figure 8(b) that increasing the noise variance from 0.1 to 0.1 increases the accuracy, but after that, accuracy reduces; therefore, we choose the variance of 0.1.

Conclusion
In this paper, we presented a noninvasive method for monitor the DoA based on time-frequency analysis of EEG signals recorded in 15 channels considering the electrode placement in 10/20 standard. EEG signals were recorded from six subjects and were partitioned into epoch with the length of 30 seconds, where consequent epochs have 50% overlap with each other. e TFM of each channel was calculated using SPWVD. Obtained TFM showed that frequencies higher than 8 Hz have near-zero amplitude, and we can remove them from TFM. en, GLCM was employed to   describe the gray content of each TFM. Contrast, correlation, energy, and homogeneity were calculated for each GLCM; hence, the feature vector of each epoch was constructed by 60 features. e redundant features were removed by the MRMR algorithm and KNN classified the remaining feature to determine the DoA. e results showed that the proposed method achieves the average accuracy of 95.32% with 16 gray levels and Mahalanobis distance and the minimum and maximum accuracy among subjects are 93.34% and 96.92%, respectively. e accuracy with high mean and low variation among subjects indicates the efficiency of the proposed method. We also analyzed the effect of the parameters of data augmentation, which indicates that augmentation multiple and noise variance equal to 35 and 0.1, respectively, achieve the highest accuracy.
Employing the methods based on deep learning and transfer learning can be considered as future works. Convolutional neural networks (CNNs) to classify the TFM are obtained from time-frequency analysis. From the viewpoint of time-series analysis, we can use long short-term memory (LSTM) networks to determine the DoA.

Data Availability
Data used to support the study are available from the corresponding author upon reasonable request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.