Covert Intention to Answer “Yes” or “No” Can Be Decoded from Single-Trial Electroencephalograms (EEGs)

Interpersonal communication is based on questions and answers, and the most useful and simplest case is the binary “yes or no” question and answer. The purpose of this study is to show that it is possible to decode intentions on “yes” or “no” answers from multichannel single-trial electroencephalograms, which were recorded while covertly answering to self-referential questions with either “yes” or “no.” The intention decoding algorithm consists of a common spatial pattern and support vector machine, which are employed for the feature extraction and pattern classification, respectively, after dividing the overall time-frequency range into subwindows of 200 ms × 2 Hz. The decoding accuracy using the information within each subwindow was investigated to find useful temporal and spectral ranges and found to be the highest for 800–1200 ms in the alpha band or 200–400 ms in the theta band. When the features from multiple subwindows were utilized together, the accuracy was significantly increased up to ∼86%. The most useful features for the “yes/no” discrimination was found to be focused in the right frontal region in the theta band and right centroparietal region in the alpha band, which may reflect the violation of autobiographic facts and higher cognitive load for “no” compared to “yes.” Our task requires the subjects to answer self-referential questions just as in interpersonal conversation without any self-regulation of the brain signals or high cognitive efforts, and the “yes” and “no” answers are decoded directly from the brain activities. This implies that the “mind reading” in a true sense is feasible. Beyond its contribution in fundamental understanding of the neural mechanism of human intention, the decoding of “yes” or “no” from brain activities may eventually lead to a natural brain-computer interface.


Introduction
e most fundamental linguistic communication consists of questions and answers, and the simplest one is the binary "yes or no" question and answer. is enables fundamental interpersonal communications (e.g., "Is your name John?" "Yes" or "Do you want to drink water?" "No"). So, by decoding the intentions to answer either "yes" or "no" from brain activities, a natural interpersonal communication tool, which does not require any operant training or heavy cognitive efforts, may be developed. As the first step toward this, here we tried to demonstrate that it is possible to decode the intentions to answer "yes" or "no" in response to selfreferential questions from noninvasive electroencephalograms (EEGs) on single-trial basis. is was motivated by our recent studies which showed that the intentions to answer "yes" or "no" to self-referential questions is represented significantly differently in event-related EEGs, particularly in alpha-band activities [1,2]. Direct decoding of "yes" and "no" intentions may eventually lead to advancement of the brain-computer interface (BCI), which is a technological means to deliver user's intention to the external world (device or other people) without behavioral outputs, by direct interpretation of brain activities. e most important target of the BCI is the patients with severe motor impairment, who are unable to communicate with others including those in the completely locked-in state (CLIS) due to amyotrophic lateral sclerosis, spinal cord injury, and brainstem stroke [3][4][5][6]. One of the most crucial technologies to enable the BCI is to read or "decode" the users' intention from their brain activities. Two major approaches have been pursued for the intention decoding. e first is based on voluntary self-regulation of specific brain signals such as slow cortical potential [7] and sensorimotor rhythms [8]. is requires extensive operant training using feedback and reward. Unfortunately, many people are unable to regulate the brain activities as required, which is known as "BCI illiteracy" [9,10]. e other approach utilizes evoked brain activities such as P300 event-related potential (ERP) [11,12] and steady-state evoked potential [13,14]. e operant training is not required, but sustained attention is needed to induce discriminable brain response increases, resulting in significant cognitive workload.
Both approaches may not be so successful for the patients with CLIS [15]. It is speculated that the failure is due to the extinction of goal-directed cognition and thought in the CLIS patients [15]. An alternative approach for the mind reading is crucial, which does not require volitional and highly cognitive efforts. Birbaumer and colleagues suggested an approach based on classical conditioning [16][17][18]. ey tried to associate language stimuli with unpleasant and painful sensory stimuli so that cortical responses to these nonlanguage stimuli are conditioned according to the language stimuli. is is remarkable considering that language is the most natural means of communication.
e specific aim of this study is to show that it is feasible to decode "yes" and "no" answers in mind from single-trial EEGs. We demonstrated that mind reading in a true sense, which is based on the prediction of the intentions to answer the questions from brain activities, is achievable. For the intention decoding, the discriminative characteristics of EEGs that we found in our previous study were utilized to find the time-frequency features for "yes/no" decoding. e decoding algorithm was developed based on the same data used in our previous study [2].

Materials and Methods
2.1. Subjects. 23 subjects with no record of neurological or psychiatric illness participated in the experiment (age: 23.13 ± 2.97 years, 12 males). All the subjects were undergraduate students of Yonsei University and right-handed native Korean speakers. Written informed consent was obtained from each subject before the experiment. e experimental procedure was approved by the Yonsei University Wonju Institutional Review Board (IRB). All experiments were performed in accordance with the guidelines and regulations of the IRB.

Experimental Task.
Before the experiment, all subjects completed a written questionnaire on their autobiographical facts (e.g., job, name, age, and date of birth). We generated two opposite types of questions from a single autobiographical fact; one question should be answered "yes," and the other (i.e., autobiographical fact violation (AFV)) should be answered "no." ese two questions were almost the same except one critical word (italicized word in the example below), which determined whether the question agreed with the subject's identity or not. For example, if the subject's job was a student, the two questions were as follows: Type (a), to be answered "yes": Is your job a student? Type (b), to be answered "no": Is your job a teacher?
In total, 40 type (a) questions and 40 type (b) questions were generated based on the questionnaire for each subject. Each question was composed of 2 or 3 Korean words, and the average number of characters (Korean "Hangul") in each critical word was 3.18 ± 1.02. Each character had 3.3 cm width and 4.27 cm height.
All questions were presented visually through commercial software (PRESENTATION; Neurobehavioral systems, Berkeley, CA). After explaining the detailed procedure of the experimental task, we requested the subjects to watch each word presented on a 17 inch computer screen carefully so that they can make immediate response as soon as possible to the critical words. e distance between the subject's eyes and the monitor was set to ∼0.75 m. Each word in a question was presented sequentially one by one on the center of the monitor, as described below. Figure 1(a) illustrates the experimental procedure. A cross mark ("+") for the fixation appeared for 1000 ms and a black screen followed for 300 ms. And then, each word in a question was presented sequentially for 300 ms, with a black screen for 300 ms between the words. e last word in the question is referred to the critical word (CW), which was presented for 300 ms along with a question mark. Although this question mark may naturally induce decision of "yes" or "no" and thus evoke answer automatically, we instructed the subjects not to make any response neither covertly nor overtly but to retain the answer in mind during the 1 s blank period. is would enable us to explore the cortical activity during retaining the information on "yes" or "no" in working memory (WM). Finally, when "Please respond" (in Korean) was presented for 300 ms, the subjects were requested to respond covertly in mind with either "yes" or "no" without any behavioral responses. Figure 1(b) illustrates expected temporal sequence of cognitive processing following the CW onset until the "Please respond" cue appeared, which was based on our previous studies on cortical information processing of intention [1,2], which showed that the brain activities differed between "yes" and "no" answers at both early (0∼600 ms) and late periods (600∼1300 ms) relative to the CW onset. We found that the early period was associated with semantic processing and automatic decision to answer [1] (denoted by a red box in Figure 1(b)), while the late period was involved in the retention of the answer in memory (denoted by a blue box in Figure 1(b)) until the "Respond cue" appeared (denoted by a yellow box in Figure 1(b)) [2]. us, the temporal period of interest for decoding the intentions to answer "yes" or "no" was the late period, corresponding to retain the intention in mind (600∼1300 ms).
Each subject performed two blocks of tasks. Each block included all questions generated based on the questionnaire (i.e., 40 type (a) and 40 type (b) questions), and 10 of 40 questions for each question type were randomly selected and 2 Computational Intelligence and Neuroscience presented once again. Consequently, each block included 50 type (a) and 50 type (b) questions in total. e average duration of each single trial (i.e., one question and answer) was 4380 ± 274.95 ms. e total time for performing the tasks was approximately 20 minutes including at least 5 minutes of rest between blocks.

Electroencephalogram (EEG) Recording and Data
Analysis. Sixty channel EEGs were recorded based on the 10-10 system using an EEG amplifier (Brain Products GmbH, Munich, Germany) with an Ag/AgCl electrode cap (EASYCAP, FMS, Munich, Germany). e ground and reference electrodes were at AFz and linked mastoids, respectively. e impedances of all electrodes were kept under 10 kΩ. e sampling rate was 500 samples/s. A bandpass filter (0.03-100 Hz) and a notch filter (60 Hz) were applied in order to reduce background noise and powerline interferences.
An open source toolbox EEGLAB was used for the whole procedure of preprocessing [19]. First, single-trial EEGs were segmented during the −500∼1300 ms period relative to the critical word onset. By visual inspection, we removed the single-trial waveforms contaminated severely from nonstereotyped artifacts such as drifts and discontinuity. en, an independent component analysis (ICA) was employed to the remaining single-trial EEGs in order to correct the ocular and muscular artifacts [20]. e group-averaged percentage of the number of epochs remaining per subject was 98.88 ± 3.08% and 97.96 ± 5.86% for "yes" and "no" questions, respectively. Figure 2(a) illustrates the structure of "yes/no" intention decoding algorithm using local timefrequency information. First, we selected 29 channels out of 60 (i.e., Fp1, Fpz, Fp2, F7, F3, Fz, F4, F8, FC5, FC1, FC2, FC6, T7, C3, Cz, C4, T8, CP5, CP1, CP2, CP6, P7, P3, Pz, P4, P8, O1, Oz, and O2), following the standard 10-20 system. is was based on a recent simulation study which showed that the decoding accuracy with the common spatial pattern (CSP) spatial filtering was optimized when ∼30 channels were used and decreased for more channels [21]. e overall time-frequency range (0-1200 ms, 4-50 Hz) was divided into subwindows of 200 ms × 2 Hz. e intention decoding within each of the local time-frequency subwindows was performed as follows. Single-trial EEGs were bandpass filtered in the frequency range of the subwindow using a linear-phase finite impulse response filter (the number of the filter order: 512, bandwidth: 2 Hz). e multichannel bandpass-filtered signals within the temporal period of the subwindow were subsequently projected to the lower dimension (four dimensions) by the CSP algorithm [22]. e four time series obtained from the CSP spatial filter were used to construct a four-dimensional feature vector, which was passed to a support vector machine (SVM) classifier. e final output of the classifier was either "yes" or "no," a decision of answer for each single trial. e performance of the trained classifier was validated by 10-fold cross-validation as follows: First, for each class, we randomly split all the trials into 10-folds with the same number of trials (i.e., ∼10 trials per fold for each class). en, we randomly selected one fold (k th , where k � 1, 2, . . ., 10) as a test data (10%) and trained the classifier using the rest of data (i.e., 9 folds excepting the k th fold, 90%). In order to keep a balance between the numbers of "yes" and "no" trials, the training/testing data were selected within each class (i.e., "yes" or "no"), as shown in Figure 2(b). e ground truth for each single trial was determined whether the question in the single trial was including AFV or not. e decoding accuracy for each subject was estimated by averaging the ratio of correct classification from 10 repetitions (i.e., k � 1, 2, . . ., 10) of this procedure.

Yes/No Decoding.
Additionally, we also made effective use all the features obtained from multiple time-frequency subwindows, in order to investigate whether more accurate decoding is possible by combining useful features each of which was localized in the time-frequency domain.
e time-frequency subwindows were selected if the decoding accuracies for a specific subwindow were higher than a predetermined threshold (2 × standard deviation above the mean among all time-frequency subwindows). And then, the classifier was trained and tested as described above, with input feature vectors obtained by combining all the selected subwindows.  Computational Intelligence and Neuroscience

Event-Related Spectral Perturbation (ERSP) Analysis.
e time-frequency activation patterns, i.e., ERSPs, were investigated to reveal statistical differences between "yes" and "no" to find the time and frequency ranges of interests for effective classification. A`continuous wavelet transform (CWT) based on a complex Morlet wavelet was used for the ERSP analysis [23]. e number of cycles for the CWT linearly increased according to the frequency from 4 to 13.5, at the lowest (1 Hz) and the highest frequencies (100 Hz), respectively [19].
is method provides better frequency resolution at high frequencies, and it is better matched to the linear scale that we adopted to visualize the time-frequency map [19]. e induced spectral power was calculated by averaging the ERSP patterns of each single trial [24]. e time-frequency distribution of ERSP patterns was represented as the ratio of the relative change to the power in a baseline interval from −300 to 0 ms prior to stimulus onset, to reduce intersubject variability and to normalize power changes across different frequency bands.
We employed the mass-univariate approach with the cluster-based permutation test for correcting multiple comparisons [25] in order to find the time, frequency, and electrode showing significant differences between "yes" and "no" without a priori knowledge. Detailed procedure is as follows: (1) A large number of paired-sample t-tests were applied to the data for all time-frequency-electrode bins within the range of 0-1200 ms (time), 5-30 Hz (frequency), and 29 electrodes. e number of bins was 181,714 � 241 × 26 × 29 since there were 241 time samples, 26 frequency points, and 29 electrodes. e electrodes showing high t values were selected, and the average power spectral power was calculated over the selected electrodes, as follows. First, from spatial distribution of the t values averaged within the frequency band of interest (e.g., theta band: 4-8 Hz; alpha band: 8-13 Hz) during the overall time period (0-1200 ms), the electrodes showing higher p values above a predetermined threshold (the upper 10% highest value) were selected. e average power spectral power was calculated over the selected electrodes for the next step.
(2) After significant locations were found in step 1, timefrequency bins were screened to be significant among all 6,266 (�241 × 26) bins if p values were below a predetermined threshold (p < 0.05). A cluster of time-frequency bins was formed if more than two successive bins were selected along either time or frequency axis. Sum of t values within the cluster, t mass , was then calculated and compared with the null distribution of surrogate data to determine statistical significance of the cluster (above the highest 5% of the null distribution). e null distribution of t mass was obtained from the largest values of t mass for each of 5,000 surrogate data, which were derived by random permutation of "yes" and "no" answers.

Feature Extraction by Common Spatial Pattern (CSP)
Filtering. CSP is a mathematical procedure to derive a spatial filter which separates a multichannel signal into  additive subcomponents so that the differences of variances are maximized between two classes. at is, the most discriminative features between two classes are obtained by maximizing the variance of the spatially filtered signal of one class while minimizing that of the other class [22]. e CSP algorithm is recognized to be effective for the discrimination of mental states from event-related EEG spectral powers [26]. e results of the CSP can be visualized as a topographic map on the scalp, which facilitates interpretation of functional neuroanatomical meanings [26]. e CSP spatial filter, W, can be obtained by simultaneous diagonalization of two covariance matrices of classes 1 and 2 as follows: where Λ 1 + Λ 2 � I. Σ 1 and Σ 2 represent the spatial covariance matrices averaged over all single-trial EEGs for each class, and Λ 1 and Λ 2 denote the diagonal matrices. e projection vector, w (column vectors of W), can be obtained from a generalized eigenvalue decomposition as follows: where w k (k � 1, . . ., C, where C is the number of channels) is the generalized eigenvector, and λ 1,k � w T k Σ 1 w k and λ 2,k � w T k Σ 2 w k are defined as the k th diagonal element of Λ 1 and Λ 2 , respectively, where λ k � λ 1,k /λ 2,k . Importantly, λ 1,k and λ 2,k (ranges from 0 to 1) reflect the variance for each class and λ 1,k + λ 2,k � 1. us, if λ 1,k is close to 1, λ 2,k should be close to 0.
is means that corresponding projection vector, w k , shows high variance in class 1 but low variance in class 2. e difference in variances between these two classes enables discriminating one class from another. e eigenvalues are sorted in the descending order during calculation, meaning that the first projection vector yields the highest variance for class 1 (but the lowest for class 2), whereas the last projection vector yields the highest variance for class 2 (but lowest for class 1). us, the first and last projection vectors are the most useful for the discrimination [22]. e spatial filter W provides the decomposition of a single-trial multichannel EEG, E, as Z � W T E, where E is represented as a matrix with C (the number of channels) rows and T (the number of time samples) columns. e columns of W −1 form the common spatial patterns and can be visualized as topographies on scalp. e variances of the spatially filtered time-series Z are calculated as features for the classification as follows: where p is the number of features. m was set to 2 which means that the first 2 and last 2 projection vectors were used as features, and thus, the number of features p was 4 for all classifications. e log transformation was adopted to approximate the normal distribution of the data.

Pattern Classification Using Support Vector Machine (SVM)
. SVM has been recognized to be a practical and robust method for the classification of human brain signals [27,28]. e SVM is trained to determine an optimal hyperplane by which the distance to the support vectors (closest to the separating boundary) is maximized [29,30].
In the case of the linear SVM classification, the hyperplane a T x + b satisfies where x i � {f p,i } denotes a feature vector (in which p � 1, . . ., 4) which can be obtained from the CSP algorithm and y i ∈ +1, −1 { } denotes its correct class label. N and ξ i denote the total number of training samples and the deviation from the optimal condition of linear separability, respectively. e pair of hyperplanes that provide the maximum separating margin can be found by minimizing the cost function (1/2)a T a + P N i�1 ξ i subject to the constraints where P > 0 represents a regularization penalty parameter of the error term. By transforming this optimization problem into its dual problem, the solution may be determined as a � N i�1 α i y i x i and achieves equality for nonzero values of α i only. e corresponding data samples are referred to as support vectors, which are crucial to identify the decision boundary. Instead of the basic linear SVM, we used a radial basis function (RBF) kernel which nonlinearly projects the feature vectors onto a higher dimensional space and thus is better suited for nonlinear relationships between features and class labels [29]. e detailed parameters of the SVM including the RBF kernel parameter and regularization penalty were determined by trial-and-error. Figure 3 shows the time-frequency representation of the "yes/no" decoding accuracy averaged over all subjects for each time-frequency subwindow. e time-frequency map of decoding accuracy was generated by representing the decoding accuracies averaged over all subjects within each time-frequency subwindow, which enables estimation of the decoding accuracies over all timefrequency ranges. We used two criteria to define the most important time-frequency subwindows showing high decoding accuracies. e first was to use the threshold level of a decoding accuracy of 75%, determined by the theoretical 95% confidence limits of the chance level when 10 trials per class are used for testing [31]. Another criterion was that the decoding accuracy should be above the mean + 2 × standard deviation value (79.34% here). e high decoding accuracies above these two threshold levels were obtained for three subwindows in the alpha and theta bands (as denoted by the three boxes in Figure 3) for both early and late periods. e highest and second highest decoding accuracies were found Computational Intelligence and Neuroscience in the upper alpha band (10-12 Hz) at late epoch (box ①: 81.08 ± 8.89% at the 1000-1200 ms, box ②: 79.99 ± 8.99% at the 800-1000 ms). Also, the third highest decoding accuracy was found in the upper theta band (6-8 Hz) at the early period (box ③: 79.76 ± 10.21% at the 200-400 ms). When all 12 features within these three best time-frequency subwindows were used together, the decoding accuracy was drastically enhanced compared to the best subwindow (10-12 Hz, 1000-1200 ms), as shown in Figure 4(a) (single: 81.08 ± 8.89%, combined: 86.03 ± 8.69%, t(22) � −5.95, p < 0.001, by the paired-sample t-test).

Yes/No Decoding.
e individual decoding accuracies are presented in Table 1. e sensitivity and specificity values for each time-frequency subwindow are presented in Supplementary Figure 1. Figure 5 shows the difference between the most important common spatial patterns for "no" and "yes" answers within the three time-frequency subwindows (averaged over all subjects). Each topography was obtained from the difference between the last ("no" answer) and first columns ("yes" answer) of the inverse matrix of the projection matrix, W, for each subject ( Supplementary Figure 2), which was calculated in each time-frequency subwindow and then averaged over all subjects. e difference between the most important common spatial patterns in the alpha band showed the strongest coefficient at the right centroparietal region at both 1000-1200 ms and 800-1000 ms periods (the leftmost and middle panels in Figure 5, respectively). e difference between the most important common spatial pattern in the theta band at the 200-400 ms period was focused in the right frontal regions (the rightmost panel in Figure 5).  electrodes (FC2, FC6, and C4) showing high t values above a predetermined threshold (t > 1.62, corresponding to the highest 10%) were selected over the right frontal region (denoted by black dots in the left panel in Figure 6(a)). Significant "yes/no" difference was found within a single time-frequency range around 200-800 ms in the upper theta and lower alpha bands (6-10 Hz), which was stronger for "no" compared to "yes" (denoted by a solid contour in the left panel in Figure 6(b)).

Event-Related Spectral Perturbation (ERSP) Analysis.
In the alpha band, 3 electrodes in right parietal area (CP2, Pz, and P4) with high t values were selected as described above (t > 1.52, the highest 10%) as denoted by black dots in the right panel in Figure 6(a). e "yes/no" difference in spectral power in this region was significant within a single time-frequency range (300-1200 ms, [9][10][11][12], where the alpha-band power was stronger for "no" compared to "yes" (denoted by a solid contour in the right panel in Figure 6(b)). Value: mean ± standard deviation (SD) of decoding accuracy (%). e range of decoding accuracies was in parenthesis. Abbreviation: TF, time frequency; Ave, average values over all subjects.

Discussion
We showed that it is possible to decode the intentions to answer "yes" and "no" with high accuracy from single-trial EEGs. e best decoding accuracy averaged over 23 subjects was as high as 86.03% when useful features in multiple time-frequency subwindows were all combined. e decoding accuracy was above 70% for most of the subjects (22 out of 23 subjects), which is considered as a reasonable accuracy for the binary classification [32]. We decoded the "yes" and "no" answers directly from the brain activities representing the two different answers, which implies that the "mind reading" in a true sense is feasible. e experimental paradigm of our study is based on a natural task which required the subjects to answer self-referential questions as in conversation with others, without any self-  Figure 5: Difference between the most important common spatial patterns for "no" and "yes" answers averaged over all subjects within 3 time-frequency subwindows. e topography was obtained from the difference between the last ("no" answer) and first ("yes" answer) columns of the inverse of the matrix, W, for each subject and then averaged over all subjects.  Computational Intelligence and Neuroscience regulation of the brain signals or high cognitive efforts. No unpleasant stimuli and volition or high cognitive efforts are required since our approach is based on a direct decoding of "yes" and "no" without any self-regulation of the brain signals. Birbaumer's group has suggested a new alternative approach based on classical conditioning to solve the problem of conventional BCI in the CLIS patients [16][17][18].
For the training, two distinct unconditioned stimuli are presented to the subjects immediately after the simple "yes/ no" questions (corresponding to the conditioned stimuli) so that the cortical responses can be conditioned differently for yes and no. e unconditioned stimuli include auditory pink noise and white noise [16,18] and weak electrical stimulation to the thumb [17]. e main idea of this approach is to modulate the users' brain activities indirectly through the unconditioned stimuli so that "yes" and "no" can be easily discriminated from neural signals responding to the sensory stimuli, rather than to read the users' answers from neural signals. is approach may provide an alternative to the conventional BCI approaches in that volition, or high cognitive efforts are not required. However, it remains unclear how long the conditioned cortical response can be maintained considering the extinction effect of classical conditioning [33]. Moreover, unconditioned stimuli such as auditory noise or electrical stimulation can evoke significant displeasure. Recently, a more natural approach for the "yes/no" decoding was demonstrated based on functional near-infrared spectroscopy (fNIRS) in the CLIS patients [34]. ey achieved "yes/no" decoding accuracy over 70% based on fNIRS signals, which were recorded, while the patients answered "yes" or "no" to personal and open questions in minds repeatedly. Interestingly, for the same experimental protocol, they reported that EEG-based decoding yielded accuracy below the chance level.
is study employed a natural question/answer task which does not require high cognitive efforts or volition, just as ours. But due to the slow nature of hemodynamics, the duration of each trial for the decoding was quite long (>10 sec). Here, we showed the possibility of "yes/no" decoding from considerably shorter signal recording, which is more beneficial for a practical BCI communication tool.
We took a systematic approach of finding features of brain activities reflecting "yes/no" answers in minds and then developing the decoding algorithm by utilizing these features. Further studies may be necessary to investigate whether the patients, who would potentially benefit from the BCI, can hold the intentions to answer in minds for a short time and to validate our method on the patients' data.
In this study, the intentions regarding self-referential questions based on the autobiographic facts were investigated. It is important to further try decoding the intentions to answer various types of questions including desire, feeling, and preference. In addition, our questions were presented only in visual stimuli. Neurological patients may have an abnormal visual function such as disability to fix their gaze on specific visual stimuli [35]. Different sensory modality such as auditory stimuli has been tried for the BCI communication tools [34,36]. It would be beneficial if our approach can be validated with auditory stimuli such as voice, considering that a high decoding accuracy above 80% was obtained even when the brain activities during the period of retaining the decision in minds (10-12 Hz, 1000-1200 ms) used for the decoding. us, we expect that it is possible to decode the "yes" and "no" intentions in a similar way, even if other types of questions and/or the auditory stimuli are employed in the further studies. In addition, here, we did not try to optimize the detailed parameters of the SVM, including the RBF kernel parameter and regularization penalty. e use of the best parameters of the SVM, for example, by using the "grid-search" method [37], may be obviously helpful for better results.
We found two time-frequency regions containing useful information for the "yes/no" decoding, in early theta and late alpha bands. e useful features for the "yes/no" decoding in the alpha band were found to be concentrated in the parietal region at 800-1200 ms from the CSP algorithm. Recently, we showed that the alpha rhythms in the right parietal region are differentiated between the intentions to answers either "yes" or "no" in minds, presumably due to the difference in cognitive loads for the WM retention [2]. Several previous studies showed that the higher parietal alpha power reflects increased memory load [38,39] or attentional demand [40,41] during WM retention. e higher alpha power is attributed to active inhibitory control to block incoming stimuli during WM retention, for efficient cortical information processing [38,39,42,43]. Our results showed higher parietal alpha power for "no" compared to "yes," which may imply higher cognitive load during retaining "no" in minds compared to "yes" [2]. e greater increase in alpha-band activity for "no" may reflect the increased WM load during the intention retention. In Korean language, "yes," is the one-character word, "네," and "no" is threecharacter word, "아니오." It is plausible that the higher WM load is required to represent intention to respond "no" than "yes" due to the length of the Korean words, resulting in the higher alpha rhythm. is assumption is supported by an ERP study which reported that greater alpha-band power was induced for retaining longer word [44].
It can also be interpreted that the significantly higher alpha-band activity in the centroparietal region for "no" is due to the higher attentional demand [40,45], and this contributed to the high decoding accuracy. is is also in agreement with a recent study [46], which reported that a higher alpha rhythm was identified in the right parietal cortex for a higher internal attention condition during a divergent thinking task. Our result of greater alpha power for "no" than for "yes" may imply a stronger inhibition of the outer stimuli by the bottom-up attention network for "no," induced by higher internal attentional demand.
is is supported by psychophysical which showed that saying "no" requires more effortful reconsideration after comprehending a sentence and a longer response time for saying "no" than "yes" [47,48]. e theta-band activity in the frontal region in 200-500 ms was another major feature for "yes/no" decoding. e theta ERS showed topography focused on midline frontal and lateral temporal regions. e difference Computational Intelligence and Neuroscience between "yes" and "no" was also most prominent in these regions. Hald et al. reported that temporal and frontal theta-band activity in 300-800 ms was significantly higher in semantically incongruent compared to congruent sentences [49]. is is commensurate with our result in that "no" stimuli are incongruent with autobiographic facts. e increase of theta-band activity for semantic incongruence was interpreted to reflect the general error detection mechanism, which is associated with error-related negativity (ERN) [50]. Interestingly, Luu and Tucker showed that frequency domain analysis of the ERN yields theta-band activity in the midfrontal region [50]. A related study reported higher theta oscillation for syntactic violation as well [51]. We observed that frontal theta power in 200-500 ms contributed to high decoding accuracy. Considering the location and frequency band, our result on the usefulness of frontal theta power in 200-500 ms can be interpreted as another evidence, suggesting that errorrelated frontal theta oscillation is a general phenomenon underlying processing of incoming stimuli containing violation with internal information.

Data Availability
e data used to support the findings of this study have not been made available because some participants of this study did not agree to distribute their physiological signals.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.