Electroencephalogram (EEG) signals are usually contaminated with various artifacts, such as signal associated with muscle activity, eye movement, and body motion, which have a noncerebral origin. The amplitude of such artifacts is larger than that of the electrical activity of the brain, so they mask the cortical signals of interest, resulting in biased analysis and interpretation. Several blind source separation methods have been developed to remove artifacts from the EEG recordings. However, the iterative process for measuring separation within multichannel recordings is computationally intractable. Moreover, manually excluding the artifact components requires a time-consuming offline process. This work proposes a real-time artifact removal algorithm that is based on canonical correlation analysis (CCA), feature extraction, and the Gaussian mixture model (GMM) to improve the quality of EEG signals. The CCA was used to decompose EEG signals into components followed by feature extraction to extract representative features and GMM to cluster these features into groups to recognize and remove artifacts. The feasibility of the proposed algorithm was demonstrated by effectively removing artifacts caused by blinks, head/body movement, and chewing from EEG recordings while preserving the temporal and spectral characteristics of the signals that are important to cognitive research.
Electroencephalography (EEG), which is the most convenient brain imaging tool that reveals electrical activity in the brain, has the most near-term potential for real-time applications in everyday environments [
Canonical correlation analysis (CCA) has been demonstrated to outperform ICA and frequency filters in eliminating muscle artifacts (electromyography, EMG) [
An efficient method for identifying artifacts is informative features, the frequency, spatial, and temporal domains of EEG signals [
Feature extraction is often followed by classification. Most classification systems for artifact removal use a supervised learning method [
Eleven right-handed adults (seven males and four females, aged from 18 to 27 years) were recruited to participate in the study. None of the participants had a history of psychological disorders. Following a detailed explanation of the experimental procedure, all participants completed a consent form before participating. All subjects were required to wear a wired EEG cap with 62 Ag/AgCl electrodes, including 60 EEG electrodes and two reference electrodes (opposite lateral mastoids) (Figure
(a) Experimental paradigm. (b) Experimental setup.
All subjects were instructed to look at the center of a screen and follow the audible instructions to generate artifacts of various types purposefully. Subjects participated in four experimental runs; in one, no motion was performed, and in the other three, common artifacts (blinking, chewing, and head rotation) were generated. Each run involved two sessions. Each session comprised three parts, which were instruction, stimulation, and resting (Figure
The flicker stimulus experiment was conducted in a shielded room to prevent any unwanted artifact from appearing in the EEG data. The two flickering stimuli had frequencies of 1 Hz and 15 Hz to induce visual-evoked potential (VEP) and steady-state visual-evoked potential (SSVEP), respectively.
The proposed algorithm comprises three main parts, which are CCA, artifact feature extraction, and the Gaussian mixture model after preprocessing, which involves downsampling to 256 Hz and passband filtering of 0.1–60 Hz. Figure
Flowchart of proposed artifact removal algorithm.
Let the observed EEG signals be
Canonical correlation analysis (CCA) is one of the BSSs, which solves the problem by forcing the sources to be maximally autocorrelated and mutually uncorrelated; it has been extensively used to separate muscle artifacts from other EEG activity [
CCA is used to find the matrices
The unknown mixing matrix
The corrected EEG signals
Figure
Demonstration of removal of EEG artifacts by BSS-CCA. (a) A 2 s portion of EEG time series that contains blinking. (b) Corresponding CCA component activations. (c) EEG corrected by removing C1, C2, C15, and C16 from (b).
Different types of artifact in EEG typically have different characteristics. For example, the amplitudes of ocular or body movement artifacts are usually much higher than those of the EEG activities of interest. High-frequency and low-amplitude activities accompany muscle artifacts. Therefore, this work proposes ten features—six spectral and four temporal—to reflect the variability of CCA components.
Six spectral features are extracted from the power spectral density (PSD) by fast Fourier transform (FFT) into six specific frequency bands (lower
The Gaussian mixture model (GMM) is used for unsupervised learning. It is a probabilistic model that assumes that all data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters [
Consider
The parameters are estimated from the maximum likelihood function. The parameters
In this work, the continuous EEG signals are processed using a fixed length window (2 s) with an overlapping window (1.5 s). Therefore, the 50 s of raw EEG data in each stimulus block were segmented into 117 2 s epochs, yielding a total of 936 epochs for each subject. The EEG signals for each epoch were decomposed into components by CCA. Traditionally, a neurophysiologist manually labels all data as ocular artifacts, EMG components, or EEG components of interest by inspecting the time series, the power spectral density, or topography of the components. The EEG dataset contains tens of thousands of individual EEG patterns from and across individual participants. In this work, 11 × 936 × 50 = 514,800 (number of subjects × number of epochs × number of EEG channels) CCA components are involved. Manually labeling various types of artifact among these decomposed CCA components is difficult and time-consuming. Therefore, GMM clusters them automatically to identify various extracted features. Typically, researchers/experts manually score and label many common artifacts, such as those associated with blinking, muscle activity, and motion, and the EEG signals of interest by checking the temporal and spectral properties of decomposed components in each GMM cluster. In this work, the Fisher criterion
Performance of algorithm has been compared with the most popular method of artifact removal called artifact subspace reconstruction (ASR) [
In this work, the number of the cluster was obtained by finding the maximum value of the Fisher criterion variable over a set (2, 3,…,20). Figure
Fluctuation of Fisher criterion value from two to 20 clusters.
In the experiment design, various typical artifacts, which were blinking, chewing, and head rotation, were generated following the instructions during the experiment. GMM clusters were grouped into four classes, which corresponded to muscle artifacts, ocular artifacts, body movement artifacts, and nonartifacts, by visually inspecting the temporal waveforms and PSDs of EEG signals. Figure
Histogram of artifact and nonartifact significance for extracted features. M, OA, BM, and NA represent muscle artifact, ocular artifact, body movement artifact, and nonartifact activity.
Three 5 s long EEG data were selected to illustrate the effectiveness of the proposed artifact removal algorithm (Figure
Five seconds of EEG data from a representative subject with (a) blinking, (b) chewing, and (c) head rotation. In each subfigure, left and right plots are obtained before and after the proposed artifact removal algorithm is applied.
Temporal (ERPs) and spectral (PSDs) responses of EEG in VEP and SSVEP tasks were used to evaluate the effectiveness of the proposed artifact removal algorithm. In the VEP task, EEG epochs were extracted from 500 ms before to 1000 ms after the visual stimulus onset. In the SSVEP task, EEG epochs were extracted from 0 ms to 1000 ms after the beginning of the visual stimulus onset. The power spectrum activities of the EEG signals were calculated by fast Fourier transformation (FFT) and converted to decibels by taking their log power. The EEG waveforms and the power spectrum activities were vertically stacked by epochs, yielding 2D images (Figure
Temporal waveform and power spectrum activities of Oz-EEG channel from a representative subject. (a and c) SSVEP task. (b and d) VEP task. (a and b) Chewing. (c and d) Head rotation. Left and right-hand images in subfigures show EEG waveforms and corresponding log power activity, respectively. Top and middle 2D images show EEG results before and after artifact removal, respectively, and bottom images present averaged results. Each horizontal line in 2D images represents a single trial. The time zero on the
Averaged evoked potentials in Fz, Cz, Pz, and Oz channels during (a) SSVEP and (b) VEP tasks from a representative subject. Each subplot figure shows the averaged waveforms for three artifacts: (i) blinking artifacts, (ii) head rotation artifacts, and (iii) chewing artifacts. In each panel, black line presents averaged waveform without motion, whereas blue and red lines present averaged waveforms before and after artifact removal, respectively.
Visual stimuli accompany visual-evoked potentials (VEPs) at most recording scalp sites, including the occipital, parietal, central, and frontal electrode sites [
Averaged evoked potentials in Fz, Cz, Pz, and Oz channels during (a) SSVEP and (b) VEP tasks from a representative subject. Each subplot figure shows the averaged waveforms for three artifacts: (i) blinking artifacts, (ii) head rotation artifacts, and (iii) chewing artifacts. In each panel, black line presents averaged waveform without motion, whereas blue and red lines present averaged waveforms after ASR and after proposed artifact removal method, respectively.
The performance of the proposed artifact removal algorithm was evaluated by classical visual-evoked tasks with common artifacts. Results thus obtained demonstrated that common artifacts in EEG were successfully suppressed by the proposed artifact removal algorithm. EEG signals appeared much cleaner after artifact removal than before it (Figure
One widely used BSS method in EEG studies is independent component analysis (ICA), which decomposes EEG signals into a set of statically independent components by maximizing its statistical independence. Several methods of ICA-based artifact removal have been proposed for removing artifacts from contaminated EEG signals [
Standard methods involve specific feature extraction and learning classification. Feature extraction is a highly efficient means of achieving satisfactory artifact classification performance. If extracted features can achieve high separability, then, it is easier to distinguish between different classes. Several studies [
This work evaluated the efficacy of the proposed artifact removal algorithm for EEG signals for the activities that are likely to produce artifacts during the experiment. Despite its contributions, this work has certain limitations. The collected data are limited to typical artifacts (ocular, muscle, and body movement artifacts). Unlike in laboratory settings, many unknown and typical application-specific artifacts may be generated in real-world situations. For example, an EEG experiment involving car driving on a real road, a speed breaker, or emergency braking may produce atypical artifacts in the EEG recording. If extracerebral artifacts make informative cortical-generated signals very noisy, then, the presented visual-evoked phenomenon in EEG signals does not exist; therefore, no such components were separated and recovered. Finally, the proposed algorithm was applied to EEG signals in two typical visual-evoked tasks. Other tasks should be carried out to evaluate the efficacy of the proposed artifact removal algorithm. Efforts are underway to examine these problems.
This work proposes a real-time EEG artifact removal algorithm, involving CCA, artifact feature extraction, and the GMM. The efficacy of the proposed method was demonstrated using two classical visual-evoked tasks, which were regular screen flashes with frequencies of 1 Hz and 15 Hz and user-generated artifacts. This work proposes the GMM-based methodology to cluster all components automatically and intelligently to reduce the inconvenience and complexity of labeling the training dataset manually. Experimental results show the feasibility of using the proposed approach to remove artifacts from EEG signals while retaining the properties of
The views and the conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for government purposes notwithstanding any copyright notation herein.
The authors declare that they have no conflicts of interest.
This work was supported in part by the UST-UCSD International Center of Excellence in Advanced Bio-engineering sponsored by the Taiwan National Science Council I-RiCE Program under Grant no. MOST 103-2911-I-009-101, in part by MOST 104-2627-E-009-001, in part by the Aiming for the Top University Plan, under Contract 106W963, and in part by the Army Research Laboratory, W911NF-10-2-0022.