This research proposes to develop a monitoring system which uses Electrocardiograph (ECG) as a fundamental physiological signal, to analyze and predict the presence or lack of cognitive attention in individuals during a task execution. The primary focus of this study is to identify the correlation between fluctuating level of attention and its implications on the cardiac rhythm recorded in the ECG. Furthermore, Electroencephalograph (EEG) signals are also analyzed and classified for use as a benchmark for comparison with ECG analysis. Several advanced signal processing techniques have been implemented and investigated to derive multiple clandestine and informative features from both these physiological signals. Decomposition and feature extraction are done using Stockwell-transform for the ECG signal, while Discrete Wavelet Transform (DWT) is used for EEG. These features are then applied to various machine-learning algorithms to produce classification models that are capable of differentiating between the cases of a person being attentive and a person not being attentive. The presented results show that detection and classification of cognitive attention using ECG are fairly comparable to EEG.
In today’s high-paced, hi-tech, and high-stress environment, a common sufferer is our cognitive processing and capacity. Cognitive psychology primarily deals with people’s ability to acquire, process, and retain information which is a fundamental necessity for task execution [
Studying alertness and drowsiness is not a new domain in scientific research. Numerous research areas are actively studying the concepts of attention, alertness, distraction, and drowsiness. Many of these researches focuses on nonsensory mechanisms to identify and quantify levels of attention in individuals [
For this reason, this research is attempting to use Electrocardiogram (ECG) for detecting cognitive attention in individuals. The ECG is a fundamental physiological signal which can be collected easily with a tiny wearable and portable monitor. Since the collection device is portable and has a small footprint on the body, it allows the capture of ECG signals from individuals in various situations in a noninvasive manner. The portability of such a data collection unit allows a more realistic study of human cognitive activities during task execution under various circumstances. The research presented in this paper is attempting to establish a correlation between cognitive attention and its implications on ECG. By being able to identify a pattern and correlation between the two it becomes possible to predict well in advance, an individual’s potential loss of attention and ingression of sleepiness during a task execution. This also provides the ability for preemptive feedback to the user upon identifying diminishing attention levels and thereby improving the individuals’ overall performance.
The rest of this paper is organized as follows: Section
An essential aspect of this research has been the collection of the data itself. Extensive search revealed that there was no dataset available, freely or otherwise, which catered to the exact needs to this particular study. Since the study is about utilizing ECG collected via a portable armband to detect the presence or lack of attention/focus in an individual, the dataset had to be collected specifically based on the requirements of this research.
In the designed experiment, volunteer subjects were individually asked to watch a series of preselected video clips during which two physiological signals, that is, the ECG and EEG, were acquired. Based on their content, the chosen video clips fell in either of two categories that is either “interesting” or “noninteresting,” requiring high and low levels of viewer engagement, respectively. The average length of each selected video clips was about 4-minute long. For each category the respective video clips were put together to form a video montage of about 20-minute viewing duration. The first category of the video montage named “interesting” included engaging scenes from documentaries, popular movie scenes, high-speed car chases, and so forth. which were intended to keep the viewers attentive and engaged with its content. The second video montage named “noninteresting” contained videos which were repetitive and monotonous in nature such as a clock ticking and still images shown for extended periods of time. These were intended to induce boredom in subjects and thereby reduce their attentiveness. Viewing the two categories of video montages one after the other required contrasting levels of engagement and focus from the participant, thereby ensuring (as far as possible) that the subjects were interested and paid attention to the interesting video set and the subjects were subsequently bored and lost focused attention during the noninteresting videos.
During the experiment the ECG signal was collected using the SenseWear-Pro armband developed by Bodymedia Inc. This armband is capable of collecting ECG data at 128 Hz [
As shown in Figure
Two leads ECG collection from Armband.
The EEG signal was collected from the subjects using MP150: EEG-100C a product by Biopac Inc. With this system an EEG cap is provided that fits snug on the head of the subject and it collects the EEG signal at a sampling rate of 1000 Hz. Signals were collected from the forehead or the frontal cortex (fp1 and fp2) with a ground reference from the ear lobe. The frontal cortex is primarily responsible for attention and higher-order functions including working memory, language, planning, judgment, and decision-making [
The schematic diagram in Figure
Methodology overview.
The acquired raw signals are first preprocessed to remove unwanted artifacts presented within the signals. Next the preprocessed signals are decomposed using various decomposition and analysis methods. In the next step valuable and informative features are extracted from the decomposed components of the signal. These extracted features are finally fed to the machine-learning step where classification models are developed to classify the feature instances to either of two cases “attention” or “nonattention.”
The acquired raw ECG signal contains some inherent unwanted artifacts that need to be dealt with before any analysis can be performed on it. The cause of these artifacts, which is usually frequency noise or baseline trend, could be due to a number of reasons such as subjects’ movement causing motion artifacts, breathing patter artifact, loose skin contact of the electrodes, and electric interference (usually found around 55 Hz). Therefore a preprocessing step has been designed to ensure that the signal is as clean and artifact free before analysis.
The preprocessing steps for the ECG signal are shown in Figure
ECG preprocessing.
Next, the filtered ECG data is sent through a baseline drift removal step. Typically baseline drift is observed in ECG recordings due to respiration, muscle contraction, and electrode impedance changes due to subject’s movement [
Given
After the raw ECG signal has been filtered of noise and baseline drift, the signal is then split into two portions based on the acquisition and experiment framework. The two portions of signals, namely, “interesting” and “noninteresting” are extracted from the original signal using timestamps that are recorded and indexed during signal acquisition. Splitting and analyzing the two sections of data separately facilitate supervised learning mechanism during the training phase in the machine learning step.
The EEG signal is comprised of a complex and nonlinear combination of several distinct waveforms which are also called band components. Each of the band components is categorized by the frequency range that they exist in. The state of consciousness of the individuals may make one frequency range more pronounced than others [
EEG preprocessing steps.
The
When it comes to analyzing dynamic spectrum or local spectral nature of nonstationary observations such as the ECG some of the popular methods include Short-Time Fourier Transform (STFT) [
Although
There are two varieties of
The STFT of a signal
The
For application of
The discrete
Let
Figure
After the windowing step, each of the 10 seconds windows is decomposed using
An example output of a 5-second window of an ECG data after
(a) shows the contour-based visualization of frequency spectrum along time, based on the
Figure
The output of each window is a frequency-time represented matrix. Each instance of the matrix is frequency point and a time point (by the row and column position, resp.). So the entire output matrix can be presented as follows:
The extraction of features from the derived output matrix of ST is performed in two steps. In the first step the output matrix is reduced from two dimensions to a single dimension. This is done by computing certain statistical measures along the frequency dimension mean of frequencies sum of frequencies product of frequencies standard Deviation of frequencies range
At the end of the first step we get an array of features from the frequency domain as follows:
Mean:
Sum:
Mean of autocovariance:
Sum of cross-correlation:
Log2 of Variance:
Two additional features are calculated from the initially obtained ST matrix. Mean of max frequencies:
Mean absolute deviation of frequencies:
The EEG signal exhibits complex behavior and nonlinear dynamics. In the past wide range of work has been done in understanding the complexities associated with the brain through multiple windows of mathematics, physics, engineering and chemistry, physiology, and so forth [
The small yet complex varying frequency structure found in scalp-recorded EEG waveforms contains detailed neuroelectric information about the millisecond time frame of underlying processing systems, and many studies indicate that waveform structure at distinct scales holds significant basic and clinical information [
Wavelet transforms essentially exist in two distinct types: the Continuous Wavelet Transform (CWT) and the Discrete Wavelet Transform (DWT). In this study for the analysis of the EEG signal the DWT method has been employed. The advantages of using DWT is that it allows the analysis of signals by applying only discrete values of shift and scaling to form the discrete wavelets. Also, if the original signal is sampled with a suitable set of scaling and shifting values, the entire continuous signal can be reconstructed from the DWT (using Inverse-DWT). A natural way of setting up the parameters
Discrete mother wavelet representation:
integer’s
Analysis equation (DWT):
Synthesis equation (inverse DWT):
In this study, Discrete Wavelet Transform or DWT is applied to the EEG band components which are extracted in the preprocessing step.
As shown in Figure
EEG decomposition and analysis steps using wavelet transform.
Each window is then decomposed using DWT. Performance of the Wavelet transform depends on the mother wavelet chosen for decomposition of the signal. A common heuristic is to choose one similar to the shape of the signal of interest. So for the set of band components that is extracted from the original EEG signal different mother wavelets that suit different bands are applied during decomposition.
As shown in Figure
(a) “COIF3” wavelet, (b) “DB4” wavelet, and (c) “BOIR3.9” wavelet.
The decomposition process in wavelet transform can be performed iteratively into several levels. The number of levels chosen for decomposition is application specific and also depends on the complexity of the signal. For window of the EEG signal band components, 5 levels of decomposition seemed to provide all the required useful information; further decomposition did not yield a better result. The detailed coefficients of all the stages from 1 through 5 and the approximation coefficient of level 5 are retained for feature extraction step.
The features computed from these coefficients are as follows. (Here, Standard deviation:
Entropy: entropy is a statistical measure of randomness. It is very useful in evaluating the information present within a signal:
Log of variance: let the probability mass function of each element be as follows Mean of frequencies (discrete Fourier domain):
Variance of probability distribution:
Sum of autocorrelation:
Mean of autocovariance:
In this application the result after signal processing on various acquired psychological signals is a large set of features. Since the data was collected in a systematic and controlled environment, the features extracted from respective portions of the signals can be classified under the two presumed categories: “attention” and “nonattention.” Hence supervised learning method is used for this study to developed classification heuristics.
Three different machine learning algorithms have been implemented and tested for this experiment. These are as follows.
There are different models for predicting continuous variables or categorical variables from a set of continuous predictors and/or categorical factor effects such as General Linear Models (GLMs) and General Regression Models (GRMs). Regression-type problems are those where attempt is made to predict the values of a continuous variable from one or more continuous and/or categorical predictor variables [
In more general terms, the purpose of the analyses via tree-building algorithms is to determine a set of if-then logical (split) conditions that permit accurate prediction or classification of cases. Tree classification techniques, when applied correctly, produce accurate predictions or predicted classifications based on few logical if-then conditions. Their advantage of regression tree-based classifier over many of the alternative techniques is that they produce simplicity in the output classifier results. This simplicity not only is useful for purposes of rapid classification of new observations but can also often yield a much simpler “model” for explaining why observations are classified or predicted in a particular manner. The process of computing classification and regression trees can be characterized as involving four basic steps: specifying the criteria for predictive accuracy, selecting splits, determining when to stop splitting, and selecting the “right-sized” tree.
C4.5 is also a decision-tree-based classification algorithm, developed by Quinlan [
Breiman developed random forest classification method which is basically an ensemble classifier that consists of multiple decision trees [
All of the above-mentioned machine-learning methods are known to have comparable performance to methods such as neural networks in physiological and medical applications [
In the machine learning step, the three mentioned classifiers are independently implemented on the extracted features of ECG and EEG and the results of each of these classifiers are compared. This is based on a setup developed earlier during initial stages of this experiment. For this experiment ECG signal from 21 subjects and EEG signal from 12 subjects have been collected.
The classification model for each of the classifiers is developed using “by-subject” or “leave one subject out” based training and test sets. In this type of training and testing, out of the given number of subject say
The results obtained from the analysis and classification of the computed features from Stockwell transform (ST) from the ECG signal are presented.
Table
|
Accuracy (average) | Specificity (average) | Sensitivity (average) |
---|---|---|---|
C4.5 | 74.22% | 67.31% | 81.13% |
Classification via regression | 71.63% | 63.11% | 80.15% |
Random forest |
|
66.73% | 87.20% |
It can be seen that overall accuracy of random-forest-based classification model was more successful than both C4.5 and classification via regression models with a classification accuracy of nearly 77%.
The features computed from the analysis of the EEG signal using discrete wavelet transform is used to develop different classification models based on the three described classification methods. The results of these classification are presented in Table
DWT features classification results of EEG.
DWT feature classification result EEG | Accuracy (average) | Specificity (average) | Sensitivity (average) |
---|---|---|---|
C4.5 | 80.93% | 81.11% | 80.96% |
Classification via regression | 82.5% | 76.74% | 88.26% |
Random forest |
|
79.74% | 91.66% |
Table
The results from the ECG feature classification of all three classifier are compared against the classification results of the EEG.
From Figure
ECG versus EEG classification comparison.
The analysis of the EEG signals is primarily to set a benchmark against which the analysis of the physiological features from the armband can be compared. This system as it has been proposed primarily focuses on the electrocardiogram (ECG) signal and various methods of decomposition are performed on it. The following are the conclusive statements that can be deduced from the systems performance so far. It can be seen that to a reasonable level of accuracy the system is able to identify cognitive attention in comparison with that detected by the EEG collected in the same experiment. The focus of this proposal was entirely on ECG alone, and with just this signal it was demonstrated that its classification accuracy was comparable to that of EEG. Amongst the various machine learning methods investigated, “classification via regression” seems to perform the best on the combined feature set. However, it was also demonstrated that “random-forest-” based classification works on the subset of features for each different decomposition and analysis method. This study also establishes that ECG alone can be used in analyzing cognitive attention and that the fluctuation of attention does have a translated impact on the Cardiac rhythm of an individual.
Here are some of the future work planned to improve the system’s classification and prediction performance. A larger data set is needed to further validate this experiment. A larger data set is expected to provide a more robust classifier model. More novel features are going to be developed and tried for the feature extraction step after decomposition. Having a more diverse base of features usually provides insight into some connate characteristics of the signal which might not be openly evident. Feature pruning and other classification methods need to be tried for increasing the accuracy.
The authors would like to acknowledge BodyMedia Advanced Development (BodyMedia). for providing the armbands for this research. This study was designed and conducted in collaboration with Dr Paul Gerber, Professor of Dyslexia Studies, Special Education and Disability Policy, VCU. The authors wish to also acknowledge the subjects who volunteered for this study.