Cortical Tasks-Based Optimal Filter Selection: An fNIRS Study

Functional near-infrared spectroscopy (fNIRS) is one of the latest noninvasive brain function measuring technique that has been used for the purpose of brain-computer interfacing (BCI). In this paper, we compare and analyze the eﬀect of six most commonly used ﬁltering techniques (i.e., Gaussian, Butterworth, Kalman, hemodynamic response ﬁlter (hrf), Wiener, and ﬁnite impulse response) on classiﬁcation accuracies of fNIRS-BCI. To conclude with the best optimal ﬁlter for a speciﬁc cortical task owing to a speciﬁc cortical region, we divided our experimental tasks according to the three main cortical regions: prefrontal, motor, and visual cortex. Three diﬀerent experiments were performed for prefrontal and motor execution tasks while one for visual stimuli. The tasks performed for prefrontal include rest (R) vs mental arithmetic (MA), R vs object rotation (OB), and OB vs MA. Similarly, for motor execution, R vs left ﬁnger tapping (LFT), R vs right ﬁnger tapping (RFT), and LFTvs RFT. Likewise, for the visual cortex, R vs visual stimuli (VS) task. These experiments were performed for ten trials with ﬁve subjects. For consistency among extracted data, six statistical features were evaluated using oxygenated hemoglobin, namely, slope, mean, peak, kurtosis, skewness, and variance. Combination of these six features was used to classify data by the nonlinear support vector machine (SVM). The classiﬁcation accuracies obtained from SVM by using hrf and Gaussian were signiﬁcantly higher for R vs MA, R vs OB, R vs RFT, and R vs VS and Wiener ﬁlter for OB vs MA. Similarly, for R vs LFTand LFT vs RFT, hrf was found to be signiﬁcant ( p < 0 . 05 ) . These results show the feasibility of using hrf for eﬀective removal of noises from fNIRS data.


Introduction
Brain-computer interface (BCI) also known as human-machine interface (HMI) or brain-machine interface (BMI) provides a communication mean between the user and external devices through a combination of hardware and software systems [1][2][3].
ese systems are trained to generate control commands based on a specific set of patterns of brain signals [4].
Brain signal acquisition is categorized between invasive and noninvasive techniques. However, due to surgical risks and limited access to the cortical region, noninvasive the experimental noises are removed prior to the change of the raw signal to its magnitude through the modified Beer-Lambert law [8,12,13]. Noises produced due to hardware or by surrounding are known to be instrumental noises. ese noises usually have high frequency that can be removed using low-pass filter; furthermore, keeping isolation from external sources such as light can reduce such type of noises. Experimental noises include motion artifacts such as head motion during signal acquisition that can cause the dislocation of optodes from the assigned position, thus generates a spike-like noise due to change in light intensity. Various studies [13][14][15] have utilized commonly developed filtering techniques randomly for noise removal. However, obtained signals can be corrupted by different kinds of noises that could affect further analysis. Noises can be physiologically produced due to Mayer waves (∼0.1 Hz), respiration (0.2∼0.5 Hz), and heartbeat (1∼1.5 Hz), mainly due to fluctuations of blood pressure [8,[16][17][18][19]. ese noises can be removed using adaptive or bandpass filtering [20,21]. After preprocessing, useful information is extracted from the filtered data afterward classified using different classifiers mainly named as linear discriminant analysis (LDA), support vector machine (SVM), quadratic discriminant analysis (QDA), and naïve bayes (NB) [22] to generate control commands, hence completing the loop for BCI. Previous studies [13,14,23] show that an appropriate filter for correctness of data is the key to achieve more accurate results.
In this study, we hypothesized to find an optimal filter for commonly used cortical tasks, owing to a particular cortical region. Hence, we compared six commonly used filters to remove previously discussed noises. ese filters include discrete Kalman [24], time-varying Wiener [25], 4 th order Butterworth, hemodynamic response filter (hrf ), Gaussian [26], and window-based finite impulse response (FIR) [27]. For the said purpose, cortical data were acquired from the three main regions of the brain, namely, prefrontal (PFC), motor (MC), and visual cortex (VC). Since the data acquired from PFC relate to thinking tasks [28][29][30][31], hence arithmetic and object rotation tasks were performed for this cortical region [32][33][34]. Similarly, tasks related to movement of limbs or fingers is related to the motor cortex [35]; therefore, the finger tapping tasks were performed for desired data acquisition [36][37][38]. Likewise, flickering of checker box was performed for visual cortex data [39]. Keeping in view the target of an optimal filter for a specific cortical region, a previous study [22] reported the different combinations of statistical features. erefore, for consistency of extracted data, statistical features were kept the same for all experimental tasks, hence making combinations of six features, namely, signal mean (SM), signal slope (SS), signal peak (SP), signal skewness (SK), signal kurtosis (KR), and signal variance (SV). For classification, a number of studies [22,40,41] reported nonlinear SVM classifiers for comparatively better accuracies, hence all experimental tasks were classified using a nonlinear SVM classifier. erefore, the main contribution of this work is (1) to analyze the effect of the six most commonly used filtering techniques and (2) to propose the optimal one among the most frequently discussed noise removal techniques. For the aforementioned experiment, we select three main cortical regions: prefrontal, motor, and visual cortex with seven various paradigms. e canonical hemodynamic response filter (hrf ) [26] performed overall best among the opted techniques. On the basis of these systematic and explicit analyses, it can be seen that selection of an optimal filter has a significant role in enhancing accuracies. Hence, these observations can serve as a standard guide for others to test the effect of noise correction algorithms for fNIRS experiments, and therefore can select a significant methodology.

Experimental Setup.
To acquire experimental data, seven paradigms owing to three main cortical regions were designed and explicitly performed using a dynamic near-infrared optical tomography (DYNOT-232; NIRx Medical Technologies, NY, USA) device at Pusan National University. It operates on two wavelengths that are 760 and 830 nm where the signal acquisition sampling frequency was 1.81 Hz. Five healthy subjects with normal or corrected-to-normal vision took part in the experiment with a verbal consent before experimentation. All subjects were right-handed with an age range of 26 ± 3. Righthanded subjects were selected to minimize hemodynamic response variation due to hemispheric-dominance difference. e experimental participants had no history of alcoholism, psychiatric, neurological, and visual disorder, cardiovascular and respiratory disease, mental illness, or any motor disability. Moreover, three hours before the commencement of the study, participants were asked to refrain from caffeinated drinks. As discussed in the literature [28,30,31,35,37,38], for thinkingrelated task, signals were acquired from PFC, similarly for motor execution tasks from the primary motor cortex (PMC), and visual stimuli task from VC. Performed experiments were according to the latest Declaration of Helsinki.

Experimental Paradigms.
In accordance with the literature [35,39], subjects were seated on a comfortable chair and were asked to take rest with restricted movements as they can, so that the hemodynamic response activation owing to previous activities can be avoided. Hence, each paradigm related to PFC, MC, and VC starts with a rest of 20 s period to set up the baseline conditions. As the literature [42][43][44][45] show, (10∼12) s task is adequate to acquire hemodynamic response of brain activity, hence 20 s initial rest was followed by 10 s task, and this was followed in turn by another 20 s rest period permitting signals to return to their baseline values before the start of the next trial in paradigm (a). For paradigm (b), 20 s rest after 10 s task 1 was again followed by 10 s task 2. e 20 s rest between two 10 s tasks was added to differentiate two classes through the baseline value. Figure 1 depicts paradigm (a) and paradigm (b). For optimal filter selection, tasks were selected concerning specific cortical regions such that for PFC, three different experiments were performed, rest (R) vs mental arithmetic (MA), R vs object rotation (OB), and OB vs MA task. Similarly, for MC, R vs left finger tapping (LFT), R vs right finger tapping (RFT), and LFT vs RFT. Likewise, for VC, R vs visual stimuli (VS) task.

Prefrontal Tasks
(1) Mental Arithmetic. According to previous studies [44,46,47], for R vs MA task, subjects were asked to perform a series of arithmetic calculations for 10 s based on the pseudorandom order, such that to subtract the random twodigit number (between 10 and 20) successively from the previous result of the three-digit number subtraction appearing on the screen (e.g., 400-11, 389-17, and 372-14). Afterwards, the screen was turned black so that subjects does not go beyond the 10 s task.
(2) Object Rotation. For the object rotation task, subjects were asked to imagine a cube rotating for the 10 s task while the "object rotation" word appeared for 10 s on the screen at about 2 m distance [48][49][50].
(3) Mental Arithmetic vs Object Rotation. In this protocol, subjects were asked to perform the aforementioned MA task versus OB task in between 20 s of rest to distinguish two tasks. e experimental paradigm is depicted in Figure 1(b).

Motor Tasks
(1) Finger Tapping. According to the literature [8,17,19,23,31,37,38,51], subjects were asked to tap the selfpaced index finger of one hand for 10 s afterwards the 20 s rest task was performed allowing signals to return to their reference values. Also, repetition for 10 times was performed as depicted in Figure 1(a). Similar trials were performed on the other hand, while for the LFT vs RFT task, 20 s rest was performed in between two tasks for the restoration of the signal to the baseline level as shown in Figure 1(b).

Visual Task
(1) Checker Box Flickering. In this experiment [4,18,39], a screen was placed in front of the subjects at a distance of approximately 2 m, and also subjects were requested to avoid eye blinking during the experiment. e 10 s task of checker box flickering at 4 Hz was performed followed by 20 s rest of the black screen. e sound was also generated during the transition between rest and task. e paradigm followed for visual stimuli is shown in Figure 1(a).

Experimental
Setup. Since the mental imagery task activates the PFC [34,43], a total 11 of near-infrared (NI) light optodes were placed on PFC, 3 of which were detectors and 8 were the source in accordance with the literature [46,47]. Similarly, for the motor execution task, the primary motor cortex (PMC) is activated [37,43], hence 15 optodes were placed on PMC out of which 8 were the source and 7 were detectors. To extract data for LFT, optodes were placed on the right hemisphere, while for RFT on the left hemisphere [43]. Similarly, for the visual stimuli task data acquisition from the visual cortex [4,18,39], eleven optodes were placed having eight sources and three detectors. e distance between the source and the detector was 3 cm. Optode placement with channel configuration for MC, VC, and PFC is shown in Figure 2.

Signal Acquisition.
In accordance with the literature [6,8,29], the raw optical density signal is converted to oxyhemoglobin (Δc HbO (t)) and deoxyhemoglobin (Δc HbR (t)) concentration using the modified Beer-Lambert law (MBLL) as decribed in the following equation: where β HbX (λ) is the HbX extinction coefficient in µM − 1 cm − 1 , d is the differential path length factor for the curved path in (mm), l is the detector and emitter distance in (mm), and Δψ HbX (t) is the absorbance difference of the light emitter wavelength of λ i .

Signal
Processing. e acquired raw signals of the brain contain various noises that can be categorized into physiological, experimental, and instrumental noise [8,52]. In fNIRS, the instrumental and experimental noises are removed prior to change of the raw signal to its magnitude through MBLL [8,12,14].

Instrumental Noises.
Noises produced due to hardware or by surrounding are categorized under instrumental noises. ese noises usually have high frequency that can be removed using a low-pass filter; furthermore, keeping isolation from external sources such as light can reduce such type of noises [8].

Experimental Noises.
ese errors contain motion artifacts due to unintentional body movements like head motion during signal acquisition. It can cause the dislocation of optodes from the assigned positions that result in a spike-like noise due to change in light intensity. Filters like Kalman and Wiener can be used to remove such type of noises [12][13][14][15].

Data
Analysis. NIRS-SPM is a toolbox designed for fNIRS data analysis. For signal processing, it provides common filtering techniques, namely, Butterworth, Gaussian, and hrf [26]. e comparative data analysis was performed by implementing Gaussian and hrf filtering using the NIRS-SPM toolbox, while other techniques on MATLAB ® 2017b. e generalized mathematical models with necessary details are as follows.

Gaussian Filtering.
A Gaussian filter is used in various forms depending upon the nature of the signal. Generally, a Gaussian filter is based on a Gaussian function which defines the probability distribution of noise or data. It can also be used as a smoothing operator. A Gaussian kernel is used for smoothing the signal in which each value is replaced with the weighted average of itself and its neighboring values [14,21,26]. A simple representation of the 2D Gaussian filter can be defined in the following equation as where x and y are the distance from the origin in horizontal and vertical axis and σ is the known standard deviation of the distribution.

Hemodynamic Response Filter (hrf ).
e hrf is based on the canonical representation of the hemodynamic response functions (HRF) and is used for the temporal smoothing of the fNIRS time series signal. In NIRS-SPM, the given functional data were smoothen using the least square estimate with ideal HRF. e hrf and Gaussian filter model details are in accordance with the literature [26].

Butterworth Filter.
Butterworth filter is a model-based bandpass filter which performs on frequency attenuation using high and low-pass filter. e filtered value not only depends on the weighted average of the unfiltered time series, but also recursively on the previous values of the filtered time series. is filter aims to have a flat frequency response in the desired pass band [26]. e 4 th order Butterworth filter with a band pass of (0.01∼0.1) Hz was applied by MATLAB ® build in a library for the desired experiments.

Finite Impulse Response
Filter. FIR filter is designed by finding the coefficients and filter order so that it performs a cross-correlation between the input signal and the time reversed impulse response; therefore, by sampling the pulse shape, coefficients of the filter are designed [14]. FIR filter of order N can be defined as the following equation:  Journal of Healthcare Engineering where y[n] is the input signal, x[n] is the output signal, N is the filter order, and b i is the value of impulse response at i th instance. Here, the 4 th order FIR filter with a low-pass band of 0.1 Hz was utilized. e coefficients were estimated using the least square estimate. However, a time-varying Wiener filter, based on the short-time Fourier series, was implemented as in [25,53].

Kalman Filter.
Kalman filters the input signal containing statistical and other noises by linear quadratic estimation. e estimated unknown variables based on the Bayesian inference and joint probability distribution tend to be more accurate [13,14]. Its simple model can be seen in the following equation: where F k is the state transition model applied to previous state x k−1 , B k is the control input model applied to the control vector u k , and w k is the process noise. A discrete model of Kalman was implemented in accordance with [24].

Feature Selection.
For the consistency between extracted data across all paradigms, six statistical features (SV, KR, SS, SM, SK, and SP) were used to extract information across data [28,43,46,54,55]. SM is calculated in the following equation as where A x is the input signal such that Δc HbO (t) and n is the total number of observations. Signal variance is calculated in equation (6) as where Z x is the input signal, μ is the mean found from equation (6), n is the number of samples, and σ is the standard deviation. For KR, calculation was made by the following equation: where X is the input signal and E is the expected value of X. SK is the asymmetry of values relative to normal distribution around the mean, hence calculated in the following equation: MATLAB ® polyfit function fits the line to all input data points, therefore used to calculate SS. Similarly, max function was used to calculate SP. Statistical features were rescaled between 0 and 1 using the following equation: where Z ′ is the rescaled feature and Z refers to the original feature values. e scatter plot across all six statistical features for the OB vs MA task of subject 1 is shown in Figure 3.

Support Vector Machine.
Statistical significance of accuracy is analyzed for selection of an optimal filter for a specific cortical region; therefore, for higher classification performance, nonlinear SVM is used [35,38]. It can rescale high-dimensional data and can control errors explicitly by maximizing the margins between two or more classes thus creating hyperplanes named as support vectors [40,41,56]. e cost function that is to be maximized for the SVM classifier gives a correlation between training data and hyperplane as defined in equations (10) and (11), respectively, where z, ξ x ε R 2 , b ε R 1 , z 2 � z T z, k is the positive regularization parameter, ε x is the measure of the training error, and y x is the class label for the nth sample. Here, the thirddegree polynomial kernel function with k � 0.5 and 10-fold cross-validation was applied for the estimation of classification accuracies [53].

Results and Discussion
3.1. Results. In this study, an optimal filter was chosen based on cortical tasks. Activities were categorized based on three main regions such as PFC, MC, and VC. Δc HbO (t) signals were filtered using six filters. Figures 4-6 show the averaged Δc HbO (t) filtered signals across trials of prefrontal, motor execution and, visual stimuli tasks. e three different signals show various paradigms for each brain region, while the horizontal axis is aligned with one complete event. e pictorial analysis of filtered responses shows that using the 4 th order Butterworth changes data form altogether while discrete Kalman and Wiener filtered the signals, but there remain some noises in the output response, whereas, canonical hrf and Gaussian give much smoother response as compared to any other technique. Moreover, previous studies have commonly utilized Butterworth for signal processing, while the obtained visual does not show any significant improvement in comparison. Likewise, the Gaussian filter is also utilized in various studies that assume the acquired data with some normal distribution; therefore, it considers Gaussian function which defines the probability distribution of noise or data. It can also be used as a smoothing operator. A Gaussian kernel is used for smoothing the signal in which each value is replaced with the weighted average of itself and its neighbouring values.

Journal of Healthcare Engineering
However, the hrf filter considers ideal hemodynamic response as a reference for smoothing the data that are closely related to chi-squared distribution. Due to this fact, hrf outperforms any other considered technique.
For consistency among extracted information, classification accuracies were obtained across six statistical features. e accuracies obtained across these features are shown in Table 1. e two-tailed independent t-test was performed to check statistical significance with Holm-Bonferroni for multiple comparison of filters. Results obtained in Table 2 show significance using hrf or the Gaussian filter for R vs MA and R vs OB while for the OB vs MA task, no significant filter was found; however, the time-varing Wiener filter outperformed others. Similarly, in motor execution tasks, hrf was found to be significant for R vs LFT, likewise for R vs RFT hrf and Gaussian. Moreover, for LFT vs RFT, the hrf filter shows better performance. In the VS task, hrf and Gaussian were significant. Moreover, we plot the mean accuracies of filters with respect to cortical regions as depicted in Figure 7. e Gaussian and canonical hrf performed consistently better across all three cortical regions. However, comparatively better accuracies were obtained using hrf only. e statistical results validate our hypothesis of optimal filter selection by assuring that hrf generally has better performance for fNIRS-based studies, which is in accordance with a previous study [53].

Discussion.
Despite of the fact that fNIRS offers portability, low cost, and ease of equipment setup, there still remains a challenge of removing noises like systematic physiological (Mayer waves, muscle activity, blood pressure, and respiration and heart rate) and artifacts [17,57], as fNIRS signals are highly contaminated by measurement noises and physiology-based systemic interference [58]. Several studies have proposed methodologies that can remove noises robustly. In [59], temporal filtering using a low-pass filter with 0.6 Hz cutoff frequency and canonical hemodynamic response function with 4 s full width at half maximum was applied. In [60], exponential moving average and Chebyshev filter were used to remove artifacts from the fNIRS data. Similarly, in [61], only low-pass filter with a 0.14 Hz cutoff was applied to remove physiological noises from fNIRS signals. Aqil et al. [62] used recursive least square estimation for online imaging.
is adaptive approach provided a spatial filtering with low and high pass, detrending, and baseline correction. Similarly, Seo et al. [63] assessed the utility of NIRS in removal of physiological noises in fMRI data by reducing variance of the residual error in the baseline fMRI signal through the NIRS signal in the model. Similarly, in [64], adaptive filtration with the affine projection algorithm was used to accelerate convergence with colored noise, but it increases computational cost. In [65], a bandpass filter based on the 5 th order Butterworth filter was used to filter motor execution signals based on EEG. In literature [52], fNIRS-based walking signals and walking signals while talking were acquired. ese signals were preprocessed using a low-pass with finite impulse response filter, while the talking task results in lowamplitude artifacts with a similar frequency of hemodynamic response, hence it may affect cortical activity [57]. In [66], multiple filters were applied for EEG-and ECG-based signal preprocessing such as for removal of motion artifacts, a median filter with 5-point, for systemic component removal, a second-order Chebychev type-2 filter, high-pass filter for signal drift removal, 1 st order low-pass Butterworth filter for blood pressure signal, and the 4 th order Butterworth bandpass filter for ECG were applied, while using too many filters for every aspect may result in cost of computation. In [8], previously mentioned aspects in Section 2.4 were briefly discussed. To remove such noise methods like MBLL, eigenbased vector approach [19], a bandpass filter was introduced in a MATLAB-based graphical user interface-based program, HomER. However, no such statistical significance comparison was seen for generic tasks. In literature [14,15], cancellation of motion artifacts using Wiener and discrete Kalman filter was discussed; meanwhile, t-tests performed in comparison showed better performance with Kalman. Similarly, in [12,13], four techniques, namely, Kalman, principal component analysis, wavelet analysis, and spline interpolation, were compared to remove NIRS data artifacts. Results showed that spine interpolation and wavelet analysis were significant for such noise removals. In [67], systemic noise was removed using wavelet minimum description length detrending approach and artifact using moving standard deviation and spline interpolation. Eggenberger  Journal of Healthcare Engineering et al. [68] removed movement artifacts by using visual inspection, and Mayer waves were avoided using averaged blocks. In [69], physiological noise was removed with a lowpass filter, and sliding window was applied for motion artifact rejection. Holtzer et al. [70] combined independent component analysis and principal component analysis to remove noise and signal drifts. In [70][71][72][73][74][75][76][77], the signal was low-pass filtered with a cutoff frequency at 0.14 Hz, and in [78], with a cutoff frequency of 0.2 Hz and in [78], a low-pass filter set at 0.67 Hz, meanwhile a moving average filter with a width of 4 s was used to smoothen the signal. In [79], Gaussian smoothing with a full width at half max of 2 s was applied, while motion artifacts were removed using the wavelet minimum description length detrending algorithm. In [80], a bandpass filter (0.01-1.25 Hz) was applied to the signal. Likewise, in [81], a bandpass filter (0.01-0.2 Hz) was applied while motion artifacts were removed through principal component analysis and spike rejection. Similarly, in [82], data were bandpass filtered (0.01 Hz to 0.14 Hz), while the wavelet filter and correlation-based signal improvement were applied to remove motion artifacts. In [83], data were filtered with a 0.01 Hz high-pass filter and a 5.0 s moving average filter and also principal component analysis were applied to reduce physiological noise. Metzger et al.       10 Journal of Healthcare Engineering [84] removed channels with large movement or technical artifacts. Smaller artifacts were corrected with the correlation-based signal improvement method while a low-pass filter with a 5.0 s moving average filter was applied. In [85][86][87], signals were analyzed with SPM99 (Statistical Parametric Mapping software; Wellcome Department of Cognitive Neurology, London, UK). In [88], moving standard deviation-based artifact removal (moving artifact reduction algorithm: MARA) with a threshold of 0.45 for HbO and 0.18 for HbR was applied and signals were linearly detrended and low-pass filtered at 0.1 Hz.
Keeping in view the discussions in the literature, various methodologies were adopted for artifact removal while for systematic physiological noises, mostly high and low-pass filter were applied. erefore, to make use of the best optimal filter for a specific cortical task, we performed an explicit and systematic analysis to find the filter based on statistical significance for mostly used cortical tasks in fNIRS study applications. e result shows that hrf outperforms the discussed techniques due to the fact that it considers ideal hemodynamic response distribution for smoothing the data. Keeping this in view, our future work involves designing of  an adaptive filter technique, considering different distributions of data and noise. erefore, opting more accurate separation of data.

Conclusion
To the best of our knowledge, this is the first study to propose filter selection for commonly used cortical tasks of functional near-infrared spectroscopy (fNIRS) contaminated with artifacts and systematic physiological noises. Six filters, namely, Gaussian, hemodynamic response filter (hrf ), Butterworth, time-varying Wiener, discrete Kalman, and window-based finite impulse response, were tested. e results obtained have validated the overall statistical significance of the hrf for prefrontal, motor, and visual cortex tasks. Furthermore, signals acquired from different cortical regions may contain different types of noises. Hence, the prime goal of this study was to suggest an optimal filter for a specific task, owing to the specific cortical region so that further studies can achieve maximum accuracies by reliably improving the recovered hemodynamic response function. Outcome of this study shows that there is a significant impact of filter selection on the accuracy of the classified data, therefore facilitating users to avoid analysis of complex signal techniques by themselves.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.