Incremental Activation Detection for Real-Time fMRI Series Using Robust Kalman Filter

Real-time functional magnetic resonance imaging (rt-fMRI) is a technique that enables us to observe human brain activations in real time. However, some unexpected noises that emerged in fMRI data collecting, such as acute swallowing, head moving and human manipulations, will cause much confusion and unrobustness for the activation analysis. In this paper, a new activation detection method for rt-fMRI data is proposed based on robust Kalman filter. The idea is to add a variation to the extended kalman filter to handle the additional sparse measurement noise and a sparse noise term to the measurement update step. Hence, the robust Kalman filter is designed to improve the robustness for the outliers and can be computed separately for each voxel. The algorithm can compute activation maps on each scan within a repetition time, which meets the requirement for real-time analysis. Experimental results show that this new algorithm can bring out high performance in robustness and in real-time activation detection.


Introduction
Functional magnetic resonance imaging (fMRI) offers a noninvasive method in studying human brain functions by recording blood-oxygen-level-dependent (BOLD) signal changes related to neuronal activity across the brain with high spatial resolution [1]. Real-time fMRI (rt-fMRI) is a method to assess the acquired data for evidence of an experimentally induced effect at every intracerebra voxel individually and simultaneously. In rt-fMRI, data are processed as fast as they are acquired [2]. For real-time fMRI applications, mapping the activations within a repetition time makes it possible to interact with fMRI experiments in a much more efficient way [3]. Online functional mapping enables researchers to monitor data quality, evolve experimental protocols more rapidly, perform interactive experimental paradigms for neurological investigation [4], achieve neurofeedback by providing feedback of brain activation to the subject in real time [5], which may have potential use in clinical applications [6].
In common fMRI experiments, MRI scanner acquires whole brain data at an interval of 2 seconds, also called repetition time. To meet the real-time requirements, all the processing steps of real-time fMRI need to be completed within a repetition time. Simple real-time fMRI processing steps consist of data reconstruction, spatial realignment (head motion correction), and statistical analysis. Among them, incremental statistical analysis on each voxel of the fMRI dataset will result in huge computational costs. To overcome the computational costs of the statistical analysis, a number of incremental activation detection algorithms have been developed for rt-fMRI applications.
Cox et al. [7] proposed the first real-time incremental activation detection algorithm and the correlation based activation detection methods are not able to model multiple experimental and confounding effects simultaneously. Gembris et al. [8] proposed correlation analysis method using sliding window technique. Friston et al. [9] proposed the general linear model (GLM), which can be used as a unified framework in the analysis of fMRI data and support multiple experimental design, but it cannot be used to rt-fMRI applications, because it needs all of the data to do the statistical analysis. The widely used correlation based techniques are special cases of GLM with a white noise model for the temporal errors in the signal. Bagarinao et al. [10] proposed an algorithm using an orthogonalization procedure to estimate the coefficient of general linear models. Roche et al. [11] proposed an algorithm using extended kalman filter (EKF) method to fit a general linear model on fMRI time courses. This technique adopts the GLM-AR model under the assumption that the fMRI noise is significantly autocorrelated. The extended kalman filter is able to fit incrementally GLM coefficients along with the one-order autoregressive noise model. In this paper, we mainly focus on the EKF method, because it requires low computation costs and memory costs, and the design matrix can be assembled incrementally, which makes it possible for more complex interactive experiment design.
The fMRI time series has low signal to noise ratio [12]. The noise in the fMRI signal is complicated. It includes not only the usual MRI sensor noise but also physiological fluctuations that affect the signal. The fluctuations of MRI scanner or severe head motion may generate sparse noise in the signal. In our model, under the assumption that the sparse noise will not change the noise distribution, a variation is added to the extended kalman filter to handle the additional measurement noise term, that is, sparse; this term can be used to model the sparse measurement outliers.
In recent years, with the developments of convex optimization, Mattingley and Boyd [13] created a robust kalman filter by replacing the standard measurement update, which can be interpreted as the result of solving a similar convex minimization problem, which includes an 1 term to handle the sparse noise. We use the robust kalman filter method to solve the value of the sparse term in our model.

Method
Real-time fMRI signal is three-dimensional volume data at each scan during a repetition time and the intensity of each voxel represents the blood oxygen level associated with neural activities. Data of each incoming fMRI scans is spatially aligned with the first scan of the series. Voxel values at different scan time points are arranged to time sequence, forming the measured time series. Each voxel will form a time series, and the length of time series is growing with the time increasing. The time series of each voxel can be calculated independently, so in the following discussion we only consider the situation of a single voxel time series.

Extended Kalman Filter Incremental Detection.
Roche et al. [11] proposed an extended kalman filter based algorithm to fit incrementally the general linear model along with a oneorder autoregressive noise model. The general linear model (GLM) explains the measured time series = [ 1 , 2 , . . . , ] in terms of a linear combination of the explanatory variables = [ 1 , 2 , . . . , ] plus an error term. The explanatory variable contains paradigm-related regressors based on the experiment design and signal characteristics and regressors are obtained by convolving the different stimulation onsets with a canonical hemodynamic response function. Model the low-frequency drift, hence enabling us to "detrend" the signal (we use polynomials up to order three). Then combine the explanatory variables into a design matrix = [ 1 , 2 , . . . , ]. The GLM can be expressed as The design matrix is is the explanatory variable, = [ 1 , 2 , . . . , ] contains (unknown) parameters which represent the coefficients of the explanatory variables. The design matrix contains paradigmrelated regressors based on the experiment design. The regressors can be obtained by convolving the different stimulation onsets with a canonical hemodynamic response function, or modeling the low-frequency drift to detrend the signal.
In the GLM-AR model, it is assumed that is a stationary Gaussian zero mean AR(1) random process and the relationship between and −1 can be expressed as.
is an (unknown) autocorrelation parameter, and is a white noise with instantaneous Gaussian distribution (0, 2 ). Alexis Roche proved that the maximum likelihood estimate of ( , ) can be computed independently from and they found that one can reach the maximum of the likelihood function by finding the minimum of ( , ): where = [ 1 , 2 , . . . , ] denote the values of explanatory variables at time .
First, they combine the parameters to be estimated and into a ( + 1) × 1 state vector = [ ; ] and assume that, at time , the current estimate of iŝ Secondly, linearize the error function ( , ) to ( ) around the current estimate using a first-order Taylor expansion: Finally, they solved the (nonlinear) least-squares regression problem by means of an EKF. The EKF updates the parameters using the following recursion: where is the kalman gain in th step and Σ denote the normalized posterior covariance matrix of given the information available at time . At time , the posterior covariance matrix of the state vector is approximated by Cov( ) ≈̂2Σ and the incremental update rule for is as follows: Computational and Mathematical Methods in Medicine 3 The number of explanatory variables is , and is the × 1 vector which contains the coefficients related to the explanatory variables. Σ × is obtained by extracting the left superior × block from the matrix̂2Σ . For a given contrast vector , we can identify the voxels that show a contrasted effect: where = ( Σ × ) −1/2 and erf(⋅) is the error function. Testing for positive activations can be achieved at any time by thresholding the image of -statistics.

Outlier Detection Method.
The sensor failures or measurement outliers will cause the sparse measurement noise and they may cause rapid degrade on the detection performance. We derive a new algorithm to detect the outliers in order to eliminate the effect on the kalman filter algorithm.
We suppose that there is a sparse term which is always zero, and it has no effect on the distribution of the total noise; then the general linear model can be modified as Then linearize the ( , , ) to ( , ): , is an unknown variable at time , while −1 is an known variable at time .
The linearized constraint equation is approximate as follows: The measurement update step of standard kalman filter algorithm is essentially an optimization problem, and the linearized parameter optimization problem can be described as follow: where and V are the unknown parameters to be estimated, Σ denotes the steady-state error covariance associated with predicting the next state , and the measurement is and the measurement noise term V is a white noise with instantaneous Gaussian distribution (0, ).
We use the robust kalman filter method to detect the outlier hidden in the measurement by replacing the measurement update with the solution of a similar convex minimization problem, which includes an 1 term to handle the sparse noise.
To (approximately) handle the additional sparse noise term , we modify the kalman filter measurement update step. The modified optimization problem is as follows: In the optimization problem (12), , , and V are the variables to be estimated.
To solve this convex optimization problem, we adopt a fast transform method proposed by Mattingely and Boyd [13], which makes problem solving become more effective: After the transformation, the original problem was transformed into an equivalent convex quadratic program problem: With variable , solving the convex quadratic program, we can achieve the sparse noise at each time. Fortunately, for one voxel at time , the size of , and is 1 × 1. Hence this optimization problem is equivalent to solve the solution of a piecewise-quadratic function. This problem has an analytical solution, means we can use analytical expression instead of the searching optimization loop.

Robust Extended Kalman Filter. The standard kalman filter consists of alternating time and measurement updates.
Since is slowly varying parameter, so it remains unchanged in the time update step. Both of the time and measurement updates are derived by the minimum mean-square error estimates of . As is shown in (5), the measurement update equation is as follows: After solving the problem, we achieve an estimate outlier valuêand then combine it into the update step: Furthermore, we combine thêinto the incremental update rule for :  For the real-time -test, the method is exactly the same as Alexis Roche's method.
Finally, the algorithm recursion is summarized as follows:

Result
The algorithm is tested on a single run from an fMRI experiment involving a visual and auditory task. The protocol is block design, and the run consisted of 10 blocks, each block including one activation epoch (20 s) and one control epoch (10 s). We aim at finding the voxels associated with the visual and auditory function; each activation epoch, the visual stimulus, and auditory stimuli are present, but the intensity is different. The data has an abrupt head motion during the scan in the 45th repetition time (TR) and the head motion caused  severe motion artifacts. The repetition time was 2 seconds for a total of 152 scans. Functional images have 80 * 80 * 33 voxels. As is shown in Figure 1, the red curve is the reference vector obtained by convolving the stimulation onset with a canonical hemodynamic response function, which is assumed the time series should be; the blue curve is the time series data of an active voxel; obviously there is an outlier at the 45th TR.
In Figure 2, all of the three curves tend to be stabilized and the correlation coefficient of activate voxel is all converge to around 26. The green curve presents Cox method, and it has a great drop at the outlier scan, and the extended kalman   filter method has an abnormal rise at the 45th TR, while our method has no great fluctuation at the outlier scan. This shows its robustness to the sparse noise. Cox method derived a threshold for active detection, so there is no value in their method. Figure 3 shows the comparison of the -test value between the Robust extended kalman filter and the extended kalman filter. We can see that both of the two algorithms have significant decrease on the outlier. Obviously, our method is more stable at the outlier scan and has a higher value after the outlier scan.
We also tested the algorithm on an inactivate voxel. In Figure 4, the blue curve is the time series of an inactive voxel and at the 45th TR there is also an outlier.  The -test value of the inactive voxel should be low, as is shown in Figure 5; the correlation curves of the three methods tend to be stable below zero. There is a big fluctuation at the outlier scan according to Cox method and our method finally obtained a stable value lower than the extended kalman method. Before the outlier scan, the correlations of the two methods show little difference. After the outlier scan, our method becomes lower than the extended kalman method.
In Figure 6, a comparison of the value between our method and the extended kalman filter method is shown. We can see that value of both the two algorithms has fluctuations at the outlier, and the fluctuations of our method are 6 Computational and Mathematical Methods in Medicine Figure 8 smaller than those of extended kalman filter method. Our method tends to be stable at a lower level. This again shows our new method's robustness to the sparse noise in the inactive voxel. Figure 7 is the result of outlier detection, as is shown in Figures 1 and 4; the original signal contains a big outlier at the 45th TR and the margin of the two outlier is different. Our outlier detection algorithm detect two outliers from both the active and inactive voxel at 45th TR, it is shown to be robust for both the active and inactive voxels.
We have applied our detection algorithm to the whole brain data, in the case where threshold value the same, the final activation maps are shown in Figure 8; left half is the activation result achieved by the extended kalman algorithm and right half is the result of the algorithm described in this paper. As expected for a visual auditory task, activations are found in the visual cortex and auditory cortex on both sides. This means that, under the same threshold of value, we can get more voxels associated with the task.
We have a C++ version of the algorithm, and we test it on a 16-core processor workstation, in which computing hole brain (80 * 80 * 32) voxels costs 0.3 s to 0.4 s; there is plenty of time to do some other processes. Our algorithm can be used on the real-time fMRI application.

Discussion
In our algorithm, parameter needs to be defined before the experiment. In the optimization object function, the parameter can be regarded as the weight of the sparse term. With a large , the outlier detection algorithm will find nothing and result zero in , the measurement update will be the same as the extended kalman filter method. So if no outlier is detected, the result of the algorithm will be the same as that of in the extended kalman filter. Small will make the algorithm modify the measurement update frequently. Under this situation, the mistaken outlier will have an effect on the noise distribution and cause the unstability of the kalman filter. Here we can regard as the detection threshold and by holding a large threshold we can detect the obvious outlier, which may be caused by the machine failure. The value of in our algorithm is in a wide range. We suggest taking a relatively large to test if the result is stable. In this experiment, we test from 50 to 300 and the algorithm can detect the outlier and the result is robust to the outlier.
The fMRI signal has a low signal to noise ratio, so we cannot give an exact threshold value and this algorithm provides a method to threshold the signal, and we can even achieve the value of the outlier and use it to modify the detection algorithm. fMRI experiment is complex, which needs cooperation with subject and operator; any problem of them may cause the experiment to fail. Our algorithm can detect this kind of failure and it can detect the outliers, which may have a great effect on follow processes, where we can stop the experiment to check the problem or just mark it and eliminate the outlier data after the experiment.
In future works, we will explore a proper method to determine , just enough to detect the outlier but not cause the unstability of the kalman filter algorithm.

Conclusion
The robust kalman filter is introduced in the activation detection of fMRI experiment. Convex optimization method is used to modify the extended kalman filter by introducing a sparse noise term. The robustness of our method to the sparse noise is improved. Moreover, the performance of the proposed method does not degrade rapidly when disturbances are involved. When applied to the time series voxels, our method can obtain more stable -test value in both activate and inactivate voxels. The proposed algorithm can also detect outliers, which may have a great effect on following processes. The detected outlier information can be used to reject or modify the bad data, which may have potential benefit for real-time applications that require higher data quality.