Multiscale Autoregressive Identification of Neuroelectrophysiological Systems

Electrical signals between connected neural nuclei are difficult to model because of the complexity and high number of paths within the brain. Simple parametric models are therefore often used. A multiscale version of the autoregressive with exogenous input (MS-ARX) model has recently been developed which allows selection of the optimal amount of filtering and decimation depending on the signal-to-noise ratio and degree of predictability. In this paper, we apply the MS-ARX model to cortical electroencephalograms and subthalamic local field potentials simultaneously recorded from anesthetized rodent brains. We demonstrate that the MS-ARX model produces better predictions than traditional ARX modeling. We also adapt the MS-ARX results to show differences in internuclei predictability between normal rats and rats with 6OHDA-induced parkinsonism, indicating that this method may have broad applicability to other neuroelectrophysiological studies.


Introduction
VARIOUS types of methods have been used to assess the degree of similarity or shared information between two signals. The methods used depend on the type of the presumptive system which processes the one "input" signal into the other "output" signal. Two basic classifications of systems are whether they are memoryless or not, and whether they are linear or not.
Common linear memoryless methods include the crosscorrelation in the time domain or coherence in the frequency domain. Common linear models with memory include autoregressive models with exogenous input (ARX), autoregressive moving average (ARMA), Box-Jenkins, Output-Error, and linear state-space models.
Higher-order nonlinear methods with memory are also sometimes used, such as polyspectral models, nonlinear ARX models, neural networks, Hammerstein-Wiener models, and Volterra models, but these are more difficult to train.
Other statistical evaluations of the similarity focus on the transfer of information between two signals, rather than explicit modeling and prediction. Examples of these analyses include Granger causality analysis, time-delay mutual information, and transfer entropy. The information theoretic analyses can measure nonlinear as well as linear effects.
Complex systems such as the brain are difficult to analyze because of the huge number of individual neuronal/synaptic paths between nuclei, the nonlinear nature of neuronal connections, and the operation at multiple time scales.
One approach is to use simple low-order linear models to approximate the transfer function relationship, such as autoregressive with exogenous (ARX) models. The advantage of using linear ARX models is that there is no need to estimate nonlinearity parameters, and less training data is required. However, the performance of such models depends crucially on the model order, scale, and prefiltering. To mitigate this dependency, a multiscale version of the autoregressive with exogenous input (MS-ARX) model has recently been developed by Nounou and colleagues [1]. The MS-ARX model allows automatic selection of the optimal scale for the ARX prediction.
In this paper, we adapt and apply the MS-ARX model to evaluate the degree of information transfer between cortical electroencephalogram (EEG) and subthalamic nucleus (STN) local field potential (LFP) signals. In a rat model of Parkinson's disease, the multiscale ARX approach showed significant differences in connectivity compared to normal.

Autoregressive System
Identification. The ARX model is a common method to represent output signals from an unknown system by using a linear combination of past output signal values and past input values. We will be following the notation of Nounou and colleagues in our model description. The equation for the ARX model is where y is the output, u is the input, α i and β m are the estimated system coefficients, and p and q are the maximum orders of the autoregressive and input filters, respectively. Equation (1) may be written in matrix form as where y(n−1) · · · y n− p u(n−1) · · · u n−q y(n−2) · · · y n− p−1 u(n−2) · · · u n−q−1 y(n−3) · · · y n− p−2 u(n−3) · · · u n−q−2 . . .
The weight parameters α i and β m may be solved using least squares: The maximum filter lengths p and q may be estimated by minimizing some criterion such as the Akaike information criterion (AIC): where r is the number of model parameters, and L is the likelihood function quantifying the model goodness of fit.

Wavelet Decomposition.
Signals may be decomposed into a multiscale time-frequency representation by projecting the signal onto an orthonormal set of basic functions. These functions correspond to a particular scale and translation of a prototype scaling function φ jk (t) and wavelet function ψ jk (t), given by

Select optimum scale during training
For the Haar wavelet used in this paper, 2.1.2. Multiscale ARX Modeling. We applied the multiscale ARX approach presented by Nounou and colleagues. Briefly, the input (EEG) and output (LFP) data were first split in half into a training and a validation set. Second, both sets were decomposed using Haar wavelets into multiple scaled approximations in addition to the original undecimated scale. Third, at each scale an ARX model was trained using the model structure selected by an AIC minimization. Fourth, the computed ARX model from each scale was converted to the original sampling rate using the following theorem proved by Nounou et al.: an ARX transfer function Y j (z)/U j (z) = G(z) at scale j is equivalent to the transfer function Y 0 (z)/U 0 (z) = G(z 2 j ) at scale 0, the original undecimated scale. Fifth, the optimal scale ARX model was selected as the one which provided the smallest mean-square Computational and Mathematical Methods in Medicine 3 error (MSE) on the validation set ( Figure 1). The MSE is defined as

Neural Data
Collection. The motor cortex has been shown to project into the subthalamic nucleus (STN), thus implying a system of unknown electrical parameters with the EEG as the input and the STN local field potential (LFP) as the output [2]. Furthermore, studies in preclinical models of Parkinson's disease have shown increased correlation between neighboring neurons [3] and increased coherence in the 15-30 Hz band between basal ganglia nuclei [4,5].
To examine the connection strength between these disparate brain areas using MS-ARX, we simultaneously recorded voltage data from the motor cortex EEG and STN LFP of anesthetized normal rats and hemiparkinsonian (HP) rats. Parkinsonism was induced by the vendor (Charles River) by 6-hydroxydopamine injection (12 μg in 4 μL, injected into coordinates AP −1.5 mm, ML +1.8, DV −7.5 from dura at 0.67 μL/minute, cf. [6,7]) and was verified by apomorphineinduced rotation testing and histological verification as detailed elsewhere [7,8]. Briefly, rats were injected with apomorphine HCl (0.2 mg/kg, subcutaneously) at 3 and 5 weeks after 6-OHDA exposure and the number of contralateral turns over 35 minutes were counted with an automated rotameter. Rats averaging more than 7 turns per minute were included in the HP group. After recording and euthanasia, the brain was frozen, sectioned coronally, stained for tyrosine hydroxylase to confirm 95% unilateral lesioning of the substantia nigra pars compacta, and stained with cresyl violet to confirm electrode localization. All procedures were approved by the Pennsylvania State University Institutional Animal Care and Use Committee.
For EEG recording, animals were deeply anesthetized with urethane, with nominal initial dose 1.3 g/kg (i.p.) and additional doses given as needed to maintain surgical anesthesia. Stainless steel screws were implanted above bregma and above motor cortex (AP +3.7 ± 1.0 mm, ML +2.5), and the EEG signal recorded as the potential difference between these screws was subsequently amplified and filtered between 0.1 Hz and 500 Hz (3500, A-M Systems) and digitized at an initial rate of 1000 samples per second. An occipital screw was used as the reference electrode. The LFP signal was taken from the tip of the tungsten microelectrode (1-2 MΩ, FHC Inc), filtered between 5 Hz-500 Hz, amplified, and digitized. Offline, both signals were filtered between 0.1 Hz and 100 Hz and downsampled to 200 samples per second.
Recordings were taken from 9 normal rats (37 distinct STN sites) and 8 HP rats (26 distinct STN sites). Electrode tracts were histologically confirmed.
We used a maximum of 260 taps in our AIC structure selection step. This maximum was empirically selected based on the observation that the peak frequencies in our data were usually between 0.8-1.3 Hz or higher, thus allowing at least one cycle period within the ARX filter length at scale zero. The maximum wavelet decomposition level was 4. Only  Figure 2: Mean MSE across all recordings at their optimal scale ( * denotes P < 0.05 rank-sum test between Normal and HP groups).
recordings with robust slow-wave activity were included in the analysis [9]. Table 1 shows the MSEs at different scales. The undecimated scale was the optimal scale for approximately half of the recordings. The other recordings saw better ARX prediction performance at higher wavelet scales (more heavily filtered wavelet approximations).

Results
The mean MSE at the optimum scale was significantly lower in the HP group compared to the normal group ( Figure 2, P < 0.05, rank-sum test). Also, the ratio (computed for each individual recording) of the mean of the absolute value of the best-scale AR coefficients (α i ) to the mean of the absolute value of the best-scale exogenous input coefficients (β m ) was significantly lower in the HP group ( Table 2, P < 0.05, rank-sum test). Figures 3 and 4 show samples of the wavelet decimation and MS-ARX predictions, illustrating the differences between scales.

Discussion
The MS-ARX technique showed improved prediction accuracy compared to the traditional ARX approach (which uses scale 0 only). This makes sense because the MS-ARX  approach uses cross-validation to automatically select the best tradeoff between smoothing and preservation of signal details.
The results seen of decreased MSE and increased proportion of exogenous input weights also indicate that linear prediction of the STN LFP based on the cortical EEG is more accurate in the HP condition and is based more on cortical input. This indicates a greater amount of similarity between the population-based cortical and STN electrical activity in the HP case. Future studies should investigate whether this increased predictability is correlated to parkinsonian pathophysiology or symptomatic severity. Brown and colleagues have shown that there is increased coherence between the cortical EEG and STN LFP in the 15-30 Hz "beta" band in PD [4,5,10,11], and also that STN LFP beta activity is correlated to clinical symptoms [12][13][14], suggesting that the linear predictability in those scales may also be correlated to clinical symptoms. However, unlike the coherence, the MS-ARX prediction measure we describe can also measure nonperiodic content similarity between the cortical EEG and STN LFP. Thus it may provide an additional useful tool to investigate nonperiodic corticosubthalamic interactions. The recent optogenetic study by Deisseroth and colleagues showed that high-frequency stimulation of the motor cortex achieved similar symptomatic amelioration as STN stimulation, presumably affecting the STN through the hyperdirect pathway [2,15]. However, this effect was only seen in high-frequency stimulation, suggesting that the temporal scale of synchronization may be important. The multiscale predictability analysis presented here can aid in analyzing these scale-differential effects.
In conclusion, the MS-ARX method is well-adapted to the high-level analysis of neural signals from different brain nuclei at multiple scales.