Automatic Lateralization of Temporal Lobe Epilepsy Based on MEG Network Features Using Support Vector Machines

1Key Laboratory for Neuroinformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China 2Department of Magnetoencephalography, Nanjing Brain Hospital Affiliated to Nanjing Medical University, Nanjing, China 3State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China 4Department of Medical Information and Engineering, College of Electrical Engineering and Information Technology, Sichuan University, Chengdu, China 5School of Biomedical Engineering, Hubei University of Science and Technology, Xianning, China


Introduction
Temporal lobe epilepsy (TLE) is the most common type of drug-resistant focal epilepsy in adults [1].TLE is traditionally associated with mesial temporal sclerosis (MTS), cell loss, and gliosis in the hippocampus, entorhinal cortex, and amygdala [2].Currently, surgical intervention is the main choice of treatment for medically intractable TLE [3].However, surgery helps only 70% of patients become seizure free [4].Indeed, approximately one-third of TLE patients are unable to control their seizures, even with the best available medications and surgery.Correct clinical diagnosis for TLE is critical for improving surgical outcomes and requires highly trained professionals [5].Manual diagnosis of unilateral TLE using brain-neuroimaging methods is time-consuming, and different experts may give contradictory diagnoses for the same data [6].Therefore, an objective and automated tool that can accurately classify brain images is desirable for presurgical evaluation of epileptogenic lateralization in TLE.
The support vector machine (SVM) method is a supervised machine learning technique for classification, and the radial basis function (RBF) kernel is the most commonly used kernel function for SVM [7].As a classification tool, the SVM technique is flexible, automated, and sufficiently fast to operate in a clinical setting [8].SVM algorithms have been applied for measuring brain morphology [9], including 2 Complexity cortical thickness, volume, curvature, and identification of MTS in TLE patients.SVM approaches have been utilized to determine lateralization of the TLE epileptogenic focus with diffusion tensor imaging (DTI) structural connectomes [10].Another study verified the use of SVM for voxel-based MRI classification, and TLE with MTS can be distinguished from TLE without MTS with over 88% accuracy [8].These investigations have combined SVM with MRI or DTI.However, one of the newest neuroimaging tools, magnetoencephalography (MEG), has rarely been applied for solving the classification problem of unilateral TLE.
During the last decade, MEG has become increasingly available as a noninvasive, reliable, fast, and patient-friendly technique for recording brain activity [11][12][13][14].MEG has been widely applied for studying epilepsy disorders, particularly the localization of pathological brain activity or lesions in candidates for epilepsy surgery [15][16][17].In addition, the sensitivity of MEG has been investigated for spike detection that depends on two sensor types (magnetometer and gradiometer) in patients with epileptic foci in the mesial temporal lobe [18].The results revealed that the magnetometer was more sensitive in the mesial temporal area, whereas the gradiometer performed better in the lateral temporal area.In another study, the ability of MEG to determine the localization and orientation of medial temporal spikes was evaluated in patients with known medial TLE using magnetic source imaging (MSI) with equivalent current dipoles (ECD) [19].The results revealed that MSI with ECD was able to detect medial temporal spikes, providing important localizing information in patients with medial TLE.In our previous MEG study, we investigated the relationship between endogenous neuromagnetic signals in patients with epilepsy and epileptic foci determined by clinical data, analyzing the performance of several existing methods for localizing the epileptic focus, such as ECD, imaginary coherence (IC), and synthetic aperture magnetometry (SAM) [20].The results suggested that the IC method performs better than ECD or SAM for quantitatively identifying epileptic activity.Thus, in the current study, we calculated connectivity matrices to construct a brain-network for all participants, based on the IC method.
Brain regions and the structural or functional associations between them constitute a brain-network.Complex network analysis is useful for quantitatively characterizing the properties of brain-networks with a small number of network measures [21,22].Recent studies have established that TLE affects a distributed neural network, with widespread extratemporal effects, rather than having a single focal epileptogenic source [23][24][25][26].Based on both structural and functional connectivity (FC) analyses, accumulating evidence suggests that brain-networks in TLE patients are pathologically altered [27][28][29][30].Hsiao et al. [31] investigated FC alterations in the default mode network (DMN) in TLE, using resting-state spike-free MEG recordings.Their results revealed that TLE involved changes in FC within the DMN and that the change was associated with the lateralization of TLE.In right TLE, FC between the DMN and the right medial temporal region is enhanced, while left TLE involves enhanced FC between the DMN and bilateral medial temporal regions [31].By analyzing resting-state MEG signals, Jin et al. [32] calculated betweenness centrality at the source-level functional network in mesial TLE.The results indicated that altered electrophysiological functional hubs reflected pathophysiological brain-network reorganization [32].To test the hypothesis that FC and network characteristics are useful for determining the lateralization of TLE, Yang et al. extracted resting-state functional brain-network features as inputs to an SVM [33].Using a leave-one-out crossvalidation strategy, their SVM model achieved a prediction accuracy of 83% with 12 TLE patients.However, there are few studies using MEG data to construct functional brainnetworks for investigating the lateralization of unilateral TLE.
In the current study, we extracted brain-network features of patients and healthy controls based on a resting-state MEG scan, including nodal degree, betweenness centrality, and nodal efficiency.The features were used as input to the SVM to classify left TLE, right TLE, and healthy controls.The results indicated that nodal degree exhibited the best performance for left TLE versus healthy control and right TLE versus healthy control, while betweenness centrality performed best for left TLE versus right TLE.Thus, our findings indicated that MEG is helpful for solving the lateralization problem of unilateral TLE.[34] and the laterality of the seizure origin was determined using clinical history, a comprehensive neurological examination, interictal EEG recordings, and neuroimaging.Fifteen healthy volunteers (age range: 19-45 years, mean age: 27.9 years) were recruited as controls who had no history of symptoms related to neurological or psychiatric disorders.There were no significant differences in age or sex between the groups.

Ethics Statement.
A full explanation of this study was provided to all patients and control participants before enrollment.All participants provided voluntary and informed written consent according to the standards set by the ethical committee of Nanjing Brain Hospital of Nanjing Medical University, who approved the study.

MEG Recordings.
MEG recordings were performed while participants were seated inside a magnetically shielded room using a whole-head CTF 275-Channel MEG system (VSM MedTech Systems Inc., Coquitlam, BC, Canada).Before beginning data acquisition, three small coils were attached to the nasion, left, and right preauricular points on the head of each participant.The three coils were subsequently activated at different frequencies for measuring each individual's head position relative to the MEG sensors.The system allowed for head localization to an accuracy of 1 mm.The sampling rate of MEG recordings was 1200 Hz.
All MEG data were recorded with noise cancellation of third-order gradients.Each epoch took 120 seconds and 15 epochs were recorded from each participant.The head position was measured before and after each epoch.The limit for acceptable head movement during MEG recording was 5 mm.Spontaneous resting-state activity was recorded while participants remained in a supine position with their eyes closed.The resting-state in this study was defined as spontaneous activity not evoked by cognitive tasks and in the absence of seizure activity.
2.4.MRI Acquisition.MRI acquisition was similar to that in our previous study [20].MR images were acquired on a 3 T scanner (Siemens Medical Solutions, Erlangen, Germany).The protocol typically included the following sequences: (1) A T1-weighted, 3D spoiled gradient-recalled echo in a steadystate sequence with TR = 7.5 ms, TE = min full, flip angle = 15 ∘ , and field of view = 240 mm × 240 mm; (2) A T2-weighted image, 2D fluid-attenuated inversion recovery (FLAIR) with TR = 8000 ms, TE = 120 ms, TI = 2000 ms, matrix = 192 × 256, field of view = 240 mm × 240 mm, and slice thickness = 5 mm in the coronal and axial planes, respectively.To allow for accurate coregistration of the two data sets, the three fiducial points were placed in locations identical to those of the three coils used in the MEG recordings.

MEG Preprocessing.
All MEG recordings were visually examined off-line by two experienced epileptologists, and the peaks of all epileptic spikes were marked manually based on the MEG recordings.We extracted three quasi-stationary segments (20 s per segment) from each participant.The segments were at least 10 s away from recent epileptic discharge and free from eye-or muscle-related artifacts or epileptic activities.Then, the data were band-pass filtered in the 1 to 4 Hz range as a frequency band of interest for further analysis.Preprocessing and analysis of MEG data were performed with Brainstorm [35], a well-documented software package that is freely available to download online under the GNU general public license (http://neuroimage.usc.edu/brainstorm).

Network Construction.
The Freesurfer open-source software package was used to extract the cortical envelope (http://surfer.nmr.mgh.harvard.edu/fswiki/DownloadAnd-Install) of each subject based on their individual MRI results.The cortical regions of interest (ROIs) in each of the individual hemispheres were identified from the reconstructed brain of each participant according to the automatic anatomical labeling template, using Desikan-Killiany atlases to define ROI [36].The cortical surface of each subject was downsampled to 15,000 vertices.The standardized low resolution brain electromagnetic tomography (sLORETA) was then used to extract time series for each vertex [37] (See Brainstorm Tutorials for details, http://neuroimage.usc.edu/brainstorm/Tutorials).sLORETA was based on minimum-norm estimation (MNE) and the cortical source activities (current density) were normalized with an individual estimate of the source standard deviation at each point.MNE can fit the MEG data through a forward model with a solution of minimum energy.The method has been identified as an efficient tool for functional mapping, since it is consistent with physiology and capable of correcting localization.Subsequently, the dynamic current strengths of the source activity at grid points within each ROI were averaged to represent the cortical source dynamics of each ROI.Based on the time-varying source strengths, the IC, which is insensitive to volume conduction effects [38], was used to estimate the functional connectivity between each pair of ROIs.The details of our IC calculations have recently been published [20].IC calculation resulted in a full 68 × 68 adjacency matrix between ROIs for each individual.The flowchart for the brain-network construction is displayed in Figure 1.
In general, networks can be represented by graphs that consist of sets of nodes and the corresponding sets of links between the nodes [22,39].In this sense, ROIs are considered to be the nodes of a network and the IC values are considered to be the links between them.We calculated graph-theory metrics from these weighted matrices using the GRETNA toolbox [40] (https://www.nitrc.org/projects/gretna/)running in Matlab (version 8.1 (R2013a) Mathworks Inc.).The sparsity threshold used for constructing weighted networks was set from 5% (5% strongest connections (edges) were kept along with their weights) to 40%, with step size of 1%, of all the possible connections within networks, generating 36 thresholded weighted networks consequently.

Feature Extraction.
Calculation of graph-theory measures provides three features to characterize the network's nodal properties in this work, including, nodal degree (), nodal efficiency (), and betweenness centrality ().
Measures of node centrality can effectively describe the importance of individual nodes in the network.The nodal degree is considered a basic and important measure of centrality and represents how strongly one node is interacting, structurally, and functionally, with other nodes in the network.Besides node degree, measures of centrality may be based on the length or number of shortest paths between nodes [21].The weighted degree of node  is defined as where  is the set of all nodes in the network and   is the connection weight of the link between nodes  and  (,  ∈ ).
Nodal efficiency is one of the most common measures of integration and can be considered as the average inverse shortest path length [41].The weighted nodal efficiency is defined as where  is the number of nodes and    is the shortest weighted path length between nodes  and .Betweenness centrality based on the number of shortest paths between nodes is a relatively sensitive measure of centrality [42].The weighted betweenness centrality is defined as where   ℎ is the number of shortest weighted paths between nodes ℎ and  and   ()  ℎ is the number of shortest weighted paths between nodes ℎ and  that pass through node .
For each given threshold of network sparsity 68 nodebased metrics were extracted from the segmented ROIs of each participant and the network metrics over 36 thresholds were used to construct the feature vectors for each metric and each subject.All the thresholds were gone through.The optimal threshold that can provide the highest classification accuracy was selected to construct the prediction model.

Support Vector Machine
Classification.An SVM is a classifier that can automatically learn (supervised learning) and produce a decision hyperplane to classify new examples [43,44].In this study, a radial basis function kernel SVM (RBF-SVM) was used to distinguish between the groups (left TLE versus healthy controls, right TLE versus healthy controls, and left TLE versus right TLE).In contrast to linear SVM, RBF-SVM can provide a nonlinear boundary using the kernel trick to transform a nonlinear space into a higher dimensional space [45].The RBF is commonly seen in (  ,   ) = exp(−‖  −  ‖ 2 /2 2 ), where  2 is the variance of the Gaussian kernel [5,46].Principal component analysis (PCA) was used to further reduce the resulting feature space [47].SVM computing was performed using the LIBSVM toolbox, available at https://www.csie.ntu.edu.tw/∼cjlin/libsvm[48].
We used the leave-one-subject-out cross-validation in this study.In each step, one subject in class 1 and one subject in class 2 construct a test set, and all the left data are treated as a training set.For example, in case of left TLE versus right TLE, the left and right TLE groups, respectively, contain 15 subjects.In the cross validation, one subject from left TLE group forms the test set with anyone subject from right TLE group, exhaustively obtaining 15 × 15 = 225 combinations corresponding to a series of binary classifications.Mixing one subject's data in both training and test sets will give the algorithm prior knowledge and cause fake high accuracy.Hence, the leave-one-subject-out cross validation is a fair evaluation scheme to truly reveal the robustness of the classifier on overcoming individual differences.In other words, this is an out-of-sample strategy to ensure the generalizability of the established classifier for out-of-sample individual subjects [49][50][51].
A confusion matrix was used to evaluate the performance of the algorithm.The lateralization problem was specified into three binary classifications, including, right TLE versus healthy control, left TLE versus healthy control, and, most importantly, left TLE versus right TLE.For each task, samples in the first class were considered "positive" while samples in the second class were considered "negative."Therefore, for each test sample, a binary classifier has four possible outcomes: True positive (TP); False positive (FP); True negative (TN); False negative (FN).The flowchart of the RBF-SVM classification is shown in Figure 2.

Principal Component Analysis (PCA).
PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.It is used here to reduce the dimensions of feature vector in classification.A relatively smaller dimension may improve the computational efficiency, especially in problems with a large number of samples and a high dimension of feature vectors.Meanwhile, the PCA operation may drop useless components in the feature vector, producing higher confusion matrix values.
The algorithm traverses n (1-68) of the 68 PCA components corresponding to the descending order of eigenvalues.Only the optimal  components are retained to construct the prediction model.

Results
Although SVM is a powerful tool for classification, the convolution-based algorithm operation is usually very timeconsuming.Before using SVM, it is still valuable to inspect whether the network metrics can classify different groups without the assistance of SVM.Group differences of 68 brain regions in the network metrics calculated based on the optimal threshold were tested with two-tailed -tests for independent samples.The brain regions with betweengroup differences ( < 0.05, uncorrected) are displayed in Table 1.False discovery rate (FDR) correction [52] ( < 0.05) was used to control for multiple comparisons.None of the significant differences survived the FDR correction.This result suggests the need for mapping of the feature vectors into the higher dimensional space, making it possible to distinguish the MEG data by a powerful classifier, such as RBF-SVM.
The RBF-SVM was utilized to build a nonlinear model that predicted TLE laterality based on the brain-network features using a leave-one-subject-out cross-validation strategy.The performance of the RBF-SVM classification was estimated by calculating the confusion matrix values as defined above.The classification performance between groups was shown in Table 2.For nodal degree (), the RBF-SVM classifiers achieved the highest classification accuracy for left TLE versus healthy controls (80.76%).For nodal efficiency (), the resulting classifiers gained an appropriate level of accuracy both for the left TLE versus healthy controls (77.38%) and for right TLE versus healthy controls (73.81%).The betweenness centrality () provided the lowest accuracy in left TLE versus healthy controls and right TLE versus healthy controls.However, for left TLE versus right TLE,  achieved the highest accuracy (88.10%), making it the best feature for lateralization of TLE in patients' groups.

Discussion
The feature vectors were constructed from the network measures based on brain regions in the source space, rather than the sensor nodes over the scalp.Each brain region with an anatomical or functional label from a given atlas has its own size and shape, whereas the sensor nodes are equalsized planes, which are approximately equidistributed over the scalp [18].Although TLE can result in pathophysiological changes in some brain regions, changes might not be  evenly spaced in the cerebral cortex.On the other hand, network analysis in source space using MEG recordings has been increasingly applied to neuroscience and neurological disorders [53].Therefore, we selected the brain regions as the basic unit (network node) to calculate the network measures, which were taken as feature vectors for the SVM.
The metrics used to measure the connectivity between each pairs of nodes within networks also should be carefully selected.Compared with the other volume conduction insensitive measures, such as the phase lag index (PLI) [54] and the weighted phase lag index (WPLI) [55], IC is not the newest and is thought to systematically underestimate the true coupling strength because it normalizes the imaginary component via signal amplitudes [54].However, IC is still a widely used technique for revealing altered FC and has been applied in cases of brain tumors [56], brain lesions [57], and other disorders.In addition, in source-level network analyses, the IC has been suggested to effectively represent the true coupling of two brain regions with a nonzero time delay using MEG data [31,58].Thus, we used the IC method to calculate the FC matrix for each subject in the source space.
In the current study, we took advantage of the nodal properties of the brain-network to classify unilateral TLE patients and healthy controls through RBF-SVMs that were trained and tested using the leave-one-subject-out cross-validation method.The classification accuracies were 73.81% to 80.76% for left TLE versus healthy controls, 65.48% to 75.00% for right TLE versus controls, and 73.81% to 88.10% for left TLE versus right TLE, respectively.Among the different classification tests, the best classification accuracy (88.10%) was obtained using betweenness centrality in left TLE versus right TLE.The current results reveal that not only nodal degree, but also betweenness centrality and nodal efficiency can achieve classification accuracies closely for unilateral TLE versus healthy controls.Interestingly, betweenness centrality performed better than nodal degree and nodal efficiency in our study when classifying left TLE and right TLE.To visually represent the results, we used a radar chart (Figure 3) to compare the relative confusion matrix values for three features.In the chart, a pentagon is used to represent the performance of each feature on five performance measures defined in the confusion matrix in the section above; for each feature, its five performance measures are represented by the five vertices of each pentagon in Figure 3. Better features occupy a larger area in the radar chart.It is clear that feature  (betweenness centrality) performed better in all confusion matrix values than  (degree) and  (nodal efficiency).The maximum difference between  and  was 14.29% on sensitivity, specificity, and accuracy.Similarly, the relative superior performance of betweenness centrality for left TLE versus right TLE makes it more valuable for clinical use, such as assisting epileptic foci localization.
Generally, a large dimension of feature vector may exponentially increase the complexity of the SVM classifier.As such, the value of feature vector dimension is desired as small as possible without compromising the classification accuracy.This is not a significant obstacle in the current research because the dimension is limited to a maximum value of 68 and the sample size of each class is only 45.However, to give some references for further study, the issue was studied nevertheless.For space sake, we illustrated an additional experiment on the best feature (, betweenness centrality) in the left TLE versus right TLE classification to demonstrate the effect of PCA in our study.Figure 4 indicates the relationship between classification accuracy and the number of PCA components according to the descending order of eigenvalues.The classification accuracy increases with the number of PCA components and plateaus at 55 (95% percentile) with the classification accuracy of 88.10%, achieving the dimensionality reduction at (68 − 55)/68 = 19.12%.The relationship between the number of PCA components and the classification accuracy maintained a similar pattern in other classification cases.

Complexity
The current study has two advantages that should be considered.First, similar to a previous study [10], our study used a graph-based approach in which the nodal properties of the brain-network were input into the SVM.Evidence indicates that unilateral TLE is a network disease rather than a disease with a single focal region [23][24][25][26].Additionally, graph-theory metrics are able to summarize the network properties with less computational cost than the voxel-based and skeletonbased methods [8,10,59].Additionally, for classification of TLE lateralization, MEG has higher temporal resolution than MRI [9,45], fMRI [33], or DTI [8,10,59].Thus, method that can analyze the effects of different frequency bands, especially the frequency band in which epileptiform discharges, was generated.Studies have shown that asymmetric and slow activity of the delta band (1-4 Hz) can reliably lateralize to the epileptogenic hemisphere [14,[60][61][62].Indeed, significant differences in delta band activity were found between TLE patients and controls in network analyses [31,63,64].Thus, we only selected one frequency band, the delta band, as our frequency band of interest.Although we propose that delta band-based network metrics are optimal for selection of lateralization, further investigation is warranted to identify and confirm the bands with the best lateralization outcomes.
The large age range of the subjects in the current study should be considered as a potential limitation.Because a relatively small number of suitable patients were available, it was difficult to recruit enough patients within a narrow age range.To enhance the reliability of our results, we plan to conduct future studies of TLE lateralization in a patient sample with a smaller age range.
Clinically, it is more significant to improve the classification accuracy in left TLE versus right TLE than to increase the one in TLE patients versus healthy controls.Our results indicate that the network feature of  is better than  and  in classifying left and right TLE with healthy control, with accuracies of 80.76% and 75.00%, respectively.However, in the classification of left TLE and right TLE, the metric  has better classification accuracy (88.10%) than  and , which is significant for determining lateralization of unilateral TLE in clinic.Our results can be compared with those from a recent study [10] that also utilized an SVM approach to determine lateralization of the TLE epileptogenic focus.In that study, the input vectors were four graph-theory metrics that were based on DTI signals.Interestingly, and in line with our findings, their results indicated that for left TLE versus healthy control, and right TLE versus healthy control, classifiers based on nodal degree () had the best classification performance compared with the other three measures.For left TLE versus right TLE, their results showed the metric  had the highest classification accuracy.Thus, we suggest that, in clinic, when identifying TLE patients from controls, the metric  is an optimal choice, and when evaluating lateralization of unilateral TLE, the metric  will be a better one.

Conclusion
As a relatively new noninvasive clinical recording system, MEG is a powerful tool for epilepsy diagnosis.However, MEG is seldom applied for determining lateralization of unilateral TLE.This study demonstrated that MEG is a good candidate for solving the classification problem of unilateral TLE and that MEG source space brain-network features can be effectively used for determining lateralization.The classification accuracy using the RBF-SVM method based on betweenness centrality was able to achieve an accuracy of 88.10% for TLE lateralization.

Figure 1 :
Figure 1: Flowchart showing the brain-network construction of spontaneous MEG based on the Desikan-Killiany Atlas.Numbers indicate the five steps.Step (1): individual MRI slices were input into Freesurfer to create the brain-structure model.In this step, the skull and cortex were recognized and their boundary surfaces were determined.The single sphere-based MEG forward model was employed to calculate MEG lead fields.Step (2): the cortex was partitioned into regions based on the Desikan-Killiany Atlas in preparation.Step (3): filtered MEG recordings were projected using the inverse problem algorithm (sLORETA) on the cortices of individual participants to obtain time series for each voxel.Step (4): the time series were averaged over each atlas to produce 68 atlas-based time courses.Step (5): the connectivity matrix was constructed using the 68 atlas time courses in preparation for network parameter calculation.

FromFigure 2 :
Figure 2: Flowchart showing the network parameter-based determination of TLE lateralization.First, based on the connectivity matrix shown in Figure1, node-related network parameters (node degree, node efficiency, and node betweenness) were calculated (step (1)).Node degree is considered a basic and important measure of centrality and represents how strongly one node is interacting, structurally and functionally, with other nodes in the network.Node efficiency is one of the most common measures of integration and can be considered as the average inverse shortest path length.Node betweenness based on the number of shortest paths between nodes is a relatively sensitive measure of centrality.The parameters were then incorporated into feature vectors.Here, we conducted dimension reduction using PCA (step (2)).The optimal feature vectors were then input into an SVM for classification (step (3)).Finally, by training the SVM, unilateral TLE could be classified with a high degree of accuracy.

Figure 3 :Figure 4 :
Figure 3: Radar chart of the performance of three features for left TLE versus right TLE.The output of each feature is represented by a pentagon in specific color.The outermost grey line means the 100% accuracy of the five values in the confusion matrix.

Table 1 :
Brain regions with significant differences ( < 0.05, two-tailed -test, uncorrected) in nodal parameters between the groups.

Table 2 :
Classification performance between groups.