EEG Resting-State Brain Topological Reorganization as a Function of Age

Resting state connectivity has been increasingly studied to investigate the effects of aging on the brain. A reduced organization in the communication between brain areas was demonstrated by combining a variety of different imaging technologies (fMRI, EEG, and MEG) and graph theory. In this paper, we propose a methodology to get new insights into resting state connectivity and its variations with age, by combining advanced techniques of effective connectivity estimation, graph theoretical approach, and classification by SVM method. We analyzed high density EEG signals recorded at rest from 71 healthy subjects (age: 20–63 years). Weighted and directed connectivity was computed by means of Partial Directed Coherence based on a General Linear Kalman filter approach. To keep the information collected by the estimator, weighted and directed graph indices were extracted from the resulting networks. A relation between brain network properties and age of the subject was found, indicating a tendency of the network to randomly organize increasing with age. This result is also confirmed dividing the whole population into two subgroups according to the age (young and middle-aged adults): significant differences exist in terms of network organization measures. Classification of the subjects by means of such indices returns an accuracy greater than 80%.


Introduction
In the last years, several anatomical, functional neuroimaging, and electrophysiological studies have investigated the resting-state connectivity in order to understand the effects of aging on the human brain. Functional connectivity studies have suggested that aging is related to decreased connectivity patterns in the Default Mode Network (DMN; [1][2][3][4]) but influences also attention networks and sensory-motor systems [5][6][7][8]. Also, in EEG field, this phenomenon has been investigated confirming fMRI results [9][10][11][12][13][14]. Different approaches based on graph theory applied to brain functional networks have been used to describe the effects of aging on cerebral processes. In particular, the resting-state network organization is suggested to tend to a more random configuration by a reduction of overall connectivity (decreased clustering and increased path length): this phenomenon was demonstrated in [10,11] estimating brain connectivity by means of Synchronization Likelihood and in [14] by means of lagged-phase synchronization. In fact, among indices describing the properties of a brain network, characteristic path length (overall integration), clustering (local segregation), and small-worldness, defined as ratio between them, represent some key topological metrics [15] describing the organization of information flows in a network. On the basis of such metrics, networks can be described as regular, smallworld, or random. Several anatomical, functional neuroimaging, and electrophysiological studies demonstrated that brain networks inferred from healthy individuals are characterized by an optimal small-world organization [16,17].
Furthermore, a description of the effective resting-state networks during three different ages (children, mid-aged, and 2 Computational Intelligence and Neuroscience elderly) was proposed by Zhu et al. [12]: in this study, effective connectivity was estimated starting from EEG signals, but the aging effects in the effective resting-state networks were investigated only in terms of hemispheres asymmetry.
The aim of the present study was to exploit a combination of advanced methods for the estimation of cortical connectivity, the use of directed and weighted graph theoretical indices, and a classification approach to investigate differences in resting-state networks related to age.
Partial Directed Coherence (PDC, [18]) is a powerful estimator of effective connectivity: it returns information about the existence, the intensity, and the direction of the causal relations between EEG signals. In order to exploit all the information returned by the PDC, in this study, we computed the directed and weighted version of state-ofthe-art graph theory indices. We extracted salient indices synthesizing the architecture of the connectivity networks elicited during the rest condition in a group of 71 healthy adults, of age ranging from 20 to 63 years. A first investigation was conducted by mean of correlation study in order to find the topological properties that are related to age. A second step focused on the characterization of two subgroups (young and middle-aged adults) and aimed to identify indices showing statistical differences between the two groups, to be used as features for classifying resting-state patterns in relation to the age of the subject.

Experimental
Design. 71 healthy subjects took part in the study (age ranging from 20 to 63 years). EEG signals were collected from 61 positions, assembled on an electrode cap (according to an extension of the 10-20 International System) during one minute of eyes-closed resting-state condition. We avoided scalp positioned reference electrodes, to minimize the effect on the signal phases. Subjects were instructed to relax and not to think of anything. After band-pass filtering (1-45 Hz + 50 Hz Notch filter), muscular and environmental artifacts were removed using a semiautomatic procedure, based on the definition of a voltage threshold (±80 V), and Independent Component Analysis [19] was applied to remove ocular artifacts related to involuntary twitch during eyesclosed condition. Particular attention was put in avoiding any preprocessing procedure that may induce changes in the phase relations between channels. EEG traces were then segmented in 1 s-epochs and the brain connectivity was estimated by means of PDC: the obtained patterns were averaged over frequency range of 1-30 Hz. The frequency band was selected in accordance with previous studies [12,20].

Multivariate Autoregressive Modeling of Brain Signals.
Suppose that the following multivariate autoregressive (MVAR) model is an adequate description of the dataset : where ( ) is the data vector in time, ( ) is a vector of multivariate zero-mean uncorrelated white noise processes, Λ( ) is the matrix of model coefficients at lag , and is the model order that can be chosen by means of the Akaike Information Criteria (AIC) for MVAR processes [21]. For the dataset used in this study, the resulting optimal model order was around 20 for each subject. We checked that the amount of data points was at least an order of magnitude higher than the number of parameters to be estimated for the model. In order to investigate the spectral properties of the examined process, (1) is transformed to the frequency domain: where Δ is the temporal interval between two samples. An adaptive formulation for MVAR model (AMVAR) is used in the study [22]. The time dependent parameter matrices can be estimated by means of GLKF method (described in the following).

Partial Directed
Coherence. The PDC [18] is a full multivariate spectral measure, used to determine the directed influences between any given pair of signals in a multivariate dataset. This estimator was demonstrated to be a frequency version of the concept of Granger causality [23]. In this study, we adopted the following formulation of PDC: where Λ( ) is the matrix containing the coefficients of associated MVAR model and Due to the normalization reported in (4), PDC values are in the interval [0 1] . With respect to the original definition of PDC [18], according to which the estimator is normalized according to the amount of connectivity emitted from the "source" channel, here, we normalized the estimator according to the total connectivity incoming to the "target" channel (row-wise normalization instead of column-wise one), to avoid the fact that PDC from given electrode is decreased when multiple signals are emitted from it. The formulation of the estimator proposed in [18] is based on the hypothesis of stationarity of signals included in the estimation process. Unfortunately, such hypothesis leads to a complete loss of the information about the temporal evolution of estimated information flows. For overcoming this limitation, a time-varying adaptation of PDC was introduced. The adaptation consisted of modifying the original formulation of PDC by including dependence from the time in the MVAR coefficients. Thus, the adaptive PDC estimator can be defined as follows: Computational Intelligence and Neuroscience 3 where refers to a dependence of the MVAR coefficients from time and Λ ( , ) represents the entry of the matrix of MVAR model coefficients Λ at frequency and time .
In this study, we used the squared formulation of PDC due to its higher accuracy and stability [24].

General Linear Kalman
Filter. The coefficients of the AMVAR model were estimated by means of General Linear Kalman Filter (GLKF). In the GLKF, an adaptation of the Kalman Filter to the case of multitrial time series is provided [25]. In particular, the equations at the basis of the algorithm are where represents the observation, is the state process, and are the transition matrices, and and are the additive noises. To obtain the connection with the time-varying MVAR, it is necessary to make the following associations: where denotes the number of trials, whereas is the dimension of the measured process. The details of the algorithm are provided in [25]. The quality of estimation is related to the definition of two adaptation parameters, 1 and 2, which regulate the compromise between the quality of estimation and the speed of adaptation to transitions. This algorithm was developed for time-varying connectivity estimation but it can be also applied in the stationary case, to provide high stability and accuracy with multichannel data [25]. In this case, the appropriate choice of the adaptation constants allows for strengthening the estimator accuracy ( 1 = 2 = 0.001).

Support Vector Machine (SVM). The Support Vector
Machine (SVM) was first proposed by Vapnik and has since attracted a high degree of interest in the machine learning research community [26]. SVMs are supervised learning models used for classification and regression analysis. In order to perform a binary classification (two separate classes), this method needs training data, marked as belonging to one of two categories, for introducing a separating hyperplane: this hyperplane must maximize the margin between the two classes and it is known as the optimum separating hyperplane. The details of the method are provided in [27].

Graph Theory Approach.
A graph is an abstract representation of a network that consists of a set of vertices (or nodes) linked by means of edges (or connections) indicating the presence of some sort of interaction between the vertices. The adjacency matrix is the mathematical representation of a graph. It contains the information about the connectivity structure of the graph: the entry is different from 0 if there is an effective link between nodes and and equal to 0 if no link exists.
Several indices based on the elements of such matrix can be computed for the characterization of the main properties of brain networks [28]. In this study, we used weighted versions of the indices, to take into account strength and direction of the connectivity links returned by PDC.
2.6.1. Node Strength. The strength index represents the total intensity associated with arcs that involve the node , both ingoing ones (in-strength, in ) and outgoing ones (outstrength, out ): where is the intensity of the link from node to node .

Characteristic Path Length.
The characteristic path length is the average shortest path length in the network [29]. Its weighted and directed version can be defined as follows: where is the average distance between node and all other nodes, is the distance between node and node , is a map (i.e., the inverse) from weight to length, and → is the directed shortest path from to (the superscript " " indicates that the equation is associated with the weighted version).

Clustering Coefficient.
The clustering coefficient describes the intensity of interconnections between the neighbors of a node [29]. It is defined as the fraction of triangles around a node or the fraction of node's neighbors that are neighbors of each other. The weighted directed version of clustering coefficient is defined as follows: where represents the number of triangles involving node , in and out are the number of incoming and outcoming edges of nodes , respectively, and is the entry of connectivity matrix (i.e., weight of connection from node to node ).

4
Computational Intelligence and Neuroscience 2.6.4. Global Efficiency. The global efficiency is the average of the inverse of the geodesic length (shortest path between two nodes in the network) and represents the efficiency of the communication between all the nodes in the network [30]. The weighted and directed formulation is shown in the following equation: where is the weighted distance between node and node .

Local Efficiency.
The local efficiency is the average of the global efficiencies computed on each subgraph belonging to the network and represents the efficiency of the communication between all the nodes around the node in the network [30]. Its weighted and directed version is defined as follows: where is the entry of connectivity matrix (i.e., weight of connection from node to node ), in and out are the number of incoming and outcoming edges of node , respectively, and is the weighted distance between node and node .
2.6.6. Weight. The weight is the mean value of all connections in the filtered network: where is the entry of weighted connectivity matrix and is the number of filtered connections.

Statistical Analysis.
A weighted directed adjacency matrix was extracted for each subject by applying a threshold able to maintain the 20% of the stronger connections. As a first step of analysis, a correlation study (Spearman correlation) was performed between age and each global index (all the indices described in the previous section except for the local index node strength) for the whole population (71 healthy subjects). Then, two subgroups were selected for classification analysis with respect to age: "young" group (20 subjects with age: 23.8 ± 1.05 years) and "middle-aged" group (20 subjects with age: 46.05 ± 5.27 years). To characterize the restingstate connectivity patterns elicited by the two age groups, a first investigation was performed by means of the local index. The node strength was computed for each subject and a Grand Average was performed for each group in order to identify the brain regions mainly involved in the resting-state network. Furthermore, to quantify the differences related to the network architecture of the two groups of subjects, the global graph indices described above were subjected to the following steps: (1) Statistical comparison (two-sample Student's -test) for a significance level of 5% between indices from young and middle-aged subjects networks. (2) Classification of the extracted features by means of SVM classifier [26] (linear kernel).
Before computing the first step, the hypothesis of normal distribution was verified applying the Kolmogorov-Smirnov one-sample test. -tests were performed for investigating which indices were significantly different between the populations: this step was important to allow for choosing the features to be used in the classification process. For the second step, a Leave One Out approach has been implemented to perform the classification. In particular, graph indices extracted from a single subject were used for classifying him/her in one of the two groups, using the indexes achieved by other randomly chosen 30 subjects (15 young and 15 middle-aged adults) as training data for SVM classifier. Each subject was tested singularly for 50 times and each iteration was characterized by a different combination of the 30 subjects used for the classifier training. During each iteration, the algorithm returned score equal to 1 for right classification and 0 otherwise. The classification performance was obtained for each subject among the iterations; then, total, young, and middle-aged performances were computed by performing the average of the performances among all the subjects belonging to each group, respectively.

Results
All the investigated indices correlate with age: in particular, Spearman's correlation coefficient is negative in all cases except for the path length ( Figure 1) indicating that in the effective resting-state networks the communication (measured by means of efficiencies, path length, and clustering) and the global strength (measured by means of weight) tend to decrease with the age.
After the preliminary investigation about the network properties in relation to the age, we focused on the characterization of effective resting-state networks related to two age subgroups (young and middle-aged). In Figure 2, the Grand Average of the node strength index is shown on a scalp map for each group. Figure 2(a) shows the strength map for the young individuals: the obtained pattern reveals a symmetric behavior with respect to the hemispheres and the parieto-occipital areas are mainly involved in the resting-state network (regions characterized by stronger connections), with weaker sources in frontal region. Instead, Figure 2 areas characterize the pattern. However, this analysis reveals that the strength values obtained for the young group are higher than those obtained for the middle-aged group and that these latter are more balanced for the frontal and parietooccipital areas.
The global indices extracted separately from the two subgroups were subjected to an independent -test for a significance level of 5%. Bar diagrams reported in Figure 3 show mean values of the indices achieved. In accordance with the correlation results, the statistical comparison reveals a significantly higher value of path length and significantly lower values of the other indices in the middle-aged group when compared to the young group.
At last, to validate at the single subject level the obtained results, four different groups of indices were used as features in the classification analysis (Figure 4(b)). Classification performances reported in Figure 4(a) reveal accuracies higher than 82% for all the examined cases.

Discussion
Studying the resting-state activity and connectivity has gained more and more importance in the last years, as a consequence of the definition of the so-called Default Mode Network in functional MRI studies and, more importantly, of its alterations in many different pathologies. However, its neuroelectrical counterpart, though intensively studied, has not provided stable and repeatable results. One of the most difficult aspects of the problem is probably the high variability intrinsic in neuroelectrical signals and their low Signal to Noise Ratio.
In this paper, we described a methodology able to combine advanced techniques for the estimation of stable, repeatable effective connectivity and graph theory for describing and classifying age-related changes in restingstate networks. The method we used for the estimation of brain networks (a modified formulation of Partial Directed Coherence, [18]) is able to return information about the existence, the weight, and the direction of the connection flow between brain regions. In the last years, different formulations of PDC have been proposed [18,[31][32][33] with the purpose to address specific issues related to the original PDC, like the decrease of PDC from a given channel when multiple signals are emitted from that channel, the effect of differences in amplitude (e.g., signals of different nature, like EEG and EMG, EEG, and LFP), and the interpretability of connectivity values. In this paper, as the study focuses on the variations of effective connectivity patterns in relation to the age, a renormalization of the original PDC, performed with respect to all connectivity links incoming to each channel, was applied to signals recorded by means of high density EEG during rest condition. As a general consideration, when dealing with signals with different scales (like EEG and EMG, EEG and LFP, and so on, or signals resulting from recordings on patients) or when being interested in the interpretation of the connectivity values, the use of gPDC or renormalized PDC [31] should be preferred.
The use of a General Linear Kalman Filtering approach [25] allowed for overcoming the limitation of the number of channels simultaneously included in a single AMVAR model, permitting a full multivariate approach even with a high number of nodes (61 EEG channels). The advantage of using a multivariate approach with respect to a pairwise one is significant, in terms of accuracy of the results and reduction of false positives, as shown, for instance, in [34]. Its main limitation is due to the amount of data necessary to provide a correct estimation of a high dimensional model, with a large number of parameters. This usually compels to an a priori selection of regions or channels to be studied, which is more difficult in a resting-state condition with respect to task related data. Our GLKF approach, on the contrary, has Computational Intelligence and Neuroscience : Bar diagrams reporting indices mean value achieved for the two age-related groups. Two-sample -test for a significance level of 5% was performed between the two populations: the symbol ( * ) highlights a significant difference between the two groups.
shown excellent properties of stability and accuracy even with a high number of channels (up to 100) and an amount of data reduced to few minutes of recordings [24,34]. The plethora of information returned by the connectivity estimation and the complexity of the resulting networks requires pointing to a graph theoretical approach to extract and quantify the relevant information about the network organization. An important aspect of this study is that we used the directed and weighted formulation of main graph indices, to exploit the richness of information provided by the connectivity estimation.
All these methodological steps proved to be a valid procedure for the description of age-related changes in human brain at rest. In fact, results of correlation study for the whole group of 71 subjects indicated that the organization of brain networks turns to a more random (less structured) condition with the normal aging. A decrease of connections weight, efficiencies, and clustering and an increase of the characteristic path length denote that "middle-aged" networks are less organized and characterized by lower power with respect to young age.
The selection of two subgroups of subjects in accordance with their "young" or "middle" age has revealed differences in the resting-state EEG pattern. The use of the local node strength index allowed for identifying the brain regions mainly involved in the eyes-closed resting-state network.  Indeed, in both groups, a prevalent role of frontal and parietooccipital areas characterizes the obtained patterns. However, results showed a weakening of connections strength with the aging (in agreement with the correlation between the weight index and age): this behavior is more related to the parietooccipital area. Furthermore, significant differences were found by means of global network indices. This result is important not only in the context of the study of normal ageing, but also in the study of pathological alterations of the brain intrinsic organization.
In fact, such alterations are often expressed in terms of their distance from a physiological conditions as represented by healthy subjects. The use of normative databases of healthy subjects is then desirable and it is important to understand how such normative parameters have to be computed and how homogeneous the population has to be to provide a correct baseline.
Graph indices studied in this work proved to be good descriptors for the architecture of human brain networks. Their use of a combination of such indices as a vector of features for the classification analysis allowed for obtaining high performances: in fact, in all the cases examined, the accuracy is higher than 82%.
Results of this study show that the proposed approach allows for evaluating age-related differences in neuroelectrical resting-state activity and for classifying these differences at single subject level just on the basis of 1-minute recordings of eyes-closed resting condition. As for the physiological conclusion, the results of this study suggest that the transition from young to middle age impacts the topological configuration of resting-state networks towards a decrease of the small-world configuration and to a tendency to the random configuration (decrease of clustering and increase of path length). This is in accordance with the literature [3,10,11,14], adding, at the same time, more details about the nature of such changes (as conveyed by the weighted, directed indices) and a stronger reliability of the data.
Future works will focus on an enlargement of the subjects sample (including the 65-85 years' range) to provide a baseline for pathologies typically involving elderly age. The suggested approach can be easily extended to more specific topological properties of the resting-state network, in order to answer questions related to the involvement of specific circuits or regions of interest in view of the application to specific clinical conditions, thus providing a versatile tool for the study of the brain intrinsic organization starting just from few minutes of noninvasive EEG recordings.