^{1}

^{2, 3, 4}

^{2, 3, 4}

^{1}

^{2}

^{3}

^{4}

When evaluating causal influence from one time series to another in a multivariate data set it is necessary to take into account the conditioning effect of the other variables. In the presence of many variables and possibly of a reduced number of samples, full conditioning can lead to computational and numerical problems. In this paper, we address the problem of partial conditioning to a limited subset of variables, in the framework of information theory. The proposed approach is tested on simulated data sets and on an example of intracranial EEG recording from an epileptic subject. We show that, in many instances, conditioning on a small number of variables, chosen as the most informative ones for the driver node, leads to results very close to those obtained with a fully multivariate analysis and even better in the presence of a small number of samples. This is particularly relevant when the pattern of causalities is sparse.

Determining how the brain is connected is a crucial point in neuroscience. To gain better understanding of which neurophysiological processes are linked to which brain mechanisms, structural connectivity in the brain can be complemented by the investigation of statistical dependencies between distant brain regions (functional connectivity) or of models aimed to elucidate drive-response relationships (effective connectivity). Advances in imaging techniques guarantee an immediate improvement in our knowledge of structural connectivity. A constant computational and modelling effort has to be done in order to optimize and adapt functional and effective connectivity to the qualitative and quantitative changes in data and physiological applications. The paths of information flow throughout the brain can shed light on its functionality in health and pathology. Every time that we record brain activity we can imagine that we are monitoring the activity at the nodes of a network. This activity is dynamical and sometimes chaotic. Dynamical networks [

Granger causality has become the method of choice to determine whether and how two time series exert causal influences on each other [

From the beginning [

Several approaches have been proposed in order to reduce dimensionality in multivariate sets, relying on generalized variance [

In this paper we will address the problem of partial conditioning to a limited subset of variables, in the framework of information theory. Intuitively, one may expect that conditioning on a small number of variables should be sufficient to remove indirect interactions if the connectivity pattern is sparse. We will show that this subgroup of variables might be chosen as the most informative for the driver variable and describe the application to simulated examples and a real data set.

We start by describing the connection between the Granger causality and information-theoretic approaches like the transfer entropy in [

Let us consider

Turning now to the central point of this paper, we address the problem of coping with a large number of variables, when the application of the multivariate Granger causality may be questionable or even unfeasible, whilst bivariate causality would detect also indirect causalities. Here, we show that conditioning on a small number of variables, chosen as the most informative for the candidate driver variable, is sufficient to remove indirect interactions for sparse connectivity patterns. Conditioning on a large number of variables requires a high number of samples in order to get reliable results. Reducing the number of variables, that one has to condition over, would thus provide better results for small data sets. In the general formulation of the Granger causality, one has no way to choose this reduced set of variables; on the other hand, in the framework of information theory, it is possible to individuate the most informative variables one by one. Once that it has been demonstrated [

Concretely, let us consider the causality

Let us consider linear dynamical systems on a lattice of

A directed rooted tree of 16 nodes.

The sensitivity (a) and the specificity (b) are plotted versus

As another example, we now fix

The directed network of 34 nodes obtained assigning randomly a direction to links of the Zachary network.

In Figure

Sensitivity and specificity are plotted versus

The mutual information gain, when the (

We consider now a real data set from an

The causality analysis of the preictal period. The causality

Concerning the preictal period, the sum of all causalities is plotted versus the number of conditioning variables.

In Figure

The sum of outgoing causality from each electrode in the EEG application, ictal period. (a) Bivariate analysis. (b) Our approach with

The sum of outgoing causality from each electrode in the EEG application, preictal period. (a) Bivariate analysis. (b) Our approach with

The causality analysis of the ictal period. The causality

We have addressed the problem of partial conditioning to a limited subset of variables while estimating causal connectivity, as an alternative to full conditioning, which can lead to computational and numerical problems. Analyzing simulated examples and a real data set, we have shown that conditioning on a small number of variables, chosen as the most informative ones for the driver node, leads to results very close to those obtained with a fully multivariate analysis and even better in the presence of a small number of samples, especially when the pattern of causalities is sparse. Moreover, looking at how causality changes with the number of conditioning variables provides information about the sparseness of the connectivity.