^{1}

^{2}

^{3}

^{1}

^{2}

^{3}

The connectivity of a network contains information about the relationships between nodes, which can denote interactions, associations, or dependencies. We show that this information can be analyzed by measuring the uncertainty (and certainty) contained in paths along nodes and links in a network. Specifically, we derive from first principles a measure known as

Networks provide a powerful syntax for representing a wide range of systems, from the trivially simple to the highly complex [

Here we introduce information-theoretic measures that capture the information contained in the connectivity of a network, which can be used to identify when these networks possess informative higher scales. To do so, we focus on the out-weight vector,

The first is the uncertainty of a node’s outputs, which is the Shannon entropy [

The second property is how weight is distributed across the whole network,

The

The entropy of the distribution of out-weights in the network forms an upper bound of the amount of unique information in the network’s relationships, from which the information lost due to the uncertainty of those relationships is subtracted. Networks with high

Here, we use this measure to develop a general classification of networks (key terms can be found in Supplementary Materials, SM V A). Furthermore, we show how the connectivity and different growth rules of a network have a deep relationship to that network’s

This work expands to networks previous research on using effective information to measure the amount of information in the causal relationships between the mechanisms or states of a system. Originally,

Our current derivation from first principles of an

To expand this framework to networks in general, we relax this intervention requirement by assuming that the elements in

Here we describe how this generalized structural

Figure

Effective information depends on network structure. (a) In Erdős-Rényi (ER) networks, we see the network’s

We report another key relationship between a network’s connectivity and its

The maximum possible

The picture that emerges is that

Comparing determinism and degeneracy. (a) Left column: three example out-weight vectors,

In a maximally deterministic network wherein all nodes have a single output,

These two quantities provide clear explanations for why different networks have the

So far, we have been agnostic as to the origin of the network under analysis. As described previously, to measure the

Since the

As the determinism and degeneracy of a network increase to their minimum and maximum possible values, respectively, the effectiveness of that network will trend to 0.0. Regardless of its size, a network wherein each node has a deterministic output to a unique target has an effectiveness of 1.0.

In Figure

Effective information of real networks. Effectiveness, a network’s

Lower effectiveness values correspond to structures that have either high degeneracy (as in right column, Figure

Perhaps it might be surprising to find that evolved networks have such low effectiveness. But, as we will show, a low effectiveness can actually indicate that there is informative higher-scale (macroscale) connectivity in the system. That is, a low effectiveness can reflect the fact that biological systems often contain higher-scale structure, which we demonstrate in the following section.

This new global network measure,

Bringing these issues to network science, we can now ask, what representation will minimize the uncertainty present in a network? We do this by examining

Notably, the phenomenon can be measured by recasting networks at higher scales and observing how the

First, we must introduce how to recast a network,

A macronode

Here, to decide whether or not a macronode is an consistent summary of its underlying subgraph, we formalize consistency as a measure of whether random walkers behave identically on

Specifically, we define the

This consistency measure addresses the extent to which a random dynamical process on the microscale topology will be recapitulated on a dimensionally reduced topology (for how this is applied in our analysis, see Materials & Methods).

What constitutes a consistent macroscale depends on the connectivity of the subgraph that gets grouped into a macronode, as shown in Figure

Macronodes. (a) The original network,

Different subgraph connectivities require different types of HOMs to consistently represent them. For instance, HOMs can be based on the input weights to the macronode, which take the form

Subgraphs with complex internal dynamics can require a more complex type of HOM in order to preserve the macronode’s consistency. For instance, in cases where subgraphs have a delay between their inputs and outputs, this can be represented by a combination of

We present these types of macronodes not as an exhaustive list of all possible HOMs, but rather as examples of how to construct higher scales in a network by representing subgraphs as nodes and also sometimes using higher-order dependencies to ensure those nodes are consistent. This approach offers a complete generalization of previous work on coarse-grains [

A network has an informative macroscale when a recast network,

Checking all possible groupings is computationally intractable for all but the smallest networks. Therefore, in order to find macronodes which increase the

By generating undirected preferential attachment networks and varying the degree of preferential attachment,

The emergence of scale in preferential attachment networks. (a) By repeatedly simulating networks with different degrees of preferential attachment (

Correspondingly the size of

Our results offer a principled and general approach to such community detection by asking whether there is an informational gain from replacing a subgraph with a single node. Therefore, we can define

The presence and informativeness of macroscales should vary across real networks, depending on connectivity. Here, we investigate the disposition toward causal emergence of real networks across different domains. We draw from the same set of networks that are analyzed in Figure

Propensity for causal emergence in real networks. Growing snowball samples of the two network domains that previously showed the greatest divergence in

That subsets of biological systems show a high disposition toward causal emergence is consistent, and even explanatory, of many long-standing hypotheses surrounding the existence of noise and degeneracy in biological systems [

We have shown that the information in the relationships between nodes in a network is a function of the uncertainty intrinsic to their connectivity as well as how that uncertainty is distributed. To capture this information, we adapted a measure, effective information

We also illustrated that what has been called “causal emergence” can occur in networks. This is the gain in

The study of higher-order structures in networks is an increasingly rich area of research [

While some [

Networks were chosen to represent the four categories of interest: social, informational, biological, and technological (see SM Figure

Previously we outlined methods for creating consistent macronodes of different types. Here, we explore their implementation, which requires deciding which macroscales are consistent. Inconsistency is measured as the Kullback-Leibler divergence between the expected distribution of random walkers on both the microscale

To measure the inconsistency we use an initial maximum entropy distribution on the shared nodes between

We focus on the shared nodes between

Here, we examine our methods of using higher-order dependencies in order to demonstrate that this creates consistent macronodes. We use 1000 simulated preferential attachment networks, which were chosen as a uniform random sample between parameters

The greedy algorithm used for finding causal emergence in networks is structured as follows: for each node,

All data used in this work were retrieved from the Konect Database [

The opinions expressed in this publication are those of the author(s) and do not necessarily reflect the views of Templeton World Charity Foundation, Inc.

The authors declare no conflicts of interests.

B.K. and E.H. conceived the project. B.K. and E.H. wrote the article. B.K. performed the analyses.

The authors thank Conor Heins, Harrison Hartle, and Alessandro Vespignani for their insights about notation and formalism of effective information. This research was supported by the Allen Discovery Center program through The Paul G. Allen Frontiers Group (12171). This publication was made possible through the support of a grant from Templeton World Charity Foundation, Inc. (TWCFG0273). This work was also supported in part by the National Defense Science & Engineering Graduate Fellowship (NDSEG) Program.

A: table of key terms. B: effective information calculation. C: deriving the effective information of common network structures. D: network motifs as causal relationships. E: table of network data. F: examples of consistent macronode. G: emergent subgraphs.