Over the past few decades, network science has introduced several statistical measures to determine the topological structure of large networks. Initially, the focus was on binary networks, where edges are either present or not. Thus, many of the earlier measures can only be applied to binary networks and not to weighted networks. More recently, it has been shown that weighted networks have a rich structure, and several generalized measures have been introduced. We use persistent homology, a recent technique from computational topology, to analyse four weighted collaboration networks. We include the first and second Betti numbers for the first time for this type of analysis. We show that persistent homology corresponds to tangible features of the networks. Furthermore, we use it to distinguish the collaboration networks from similar random networks.
Networks are a useful abstraction for many realworld systems. Some examples are the Internet, communication networks, biological networks, and social networks. Many of these networks are intrinsically weighted [
There are several approaches to analysing weighted networks. One is to suitably generalize measures for binary networks [
In this paper, we take a different approach. Instead of finding the optimal threshold weight, which is often only optimal for a specific property, we study all different levels of resolution at once. To do so we use persistent homology, a recent technique from computational topology. The framework of persistent homology records structural properties and their changes for a whole range of thresholds. There are only a few other papers that use persistent homology to analyse networks [
Here, we include the second and first Betti numbers for the first time. This leads to richer network measures. We show that the first Betti numbers correspond to tangible features of the network and use this richer form of persistent homology to distinguish structured networks from random networks.
In this section we introduce concepts from computational topology in the setting of networks. For a more elaborate introduction to persistent homology we refer to [
Persistent homology computes the topological features of a filtration of a space. A filtration of a space can be thought of as the evolution of a space or a growing sequence of spaces. More formally a filtration of a space
A filtration of a triangle (a). We start with three connected components. The yellow and the green components die in step two and three, but the red component persists the whole filtration. In the fourth step a loop is born, which dies in the final step of the filtration. The zeroth Betti number equals the number of connected components. The first Betti number equals the number of loops (b). We use a barcode to visualise the birth and death of the Betti numbers (c).
Using the inclusion maps
Here, we will restrict our attention to the zero, one and twodimensional homology of spaces. This will reduce our computations significantly, since we do not need to include parts of our space that are higher dimensional than twodimensional. We will make this statement more precise in the following section.
It is well known that
A weighted graph is a graph
Note that for two different thresholds
Since a graph can be equipped with a topology to turn it into a a onedimensional space, we can directly apply persistent homology to a graph filtration. We will then obtain nontrivial Betti numbers in dimensions zero and one only.
We can encode more of the topological information of the graph into a higher dimensional space, a simplicial complex. There are many different ways to construct a filtration of simplicial complexes from a graph filtration. A common choice is the clique complex since it reduces computational efforts [
We obtain the clique complex of a graph by “filling in” all cliques, that is, all complete subgraphs. A 3clique will turn into a filled triangle and a 4clique into a solid tetrahedron and similarly for higher dimensional cliques. A nice property of the clique complex is that cliques correspond to highly connected groups of nodes that may represent communities [
A vertex is also known as a 0simplex, an edge as a 1simplex, a triangle as a 2simplex, and a tetrahedron as a 3simplex. A face of a simplex
Let
From the definition of homology we know that the
We have applied persistent homology to four collaboration networks of scientists [
Through this construction we obtain a network that has a very different weight distribution from a more traditional social network as described by Granovetter [
Instead, in these collaboration networks, weak ties are necessarily part of communities. And in fact, the weaker the tie, the larger the community that it is part of. For example, let two scientists be connected by a weak tie with weight 0.125. This implies that they have coauthored a paper with at least seven other authors (they could also both have appeared on, e.g., two papers with 15 authors). Let us for simplicity assume this is the case. This paper with nine authors corresponds to a 9clique in our network. All edges in this clique have weight larger or equal to 0.125. If we inspect edges with lower weight than 0.125 we find even more coauthors and larger cliques.
We will use the network scientists data to explore
We will first discuss the zeroth Betti numbers of the clique complex filtration. As discussed in the previous section, we may restrict to the 1skeleton of the complex for this computation, that is, the graph itself. We start our filtration with
As we lower
The network is not connected while
Collaboration networks.
Network  No. of nodes  No. of edges 

No. of edges = 
No. of edges 

Network science  379  914  0.143  47  10 
Condensed matter  36458  171735  0.034  315  0 
Highenergy 
5835  13815  0.056  171  0 
Astrophysics  14845  119652  0.018  357  0 
Largest connected component of the network science collaboration network. The enlarged nodes are the nodes that join the largest connected component at the lowest filtration value. Their colours correspond to the component they belonged to before this filtration value is reached.
On the left (a) we plot the zeroth Betti number against the threshold
We were curious to see if the zeroth Betti numbers could distinguish this collaboration network from random ErdösRényi graphs with the same number of nodes and edges and with the same weights assigned to the edges. We generated 1000 random graphs and used the Bottleneck distance [
Next we inspect the first Betti numbers of the clique complex associated to our network. To do so we built the 2skeleton, which includes all vertices, edges, and triangles. Note that a filled triangle is added whenever three scientists are pairwise connected. As mentioned in the previous section, without filling in these triangles, each triple of pairwise collaborating scientists would be a loop and increase the first Betti number by one. However, we are interested in the loops in the network on a larger scale. In Figure
We only show the central part, see Figure
We investigated if the first Betti numbers give us further power to distinguish between the collaboration network and the random networks. We found that for random networks we obtain much higher first Betti numbers. For 1000 randomly generated networks we found an average of 520.65 (s.d. 4.39) intervals, while our structured network only has 9 intervals. The reason that this number is so much higher for random networks is that there is less clustering and thus fewer triangles that are filled in and more loops with more than three edges.
Using the first Betti numbers it is enough to only compare the final networks to distinguish between random and structured networks. We hope that the persistent homology of the whole filtration will be able to detect more subtle structural differences to distinguish networks that are more similar in structure. Notice how all of the loops that were born persisted to the end of the filtration. It would have been possible for a loop to die. For instance, if the four scientists (A. Vazquez, A. Vespignani, A. Barrat, and M. Weigt) appearing in the red loop found at
For this network all higher Betti numbers are trivial.
In this section we perform analysis on three larger collaboration networks. Again we restrict our attention to the largest connected component of each network. In Table
We investigated if we can distinguish these collaboration networks from random networks using the persistence barcodes. We noticed that all three networks have several intervals corresponding to second Betti numbers.
Let
Collaboration networks.
Network  No. of intervals 
No. of intervals 



Condensed matter  11361  274  0.00026  −0.79 
Highenergy theory  1389  2  0.00081  −0.82 
Astrophysics  4879  222  0.0011  −0.71 
A filtration of a random network corresponds to increasing
We used Gephi [
We wrote code in JAVA that imports a weighted edge list and converts it to a graph filtration. Subsequently we used javaPlex to build the clique complex filtration and compute the persistence intervals. The computation of the persistence intervals is the bottleneck in this computation. This took longest for the astrophysics network; 267 s (on a MacBook Pro 2.4 GHz Intel Core 2 Duo with 4 GB RAM), presumably since it is the densest network. For our current purposes these computation times are sufficient; however, if we want to apply the same computations to larger networks we need faster algorithms. This should be possible as described in Chapter 12 of [
By applying persistent homology to four collaboration networks of scientists we have shown that it gives us interesting information about the structure of weighted networks. We found that due to the construction of collaboration networks, weak ties form cliques and strong ties act as local bridges between those cliques. This is contrary to what has been described in other social networks. We would like to investigate this in greater detail in future work.
We used persistent homology to analyse the structure of weighted networks. The inclusion of the first and second Betti numbers gave us a richer measure to work with than in the existing literature. We were able to use persistent homology to distinguish these collaboration networks from random networks. Using the one and twodimensional Betti numbers of the network we did not need to take the weights into account. We are hoping that using the weights will give us the ability to distinguish networks that are more similar in structure. This is left as future work.
Research by both authors was partially supported by the Australian Department of Defence under Research Agreement 4500743680.