^{1}

^{2}

^{1}

^{1}

^{1}

^{1}

^{2}

When facing the problem of reconstructing complex mesoscale network structures, it is generally believed that models encoding the nodes organization into modules must be employed. The present paper focuses on two block structures that characterize the empirical mesoscale organization of many real-world networks, i.e., the

The analysis of mesoscale network structures is a topic of great interest within the community of network scientists: much attention, however, has been received by the community-detection topic [

The present work aims at contributing to this stream of research, by exploring the effectiveness of models that constrain only local information in reproducing complex mesostructures as the bow-tie and the core-periphery ones. When approaching such a problem it is, in fact, commonly believed that models encoding the nodes organization into modules must be employed: here we test this hypothesis, by comparing models that enforce topological information like the total number of links, the degree sequences, and the reciprocity structure with their block-wise counterparts.

To this aim, we have considered real-world networks whose topological structure is

Remarkably, all models considered in the present paper can be recovered within the same framework, i.e., the entropy-maximization one, which has been proven to be rather effective for approaching both pattern detection and real-world networks reconstruction problems [

As a byproduct, our paper enriches the toolbox for the analysis of bipartite networks. Among the many, available, network representations, the bipartite one has recently received much attention [

This is especially true when considering that bipartite networks emerge quite naturally when studying the aforementioned mesoscale structures. It is, in fact, evident that analysing the way nodes cluster together unavoidably leads to the analysis of the way such modules interact. From an algebraic point of view, this boils down to consider matrices characterized by diagonal square blocks (i.e., the adjacency matrices of the modules themselves) and off-diagonal rectangular blocks (i.e., the adjacency matrices of the bipartite networks encoding their interactions).

Our method will be employed to analyse economic and financial networks empirically characterized by either bow-tie or core-periphery structures: more specifically, we will focus on two systems, the World Trade Web and the Dutch Interbank Network. As we will show, while the former can be described by a partial bow-tie structure, the latter is characterized by the coexistence of a core-periphery-like structure and a proper bow-tie one, the second one carrying a larger amount of information about the system evolution than the first one.

Let us now describe the two systems we have considered for the present analysis.

Let us, first, provide an algebraic representation of the mesoscale structures considered in the present paper, i.e., the bow-tie and the core-periphery ones.

Networks whose topology is empirically characterized by a core-periphery structure can be represented as follows:

Notice that the two matrices

While the definition of core-periphery structure is quite intuitive, the definition of bow-tie structure, on the other hand, is based on the concept of node

SCC: each node in the Strongly Connected Component (SCC) is reachable from any other node belonging to the SCC;

IN: each node in the SCC is reachable from any node belonging to the IN-component;

OUT: each node in OUT-component is reachable from any node belonging to the SCC.

According to the definitions above, networks whose topology is empirically characterized by a bow-tie structure can be represented by the following adjacency matrix:

Let us now provide a brief description of the set of models that will be implemented to analyse the two kinds of mesoscale structures described above (for a detailed description see Appendix

The first class of null models we consider for the present analysis is the one including the so-called

Interestingly, the

When analysing directed networks, however, a nontrivial piece of information to be taken into account is represented by reciprocity [

Models in both classes are

Although rising the number of parameters to better reproduce empirical patterns is tempting, the risk of overfitting should be, nevertheless, avoided. A criterion to identify the best model out of a basket of possible ones is, thus, needed. In what follows, we will adopt the Akaike Information Criterion (AIC hereafter)

In order to make (

Specifying the degree sequences leads to further rise the number of parameters: the Directed Configuration Model (DCM) is, in fact, defined by

Accounting also for the information provided by the reciprocity requires a number of parameters to be specified that is

The model selection framework based upon the two information criteria above allows the probability that a given model

Top panel: the WTW bow-tie structure, composed by the SCC and the IN-component only. The panels below show the countries belonging to the SCC (in colors) and the countries belonging to the IN-component (in gray) in 1993, 1998 and 2002, respectively. Countries belonging to the SCC keep rising their reciprocated degree (see also Figure

Dynamics of the in-degree (defined as

From a macroeconomic point of view, the increasing number of nodes within the SCC may evidence a sort of ongoing globalization process [

Let us now analyse what kind of topological information is actually needed in order to explain the mesoscale WTW structure. To this aim, let us sum up the observations about the empirical structure of the WTW by imagining a

The need of considering a block model becomes evident when comparing the homogeneous benchmark provided by the DRG with its block-wise counterpart, i.e., the SBM (see Figure

Evolution of the AIC and BIC values for the WTW across the years 1992-2002: while the SBM (blue trend) must be preferred to the traditional DRG (being the network composed by parts with different link densities), heterogeneous benchmarks are, generally speaking, to be preferred. Although the DCM and the RCM are characterized by very similar AIC values, AIC and BIC weights let always the DCM win. The ddc-SBM experiences convergence problems throughout the entire temporal period.

Generally speaking, however, benchmarks encoding the degree heterogeneity are to be preferred. Interestingly, (both) nonblock models outperform block models, indicating that specifying additional information to the one encoded into local properties is indeed unnecessary. This is not surprising, however, when considering that the nodes belonging to the IN-component have zero in-degrees. The latter, in fact, are exactly reproduced by both the DCM and the RCM: the “peripherical” part of the network under analysis is, thus, automatically explained by a simpler kind of statistics with no need to invoke any

Let us now compare our degree-informed models over the

The same consideration, together with the observation that the large

On the other hand, comparing the BCM and the DCM on the SCC leads to the conclusion that, as the latter enlarges,

Apparently, thus, two nonblock models compete, i.e., the DCM and the RCM (see Figure

Notably, the DIN is also characterized by a certain degree of bow-tieness, given the presence of an SCC, an IN-component, and, differently from the WTW, also a nonvanishing OUT-component: both the

Evolution of the DIN bow-tie structure (the SCC is shown in gray, the IN-component is shown in blue, and the OUT-component is shown in green). The crisis period (last four points) is signalled by a sharp decrease of the SCC and IN-components size (and a corresponding increase of the OUT-component size). The size of the SCC, however, starts shrinking in 2004Q1 (deviating from the approximately constant trend observed since 1998Q1), seemingly constituting an additional, early-warning signal of the upcoming crisis. On the other hand, the DIN core (shown in orange) does not undergo any significant variation throughout the whole temporal interval.

In order to individuate the null model encoding the right amount of topological information to explain the DIN bow-tie structure, let us notice that its SCC can be imagined as a

Evolution of the AIC and BIC values for the DIN across the quarters 1998Q1-2008Q4: while the SBM (blue trend) must be preferred to the traditional DRG (being the network composed by parts with different link densities), heterogeneous benchmarks are, generally speaking, to be preferred. Although the DCM wins in the vast majority of cases (both for the bow-tie and the core-periphery mesoscale structures), quarters exist where the DCM and the RCM compete; BIC, on the other hand, lets the SBM win sometimes, when analysing the DIN core-periphery structure. The ddc-SBM experiences convergence problems throughout the entire temporal period.

Generally speaking, however, models accounting for the degree heterogeneity are to be preferred. As for the WTW, zero in-degrees and zero out-degrees are exactly reproduced by nonblock models like the DCM and the RCM. On top of this, the low reciprocity value of the DIN (amounting at

Deviations from this idealized picture, however, exist. This is particularly evident when analysing the

Consistently, AIC and BIC weights let the DCM win in the vast majority of cases, although in some periods the DCM and the RCM compete. Overall, this is valid when considering the DIN core-periphery structure too.

The WTW and the DIN represent two real-world systems characterized by (apparently) nontrivial mesoscale structures: while the first one is characterized by a (partial) bow-tie organization, in the second one the bow-tie partition coexists with a core-periphery partition. Both kinds of mesoscale structures are characterized by interacting blocks whose internal topology is commonly believed to be determined by a nontrivial interplay between nodes connectivity and the reciprocity of connections. It is, thus, interesting to ask ourselves the extent to which such structures are, instead, accounted for by purely local information.

Remarkably, what our analysis points out is that specifying the degree sequences is often enough to reproduce these mesoscale structures, thus suggesting that the observed modules emerge as a consequence of local connectivity patterns between nodes: for example, the absence of incoming/outgoing connections for a set of nodes naturally leads them to be identified as an IN-/OUT-component.

Differences between systems, naturally, exist. Let us notice that, contrarily to what observed in the WTW case, AIC and BIC provide different answers to the question concerning the performance of block models in explaining the DIN core-periphery structure: while the Akaike criterion ranks the BCM first, the Bayesian criterion assigns the highest score to the SBM in the vast majority of temporal snapshots. If, on the one hand, this saves the role potentially played by blocks, on the other it points out that the large difference between the connectivity values of the core and the periphery [

A second comment about the DIN concerns the observation that, when considering the core-periphery structure, the AIC values of block models overlap with the AIC values of the simpler models to a larger extent (see Figure

A third comment concerns reciprocity: although it plays a role in the definition of the “core” parts (i.e., the SCC and the properly defined core), its explanatory power is much more limited than expected: as a result, the degree sequence seems to encode all relevant information to reproduce the mesoscale structures considered in the present paper, thus questioning the role supposedly played by some kind of higher-level information—e.g., a partition into blocks—to explain them.

Generally speaking, all null models considered in this paper can be recovered within the Exponential Random Graphs (ERG) framework. Following [

All degree-informed null models can be recovered as particular cases of the following Hamiltonian:

Let us explicitly solve the BCM in the two, off-diagonal matrices

The aforementioned probability coefficients are determined via the likelihood condition in (

The estimation step, thus, reads

The SBM can be recovered by posing

Inserting the information about reciprocity into a bipartite null model leads to the following probability coefficient:

The probability coefficients defining our bipartite, reciprocal model read

The aim of this appendix is providing simple examples of network configurations to further illustrate the methodology presented in the paper.

To this aim let us consider a bimodular structure where the link density of the two communities (whose number of nodes is

Let us now plot the trends of

Left panel: comparison between the numerical values of the BIC computed for the SBM (blue trend) and the DRG (red trend), on a bimodular network where the link density of the two communities (

Let us now consider a core-periphery structure where the link density of the two communities (whose number of nodes is

As a last case-study, let us now consider the comparison between the DCM and the RCM. To this aim, let us explicitly solve both models on binary, directed networks with an increasing level of reciprocity

Comparison between the numerical values of the BIC computed for the RCM (blue trend) and the DCM (red trend) on a network with an increasing level of reciprocity

World Trade Web data that support the findings of this study are openly available at the UN Comtrade Database (http://comtrade.un.org/). Dutch interbank exposures data are not publicly available due to privacy restrictions.

The manuscript has been presented in the CCS 2018 Conference (

The authors declare no competing financial interests.

Jeroen van Lidth de Jeude, Riccardo Di Clemente, Guido Caldarelli, Fabio Saracco, and Tiziano Squartini developed the method. Jeroen van Lidth de Jeude performed the analysis. Jeroen van Lidth de Jeude, Riccardo Di Clemente, Guido Caldarelli, Fabio Saracco, and Tiziano Squartini wrote the manuscript. All authors reviewed and approved the manuscript.

This work was supported by the EU Projects CoeGSS (Grant no. 676547), DOLFINS (Grant no. 640772), MULTIPLEX (Grant no. 317532), Openmaker (Grant no. 687941), and SoBigData (Grant no. 654024). RDC, as Newton International Fellow of the Royal Society, acknowledges support from the Royal Society, the British Academy, and the Academy of Medical Sciences (Newton International Fellowship, NF170505).