Information Feedback in Temporal Networks as a Predictor of Market Crashes

Laboratory for Financial and Risk Analytics, Faculty of Electrical Engineering and Computing, University of Zagreb, 10000 Zagreb, Croatia Woodrow Wilson School of Public and International Affairs and Department of Economics, Princeton University, Princeton, NJ 08544, USA CERGE-EI, Politickch vz. 7, P.O. Box 882, 111 21 Prague, Czech Republic Center for Polymer Studies and Department of Physics, Boston University, Boston, MA 02215, USA Faculty of Civil Engineering, University of Rijeka, 51000 Rijeka, Croatia Zagreb School of Economics and Management, 10 000 Zagreb, Croatia Luxembourg School of Business, Luxembourg, Luxembourg


Introduction
Connectivity patterns in complex systems and their dynamic properties have been the focus of extensive research in physical, biological, neurological, and social systems [1][2][3][4].In many studies, identification of strong dependencies between interconnected components has been linked to systemic risk-the risk associated with the collapse of the entire system [5][6][7][8].This line of research is especially prominent in modeling risk in financial systems [9,10], where dependent components within the system (e.g., banks, companies, and financial assets) are more likely to fail simultaneously and influence other connected components (a phenomenon known also as spillover effect), thus inducing a potential cascade of failures in the entire system [11].However, due to the dynamic nature of financial systems and often complex dependency relationships, the identification and quantification of these effects is generally not a trivial task [12,13].In addition, financial variables and time series (e.g., prices, returns, and volumes) are known to exhibit strongly non-Gaussian characteristics, heavy tails, and long-range dependence, which calls into question standard parametric approaches [14].
The complexity of economic and financial systems have been in the focus of research from various perspectives, employing Ising models [15,16], agent-based models [17][18][19], and game theory [20].The instabilities therein, observed as crises and crashes [21,22], are especially elusive and hard to model and predict [23,24].Generally, connectivity patterns in financial markets are modelled by estimating and analyzing graphs of financial assets [25], which have been found to capture the structural properties of these complex systems, such as the hierarchical structures captured by spanning trees in financial markets [26,27].Empirical research on connectedness in financial systems and the relation to systemic risk has intensified in the aftermath of the 2008 subprime crisis, focusing either on contagion within the banking sector or the comovement in financial markets [28].Longin and Solnik [29] first provided formal evidence of increased correlations during bear markets, and recent studies report that volatility cross-correlations exhibit long memory meaning that once high volatility (risk) is spread across the entire market, it could last for a long time [30].Adrian and Brunnermeier's ΔCoVaR and the systemic expected shortfall (SES) by Acharya et al. [31] quantify the potential distress of financial institutions conditional on other institutions' poor performance, thus measuring the spillover of losses within the financial sector.Kritzman and Li [32] quantify the divergence of the cross section of financial returns from their historical behavior expressed as the Mahalanobis distance, and find that it can be used similarly to implied volatility in sets of assets without liquid option markets, while accounting for their interactions.To measure the total contribution of assets to the systemic risk of the entire market, Kritzman et al. [33] propose the absorption ratio, based on the principal component analysis of the cross section of financial returns.A more detailed look into the connectivity patterns in financial systems was proposed by Billio et al. [34], who analyzed the dynamic causality patterns in networks of hedge funds, brokers, insurance companies, and banks.They report a highly asymmetric relationship during the subprime crisis of 2008 and find that the proportion of significant causal relationships in the network increases with the financial distress and crises in the market.Recently, Curme et al. [35] introduced a numerical method for validating time-lagged correlation networks of assets and found a growing number of statistically validated links and the rise of instability in financial networks.Although the state-of-the-art approaches employ correlation-based measures and Granger causality tests for inference of dependency relationships in financial networks, the assumptions of linearity and Gaussianity of these methods are often violated with financial data.In addition, the conclusions drawn from such analyses require further long-range historical backtests of the relationship between specific network patterns and systemic risk in the markets which would include more systemic events than the single 2008 subprime crisis.
In this paper, we investigate the dynamic relationships within a network of financial assets, their evolution through time and relation to the systemic risk in the market.We take a nonparametric approach to identify and validate time-lagged cross-sectional links between pairs of assets in a financial market.Based on the information-theoretic concept of entropy which quantifies uncertainty, we measure the dependence between and within time series as a reduction in uncertainty, as quantified by Schreiber's transfer entropy [36].The concept of entropy has been used to measure sequential irregularity in many time series applications, with notable results in finance.Moreover, transfer entropybased methods yield state-of-the-art results in detecting information flows in computational neuroscience, bioinformatics, and financial economics [37].Specific financial applications include estimation of serial irregularities and risk in time series or returns and inference of the global dependence networks of financial indices [20,[38][39][40][41][42].We expand on previous results by investigating the evolution of dynamic causal networks (sometimes referred to as information flow networks) through time and analyzing their association with systemic risk in the market.Moreover, we focus on the emergence of information feedback within these networks and hypothesize that strongly connected feedback components may indicate future distress in the system.The concept of feedback in financial systems was previously linked to systemic risk in various studies [22,43], most notably, the DebtRank methodology proposed by Battiston et al. [44] uses interbank lending networks to assess the risk within the financial sector.Our approach moves beyond interbank lending networks and relies on time series of returns for any set of financial assets to infer directed dependency networks and study the information feedback within.In addition, previous approaches often depend on fundamental firm-level financial data, available only in quarterly time intervals and often heterogeneous in nature, while in this paper we estimate dependency networks using asset prices from financial markets, allowing for wider areas of application.
Based on the proposed methodology for estimating dependency networks of financial assets from time series of asset returns, we examine the general levels of predictability in the market and the topologies of the estimated dependency networks.Furthermore, we study the emergence of information feedback in the networks and introduce a network-based systemic risk indicator to test our hypothesis.We apply the proposed approach to 9 U.S. sector indices on a period from 1999 to 2016 and a selection of S&P 500 constituent company stocks from 1980 to 2016.In addition, we consider the U.S. House Price Index [45] data and apply our approach to the real estate market as well.Our results suggest that the dynamic dependency networks exhibit strongly connected feedback components, particularly around periods of financial crises.The proposed systemic risk indicator is shown to yield predictive power for future market distress, both for the two stock market datasets (sector indices and individual stocks), and the real estate market.These results demonstrate the validity of our approach and indicate that the proposed methodology can be used to construct early-warning signals for crashes in financial markets.

Methods and Data
2.1.Information Theoretic Causality Measures.In thermodynamics and statistical mechanics, entropy is a measure of disorder in a system.In information theory, it is a quantification of uncertainty of a process, based on the Shannon 2 Complexity information content of an event x: η x = −log p x [46].For a sequence of events X, entropy is defined as the average information: In analogy, the joint entropy H X, Y for two variables X and Y is defined with the joint probabilities p x, y in the sum.The entropy is maximal when all events are equally probable, meaning their distribution is uniform, and minimal (equal to zero) for deterministic processes [47].In analogy with (1), the idea of conditional entropy H X | Y measures the uncertainty in X left after accounting for the context in Y: To measure causality relationships between time series, the most common approach is the Wiener-Granger concept of causality [48]: if the prediction of time series X i t given its own past X i t − 1, … , t − k is improved by using the past values of another time series X j t − 1, … , t − l , then X j → X i [49].Although the most popular approach to Granger causality is based on estimation of VAR models, Schreiber [36] proposed the notion of transfer entropy as a nonparametric version of Granger causality.In the information theoretic sense, transfer entropy is defined as the amount of uncertainty in X given its own past reduced by including the past of Y:

3
where the uncertainty is quantified using the concept of conditional entropy H A | B : the uncertainty left in A after accounting for the context B [37].Note that owing to the lack of autocorrelation in return time series, we may restrict the past of X i to k = 1 and past of X j to l = 1 steps and thus reduce the dimensionality of the estimation procedures.
By incorporating Shannon's entropy given in (2), the transfer entropy formula from (3) can be specified as The above expression is also known as the Kullback-Leibler divergence between distributions p X i t | X i t − 1 , X j t − 1 and p X i t | X i t − 1 [36].If the distribution X i t given its own past X i t − 1 were independent of X j t − 1 , the KL divergence given in (4) would be equal to 0. On the other hand, if X j would deterministically predict X i , the KL divergence would reach its maximum value, equal to the entropy H X i t | X i t − 1 .In the Granger-Wiener causality sense, by using expression (4) we measure the improvement in predicting X i when knowing X j , conditional on the past of X i itself.It has been shown that for Gaussian variables, transfer entropy is equal to Granger causality [50], which implies that the standard linear and VAR model-based methods are a special case of the proposed approach.Due to a positive bias in the estimates T j→i , Marschinski and Kantz [38] propose the effective transfer entropy for financial time series, calculated by subtracting the mean of the surrogate measurements T s j→i , using random permutations of the source time series X j .In order to measure the fraction of the maximum possible value, we employ the concept of normalized transfer entropy, used in neurophysiology and computational neuroscience [51,52].It is defined as the effective transfer entropy divided by the entropy of X i given its own past, which is the maximum theoretic value of T j→i (for a case when H X i t | X i t − 1 , X j t − 1 = 0): This measurement represents the fraction of information in X i not explained by its own past which is explained by including the past of X j .
To include potential serial dependency of each time series X i t on its own past X i t − 1 , we also estimate the mutual information: In the discrete case, the expression can be reformulated as follows: The above expression again corresponds to the KL divergence [47] between the joint distribution p X i t , X i t − 1 and the product of marginal distributions p X i t p X i t − 1 .If the return distributions were independent, the joint distribution would be equal to the product of marginals and the KL divergence would be equal to 0. On the other hand, if the subsequent returns were functionally (deterministically) dependent, then the KL divergence would reach its maximum value, equal to the entropy H X i t .
In addition, the mutual information measure for Gaussian distributions is determined by the Pearson correlation coefficient ρ KL = −0 5 log 1 − ρ 2 -again implying that the proposed approach is a generalization of the correlationbased methods.In analogy with (5), we estimate the effective mutual information Îi − E Î s i (where Î s i is estimated using the shuffled time series X i t − 1 ) and normalize it by H X i t , which is its theoretical maximum.

Empirical Estimation
3 Complexity p X i t − 1 need to be estimated, commonly not a simple task due to scarcity of data.Various methods for discretizing asset returns for transfer entropy estimation have been utilized [40,41], mainly based on binning.Specifically, due to the dynamic nature of financial systems, dependencies are estimated using time windows rather than the entire history, which makes the studied time series considerably short.Since a large number of bins may significantly deteriorate the estimation of multidimensional distributions, we significantly reduce the number of variables by limiting returns into just two classes based on their sign: positive and negative.However, rather than just taking the signs of returns and counting their occurrences on a given time window, we employ concepts from fuzzy set theory [53] and define a sigmoid membership function which encodes realized returns r t to two classes (positive and negative) with membership μ + t and μ − t : where α is a parameter defining the steepness of the sigmoid function.Note that ∀r t : μ + t + μ − t = 1.Therefore, the sigmoid functions define the membership of r t for the positive and negative classes, depending on its magnitude-very positive returns will have much higher μ + t (and therefore smaller μ − t ), and vice versa, as shown in Figure 1.
To better understand the procedure of estimating discrete distributions of returns with just two discrete realizations based on sigmoid membership functions, we give an illustrative example: let r = 0 03, −0 01, 0 04, −0 02 be a time series of asset returns at n = 4 discrete points in time.By simply counting the occurrences of positive and negative returns sgn r = 1, −1, 1, −1 , one can estimate the discrete distribution of two return classes: p + = 0 5 and p − = 0 5.However, when the sigmoid membership functions are applied, one obtains μ + = 0 99, 0 18, 0 997, 0 05 and μ − = 0 01, 0 82, 0 003, 0 95 .From these memberships, the probabilities can be estimated as p + = 1/n∑μ + (same for μ − ): p + = 2 22/4 = 0 55 and p + = 1 78/4 = 0 45.The procedure for multidimensional returns is analogous to this one-returns are discretized to two classes using the proposed sigmoidal membership function, and discrete distributions are estimated by summing the memberships of the samples.In this fashion, we do not treat all signs equally but through memberships assign more weights to those returns which have larger magnitudes, thus, very small either positive or negative returns are not as significant as larger ones.By doing so, we manage to keep the dimensionalities of discrete distributions in ( 6) and (4) low, while accounting for the magnitude of returns.
To obtain directed causality networks from multiple time series (which represent components of a dynamical complex system), we estimate T n i→j for all pairs of time series i, j .From these we infer directed networks with weights defined as In the constructed causality networks, directed edges , with weights equal to the normalized transfer entropy and their weights correspond to the estimated normalized mutual information Î n i .Therefore, we estimate a temporal network where links represent causality relationships between individual financial assets (including selfloops, which indicate serial dependency).
In this contribution, we provide empirical results by performing our analysis on three distinct datasets: (i) daily prices of 9 U.S. sector indices on a period from 1990 to 2017, (ii) daily stock prices of 47 companies which are long-established constituents of the S&P 500 index from 1980 to 2017, and (iii) quarterly house price indices of 51 U.S. states on a period from 1975 to 2017 [45].Since prices generally have a positive drift (often modeled as Brownian motion), to measure causality between financial time series we use logarithmic returns, calculated as the change in the logarithm of the price S t : Thus, in our analysis time series X i , i = 1, … , N correspond to the N individual components of the considered financial systems (specifically: 9 sector indices, 47 companies, and 51 house price indices).

Information Feedback in Directed Dependency Networks.
The proposed methodology is based on detecting pairs of time series where knowing the past of one helps predict the future of the other with respect to only knowing its own past.By estimating pairwise links in such a way, we obtain networks where the nodes represent individual time series and the directed links between them denote the estimated 4 Complexity dependencies.Even though pairwise methods may not capture a common third driving factor, accounting for all of such effects is virtually impossible in real-world datasets.On a network level, the goal is not to measure the entire dependency of the system (for instance, as measured by the total correlation)-this would require estimation in high-dimensional settings, which is especially problematic in systems such as financial markets, where long time windows may not be appropriate due to their dynamic properties.Rather than that, we are interested in the patterns created by the estimated dependency networks, inferred from pairwise links.Note that these are not instantaneous correlations (which would be measured by the correlation/covariance matrices), but rather time-lagged effects which may not exist even in the presence of high correlations, and represent a nonlinear extension of the well-known Granger causality [54].
We hypothesize that feedback effects in the dependency structure of financial assets are symptomatic of severe inefficiencies in the system, and thus are related to the overall level of systemic risk.We define information feedback not only as a bidirectional dependency (or information transfer, as measured by transfer entropy) estimated in pairs of assets, but also as a loop of any size in the network, forming a pattern of cyclic dependency.This is related closely to strongly connected components (SCC) from network theory-defined as subgraphs consisting of nodes which are all reachable from every other node within the same subgraph-as shown in Figure 2.
Evidently, the emergence of SCCs indicates feedback in the networks, not only between pairs of nodes, but also cycles through multiple nodes and edges.We measure this by employing Tarjan's procedure [56] for identifying SCCs in each network.The procedure is a depth-first search algorithm which traverses all nodes and their respective neighbors, and partitions the original directed graph into subgraphs corresponding to the SCCs.The algorithm passes once through each node, building a forest of trees and subtrees containing nodes reachable from each traversed node, while keeping track of the highest reachable parent node for each traversed node in the graph (since these links are not necessarily preserved in the trees).Nodes which contain a link to a parent in their tree or another node which may link to a common parent in the tree form a strongly connected component, including all other nodes in their subtree.The algorithm pseudocode for a set of vertices V and edges E is given below.
The algorithm does not depend on the ordering of nodes or the choice of the first root node.Moreover, since the depth-first search traverses each node only once, the computational complexity of O V + E .We employ Tarjan's algorithm to detect SCCs in the estimated directed dependency networks obtained from financial time series, and study the properties and emergence of information feedback in the system.

U.S. Stock
Market.First, we analyze the dependency networks of the U.S. financial markets represented by the 9 sector indices and 47 stocks of S&P 500 constituent companies.Before estimating temporal networks, to measure a general amount of predictability we estimate the entropies H X i t and conditional entropies H X i t | X i t − 1 for all stocks i, and conditional entropies H X i t | X i t − 1 , X j t − 1 for all stock pairs i, j on a rolling time window of T = 1 year and a step of 1 day.Figure 3 shows the estimated entropies of the 47 stocks of S&P 500 constituent companies, averaged over all companies i and pairs i, j , subtracted from the theoretical maximum entropy, thus quantifying the amount of predictability in the time series of returns through time.The evidence in Figure 3 suggest that the predictability of stock returns from their own past H X i t | X i t − 1 and the past of other stock returns H X i t | X i t − 1 , X j t − 1 generally diminishes through time, as indicated by the distinct negative trends in conditional entropies and the differences between the conditional entropies and the entropies of each stock i.This may be interpreted through the implications of the efficient market hypothesis [57,58] and suggests that through the last 4 decades the frictions in the U.S. stock market have decreased-this is in line with a reported increase of liquidity and reduction of trading costs in the U.S. stock market in the observed period from 1980 to 2017 [59].
We estimate the normalized transfer entropies T n j→i for all pairs of stocks i, j and analyze snapshots of inferred causality networks at different points in time, shown in Figure 4.It is evident that the causality networks exhibit considerable differences in different periods: These results suggest that cross-sectional causal relationships between stocks rise during turbulent market periods and are substantially lower when the entire system is stable.
To analyze this relationship, we measure the total number of links as a percentage of the maximum possible number of links in the network and the dynamics of this quantity through time.In addition, we study the emergence of feedback relationships in the estimated networks-here we define the elementary feedback pair as a situation where both T n j→i and T n i→j are nonnegative for a pair of time series i, j .We count the number of pairs i, j for which such relationships are found and again express it as a percentage of the total 5 Complexity possible number of such pairs in a network.In addition, we inspect the average link weight (equal to the average T n i→j for identified links), and display these quantities for the networks of 47 stocks of long-established S&P 500 constituents in Figure 5.
The average link weight is shown to gradually decrease over time, which is in line with the predictability levels in Figure 3.However, due to the bias correction in (5), the number of nonnegative estimated normalized transfer entropies (i.e., the number of links) does not have this drift.
We estimate the SCCs in each network through the observed period and analyze the individual SCC sizes (number of nodes within the SCC), demonstrated in Figure 6.
These results reveal the nature of the feedback in temporal causality networks, suggesting that it mainly concentrates within one large SCC, rather than multiple commensurable components.This finding is especially interesting when considering the fact that for correlation matrices of contemporaneous asset returns, the first eigenvalue in the spectral decomposition (also known as market mode) is found to account for the majority of variance in the market [33,34].This finding suggests that the common risk component in interconnected financial markets is not only contemporaneous, but also spills over through temporal dependencies between assets, and that this effect seems to persist through time.
Moreover, the notable rise in the SCC size around 1998 which remains relatively high until after the 2008 subprime crash (with some fluctuations between the 2000 Dot-com bubble and the 2008 crash) might be evidence of a phase transition in the complex network of financial assets [2,60].The emergence of such a large SCC in the network might be a consequence of the investors forming patterns of feedback trading across almost the entire market-something reported both in individual and institutional investors [61].A possible line of further research might try to find which behavioral properties of agents on a microstructural level cause such dynamics to be observed in the system [17,62,63].
In this paper, we hypothesize that the information feedback observed in directed dependency networks indicates inefficiencies in the financial market and may be used as a measure of systemic risk.To verify our hypothesis, we measure the amount of feedback in each network by only considering the first largest SCC (motivated by the fact that is significantly larger and consists of up to 100% of nodes in the network) and propose a measure based on the outdegrees d + i of all nodes i within the largest SCC: We normalize by 1/ N N − 1 which is the maximum possible number of directed links in a fully connected giant The causality network of the U.S. stock market estimated in the period 1992-1994, showing only nodes connected to the main component.Two strongly connected components and the edges within are marked in green and blue.The network was visualized in the Cytoscape software environment [55].6 Complexity SCC which would contain all nodes in the network.We call the quantity in (10) the SCC index and calculate S t from the networks estimated at each point in time.Due to the fact that the existence of feedback within the network does not necessarily mean that a negative shock is about to occur, we combine the estimated SCC index with the VIX index, calculated using implied volatilities on the S&P 500 index, also known as the fear index.The goal of including the VIX into the early-warning indicator is to measure both the feedback and fear within the market-however, it is not in the scope of this paper to undertake the details of constructing a comprehensive early-warning indicator for financial crises: this is left for further analyses which would include exogenous effects and data.In the following results, we only use the observed market data to demonstrate the predictive power of the proposed approach.The results for the SCC index and the VIX ⋅ S indicator for both stocks and sector index datasets are shown in Figure 7.
The estimated SCC index builds up in the bull market prior to the Dot-com bubble of 2000, as well as just prior to the 2008 subprime mortgage crisis, but in both cases deflates with the aftermath of the crashes.The shift in the behavior of the system seen in 2003 has also been reported in a mutual information-based analysis by Harré and Bossomaier [64].It is important to note that events such as the September 11 attacks are exogenous to the system and thus cannot be expected to be found in the data and predicted by such endogenous approaches-such considerations are important, especially in the presence of so-called "twin crises" [65].In for all stocks and pairs i, j of the 47 stocks of S&P 500 constituent companies, averaged over all companies i and pairs i, j , and subtracted from the theoretical maximum entropy.The difference H X i t | X i t − 1 − H X i t corresponds to the amount of predictability of time series of financial returns from their own past, and H X i t | X i t − 1 , X j t − 1 − H X i t corresponds to the amount of predictability of time series from both their past and the past of other time series-both of these quantities diminish through time.
7 Complexity the case of the subprime mortgage crisis, although some econometric evidence exists on liquidity and return dependence prior to the crisis [66,67], the question remains to what extent the housing bubble was priced in the stock market.Our SCC index seems to rise just in the onset of the 2007 instabilities, but prior to the large drawdown in the S&P 500 index of 2008-a similar increase in risk was observed in other systemic risk measures [32,33].
Generally, there are two assumptions that each earlywarning indicator should meet.First, it should warn prior to the arrival of the large shocks, thus ex ante not ex post.Second, the peaks indicating these shocks should be substantially higher compared to the bulk of the indicator quantifying the rest of the events.To verify that increased values of the SCC index S t and the proposed systemic risk indicator VIX • S t both rise prior to periods of increased volatility, we inspect the cross-correlation functions between the two and the volatility of the S&P 500 index, as measured by the   8 Complexity standard deviation of returns on a rolling time window of 1 year-the past 1-year realized volatility σ t−1y .The crosscorrelation functions for both stocks and sector index data are shown in Figure 8, indicating that both considered signals increase prior to the increase in volatility, by ca.τ = 150 days for stocks and τ = 250 days for sector indices.Furthermore, to test the capacity of the SCC index to predict future distress in the market, we perform a regression on the future 1-year realized volatility σ t+1y of the S&P 500 index, calculated as the sample standard deviation.We choose this period even though results from the cross-correlation analysis may suggest that the index calculated from U.S. stock market data may be predictive over a shorter time frame.Since different time windows for the estimation of the SCC index could very well yield different time frames for the prediction of volatility, this is left to further research and we proceed with the 1-year time window both for the estimation of the SCC index and inference on forward volatility prediction.First, we define the reduced model where the input variables are the VIX indicator and the past 1-year realized volatility σ t−1y , to account for the autoregressive nature of market volatility: σ t+1y t = β 0 + β 1 VIX t + β 2 σ t−1y t .The VIX is included in the null model in order to investigate how much the proposed SCC index alone improves volatility prediction as opposed to standard techniques and signals.It is also important to note that the past volatility is calculated on a time period t − 1y, t and future volatility on t, t + 1y , meaning that there is no overlapping between the windows.In the expanded model, we include the SCC index S: σ t+1y t = β 0 + β 1 VIX t + β 2 σ t−1y t + β 3 S t .To compare these, we use the adjusted R 2 measure (which takes into account the number of parameters in the model) and the Akaike information criterion (AIC)-note that the AIC is observed on an additive scale, and only the differences of AIC between models are interpretable, in such a way that a model with AIC reduced by 10 or more is considered to have substantial support [68].The results in Table 1 demonstrate a significant improvement in future volatility prediction when the SCC index is included, as opposed to the reduced models including only past volatility and the VIX index.This is implied both by the increase in the adjusted R 2 measure and the Akaike information criterion (AIC) which is strongly decreased in the expanded model.Although from a modelling perspective the R 2 adj of 0.45 and 0.5 for the expanded models may not seem as an exceptional fit, these are not distant from other results in financial research; for instance, the fit of Shiller's CAPE on the 10-year forward returns which is around 0.5-0.6 (depending on the period) [69].However, to the best of our knowledge there are no similar results for such short-to midterm predictions of market volatility (i.e., the forward 1-year period used here).In addition, to see how the combination of VIX and the SCC index fares against the linear model, we include the interaction term VIX • X in the regression, and observe that the model does not seem to improve significantly, as suggested by the R 2 adj and AIC measures.However, we obtain the best results for the AIC when estimating a linear model using only the interaction term σ t+1y t = β 0 + β 1 VIX t • S t -for both the stocks and sector index datasets, implying that this single variable captures the most important dynamics pertaining to future volatility prediction, and allows for the simplest form of the model.Owing to the normalization of the SCC index by the maximum possible size of the network, both the SCC index estimated from 47 stocks and the one estimated using 9 sector indices perform very consistently with respect to the magnitude of the estimated coefficients.In addition, the fact that the respective models fit very similarly indicates that the proposed methodology captures common effects in the market, observable from different datasets.
3.2.U.S. House Price Index.In addition to the financial market data, we also perform our analysis on the U.S. house price data, which consists of quarterly house price indices for 51 U.S. states from 1970 to 2017.We estimate the directed dependency networks on a time window of T = 20 quarters (5 years), and compare the calculated SCC index with the overall house price index for the entire U.S. real estate market in Figure 9.
It is evident from the results that the estimated SCC index is high in the midst of the 2007-2009 housing bubble, indicating that there were strong feedback components and dependencies between the real estate prices in U.S. states.We repeat the previous analysis and inspect the cross-  9 Complexity correlation between the estimated SCC index and the volatility of the U.S. HPI index, estimated on a rolling window of 5 years (i.e., 20 quarters)-the results are shown in Figure 10.
Even though the SCC index was estimated on a rolling time window of length T = 20 quarters (i.e., 5 years), the cross-correlation in Figure 10 suggest that it is predictable for future volatility on a shorter time span, from 2 to 8 quarters (half to two years).To measure the extent to which volatility prediction is improved by including the SCC index, we employ regressions on the future volatility of the U.S. HPI index.To avoid overlapping windows in calculating input and output variables (past and future volatility), we adopt a 2-year window (8 quarters) to estimate both the past volatility σ t−2y t and the future volatility which is the dependent variable in the model σ t+2y t .The reduced model reads: σ t+2y t = β 0 + β 1 σ t−2y t .The expanded model includes the SCC index: σ t+2y t = β 0 + β 1 σ t−2y t + β 2 S HPI t .The results of the performed regressions are given in Table 2.
The results suggest that a significant improvement in the model is introduced by including the SCC index, indicating our proposed indicator manages to capture these effects and timely indicates the systemic risk associated with the U.S. housing market.The strength of the relationship is similar to the one found in financial market data in Section 3.1, although volatilities are generally lower in real estate prices.These results additionally support our hypothesis and extend the applicability of our approach from stock market data to economic complex systems such as the housing market.

Summary
In this contribution, we have analyzed directed dependency networks of financial assets, estimated using information theoretic concepts of transfer entropy and mutual information.We employed a resampling technique to remove the estimation bias and obtain validated directed networks.Firstly, we found that the general predictability levels have been diminishing through the last decades in the U.S. stock market, which is coincidental with the reduction in bid-ask spreads and transaction costs, and may be interpreted as an indicator of rising market efficiency.In addition, we examined the estimated directed dependency networks for various periods in time, and report that the networks exhibit strong connections with feedback loops during periods of high volatility and market crashes, as opposed to sparse networks estimated during periods of stable market growth.To test the hypothesis that information feedback in the financial network indicates elevated systemic risk levels, we estimated directed temporal networks on a rolling time window and identified the strongly connected component (SCC) for each time step.
To estimate the contribution of the SCC to the entire system, we define the SCC index as the sum of all SCC node outdegrees, normalized by the maximum theoretical number of links.Using stock market and real estate data, we show that the SCC index can be used to timely indicate systemic risk and predict future market volatility with a remarkable precision and consistently in all the considered datasets.These results indicate that our methodology yields relevant information for evaluating systemic risk in financial networks, and may be useful for both academics and practitioners as a tool for developing early-warning signals for future market crashes.

Figure 1 :
Figure 1: Membership functions with parameter α = 150 used for calculating discrete distributions of asset returns.
(a) During the period of stable market growth from 1994 to 2006, the estimated network is sparse with relatively low values of normalized transfer entropy links.(b) During the Dot-com bubble and crash between 1999 and 2001, the network is much more dense with higher link weights.(c) Again during a period of market recovery from 2004 to 2006, the network is considerably lighter in terms of link weights.(d) During the subprime bubble and crisis from 2007 to 2009, the estimate network is almost fully connected with considerably heavier weights.

Figure 2 :
Figure2: The causality network of the U.S. stock market estimated in the period 1992-1994, showing only nodes connected to the main component.Two strongly connected components and the edges within are marked in green and blue.The network was visualized in the Cytoscape software environment[55].

3MBFigure 4 :
Abb ott labo rato ries Al tri a Gr ou p A m er ic an E le ct ri c P ow er A rc h e r D a n ie ls M id la n d B o e in g B r is t o l-M y e r s S q u ib b C a m p b e ll S o u p C a te rp il la r C he vr on CM S En erg y Coca -Cola Colgate-Palmo live Con soli dat ed Edi son CS X C SV D e e re D o w C h e m ic a l D T E E n e r g y D u P o n t E at o n Ed is on In te rn at io na l En ter gy Exelon Exxon Mobil Fo rd Mo tor G en er al D yn am ic s G en er al E le ct ri c G e n e r a l M il ls

Figure 5 :
Figure 5: Number of links (shown in dark blue, scale on the left axis), and the number of feedback pairs (light blue, left axis) as a percentage of total possible links/pairs, and the average link weight (black, right axis) for the causality networks estimated from the U.S. stock market data on a rolling time window of T = 1 year.

Figure 6 :
Figure 6: The number of nodes within the largest and second largest strongly connected components of the networks, expressed as a percentage of all the nodes in the network.The transparent lines are the raw data and the full blue and green lines correspond to the 1-year moving average.

Figure 7 :
Figure7: The estimated SCC index and the product of the VIX index and the SCC index as early-warning indicators for financial crises, shown with the S&P 500 market index as a benchmark of the U.S. stock market performance.The sector index data is only available since 2000 (green lines), and the VIX data is available since 1990, thus the VIX • S on the lower graph is only displayed from that point on-the time frames are nevertheless kept the same on both graphs for comparison.

Figure 8 :
Figure 8: Cross-correlation functions between the volatility of the S&P 500 index σ t and the SCC index S t (a), and the cross-correlation function of the S&P 500 volatility and the VIX • S t indicator (b).The observed cross-correlation functions peak around τ = 150 for stocks and τ = 250 for sector index data.

Figure 9 :
Figure 9: The estimated SCC index for the house price indices of 51 U.S. states on a rolling time window of T = 20 quarters (blue, scale on the right axis), together with the overall U.S. house price index (black, left axis), and the 1-year moving average return of the US House Price Index (green, right axis).

Figure 10 :
Figure 10: The cross-correlation function between the volatility of the U.S. HPI index σ t and the SCC index S HPI t estimated for the house price dataset.The observed cross-correlation function peaks around τ = 5 quarters and remains relatively high between τ = 2 and τ = 8.

Table 1 :
Regression analysis of the developed SCC index on the future realized market volatility, using reduced (without the SCC index S) and expanded (including S) models: without, with, and only using the interaction term VIX ⋅ S, performed on both the U.S. stocks and U.S. sector index data.Coefficients significant at the 1% significance level are marked with an asterisk ( * ).The respective adjusted R 2 measures and the Akaike information criterion (AIC) are also reported-including the AIC difference (Δ) from the null (reduced) model.

Table 2 :
Regression analysis of the developed SCC index on the future realized market volatility, using reduced (without the SCC index S) and expanded (including S) models, performed on the U.S. house price data.Coefficients significant at the 1% significance level are marked with an asterisk ( * ).The respective adjusted R 2 measures and the Akaike information criterion (AIC) are also reported-including the AIC difference (Δ) from the null (reduced) model.